Lecture: Indexing Structures for Files and Physical Database Design
Readings: Chapter 17, Fundamentals of database systems, Seventh Edition (R. Elmasri, S. Navathe).
Indexes
- indexes are used to speed up record retrieval in response to certain search conditions
- indexes structures provide secondary access paths
- any field can be used as an index
- multiple indexes can be constructed
- most indexes are based on ordered files
- index files are small and can be retrieved with less block accesses
Single-Level Ordered Indexes
- ordered index similar to an index on a book
- the indexing field stores the value of the index and a pointer to the block where the record can be retrieved
- values in index are ordered
- types of single-level indexes
- primary index
- specified on the
ordering
key field of ordered file of records - index file is an ordered file with two keys
- primary key
- pointer to a disk block
- one index entry in the index file for each block in the data file
- disadvantage: insertion and deletion of records
- move records around and change index values
- solutions: use unordered overflow file, use linked list of overflow records
- specified on the
- clustering index
- used if numerous records can have the same value for the
ordering
field - file records are physically ordered on a non-key field without a distinct value for each record
- index file is an ordered file with two keys
- clustering field value
- pointer to a disk block of the first appearance of the field value
- used if numerous records can have the same value for the
- secondary index
- can be specified on any
non ordering
field - index file is file with two keys
- indexing field
- block pointer or record pointer
- usually needs more storage and longer search time than primary index (because it’s dense)
- can be specified on any
- primary index
- indexes may be dense or sparse
- dense index has an index entry for every search key value in the data file
- sparse index has entries for only some search values
- applicable when records are sequentially ordered on search-key
Other Types of Indexes
- multilevel Indexes
- designed to reduce remaining search space as search is conducted
- index file
- first level of the multilevel index
- second level
- primary index to the first level
- third level
- primary index to the second level
- index file
- designed to reduce remaining search space as search is conducted
- hash indexes
- secondary structure for file access
- uses hashing on a search key other than the one used for the primary data file organization
- index entries of form
(k, pr)
or(k, p)
pr
: pointer to the record containing the keyp
: pointer to the block containing the record for that key
Some General Issues Concerning Indexing
- physical index
- pointer specifies physical record address
- disadvantage: pointer must be changed if record is moved
- secondary indexes can be created for any primary record organization
- complements other primary access methods
- tuning indexes
- tuning goals
- dynamically evaluate requirements
- reorganize indexes for best performance
- reasons for revising initial index choice
- certain queries may take too long to run due to lack of an index
- certain indexes may not get utilized
- certain indexes may undergo too much updating if based on an attribute that undergoes frequent changes
- tuning goals