Index (database) |
An index is a feature in a Database that allows quick access to the rows in a Table (database). The index is created using one or more columns of the table. Not only is the index smaller than the original table (due to having less columns), but is is optimized for quick searching, usually via a balanced tree.
Some databases extend the power of indexes even further by allowing indexes to be created on functions or expressions. For example, an index could be created on upper(last_name), which would only store the uppercase versions of the last_name field in the index. Indexes can also be defined as unique or non-unique. A unique index acts as a constraint on the table by preventing identical rows in the index and thus, the original columns.
= Architecture =
There are two kinds of architectures for indexes which are clustered and non-clustered; both types of indexes use a balanced tree (b-tree) data structure where the index is arranged in a tree with 2 types of pages. At the top of the tree is the index set of pages; these act as an index to the index. At the lowest (or leaf) level is the sequence set of pages.
In a non-clustered index the leaf level contains one or more columns together with pointers to the data pages containing rows with the given index values. In a clustered index the leaf level is the actual data page. Data is physically stored on a data page in ascending order. The order of values in the index pages is also ascending.
== Column order ==
The order that columns are listed in the index definition is important. It is possible to retrieve a set of row identifiers using the only the first indexed columns. However, it is not possible or efficient (on most databases) to retrieve the set of row identifiers using the only the second or greater indexed column.
For example, imagine a phone book that is organized by city first, then by last name, and then by first name. If given the city, you can easily extract the list of all phone numbers for that city. However, in this phone book it would be very tedious to find all the phone numbers for a given last name. You would have to look within each city s section for the entries with that last name. Some databases can do this, others just wont use the index.
= Applications and limitations=
Indexes are useful for many applications but do come with some limitations. Consider the following SQL statement: SELECT first_name FROM people WHERE last_name = Finkelstein ;. To process this statement without an index the database software must look at the last_name column on every row in the database (this is known as a full table scan). With an index the database simply follows the b-tree data structure until the Finkelstein entry has been found; this is much less computationally expensive than a full table scan.
Consider this SQL statement: SELECT email_address FROM customers WHERE email_address LIKE %@yahoo.com ;. This query would yield an email address for every customer whose email address ends with yahoo.com , but even if the email_address column has been indexed the database still must perform a full table scan. This is because the index is built with the assumption that words go from left to right. With a wildcard at the beginning of the search-term the database software is unable to use the underlying b-tree data structure. This problem can be solved through the addition of another index created on reverse(email_address) and a SQL query like this: select email_address from customers where reverse(email_address) like reverse( %@yahoo.com );. This puts the wild-card at the right most part of the query (now ) which the index on reverse(email_address) can satisfy.|
|