Indexes in Oracle DB part 1

Submitted by epilecki on
Blog article:

Indexes… what indexes?

Indexes are one of the key Oracle features in terms of improving query performance and one might think it should be clear to everybody when and how to use them. Not at all... Working as a DBA for several years I was surprised to notice that so many developers do not know what indexes are, how they work and how to use them efficiently. I thought it would be good to put this knowledge down in a series of blog posts… and here we are!  Let’s start with…


Fundamentals

According to Oracle documentation - “Indexes are optional structures associated with tables and clusters that allow SQL statements to execute more quickly against a table. … Oracle Database index provides a faster access path to table data. You can use indexes without rewriting any queries. Your results are the same, but you see them more quickly.

This is a good definition. I should also mention that indexes are automatically maintained by Oracle throughout DML operations and store rows in a sorted fashion for indexed column or combination of columns.

Apart from primary and foreign key constraint-enforcing, indexes are optional so you don’t have to create any if you don’t need them. In fact it’s better not to create them if they are not useful for the query workload. Some developers tend to index all columns of all tables just in case to guarantee fast query response for any possible queries – this is a very bad practice. Although logically and physically independent from the base table indexes are not always beneficial and not completely transparent.

They can also have negative implications such as:

  • Performance degradation of DML statements: indexes must be updated in case of DML operations on the base table
  • Additional contention when multiple sessions modify data covered by the same index blocks; especially true for indexes that grow monotonically from sequences and most severe for bitmap indexes due to the implementation of bitmap locking mechanism
  • Indexing all columns in the table needs a lot of extra storage, sometimes more than the table data itself consume
  • If broken (in UNUSABLE state) they can cause user queries to fail unless SKIP_UNUSABLE_INDEXES parameter is also set to TRUE

So, are the indexes bad? Definitely no!

Should we create them? Well… it depends. In general the answer is yes, but only those that are really needed for a given configuration and workload.

There are some DB configurations which can survive with little or no indexes - for example Oracle Exadata machines that do automatic in-memory indexing of all columns on storage level and can therefore do well without ordinary indexes.

Small tables that can be read in a few multi-block IO operations through full table scan will likely not benefit from any indexes.

You should only index columns if you expect that:

  • queries will use those columns as filters in their queries (conditions in WHERE clause)
  • queries  will need to perform sorts or groupings on those columns

I would recommend starting with very few indexes that seem to be most important for the SQL workload and successively add new ones as they are needed or remove those that are not really in use. A new feature of Oracle 11g – invisible indexes may be very handy in discovering unused indexes.


Internal structure and basic types

We have 2 basic types of indexes in Oracle: B-tree indexes and bitmap indexes; and a bunch of sub-types or flavours of the basic types: unique, composite, function-based, reverse key, partitioned, etc.

How does an index look like? In Oracle the base structure of all indexes is a B-tree (or balanced tree). Many Oracle specialists tend to think about bitmap indexes as sets of bitmaps, but not so many of them know that even the bitmap indexes have B-tree structure. What is really different between normal B-tree index and bitmap index is the leaf block format.

B-tree structure used in Oracle indexes has some nice advantages:

  • the tree always stays balanced
  • all leaf blocks are on the same level so retrieval of each single row takes approximately the same amount of time
  • typical B-tree has 3 levels  (sometimes 4) even for very large tables, so a single row retrieval performance is not degraded as the table grows

Normal B-tree index stores key-rowid pairs as ordered lists in each index leaf block; so to get to the specific row Oracle first browses from the root to the leaf that contains the requested value and then iterates through subsequent rows as needed. For each rowid it then goes to the base table and retrieves the full row using single block access mode.

In case of a bitmap index the leaf blocks do not contain lists of key-rowid pairs, but rather so called bitmap pieces. Each bitmap piece consists of the key value, start rowid, end rowid and a bitmap that describes if rows within the given range of rowids have the specific key value set. Note that bitmaps in bitmap pieces are compressed and thus the rows can be stored in bitmap index in very space efficient manner. Because each indexed value requires a separate set of bitmaps such indexes are most space efficient for low cardinality columns. However, high cardinality columns achieve better compression ratios due to the fact that compression algorithm groups zeros in bitmaps and bitmaps for such columns contain mostly zeros; so the storage overhead is not huge even for columns having many distinct values.

To retrieve rows having some specific value oracle has to decompress the bitmaps, translate them into rowids and then access the base table with single block reads. Although bitmaps need to be decompressed and translated to rowids upon access performance is not really negatively impacted as being smaller in size bitmap indexes take fewer IO operations when blocks are fetched from disk.  This means that bitmap indexes tend to use slightly more CPU power while reducing the burden on IO and it might be beneficial to replace normal indexes with bitmap indexes if the system is IO bound. However one should also note that the choice of index types should mainly be driven by the shape of user queries.


Which of the 2 basic index types should I use?

Do not create your indexes blindly – take your time to capture typical query workload and analyse the queries. Or if you lack time or experience use Oracle supplied advisors (SQL Tuning advisor, SQL Access advisor) on a typical workload to generate initial index recommendations for you.

As soon as you know which columns should be indexed you should decide what types and flavours of indexes to use. Right now let’s talk only about choosing between normal B-tree and a bitmap index.

Normal B-tree index should be created if:

  • the query contains range-based WHERE conditions, such as >,<, <=,>=; especially for high cardinality columns
  • the query retrieves single row or few rows through equality condition filter (=) on medium or high cardinality column; especially true for unique column
  • the query needs to order or group by data by the specified column or group of columns
  • there is high DML activity concerning indexed column(s)

Note that Oracle can only use one B-tree index for accessing table data. If you observe several columns often used together as filters it might be very beneficial to create a composite (multi-column) B-tree index that covers all of those columns. As Oracle is only able to use the whole index or the leading subset of columns - the columns that are most likely to be used in queries should appear as the first in the index; the second criterion is selectivity of the columns – more selective columns (with higher cardinality) should also be given priority.

B-tree indexes do not store NULL values, so they cannot be used to speed up queries if indexed columns are filtered with IS NULL predicate.

Bitmap index should be created if:

  • the query contains equality WHERE condition (=) on low or medium cardinality column
  • the query contains multiple equality WHERE conditions linked with any combination of logical operations such as: AND, OR, NOT
  • the query performs count(*) or count(<column>) aggregations
  • the workload is mostly read only, with little or no DML activity
  • the query contains inequality WHERE condition: !=, <> on low cardinality column
  • the query contains IS NULL clause in the WHERE conditions
  • the query contains IN operator with list containing few values (or subquery returning few values): dozens rather than hundreds or thousands, but the exact boundary depends on the column cardinality and may be lower

Note that in contrary to normal B-tree indexes Oracle can combine more than one bitmap index in a single access if WHERE query filters are connected with logical operators. Therefore if there is a group of columns of which random subsets are used as filters in queries it might be a good idea to create a single column bitmap index on each of such columns and let Oracle combine them when queries are executed.

Many Oracle books state that bitmap indexes are only useful for low cardinality columns, containing less than 10 distinct values. This is not really true. Bitmap indexes can also be beneficial for medium cardinality columns containing dozens, hundreds or even a few thousands distinct values, provided that the conditions listed earlier in this post are met.

Bitmap indexes should not be used for high DML contention workloads due to the locking behaviour implemented. To modify a single row in a bitmap index Oracle needs to lock 1 or 2 bitmap pieces (for the old and for the new value). Because each bitmap piece describes many rows no other rows from the same bitmap pieces may be modified at the same time. In case of high DML activity it may introduce significant performance issues.

It's important to note that bitmap indexes, unlike normal B-tree, also index NULL values - there is a separate set of bitmaps for NULL.


Topic to be continued…

This post describes some basic facts about Oracle indexes. Watch for the next posts in this thread – I will be describing index sub-types and flavours (when to use them), various index scans and also in more details performance considerations of using indexes in Oracle.

Add new comment

CAPTCHA
Enter the characters shown in the image.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.

Disclaimer

The views expressed in this blog are those of the authors and cannot be regarded as representing CERN’s official position.

CERN Social Media Guidelines

 

Blogroll