Indexes in Oracle DB part 2

In my first post about indexes I promised that more in this topic will follow up and here it is... This series of articles is based on observation how developers fail to correctly implement indexing in their applications based on Oracle and aims to provide guidelines on how indexes should be used. Today let’s focus on index scans. Understanding how their work might be very helpful in planning the indexing strategy!


Index scans fundamentals

An index scan is an operation of reading index blocks to quickly identify rows matching indexed criteria. Depending on the scan type, Oracle can either read the leaf blocks directly (full scan) or navigate through branch blocks first to find the desired leaf blocks. Every index leaf entry contains the ROWID of rows in the table or in case of bitmap indexes a bitmap together with ROWID ranges for calculating the exact physical location of each row that fits the search conditions. With the ROWIDs at hand the rows are retrieved from the table via single block reads.

The following operations are performed in the course of each index scan:

  • Navigating an index tree to select the starting leaf block associated with the column key values specified in the query WHERE condition; except for full scans
  • Jumping to additional leaf blocks using pointers (if needed)
  • Loading, decompressing and optionally merging bitmaps – only for bitmap indexes
  • Returning ROWIDs of all matching rows

Note that for a small table (or large portion of a big table) it might be cheaper to read it with full table scan as such uses multi block reads. However index scans have one significant advantage in comparison to full scans – they are cached. Blocks retrieved through index scans are placed in buffer cache and the more often they are accessed the longer they remain cached.

The following query displays the most frequently accessed data blocks:

SELECT obj object, dbarfil file#, dbablk block#, tch touches
FROM x$bh ORDER BY tch desc;

This also means that if a database is restarted or buffer cache flushed, the cache will need some time to fill in with data and all queries that use indexes will be temporarily slower. Beware, as the slowdown may be huge, even by an order of magnitude. Exactly how long the degradation lasts depends on the database workload. In a very active database it might be just 15-30 minutes, but some less active might need even 24hrs to achieve good cache efficiency.

Now let’s talk about various index scan types…


B-tree index scans

Index unique scan

This scan returns, at most, a single ROWID. The optimizer chooses unique scan when all indexed columns are specified in WHERE clause with equality conditions and in addition the index was created as UNIQUE, for example to support PRIMARY KEY constraint. This scan is typically used for primary key lookup and also by nested loops join algorithm if the child/detail table is chosen as a driving table. This is also the fastest way of retrieving a single row.

Index range scan

An index range scan is a common operation for accessing selective data. It can be used when the query specifies a range predicate (>, >=, <, <=, BETWEEN) matching indexed columns, any combination of range predicates joined with AND, or an equality predicate (=) if the index is not unique. A text search with a trailing wildcard (for example, ‘ABC%’) can also result in an index range scan.

The first rows to be returned are selected through navigating from the root to the leaf, but the next rows are accessed sequentially. Oracle uses pointers to jump to adjacent leaf blocks if necessary. Data is returned in a sorted order with multiple rows having identical values being sorted by ROWIDs.

Note that index range scan might also be used to satisfy ORDER BY clause. This might happen even if large portion of the table is being accessed and where full table scan would normally be faster given no ORDER BY is specified.

Having the table rows ordered in the same way as they appear in the index greatly improves the table access through index range scans. This is because adjacent rows are located in the same data blocks and fewer blocks must be accessed to satisfy query criteria, thus resulting in reduced I/O overhead.

You can use the optimizer hint /*+ INDEX(table index) */ to manually influence Oracle to use a range scan with a specific index.

Index full scan

Index full scan means scanning all leaf blocks in the index in the sorted order. This is rarely faster than a full table scan as single block access is used to get table data for each indexed row, so you won’t see this scan type too often. 

However there are some cases when full index scan might perform well enough to be employed and that is when the query needs data in the same sorted order as present in the index – thus sorting can be avoided. Even if the query needs sorted data full table scan and a sort might still be better, so index FS is typically employed when one of the additional conditions apply on top of the need for sorting:

  • Full table scan followed by a sort will be more expensive than full index scan; which typically happens when disk sort is envisaged (much slower than memory sort)
  • All columns referenced by the query are included by the index; no need to access the table itself
  • LIKE operator with leading wildcard is used for indexed column
  • Inequality predicate (!= or <>)  is used for indexed column
  • Equality or range predicate is used for non-leading column of the index

Index fast full scan

Fast  full index scan is similar to index full scan in a way that it also accesses all leaf blocks of the index, the difference is that it does so in non-sorted fashion, reading the blocks as they appear on disk. This means that index FFS cannot be used to satisfy a sort, but it can use multi-block access method (same as full table scan) to access the index data very quickly. It can only be used when at least one of the indexed columns is NOT NULL.

This type of scan is typically employed when the query references all columns stored in the index and the data is not requested in sorted order. As an index, containing only small subset of all columns, is usually much smaller than the table itself it can be read much faster than the whole table. Index FFS can also be used to serve count(*) queries, although bitmap indexes are usually better in such queries.

This is the only index scan that can be parallelized for non-partitioned tables.

You can use optimizer hint /*+ INDEX_FFS (table index) */ to manually influence Oracle to use a fast full scan with a specific index.

Index skip scan

Index skip scan can be used for composite (multi-column) indexes when the leading column is not specified in the query WHERE clause filters but the other trailing columns are. Skip scanning enables a composite index to be split logically into smaller sub-indexes.  The number of logical sub-indexes is determined by the number of distinct values in the initial column. Skip scanning is advantageous if the leading column of the composite index has very few distinct values and non-leading part of the index has good selectivity. Each sub-index is scanned individually for values matching the query predicates, and because multiple independent scans are needed it might only be faster than other access types if the first column has low cardinality – for example 2 or 3 distinct values.

You can use optimizer hint /*+ INDEX_SS (table index) */ to manually influence Oracle to use the skip scan method on a specific index.

Example

SQL> CREATE TABLE employees (id NUMBER(4), name VARCHAR2(20), gender VARCHAR2(1));
SQL> -- populate the table, e.g. (4,’SMITH’,’M’)
SQL> CREATE INDEX emp_gen_nam_idx on employees(gender,name);
SQL> SELECT * FROM employees WHERE name=’SMITH’;

Splitting this composite index would result in two logical sub-indexes: the first has the keys with the gender value ‘F’, the second sub-index has the keys with the gender value ‘M’. Each sub-index will be scanned independently to find rows matching name = ‘SMITH’ criteria.


Bitmap index scans

Bitmap indexes only differ from normal b-tree indexes by the leaf block format so the basic principles of a normal b-tree scan apply also to bitmap indexes. The difference is that instead of traversing the lists of key-value pairs in leaf blocks (as it happens for normal indexes) oracle loads the bitmaps, decompresses them and converts to rowid ranges that match filter criteria. Note that multiple bitmaps can easily be combined with logical operators so the conversion to rowid don’t have to be performed for each bitmap separately but rather for the resulting final bitmap after logical operations have been applied on all bitmaps used to filter data in a particular query.

Note that hints for using specific index scans listed for b-tree indexes also work for bitmap indexes.  For example you can use /*+ INDEX(table index) */ to influence optimizer to perform BITMAP INDEX (RANGE SCAN).

BITMAP INDEX (SINGLE VALUE)

Single value bitmap index scan is performed by Oracle when the query specifies an equality condition for a bitmap indexed column. A single bitmap is being built for the value specified in the WHERE condition. Note that IS NULL predicate can also be served by this access method because bitmap indexes include NULL values as an additional set of bitmap pieces.

BITMAP INDEX (RANGE SCAN)

This type of scan is performed by Oracle when the query specifies a range condition (<, <=, >, >=, BETWEEN, or LIKE with a trailing ‘%’ wildcard) for a bitmap indexed column. A set is constructed containing one bitmap for each value within the filtered range.  This operation is then followed by BITMAP MERGE, which combines independent single-value bitmaps into one result bitmap covering all values within the range. Note that returned rows are ordered in the same way as for B-tree indexes—first by the indexed values and then by ROWID.

BITMAP INDEX (FULL SCAN)

The use-cases of bitmap index full scan are similar to those of normal index full scans, which is, when data is needed in sorted order. A set is constructed containing one bitmap for each value of the indexed column that were not filtered out by query WHERE conditions. This operation is then followed by BITMAP MERGE, which combines independent single-value bitmaps into one result bitmap covering all values. Note that full scan on bitmap indexes is usually more expensive than the same scan would be on a normal b-tree index, so it is even less likely to see this scan type employed by the optimizer.

BITMAP INDEX (FAST FULL SCAN)

The use-cases of bitmap index full scan are similar to those of normal index fast full scans, which is, when all columns referenced in the query are contained in the index and data don’t have to be returned in sorted order.

A bitmap index fast full scan is often used in conjunction with BITMAP CONVERSION (COUNT) operation to serve COUNT(…) aggregate queries in a very efficient manner.

BITMAP CONVERSION (COUNT)

This type of scan counts and returns the number of set bits in one or more bitmaps. This operation serves to implement COUNT() aggregates and is considered the fastest way of answering such queries, especially for small bitmap indexes built on low cardinality columns.

BITMAP CONVERSION (COUNT) can be used for count(*) queries, or count(column) queries if the column specified has NOT NULL constraint. The count(column) for nullable columns or count(distinct column) can also be served by an index fast full scan, but in such cases, BITMAP CONVERSION (TO ROWIDS) is performed instead of BITMAP CONVERSION (COUNT), and ROWIDs are aggregated using either the SORT GROUP BY or SORT AGGREGATE algorithm, which is slightly slower but still very efficient. Note that this operation works even if WHERE clause filter is present for the indexed column, in which case only bitmaps for values matching the filter are included in the counting.

BITMAP CONVERSION (TO ROWIDS)

This is typically the last step of bitmap index related operations. When all possible bitmap indexes are applied and merged into the final bitmap Oracle converts the final bitmap into ROWIDs; then the ROWIDs are used to retrieve matching rows from the table segment.

BITMAP CONVERSION (FROM ROWIDS)

This technique is used to create a bitmap on the fly (if there is no bitmap index present) when the optimizer judges that it is faster to use logical operations on bitmaps than any other data access method. This means that non-existing bitmap index is kind of simulated, typically for being used in conjunction (combination of AND/OR/NOT logical operations) with other bitmap indexes, either real or also simulated.  In most cases this kind of conversion is only beneficial if the WHERE clause contains OR operator, which is typically very slow with other access methods.

Example

SQL> EXPLAIN PLAN FOR
SELECT COUNT(*) FROM table
WHERE col1 = 1234 AND col 2 = ‘ABC’ OR col 3 BETWEEN 10 AND 20;
 
SELECT STATEMENT
     SORT AGGREGATE
       BITMAP CONVERSION COUNT
         BITMAP OR
           BITMAP AND
             BITMAP INDEX SINGLE VALUE col1_ind
             BITMAP CONVERSION FROM ROWIDS
               INDEX RANGE SCAN col2_ind
           BITMAP CONVERSION FROM ROWIDS
             SORT ORDER BY
               INDEX RANGE SCAN col3_ind

BITMAP MERGE

This type of scan combines independent single-value bitmaps into one result bitmap covering multiple values. It is used after bitmap index range or full scans to build a single bitmap for the specified group of values.

BITMAP AND/OR/NOT/MINUS

Oracle can perform logical AND, OR, NOT, or MINUS (AND NOT) operations between two or more bitmaps very efficiently, so having multiple single-column bitmap indexes is the fastest option for satisfying queries with multiple equality predicates connected with logical operators.

Example

The following query can be satisfied by efficient logical operations on bitmap indexes:

SQL> CREATE TABLE passengers (id NUMBER(10), name VARCHAR2(20), gender VARCHAR2(1), age_group VARCHAR2(20));
SQL> INSERT INTO passengers VALUES (1,’SMITH’,’M’,’ADULT’);
SQL> INSERT INTO passengers VALUES (2,’BROWN’,’F’,’ADULT’);
SQL> INSERT INTO passengers VALUES (3,’SPARROW’,’F’,’INFANT’);
SQL> INSERT INTO passengers VALUES (4,’MCDONALD’,’M’,’CHILD’);
SQL> commit;
SQL> CREATE BITMAP INDEX pas_gen_idx on passengers(gender);
SQL> CREATE BITMAP INDEX pas_age_idx on passengers(age_group);
SQL> SELECT COUNT(*) FROM passengers WHERE gender = 'M' AND age_group IN ('INFANT','CHILD');
 
SELECT STATEMENT 
  SORT AGGREGATE 
    BITMAP CONVERSION COUNT  
      BITMAP AND 
        BITMAP INDEX SINGLE VALUE PAS_GEN_IDX
        BITMAP OR
          BITMAP INDEX SINGLE VALUE PAS_AGE_IDX
          BITMAP INDEX SINGLE VALUE PAS_AGE_IDX

Topic to be continued…

After covering the basic scan types in this post, I will move on to more advenced aspects. In the next article I will describe index block splits and index performance considerations/tips, including related optimizer parameters.

 

 

 

 

 

 

 

 

 

 

 

Add new comment

You are here