Automatic Indexes: Scenarios Where Automatic Indexes NOT Created Part III (“Load...
source link: https://richardfoote.wordpress.com/2022/04/28/automatic-indexes-scenarios-where-automatic-indexes-not-created-part-iii-loaded/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
Automatic Indexes: Scenarios Where Automatic Indexes NOT Created Part III (“Loaded”) April 28, 2022
Posted by Richard Foote in 19c, Advanced Index Compression, Automatic Indexing, Autonomous Database, Autonomous Transaction Processing, CBO, Clustering Factor, Data Clustering, Exadata, Index Access Path, Index Column Order, Index Compression, Oracle, Oracle 21c, Oracle Cloud, Oracle Cost Based Optimizer, Oracle General, Oracle Indexes, Oracle19c, Overloading.
trackback
In my previous two posts, I’ve discussed scenarios where Automatic Indexing (AI) does not currently created automatic indexes and you may need to manually create the necessary indexes.
In this post, I’ll discuss a third scenario where AI will create an index, but you may want to manually create an even better one…
I’ll start by creating a relatively “large” table, with 20+ columns:
SQL> create table bowie_overload (id number, code 1 number, code 2 number, stuff 1 varchar 2 ( 42 ), stuff 2 varchar 2 ( 42 ), stuff 3 varchar 2 ( 42 ), stuff 4 varchar 2 ( 42 ), stuff 5 varchar 2 ( 42 ), stuff 6 varchar 2 ( 42 ), stuff 7 varchar 2 ( 42 ), stuff 8 varchar 2 ( 42 ), stuff 9 varchar 2 ( 42 ), stuff 10 varchar 2 ( 42 ), stuff 11 varchar 2 ( 42 ), stuff 12 varchar 2 ( 42 ), stuff 13 varchar 2 ( 42 ), stuff 14 varchar 2 ( 42 ), stuff 15 varchar 2 ( 42 ), stuff 16 varchar 2 ( 42 ), stuff 17 varchar 2 ( 42 ), stuff 18 varchar 2 ( 42 ), stuff 19 varchar 2 ( 42 ), stuff 20 varchar 2 ( 42 ), name varchar 2 ( 42 )); Table created. SQL> insert into bowie_overload select rownum, mod(rownum, 1000 )+ 1 , '42' , 'David Bowie' , 'Major Tom' , 'Ziggy Stardust' , 'Aladdin Sane' , 'Thin White Duke' , 'David Bowie' , 'Major Tom' , 'Ziggy Stardust' , 'Aladdin Sane' , 'Thin White Duke' , 'David Bowie' , 'Major Tom' , 'Ziggy Stardust' , 'Aladdin Sane' , 'Thin White Duke' , 'David Bowie' , 'Major Tom' , 'Ziggy Stardust' , 'Aladdin Sane' , 'Thin White Duke' , 'The Spiders From Mars' from dual connect by level <= 10000000 ; 10000000 rows created. SQL> commit; Commit complete. SQL> exec dbms_stats.gather_table_stats(ownname=>null, tabname=> 'BOWIE_OVERLOAD' ); PL/SQL procedure successfully completed. |
The main columns to note here are CODE1 which contains 1000 distinct values (and so is kinda selective on a 10M row table, but not spectacularly so, especially with a poor clustering factor) and CODE2 which always contains the same value of “42” (and so will compress wonderfully for maximum effect).
I’ll next run the following query a number of times:
SQL> select code 1 , code 2 from bowie_overload where code 1 = 42 ; 10000 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1883860831 -------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10000 | 70000 | 74817 ( 1 ) | 00: 00: 03 | | * 1 | TABLE ACCESS STORAGE FULL | BOWIE_OVERLOAD | 10000 | 70000 | 74817 ( 1 ) | 00: 00: 03 | -------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - storage( "CODE1" = 24 ) filter( "CODE1" = 24 ) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 869893 consistent gets 434670 physical reads 0 redo size 183890 bytes sent via SQL*Net to client 7378 bytes received via SQL*Net from client 668 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10000 rows processed |
Without an index, the CBO currently has no choice but to perform a FTS. An index on the CODE1 column would provide the necessary filtering to fetch and return the required rows.
BUT, if this query was important enough, we could improve things further by “Overloading” this index with the CODE2 column, so we could use the index exclusively to get all the necessary data, without having to access the table at all. Considering an index on just the CODE1 column would need to fetch a reasonable number of rows (10000) and would need to visit a substantial number of different table blocks due to its poor clustering, overloading the index in this scenario would substantially reduce the necessary workloads of this query.
So what does AI do in this scenario, is overloading an index considered?
If we look at the AI report:
GENERAL INFORMATION ------------------------------------------------------------------------------- Activity start : 28 -APR -2022 12: 15: 45 Activity end : 28 -APR -2022 12: 16: 33 Executions completed : 1 Executions interrupted : 0 Executions with fatal error : 0 ------------------------------------------------------------------------------- SUMMARY (AUTO INDEXES) ------------------------------------------------------------------------------- Index candidates : 1 Indexes created ( visible / invisible) : 1 ( 1 / 0 ) Space used ( visible / invisible) : 134.22 MB ( 134.22 MB / 0 B) Indexes dropped : 0 SQL statements verified : 2 SQL statements improved (improvement factor) : 2 ( 47.1 x) SQL plan baselines created : 0 Overall improvement factor : 47.1 x ------------------------------------------------------------------------------- SUMMARY (MANUAL INDEXES) ------------------------------------------------------------------------------- Unused indexes : 0 Space used : 0 B Unusable indexes : 0 ------------------------------------------------------------------------------- INDEX DETAILS ------------------------------------------------------------------------------- The following indexes were created: ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- | Owner | Table | Index | Key | Type | Properties | ------------------------------------------------------------------------------- | BOWIE | BOWIE_OVERLOAD | SYS_AI_aat 8 t 6 ad 0 ux 0 h | CODE 1 | B-TREE | NONE | ------------------------------------------------------------------------------- ------------------------------------------------------------------------------- VERIFICATION DETAILS ------------------------------------------------------------------------------- The performance of the following statements improved: ------------------------------------------------------------------------------- Parsing Schema Name : BOWIE SQL ID : bh 5 cuyv 8 ga 0 bt SQL Text : select code 1 , code 2 from bowie_overload where code 1 = 42 Improvement Factor : 46.9 x Execution Statistics: ----------------------------- Original Plan Auto Index Plan ---------------------------- ---------------------------- Elapsed Time (s): 42619069 241844 CPU Time (s): 25387841 217676 Buffer Gets: 12148771 18499 Optimizer Cost: 74817 10021 Disk Reads: 6085380 9957 Direct Writes: 0 0 Rows Processed: 140000 10000 Executions: 14 1 PLANS SECTION --------------------------------------------------------------------------------------------- - Original ----------------------------- Plan Hash Value : 1883860831 -------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | -------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | | | 74817 | | | 1 | TABLE ACCESS FULL | BOWIE_OVERLOAD | 10000 | 70000 | 74817 | 00: 00: 03 | -------------------------------------------------------------------------------- - With Auto Indexes ----------------------------- Plan Hash Value : 2541132923 --------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost | Time | --------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 9281 | 64967 | 10021 | 00: 00: 01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_OVERLOAD | 9281 | 64967 | 10021 | 00: 00: 01 | | * 2 | INDEX RANGE SCAN | SYS_AI_aat 8 t 6 ad 0 ux 0 h | 10000 | | 18 | 00: 00: 01 | --------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): ------------------------------------------ * 2 - access( "CODE1" = 42 ) Notes ----- - Dynamic sampling used for this statement ( level = 11 ) |
We see that an automatic index on just the CODE1 column was created.
SQL> select index_name, auto , visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name= 'BOWIE_OVERLOAD' ; INDEX_NAME AUT VISIBILIT COMPRESSION STATUS NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR ------------------------- --- --------- ------------- -------- ---------- ----------- ----------------- SYS_AI_aat 8 t 6 ad 0 ux 0 h YES VISIBLE ADVANCED LOW VALID 10000000 15363 10000000 SQL> select index_name, column_name, column_position from user_ind_columns where table_name= 'BOWIE_OVERLOAD' order by index_name, column_position; INDEX_NAME COLUMN_NAME COLUMN_POSITION ------------------------- --------------- --------------- SYS_AI_aat 8 t 6 ad 0 ux 0 h CODE 1 1 |
If we now re-run the query (noting in Oracle21c after you invalidate the current cursor):
SQL> select code 1 , code 2 from bowie_overload where code 1 = 42 ; 10000 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 2541132923 ------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 10000 | 70000 | 10021 ( 1 )| 00: 00: 01 | | 1 | TABLE ACCESS BY INDEX ROWID BATCHED | BOWIE_OVERLOAD | 10000 | 70000 | 10021 ( 1 )| 00: 00: 01 | | * 2 | INDEX RANGE SCAN | SYS_AI_aat 8 t 6 ad 0 ux 0 h | 10000 | | 18 ( 0 )| 00: 00: 01 | ------------------------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access( "CODE1" = 42 ) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 10021 consistent gets 0 physical reads 0 redo size 50890 bytes sent via SQL*Net to client 63 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10000 rows processed |
The query now uses the newly created automatic index.
BUT, at 10021 consistent gets, it’s still doing a substantial amount to work here.
If we manually create another index that overloads the only other column (CODE2) required in this query:
SQL> create index bowie_overload_code 1 _code 2 _i on bowie_overload(code 1 ,code 2 ) compress advanced low ; Index created. |
I’m using COMPRESS ADVANCED LOW as used by the automatic index, noting that CODE2 only contains the value “42” for all rows, making it particularly perfect for compression and a “best case” scenario when it comes to the minimal overheads potentially associated with overloading this index (I’m trying yo give AI every chance here):
SQL> select index_name, auto , constraint_index, visibility, compression, status, num_rows, leaf_blocks, clustering_factor from user_indexes where table_name= 'BOWIE_OVERLOAD' ; INDEX_NAME AUT CON VISIBILIT COMPRESSION STATUS NUM_ROWS LEAF_BLOCKS CLUSTERING_FACTOR ------------------------------ --- --- --------- ------------- -------- ---------- ----------- ----------------- SYS_AI_aat 8 t 6 ad 0 ux 0 h YES NO VISIBLE ADVANCED LOW VALID 10000000 15363 10000000 BOWIE_OVERLOAD_CODE 1 _CODE 2 _I NO NO VISIBLE ADVANCED LOW VALID 10000000 15363 10000000 SQL> select index_name, column_name, column_position from user_ind_columns where table_name= 'BOWIE_OVERLOAD' order by index_name, column_position; INDEX_NAME COLUMN_NAME COLUMN_POSITION ------------------------------ --------------- --------------- BOWIE_OVERLOAD_CODE 1 _CODE 2 _I CODE 1 1 BOWIE_OVERLOAD_CODE 1 _CODE 2 _I CODE 2 2 SYS_AI_aat 8 t 6 ad 0 ux 0 h CODE 1 1 |
In fact, my manually created index is effectively the same size as the automatic index, with the same number (15363) of leaf blocks.
So I’m giving AI the best possible scenario in which it could potentially create an overloaded index.
But I’ve never been able to get AI to create overloaded indexes. Only columns in filtering predicates are considered for inclusion in automatic indexes.
If I now re-run my query again:
SQL> select code 1 , code 2 from bowie_overload where code 1 = 42 ; 10000 rows selected. Execution Plan ---------------------------------------------------------- Plan hash value: 1161047960 ------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 10000 | 70000 | 18 ( 0 )| 00: 00: 01 | | * 1 | INDEX RANGE SCAN | BOWIE_OVERLOAD_CODE 1 _CODE 2 _I | 10000 | 70000 | 18 ( 0 )| 00: 00: 01 | ------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access( "CODE1" = 42 ) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 21 consistent gets 0 physical reads 0 redo size 50890 bytes sent via SQL*Net to client 63 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 10000 rows processed |
We notice the CBO now uses the manually created index without any table access path, as it can just use the index to access the necessary data.
The number of consistent gets as a result has reduced significantly, down to just 21, a fraction of the previous 10021 when the automatic index was used.
So the scenario an of overloaded index that could significantly reduce database resources, which is currently not supported by AI, is another example of where may want to manually create a necessary index.
As always, this may change in the future releases…
Recommend
About Joyk
Aggregate valuable and interesting links.
Joyk means Joy of geeK