Indexing strategy for dimension tables.
+3
Jeff Smith
hkandpal
platforminc
7 posters
Page 1 of 1
Indexing strategy for dimension tables.
Hi All,
I want to find out what the best practice is when it comes to indexing dimensions, is it better to create covering indexes or perhaps one per column where the column will be used for joins/search predicates. Also, any guidance on dimension tables with very few rows ?
The DBMS in use is SQL server.
Thanks.
I want to find out what the best practice is when it comes to indexing dimensions, is it better to create covering indexes or perhaps one per column where the column will be used for joins/search predicates. Also, any guidance on dimension tables with very few rows ?
The DBMS in use is SQL server.
Thanks.
platforminc- Posts : 7
Join date : 2012-05-25
Indexing strategy for dimension tables.
Hi,
one way you can go about is to have a Primary key in the surrogate column as that column will be used in your Fact tables.
If the database is SQL server you can try clustering for the natural key it may help you in the queries (you need to test the select and insert as index may help the query but will slow inserts and updates).
For dimensions's where the number of rows are not many you can have one table which will store all those types of dimensions, search for junk dimension you will get many article.
thanks
Himanshu
one way you can go about is to have a Primary key in the surrogate column as that column will be used in your Fact tables.
If the database is SQL server you can try clustering for the natural key it may help you in the queries (you need to test the select and insert as index may help the query but will slow inserts and updates).
For dimensions's where the number of rows are not many you can have one table which will store all those types of dimensions, search for junk dimension you will get many article.
thanks
Himanshu
hkandpal- Posts : 113
Join date : 2010-08-16
Re: Indexing strategy for dimension tables.
I think the best way is to check the execution plan to see where the bottle is, apply an index, recheck the execution plan to see if it helped.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: Indexing strategy for dimension tables.
I suggested to apply indices on each useful columns.
One Bitmap index on columns for Oracle
Multiple BTree/Hash indices on columns for MySQL
One Bitmap index on columns for Oracle
Multiple BTree/Hash indices on columns for MySQL
winipcfg- Posts : 2
Join date : 2012-07-30
Re: Indexing strategy for dimension tables.
I would suggest following general tips on indexing
1. For surrogate key columns -Btree Indexs
2. For Business Key columns - Btree indexs
3. For low cardinality columns - BitMap Indexs
4. If your queries are using any function utilizing dim column use function indexs or try creating a separate column in dim table with after applying functions
e.g. concat (fname,lname) instead of this create separate column with full name and store pre- concatenated value.
After this do a sql plan analysis to optimize indexes.
1. For surrogate key columns -Btree Indexs
2. For Business Key columns - Btree indexs
3. For low cardinality columns - BitMap Indexs
4. If your queries are using any function utilizing dim column use function indexs or try creating a separate column in dim table with after applying functions
e.g. concat (fname,lname) instead of this create separate column with full name and store pre- concatenated value.
After this do a sql plan analysis to optimize indexes.
Re: Indexing strategy for dimension tables.
From a SQL Server perspective, I make the surrogate key column a Pk which is clustered. From there, I typically put an unique index on the natural key to both prevent accidental duplicates and aid in look-ups when updating. From there, most indexes are results of performance tuning and need to be reviewed periodically.
chumeniuk- Posts : 3
Join date : 2010-05-17
Bitmap / Stats
If you're working on Oracle, bitmap indices work very well - you can use them on common columns that will be used for filtering, and if you decide to use them on join columns - it works very well if you define the same index on your fact table as well. Oracle will process the join using just the index data which makes the joins happen fairly fast.
If Teradata is your database, it is best to simply collect statistics on important columns and not define any Secondary Indices to begin with (you would have defined a Primary Index when you created the table). Here you will have to focus a lot more on the distribution of data across Amps and redistribution during join operations. Choosing the right indices here is a bit more complicated and goes a long way to dictate your the performance.
If Teradata is your database, it is best to simply collect statistics on important columns and not define any Secondary Indices to begin with (you would have defined a Primary Index when you created the table). Here you will have to focus a lot more on the distribution of data across Amps and redistribution during join operations. Choosing the right indices here is a bit more complicated and goes a long way to dictate your the performance.
vickyejain- Posts : 7
Join date : 2012-08-20
Similar topics
» Dimension Table Indexing Strategy
» Fact Table Indexing Strategy
» Loading fact table and dimension indexing.
» Very Large Dimension Strategy
» Too much of inner joins-ETL Strategy to load into Dimension
» Fact Table Indexing Strategy
» Loading fact table and dimension indexing.
» Very Large Dimension Strategy
» Too much of inner joins-ETL Strategy to load into Dimension
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum