Indexing strategy for dimension tables.

platforminc Fri Aug 10, 2012 10:39 am

Hi All,

I want to find out what the best practice is when it comes to indexing dimensions, is it better to create covering indexes or perhaps one per column where the column will be used for joins/search predicates. Also, any guidance on dimension tables with very few rows ?

The DBMS in use is SQL server.

Thanks.

hkandpal Wed Aug 15, 2012 12:27 pm

Hi,

one way you can go about is to have a Primary key in the surrogate column as that column will be used in your Fact tables.
If the database is SQL server you can try clustering for the natural key it may help you in the queries (you need to test the select and insert as index may help the query but will slow inserts and updates).
For dimensions's where the number of rows are not many you can have one table which will store all those types of dimensions, search for junk dimension you will get many article.

thanks

Himanshu

Jeff Smith Wed Aug 15, 2012 3:17 pm

I think the best way is to check the execution plan to see where the bottle is, apply an index, recheck the execution plan to see if it helped.

winipcfg Wed Aug 15, 2012 9:28 pm

I suggested to apply indices on each useful columns.

One Bitmap index on columns for Oracle
Multiple BTree/Hash indices on columns for MySQL

kuldeepchitrakar Thu Aug 16, 2012 4:19 am

I would suggest following general tips on indexing

1. For surrogate key columns -Btree Indexs
2. For Business Key columns - Btree indexs
3. For low cardinality columns - BitMap Indexs
4. If your queries are using any function utilizing dim column use function indexs or try creating a separate column in dim table with after applying functions

e.g. concat (fname,lname) instead of this create separate column with full name and store pre- concatenated value.

After this do a sql plan analysis to optimize indexes.

chumeniuk Thu Aug 16, 2012 3:48 pm

From a SQL Server perspective, I make the surrogate key column a Pk which is clustered. From there, I typically put an unique index on the natural key to both prevent accidental duplicates and aid in look-ups when updating. From there, most indexes are results of performance tuning and need to be reviewed periodically.

vickyejain Mon Aug 20, 2012 3:30 am

If you're working on Oracle, bitmap indices work very well - you can use them on common columns that will be used for filtering, and if you decide to use them on join columns - it works very well if you define the same index on your fact table as well. Oracle will process the join using just the index data which makes the joins happen fairly fast.

If Teradata is your database, it is best to simply collect statistics on important columns and not define any Secondary Indices to begin with (you would have defined a Primary Index when you created the table). Here you will have to focus a lot more on the distribution of data across Amps and redistribution during join operations. Choosing the right indices here is a bit more complicated and goes a long way to dictate your the performance.

Sponsored content

Indexing strategy for dimension tables.

Indexing strategy for dimension tables.

Indexing strategy for dimension tables.

Re: Indexing strategy for dimension tables.

Re: Indexing strategy for dimension tables.

Re: Indexing strategy for dimension tables.

Re: Indexing strategy for dimension tables.

Bitmap / Stats

Re: Indexing strategy for dimension tables.