Data Warehouse - Indexes

AzeemFarooqui Tue Feb 24, 2009 12:28 pm

Hi,

I am currently at the point where I need to think about Indexes on my data warehouse. We are currently using SQL Server 2005. As a general rule of thumb would the following hold true (Please note that the reporting requirements are still vague):

1) All primary key's will have a clustered Index
2) All foreign keys will have a non clustered index
3) Any columns which require ordering will have an index.

I would be interested to know what other peoples thoughts are on this strategy?

I would also be interested to know how other people have defined an index strategy within their data warehouses.

Thanks
Azeem

Joy Tue Feb 24, 2009 1:48 pm

For SQL Server 2005, my standard suggested starting point for indexes on a dimensional DW that's used for direct querying & reports is:

- Dimension tables get clustered single-column PK index on the (integer) primary keys
- For a very large dimension, I may add a few indexes on the most popular non-key columns (popular = often used in queries)
- For large dimensions (100k rows), I'd consider indexing the source system key (non-unique if you have any Type2 attributes in the dimension). This is not for query time, but for the surrogate key pipeline during ETL. (Though if you use SSIS and cache the lookup, the index may not be very valuable.)

- Fact tables get single-column nonclustered indexes on the FKs to the dimension tables
- If your PK is composite (consisting of some or all of the dimension FKs), do not make it a clustered index. In this case, make a 2 or at most 3-column nonunique clustered index, with DateKey as the first column in the index.
- If you've defined a single-column surrogate PK for the fact table, of course make it clustered.

I'm not sure what you mean by "any column which requires ordering will have an index."

If your relational DW exists only to populate the SSAS database, your indexing can be lighter, even zero. Though this is an unusual case -- most systems have at least some query activity into the relational DW.

AzeemFarooqui Thu Feb 26, 2009 4:52 am

Thanks for getting back on this Joy.

Are you aware of any setting within SQL Sever 2005 that makes the data warehouse perform better? I have heared of a similar parameter in Oracle called star schema enabled. Does SQL Server 2005 have a similar thing?

Apart from that are there any other factors that need to be considered when building a data warehouse in SQL Server 2005?

I appreciate your help.

Regards
Azeem

BoxesAndLines Thu Feb 26, 2009 9:51 am

I've never heard of a setting like that for Oracle. Anyone else know what that is?

inglev Fri Feb 27, 2009 11:36 am

AzeemFarooqui wrote:Thanks for getting back on this Joy.

Are you aware of any setting within SQL Sever 2005 that makes the data warehouse perform better? I have heared of a similar parameter in Oracle called star schema enabled. Does SQL Server 2005 have a similar thing?

Apart from that are there any other factors that need to be considered when building a data warehouse in SQL Server 2005?

I appreciate your help.

Regards
Azeem

I think you are referring to the Oracle parameter star_transformation_enabled. Refer to what Jonathan Lewis wrote about it http://www.dbazine.com/oracle/or-articles/jlewis6
If properly applied, it can really improve query performance.
However, it's not that switching it on makes the data warehouse automatically more performant. A magic FAST=ON switch has not yet been found ;-)

Cheers

TStahr Fri Feb 27, 2009 7:19 pm

- If you've defined a single-column surrogate PK for the fact table, of course make it clustered.

I've always wondered... why use a meaningless surrogate PK of the fact table as your clustered index? Wouldn't it make more sense to identify a more business-meaningful key to build the clustered index?

ubethke Mon Mar 02, 2009 12:48 pm

Star transformation does not exist in SQL Server 2005.

Was introduced in SQL Server 2008 though. As outlined by previous posters this is no magic bullet though.

Sponsored content

Data Warehouse - Indexes

Data Warehouse - Indexes

Re: Data Warehouse - Indexes

Re: Data Warehouse - Indexes

Re: Data Warehouse - Indexes

Re: Data Warehouse - Indexes

Re: Data Warehouse - Indexes

Re: Data Warehouse - Indexes

Re: Data Warehouse - Indexes