Data Warehouse - Indexes
+2
Joy
AzeemFarooqui
6 posters
Page 1 of 1
Data Warehouse - Indexes
Hi,
I am currently at the point where I need to think about Indexes on my data warehouse. We are currently using SQL Server 2005. As a general rule of thumb would the following hold true (Please note that the reporting requirements are still vague):
1) All primary key's will have a clustered Index
2) All foreign keys will have a non clustered index
3) Any columns which require ordering will have an index.
I would be interested to know what other peoples thoughts are on this strategy?
I would also be interested to know how other people have defined an index strategy within their data warehouses.
Thanks
Azeem
I am currently at the point where I need to think about Indexes on my data warehouse. We are currently using SQL Server 2005. As a general rule of thumb would the following hold true (Please note that the reporting requirements are still vague):
1) All primary key's will have a clustered Index
2) All foreign keys will have a non clustered index
3) Any columns which require ordering will have an index.
I would be interested to know what other peoples thoughts are on this strategy?
I would also be interested to know how other people have defined an index strategy within their data warehouses.
Thanks
Azeem
AzeemFarooqui- Posts : 6
Join date : 2009-02-23
Re: Data Warehouse - Indexes
For SQL Server 2005, my standard suggested starting point for indexes on a dimensional DW that's used for direct querying & reports is:
- Dimension tables get clustered single-column PK index on the (integer) primary keys
- For a very large dimension, I may add a few indexes on the most popular non-key columns (popular = often used in queries)
- For large dimensions (100k rows), I'd consider indexing the source system key (non-unique if you have any Type2 attributes in the dimension). This is not for query time, but for the surrogate key pipeline during ETL. (Though if you use SSIS and cache the lookup, the index may not be very valuable.)
- Fact tables get single-column nonclustered indexes on the FKs to the dimension tables
- If your PK is composite (consisting of some or all of the dimension FKs), do not make it a clustered index. In this case, make a 2 or at most 3-column nonunique clustered index, with DateKey as the first column in the index.
- If you've defined a single-column surrogate PK for the fact table, of course make it clustered.
I'm not sure what you mean by "any column which requires ordering will have an index."
If your relational DW exists only to populate the SSAS database, your indexing can be lighter, even zero. Though this is an unusual case -- most systems have at least some query activity into the relational DW.
- Dimension tables get clustered single-column PK index on the (integer) primary keys
- For a very large dimension, I may add a few indexes on the most popular non-key columns (popular = often used in queries)
- For large dimensions (100k rows), I'd consider indexing the source system key (non-unique if you have any Type2 attributes in the dimension). This is not for query time, but for the surrogate key pipeline during ETL. (Though if you use SSIS and cache the lookup, the index may not be very valuable.)
- Fact tables get single-column nonclustered indexes on the FKs to the dimension tables
- If your PK is composite (consisting of some or all of the dimension FKs), do not make it a clustered index. In this case, make a 2 or at most 3-column nonunique clustered index, with DateKey as the first column in the index.
- If you've defined a single-column surrogate PK for the fact table, of course make it clustered.
I'm not sure what you mean by "any column which requires ordering will have an index."
If your relational DW exists only to populate the SSAS database, your indexing can be lighter, even zero. Though this is an unusual case -- most systems have at least some query activity into the relational DW.
Re: Data Warehouse - Indexes
Thanks for getting back on this Joy.
Are you aware of any setting within SQL Sever 2005 that makes the data warehouse perform better? I have heared of a similar parameter in Oracle called star schema enabled. Does SQL Server 2005 have a similar thing?
Apart from that are there any other factors that need to be considered when building a data warehouse in SQL Server 2005?
I appreciate your help.
Regards
Azeem
Are you aware of any setting within SQL Sever 2005 that makes the data warehouse perform better? I have heared of a similar parameter in Oracle called star schema enabled. Does SQL Server 2005 have a similar thing?
Apart from that are there any other factors that need to be considered when building a data warehouse in SQL Server 2005?
I appreciate your help.
Regards
Azeem
AzeemFarooqui- Posts : 6
Join date : 2009-02-23
Re: Data Warehouse - Indexes
I've never heard of a setting like that for Oracle. Anyone else know what that is?
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Data Warehouse - Indexes
AzeemFarooqui wrote:Thanks for getting back on this Joy.
Are you aware of any setting within SQL Sever 2005 that makes the data warehouse perform better? I have heared of a similar parameter in Oracle called star schema enabled. Does SQL Server 2005 have a similar thing?
Apart from that are there any other factors that need to be considered when building a data warehouse in SQL Server 2005?
I appreciate your help.
Regards
Azeem
I think you are referring to the Oracle parameter star_transformation_enabled. Refer to what Jonathan Lewis wrote about it http://www.dbazine.com/oracle/or-articles/jlewis6
If properly applied, it can really improve query performance.
However, it's not that switching it on makes the data warehouse automatically more performant. A magic FAST=ON switch has not yet been found ;-)
Cheers
inglev- Posts : 3
Join date : 2009-02-17
Re: Data Warehouse - Indexes
- If you've defined a single-column surrogate PK for the fact table, of course make it clustered.
I've always wondered... why use a meaningless surrogate PK of the fact table as your clustered index? Wouldn't it make more sense to identify a more business-meaningful key to build the clustered index?
TStahr- Posts : 4
Join date : 2009-02-27
Re: Data Warehouse - Indexes
Star transformation does not exist in SQL Server 2005.
Was introduced in SQL Server 2008 though. As outlined by previous posters this is no magic bullet though.
Was introduced in SQL Server 2008 though. As outlined by previous posters this is no magic bullet though.
Similar topics
» Ebook The Data Warehouse Lifecycle Toolkit, 2nd Edition: Practical Techniques for Building Data Warehouse and Business Intelligence Systems
» data warehouse or not ? when is it okay to use OLAP without a data warehouse database
» Is it a best practice that Data warehouse follows the source system data type?
» data warehouse and data warehouse system
» difference between data mart and data warehouse at logical/physical level
» data warehouse or not ? when is it okay to use OLAP without a data warehouse database
» Is it a best practice that Data warehouse follows the source system data type?
» data warehouse and data warehouse system
» difference between data mart and data warehouse at logical/physical level
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum