Kimball or not Kimball type 2 dimension (sk as primary key)
2 posters
Page 1 of 1
Kimball or not Kimball type 2 dimension (sk as primary key)
We're using Oracle and I received the following (see below) from our architect about why we shouldn't create a new SK for every type 2 change. i.e. use a combination of date and sk as the primary key.
Is this correct. Will it affect the creation of bitmap indexes that Oracle craves?
Cheers
------------------------------------------
Type 2 keys
This is in response to Page 258 on the Kimball DW Lifecycle Toolkit. The approach that is strongly recommended on this page will not be followed.
Choice 1: Kimball
In this the SK and the Natural acct key are not synonyms. The purpose of the SK is to uniquely identify each row, so it changes over time, even though the Natural key may remain the same.
Account table
Acct_SK Effective Date Acct_Name Current Row Ind Natural_Acct_Key
001 1/1/05 Mr Jones F 80808012345678
002 2/2/06 Mr & Mrs Jones T 80808012345678
Account Balance table
Acct_SK Effective Date Balance
001 1/1/05 $1
001 1/1/06 $1,000
002 2/2/06 $10
Choice 2: Alternate
In this the SK and the Natural acct key are synonyms. The purpose of the SK is to provide a compressed pseudo key that can be used to join to other tables. The PK of this table is the SK + Effective Date.
Also note that this is the preferred design choice in all large data warehouses, using Teradata hardware.
Account table
Acct_SK Effective Date Acct_Name Current Row Ind Natural _Acct_Key
001 1/1/05 Mr Jones F 80808012345678
001 2/2/06 Mr & Mrs Jones T 80808012345678
Account Balance table
Acct_SK Effective Date Balance
001 1/1/05 $1
001 1/1/06 $1,000
001 2/2/06 $10
Consideration
Choice 1
Plus - Joins between tables do not require date as part of the where clause.
When looking at the history of a table over time, the child table must be joined back to the parent to ensure that the like is compared with like. This means that the SK should not be joined for any external tables.
Choice 2
Plus- The SK may be used to uniquely identify the object that the table represents over time.
Joins between tables always require date as part of the where clause.
---------------
Choice 2 will be used in the design of all Type 2 tables.
[b]
Is this correct. Will it affect the creation of bitmap indexes that Oracle craves?
Cheers
------------------------------------------
Type 2 keys
This is in response to Page 258 on the Kimball DW Lifecycle Toolkit. The approach that is strongly recommended on this page will not be followed.
Choice 1: Kimball
In this the SK and the Natural acct key are not synonyms. The purpose of the SK is to uniquely identify each row, so it changes over time, even though the Natural key may remain the same.
Account table
Acct_SK Effective Date Acct_Name Current Row Ind Natural_Acct_Key
001 1/1/05 Mr Jones F 80808012345678
002 2/2/06 Mr & Mrs Jones T 80808012345678
Account Balance table
Acct_SK Effective Date Balance
001 1/1/05 $1
001 1/1/06 $1,000
002 2/2/06 $10
Choice 2: Alternate
In this the SK and the Natural acct key are synonyms. The purpose of the SK is to provide a compressed pseudo key that can be used to join to other tables. The PK of this table is the SK + Effective Date.
Also note that this is the preferred design choice in all large data warehouses, using Teradata hardware.
Account table
Acct_SK Effective Date Acct_Name Current Row Ind Natural _Acct_Key
001 1/1/05 Mr Jones F 80808012345678
001 2/2/06 Mr & Mrs Jones T 80808012345678
Account Balance table
Acct_SK Effective Date Balance
001 1/1/05 $1
001 1/1/06 $1,000
001 2/2/06 $10
Consideration
Choice 1
Plus - Joins between tables do not require date as part of the where clause.
When looking at the history of a table over time, the child table must be joined back to the parent to ensure that the like is compared with like. This means that the SK should not be joined for any external tables.
Choice 2
Plus- The SK may be used to uniquely identify the object that the table represents over time.
Joins between tables always require date as part of the where clause.
---------------
Choice 2 will be used in the design of all Type 2 tables.
[b]
FactProvider- Posts : 4
Join date : 2009-03-27
Age : 47
Location : Sydney
Re: Kimball or not Kimball type 2 dimension (sk as primary key)
I don't think your architect understands why one implements a type 2 in the first place. The point is to associate the fact with a specific dimensional state... the unique surrogate key identifies that specific state.
If you do what he/she recommends, joins become more complex and slower as they now have to consider date. How do you find the correct row for the transaction on 1/1/06? You'll need to store an expiration date in the dimension as well, which you should do anyway. Fundimentally, both methods will achive the same result, but the unique surrogate key approach is much simpler and unambiguous. Why complicate things?
If you do what he/she recommends, joins become more complex and slower as they now have to consider date. How do you find the correct row for the transaction on 1/1/06? You'll need to store an expiration date in the dimension as well, which you should do anyway. Fundimentally, both methods will achive the same result, but the unique surrogate key approach is much simpler and unambiguous. Why complicate things?
Similar topics
» How to handle a Type I or II dimension with a snowflaked customer sub dimension (kimball book page 337, 338)
» dimension attribute denormalisation in fact table
» rationale behind dimension with Type 0 and missing Type 5
» Primary Key of Dimension and Fact Tables
» Type 2 dimension or type 2 column?
» dimension attribute denormalisation in fact table
» rationale behind dimension with Type 0 and missing Type 5
» Primary Key of Dimension and Fact Tables
» Type 2 dimension or type 2 column?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum