Slowly Changing Dimensions - Design Review (Need More Clarification)
3 posters
Page 1 of 1
Slowly Changing Dimensions - Design Review (Need More Clarification)
Hello,
Suppose I have following Fact and Dim table
CustomerDim
CustomerKey
CustomerID
CustomerNumber
CustomerName
CreditLimit - Slowly Changing, wants history
IsAnnualMember - Slowly Changing, can be different from year to year or order to order
.
.
SalesFact
CustomerKey - surrogate key
CustomerID - Natural Kay
ProductKey - surrogate key
TimeKey - surrogate key
OrderAmount
OrderQty
Now I have read from
http://www.intelligententerprise.com/showArticle.jhtml?articleID=59301280&pgno=1
1.
The Mini Dimension with "Current" Overwrite
When you need historical tracking but are faced with semi-rapid changes in a large dimension, pure type 2 tracking is inappropriate. If you use a mini dimension, you can isolate volatile dimension attributes in a separate table rather than track changes in the primary dimension table directly. The mini-dimension grain is one row per "profile," or combination of attributes, while the grain of the primary dimension might be one row per customer. The number of rows in the primary dimension may be in the millions, but the number of mini-dimension rows should be a fraction of that. You capture the evolving relationship between the mini dimension and primary dimension in a fact table. When a business event (transaction or periodic snapshot) spawns a fact row, the row has one foreign key for the primary dimension and another for the mini-dimension profile in effect at the time of the event.
2.
Type 2 with "Current" Overwrite
Another variation for tracking unpredictable changes while supporting rollup of historical facts to current dimension attributes is a hybrid of type 1 and type 2. In this scenario, you capture a type 2 attribute change by adding a row to the primary dimension table. In addition, you have a "current" attribute on each row that you overwrite (type 1) for the current and all previous type 2 rows. You retain the historical attribute in its own original column. When a change occurs, the most current dimension row has the same value in the uniquely labeled current and historical ("as was" or "at time of event") columns.
3.
Type 2 with Natural Keys in the Fact Table
If you have a million-row dimension table with many attributes requiring historical and current tracking, the last technique we described becomes overly burdensome. In this situation, consider including the dimension natural key as a fact table foreign key, in addition to the surrogate key for type 2 tracking. This technique gives you essentially two dimension tables associated with the facts, but for good reason. The type 2 dimension has historically accurate attributes for filtering or grouping based on the effective values when the fact table was loaded. The dimension natural key joins to a table with just the current type 1 values. Again, the column labels in this table should be prefaced with "current" to reduce the risk of user confusion. You use these dimension attributes to summarize or filter facts based on the current profile, regardless of the values in effect when the fact row was loaded. Of course, if the natural key is unwieldy or ever reassigned, then you should use a durable surrogate reference key instead.
Now in my suitation I want two customer dimensions as (please be easy on me as I have close to 3 months DW/BI knowledge).
CustomerStaticDim
CustomerID
CustomerNumber
CustomerName
CustomerChangingDim
CustomerKey
CustomerID
CreditLimit
IsAnnualMember
StartTimeKey
EndTimeKey
Based on my little knowledge I cannot decide if I am right in my design or I am completely out of my mind.
I need your Expert Opinion please.
Thanks in advance,
Shahzad
Houston.
Suppose I have following Fact and Dim table
CustomerDim
CustomerKey
CustomerID
CustomerNumber
CustomerName
CreditLimit - Slowly Changing, wants history
IsAnnualMember - Slowly Changing, can be different from year to year or order to order
.
.
SalesFact
CustomerKey - surrogate key
CustomerID - Natural Kay
ProductKey - surrogate key
TimeKey - surrogate key
OrderAmount
OrderQty
Now I have read from
http://www.intelligententerprise.com/showArticle.jhtml?articleID=59301280&pgno=1
1.
The Mini Dimension with "Current" Overwrite
When you need historical tracking but are faced with semi-rapid changes in a large dimension, pure type 2 tracking is inappropriate. If you use a mini dimension, you can isolate volatile dimension attributes in a separate table rather than track changes in the primary dimension table directly. The mini-dimension grain is one row per "profile," or combination of attributes, while the grain of the primary dimension might be one row per customer. The number of rows in the primary dimension may be in the millions, but the number of mini-dimension rows should be a fraction of that. You capture the evolving relationship between the mini dimension and primary dimension in a fact table. When a business event (transaction or periodic snapshot) spawns a fact row, the row has one foreign key for the primary dimension and another for the mini-dimension profile in effect at the time of the event.
2.
Type 2 with "Current" Overwrite
Another variation for tracking unpredictable changes while supporting rollup of historical facts to current dimension attributes is a hybrid of type 1 and type 2. In this scenario, you capture a type 2 attribute change by adding a row to the primary dimension table. In addition, you have a "current" attribute on each row that you overwrite (type 1) for the current and all previous type 2 rows. You retain the historical attribute in its own original column. When a change occurs, the most current dimension row has the same value in the uniquely labeled current and historical ("as was" or "at time of event") columns.
3.
Type 2 with Natural Keys in the Fact Table
If you have a million-row dimension table with many attributes requiring historical and current tracking, the last technique we described becomes overly burdensome. In this situation, consider including the dimension natural key as a fact table foreign key, in addition to the surrogate key for type 2 tracking. This technique gives you essentially two dimension tables associated with the facts, but for good reason. The type 2 dimension has historically accurate attributes for filtering or grouping based on the effective values when the fact table was loaded. The dimension natural key joins to a table with just the current type 1 values. Again, the column labels in this table should be prefaced with "current" to reduce the risk of user confusion. You use these dimension attributes to summarize or filter facts based on the current profile, regardless of the values in effect when the fact row was loaded. Of course, if the natural key is unwieldy or ever reassigned, then you should use a durable surrogate reference key instead.
Now in my suitation I want two customer dimensions as (please be easy on me as I have close to 3 months DW/BI knowledge).
CustomerStaticDim
CustomerID
CustomerNumber
CustomerName
CustomerChangingDim
CustomerKey
CustomerID
CreditLimit
IsAnnualMember
StartTimeKey
EndTimeKey
Based on my little knowledge I cannot decide if I am right in my design or I am completely out of my mind.
I need your Expert Opinion please.
Thanks in advance,
Shahzad
Houston.
MeRookie- Posts : 5
Join date : 2009-02-03
Location : Houston
Re: Slowly Changing Dimensions - Design Review (Need More Clarification)
My EXPERT opinion is that you create one Customer dimension table.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Slowly Changing Dimensions - Design Review (Need More Clarification)
Thanks for reply.
Will you please be kind enough to explain then how to maintain and report history changes/data.
Will you please be kind enough to explain then how to maintain and report history changes/data.
MeRookie- Posts : 5
Join date : 2009-02-03
Location : Houston
Re: Slowly Changing Dimensions - Design Review (Need More Clarification)
Make your dimension a type 2. In the ETL process only process updates for the selected columns that you want history on. Ignore the changes in columns that you don't care about history.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Slowly Changing Dimensions - Design Review (Need More Clarification)
Thanks again.
So in my above example the customer dim will be
CustomerDim
CustomerKey
CustomerID
CustomerNumber
CustomerName
CreditLimit
IsAnnualMember
StartDate
EndDate
And every time any of the data changes insert new record in this table. Value will change only for 2-3 columns and will stay constant or will be of Type 1 for 3-4 columns. I have 5-7 years background of OLTP and it is little difficult to digest keep on inserting the repeating values like customer name and number.
So in my above example the customer dim will be
CustomerDim
CustomerKey
CustomerID
CustomerNumber
CustomerName
CreditLimit
IsAnnualMember
StartDate
EndDate
And every time any of the data changes insert new record in this table. Value will change only for 2-3 columns and will stay constant or will be of Type 1 for 3-4 columns. I have 5-7 years background of OLTP and it is little difficult to digest keep on inserting the repeating values like customer name and number.
MeRookie- Posts : 5
Join date : 2009-02-03
Location : Houston
Re: Slowly Changing Dimensions - Design Review (Need More Clarification)
Close. Anytime any of the columns change that you want to capture history for, insert a new record. If you don't want to carry history for a column and only that column changes, don't do anything.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Make Credit Limit a degenerate dimension
An alternative is to make the Credit Limit a Degenerate dimension on the Fact Table. That will reduce the amount of change in the Customer dimension.
If you are concerned about the size of the customer dimension, then you could remove Customer Name from the Customer Dimension and create a Name Dimension. Then put them back together in a Customer Fact Table. This isn't something that I would do, but it would make the Customer Dimension narrower, which might be an issue for huge customer tables.
If you are concerned about the size of the customer dimension, then you could remove Customer Name from the Customer Dimension and create a Name Dimension. Then put them back together in a Customer Fact Table. This isn't something that I would do, but it would make the Customer Dimension narrower, which might be an issue for huge customer tables.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: Slowly Changing Dimensions - Design Review (Need More Clarification)
Thanks guys. I pretty much took the words of BoxesAndLines.
MeRookie- Posts : 5
Join date : 2009-02-03
Location : Houston
Similar topics
» Not so slowly changing dimensions
» Slowly Changing Dimensions Are Not Always as Easy as 1, 2, 3
» Slowly changing heterogeneous dimensions
» Slowly changing fact with SCD2 Dimensions
» Bridging Tables and Slowly Changing Dimensions
» Slowly Changing Dimensions Are Not Always as Easy as 1, 2, 3
» Slowly changing heterogeneous dimensions
» Slowly changing fact with SCD2 Dimensions
» Bridging Tables and Slowly Changing Dimensions
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum