Handling a SCD Type 2 change in a fact
2 posters
Page 1 of 1
Handling a SCD Type 2 change in a fact
Hi,
I'm in the process of designing a dimensional model for a product warehouse, and a question I have is about how to handle a change of a Type 2 dimension in a related fact. I am keeping the details of each of our products in a Type 2 dimension (name, price, size, etc with effective start and end dates) and a fact keeps track of each sale we made to our customers in terms of customer_SID, product_SID, time_SID, dollars_sale.
Now if I a name of our product changes, say instead of product 'A' we now want to call it product 'D', my Type 2 logic would insert a new row into the product dimension. Since our business folks want to see details on the products sold and know of the product only as D not as A, should I insert new rows for all the facts that had an 'A' SID with the 'D' SID or should I put in the natural key into both the dimension and the fact so that our reports could just use natural key join and max effective dating from the dimension to pick the attributes from the 'D' dimension row?
Thanks for your help
I'm in the process of designing a dimensional model for a product warehouse, and a question I have is about how to handle a change of a Type 2 dimension in a related fact. I am keeping the details of each of our products in a Type 2 dimension (name, price, size, etc with effective start and end dates) and a fact keeps track of each sale we made to our customers in terms of customer_SID, product_SID, time_SID, dollars_sale.
Now if I a name of our product changes, say instead of product 'A' we now want to call it product 'D', my Type 2 logic would insert a new row into the product dimension. Since our business folks want to see details on the products sold and know of the product only as D not as A, should I insert new rows for all the facts that had an 'A' SID with the 'D' SID or should I put in the natural key into both the dimension and the fact so that our reports could just use natural key join and max effective dating from the dimension to pick the attributes from the 'D' dimension row?
Thanks for your help
markfranco- Posts : 2
Join date : 2010-09-24
Re: Handling a SCD Type 2 change in a fact
Even though you have a type 2 dimension, it doesn't mean all the attributes need to be treated as type 2. It is common that only some number of attributes require history while other attribues (such as product name) do not. If history is not required for an attribute, changes to that attribute should not trigger a new row (only changes to the type 2 attributes would do so). Type 1 (non historical) attributes would be updated for all versions of that product... basically you update type 1 attributes based on the natural key so all rows for that product are updated.
If the question is, given a type 2 dimension, how do you report the current version of all dimension values for all rows in the fact... you do a self join in the dimension table using the natural key and filtering for the current row (usually using a current row flag maintained in the dimension table).
If the question is, given a type 2 dimension, how do you report the current version of all dimension values for all rows in the fact... you do a self join in the dimension table using the natural key and filtering for the current row (usually using a current row flag maintained in the dimension table).
Re: Handling a SCD Type 2 change in a fact
Thanks for your advice.
I think I will go with a regular type 2 dimension since I do want to keep the product name history in the table, so we won't lose that detail, but I can still use the self join on the current indicator to pick the latest row for reporting.
I was concerned that doing this would slow down the reporting process since I would have to do this extra join and the natural key to join with the fact would be a varchar possibly slowing down the process more, if the database (Oracle) does a table scan despite my indexes.
I think I will go with a regular type 2 dimension since I do want to keep the product name history in the table, so we won't lose that detail, but I can still use the self join on the current indicator to pick the latest row for reporting.
I was concerned that doing this would slow down the reporting process since I would have to do this extra join and the natural key to join with the fact would be a varchar possibly slowing down the process more, if the database (Oracle) does a table scan despite my indexes.
markfranco- Posts : 2
Join date : 2010-09-24
Re: Handling a SCD Type 2 change in a fact
Yes, it does slow things down. There are alternative approaches. One is to keep a type 1 key as an alternate key to the dimension. Basically the type 1 key is set to the initial primary (type 2) key value when a new entry is created. In subsequent updates to that entry, the type 1 key value is propagated from the previous row. You keep both the type 2 and type 1 versions of the key in the fact table and join on the type 1 key and current flag when you want the current dimension row. This eliminates the self join.
Another alternative, particularly when there are only a handful of type 2 attributes, is to maintain two dimension tables a full type 1 and a type 2 covering historical attributes and have both keys in the fact.
Another alternative, particularly when there are only a handful of type 2 attributes, is to maintain two dimension tables a full type 1 and a type 2 covering historical attributes and have both keys in the fact.
Similar topics
» What to do when facts change? Update the fact table with update and deletes? or use SCD type 2?
» Kimbal Fact Table Type - Transactional Fact Type Issue
» SCD Type 2 Change Reasons
» Type 2 Change Handling
» Much ado about overlapping type 2 change
» Kimbal Fact Table Type - Transactional Fact Type Issue
» SCD Type 2 Change Reasons
» Type 2 Change Handling
» Much ado about overlapping type 2 change
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum