Type 2 SCD handling using the SSIS SCD Wizard
3 posters
Page 1 of 1
Type 2 SCD handling using the SSIS SCD Wizard
Hi
We have a Kimball style DW which we have implemented using SQL Server 2008 R2 technology.
Our DimAccount dimension has a field called Status which is being tracked as Type 2 attribute. We are using the Wizard generated code and following the techniques and advice given in the Kimball book that is specific to SQL Server 2008 R2.
But, we have now hit a major issue: each time an Account record undergoes a Type 2 change, a new record is created and the old record is expired. But, the new record is assigned a new surogate key (automatically by SQL Server) which is technically OK but it means that unless we use the business keys, we have no way of knowing that the expired record and the new record "point" to the same Account. Here is an example from our DW:
SKey BizKey Account Name Status RowIsCurrent
397620 55173846 Joe Bloggs inactive N
446150 55173846 Joe Bloggs dormant Y
We don't want to use the BizKey to link these two records because what if we start extracting Account data from another system? Their business key will probably be different.
So, should we generate two surrogate keys:
SKey DurableKey BizKey Account Name Status RowIsCurrent
397620 66666666 55173846 Joe Bloggs inactive N
446150 66666666 55173846 Joe Bloggs dormant Y
This way we can the (surrogate) DurableKey to link expired rows to current rows.
Is this a good approach?
We have a Kimball style DW which we have implemented using SQL Server 2008 R2 technology.
Our DimAccount dimension has a field called Status which is being tracked as Type 2 attribute. We are using the Wizard generated code and following the techniques and advice given in the Kimball book that is specific to SQL Server 2008 R2.
But, we have now hit a major issue: each time an Account record undergoes a Type 2 change, a new record is created and the old record is expired. But, the new record is assigned a new surogate key (automatically by SQL Server) which is technically OK but it means that unless we use the business keys, we have no way of knowing that the expired record and the new record "point" to the same Account. Here is an example from our DW:
SKey BizKey Account Name Status RowIsCurrent
397620 55173846 Joe Bloggs inactive N
446150 55173846 Joe Bloggs dormant Y
We don't want to use the BizKey to link these two records because what if we start extracting Account data from another system? Their business key will probably be different.
So, should we generate two surrogate keys:
SKey DurableKey BizKey Account Name Status RowIsCurrent
397620 66666666 55173846 Joe Bloggs inactive N
446150 66666666 55173846 Joe Bloggs dormant Y
This way we can the (surrogate) DurableKey to link expired rows to current rows.
Is this a good approach?
amir2- Posts : 29
Join date : 2010-07-29
Re: Type 2 SCD handling using the SSIS SCD Wizard
You could do that. if you do, store the durable key as an FK in the fact along with the normal type 2 key. This will allow you to locate the current dimension row from a fact without having to do a self-join on the dimension.
Re: Type 2 SCD handling using the SSIS SCD Wizard
Thanks.
I would have thought this must be a common problem, for which there would be a common set of solution(s) (?). I wonder why it isn't covered more widely in variuos writings...
I would have thought this must be a common problem, for which there would be a common set of solution(s) (?). I wonder why it isn't covered more widely in variuos writings...
amir2- Posts : 29
Join date : 2010-07-29
Re: Type 2 SCD handling using the SSIS SCD Wizard
It has come up in a number of threads. There are a number of variations for implementing a type 2 to make life easier for current data queries.
Re: Type 2 SCD handling using the SSIS SCD Wizard
ngalemmo wrote:You could do that. if you do, store the durable key as an FK in the fact along with the normal type 2 key. This will allow you to locate the current dimension row from a fact without having to do a self-join on the dimension.
I concur, but be careful to always constrain by RowIsCurrent = 'Y' when joining the fact table to the dimension using the durable key or you will get too many records!
(If you SCD2 dimension also contained begin/end effective dates, you could also constrain on the row that was current at a given point in time.)
Given how frequently this comes up, I am surprised that Kimball Group hasn't defined a "best practice" in this area. In at least one example, they even suggested it was OK to use the business key to identify rows belonging to the same entity, which seems to go against fundamental Kimball practices!
Last edited by VHF on Wed Oct 12, 2011 10:14 am; edited 4 times in total (Reason for editing : afterthought; typo)
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Type 2 SCD handling using the SSIS SCD Wizard
...they even suggested it was OK to use the business key to identify rows belonging to the same entity, which seems to go against fundamental Kimball practices!.
Not really, the basic technique for locating the current row in a type 2 relies on using the natural key to identify rows belonging to the same entity. So it has been a practice for a long time. However, as we know, it is not ideal. A 'type 1' surrogate is more reliable and controllable.
Similar topics
» Microsoft SSIS BIDS SCD Type 2 Development
» rationale behind dimension with Type 0 and missing Type 5
» Type-2 Dates as Date Data Type ?
» Incremental Load Testing -Type 1 & Type 2
» SCD object in SSIS
» rationale behind dimension with Type 0 and missing Type 5
» Type-2 Dates as Date Data Type ?
» Incremental Load Testing -Type 1 & Type 2
» SCD object in SSIS
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum