Tracking history of multiple SCD type 2 attributes
4 posters
Page 1 of 1
Tracking history of multiple SCD type 2 attributes
Hi,
I am new to dimensional modelling and this could be a silly question.
In a dimension table, for a type-2 attribute the entire record is expired using an end_date or a flag column. How to track history of multiple type 2 attributes. For example, let say I have an Account Dimension, where "Account Number" is natural key and "Account Status" and "External Credit Rating"
are type-2 attributes.
In case accounts credit rating changes a new row will be entered and old row will be expired even though account status has not changed and vice versa.
Account Number | Account Status | External Credit rating | Effective From | Effective To
101 | Active | AAA | 01-JAN-2013 | 28-FEB-2013
101 | Dormant | AAA | 01-MAR-2013 | 30-JUN-2013
101 | Inoperative | AAA | 01-JUL-2014 | 9999-12-31
If I have to answer a questions like
x. How many accounts with rating AAA were present in the period 01-JAN-2014 to 31-MAR-2014?
x. How many accounts were rated AAA in period 01-MAR-2013 and 31-MAY-2013 and what was there previous ratings?
This would require some complex SQLs at reporting end. Can we add Effective From and Effective To columns for each attribute we want to track history. Can this be designed differently to answer above questions effectively.
Thanks
Aditya
I am new to dimensional modelling and this could be a silly question.
In a dimension table, for a type-2 attribute the entire record is expired using an end_date or a flag column. How to track history of multiple type 2 attributes. For example, let say I have an Account Dimension, where "Account Number" is natural key and "Account Status" and "External Credit Rating"
are type-2 attributes.
In case accounts credit rating changes a new row will be entered and old row will be expired even though account status has not changed and vice versa.
Account Number | Account Status | External Credit rating | Effective From | Effective To
101 | Active | AAA | 01-JAN-2013 | 28-FEB-2013
101 | Dormant | AAA | 01-MAR-2013 | 30-JUN-2013
101 | Inoperative | AAA | 01-JUL-2014 | 9999-12-31
If I have to answer a questions like
x. How many accounts with rating AAA were present in the period 01-JAN-2014 to 31-MAR-2014?
x. How many accounts were rated AAA in period 01-MAR-2013 and 31-MAY-2013 and what was there previous ratings?
This would require some complex SQLs at reporting end. Can we add Effective From and Effective To columns for each attribute we want to track history. Can this be designed differently to answer above questions effectively.
Thanks
Aditya
trulyaditya- Posts : 2
Join date : 2014-03-14
Re: Tracking history of multiple SCD type 2 attributes
You can have current row indicator to always get the latest data. Also account created and expiry date may help you query you effectively.
manickam- Posts : 27
Join date : 2013-04-26
Re: Tracking history of multiple SCD type 2 attributes
Accessing current row can be done by flag but what if I want to track history of one of the type 2 attributes. I have two attributes for which I want to track history but only one pair of Effective To and Effective From columns. So even though one of the column value has changed and others didn't I am still expiring the record. So in above given example if I access the period 01-JAN-2013 to 28-FEB-2013 (1st row) I see that account status was ACTIVE and rating was AAA till 28-FEB-2013 which is right for account status but not for rating.
trulyaditya- Posts : 2
Join date : 2014-03-14
Re: Tracking history of multiple SCD type 2 attributes
As you have probably found out, this is not an easy set of issues to resolve: I think you will always struggle to track two (or more) independently changing attributes in the same table. Can you provide any further information on exactly what your reporting requirements are? For example, if the same account, between 01-JAN-2014 to 31-MAR-2014, changes credit rating from AAA to BBB to AAA to ABB is that a count of 1 or 2 in your report of AAA-rated accounts? If you changed your first requirement slightly to:
Show the count of accounts by rating in the period 01-JAN-2014 to 31-MAR-2014
then would this one account appear in 4 different buckets in the report (and so be counted 4 times)? If not, and it should only be counted once, then what are the rules for which credit rating value it should be assigned to?
Cross-row comparison is never easy: your "compare current and previous value" requirement sounds like something SCD Type 3 might be used for.
Moving your 2 SCD2 attributes out of your main Dim and into separate SCD4 mini-dimensions may be a way forward. I was also thinking about using factless fact tables to track these changes - but if you include both of your changing attributes in dimensions attached to the fact then you still get the issue of double counting rating queries when status changes and vic. versa.
Hopefully someone else on this forum has some more thoughts on this.
Show the count of accounts by rating in the period 01-JAN-2014 to 31-MAR-2014
then would this one account appear in 4 different buckets in the report (and so be counted 4 times)? If not, and it should only be counted once, then what are the rules for which credit rating value it should be assigned to?
Cross-row comparison is never easy: your "compare current and previous value" requirement sounds like something SCD Type 3 might be used for.
Moving your 2 SCD2 attributes out of your main Dim and into separate SCD4 mini-dimensions may be a way forward. I was also thinking about using factless fact tables to track these changes - but if you include both of your changing attributes in dimensions attached to the fact then you still get the issue of double counting rating queries when status changes and vic. versa.
Hopefully someone else on this forum has some more thoughts on this.
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: Tracking history of multiple SCD type 2 attributes
How about
SELECT COUNT(DISTINCT ACCOUNT_NUMBER)
WHERE ACCOUNT_STATUS = 'AAA' AND EFFECTIVE_FROM <= 3/31/2014 AND EFFECTIVE_TO => 1/1/2014
it becomes a matter of business definition. The above would count any account that held that status at any time during the period. If what they really want is accounts with that status at the start or end of the period, it's even simpler:
SELECT COUNT(*)
WHERE ACCOUNT_STATUS = 'AAA' AND 1/1/2014 BETWEEN EFFECTIVE_FROM AND EFFECTIVE_TO
or
SELECT COUNT(*)
WHERE ACCOUNT_STATUS = 'AAA' AND 3/31/2014 BETWEEN EFFECTIVE_FROM AND EFFECTIVE_TO
SELECT COUNT(DISTINCT ACCOUNT_NUMBER)
WHERE ACCOUNT_STATUS = 'AAA' AND EFFECTIVE_FROM <= 3/31/2014 AND EFFECTIVE_TO => 1/1/2014
it becomes a matter of business definition. The above would count any account that held that status at any time during the period. If what they really want is accounts with that status at the start or end of the period, it's even simpler:
SELECT COUNT(*)
WHERE ACCOUNT_STATUS = 'AAA' AND 1/1/2014 BETWEEN EFFECTIVE_FROM AND EFFECTIVE_TO
or
SELECT COUNT(*)
WHERE ACCOUNT_STATUS = 'AAA' AND 3/31/2014 BETWEEN EFFECTIVE_FROM AND EFFECTIVE_TO
Similar topics
» Relationship between a history tracking table and a non-history tracking table?
» History tracking in a CRM data warehouse
» Tracking history in huge hierarchies
» Tracking fact table history
» Dimension Attributes and Fact attributes storing same data in multiple data marts??
» History tracking in a CRM data warehouse
» Tracking history in huge hierarchies
» Tracking fact table history
» Dimension Attributes and Fact attributes storing same data in multiple data marts??
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum