Handling late changes to Type 2 attributes
3 posters
Page 1 of 1
Handling late changes to Type 2 attributes
Hi. I'm trying to find a good way to handle late changes to Type 2 attributes. Please note that I am not asking about early-arriving facts that could be handled with an inferred dimension member. Rather, I am interested in the situation where the dimension member already exists, and you find out about a change to a Type 2 attribute after the fact.
As an example of what I mean, in my client's data warehouse, each customer is associated with (covered by) a particular salesperson. Over time, this coverage may change, therefore the covering salesperson is a Type 2 attribute in my DimCustomer table.
In a perfect world, these changes in coverage will be entered on the day they happen, and each fact row will be associated with the correct row in the dimension table from the time it is entered, but this is not how it actually happens. The reality is that coverage changes are actually entered some period of time after the date of the change.
In the simplest situation, I find out about a coverage change that happened on a particular date. To reflect this in my DimCustomer table, I need to insert a new row with the correct start date reflecting the new coverage, modify the end date of the previous row, and THEN realign all the affected fact rows so that they have the correct surrogate key based on the new row start and end dates.
But in a more complex situation, I need to consider that changes to other Type 2 attributes may have occurred after the date of the coverage change. If this is the case, then I also need to modify the subsequent rows to reflect the correct coverage value.
My client handles this now by reloading the past 90 days of data every day. But sometimes there are changes outside of this window, and this current solution won't work for the long-term. If we had a good way to handle this, we wouldn't even need to run 90 days of data every day.
Surely other people have run into this. How do other people handle this? What would be the best practices way to handle this? I would love any thoughts or suggestions that anyone may have.
Thanks so much!
Jill Kaszynski
As an example of what I mean, in my client's data warehouse, each customer is associated with (covered by) a particular salesperson. Over time, this coverage may change, therefore the covering salesperson is a Type 2 attribute in my DimCustomer table.
In a perfect world, these changes in coverage will be entered on the day they happen, and each fact row will be associated with the correct row in the dimension table from the time it is entered, but this is not how it actually happens. The reality is that coverage changes are actually entered some period of time after the date of the change.
In the simplest situation, I find out about a coverage change that happened on a particular date. To reflect this in my DimCustomer table, I need to insert a new row with the correct start date reflecting the new coverage, modify the end date of the previous row, and THEN realign all the affected fact rows so that they have the correct surrogate key based on the new row start and end dates.
But in a more complex situation, I need to consider that changes to other Type 2 attributes may have occurred after the date of the coverage change. If this is the case, then I also need to modify the subsequent rows to reflect the correct coverage value.
My client handles this now by reloading the past 90 days of data every day. But sometimes there are changes outside of this window, and this current solution won't work for the long-term. If we had a good way to handle this, we wouldn't even need to run 90 days of data every day.
Surely other people have run into this. How do other people handle this? What would be the best practices way to handle this? I would love any thoughts or suggestions that anyone may have.
Thanks so much!
Jill Kaszynski
jkaszynski- Posts : 3
Join date : 2010-01-26
Location : Boston, MA
Re: Handling late changes to Type 2 attributes
I'm just thinking off the top of my head... this may not be a good approach, but...
Since this is a regularly occuring problem, have you considered removing the sales person from the customer dimension? A factless fact table containing customer key, salesperson key and effective dates would certainly handle the issue without needing to reload anything. It isn't pretty, and it would impact performance a bit when reporting sales person, but it would solve the problem.
Since this is a regularly occuring problem, have you considered removing the sales person from the customer dimension? A factless fact table containing customer key, salesperson key and effective dates would certainly handle the issue without needing to reload anything. It isn't pretty, and it would impact performance a bit when reporting sales person, but it would solve the problem.
Re: Handling late changes to Type 2 attributes
It is apparent that the business does not care about history. That is if a salesperson arrives 60 days late, the previous reporting for 60 days was incorrect. But when you go back and scrub history, you lose the fact that for the past 60 days the reports have been wrong. I've handled situations like this in two different ways. The first way (on a Teradata platform) I captured business effective and end dates in addition to EDW effective and end dates. This way I can go back and restate history and still have insight to the fact that the history for the past two months was just altered yesterday. The second method, on a dimensional model, was to just accept that the day that I process an event might not actually be the real business date for that event. For example, if I process a customer disconnect on today that was effective 2 weeks ago, I count the disconnect as of today. The backdated disconnects average out over time which ends up having nominal effect on the metrics.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Similar topics
» alternate approaches for late arriving dimension attributes
» Design Alternatives for Late Arriving/Updated Type 2 Dimension Data
» Aggregating type 2 attributes for all facts
» Incremental Load Testing -Type 1 & Type 2
» Tracking history of multiple SCD type 2 attributes
» Design Alternatives for Late Arriving/Updated Type 2 Dimension Data
» Aggregating type 2 attributes for all facts
» Incremental Load Testing -Type 1 & Type 2
» Tracking history of multiple SCD type 2 attributes
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum