Not quite late arriving dimensions but similar case....
4 posters
Page 1 of 1
Not quite late arriving dimensions but similar case....
Say, My source system received a record which gets recorded in the source system may be a day later(day 2) than it's actual effectiveness(day 1) for some reason.
Now, when we store that in the Dimension table, we are going to use the day when it got recorded in the system using say cycle date concept (day 2) as a Type-2 version start date.
This would cause problem when a fact record comes with the date of day 1.
Is it safe to check some date column in the dimension table for it's first effectiveness and use that for type-2 version start date rather than using cycle date(which is day-2) ?
We have never done this and would like to know if we are going to cause some other issue trying to fix this one case...
Any input is appreciated.
Now, when we store that in the Dimension table, we are going to use the day when it got recorded in the system using say cycle date concept (day 2) as a Type-2 version start date.
This would cause problem when a fact record comes with the date of day 1.
Is it safe to check some date column in the dimension table for it's first effectiveness and use that for type-2 version start date rather than using cycle date(which is day-2) ?
We have never done this and would like to know if we are going to cause some other issue trying to fix this one case...
Any input is appreciated.
VTK- Posts : 50
Join date : 2011-07-15
Re: Not quite late arriving dimensions but similar case....
Don't you use a constant date value like 1900-01-01 as start_date for first versions?
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Re: Not quite late arriving dimensions but similar case....
We do that for the initial load (aka history records) but for a daily on going load, would not that become a mis representation of the data ? Is that a good practice ?
Does other companies do this ?
Does other companies do this ?
VTK- Posts : 50
Join date : 2011-07-15
Re: Not quite late arriving dimensions but similar case....
Ask the business if they care that yesterday's report can change slightly due to late arriving facts. Explain that this is an artifact of receiving information as soon as possible. Further explain that if they want to be able to replicate yesterday's report as it actually was as well as how it should have been then it will require additional date logic on queries, additional date logic on ETL, and additional dates in the data model. Be sure to quantify the late arriving information so the business can understand the potential cost of missing information. Most companies will just accept that business + 1 day reports are slightly less accurate than business + 2 days and move on.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Not quite late arriving dimensions but similar case....
As far as the dates go, you can record both business effective period and as loaded (system date) effective periods, both are valid. As far as assigning FKs to facts, you can use either effective period as long as you are consistent.
Re: Not quite late arriving dimensions but similar case....
From the replies what I understand is that it's not good practice to mix the business effectiveness and system effectiveness in a single column.
In my case, I may not have both business effective and expire date on all the entities meaning I may have only say Policy effective date not Policy expired date...which is why the option of using the policy effective date for Version from date if it's before the system effective date...as this coulld be issue only when the record shows up first in the system.
I understand the concept of having to use either of the date combinations to pick dimension record....but I got a question on having both set of dates in the dimensions...
Do we still version it on system effective date only ? If so and if we use business effective dates to attach to fact record then we may find more than one record as something else could have changed in that row.
If not, what is the point of having that date also in the table ?
Am I making sense here with my question ?
Thanks
In my case, I may not have both business effective and expire date on all the entities meaning I may have only say Policy effective date not Policy expired date...which is why the option of using the policy effective date for Version from date if it's before the system effective date...as this coulld be issue only when the record shows up first in the system.
I understand the concept of having to use either of the date combinations to pick dimension record....but I got a question on having both set of dates in the dimensions...
Do we still version it on system effective date only ? If so and if we use business effective dates to attach to fact record then we may find more than one record as something else could have changed in that row.
If not, what is the point of having that date also in the table ?
Am I making sense here with my question ?
Thanks
VTK- Posts : 50
Join date : 2011-07-15
Re: Not quite late arriving dimensions but similar case....
In my case, I may not have both business effective and expire date on all the entities meaning I may have only say Policy effective date not Policy expired date...which is why the option of using the policy effective date for Version from date if it's before the system effective date...as this coulld be issue only when the record shows up first in the system.
Problems with getting good dates from source systems is often the reason a DW relies solely on its internal system timestamps to track this stuff.
Do we still version it on system effective date only ? If so and if we use business effective dates to attach to fact record then we may find more than one record as something else could have changed in that row.
If not, what is the point of having that date also in the table ?
If you get a good business effective timestamp you would use it to set the business expiration timestamp of the previous version. You should not run into situations where there are two possible choices.
Similar topics
» Preserving prior history for late arriving dimensions
» Late Arriving Facts
» alternate approaches for late arriving dimension attributes
» Design Alternatives for Late Arriving/Updated Type 2 Dimension Data
» Large number of late arriving facts
» Late Arriving Facts
» alternate approaches for late arriving dimension attributes
» Design Alternatives for Late Arriving/Updated Type 2 Dimension Data
» Large number of late arriving facts
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum