Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Late Arriving Dimension Data

2 posters

Go down

Late Arriving Dimension Data Empty Late Arriving Dimension Data

Post  GregDC Fri Jan 09, 2015 3:29 pm

I have a Warehouse with a grain of monthly. Data from the HR department is reported on as of the last working day of the month. However we are being told that are counts by status are off. It would seem that on the month after the report is issued that HR provides records with status changes that are backdated to the prior month. So when HR looks at the counts for the July Status report they are correct according to the data in the dimension/fact tables at the time the report was run, but HR did not get all the field reports until several days (or even weeks) into the next month. They know this happens so they change their systems and backdate the change. When the ETL picks up the change status record there is confusion. Do I change the Status with the backdate (hence making the end-of-month report non-duplicatable, but in agreement with the current HR system) or do I make the change and reflect the change in the month the data was reported (hence disagreeing with the HR system)?

Thanks for helping me think through this problem,

GregDC

GregDC

Posts : 17
Join date : 2015-01-09

Back to top Go down

Late Arriving Dimension Data Empty Re: Late Arriving Dimension Data

Post  ngalemmo Fri Jan 09, 2015 3:37 pm

Yes.  

Why not do both? It depends on what the requirement is.  If the business can't decide then providing both covers all possibilities.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Late Arriving Dimension Data Empty Re: Late Arriving Dimension Data

Post  GregDC Sat Jan 10, 2015 9:50 am

Thank you for the suggestion. The overall business requires reports to be repeatable (July reports always show the same July results), the HR group however insists that the "correct" answer for the July report needs to agree with their system in mid-August. So I have been trying to think how to implement you "both/and" suggestion.

I will admit that I am not seeing how to do this without some big changes to the dimension's attributes. Can you enlighten me a little about what you are thinking?

Thanks

GregDC

Posts : 17
Join date : 2015-01-09

Back to top Go down

Late Arriving Dimension Data Empty Re: Late Arriving Dimension Data

Post  ngalemmo Sat Jan 10, 2015 3:06 pm

Implement a type 2 dimension. Facts will be tied to the dimension at the point in time the fact was added. If a user wants the current dimension row, you do a self join on the dimension, based on the natural key, to locate the current version of the row.

If you want to avoid the self join there are two ways to handle it. You can have a single dimension table with a type 2 key and a persistent (i.e. type 1) key and store both keys in the fact. If they want current, you query the dimension using the persistent key and filter on the current flag. The other way is to have two dimension tables, one type 1 and the other type 2, again with both keys in the fact table.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Late Arriving Dimension Data Empty Re: Late Arriving Dimension Data

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum