Late Arriving Facts
3 posters
Page 1 of 1
Late Arriving Facts
Hello All,
Need some inputs regarding handling of late arriving facts:
I have a demographics related fact table that holds the following measures - Population_Count, Birth_Count, Death Count.
The behaviour of the Population_Count is that it stores the count as of the refresh date. However, the Birth and Death Counts are for a particular period - (monthly).
Now the fact table mentioned above is populated from a Level 1 data mart which is refreshed daily and holds data at a lower granularity than Level 2. The Level 2 data mart is refreshed monthly. As a result, there are mismatches in the Counts.
Example:
For the period 1 JAN - 31st JAN the Population Count = 200 for Level 1 mart refreshed as on 31st JAN and Level 2 mart refreshed on 31st JAN. However, Level 2 will not be refreshed until the 28th FEB, whereas Level 1 will be refreshed daily. As a result there will be an obvious mismatch between the count in Level 1 and LEvel 2. Say in 15th Feb new data is received in LEvel 1 for Births/Deaths in Jan. This should be reflected in the LEvel 2 as well.
What are the best ways of resolving this issue?
Thanks in advance!
Need some inputs regarding handling of late arriving facts:
I have a demographics related fact table that holds the following measures - Population_Count, Birth_Count, Death Count.
The behaviour of the Population_Count is that it stores the count as of the refresh date. However, the Birth and Death Counts are for a particular period - (monthly).
Now the fact table mentioned above is populated from a Level 1 data mart which is refreshed daily and holds data at a lower granularity than Level 2. The Level 2 data mart is refreshed monthly. As a result, there are mismatches in the Counts.
Example:
For the period 1 JAN - 31st JAN the Population Count = 200 for Level 1 mart refreshed as on 31st JAN and Level 2 mart refreshed on 31st JAN. However, Level 2 will not be refreshed until the 28th FEB, whereas Level 1 will be refreshed daily. As a result there will be an obvious mismatch between the count in Level 1 and LEvel 2. Say in 15th Feb new data is received in LEvel 1 for Births/Deaths in Jan. This should be reflected in the LEvel 2 as well.
What are the best ways of resolving this issue?
Thanks in advance!
dwbi_rb- Posts : 17
Join date : 2009-02-19
Re: Late Arriving Facts
You are talking of two kinds of snapshot data in a single fact table.
Transactional(Daily) for BIRTH_COUNT and DEATH_COUNT and
Monthly for POPULATION.
What I believe is the grain of this fact table is not defined appropriately.
If the Level 2 Datamart is scheduled to be loaded monthly basis then I would suggest to create a monthly snapshot fact table and load this table for the level 2 Datamart. This way you can have access to both Transactional and Monthly snapshots at any point of time by the means of these Two Fact tables.
But If you really want to maintain those two grains in the same fact table then you can have a Routine or SQL procedure which can update the BIRTH_COUNT and DEATH_COUNT fields in the Level 2 Data mart every time the LEVEL 1 data mart is refreshed. In this case the LEVEL 2 Data MArt will no longer remain an absolute monthly snapshot of the data.
Thanks
Transactional(Daily) for BIRTH_COUNT and DEATH_COUNT and
Monthly for POPULATION.
What I believe is the grain of this fact table is not defined appropriately.
If the Level 2 Datamart is scheduled to be loaded monthly basis then I would suggest to create a monthly snapshot fact table and load this table for the level 2 Datamart. This way you can have access to both Transactional and Monthly snapshots at any point of time by the means of these Two Fact tables.
But If you really want to maintain those two grains in the same fact table then you can have a Routine or SQL procedure which can update the BIRTH_COUNT and DEATH_COUNT fields in the Level 2 Data mart every time the LEVEL 1 data mart is refreshed. In this case the LEVEL 2 Data MArt will no longer remain an absolute monthly snapshot of the data.
Thanks
Mj1978- Posts : 8
Join date : 2010-03-10
Re: Late Arriving Facts
Now the fact table mentioned above is populated from a Level 1 data mart which is refreshed daily and holds data at a lower granularity than Level 2. The Level 2 data mart is refreshed monthly. As a result, there are mismatches in the Counts.
Yes, of course they don't match. What were you expecting to happen? What is the 'level 2' summary supposed to represent?
It's very common to have month-end snapshots of things in a data warehouse. If that is what the 2nd table is supposed to be, so be it. If it is a snapshot and the user want's current data, why can't they query the first table (the one that is updated daily)?
If the second table is not a monthly snapshot, but is supposed to be a summary of the first table, then update it daily... or eliminate it altogther.
Similar topics
» Large number of late arriving facts
» Preserving prior history for late arriving dimensions
» Late Arriving Dimension Data
» Late arriving dim *change* - how to update the fact?
» Not quite late arriving dimensions but similar case....
» Preserving prior history for late arriving dimensions
» Late Arriving Dimension Data
» Late arriving dim *change* - how to update the fact?
» Not quite late arriving dimensions but similar case....
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|