Storing data that changes periodically - should I use a periodic snapshot fact?
4 posters
Page 1 of 1
Storing data that changes periodically - should I use a periodic snapshot fact?
Hi everyone,
I need to make additions to the data warehouse that I'm working on, and I am hoping for some guidance.
Some context on the data that I'm working with:
In simple terms, the system that will form the source of this data tracks how employees are allocated to various projects. From 200901, Employee X allocates 40% of her time to Project A and 60% to Project B. In 201102, she changes roles and her time is now split 20% to Project B and 80% to Project C. The tact table needs to show employee allocations to each project they are working on for a given time (at a month level), so that these allocations for each employee can be reported on a monthly basis.
This does not seem transactional to me - as such, a transactional fact does not look like the correct approach. Would a periodic snapshot fact be more suited for this?
My confusion comes from the fact that from 200901 to 201101, the first allocation of Employee X is valid. From 201102, the allocation changes, and is now valid until a new allocation is created. Do I somehow fill in the gaps between 200901 and 201101 in the fact able, and have a record for each month during this time period, even though the allocation hasn't changed, or do I only record instances where allocations change and use logic in my queries to work out what allocations were in between these records?
Thanks.
I need to make additions to the data warehouse that I'm working on, and I am hoping for some guidance.
Some context on the data that I'm working with:
In simple terms, the system that will form the source of this data tracks how employees are allocated to various projects. From 200901, Employee X allocates 40% of her time to Project A and 60% to Project B. In 201102, she changes roles and her time is now split 20% to Project B and 80% to Project C. The tact table needs to show employee allocations to each project they are working on for a given time (at a month level), so that these allocations for each employee can be reported on a monthly basis.
This does not seem transactional to me - as such, a transactional fact does not look like the correct approach. Would a periodic snapshot fact be more suited for this?
My confusion comes from the fact that from 200901 to 201101, the first allocation of Employee X is valid. From 201102, the allocation changes, and is now valid until a new allocation is created. Do I somehow fill in the gaps between 200901 and 201101 in the fact able, and have a record for each month during this time period, even though the allocation hasn't changed, or do I only record instances where allocations change and use logic in my queries to work out what allocations were in between these records?
Thanks.
min.emerg- Posts : 39
Join date : 2011-02-25
Re: Storing data that changes periodically - should I use a periodic snapshot fact?
I've spent some more time researching the problem (on this forum) and have found a few possible solutions:
1. Take a monthly snapshot of the data, and create a new record in the fact table for all Employees that are active (regardless of whether their allocations have changed since the last month). We could therefore have 12 records of exactly the same allocation data for an employee but for 12 different months (if their allocations have stayed the same for that 12 month period).
2. Use a type-2 fact, with two date fields added to the fact table to determine when a particular record was active.
A question comes to mind - aren't you limited in the types of aggregation that can be done using method 2? If you want to sum an Employees allocation for the last three months, this might be difficult as there is only one record, and not three identical records as there would be using method 1.
1. Take a monthly snapshot of the data, and create a new record in the fact table for all Employees that are active (regardless of whether their allocations have changed since the last month). We could therefore have 12 records of exactly the same allocation data for an employee but for 12 different months (if their allocations have stayed the same for that 12 month period).
2. Use a type-2 fact, with two date fields added to the fact table to determine when a particular record was active.
A question comes to mind - aren't you limited in the types of aggregation that can be done using method 2? If you want to sum an Employees allocation for the last three months, this might be difficult as there is only one record, and not three identical records as there would be using method 1.
min.emerg- Posts : 39
Join date : 2011-02-25
Re: Storing data that changes periodically - should I use a periodic snapshot fact?
Option 2, the accumulating snapshot, does make some calculations more difficult, but it is more compact. Its more difficult to maintain, requiring you to update expiration dates on superceded rows. It is also open to interpretation as users can choose what date they want to look at... not necessarily the month-end.
For this particular application, a periodic snapshot, with monthly rows, is probaly the better choice. Its easy to understand, not open to interpretation (explicity shows the 'month' numbers based on business rules that determine when such numbers are counted) and simple to use.
For this particular application, a periodic snapshot, with monthly rows, is probaly the better choice. Its easy to understand, not open to interpretation (explicity shows the 'month' numbers based on business rules that determine when such numbers are counted) and simple to use.
Re: Storing data that changes periodically - should I use a periodic snapshot fact?
Thanks again ngalemmo for your help.
I was leaning towards the periodic snapshot approach - we will possibly be adding 100k records to the fact table every month (at the most), and will partition the data after a year, so a million or so records is easy enough to work with. It also makes life easier if they somehow want to update a record in the past. Using the type-2 fact approach would mean that we have to insert a record somewhere in the middle and change the dates accordingly, whereas with periodic snapshots we only have to updated a single record.
I was leaning towards the periodic snapshot approach - we will possibly be adding 100k records to the fact table every month (at the most), and will partition the data after a year, so a million or so records is easy enough to work with. It also makes life easier if they somehow want to update a record in the past. Using the type-2 fact approach would mean that we have to insert a record somewhere in the middle and change the dates accordingly, whereas with periodic snapshots we only have to updated a single record.
min.emerg- Posts : 39
Join date : 2011-02-25
Re: Storing data that changes periodically - should I use a periodic snapshot fact?
I like the periodic snapshot idea as well. Especially when the volumes are lower.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Storing data that changes periodically - should I use a periodic snapshot fact?
Is a rolling window useful when loading a periodic snapshot fact?
There are scenarios where adjustments can be made in the future for past entries into the fact (late arriving fact data). If I want these late arriving facts to be processed into the fact table, could I use a rolling window based off a record's updated date to determine whether data has been updated or is newly inserted since data was last processed? Bearing in mind that I will be taking a snapshot monthly - I also want to include updates that were made to previous months.
There are scenarios where adjustments can be made in the future for past entries into the fact (late arriving fact data). If I want these late arriving facts to be processed into the fact table, could I use a rolling window based off a record's updated date to determine whether data has been updated or is newly inserted since data was last processed? Bearing in mind that I will be taking a snapshot monthly - I also want to include updates that were made to previous months.
min.emerg- Posts : 39
Join date : 2011-02-25
Re: Storing data that changes periodically - should I use a periodic snapshot fact?
For snapshots, I do not generally apply late arriving facts. That's the beauty of the snapshot. It is what it is.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Storing data that changes periodically - should I use a periodic snapshot fact?
Fully agreed. If you update the snapshot facts, it will not be the true picture at particular point of time. But why bother updating? the future snapshots will cover the adjustments anyway.BoxesAndLines wrote:For snapshots, I do not generally apply late arriving facts. That's the beauty of the snapshot. It is what it is.
I also prefer the periodic snapshot in this case, as it is more cube friendly for trend analysis than the accumulating ones.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Similar topics
» Periodic snapshot fact tables with sparse data
» Variable period data in a single periodic snapshot fact table
» Transaction Fact or periodic snapshot fact
» transactional fact vs periodic snapshot fact
» Insurance - Policies and Riders Periodic Snapshot Fact Table Design Suggestion
» Variable period data in a single periodic snapshot fact table
» Transaction Fact or periodic snapshot fact
» transactional fact vs periodic snapshot fact
» Insurance - Policies and Riders Periodic Snapshot Fact Table Design Suggestion
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum