Periodic snapshot: insert when not changed?
2 posters
Page 1 of 1
Periodic snapshot: insert when not changed?
Hi there,
Just a little Kimball theory question. This is about my earlier described student dwh, but to explain it easier I'll use the inventory chapter of the DWH Toolkit.
If we would use a periodic snapshot fact table which will store the amount of inventory each week per product per warehouse, then there would be 10000 products and 5000 warehouses. If I understand the book well, every week there would be inserted 10000x5000 rows with inventory facts, together with a reference date.
Another option would be about the same, but we only insert all records that have been changed. So there would be an initial load with the same 10000 x 5000, but after that every week only all products would be added of which something have been changed in inventory records. Together with a date_valid_from and date_valid_to (which could be n/a).
I think in the second option, querying would be a bit more complicated but not too complicated. If necessarry we could always build a view in which we can simulate the first option.
So would it be better to always insert all products/warehouses or only insert the updated ones?
Thank you,
Jochem
Just a little Kimball theory question. This is about my earlier described student dwh, but to explain it easier I'll use the inventory chapter of the DWH Toolkit.
If we would use a periodic snapshot fact table which will store the amount of inventory each week per product per warehouse, then there would be 10000 products and 5000 warehouses. If I understand the book well, every week there would be inserted 10000x5000 rows with inventory facts, together with a reference date.
Another option would be about the same, but we only insert all records that have been changed. So there would be an initial load with the same 10000 x 5000, but after that every week only all products would be added of which something have been changed in inventory records. Together with a date_valid_from and date_valid_to (which could be n/a).
I think in the second option, querying would be a bit more complicated but not too complicated. If necessarry we could always build a view in which we can simulate the first option.
So would it be better to always insert all products/warehouses or only insert the updated ones?
Thank you,
Jochem
jochem_van_grondelle- Posts : 11
Join date : 2009-09-22
Re: Periodic snapshot: insert when not changed?
For inventory it would be a bad thing, but for something like school/class enrollment, maintaining time based incrementals is not so bad.
The problem with time based incremental snapshots is, to get a total count you need to sum from the last full snapshot up to the point in time you are interested in. For something like inventory, a lot of time may pass since the last full snapshot, making queries run long as well as requiring potentially complex selection logic to get the right count. But, in your case, I would assume you would have semester as one of the dimensions to the snapshot, so it is fairly easy to control when the first (and probably only) full snapshot occurs and the scope of the incremental changes.
Its going to depend on the planned query environment and how much you can hide from the end user. Full periodic snapshots are dead simple to use, whereas incremental snapshots require a subquery to construct the particular point in time desired. If the BI tool you are using has a robust meta layer that makes the difference transparent, go ahead.
The problem with time based incremental snapshots is, to get a total count you need to sum from the last full snapshot up to the point in time you are interested in. For something like inventory, a lot of time may pass since the last full snapshot, making queries run long as well as requiring potentially complex selection logic to get the right count. But, in your case, I would assume you would have semester as one of the dimensions to the snapshot, so it is fairly easy to control when the first (and probably only) full snapshot occurs and the scope of the incremental changes.
Its going to depend on the planned query environment and how much you can hide from the end user. Full periodic snapshots are dead simple to use, whereas incremental snapshots require a subquery to construct the particular point in time desired. If the BI tool you are using has a robust meta layer that makes the difference transparent, go ahead.
Similar topics
» Can you combine an accumulating and periodic snapshot into a single accumulating periodic snapshot table?!
» Balance Sheet Fact
» Is periodic snapshot the right choice?
» How to Track SCD Type 2 for Accumlating or Periodic Snapshot
» Periodic snapshot or Transaction type of FI
» Balance Sheet Fact
» Is periodic snapshot the right choice?
» How to Track SCD Type 2 for Accumlating or Periodic Snapshot
» Periodic snapshot or Transaction type of FI
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum