Large Fact Table and Maintaining Periodic Snapshot: Practice
3 posters
Page 1 of 1
Large Fact Table and Maintaining Periodic Snapshot: Practice
I would like some opinions with my approach here. I am trying to replace an overkill lift and load ETL process that basically replicated the entire universe of dataset every period instead of doing just Delta Load. In this delta approach, I have a stage table with new fact rows and I merge it to the target base table to load delta. I also a current flag to the new and old records. For reporting purposes, the current picture is all we need and I use the flag. For weekly and monthly view if needed, I have a summary like table that gets loaded periodically to point me to the weekly or monthly view.
FactTable (daily currrent view is a just selection of all currrent records)
FactKey inv_id locid Sumval IS_CURRENT Dateid
1000 990 100 50 N 20090101
1001 991 101 25 N 20090101
1002 992 102 75 Y 20090101
----------- ---------------- ----------------------------
1003 990 102 55 Y 20090102 -------> new changes inserted and flag set to mark current
1004 991 100 30 Y 20090102
For a monhthly view, I have a MonthlyInvoiceSnapshot table that gets loaded from my fact table above at the end of the month picking all current records for say invoice fact
Monthid FactKey inv_id
M1 1000 990
M1 1001 991
M1 1002 992
-----------------------
M2 1003 990
M2 1004 991
M2 1002 992
Similarly I could have weekly, quarterly and yearly look. Lets just say all Dimension tables are TYPE1 and therefore no need to keep history. Is there something I am overlooking? I would like to hear some opinions
Thanks
Buz
FactTable (daily currrent view is a just selection of all currrent records)
FactKey inv_id locid Sumval IS_CURRENT Dateid
1000 990 100 50 N 20090101
1001 991 101 25 N 20090101
1002 992 102 75 Y 20090101
----------- ---------------- ----------------------------
1003 990 102 55 Y 20090102 -------> new changes inserted and flag set to mark current
1004 991 100 30 Y 20090102
For a monhthly view, I have a MonthlyInvoiceSnapshot table that gets loaded from my fact table above at the end of the month picking all current records for say invoice fact
Monthid FactKey inv_id
M1 1000 990
M1 1001 991
M1 1002 992
-----------------------
M2 1003 990
M2 1004 991
M2 1002 992
Similarly I could have weekly, quarterly and yearly look. Lets just say all Dimension tables are TYPE1 and therefore no need to keep history. Is there something I am overlooking? I would like to hear some opinions
Thanks
Buz
buzzer75- Posts : 2
Join date : 2009-06-18
Re: Large Fact Table and Maintaining Periodic Snapshot: Practice
The first part is ok, but I don't understand the purpose of the monthly snapshot. It does not appear to be a summary of any kind because you are including the invoice number in the table. How is it different than just querying the base fact table?
Why have separate tables?
Most DBMS's have optimizers built into their query plan generation engines that will recognize true aggregate queries versus a detailed fact table and will redirect to an aggregate view that will provide the results.
Often to use this functionality you will create an Indexed View (or your DBMS flavor equivalent) that contains the appropriate aggregate query of the detailed fact table.
The point of this is it minimizes the ETL effort as it would require only designing and implementing ETL load for the base detailed fact table.
A great further reference for this and other aggregation stratgies can be found in "Mastering Data Warehouse Aggregates - Solutions for Star Schema Performance" by Christopher Adamson [Wiley 2006 ISBN 0-471-77709-9].
Additionally, this all assumes that you are not using any OLAP or other analysis engines that already provide this behavior.
Often to use this functionality you will create an Indexed View (or your DBMS flavor equivalent) that contains the appropriate aggregate query of the detailed fact table.
The point of this is it minimizes the ETL effort as it would require only designing and implementing ETL load for the base detailed fact table.
A great further reference for this and other aggregation stratgies can be found in "Mastering Data Warehouse Aggregates - Solutions for Star Schema Performance" by Christopher Adamson [Wiley 2006 ISBN 0-471-77709-9].
Additionally, this all assumes that you are not using any OLAP or other analysis engines that already provide this behavior.
JoeSalvatore- Posts : 4
Join date : 2009-06-19
Re: Large Fact Table and Maintaining Periodic Snapshot: Practice
Thank you so much for the responses. To answer the question on the snapshot, I may have used a bad example. I think of this snapshot as a hook table that will point me to the fact key in the main fact table as of the period in question. This will avoid date logic complications and also as you pointed out, I can leverage join indexes and have some good performance for retrieving historic data.
I do plan on type2 with some dimension tables where profile changes are required to capture. My main goal is to get rid of massive lift and load (version) etl that basically is a overkill in my case. With this delta process, I can still answer those rare AS OF questions. For majority of my need, I believe the current picture will suffice.
I do plan on type2 with some dimension tables where profile changes are required to capture. My main goal is to get rid of massive lift and load (version) etl that basically is a overkill in my case. With this delta process, I can still answer those rare AS OF questions. For majority of my need, I believe the current picture will suffice.
buzzer75- Posts : 2
Join date : 2009-06-18
Similar topics
» Aggregates in Periodic Snapshot Fact Table
» Is this a Correct Periodic Snapshot Fact Table?
» Muliple currencies for periodic snapshot fact table
» Variable period data in a single periodic snapshot fact table
» Updating Periodic Snapshot Fact Tables
» Is this a Correct Periodic Snapshot Fact Table?
» Muliple currencies for periodic snapshot fact table
» Variable period data in a single periodic snapshot fact table
» Updating Periodic Snapshot Fact Tables
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum