DW refresh strategy
3 posters
Page 1 of 1
DW refresh strategy
Hi everyone,
I have a few questions about loading dimension and fact tables and I would appreciate your help.
I have 4 fact and 6 dimension tables that am updating/loading every night. We also load/update these fact tables two times during business hours. How do you guys typically load your DW tables? I look at the last modified date in the staging table and do an incremental load. For example, I have a fact table called Orders and it has Last Modified Date as a column. Each time the DW is refreshed, it will look at the last modified date field and pull only those orders that were modified since the last update. Do you guys have any better ideas? Should I rebuild the fact tables each time the DW is refreshed?
Thanks for your advise with this.
I have a few questions about loading dimension and fact tables and I would appreciate your help.
I have 4 fact and 6 dimension tables that am updating/loading every night. We also load/update these fact tables two times during business hours. How do you guys typically load your DW tables? I look at the last modified date in the staging table and do an incremental load. For example, I have a fact table called Orders and it has Last Modified Date as a column. Each time the DW is refreshed, it will look at the last modified date field and pull only those orders that were modified since the last update. Do you guys have any better ideas? Should I rebuild the fact tables each time the DW is refreshed?
Thanks for your advise with this.
SnowShine429- Posts : 36
Join date : 2013-02-16
Re: DW refresh strategy
What type of fact tables do you have? Generally, unless you are loading an accumulating snapshot fact, all transactions should be treated as inserts.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: DW refresh strategy
I have the same typical tables Orders etc. Can you please explain what you mean by "all transactions should be treated as inserts".
SnowShine429- Posts : 36
Join date : 2013-02-16
Re: DW refresh strategy
Orders is not a type of fact table. There are three basic types of fact tables, transaction, snapshot, and accumulating snapshot. You can browse the Kimball site and find the details about these three types. An Order fact is typically a transaction fact (if you wait until the order is completed). Each order line (a transaction) is inserted and you should never need to touch that row again. That's the beauty of a transaction fact table, insert and your done. The metric should never change. If the customer, product, or channel change, it's an update to a dimension table. You should never need to reload a transaction fact unless something goes drastically wrong.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: DW refresh strategy
Thanks for your help. So you are suggesting I do just inserts(incremental). What about updates? For example, let's say there was a transaction entered yesterday and later it was modified to edit the quantity. Wouldn't this require an update to the fact table?
SnowShine429- Posts : 36
Join date : 2013-02-16
Re: DW refresh strategy
That is why you want to wait until the order is completed (i.e. no longer updateable). You want to avoid updates to the facts.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: DW refresh strategy
In recent years I've been taking the opposite approach - by default I fully refresh all Fact tables unless I run out of refresh window or business requirements get complex. This seems to work well for 95% of Facts.
This has advantages that it is much simpler to design and build and therefore more reliable and easier to test. It also becomes trivial to add new dimensions or measures with a complete history.
With modern ETL tools this approach can be suprisingly scalable e.g. fully refresh a 160m row Fact each night in around 1.5 hours.
This has advantages that it is much simpler to design and build and therefore more reliable and easier to test. It also becomes trivial to add new dimensions or measures with a complete history.
With modern ETL tools this approach can be suprisingly scalable e.g. fully refresh a 160m row Fact each night in around 1.5 hours.
Last edited by Mike Honey on Tue Feb 26, 2013 10:30 pm; edited 1 time in total (Reason for editing : deleted "technically")
Re: DW refresh strategy
Thanks again, based on my experience, waiting for transactions to "fully complete and not editable later" is an ideal situation. Just wondering what you recommend if one has to update the fact table due to the way business works.
SnowShine429- Posts : 36
Join date : 2013-02-16
Re: DW refresh strategy
Depends on what you are trying to accomplish. Pipeline order analysis is best accomplished using an accumulating snapshot fact (where you do update the fact). Otherwise, most order analysis is done on final orders using transaction facts.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Similar topics
» BI Strategy Vs Big Data Strategy Vs Datawarehousing solution
» Many to many relationship question
» Date Dimension refresh for all data marts nightly ??
» Staging Strategy and CDC
» My Index Strategy
» Many to many relationship question
» Date Dimension refresh for all data marts nightly ??
» Staging Strategy and CDC
» My Index Strategy
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum