Inventory Fact Table: How to load it?
5 posters
Page 1 of 1
Inventory Fact Table: How to load it?
Hi all, I need to load a huge inventory fact snapshot table. My source table is not tracking changes, but I need to be carefully loading only changed row to fact table because this is a huge fact table.
What is the best method to track changes between source and destination table in SSIS?
We have a source table with (for instance) ProductID, Units. We have a fact table with Date, SK_Product, Units. When we process a row from source, we need to compare Units on source to Units on the "last date" record on fact table, if there are any changes, then we need to add the row to the fact table.
Thx.
What is the best method to track changes between source and destination table in SSIS?
We have a source table with (for instance) ProductID, Units. We have a fact table with Date, SK_Product, Units. When we process a row from source, we need to compare Units on source to Units on the "last date" record on fact table, if there are any changes, then we need to add the row to the fact table.
Thx.
ZZ- Posts : 2
Join date : 2011-06-27
Re: Inventory Fact Table: How to load it?
It's a snapshot table. You load everything every day.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Inventory Fact Table: How to load it?
...unless you store rows with effective time periods. If that is the case, the most efficient way is a sequential merge pass through the data. Use two feeds, one from the source the other from the fact, sort both in the same sequence and compare the two.
Re: Inventory Fact Table: How to load it?
I do something similar with my product product pricing fact table. Pricing changes infrequently, so maintaining a true snaphot fact table would entail storing the same information over and over again. I use what I call a "historical fact table" (disclaimer: I made this patern up--haven't found anything quite like it Kimball books.) My table looks something like this:
ProductKey
BeginEffectiveDate
EndEffectiveDate
UnitPrice
Note that the primary key is a compound key consisting of ProductKey + BeginEffectiveDate. Some might question whether that is a good idea and although I would never use that type of primary key in an SCD2 dimension I have found it works well in my admitedly modest historical fact table. The key design might need some additional thought in a really large fact table. I set EndEffectiveDate to '2999-12-31' for all current records.
I don't do the ETL merge in SSIS as there is currently no functionality equivilient to the SQL 2008 MERGE command. I use the following steps (running on SQL 2008):
Step 1: Use SSIS to bring source data into a SQL staging table.
Step 2: Use SQL MERGE command to expire (UPDATE) any current records where the inventory balance has changed. When expiring a record I set it's EndEffectiveDate to one day before the ETL "loading as-of" date.
Step 3: Use SQL MERGE command to insert any unmatched records. This will create new records. I use the ETL "loading as-of" date as teh BeginEffectiveDate.
Although the MERGE command can do INSERT, UPDATE, and DELETE in a single operation, it can't both UPDATE and INSERT for a given given record at the same time, which is what we need when a price changes, so I run two separate MERGE commands.
(I don't currently expire prices for discontinued items, but I could handle this with the WHEN NOT MATCHED BY SOURCE clause to expire the final price for that item. In your case you might want to expire the previous record and create a final zero balance record if an item is no longer reported in your inventory source data.)
So after a price change for product 123 there would be two records for that product in the history table:
As long as the BeginEffectiveDate of the new record is exactly one day later than the EndEffectiveDate of the previous record, I can query prices on any given date by using WHERE @PricingDate BETWEEN BeginEffectiveDate AND EndEffectiveDate. (I think this use of BETWEEN is not ANSI92 compliant but is supported in T-SQL.)
In fact, I go one step further and provide a view which joins the date dimension with my historical pricing fact table and exposes it as if it were a traditional snapshot with every price populated for every product for every date that exists in the date dimension. Hopefully the users will filter by date and/or product when using this view!
If the marjority of values change every day then there is not much value to this "historical fact table" pattern. But if many of the values stay the same from day to day, it will store substantially less records than a standard daily snapshot fact table.
ProductKey
BeginEffectiveDate
EndEffectiveDate
UnitPrice
Note that the primary key is a compound key consisting of ProductKey + BeginEffectiveDate. Some might question whether that is a good idea and although I would never use that type of primary key in an SCD2 dimension I have found it works well in my admitedly modest historical fact table. The key design might need some additional thought in a really large fact table. I set EndEffectiveDate to '2999-12-31' for all current records.
I don't do the ETL merge in SSIS as there is currently no functionality equivilient to the SQL 2008 MERGE command. I use the following steps (running on SQL 2008):
Step 1: Use SSIS to bring source data into a SQL staging table.
Step 2: Use SQL MERGE command to expire (UPDATE) any current records where the inventory balance has changed. When expiring a record I set it's EndEffectiveDate to one day before the ETL "loading as-of" date.
Step 3: Use SQL MERGE command to insert any unmatched records. This will create new records. I use the ETL "loading as-of" date as teh BeginEffectiveDate.
Although the MERGE command can do INSERT, UPDATE, and DELETE in a single operation, it can't both UPDATE and INSERT for a given given record at the same time, which is what we need when a price changes, so I run two separate MERGE commands.
(I don't currently expire prices for discontinued items, but I could handle this with the WHEN NOT MATCHED BY SOURCE clause to expire the final price for that item. In your case you might want to expire the previous record and create a final zero balance record if an item is no longer reported in your inventory source data.)
So after a price change for product 123 there would be two records for that product in the history table:
ProductKey | BeginEffectiveDate | EndEffectiveDate | UnitPrice |
123 | 2008-01-01 | 2011-06-27 | 42.50 |
123 | 2011-06-28 | 2999-12-31 | 47.25 |
As long as the BeginEffectiveDate of the new record is exactly one day later than the EndEffectiveDate of the previous record, I can query prices on any given date by using WHERE @PricingDate BETWEEN BeginEffectiveDate AND EndEffectiveDate. (I think this use of BETWEEN is not ANSI92 compliant but is supported in T-SQL.)
In fact, I go one step further and provide a view which joins the date dimension with my historical pricing fact table and exposes it as if it were a traditional snapshot with every price populated for every product for every date that exists in the date dimension. Hopefully the users will filter by date and/or product when using this view!
If the marjority of values change every day then there is not much value to this "historical fact table" pattern. But if many of the values stay the same from day to day, it will store substantially less records than a standard daily snapshot fact table.
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Inventory Fact Table: How to load it?
I think Kimball's transaction dimension is the closest to what VHF described. I guess the purpose of having this kind of effective dated dimension is to store some fast changing SCD attributes, rather than FKs, in a fact-like dimension table that will be referenced by other true fact tables. Its major benefit is the space saving as opposed to periodic snapshot fact table.
Calling a transaction dimension a fact table may be misleading as none of attributes in a transaction dimension is additive or semi-additive, and you would have a proper single field PK, say UnitPriceChangeKey, without concern of trying to make a fact table to be FKed by other fact tables. Similarly you could track any one or more fast changing attributes in a single or separate transaction dimensions if changes only happen to small set of dimension records. However joining to date dimension, instead of having a physical periodic snapshot fact table, may have significant performance cost.
Calling a transaction dimension a fact table may be misleading as none of attributes in a transaction dimension is additive or semi-additive, and you would have a proper single field PK, say UnitPriceChangeKey, without concern of trying to make a fact table to be FKed by other fact tables. Similarly you could track any one or more fast changing attributes in a single or separate transaction dimensions if changes only happen to small set of dimension records. However joining to date dimension, instead of having a physical periodic snapshot fact table, may have significant performance cost.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Inventory Fact Table: How to load it?
OK, Itīs very clear to me now. Effective time periods + merge can work fine for Data Warehouse, we will obtain some kind of "sparse snapshot table", but how do I deal in SSAS with? Supose We have daily workdays inventory closes and we have a product with no changes from January, something like
Units measure is defined with LastNonEmpty aggregate method (I think is the only option) , for June there is not any record, so Unit measure (LastNonEmpty child in time hierarchy) is zero instead of 2. ŋ? Thatīs why you (VHF) create a "non-sparse/traditional snapshot table" view from the "sparse snapshot table" that we already have?
So I have to choose between lose disk space or time to execute the view? Ummmhhh !! Let me think for a while...
ProductKey | BeginEffectiveDate | EndEffectiveDate | Units |
123 | 1/1/2011 | 17/1/2011 | 1 |
123 | 18/1/2011 | 1/1/2999 | 2 |
Units measure is defined with LastNonEmpty aggregate method (I think is the only option) , for June there is not any record, so Unit measure (LastNonEmpty child in time hierarchy) is zero instead of 2. ŋ? Thatīs why you (VHF) create a "non-sparse/traditional snapshot table" view from the "sparse snapshot table" that we already have?
So I have to choose between lose disk space or time to execute the view? Ummmhhh !! Let me think for a while...
ZZ- Posts : 2
Join date : 2011-06-27
Re: Inventory Fact Table: How to load it?
For trend analysis, a date key in the fact table is necessary. With SSAS, you could use VHF's view to produce a virtual fact table and let the cube work out all the necessary measures. However down to the physical level, a physical periodic snapshot fact table is a preferred model based on Kimball's methodology as the model would deliver reasonable performance with SQL even without using the cubes.
Anyway, I would suggest not to get carried away by saving the disk space and avoiding a proper periodic fact table which is supposed to duplicate many factual rows in nature and hence provides a powerful and fast trend analysis mechanism. If the fact table is so lean, why bother worrying about its going too deep, the deeper, the more valuable.
Anyway, I would suggest not to get carried away by saving the disk space and avoiding a proper periodic fact table which is supposed to duplicate many factual rows in nature and hence provides a powerful and fast trend analysis mechanism. If the fact table is so lean, why bother worrying about its going too deep, the deeper, the more valuable.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Inventory Fact Table: How to load it?
hang wrote:I think Kimball's transaction dimension is the closest to what VHF described.
I'll have to read up on that... it sounds like a fit as prices, inventory, etc. are really dimension attributes... just not nescessarly slowly changing!
In the Dimensional Modeling in Depth class Ralph presents a pattern he calls "instantaneous balances", which is a fact table with begin and end effective dates (using a full date/time timestamp in his example.) Using that pattern you could answer questions like "What were all the customer account balances as of 2:00pm last Friday?"
Ralph warns that using this pattern is akin to getting hooked on drugs... when you use this pattern, you will have to use a date qualifier in your queries for the rest of your life!
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Inventory Fact Table: How to load it?
I am working on an employment BI system, and I found the most challenging issue, and perhaps also in this forum, is to identify the facts and dimensions because of the sheer size of the data. If someone has correctly identified the facts and dimensions for me, dimensional modeling and the rest of the down-stream activities are really very straight-forward given that Kimball has laid out almost everything for us in his books.VHF wrote:Ralph warns that using this pattern is akin to getting hooked on drugs... when you use this pattern, you will have to use a date qualifier in your queries for the rest of your life!
You are right in that the effective date design pattern is a very tempting choice for almost everything, dimensions and facts. It sounds like a silver bullet to all the difficult modeling scenarios as long as historical correlations need to be kept. However I think dimensional modeling is far more artistic than that, every technique in Kimball's books has been targeted at some common problems in real practice. If sometimes we can't find a proper solution in Kimball's book, I guess most likely, we have over looked or have not quite digested the relevant part in his books.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Similar topics
» The Role of the Inventory Transaction Fact Table Within a Value Chain DW
» Reduce Rows in Inventory Snapshot Fact Table
» Fact Table Load Question...
» Incremental Load for fact table help
» FACT table Incremental load!
» Reduce Rows in Inventory Snapshot Fact Table
» Fact Table Load Question...
» Incremental Load for fact table help
» FACT table Incremental load!
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum