Large Dimension table compared to fact table?
3 posters
Page 1 of 1
Large Dimension table compared to fact table?
I may be misguided in my attempt to make a dimensional model here, but I will explain the scenario. We provide a value add service in the supply chain business...in this case by depicting a scorecard of the accuracy of vendors' product sizing specifications. We will show for any particular date the average of % diffs and actual counts of all products that passed and failed a retailers tolerance level.
Vendors send retailers send their product sizes (e.g. height, depth, width, weight). At any given time it is possible that one retailer may have received a different set of specs for the same item than a second retailer has for the same vendor's item. The retailers perform their own size measurements randomly after receiving the product. We calculate the percentage differences between vendor and retailer and also compare the % to a retailer's overall tolerance level to determine pass/fail.
So, step 1 - a vendor sends us the product specs for a vendor/retailer/product specification. Step 2 - the retailer measures some of the the vendor/retailer/products and sends us those measurements. Step 3 - we calculate the percentage diffs and pass/fail status. Step 4 - the vendor may change measurements for that vendor/retailer/product for which we recalculate the diffs and pass/fail status.
My initial design thoughts are to create a "vendor retailer product" dimension of SCD type 2 measurements as recorded by the vendor and retailer. Step 2 above would trigger a fact recording the % diff along with dimensional keys to "vendor retailer product", calendar, product group, and maybe other small dimensions.
OK, so I am newbie to dimensional design armed only with the Kimball Dimensional modeling book and the Kimball class from over 7 months ago. I need some help with some fundamental data modeling suggestions for this reporting scenario. One thing I know is that my "vendor retailer product dimension" dwarfs the actual fact table and that is certainly not normal for a dimensional model.
Any suggestions...thanks, Tim
Vendors send retailers send their product sizes (e.g. height, depth, width, weight). At any given time it is possible that one retailer may have received a different set of specs for the same item than a second retailer has for the same vendor's item. The retailers perform their own size measurements randomly after receiving the product. We calculate the percentage differences between vendor and retailer and also compare the % to a retailer's overall tolerance level to determine pass/fail.
So, step 1 - a vendor sends us the product specs for a vendor/retailer/product specification. Step 2 - the retailer measures some of the the vendor/retailer/products and sends us those measurements. Step 3 - we calculate the percentage diffs and pass/fail status. Step 4 - the vendor may change measurements for that vendor/retailer/product for which we recalculate the diffs and pass/fail status.
My initial design thoughts are to create a "vendor retailer product" dimension of SCD type 2 measurements as recorded by the vendor and retailer. Step 2 above would trigger a fact recording the % diff along with dimensional keys to "vendor retailer product", calendar, product group, and maybe other small dimensions.
OK, so I am newbie to dimensional design armed only with the Kimball Dimensional modeling book and the Kimball class from over 7 months ago. I need some help with some fundamental data modeling suggestions for this reporting scenario. One thing I know is that my "vendor retailer product dimension" dwarfs the actual fact table and that is certainly not normal for a dimensional model.
Any suggestions...thanks, Tim
tcash- Posts : 4
Join date : 2009-12-01
Location : Princeton, NJ
Re: Large Dimension table compared to fact table?
OK, I've given this design more thought and tried to put my dimensional brain in gear. I had been thinking that the product height, width, and weights were simply descriptive attributes of a dimension. But, I can treat them as semi-additive numbers in a fact table by creating a fact row with the original measurements recorded as is. Modifications to a measurements be can be represented as the delta from the original measurement. By defining the grain as Daily Product Measurement Transactions from Suppliers and Retailers and incorporating the Retailer, Supplier, Product, and Transaction Date as dimensions, perhaps I have a workable solution. I will also need a summary fact table to depict the current scenario as well as a monthly snapshot fact table to give end of month historical perspective. Am I on the right track now?
Tim
Tim
tcash- Posts : 4
Join date : 2009-12-01
Location : Princeton, NJ
Re: Large Dimension table compared to fact table?
In your case, the measurements are the facts, not dimensional attributes. And you have two sets of facts: the stated dimensions and the measured dimensions. Supplier, Retailer and Product are all independent dimensions to those facts.
The stated measurement facts would be by supplier and product as well as date. If the supplier restates the dimensions, you may also want to include a current flag as a degenerate dimension to identify the most recent stated measurements.
The measured facts would be by retailer, supplier, product and date. In addition to the measurements from the retailer, you may also want to include the current stated measurements (or variance from those measurements) at the time of receipt... essentially to make queries simpler. I would not store percentages (as they are not additive)... prefering to calculate those in the query.
The stated measurement facts would be by supplier and product as well as date. If the supplier restates the dimensions, you may also want to include a current flag as a degenerate dimension to identify the most recent stated measurements.
The measured facts would be by retailer, supplier, product and date. In addition to the measurements from the retailer, you may also want to include the current stated measurements (or variance from those measurements) at the time of receipt... essentially to make queries simpler. I would not store percentages (as they are not additive)... prefering to calculate those in the query.
Re: Large Dimension table compared to fact table?
Thank you for the response. My original thoughts were off target but it seems that my second post was close to your proposal. The trick was to think of those measurements as additive.
If I store the diffs or actuals at time of receipt they would not be additive either....however, I can see where they would facilitate some querying (especially if I add a current flag degenerative dimension). But would storing percentage may facilitate queries in much the same manner?
If I store the diffs or actuals at time of receipt they would not be additive either....however, I can see where they would facilitate some querying (especially if I add a current flag degenerative dimension). But would storing percentage may facilitate queries in much the same manner?
tcash- Posts : 4
Join date : 2009-12-01
Location : Princeton, NJ
Re: Large Dimension table compared to fact table?
I guess it depends if quantities are involved... which in your case may not be an issue. If you store a % variance and each row represents a single instance, you can correctly calculate averages using simple SQL queries. So you should be ok. Storing ratios or unit values (such as unit price) is more an issue when quantities are involved. For example, in a sales fact table, average price would not be AVG(Unit_Price), but rather SUM(Extended_Price)/SUM(Quantity).
With that said, I would still prefer to store actual variance, rather than %, because it makes it easier to calculate other statistics, such as standard deviation.
With that said, I would still prefer to store actual variance, rather than %, because it makes it easier to calculate other statistics, such as standard deviation.
Re: Large Dimension table compared to fact table?
Thank you once again. I just presented the revised model to a bunch of non-dimensional techies and they actually understood and liked it! So, even if we do not implement this model, at least we've begun the education of my peers (and myself!) in dimensional modeling.
tcash- Posts : 4
Join date : 2009-12-01
Location : Princeton, NJ
Large Dimension table compared to fact table?
Large Dimension table compared to fact table? I guess it depends if quantities are involved... RE: Large Dimension table compared to fact table? ...
__________________
Download Microsoft 2010
__________________
Download Microsoft 2010
Steve Dunning- Posts : 4
Join date : 2009-09-29
Similar topics
» Wide and large Dimension or Survey Factless Fact Table
» Large Student dimension or new Student Fact table?
» Please Help- Drill Through To Very Large Dimension Table confusion...
» Partitioning Large MS SQL-Server FACT table
» Large Student dimension or new Student Fact table?
» Please Help- Drill Through To Very Large Dimension Table confusion...
» Partitioning Large MS SQL-Server FACT table
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum