Dimensions Directly Tied to Facts vs. Snowflaking
4 posters
Page 1 of 1
Dimensions Directly Tied to Facts vs. Snowflaking
Hi all,
Thanks in advance for your input on this.
We are in the midst of a building out a portion of our data warehouse. The example here has to do with insurance data. Hopefully the discussion can be abstracted to a more generic modeling level.
We are looking at two options.
Option 1 (see image) joins a dimension directly to the fact table
https://2img.net/r/ihimizer/i/dimproductoption1.png/
Option 2 (see image) snowflake the table allowing for a view of the data independent of the fact table.
https://2img.net/r/ihimizer/i/dimproductoption2.png/
I'm looking for feedback about the pros/cons/reasons for these two options. Reference to Kimball books, Kimball Design Tips, links, etc. are always welcome.
The concern we have is that we originally built our data warehouse using option 1, but now with the type 3 SCD it seems to make more sense to use option 2. Using a hybrid of both options (e.g. a fact with FKs to both DimProduct and DimSubmission which has a FK also to DimProduct) seems scary since you have multiple ways/paths to get to DimProduct.
Design constraints:
DimProduct is a dimension that logically can be tied to many different facts across the enterprise.
The basis (starting point) for a product is a submission.
Thanks in advance for your input on this.
We are in the midst of a building out a portion of our data warehouse. The example here has to do with insurance data. Hopefully the discussion can be abstracted to a more generic modeling level.
We are looking at two options.
Option 1 (see image) joins a dimension directly to the fact table
https://2img.net/r/ihimizer/i/dimproductoption1.png/
Option 2 (see image) snowflake the table allowing for a view of the data independent of the fact table.
https://2img.net/r/ihimizer/i/dimproductoption2.png/
I'm looking for feedback about the pros/cons/reasons for these two options. Reference to Kimball books, Kimball Design Tips, links, etc. are always welcome.
The concern we have is that we originally built our data warehouse using option 1, but now with the type 3 SCD it seems to make more sense to use option 2. Using a hybrid of both options (e.g. a fact with FKs to both DimProduct and DimSubmission which has a FK also to DimProduct) seems scary since you have multiple ways/paths to get to DimProduct.
Design constraints:
- We've been asked to provide a type 3 SCD of the Product. At this point we were thinking of adding this to DimSubmission as two seperate FKs to DimProduct. One FK would be something like OriginalProduct and the other would be CurrentProduct.
- You CAN have a Product without a submission. This was originally why we chose option 1 since you would lose products from your results if there was no Submission for that Product.
DimProduct is a dimension that logically can be tied to many different facts across the enterprise.
The basis (starting point) for a product is a submission.
kbarrett- Posts : 2
Join date : 2009-02-03
RE:Dimensions Directly Tied to Facts vs. Snowflaking
Option 2 with the requirement (SCD 3 having Current,PrevProduct) is fine.
But the conflicting factor is what is the granularity of InventoryCount. Does it totally depends on Submission Number?i.e., only those products which have submission number are accounted for Inventory Count or all products can come under it? If all products are required, I dont think how would you use any data for products which are missing or not required for your Facts.
If there is another fact say as shown in Option1 fig (FactSomeOtherFact) which solely depends on Product table, then you can directly link that table with that FactSomeOtherFact though you have snowflaked the other Dimensions as that of Option2.
But the conflicting factor is what is the granularity of InventoryCount. Does it totally depends on Submission Number?i.e., only those products which have submission number are accounted for Inventory Count or all products can come under it? If all products are required, I dont think how would you use any data for products which are missing or not required for your Facts.
If there is another fact say as shown in Option1 fig (FactSomeOtherFact) which solely depends on Product table, then you can directly link that table with that FactSomeOtherFact though you have snowflaked the other Dimensions as that of Option2.
Prasanna- Posts : 6
Join date : 2009-10-20
Re: Dimensions Directly Tied to Facts vs. Snowflaking
What are you trying to gain from the snowflake? From your desciption you are not changing attributes about a product, but rather changing the product associated with a submission or policy... correct? So 'product' as a dimension is not type 3. Is what is changing actually the collection of coverages, riders, deductables, etc... that make up a 'product'?
What kind of changes are you trying to capture?
What kind of changes are you trying to capture?
Resist the urge to snowflake
In almost 20 years of building DSS and data warehouse systems, I cannot recall EVER needing what we now call a Type 3 SCD. Really. A Type 2 SCD provides the same information and then some, with maybe a bit more work to get a Type 3 out of it.
What is really being tracked? The changes to the product or the changes to the relationships that the product has with other dimensions and facts? That is the real question. Sort that out and you are almost done.
What is really being tracked? The changes to the product or the changes to the relationships that the product has with other dimensions and facts? That is the real question. Sort that out and you are almost done.
Colin Davies- Posts : 8
Join date : 2009-05-20
Similar topics
» Snowflaking Dimensions
» No of Dimensions and Facts
» Outrigger dimensions in Facts
» Static facts in dimensions?
» Cross reference facts and dimensions
» No of Dimensions and Facts
» Outrigger dimensions in Facts
» Static facts in dimensions?
» Cross reference facts and dimensions
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum