Modeling a fact with multiple sources
5 posters
Page 1 of 1
Modeling a fact with multiple sources
We have a situation where multiple business processes will generate a recommendation. A recommendation is a set of instructions, estimated costs, risk analysis, etc. for identified hazards. Examples of various business processes that generate these recommendations are a mishap investigation, workplace evaluation, hazard identification, inspection report, and a few others.
Now, the end-user wants to slice and dice either by individual source processes or all recommendations in general. For instance, I may want to see recommendations from inspection reports in California, or recommendations from workplace evaluations at construction sites. Each source process has its own unique criteria that I may want to query (i.e. mishap reports have causal factors, inspections have inspection types). The recommendations themselves, though, have common data (i.e. cost, risk analysis, etc.) regardless of source. Therefore, the customer wants to see recommendations across the board as well. Analyze all recommendations received by an organization during 2008 (NOTE: organization is common across business processes).
Would I have a fact table for each inidividual source prossess-recommendation combination and then a fact for all recommendations? Or, is there a way to have a single fact for all recommendations and somehow type back to the appropriate source process.
Thank you.
Now, the end-user wants to slice and dice either by individual source processes or all recommendations in general. For instance, I may want to see recommendations from inspection reports in California, or recommendations from workplace evaluations at construction sites. Each source process has its own unique criteria that I may want to query (i.e. mishap reports have causal factors, inspections have inspection types). The recommendations themselves, though, have common data (i.e. cost, risk analysis, etc.) regardless of source. Therefore, the customer wants to see recommendations across the board as well. Analyze all recommendations received by an organization during 2008 (NOTE: organization is common across business processes).
Would I have a fact table for each inidividual source prossess-recommendation combination and then a fact for all recommendations? Or, is there a way to have a single fact for all recommendations and somehow type back to the appropriate source process.
Thank you.
schnedar- Posts : 4
Join date : 2009-04-23
Re: Modeling a fact with multiple sources
Usually you have one fact table with appropriate dimensions to segment and analyze the data.
One fact or many with a view
There are a couple of different ways to go on this and the correct method "depends". If all of the seperate source systems have the same columns and the same measures, then go with one fact table. If the source systems have different columns but you can map them to common dimension tables (for example, you have 3 source systems, each source system has 3 unique columns, you could create a dimension table with all 9 columns - this would work great if the columns from the 3 source systems rolled up to a common hierarchy).
What you want to avoid is a situation where you have dimension key columns in your fact table that only apply to data from one source.
Another alternative to to create seperate facts tables and a view across the fact tables that contain the common dimensions. Another option is seperate fact tables with an aggregate table populated from the seperate fact tables.
What you want to avoid is a situation where you have dimension key columns in your fact table that only apply to data from one source.
Another alternative to to create seperate facts tables and a view across the fact tables that contain the common dimensions. Another option is seperate fact tables with an aggregate table populated from the seperate fact tables.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Multiple Facts with a View
Could you expand on the following statement? What would be wrong with having some of the dimensional keys being N/A?
"What you want to avoid is a situation where you have dimension key columns in your fact table that only apply to data from one source."
"What you want to avoid is a situation where you have dimension key columns in your fact table that only apply to data from one source."
cbusch- Posts : 4
Join date : 2009-02-03
Age : 63
Location : Albany NY
Re: Modeling a fact with multiple sources
Sounds like this might be a candidate for Level 1 and Level 2 fact tables. Level 1 facts would look exactly like they do from their respective source system. Then all the ubiquitous facts shared between all sources roll up into a single fact record (level 2). It would look kinda like this:
LEVEL 1 FACT TABLES
================
Source 1:
Fact A, Fact B, Fact C, Fact D
Source 2:
Fact A, Fact B, Fact E, Fact F
LEVEL 2 FACT TABLE
===============
Fact A, Fact B
Since facts A and B are shared across all sources they're the only ones that make it into your level 2 fact table. Facts C & D only belong to source 1, Facts E & F only belong to source 2. End users can still get to them, they just have to go to the lower level fact tables. This avoids having to "plug" missing facts within a single fact table row that don't exist across all systems. This also applies to dimension keys as well, not just facts. I think this is essentially what Jeff Smith is saying when he mentioned an aggregate fact table. I also agree within him that you should avoid having "n/a" dimensions foreign surrogate keys in your fact tables whenever possible. Those are really just acting as a "plug".
LEVEL 1 FACT TABLES
================
Source 1:
Fact A, Fact B, Fact C, Fact D
Source 2:
Fact A, Fact B, Fact E, Fact F
LEVEL 2 FACT TABLE
===============
Fact A, Fact B
Since facts A and B are shared across all sources they're the only ones that make it into your level 2 fact table. Facts C & D only belong to source 1, Facts E & F only belong to source 2. End users can still get to them, they just have to go to the lower level fact tables. This avoids having to "plug" missing facts within a single fact table row that don't exist across all systems. This also applies to dimension keys as well, not just facts. I think this is essentially what Jeff Smith is saying when he mentioned an aggregate fact table. I also agree within him that you should avoid having "n/a" dimensions foreign surrogate keys in your fact tables whenever possible. Those are really just acting as a "plug".
BrianJarrett- Posts : 61
Join date : 2009-02-03
Age : 50
Location : St. Louis, MO
Similar topics
» modeling multiple fact tables
» Design Fact Table in Dimensional Modeling with Multiple Grain
» Dimension with different sources (multiple business keys) ?
» 2 sources using Dimensional Modeling Workbook
» Multiple sources for the same data - which one to extract from?
» Design Fact Table in Dimensional Modeling with Multiple Grain
» Dimension with different sources (multiple business keys) ?
» 2 sources using Dimensional Modeling Workbook
» Multiple sources for the same data - which one to extract from?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum