Multiple different grain fact tables with lot of common dimensions.
5 posters
Page 1 of 1
Multiple different grain fact tables with lot of common dimensions.
Hi,
I am faced with one scenario where we have 3 fact tables -
1. Case fact - one record per case
2. Case status fact - one record per case per status change with time
3. Case status aggregate - one record per case per status ( kind of aggregate where data is rolled up from Case status fact however resulting in not so impressive compression ratio)
Apart from that we have large number of attributes called as "Case attributes" present in Case fact such as - Case type, Case state, Case owner user, Case Priority etc. ( 20 + such Case attributes), and all are implemented through Reference Code Dimension. (Let me know if more description is required about its design, for better understanding).
Problem -
We have customers creating reporting using common dimensions and "Case attributes" from Case fact and taking measures from Case status aggregate or Case status fact which is resulting in "No Data" as these case attributes (implemented through Reference Code Dimension) are not conformed across fact tables.
Hence to address such requirements (which are many !) we are planning to propagate these Case attributes to all the other fact tables, which will enable them to create such reports with out any problem. Doing this will not cause any grain mismatch as all listed fact tables have CASE_ID (PK for Case) and can accommodate these "Case attributes". Please provide your suggestions on this approach, any other suggestions are also welcomed.
Regards,
Abhiraizada
I am faced with one scenario where we have 3 fact tables -
1. Case fact - one record per case
2. Case status fact - one record per case per status change with time
3. Case status aggregate - one record per case per status ( kind of aggregate where data is rolled up from Case status fact however resulting in not so impressive compression ratio)
Apart from that we have large number of attributes called as "Case attributes" present in Case fact such as - Case type, Case state, Case owner user, Case Priority etc. ( 20 + such Case attributes), and all are implemented through Reference Code Dimension. (Let me know if more description is required about its design, for better understanding).
Problem -
We have customers creating reporting using common dimensions and "Case attributes" from Case fact and taking measures from Case status aggregate or Case status fact which is resulting in "No Data" as these case attributes (implemented through Reference Code Dimension) are not conformed across fact tables.
Hence to address such requirements (which are many !) we are planning to propagate these Case attributes to all the other fact tables, which will enable them to create such reports with out any problem. Doing this will not cause any grain mismatch as all listed fact tables have CASE_ID (PK for Case) and can accommodate these "Case attributes". Please provide your suggestions on this approach, any other suggestions are also welcomed.
Regards,
Abhiraizada
Abhiraizada- Posts : 20
Join date : 2011-05-24
Re: Multiple different grain fact tables with lot of common dimensions.
When you say propogate, what exactly are you planning on doing? Adding dimensions or copying columns?
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Multiple different grain fact tables with lot of common dimensions.
Any fact table should contain references to its appropriate dimensions. Integration across facts can only occur where common dimensions and/or attributes exist. So, if a dimension is appropriate for a fact, it should be included.
I am not a fan of this "reference code" dimension you refer to. I assume what you are talking about is a generalized code/description table with an FK on the fact to reference the particular code and desription that applies. Where there may be 20+ codes involved, you have a corresponding 20+ foreign keys in the fact. Is my assumption correct?
I am not a fan of this "reference code" dimension you refer to. I assume what you are talking about is a generalized code/description table with an FK on the fact to reference the particular code and desription that applies. Where there may be 20+ codes involved, you have a corresponding 20+ foreign keys in the fact. Is my assumption correct?
Multiple different grain fact tables with lot of common dimensions.
Hi,
if you want to measure the facts against a common dimension then it should be present in all the fact tables.
How many different dimension will you be including ?
Could you please give us more detail about how the different fact tables are related to each other(Case fact, Case status fact and Case status aggregate ).
Thanks
Himanshu
if you want to measure the facts against a common dimension then it should be present in all the fact tables.
How many different dimension will you be including ?
Could you please give us more detail about how the different fact tables are related to each other(Case fact, Case status fact and Case status aggregate ).
Thanks
Himanshu
hkandpal- Posts : 113
Join date : 2010-08-16
Re: Multiple different grain fact tables with lot of common dimensions.
BoxesAndLines wrote:When you say propogate, what exactly are you planning on doing? Adding dimensions or copying columns?
We are planning to add more dimensions to the fact tables ( Case status aggregate and Case status Fact). However all of the new dimension fk's in fact tables will only refer to Reference Code Dimension.
Hope his help !
Regards,
Abhiraizada
Abhiraizada- Posts : 20
Join date : 2011-05-24
Re: Multiple different grain fact tables with lot of common dimensions.
ngalemmo wrote:Any fact table should contain references to its appropriate dimensions. Integration across facts can only occur where common dimensions and/or attributes exist. So, if a dimension is appropriate for a fact, it should be included.
I am not a fan of this "reference code" dimension you refer to. I assume what you are talking about is a generalized code/description table with an FK on the fact to reference the particular code and desription that applies. Where there may be 20+ codes involved, you have a corresponding 20+ foreign keys in the fact. Is my assumption correct?
Thanks for this quick response...now i can start working on implementing its logical design.
And yes the 2 fact tables - Case status Fact and Case status aggregate will have 20+ additional Fk's from Reference Code Dimension.
I can also understand why you don't like the common "Reference Code" dimension, but in our scenario its easy way out as something similar has been implemented in source system and having this simplifies things for me.
Regards,
Abhiraizada
Abhiraizada- Posts : 20
Join date : 2011-05-24
Re: Multiple different grain fact tables with lot of common dimensions.
I would get rid of Reference Code Dimension and regroup the attributes into a number of junk dimensions for all the low cardinality attributes, and create a few stand alone dimensions for high cardinality attributes. Hopefully that will cut down the number of dimensions singnificantly so that you can conform your fact tables on minimal number of dimensions.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Multiple different grain fact tables with lot of common dimensions.
Big, unified code tables are bad design in OLTP and dimensional modeling. +1 on what Hang said.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Multiple different grain fact tables with lot of common dimensions.
BoxesAndLines wrote:Big, unified code tables are bad design in OLTP and dimensional modeling. +1 on what Hang said.
What are major draw backs because of such code tables ? Where we have 3 major columns as - Code type, Code value and Code desc. Currently our reference code table has around 30000 records which is never going to increase as its configurable data.
The major advantage i see with them being present in OLTP/OLAP as it standardize all such Ref. Codes together which is better in terms of reducing number of tables and also make it more flexible in terms of extension/standardization/translation etc.
Let me know if i am missing any point here
Regards,
Abhiraizada
Abhiraizada- Posts : 20
Join date : 2011-05-24
Re: Multiple different grain fact tables with lot of common dimensions.
It removes the ability of the database to enforce business rules. One code is varchar, another code is numeric. I can no longer enforce the minimum basics of selecting the appropriate datatype. Additionally, I can no longer enforce RI, add hierarchies, and may introduce concurrency issues since everyone in the company now uses the same table.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Multiple different grain fact tables with lot of common dimensions.
From a performance standpoint, having a wide fact table will hurt. It can also be problematic for some BI tools to deal with it.
As it has been suggested, you are better off grouping correlated or low cardinality attributes into junk dimensions to significantly reduce the number of FKs in the fact table.
To make change detection easier, you could maintain the code table in the DW to support the ETL process, but do not expose it to users and do not reference it from fact tables. Another use for it would be to support 'list of values' type dropdowns in a BI tool.
As it has been suggested, you are better off grouping correlated or low cardinality attributes into junk dimensions to significantly reduce the number of FKs in the fact table.
To make change detection easier, you could maintain the code table in the DW to support the ETL process, but do not expose it to users and do not reference it from fact tables. Another use for it would be to support 'list of values' type dropdowns in a BI tool.
Re: Multiple different grain fact tables with lot of common dimensions.
Almost all the legacy systems have this type of generic code table in both OLTP and DW systems due to ignorance of industry best practice. As a dimensional modeler for DW system, it's crucial to not get influenced by OLTP design, let alone a bad one like code table. Instead, you should remodel the same data in dimensional mindset.
What do you gain by reducing many entities into one table? very little but at huge price. Firstly for OLTP system, it confuses model users as they have to guess how to use the table or rely on some documentation outside the ER diagram. The design goes against normalization principle which is supposed to break up big anomaly tables into more normalized tables. The only possible benefit is that it somehow could help front-end developer to centralize their code. However data modeling should not compromise its principle to just serve the front-end at price of an unclear model.
More importantly in dimensional modeling, this over-normalized 20+ dimension code table should be the very target that needs to be flatend/denormalized into fewer dimensions to achieve two major goals in a DW system. A ease of use, as the new dimensional model becomes clear. B performance, as number of FK in fact table is minimized hence saving the space and joins.
What do you gain by reducing many entities into one table? very little but at huge price. Firstly for OLTP system, it confuses model users as they have to guess how to use the table or rely on some documentation outside the ER diagram. The design goes against normalization principle which is supposed to break up big anomaly tables into more normalized tables. The only possible benefit is that it somehow could help front-end developer to centralize their code. However data modeling should not compromise its principle to just serve the front-end at price of an unclear model.
More importantly in dimensional modeling, this over-normalized 20+ dimension code table should be the very target that needs to be flatend/denormalized into fewer dimensions to achieve two major goals in a DW system. A ease of use, as the new dimensional model becomes clear. B performance, as number of FK in fact table is minimized hence saving the space and joins.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Similar topics
» Design Question - Multiple Fact Tables at the same Grain
» Using a dimension in multiple fact tables with different grain and support SCD
» Multiple Fact table with different grain w/conformed dimensions - BO Challenge
» Do I need multiple fact tables or dimensions
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Using a dimension in multiple fact tables with different grain and support SCD
» Multiple Fact table with different grain w/conformed dimensions - BO Challenge
» Do I need multiple fact tables or dimensions
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum