Fact dimensionality
2 posters
Page 1 of 1
Fact dimensionality
I often run into the situation the we have several fact tables with different grains, but are related through a single ID at the transaction level. Typically the dimensionality is the same, but the related information doesn't exist in the source tables.
My question is, what is the best practice to model this dimensionally? To help explain further I will give an example:
You have a source Loan table which has several attributes related to the loan when it was originated, like Borrower, interest rate, date, loan amount, loan type, etc. This table is keyed off of the loan ID.
You have a source Loan History table which has any payments made against the loans. It has payment date, payment amount, payment type, etc. This table also has the Loan ID it is related to.
If I were to dimensionally model these two tables, I would break it down like this:
Loan Table Dimensions: Borrower, Loan Type, Origination Date, Loan ID (Degenerate)
Loan Table Measures: Loan Balance, Int Rate
Loan History Table Dimension: Payment Date, Payment Type, Loan ID (Degenerate)
Loan History Table Measures: Payment Amount
But you could realistically join the two tables by Loan ID, and:
1) Create a single fact table with all combined dimensionality with mixed grains, which isn't recommended, or
2) Join the Loan Table to the Loan History table to utilize the dimensionality of the Loan origination dimensionality
Now imagine this schema has several fact tables, different grains, different dimensionality, but still related by Loan ID. Each table could be combined with each other through Loan ID to introduce the dimensionality of the other.
My usual way of handling this is to join the tables so the lower-grained fact table includes Borrower, Origination Date, and Loan Type in addition to the Payment Type and Payment Date. But I go back and forth on doing this in ETL or with views.
What are the best practices for these situations?
My question is, what is the best practice to model this dimensionally? To help explain further I will give an example:
You have a source Loan table which has several attributes related to the loan when it was originated, like Borrower, interest rate, date, loan amount, loan type, etc. This table is keyed off of the loan ID.
You have a source Loan History table which has any payments made against the loans. It has payment date, payment amount, payment type, etc. This table also has the Loan ID it is related to.
If I were to dimensionally model these two tables, I would break it down like this:
Loan Table Dimensions: Borrower, Loan Type, Origination Date, Loan ID (Degenerate)
Loan Table Measures: Loan Balance, Int Rate
Loan History Table Dimension: Payment Date, Payment Type, Loan ID (Degenerate)
Loan History Table Measures: Payment Amount
But you could realistically join the two tables by Loan ID, and:
1) Create a single fact table with all combined dimensionality with mixed grains, which isn't recommended, or
2) Join the Loan Table to the Loan History table to utilize the dimensionality of the Loan origination dimensionality
Now imagine this schema has several fact tables, different grains, different dimensionality, but still related by Loan ID. Each table could be combined with each other through Loan ID to introduce the dimensionality of the other.
My usual way of handling this is to join the tables so the lower-grained fact table includes Borrower, Origination Date, and Loan Type in addition to the Payment Type and Payment Date. But I go back and forth on doing this in ETL or with views.
What are the best practices for these situations?
kskistad- Posts : 11
Join date : 2009-02-03
Re: Fact dimensionality
I think you may have stumbled on the beauty of "conformed dimensions". Dr. Kimball has written extensively on this concept. It is how you drill across different fact tables.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Fact table dimensionality
Thanks for the reply. I understand the concept of conformed dimensions, but I didn't know in practice whether modelers try to stuff all dimension keys possible (that relate in some way) into every single one of their fact tables, even if in the relational world they only indirectly relate. In my previous example, the second fact table was for loan payments only, so Date Paid is an obvious dimension on this table, but because the payment relates to a loan, and the loan has attributes of it's own, like Loan Origination Date (or any or all other loan-level dimensions), then in practice should I put a dimension key for Loan Origination Date into the payment fact table, simply because I can? What is the common practice? Maybe this comes down to a business requirements decision? In other words, maybe you only put the dimensions necessary to do the analysis the business needs rather than putting all dimensions in all fact tables just because it's possible...
kskistad- Posts : 11
Join date : 2009-02-03
Re: Fact dimensionality
My approach is more of the latter. I do try to anticipate things the business might want to report though. This is what I consider one of the pitfalls of dimensional modeling. In the normalized world, the data relationships are always present. As we've all heard and read from Dr. Kimball's myths of dimensional model, the dimensional model can support any relationship from the normalized world. The reality is that I don't include all of these relationships for performance reasons. What you do and do not include is a discussion best carried out with the business.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Similar topics
» Relative Rank with Varying Dimensionality - Fact or MDX?
» Dimensionality of facts and Microsoft SSAS Cubes
» Conformed Dimension for Transaction Fact and Accumulating Snapshot Fact Table
» Single fact table vs multiple fact tables - what is the right way in a dimensional model?
» Unknown number of relationships from dimension to fact until fact loaded
» Dimensionality of facts and Microsoft SSAS Cubes
» Conformed Dimension for Transaction Fact and Accumulating Snapshot Fact Table
» Single fact table vs multiple fact tables - what is the right way in a dimensional model?
» Unknown number of relationships from dimension to fact until fact loaded
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|