Fact and dimension tables - avoiding same number of rows in both
4 posters
Page 1 of 1
Fact and dimension tables - avoiding same number of rows in both
I have a Prospect Follow-up activity fact table. Follow-up activities are Email, Sales Call, Send Invitation etc... performed by Sales person to sell a Unit to a Prospect resident. Follow-up activity fact table's grain is one row for each Prospect, Follow-up action by Sales person.
Simultaneously, a Prospect goes through different steps of Statuses. Source system keeps history of Status changes and Follow-up activities.
So we have a Prospect dimension common to Status fact table (with effective st and end dates) and Follow-up Activity table.
My question is.. Prospect has several many-to-many attributes such as personal needs, care type, inquiry source and couple more. Is it right to create a Prospect fact table that has one row per Prospect (= rows in Prospect Dim) that joins to a Personal Need Group bridge table that joins to Personal need dimension table? Also is it right to use this fact table to relate a Prospect to other dimensions that are not specific to an event?
From design principles, I learned that dimensions are not identified if Fact and Dimension has same number of rows. However, I'm not able to avoid in this model...
For clarity I'm attaching the design model..
Thanks for your time!
Simultaneously, a Prospect goes through different steps of Statuses. Source system keeps history of Status changes and Follow-up activities.
So we have a Prospect dimension common to Status fact table (with effective st and end dates) and Follow-up Activity table.
My question is.. Prospect has several many-to-many attributes such as personal needs, care type, inquiry source and couple more. Is it right to create a Prospect fact table that has one row per Prospect (= rows in Prospect Dim) that joins to a Personal Need Group bridge table that joins to Personal need dimension table? Also is it right to use this fact table to relate a Prospect to other dimensions that are not specific to an event?
From design principles, I learned that dimensions are not identified if Fact and Dimension has same number of rows. However, I'm not able to avoid in this model...
For clarity I'm attaching the design model..
Thanks for your time!
Last edited by snpr01 on Fri Sep 21, 2012 12:30 pm; edited 2 times in total (Reason for editing : Sorry, I updated the image as it had incorrect IDs in Follow-up Activity fact table.)
snpr01- Posts : 13
Join date : 2009-02-03
Re: Fact and dimension tables - avoiding same number of rows in both
snpr01 wrote:
From design principles, I learned that dimensions are not identified if Fact and Dimension has same number of rows. However, I'm not able to avoid in this model...
I hadn't heard this one before - which "design principles" exactly are you referring to?
Re: Fact and dimension tables - avoiding same number of rows in both
That your fact to dimension relationship cardinality should not be 1 to 1.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Fact and dimension tables - avoiding same number of rows in both
Thanks for clarifying B&L. As you said, I was meaning to avoid 1 to 1 cardinality relationship from fact to dimension.
What is the best way to design this when there are multivalued dimensions and other related dimensions to a dimension that are not specific to an event, gathered over a period of time, and that are not needed to be maintained with date history?
What is the best way to design this when there are multivalued dimensions and other related dimensions to a dimension that are not specific to an event, gathered over a period of time, and that are not needed to be maintained with date history?
snpr01- Posts : 13
Join date : 2009-02-03
Re: Fact and dimension tables - avoiding same number of rows in both
Seems like you should be able to leverage some of the other existing facts to capture Personal Needs. It seems overkill to me to create a fact table solely to capture multi-valued relationships. Is it really a factless fact or are there other metrics for Prospect Fact?
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Fact and dimension tables - avoiding same number of rows in both
Right now, it is a factless fact table. Other two fact tables are also factless. Really, measures are going to be when the Prospect becomes a Resident by signing an agreement: in resident agreement fact (that also joins to Prospect dim).
So, it is a bit unclear whether I'm doing the right design..
So, it is a bit unclear whether I'm doing the right design..
snpr01- Posts : 13
Join date : 2009-02-03
Re: Fact and dimension tables - avoiding same number of rows in both
I'm still trying to justify the necessity of Prospect fact that has 1:1 cardinality with Prospect Dim.
Initially, it seemed like a very common scenario that anyone would face. How do you design this without going against the principal of not having 1:1 cardinality fact and dimension tables?
Just to clarify: this model is to analyze Prospect bank/pool, their counts in each status, how many follow-ups are being done on the Prospects, how many sales persons are working on them, their movement in the pipeline etc.. along with other entities that are related to a Prospect. Unfortunately, there are no transactions that occur in this process. The measures would be counts of Prospects by different dimensions.
Initially, it seemed like a very common scenario that anyone would face. How do you design this without going against the principal of not having 1:1 cardinality fact and dimension tables?
Just to clarify: this model is to analyze Prospect bank/pool, their counts in each status, how many follow-ups are being done on the Prospects, how many sales persons are working on them, their movement in the pipeline etc.. along with other entities that are related to a Prospect. Unfortunately, there are no transactions that occur in this process. The measures would be counts of Prospects by different dimensions.
snpr01- Posts : 13
Join date : 2009-02-03
Re: Fact and dimension tables - avoiding same number of rows in both
Hi snpr01,
I suggest that you relax your "principles" and continue with 1:1 cardinality.
As your model matures you may add other facts beyond just a count e.g. Target $ for each Prospect. You may need an "Unknown" member/row in your Prospect Dimension when relating it to other Facts, which would not be present in your Prospect Fact. You may need to track SCD type 2 changes in your Prospect Dimension for relating it to other Facts. Any of those requirements will be awkward with a single-table design.
I also used to follow the "principal of not having 1:1 cardinality" as received wisdom, but I'm over that now. I attended a course taught by Ralph recently where he specifically debunked it. I now regret the models I designed with one table serving a fact+dimension role. They all ran into those requirements eventually and the solutions are awkward.
Good luck!
Mike
I suggest that you relax your "principles" and continue with 1:1 cardinality.
As your model matures you may add other facts beyond just a count e.g. Target $ for each Prospect. You may need an "Unknown" member/row in your Prospect Dimension when relating it to other Facts, which would not be present in your Prospect Fact. You may need to track SCD type 2 changes in your Prospect Dimension for relating it to other Facts. Any of those requirements will be awkward with a single-table design.
I also used to follow the "principal of not having 1:1 cardinality" as received wisdom, but I'm over that now. I attended a course taught by Ralph recently where he specifically debunked it. I now regret the models I designed with one table serving a fact+dimension role. They all ran into those requirements eventually and the solutions are awkward.
Good luck!
Mike
Last edited by Mike Honey on Thu Oct 11, 2012 4:50 pm; edited 1 time in total (Reason for editing : typo)
Re: Fact and dimension tables - avoiding same number of rows in both
So I'm not alone on this boat..
Thanks Mike!
Thanks Mike!
snpr01- Posts : 13
Join date : 2009-02-03
Re: Fact and dimension tables - avoiding same number of rows in both
I don't quite get it. Personal Needs are multivalued attributes in the prospect dimension, not a multivalued dimension to a fact. Why do you have to fabricate a fact table that is 1-1 to a dimension.
I remember Kimball said a bridge table is between dimensions. If there is a multivalued attribute in a dimension, like in your case, you create a bridge table with prospect key and personal need Key in it. If a dimension is a multivalued to the fact, you then create a bridge table with group key repeated for different dimension key in the bridge, so that group key becomes a single valued dimension key in the fact table. From referential perspective between fact and dimension table, there is an imaginary group dimension table between fact and bridge table, but it is still a valid design to omit the group table and connect the bridge to the fact table directly. So in either case, multivalued attribute in a dimension, or multivalued dimension in a fact table, the bridge is between dimensions.
I remember Kimball said a bridge table is between dimensions. If there is a multivalued attribute in a dimension, like in your case, you create a bridge table with prospect key and personal need Key in it. If a dimension is a multivalued to the fact, you then create a bridge table with group key repeated for different dimension key in the bridge, so that group key becomes a single valued dimension key in the fact table. From referential perspective between fact and dimension table, there is an imaginary group dimension table between fact and bridge table, but it is still a valid design to omit the group table and connect the bridge to the fact table directly. So in either case, multivalued attribute in a dimension, or multivalued dimension in a fact table, the bridge is between dimensions.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Fact and dimension tables - avoiding same number of rows in both
Hi hang, I'm designing this fact table not only for (a) Prospect's multivalued attributes but also for (b) other 1-1 entities' relationships with the Prospect.
1:1 entities are: to name a couple, Referral Source organization dim, Referral Source associate dim etc..
1:N are : Personal needs, Inquiry sources (multiple family members or friends may inquire about the Prospect's admission) etc.
If I do not have this fact table, then I would have to snow-flake 1:1 entities like Referral Source Org on the Prospect dim which is not the right thing to do. Since I'm making this fact table to avoid Snowflakes, I'm using it as a place for the 1:n dimensions as well.
Would you still make Personal Need group table join to the Prospect Dim? (and two other similar 1:N attributes that are not mentioned here for simplicity)
On a side note, as the design is evolving, I'm making the Prospect fact as an Accumulating fact table for the purpose of Prospect's statuses lag calculation (assuming a rigid, one directional status pipeline).
This reason is also adding to the list of purposes to introduce this Prospect fact table.
To add clarity to my explanation, here is the updated model..
1:1 entities are: to name a couple, Referral Source organization dim, Referral Source associate dim etc..
1:N are : Personal needs, Inquiry sources (multiple family members or friends may inquire about the Prospect's admission) etc.
If I do not have this fact table, then I would have to snow-flake 1:1 entities like Referral Source Org on the Prospect dim which is not the right thing to do. Since I'm making this fact table to avoid Snowflakes, I'm using it as a place for the 1:n dimensions as well.
Would you still make Personal Need group table join to the Prospect Dim? (and two other similar 1:N attributes that are not mentioned here for simplicity)
On a side note, as the design is evolving, I'm making the Prospect fact as an Accumulating fact table for the purpose of Prospect's statuses lag calculation (assuming a rigid, one directional status pipeline).
This reason is also adding to the list of purposes to introduce this Prospect fact table.
To add clarity to my explanation, here is the updated model..
snpr01- Posts : 13
Join date : 2009-02-03
Re: Fact and dimension tables - avoiding same number of rows in both
Now I can see your point. I guess the accumulating snapshot fact is very useful for analyzing the lifecycle of a prospect becoming a resident. The schema looks like a quite reasonable star and it would work if prospect is a type 1 dimension in relation to other multivalued attributes. Just bear in mind that every time you need further details about prospect, you have to join with a fact table to get them, which in a sense, is snowflaking through a fact table instead of dimension directly. However in this case, 1:1 between prospect dimension and the accumulatiing snapshot is a sound dimensional design, given the nature of this type of fact table. Whether you should reflect the relationship through a fact table or within dimensions is a different issue.
An important point about leveraging fact table to resolve the dimension/attribute relationship is that a time series (date key) in the fact table can time-stamp the relationship for change tracking to alleviate type 2 impact on big dimension. However the accumulating snapshot fact does not give you that capability, as it has pivoted the dates into columns that will be updated with current values.
Another point is, you could potentially need personal details for other fact tables in your model. Would you repeat the relationships in all the relevant fact tables or you would rather resolve them within dimensions in one place.
An important point about leveraging fact table to resolve the dimension/attribute relationship is that a time series (date key) in the fact table can time-stamp the relationship for change tracking to alleviate type 2 impact on big dimension. However the accumulating snapshot fact does not give you that capability, as it has pivoted the dates into columns that will be updated with current values.
Another point is, you could potentially need personal details for other fact tables in your model. Would you repeat the relationships in all the relevant fact tables or you would rather resolve them within dimensions in one place.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Similar topics
» Reducing number of rows in fact tables
» Number of Columns in Fact Tables vs. Dimension Tables
» Avoiding Nulls in Dimension Tables
» Large number of snowflake code tables per dimension
» Unknown number of relationships from dimension to fact until fact loaded
» Number of Columns in Fact Tables vs. Dimension Tables
» Avoiding Nulls in Dimension Tables
» Large number of snowflake code tables per dimension
» Unknown number of relationships from dimension to fact until fact loaded
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum