How to resolve a many to many relationship? Bridge Table or Utilize the same fact table?
3 posters
Page 1 of 1
How to resolve a many to many relationship? Bridge Table or Utilize the same fact table?
Hi everyone,
I'm new to dimensional modeling so please bear with me.
The question is how to resolve a many to many relationship between a fact and dimension table.
The business case is that we have an unemployment counselor who will review a case and refer the case (claimant) to one or more services. Services can be additional counseling, school, etc. lets say we have 20 different services.
Design #1: Utilizes a bridge table.
Service Dimension table with columns Service_id (surrogate key), Service Code and Service Description columns
Claimant Case Review Fact table with columns Claimant_Case_Review_id(lets assume I used a surrogate key on the fact table to make things easier), review start date, review end date, referred to services date, completed services date, exempt from services flag.
Service Claimant Case Review Bridge table with foreign keys Service_id and Claimant_Case_Review_id. Lets assume I have load_date and a time dimension.
For example if the case was referred to 3 services then the fact table will have 1 record and the bridge table will have 3 records.
Design #2: Utilize the fact table
Service Dimension table with columns Service_id (surrogate key), Service Code and Service Description columns
Claimant Case Review Fact table with columns Claimant_Case_Review_id(lets assume I used a surrogate key on the fact table to make things easier), Service_id (foreign key), review start date, review end date, referred to services date, completed services date, exempt from services flag.
For example, if the case was referred to 3 services the fact table will create 3 rows (one for each service). The rationale behind this design is that since services and the review are tightly couple and the business is seamless that it will be ok to save the records in the fact table. Also if we selected Design #1 then in order to get the services for a particular case review an expensive join will have to occur between the fact and bridge table (which is like joining two fact tables which is considered taboo). With Design #2 if the same question was ask, one wouldn't have to do any join. Also with Design #1 the bridge table will be growing faster than your fact table when fact tables are suppose to be the one growing.
I'm not saying I agree with Design #1 or Design #2 it just seem #2 had some strong points. I wanted to go along with #1 because its a clean design and that's what the Data Warehouse toolkit recommends (I believe).
Please tell me the pros and and cons to both and in what scenario I should use one over the other.
Thanks!
I'm new to dimensional modeling so please bear with me.
The question is how to resolve a many to many relationship between a fact and dimension table.
The business case is that we have an unemployment counselor who will review a case and refer the case (claimant) to one or more services. Services can be additional counseling, school, etc. lets say we have 20 different services.
Design #1: Utilizes a bridge table.
Service Dimension table with columns Service_id (surrogate key), Service Code and Service Description columns
Claimant Case Review Fact table with columns Claimant_Case_Review_id(lets assume I used a surrogate key on the fact table to make things easier), review start date, review end date, referred to services date, completed services date, exempt from services flag.
Service Claimant Case Review Bridge table with foreign keys Service_id and Claimant_Case_Review_id. Lets assume I have load_date and a time dimension.
For example if the case was referred to 3 services then the fact table will have 1 record and the bridge table will have 3 records.
Design #2: Utilize the fact table
Service Dimension table with columns Service_id (surrogate key), Service Code and Service Description columns
Claimant Case Review Fact table with columns Claimant_Case_Review_id(lets assume I used a surrogate key on the fact table to make things easier), Service_id (foreign key), review start date, review end date, referred to services date, completed services date, exempt from services flag.
For example, if the case was referred to 3 services the fact table will create 3 rows (one for each service). The rationale behind this design is that since services and the review are tightly couple and the business is seamless that it will be ok to save the records in the fact table. Also if we selected Design #1 then in order to get the services for a particular case review an expensive join will have to occur between the fact and bridge table (which is like joining two fact tables which is considered taboo). With Design #2 if the same question was ask, one wouldn't have to do any join. Also with Design #1 the bridge table will be growing faster than your fact table when fact tables are suppose to be the one growing.
I'm not saying I agree with Design #1 or Design #2 it just seem #2 had some strong points. I wanted to go along with #1 because its a clean design and that's what the Data Warehouse toolkit recommends (I believe).
Please tell me the pros and and cons to both and in what scenario I should use one over the other.
Thanks!
cmp66- Posts : 6
Join date : 2014-03-12
Re: How to resolve a many to many relationship? Bridge Table or Utilize the same fact table?
The bridge doesn't make much sense in this case. One row per service per case gives you more opportunities for analysis.
Re: How to resolve a many to many relationship? Bridge Table or Utilize the same fact table?
Thanks for the prompt reply!
Again I'm new to the dimensional modeling world so I'm trying to come up with a set of conventions that may make me lean one way (bridge table) or the other way (no bridge table).
In what scenarios should a bridge table be utilize?
Reason I ask is because someone will make the case that I should always go without the bridge table for a many to many relationship to avoid an expensive join. The notion at my job is space is not an issue and performance wins out all the time.
Thanks!
Again I'm new to the dimensional modeling world so I'm trying to come up with a set of conventions that may make me lean one way (bridge table) or the other way (no bridge table).
In what scenarios should a bridge table be utilize?
Reason I ask is because someone will make the case that I should always go without the bridge table for a many to many relationship to avoid an expensive join. The notion at my job is space is not an issue and performance wins out all the time.
Thanks!
cmp66- Posts : 6
Join date : 2014-03-12
Re: How to resolve a many to many relationship? Bridge Table or Utilize the same fact table?
A classic example is book authors. Sales are tracked by the book, but there may be multiple authors each receiving some portion of the royalty. A bridge is used because book related facts are represented by a single row for the book. If you include authors as a dimension, it requires a bridge because there may be more than one author.
You don't have a situation like that. You are better off treating each service referral as an individual event. A case may have multiple referrals, but case is just one of many dimensions of the referral.
You don't have a situation like that. You are better off treating each service referral as an individual event. A case may have multiple referrals, but case is just one of many dimensions of the referral.
Re: How to resolve a many to many relationship? Bridge Table or Utilize the same fact table?
Ok great. thank you!
Now I have another business scenario:
I have a Claimant Claim Fact table which represents a Claimant filing for benefits.
I also have an Appeal Case Fact table which could represent a Claimant or Employer appealing a benefit decision.
Some times an Appeal Case can be filed by multiple claimants which means an Appeal Case can be comprise of multiple Claimant Claim.
The opposite is true as well; A Claimant Claim can be associated with many Appeal Cases.
Solution #1:
I was going to put the Claimant Claim id (this is a surrogate key of the fact table not coming from the source system) as a (soft) foreign key in the Appeal Case Fact table. So if Three Claimant Claims are associated with an Appeal Case, I would insert three records in the Appeal Case table. The assumption I'm making is that the source system doesn't provide a Claimant Case Id. I wouldn't be able to answer the following question: What are the distinct Appeal Cases? Because I just loaded three Claimant Claim records into the Appeals Case table that are not group.
Solution #2:
Use a bridge/association/junction table. The problem here is that I'm essentially joining the Claimant Claims Fact table with the Appeal Case Fact table. This will help me to answer the question above with no issues since the association isn't stored in the Appeal Case Fact table. However, I read that I shouldn't join a fact table to another fact table because that could lead to a cartesian product.
So I'm not to sure how to model this scenario.
Now I have another business scenario:
I have a Claimant Claim Fact table which represents a Claimant filing for benefits.
I also have an Appeal Case Fact table which could represent a Claimant or Employer appealing a benefit decision.
Some times an Appeal Case can be filed by multiple claimants which means an Appeal Case can be comprise of multiple Claimant Claim.
The opposite is true as well; A Claimant Claim can be associated with many Appeal Cases.
Solution #1:
I was going to put the Claimant Claim id (this is a surrogate key of the fact table not coming from the source system) as a (soft) foreign key in the Appeal Case Fact table. So if Three Claimant Claims are associated with an Appeal Case, I would insert three records in the Appeal Case table. The assumption I'm making is that the source system doesn't provide a Claimant Case Id. I wouldn't be able to answer the following question: What are the distinct Appeal Cases? Because I just loaded three Claimant Claim records into the Appeals Case table that are not group.
Solution #2:
Use a bridge/association/junction table. The problem here is that I'm essentially joining the Claimant Claims Fact table with the Appeal Case Fact table. This will help me to answer the question above with no issues since the association isn't stored in the Appeal Case Fact table. However, I read that I shouldn't join a fact table to another fact table because that could lead to a cartesian product.
So I'm not to sure how to model this scenario.
cmp66- Posts : 6
Join date : 2014-03-12
Re: How to resolve a many to many relationship? Bridge Table or Utilize the same fact table?
ngalemmo wrote:The bridge doesn't make much sense in this case. One row per service per case gives you more opportunities for analysis.
What if the grain of the fact should be at Claimant_Case_Review_id level? If the dimension of service is a rarely visited aspect of the fact, in this situation bridge table makes sense?
modeling12345- Posts : 2
Join date : 2014-01-30
Re: How to resolve a many to many relationship? Bridge Table or Utilize the same fact table?
No, because the business activity is a service referral, so an atomic level fact table should maintain information at the activity level. A case level representation of this activity would be an aggregate of the atomic level facts and would not contain dimensions that are being aggregated (i.e. services).
Similar topics
» many to many fact table relationship - use dimension, bridge or ?
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» separate fact table/different grain - do I need a bridge table
» Data in a fact or dimenzion table or bridge table
» Bridge Table - What is in the Fact tbl?
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» separate fact table/different grain - do I need a bridge table
» Data in a fact or dimenzion table or bridge table
» Bridge Table - What is in the Fact tbl?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum