Handling Many to Many Relationships Between Dimensions
3 posters
Page 1 of 1
Handling Many to Many Relationships Between Dimensions
I am working on a dimensional model for the data warehouse we are building. We've received a lot of help and done a whole lot of research but there is still an area where there is some confusion and I could use some reassurance or redirection with the approach we're taking, specifically dealing with our many to many relationships.
Our solution is based on reporting on performance observations of teachers. Each observation has an observer(user1), a learner(user2), and a building that observation took place in.
Also, there are some reports that need to be done about the user themselves. This involves a few many to many relationships. For example a user can be in many buildings, and can have many roles.
The way we're dealing with these many to many relationships is with bridge tables. We have a user dimension, a building dimension, and a user-building fact which has the surrogate key of both user and building. I've seen this example many times in searches which is why we've gone this route but there are two problems we face that I can't find anywhere when searching what approach to take. Maybe I'm not using the right terminology.
1) What happens when in the source system a user leaves a building? Should we delete that fact? Or use dates to expire that building-user relationship?
2) What happens when in the source system there are changes made to the building or the user. Because these are SCD2 tables, the surrogate key will be changed. This makes sense to me in our other facts, because they are instances in time, where a history should be kept of what things were like at that point in time, but here we need to show what a user's current buildings are. I'm assuming that we might have to use instead of the surrogate key a key that doesn't change when the dimension changes, but I'm confused how our reporting solution will know which of the two users to show.
Our solution is based on reporting on performance observations of teachers. Each observation has an observer(user1), a learner(user2), and a building that observation took place in.
Also, there are some reports that need to be done about the user themselves. This involves a few many to many relationships. For example a user can be in many buildings, and can have many roles.
The way we're dealing with these many to many relationships is with bridge tables. We have a user dimension, a building dimension, and a user-building fact which has the surrogate key of both user and building. I've seen this example many times in searches which is why we've gone this route but there are two problems we face that I can't find anywhere when searching what approach to take. Maybe I'm not using the right terminology.
1) What happens when in the source system a user leaves a building? Should we delete that fact? Or use dates to expire that building-user relationship?
2) What happens when in the source system there are changes made to the building or the user. Because these are SCD2 tables, the surrogate key will be changed. This makes sense to me in our other facts, because they are instances in time, where a history should be kept of what things were like at that point in time, but here we need to show what a user's current buildings are. I'm assuming that we might have to use instead of the surrogate key a key that doesn't change when the dimension changes, but I'm confused how our reporting solution will know which of the two users to show.
shastings- Posts : 1
Join date : 2013-05-22
Re: Handling Many to Many Relationships Between Dimensions
Hi shastings,
I am no expert, but I am happy to share my thoughts.
An alternative to this, which shows only the current relationship between building and user, might be to use "durable" keys in the bridge table. When using the bridge table in the reporting solution, you need to ensure that the interface knows to join to each SCD2 on the durable key and at the same time filter for only the "current" records in each dimension.
I am no expert, but I am happy to share my thoughts.
In the case of a user leaving a building, you probably need to go back to the business and ask what should happen to the historical data including facts and relationships between users and buildings. Although I get the feeling that no matter the answer from the business, deleting a fact is frowned upon unless it is truly a mistake.shastings wrote:1) What happens when in the source system a user leaves a building? Should we delete that fact? Or use dates to expire that building-user relationship?
I think Ralph talks about two ways to figure this out. The first is implied in Design Tip #136. The way I read the article, you need to create a new row for each update to either the building or the user table. This helps track the history of the relationship between the two.shastings wrote:2) What happens when in the source system there are changes made to the building or the user. Because these are SCD2 tables, the surrogate key will be changed. This makes sense to me in our other facts, because they are instances in time, where a history should be kept of what things were like at that point in time, but here we need to show what a user's current buildings are. I'm assuming that we might have to use instead of the surrogate key a key that doesn't change when the dimension changes, but I'm confused how our reporting solution will know which of the two users to show.
An alternative to this, which shows only the current relationship between building and user, might be to use "durable" keys in the bridge table. When using the bridge table in the reporting solution, you need to ensure that the interface knows to join to each SCD2 on the durable key and at the same time filter for only the "current" records in each dimension.
LizBaron- Posts : 2
Join date : 2013-05-22
Location : Long Island, NY
Re: Handling Many to Many Relationships Between Dimensions
1) If you delete the fact record, you lose the history of the relation. You have to mark that record to show the relation between dimensions is expired. Expire date is possible.
2) You don't need to change the fact table when any change occurs in dimensions. You can do a self-join on the dimension using the natural key to find the version you want see.
2) You don't need to change the fact table when any change occurs in dimensions. You can do a self-join on the dimension using the natural key to find the version you want see.
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 42
Location : Istanbul, Turkey

» Fuzzy relationships between fact and dimensions
» Schema Roadblock: Dimension vs Fact [Dimensions w/FKs to other Dimensions?]
» One to many relationships
» Same attribute in multiple dimensions or Create new dimension?
» Many-to-many Relationships
» Schema Roadblock: Dimension vs Fact [Dimensions w/FKs to other Dimensions?]
» One to many relationships
» Same attribute in multiple dimensions or Create new dimension?
» Many-to-many Relationships
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum