Should I create one or two dimensions?
2 posters
Page 1 of 1
Should I create one or two dimensions?
Hello,
I work for a travel company. In the past I built a data mart for our sales department having a Traveler dimension. I now am building some data marts for our operations department and the same dimension is used but from a slightly different perspective. My instinct is to use the same dimension (as suggested when using the bus matrix) but there are some complexities and I'd like to get some feedback.
I think it's the way to go because the Traveler concept is the same in both the Sales and Operations perspectives. The same person we sold to will be the one traveling. It would facilitate traversing the data marts.
There are two reasons to split into two dimensions. The first is that a small minority of the attributes are shared between systems (though isn't a surprise). The second is complexity.
There complexity is that each system tracks its own Traveler object independently which means the Traveler_id (natural key) from the Sales system is a different value than the one from the Operations system. Furthermore, there are multiple Sales systems and one Operations system (even though I only care about 1 sales system). This means that the Traveler can exist in:
- just my Sales system (Right after the sale. Data integration may take months due to business process.)
- just the Operations system (if sale originated in a different Sales system)
- both systems
If I combine them into 1 dimension the unique identifier would be (SalesTraveler_id, OpsTraveler_id), though that will change once the Traveler is integrated from Sales to Ops (st_id: 1, ot_id: NULL -> st_id:1, ot_id:200). This means I probably need a linking table with a surrogate key just to manage them before I even create the dimension table with its own surrogate key.
From experience out there, does the convenience of having one Traveler dimension to use with all the data marts outweigh the complexity of combining them from the multiple systems? Certainly the ETL would be much simpler otherwise.
I appreciate any advice ahead of time.
Thanks,
Arturo
I work for a travel company. In the past I built a data mart for our sales department having a Traveler dimension. I now am building some data marts for our operations department and the same dimension is used but from a slightly different perspective. My instinct is to use the same dimension (as suggested when using the bus matrix) but there are some complexities and I'd like to get some feedback.
I think it's the way to go because the Traveler concept is the same in both the Sales and Operations perspectives. The same person we sold to will be the one traveling. It would facilitate traversing the data marts.
There are two reasons to split into two dimensions. The first is that a small minority of the attributes are shared between systems (though isn't a surprise). The second is complexity.
There complexity is that each system tracks its own Traveler object independently which means the Traveler_id (natural key) from the Sales system is a different value than the one from the Operations system. Furthermore, there are multiple Sales systems and one Operations system (even though I only care about 1 sales system). This means that the Traveler can exist in:
- just my Sales system (Right after the sale. Data integration may take months due to business process.)
- just the Operations system (if sale originated in a different Sales system)
- both systems
If I combine them into 1 dimension the unique identifier would be (SalesTraveler_id, OpsTraveler_id), though that will change once the Traveler is integrated from Sales to Ops (st_id: 1, ot_id: NULL -> st_id:1, ot_id:200). This means I probably need a linking table with a surrogate key just to manage them before I even create the dimension table with its own surrogate key.
From experience out there, does the convenience of having one Traveler dimension to use with all the data marts outweigh the complexity of combining them from the multiple systems? Certainly the ETL would be much simpler otherwise.
I appreciate any advice ahead of time.
Thanks,
Arturo
arturopw- Posts : 1
Join date : 2013-02-26
Re: Should I create one or two dimensions?
I personally like to consolidate multiple sources of the same information into a single dimension. Likewise, if I can (i.e. I have the tools available and the data supports it), I like to consolidate the data within those dimensions. I view this type of functionality, the value add of data warehousing.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Similar topics
» When to add attribute to Dimension or Create new Dimensions
» Same attribute in multiple dimensions or Create new dimension?
» Dimensions - To Create Based on Type or Role ?
» How to create a schema with unrelated client dimensions
» Schema Roadblock: Dimension vs Fact [Dimensions w/FKs to other Dimensions?]
» Same attribute in multiple dimensions or Create new dimension?
» Dimensions - To Create Based on Type or Role ?
» How to create a schema with unrelated client dimensions
» Schema Roadblock: Dimension vs Fact [Dimensions w/FKs to other Dimensions?]
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum