Reusing dimensions in model
3 posters
Page 1 of 1
Reusing dimensions in model
I'm new to data modeling, and I'd like some input on the following scenario.
For example, we have vendor information (all coming from same source table) that can be present for a tag in our inventory (depending on tag type) in three different ways. It can be the producer, vendor, or outside processor (which can only be certain types of vendors so a filtered list). I see this as one dimension say DIM_VENDOR. My question is since I need to join this in 3 times, should I create sql views (we are using Microsoft SQL server, if it matters) to make DIM_VENDOR display as DIM_Outside_Processor and DIM_Producer (with results filtered on the vendor types this can be)? Is it okay to use views in the model or stick with physical tables only aliasing the reused tables as a different name at join time and not filtering for the producer table? Also, in the fact table, If I continued with the naming convention of putting the surrogate key field in the fact table with the same name it has in the dimension, I’d be showing vendor_sk three times in the fact table these dimensions are using, and you'd just have to know how the data was loaded to know which vendor you were getting (vendor, outside processor, or producer). This doesn’t seem like a very good way to go for understandability. I know the model is suppose to be easily understood by the business so I'd think that I'd rename the surrogate keys for the producer and outside processor surrogate key values in the fact table from something like Vendor_SK to OutsideProcess_SK and Producer_SK and use sql views of DIM_VENDOR.
For example, we have vendor information (all coming from same source table) that can be present for a tag in our inventory (depending on tag type) in three different ways. It can be the producer, vendor, or outside processor (which can only be certain types of vendors so a filtered list). I see this as one dimension say DIM_VENDOR. My question is since I need to join this in 3 times, should I create sql views (we are using Microsoft SQL server, if it matters) to make DIM_VENDOR display as DIM_Outside_Processor and DIM_Producer (with results filtered on the vendor types this can be)? Is it okay to use views in the model or stick with physical tables only aliasing the reused tables as a different name at join time and not filtering for the producer table? Also, in the fact table, If I continued with the naming convention of putting the surrogate key field in the fact table with the same name it has in the dimension, I’d be showing vendor_sk three times in the fact table these dimensions are using, and you'd just have to know how the data was loaded to know which vendor you were getting (vendor, outside processor, or producer). This doesn’t seem like a very good way to go for understandability. I know the model is suppose to be easily understood by the business so I'd think that I'd rename the surrogate keys for the producer and outside processor surrogate key values in the fact table from something like Vendor_SK to OutsideProcess_SK and Producer_SK and use sql views of DIM_VENDOR.
shelzalee- Posts : 6
Join date : 2011-12-06
Re: Reusing dimensions in model
Having different views of the vendor dimension is fine if it helps the users understand the model. I assume the view would contain a filter on type. But, at the same time, the vendor dimension is valid as itself (when type doesn't matter).
As far as the fact goes, there should still be only one foreign key in the physical fact table if only one vendor is involved. You could define a view of the fact that maps the single column into 3 aliases, but I don't think it would help. If a user tries to join the Producer_SK from the fact to DIM_Producer, AND join OutsideProcessor_SK to DIM_OutsideProcessor, they would get no results every time because the sets are mutually exclusive.
As far as the fact goes, there should still be only one foreign key in the physical fact table if only one vendor is involved. You could define a view of the fact that maps the single column into 3 aliases, but I don't think it would help. If a user tries to join the Producer_SK from the fact to DIM_Producer, AND join OutsideProcessor_SK to DIM_OutsideProcessor, they would get no results every time because the sets are mutually exclusive.
Re: Reusing dimensions in model
Create three FK's on your fact table for each 'role' that a vendor can participate in. The FK column name should match the role so you know which FK goes to which relationship.
The other thing you need to watch out for are the optionality of the columns based on role type. If the set of attributes for a Vendor are significantly different than for a Producer, you may want to consider creating separate dimensions or views to filter out the non-applicable attributes.
The other thing you need to watch out for are the optionality of the columns based on role type. If the set of attributes for a Vendor are significantly different than for a Producer, you may want to consider creating separate dimensions or views to filter out the non-applicable attributes.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Reusing dimensions in model
From reading the original post, I got the impression that a fact has only one vendor involved and that the various categories of vendors are types, not necessarily roles. Based on that interpretation, the idea of multiple FKs is not correct as it would lead to null FKs which don't apply based on the type of vendor involved, and using more than one FK in a query will again return no results if inner joins are used.
Reusing dimensions in model
I believe I have my answer that resuing the physical table DIM_VENDOR as a view is fine when joining. The DIM_VENDOR would serve as three roles vendor, producer, and oustide processor. These all have the same attributes, just that producer is only a certain list of the types in vendor so I could filter on those types which are one of my attributes for the view. When creating the views I'll alias my PK column in those tables so for the producer and oustide processor views I'd have Producer_SK and OustideProcessor_SK instead of Vendor_SK in all three.
These three roles won't be used with every type of our inventory. Our outside processor material will have both a vendor and an outside processor, our wip and finished goods material will not use any of these, and our raw will use both producer and vendor. The grain of the star schema is tag so the grain is consistent, but I'm planning to use dummy records for the types of material missing these dimensions in order for the inner join to continue to return results as expected.
These three roles won't be used with every type of our inventory. Our outside processor material will have both a vendor and an outside processor, our wip and finished goods material will not use any of these, and our raw will use both producer and vendor. The grain of the star schema is tag so the grain is consistent, but I'm planning to use dummy records for the types of material missing these dimensions in order for the inner join to continue to return results as expected.
shelzalee- Posts : 6
Join date : 2011-12-06
Similar topics
» How to model relations between dimensions?
» Recursive Dimensions in Dimensional Model
» Number of dimensions in a dimensional model.
» Model with Attributes Based on Two Separate Dimensions
» How to model a FACT where its relationship to multiple dimensions has changed from M:1 to M:M
» Recursive Dimensions in Dimensional Model
» Number of dimensions in a dimensional model.
» Model with Attributes Based on Two Separate Dimensions
» How to model a FACT where its relationship to multiple dimensions has changed from M:1 to M:M
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum