Composite Dimension Natural Keys - Attribute Used In Multiple Dimensions as well as Fact Tables
2 posters
Page 1 of 1
Composite Dimension Natural Keys - Attribute Used In Multiple Dimensions as well as Fact Tables
We have multiple instances of the same ERP system as well as their own Databases, differing by their locale. (International Facilities)
Each of these locales are assigned a unique ID to identify their database source. (As well as the Facility - I.e. for America the ID would be USA)
Within each satellite location, the DB schema is the same as the others, same tables, same fields, but with differing values based upon users entry and sales/costs/etc.
When staging the tables, for example the Parts table, from two or more of these locales, their Natural Operational Key will be the same (Same Part ID in the ERP system regardless of locale), but their additional attribute data may differ (A part's Stock QTY would differ by location, as well as the description assigned for that Part ID). They differ due to user entry, as well as Facility needs.
When analyzing and reporting on Parts, it is crucial that we be able to look up not only by part ID but additionally be able to discern from where that particular part ID record came from, i.e. what facility.
Having this requirement, I would consider the Natural Operational Key to be a Composite one, based upon Part ID and Satellite Locale.
Now this solution would work fine for a single dimension like this this single dimension, however, this Satellite ID is used in the same manner as with parts elsewhere and in many other dimensions.
It is also needed to a FK slicer in many Fact Tables.
How should I treat this attribute? Should I it in its own dimension (Satellite), and then snowflake the dimensions that require this composite key to have a FK relationship to it?
Or should I push the value into each dimension repeating the Operational key (duplication), but then have the fact table hold the sole FK to the Satellite Dimension?
Any insight would be appreciated.
Each of these locales are assigned a unique ID to identify their database source. (As well as the Facility - I.e. for America the ID would be USA)
Within each satellite location, the DB schema is the same as the others, same tables, same fields, but with differing values based upon users entry and sales/costs/etc.
When staging the tables, for example the Parts table, from two or more of these locales, their Natural Operational Key will be the same (Same Part ID in the ERP system regardless of locale), but their additional attribute data may differ (A part's Stock QTY would differ by location, as well as the description assigned for that Part ID). They differ due to user entry, as well as Facility needs.
When analyzing and reporting on Parts, it is crucial that we be able to look up not only by part ID but additionally be able to discern from where that particular part ID record came from, i.e. what facility.
Having this requirement, I would consider the Natural Operational Key to be a Composite one, based upon Part ID and Satellite Locale.
Now this solution would work fine for a single dimension like this this single dimension, however, this Satellite ID is used in the same manner as with parts elsewhere and in many other dimensions.
It is also needed to a FK slicer in many Fact Tables.
How should I treat this attribute? Should I it in its own dimension (Satellite), and then snowflake the dimensions that require this composite key to have a FK relationship to it?
Or should I push the value into each dimension repeating the Operational key (duplication), but then have the fact table hold the sole FK to the Satellite Dimension?
Any insight would be appreciated.
Srixon- Posts : 4
Join date : 2015-01-21
Re: Composite Dimension Natural Keys - Attribute Used In Multiple Dimensions as well as Fact Tables
Satellite should be an attribute in dimensions and should be part of the natural key. My personal preference is to define the natural key as a separate varchar column that is a delimited concatenation of whatever pieces make up the key. The components of the key are also stored as attributes. This allows you to handle multiple sources in the same dimension in a consistent manner when different sources may have different key structures.
I don't know what you mean by "It is also needed to a FK slicer in many Fact Tables.". If you mean users need to filter by satellite, you would be covered if it is carried as a dimensional attribute. However, since this code would be carried on all dimensions, it could be confusing as to which one to use, it can also be problematic if you are building an 'auto filter' in the BI layer based on user id or are implementing row level security. For consistency in querying, it may be worthwhile to include the code as a degenerate dimension value on facts. This gives a single column that can always be used as a filter regardless of which dimensions are used in the query.
I imagine these different instances are autonomous operations. So I would carry the satellite on 'corporate standard' dimensions as well, as it is conceivable that they could have been modified locally.
As far as satellite being its own dimension, it would depend on wither there are additional attributes other than the code. If that is the case and it warrants a dimension, I add it as a dimension to the facts in place of the degenerate dimension. I would also carry the code as attributes on the dimensions and as a component of the dimension natural key. I would not snowflake this dimension with other dimensions as there is no benefit in doing so.
I don't know what you mean by "It is also needed to a FK slicer in many Fact Tables.". If you mean users need to filter by satellite, you would be covered if it is carried as a dimensional attribute. However, since this code would be carried on all dimensions, it could be confusing as to which one to use, it can also be problematic if you are building an 'auto filter' in the BI layer based on user id or are implementing row level security. For consistency in querying, it may be worthwhile to include the code as a degenerate dimension value on facts. This gives a single column that can always be used as a filter regardless of which dimensions are used in the query.
I imagine these different instances are autonomous operations. So I would carry the satellite on 'corporate standard' dimensions as well, as it is conceivable that they could have been modified locally.
As far as satellite being its own dimension, it would depend on wither there are additional attributes other than the code. If that is the case and it warrants a dimension, I add it as a dimension to the facts in place of the degenerate dimension. I would also carry the code as attributes on the dimensions and as a component of the dimension natural key. I would not snowflake this dimension with other dimensions as there is no benefit in doing so.
Similar topics
» Same attribute in multiple dimensions or Create new dimension?
» Storing Date Keys in dimension tables versus fact tables
» Do I need multiple fact tables or dimensions
» Business keys or Natural keys in the Fact table
» Multiple different grain fact tables with lot of common dimensions.
» Storing Date Keys in dimension tables versus fact tables
» Do I need multiple fact tables or dimensions
» Business keys or Natural keys in the Fact table
» Multiple different grain fact tables with lot of common dimensions.
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum