Tracking changes in a hierarchy
2 posters
Page 1 of 1
Tracking changes in a hierarchy
Hi,
I'm working in a DW project on a bank. They don't like to much the use of subrrogate keys and they prefer to use natural keys. One of the problems we have is a product hierarchy. I'm going to simplify it. The hierarchy is something like:
Category
|
SubCategory
|
Product
The problem is that one product can belong to more than one Subcategory and one Subcategory can belong to more than one Category. So We can have more than one father for each element. Because of this they have created three diferent dimension tables (category, subcategory and product) and two relation tables (Cat-Sub and Sub-Prod)
We want to handle this dimensions as SCD type II. So If for example I have a change in one subcategory I will add a new row in the subcategory dimension tables.
My question is, Imagine one product is changed to another subcategory. How can you handle this changes in the relation table? The relations are defined in the relations tables and of course users want to have keep historical information about the relation also.
Thanks in advance!!
Regards
Fico
I'm working in a DW project on a bank. They don't like to much the use of subrrogate keys and they prefer to use natural keys. One of the problems we have is a product hierarchy. I'm going to simplify it. The hierarchy is something like:
Category
|
SubCategory
|
Product
The problem is that one product can belong to more than one Subcategory and one Subcategory can belong to more than one Category. So We can have more than one father for each element. Because of this they have created three diferent dimension tables (category, subcategory and product) and two relation tables (Cat-Sub and Sub-Prod)
We want to handle this dimensions as SCD type II. So If for example I have a change in one subcategory I will add a new row in the subcategory dimension tables.
My question is, Imagine one product is changed to another subcategory. How can you handle this changes in the relation table? The relations are defined in the relations tables and of course users want to have keep historical information about the relation also.
Thanks in advance!!
Regards
Fico
fico- Posts : 3
Join date : 2010-09-12
Re: Tracking changes in a hierarchy
Just add effective and expiration date bounds to the relation table.
Re: Tracking changes in a hierarchy
ngalemmo wrote:Just add effective and expiration date bounds to the relation table.
That's what I said to the customer but they answered "Then We have to use the date on the joins. Where is the adventage of using a subrrogate key then?"
We are going to use MicroStrategy to report.
As I said they wants to use natural key instead of subrrogate keys. They will have natural key + start date + end date in all tables (dimension and relation). Then they will create a view joining each table with the Day Dimension table so they obtain a daily snapshot of every dimension and relation table with natural key+day as identifier of each row.
In MicroStrategy they want to model each attribute against the view and using natural key+day as ID.
* They will model Day Attribute as father of all attributes
* They will relate Product-Subcategory and Subcategory-Category using the natural key+day
* They will join each attribute against fact tables using natural key+day.
What do you think of this approach?
fico- Posts : 3
Join date : 2010-09-12
Re: Tracking changes in a hierarchy
Natural or surrogate keys don't matter. The relationship tables define the relationship between entities, not the entity itself. I am assuming a change in the relationship does not affect either entity... the product is still the product and the subcategory is still the subcategory. So, no matter what key you use, the state of the relationship is an independent event and must be accounted for independently.
Surrogate keys do not change the nature of such relationships. They are used for other reasons... primarily to protect the data warehouse integrity when business keys change, and secondarily, they provide small, stable keys that reduce index size and improve performance.
As for the solution, the snapshot is kind of overkill. If you define an index of key(ASC), expiration date (DSC), on the relationship tables, doing a join using BETWEEN (effective, expiration) would work just fine. If they usually just want current relationships, use a standard open expiration date (say, 12/31/2999) and filter on that value. This filter could be hard-coded in the BI layer.
Surrogate keys do not change the nature of such relationships. They are used for other reasons... primarily to protect the data warehouse integrity when business keys change, and secondarily, they provide small, stable keys that reduce index size and improve performance.
As for the solution, the snapshot is kind of overkill. If you define an index of key(ASC), expiration date (DSC), on the relationship tables, doing a join using BETWEEN (effective, expiration) would work just fine. If they usually just want current relationships, use a standard open expiration date (say, 12/31/2999) and filter on that value. This filter could be hard-coded in the BI layer.
Similar topics
» Relationship between a history tracking table and a non-history tracking table?
» tracking customer activity
» Customer with Multiple Address types
» Tracking repeated issue "status"
» History tracking in a CRM data warehouse
» tracking customer activity
» Customer with Multiple Address types
» Tracking repeated issue "status"
» History tracking in a CRM data warehouse
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum