Modeling SCD Type 2 Dimension
3 posters
Page 1 of 1
Modeling SCD Type 2 Dimension
As per our requirement we need to model Dimensions as Type 2
We have the following ETL Layers
1. Source
2. Staging
3. HDW (History Warehouse)
4. Datamart
We generate Surrogate Keys in HDW and Publish the same in Datamart.
For this reason we have kept same surrogate key for a record and is identical to Logical Key
Ex: Employee No = 123 Surrogate Key is always 1 irrespective of the History Tracked
We have taken this approach so that same key will be carried to Datamart and Data lineage can be produced at any given time
Let me know if this is the right approach?
The Data base is Neoview. So we have key Surrogate Key and Effective Date Number (YYYYMMDD) as the composite Primary Key or Surrogate Key?
If we key running sequence surrogate keys, updates to Fact and tracing that to the Data mart would be difficult.
Let me know your thoughts if this approach is correct??
In Datamart, For every Dimension we have the Surrogate Key + Date Effective Key as primary key for Dimension tabke. Date Effective Key is nothing but Time dimension key. So to all Dimension there is a relationship with Time Dimension. Let me know if this approach is correct? Or is there any other better approach to the way it can be handled. Another thought is use of version id but that might complicate the ETL?
Share your thoughts as I'm doing modeling for the first time
We have the following ETL Layers
1. Source
2. Staging
3. HDW (History Warehouse)
4. Datamart
We generate Surrogate Keys in HDW and Publish the same in Datamart.
For this reason we have kept same surrogate key for a record and is identical to Logical Key
Ex: Employee No = 123 Surrogate Key is always 1 irrespective of the History Tracked
We have taken this approach so that same key will be carried to Datamart and Data lineage can be produced at any given time
Let me know if this is the right approach?
The Data base is Neoview. So we have key Surrogate Key and Effective Date Number (YYYYMMDD) as the composite Primary Key or Surrogate Key?
If we key running sequence surrogate keys, updates to Fact and tracing that to the Data mart would be difficult.
Let me know your thoughts if this approach is correct??
In Datamart, For every Dimension we have the Surrogate Key + Date Effective Key as primary key for Dimension tabke. Date Effective Key is nothing but Time dimension key. So to all Dimension there is a relationship with Time Dimension. Let me know if this approach is correct? Or is there any other better approach to the way it can be handled. Another thought is use of version id but that might complicate the ETL?
Share your thoughts as I'm doing modeling for the first time
ranjitkumars- Posts : 7
Join date : 2012-03-15
Age : 44
Location : United Kingdom
Re: Modeling SCD Type 2 Dimension
A type 2 dimension has a single surrogate primary key (as is the case with all dimensions). It does not have a compound primary key,
Re: Modeling SCD Type 2 Dimension
I would drop the DWH and keep all history in what you call data marts. Then I wouldn't call it a data mart, I would call it the EDW. You will soon discover with your current approach that it is cost prohibitive (in terms of labor and time) to manage two data warehouses.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Modeling SCD Type 2 Dimension
Boxes & Lines,
I totally agree with. I'm able to envision that while building the model.
The issue here is we work for a Retail client and as per their standards they want HDW (In the lines of IBM RLDM)
and then the Datamart for this specific subject area.
The approach we have taken is costly for the ETL coding and maintainence persay as well.
The intention of IBM RLDM was to leverage to a Enterprise Datawarehouse, but give the understanding of the source system,
we do not have reference data required for IBM RLDM.
Probably we should raise this point up with client so that we can do away with the Datamart
I totally agree with. I'm able to envision that while building the model.
The issue here is we work for a Retail client and as per their standards they want HDW (In the lines of IBM RLDM)
and then the Datamart for this specific subject area.
The approach we have taken is costly for the ETL coding and maintainence persay as well.
The intention of IBM RLDM was to leverage to a Enterprise Datawarehouse, but give the understanding of the source system,
we do not have reference data required for IBM RLDM.
Probably we should raise this point up with client so that we can do away with the Datamart
ranjitkumars- Posts : 7
Join date : 2012-03-15
Age : 44
Location : United Kingdom
Similar topics
» rationale behind dimension with Type 0 and missing Type 5
» Type 2 dimension or type 2 column?
» Type 3 Slowly Changing Dimension
» How to handle a Type I or II dimension with a snowflaked customer sub dimension (kimball book page 337, 338)
» Modeling an Employee Dimension to a Fact which has two columns relating to the Dimension
» Type 2 dimension or type 2 column?
» Type 3 Slowly Changing Dimension
» How to handle a Type I or II dimension with a snowflaked customer sub dimension (kimball book page 337, 338)
» Modeling an Employee Dimension to a Fact which has two columns relating to the Dimension
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum