Loading dimension when source already has effective to and from dates
4 posters
Page 1 of 1
Loading dimension when source already has effective to and from dates
Hi,
Wondering what the best way would be to handle the following:
I had initially thought I would complete the first load of the dimension deriving the SCD effective dates from the source effective dates and would then only consider the currently active records in the source for future loads.
However, the problems are that:
For the above reasons I'm thinking that my previous idea is not going to work but the only other option I can come up with is to model each row in the source table (which has it's own unique id) as it's own entity in the dimension. I would have columns for CourseEffectiveFrom and CourseEffectiveTo based on those values in the source along with the standard SCD CURRENT/ROW_EFFECTIVE_FROM/TO columns.
The initial load would therefore have the same number of rows as the source, with duplicate entries for a course "code".
For some reason this feels strange but I'm unsure of a better way to handle it.
I'm sure this is a relatively common scenario so if anyone has any suggestions (or validation, if it makes sense) then I'd love to hear!
Thanks all,
lcm
Wondering what the best way would be to handle the following:
- Educational Institution
- Currently modelling the courses offered
- Need point in time so there are SCD type 2 requirements
- Source has multiple rows for a course with effective from and to dates going back many year to delineate the changes in course attributes
I had initially thought I would complete the first load of the dimension deriving the SCD effective dates from the source effective dates and would then only consider the currently active records in the source for future loads.
However, the problems are that:
- Courses are entered with effective dates in the future - for planning and enrolment reasons
- Occasionally the effective from date on a course is adjusted to an earlier date (eg. changed from 2007-01-01 to 2006-01-01)
- Occasionally a new row is added for an existing course with effective from/to placing it earlier in the timeline
For the above reasons I'm thinking that my previous idea is not going to work but the only other option I can come up with is to model each row in the source table (which has it's own unique id) as it's own entity in the dimension. I would have columns for CourseEffectiveFrom and CourseEffectiveTo based on those values in the source along with the standard SCD CURRENT/ROW_EFFECTIVE_FROM/TO columns.
The initial load would therefore have the same number of rows as the source, with duplicate entries for a course "code".
For some reason this feels strange but I'm unsure of a better way to handle it.
I'm sure this is a relatively common scenario so if anyone has any suggestions (or validation, if it makes sense) then I'd love to hear!
Thanks all,
lcm
lcm- Posts : 4
Join date : 2011-08-24
Re: Loading dimension when source already has effective to and from dates
Please refer to this related topic:
http://forum.kimballgroup.com/t836-loading-future-dated-rows-to-dimensions#3575
http://forum.kimballgroup.com/t836-loading-future-dated-rows-to-dimensions#3575
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Loading dimension when source already has effective to and from dates
Thanks hang - that was a good little read. It answers the question regarding the future rows but I can forsee that being tricky when handling future rows when dealing with both SCD type 1 and 2.
Probably the larger query is around the "historic" records changing date coverage. I'll try and give an example below:
Source table - this shows Scenario 1 - a future record in the source table
Scenario 2 - Effective from date on a course is adjusted to an earlier date
Scenario 3 - A new row is added for an existing course with effective from/to placing it earlier in the timeline
So, as mentioned before, my original plan was to derive the SCD ROW_EFFECTIVE dates in the dimension from the EffectiveFrom/To dates in the source and store the relevant attributes. However, now that I'm aware of the above scenarios I don't think that plan is going to work. At least not without a whole lot of existing dimension record manipulation.
In the interests of keeping it simple I was instead thinking that I might have a dimension table as follows:
But looking at it I don't get the sense that it's ideal. However, if I did go down the above then future/back dated rows would just slot in and modifications to the dates of existing course ranges would be handled by a standard SCD2 new row.
As I said, seems to do what I need but still doesn't feel "right".
Any ideas folks??
Thanks,
lcm
Probably the larger query is around the "historic" records changing date coverage. I'll try and give an example below:
Source table - this shows Scenario 1 - a future record in the source table
UQID | CourseCode | CourseName | CourseCredits | EffectiveFrom | EffectiveTo |
1 | ABS100 | Absolutely Enthralling Course | 300 | 2001-01-01 | 2006-12-31 |
2 | ABS100 | Absolutely Enthralling Course | 600 | 2007-01-01 | 2011-12-31 |
3 | ABS100 | Absolutely Enthralling Course | 800 | 2012-01-01 | 2199-12-31 |
Scenario 2 - Effective from date on a course is adjusted to an earlier date
UQID | CourseCode | CourseName | CourseCredits | EffectiveFrom | EffectiveTo |
1 | ABS100 | Absolutely Enthralling Course | 300 | 1997-01-01 | 2006-12-31 |
2 | ABS100 | Absolutely Enthralling Course | 600 | 2007-01-01 | 2011-12-31 |
3 | ABS100 | Absolutely Enthralling Course | 800 | 2012-01-01 | 2199-12-31 |
Scenario 3 - A new row is added for an existing course with effective from/to placing it earlier in the timeline
UQID | CourseCode | CourseName | CourseCredits | EffectiveFrom | EffectiveTo |
1 | ABS100 | Absolutely Enthralling Course | 300 | 2001-01-01 | 2006-12-31 |
2 | ABS100 | Absolutely Enthralling Course | 600 | 2007-01-01 | 2011-12-31 |
3 | ABS100 | Absolutely Enthralling Course | 800 | 2012-01-01 | 2199-12-31 |
4 | ABS100 | Absolutely Enthralling Course | 200 | 1997-01-01 | 2000-12-31 |
So, as mentioned before, my original plan was to derive the SCD ROW_EFFECTIVE dates in the dimension from the EffectiveFrom/To dates in the source and store the relevant attributes. However, now that I'm aware of the above scenarios I don't think that plan is going to work. At least not without a whole lot of existing dimension record manipulation.
In the interests of keeping it simple I was instead thinking that I might have a dimension table as follows:
CourseKey(SK) | CourseUQID(NK) | CourseCode | CourseName | CourseCredits | EffectiveFrom | EffectiveTo | IS_CURRENT | ROW_EFFECTIVE_FROM | ROW_EFFECTIVE_TO |
But looking at it I don't get the sense that it's ideal. However, if I did go down the above then future/back dated rows would just slot in and modifications to the dates of existing course ranges would be handled by a standard SCD2 new row.
As I said, seems to do what I need but still doesn't feel "right".
Any ideas folks??
Thanks,
lcm
lcm- Posts : 4
Join date : 2011-08-24
Re: Loading dimension when source already has effective to and from dates
The idea of having two sets of dates is fine. It's been discussed before.
You simply have two conflicting sets of date ranges. One relates to the business view of when something goes into effect and the other is the technical view of when it was applied to the data warehouse.
You simply have two conflicting sets of date ranges. One relates to the business view of when something goes into effect and the other is the technical view of when it was applied to the data warehouse.
Back Date Business Effective Date
Hi
I am looking for an answer to back date business effective From and To dates listed on this issue. Did anyone has a solution for that ? I am also having a similar issue.
Thanks
Manjith
I am looking for an answer to back date business effective From and To dates listed on this issue. Did anyone has a solution for that ? I am also having a similar issue.
Thanks
Manjith
manjith.gunatilaka- Posts : 1
Join date : 2013-02-28
Location : Australia
Similar topics
» can i use other two effective dates column in SCD2
» Dimensions with effective and end dates - best design and best way to join to fact tables?
» Combination of SCD2, partial snowflake, Effective and end dates in fact table
» Fact with eff / exp dates referencing dimension with eff / exp dates
» Shrunken Dimension: Model and loading
» Dimensions with effective and end dates - best design and best way to join to fact tables?
» Combination of SCD2, partial snowflake, Effective and end dates in fact table
» Fact with eff / exp dates referencing dimension with eff / exp dates
» Shrunken Dimension: Model and loading
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|