Adding new fields to a live system
+8
smithcarter210
hamburg113
jimmycarter165
jamiehout
rpcasey001
Jeff Smith
bgray
D_Pons
12 posters
Page 1 of 1
Adding new fields to a live system
So, the DW DB is loaded with data of various SCD types.
The data is in live use.
New requirements arise that require new SCD 2 fields to be added.
Backups exist of the source systems.
Are there any design tips, experiences that anyone can share on how to populate the new SCD 2 fields.
Is it just a question of rebuilding the whole DW DB using those backups until you've caught up with 'Now'?
Thanks
Ponzie
The data is in live use.
New requirements arise that require new SCD 2 fields to be added.
Backups exist of the source systems.
Are there any design tips, experiences that anyone can share on how to populate the new SCD 2 fields.
Is it just a question of rebuilding the whole DW DB using those backups until you've caught up with 'Now'?
Thanks
Ponzie
D_Pons- Posts : 16
Join date : 2009-02-10
Location : UK
Adding new fields to a live system
I'm first going to make the assumption that the new SCD2 field(s) is going to be added to a dimension table that already contains at least one SCD2 field.
By default, an SCD2 field does not start recording historical changes until the day you implement it. When you initially add the column, all instances (current and historical rows) of the particular row would be assigned the current column value.
That behavior may or may not meet your business requirements because the column values for historic rows are likely to be historically inaccurate. If you need to recreate the past it can be an immense undertaking because you have to consider the impact from the existing SCD2 columns. The historic rows that exist were created based on the change activity dates relative to the existing SCD2 columns and are likely not at all related to changes that may have occurred with your new SCD2 columns. If you start inserting new historic rows you run the strong likelihood of creating time overlapping dimension rows, or worse, corrupting the foreign keys in your fact tables if you aren't meticulously careful. This approach is NOT for the faint of heart or someone only casually familiar with how everything works in the DW.
Rebuilding from scratch from historic backups might work if you thoroughly think it through. Keep in mind that standard SCD2 change detection methods create new dimension rows and assign effectivity dates when the ETL "detects" the change. Simply running repeated loads from backup using your standard ETL would NOT work. Lets say for example you are somehow able to reload all of your data from multiple backups in one day...Today. The initial load would set the Effective_From date to some default way in the past and the Effective_To date to some default way in the future. The next sequential load would recognize the changes in your SCD2 columns, create new rows and adjust effectivity dates. The problem is that the change was detected "Today" and by default your effectivity dates would reflect "Today" rather than when the change really occurred sometime in the past. When you adjust and assign the effectivity dates, you need to set them relative to the date of the backup...not Today. If the business requires you to recreate dimensional history, this is probably the safest approach but as I said, you need to think it through. I suspect for most people this is not an option since totally rebuilding from scratch may takes months to process.
Hope that helps!
By default, an SCD2 field does not start recording historical changes until the day you implement it. When you initially add the column, all instances (current and historical rows) of the particular row would be assigned the current column value.
That behavior may or may not meet your business requirements because the column values for historic rows are likely to be historically inaccurate. If you need to recreate the past it can be an immense undertaking because you have to consider the impact from the existing SCD2 columns. The historic rows that exist were created based on the change activity dates relative to the existing SCD2 columns and are likely not at all related to changes that may have occurred with your new SCD2 columns. If you start inserting new historic rows you run the strong likelihood of creating time overlapping dimension rows, or worse, corrupting the foreign keys in your fact tables if you aren't meticulously careful. This approach is NOT for the faint of heart or someone only casually familiar with how everything works in the DW.
Rebuilding from scratch from historic backups might work if you thoroughly think it through. Keep in mind that standard SCD2 change detection methods create new dimension rows and assign effectivity dates when the ETL "detects" the change. Simply running repeated loads from backup using your standard ETL would NOT work. Lets say for example you are somehow able to reload all of your data from multiple backups in one day...Today. The initial load would set the Effective_From date to some default way in the past and the Effective_To date to some default way in the future. The next sequential load would recognize the changes in your SCD2 columns, create new rows and adjust effectivity dates. The problem is that the change was detected "Today" and by default your effectivity dates would reflect "Today" rather than when the change really occurred sometime in the past. When you adjust and assign the effectivity dates, you need to set them relative to the date of the backup...not Today. If the business requires you to recreate dimensional history, this is probably the safest approach but as I said, you need to think it through. I suspect for most people this is not an option since totally rebuilding from scratch may takes months to process.
Hope that helps!
bgray- Posts : 8
Join date : 2009-02-10
Re: Adding new fields to a live system
Thanks for that.
Pretty much as I feared!
Our ETL is designed to distinguish between the date the data is loaded in to the Data Warehouse Database and the date that the source data is representing - i.e. the date of the backup.
This was so that we could load at least some historic type 2 data.
I think the business case is going to have to be really strong to keep re-loading historic data - as you say, it takes a long time!
thanks
Pretty much as I feared!
Our ETL is designed to distinguish between the date the data is loaded in to the Data Warehouse Database and the date that the source data is representing - i.e. the date of the backup.
This was so that we could load at least some historic type 2 data.
I think the business case is going to have to be really strong to keep re-loading historic data - as you say, it takes a long time!
thanks
D_Pons- Posts : 16
Join date : 2009-02-10
Location : UK
It depends if you have to update history
If the new column only applies to data loaded into the DW after it is added to the dimension table, then it's a simple process. Add the column, and treat the old rows as if the data came in with a null in the new column.
If you need the new column to be polulated for history, then you have no choice but to reload all of the historical data.
If you need the new column to be polulated for history, then you have no choice but to reload all of the historical data.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: Adding new fields to a live system
let me try that myself....
Last edited by jamiehout on Thu Dec 30, 2010 1:48 am; edited 1 time in total (Reason for editing : want to see my signature)
jamiehout- Posts : 3
Join date : 2010-12-22
Re: Adding new fields to a live system
May be you will try.
jimmycarter165- Posts : 1
Join date : 2011-04-19
Re: Adding new fields to a live system
Hi
You can find this info by using search box in the top of website with some keywords related before posting questions.
You can find this info by using search box in the top of website with some keywords related before posting questions.
hamburg113- Posts : 3
Join date : 2011-04-22
Re: Adding new fields to a live system
I agree with you man.
smithcarter210- Posts : 2
Join date : 2011-05-12
Age : 39
Re: Adding new fields to a live system
hamburg113 wrote:Hi
You can find this info by using search box in the top of website with some keywords related before posting questions.
Thank you. I was trying to find that.
alank- Posts : 1
Join date : 2011-07-28
Re: Adding new fields to a live system
thank you to share such a interesting and help full post with us, i like it very much and i hope that you will keep it up
BISE peshawaar
BISE peshawaar
sabila1230- Posts : 2
Join date : 2011-08-11
Re: Adding new fields to a live system
sabila1230 wrote:thank you to share such a interesting and help full post with us, i like it very much and i hope that you will keep it up
i agree with you sabila...
________________________
http://termlifeinsurancemales.com/life-insurance-with-health-risks/life-insurance-for-people-with-diabetes
http://termlifeinsurancemales.com/life-insurance-by-state/california-life-insurance
jackkutcher- Posts : 1
Join date : 2012-01-17
Re: Adding new fields to a live system
Has it been resolved then guys?
__________________
http://www.fitnesssupplementreviews.com/decatrim-reviews/ and http://www.fitnesssupplementreviews.com/hydroxycut-review/
__________________
http://www.fitnesssupplementreviews.com/decatrim-reviews/ and http://www.fitnesssupplementreviews.com/hydroxycut-review/
james565- Posts : 1
Join date : 2012-04-18
Similar topics
» employee dimension - adding extra fields
» How to model dimension data including dynamic fields from the OLTP system?
» Dimension fields depend on other fields
» Attn: Joy - Please respond :) re: SSIS 2008 Adv Works Model
» updating a live database
» How to model dimension data including dynamic fields from the OLTP system?
» Dimension fields depend on other fields
» Attn: Joy - Please respond :) re: SSIS 2008 Adv Works Model
» updating a live database
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|