Repopulating a dw for specific customers
4 posters
Page 1 of 1
Repopulating a dw for specific customers
Hi,
I am new to datawarehousing and have created a small datawarehouse in sql2008r2.It has been running smoothly until the other day.
WThe have lost 6 months worth of a customers data due to there database becoming corrupt.The data is now back in the live system.
However i now have the task of trying to get this data into my dw.I have no idea what is the best way or how to deal with this situation should it have in the future.
i am thinking of the following,
1.Extract the customer specific details to a seperate test database,maniulate the data there and then insert it into the datawarehouse.
2.Repopulate the whole of the dw.
Option 1 could get messy as i could end up missing data.
Option 2 i think would take time,but at the same time it would mean the data is correct but would take the dw out for sometime.
Please can anyone suggest what is the best option?.I have never come across this and am really unsure what to do in this scenario.
Any advise would be very much appreciated.
Thank you
JTigger
I am new to datawarehousing and have created a small datawarehouse in sql2008r2.It has been running smoothly until the other day.
WThe have lost 6 months worth of a customers data due to there database becoming corrupt.The data is now back in the live system.
However i now have the task of trying to get this data into my dw.I have no idea what is the best way or how to deal with this situation should it have in the future.
i am thinking of the following,
1.Extract the customer specific details to a seperate test database,maniulate the data there and then insert it into the datawarehouse.
2.Repopulate the whole of the dw.
Option 1 could get messy as i could end up missing data.
Option 2 i think would take time,but at the same time it would mean the data is correct but would take the dw out for sometime.
Please can anyone suggest what is the best option?.I have never come across this and am really unsure what to do in this scenario.
Any advise would be very much appreciated.
Thank you
JTigger
jtigger- Posts : 2
Join date : 2012-04-27
Re: Repopulating a dw for specific customers
Database backups and restores.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Repopulating a dw for specific customers
Reload and do regular backups.
My worst DW experience was similar. It was in the mid-1990's at an HMO. It was a large DW for the time and operations dutifully did full and incremental backups on schedule. There was a power issue that destroyed a number of disk drives. So they went about restoring the hardware and recovering the database. They had an automated tape library for backups. It was a large machine that stored tape cartridges and mounted them into a dozen different drives. A full backup was 80-90 cartridges. As they started the restore they discovered bad tapes. It turns out that one of the tape drives was faulty and would occasionally crimp and damage a tape. They tried successive backups with the same result. After a few days they managed to restore from a good set that was around 10 weeks old. We then had to reload daily batches until we caught up. It took a few weeks.
My worst DW experience was similar. It was in the mid-1990's at an HMO. It was a large DW for the time and operations dutifully did full and incremental backups on schedule. There was a power issue that destroyed a number of disk drives. So they went about restoring the hardware and recovering the database. They had an automated tape library for backups. It was a large machine that stored tape cartridges and mounted them into a dozen different drives. A full backup was 80-90 cartridges. As they started the restore they discovered bad tapes. It turns out that one of the tape drives was faulty and would occasionally crimp and damage a tape. They tried successive backups with the same result. After a few days they managed to restore from a good set that was around 10 weeks old. We then had to reload daily batches until we caught up. It took a few weeks.
Re: Repopulating a dw for specific customers
I've found these scenarios quite common - most solutions I've been involved with strike something similar every year or two. Overall I think there has been a steady decline in the technical quality of source systems over the years. If it's not outright data corruption, its the discovery that source system timestamps that you were relying on for your incremental extract are actually not reliable...
My strategy to defend against this is to fully refresh fact tables as far as I possibly can. It can be surprising how much data you can thump in overnight on modern hardware, especially with a "massively parallel" data load design.
My first fall back position is to load incrementally on weeknights, then fully refresh each weekend. The typical incremental extract is actually my last resort.
Obviously there are wasted resources as redundant data is flowing around each night. But the result is a simpler, more robust design - which implies less effort and lower risk for development, testing, support and maintenance. You also gain a lot of design flexibility as it is far quicker and easier to add or modify fact columns.
With a full refresh design, the solution to your problem would be ... simply to wait until the next full refresh.
Good luck!
Mike
My strategy to defend against this is to fully refresh fact tables as far as I possibly can. It can be surprising how much data you can thump in overnight on modern hardware, especially with a "massively parallel" data load design.
My first fall back position is to load incrementally on weeknights, then fully refresh each weekend. The typical incremental extract is actually my last resort.
Obviously there are wasted resources as redundant data is flowing around each night. But the result is a simpler, more robust design - which implies less effort and lower risk for development, testing, support and maintenance. You also gain a lot of design flexibility as it is far quicker and easier to add or modify fact columns.
With a full refresh design, the solution to your problem would be ... simply to wait until the next full refresh.
Good luck!
Mike
Similar topics
» Many-to-many attributes for Customers and DW design
» Percentage of customers from a district
» Customers from 2 sources on different granulaties
» Is it worth modeling these customers?
» Handling customers merged via MDM
» Percentage of customers from a district
» Customers from 2 sources on different granulaties
» Is it worth modeling these customers?
» Handling customers merged via MDM
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum