updating a live database
3 posters
Page 1 of 1
updating a live database
Hello,
We get the previous day transactions at 8 am in the morning. It takes about four hours to load the data into the dw. So, the load happens during the regular business hours while the users are actively querying the database. One of the questions I was asked today is how do we ensure the integrity of the data when the data is being consumed as it gets updated. Some records may still show the data that is two days old and some others may show the current as of yesterday.
Here is a tentative solution that was recommended by someone: We will have two databases/schemas, lets call db1 and db2. The reports will run off of db1. We will have an exact copy of it as db2. We will load the daily jobs into db2 while the reports are being run from db1. Once the load is complete for that day, get the users out of the system for a brief period of time, drop db1, rename db1 to db2, bring the users back, then copy db1 to db2. Lather, rinse and repeat.
First time when I heard this, all I said was 'crazeeee'. This is way overkill if you ask me. But what other options do we have?
Any ideas anyone?
We get the previous day transactions at 8 am in the morning. It takes about four hours to load the data into the dw. So, the load happens during the regular business hours while the users are actively querying the database. One of the questions I was asked today is how do we ensure the integrity of the data when the data is being consumed as it gets updated. Some records may still show the data that is two days old and some others may show the current as of yesterday.
Here is a tentative solution that was recommended by someone: We will have two databases/schemas, lets call db1 and db2. The reports will run off of db1. We will have an exact copy of it as db2. We will load the daily jobs into db2 while the reports are being run from db1. Once the load is complete for that day, get the users out of the system for a brief period of time, drop db1, rename db1 to db2, bring the users back, then copy db1 to db2. Lather, rinse and repeat.
First time when I heard this, all I said was 'crazeeee'. This is way overkill if you ask me. But what other options do we have?
Any ideas anyone?
scott.kan- Posts : 3
Join date : 2012-05-02
Re: updating a live database
How about transactions? Any update process against a relational database needs to operate in a transaction state. Database integrity rules are such that no concurrent user will see any changes performed by another transaction. Only after the transaction has finished and a commit performed will the updates become visible to new queries.
There are other issues more pertinent to your situation. First is resource contention during the day, second is what does it mean to load in the middle of a business day? Reports run in the morning will be missing data of reports run in the afternoon. If consistency during the day is more important than the currency of the data (day old versus 2 days old), you may want to consider delaying the load until after business hours.
There are other issues more pertinent to your situation. First is resource contention during the day, second is what does it mean to load in the middle of a business day? Reports run in the morning will be missing data of reports run in the afternoon. If consistency during the day is more important than the currency of the data (day old versus 2 days old), you may want to consider delaying the load until after business hours.
Re: updating a live database
Very valid points. We do have the resource contention. Delaying until after the business hours is something I suggested which is on the table. However as you can expect, this option will not go well with some users. Hence we are looking for alternate solutions for this problem.
scott.kan- Posts : 3
Join date : 2012-05-02
Re: updating a live database
If you decide to load during the day, you separate fact loading from dimension loading. Do dimensions first. If the dimension is a type 2 dimension, users will never see the update with the current fact tables, since no fact table would reference a newly added dimension row.
Load all the fact tables at once if possible. Preprocess the source data to produce a temporary table containing the new fact rows. When all the preprocessing (validation, key assignment, etc...) is done, insert the rows from the temp tables to the target fact table. This will minimize the time window when specific fact tables may be out of sync. Besides, even if they are, it only matters if a query being run at the time needs both fact tables.
The only issue is type 1 or type 3 dimensions. After you run a process that updates the dimension, new queries will see the updated dimension. This may affect some reports as they may be using an attribute that changed. However, dimension updates are relatively small compared to transaction volumes. This shortcoming may have no material effect on most reports. Other than this case, a user should never see dimensions out of sync with the facts.
Load all the fact tables at once if possible. Preprocess the source data to produce a temporary table containing the new fact rows. When all the preprocessing (validation, key assignment, etc...) is done, insert the rows from the temp tables to the target fact table. This will minimize the time window when specific fact tables may be out of sync. Besides, even if they are, it only matters if a query being run at the time needs both fact tables.
The only issue is type 1 or type 3 dimensions. After you run a process that updates the dimension, new queries will see the updated dimension. This may affect some reports as they may be using an attribute that changed. However, dimension updates are relatively small compared to transaction volumes. This shortcoming may have no material effect on most reports. Other than this case, a user should never see dimensions out of sync with the facts.
Re: updating a live database
Thank you for all the pointers. They all make sense to me. I understand that I have to try and re-architect the integration architecture, given the complex requirement.
scott.kan- Posts : 3
Join date : 2012-05-02
Re: updating a live database
The part that confuses me is the join, the concept of loading the fact into a new table makes perfect sense, however I assumed the problem would be how to know which dimension record to use for the fact... since none of the fact attributes exist in the dimension and the PK in the dim no longer exists.
________________
watson
________________
watson
Blackcat- Posts : 1
Join date : 2014-12-17
Similar topics
» Attn: Joy - Please respond :) re: SSIS 2008 Adv Works Model
» Live office connection with Bo4 UNX universe
» Adding new fields to a live system
» Updating facts
» Updating Fact!
» Live office connection with Bo4 UNX universe
» Adding new fields to a live system
» Updating facts
» Updating Fact!
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum