Warehouse as source for another Warehouse
5 posters
Page 1 of 1
Warehouse as source for another Warehouse
We will be needing both fact and dimension information for a new subject area in our warehouse. The data we need exists in a different warehouse in the same company (star schema, but a different platform, different tool suite etc.).
The issue is whether we should source from the other warehouse, or build our own ETL from the original source. I'm wondering if there's some academic, theoretical, or common practice favoring one side or the other. We can always just decide based on what we think is best...
Some say we should pull from the system of record because a) it's the system of record and b) we don't want to be subject to changes in the other warehous. Others say we should pull from the other warehouse because a) the business rules and transformations have already been applied, why do all that extra work, and b) if we had different transformations, we'd have to worry about reconciliation and make changes in two places.
Is there a "correct" answer in theory or practice? I'm inclined to pull from the other warehouse, we can get cooperation and notification on changes there. But someone will say the only legitimate way is to pull from the source. Thanks for any direction.
(Yes we should have one warehouse, with different marts using all conformed dimensions that get loaded once from a common source. Not happening, at least anywhere near the time we need to get this new subject area going).
The issue is whether we should source from the other warehouse, or build our own ETL from the original source. I'm wondering if there's some academic, theoretical, or common practice favoring one side or the other. We can always just decide based on what we think is best...
Some say we should pull from the system of record because a) it's the system of record and b) we don't want to be subject to changes in the other warehous. Others say we should pull from the other warehouse because a) the business rules and transformations have already been applied, why do all that extra work, and b) if we had different transformations, we'd have to worry about reconciliation and make changes in two places.
Is there a "correct" answer in theory or practice? I'm inclined to pull from the other warehouse, we can get cooperation and notification on changes there. But someone will say the only legitimate way is to pull from the source. Thanks for any direction.
(Yes we should have one warehouse, with different marts using all conformed dimensions that get loaded once from a common source. Not happening, at least anywhere near the time we need to get this new subject area going).
pbackstrom- Posts : 3
Join date : 2012-08-10
Re: Warehouse as source for another Warehouse
If the other data warehouse can be treated as the 'source of truth' for the processed data you are looking for - I would strongly suggest pulling from it. Applying the same logic at 2 different places can result in significant effort to reconcile data and reduce business confidence if ever the numbers are different, requiring reconciliation regularly.
vickyejain- Posts : 7
Join date : 2012-08-20
Re: Warehouse as source for another Warehouse
If the other source already exists, it already is in star-schema, and it already correctly applies business rules and transformations ..... Why do you need to do anything?
LAndrews- Posts : 132
Join date : 2010-05-13
Location : British Columbia, Canada
Re: Warehouse as source for another Warehouse
You already know the correct answer in theory, "Yes we should have one warehouse, with different marts using all conformed dimensions that get loaded once from a common source...". In practice, I would look to load from the intermediate staging tables instead of the final fact and dimension tables. This will allow your ETL processing to kick off earlier and reduce your dependencies on upstream processing. It will also ideally reduce the amount of data you need to process on a daily basis. Don't be surprised when the political infighting starts when the numbers from each of the "replicated" fact tables do not match.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Warehouse as source for another Warehouse
I work in a Telco where we love data warehousing (we have at least 6 large scale datawarehouses, plus many more smaller reporting systems).
The principle we work on is "Always go directly to the source". If the other data warehouse is decommissioned or falls over - you're screwed. Do you know that the other system's numbers are even correct?
The principle we work on is "Always go directly to the source". If the other data warehouse is decommissioned or falls over - you're screwed. Do you know that the other system's numbers are even correct?
Re: Warehouse as source for another Warehouse
Thanks. Yes, your points are part of our issue too.
Right now it looks like we'll be retaining control of the "source" warehouse in our same department, so hopefully if it ever gets shut down by the business, we'll have time to code to it. I think, too, that it means the data is about as correct as it would be if we went to the source as second time.
Right now it looks like we'll be retaining control of the "source" warehouse in our same department, so hopefully if it ever gets shut down by the business, we'll have time to code to it. I think, too, that it means the data is about as correct as it would be if we went to the source as second time.
pbackstrom- Posts : 3
Join date : 2012-08-10
Similar topics
» data warehouse and data warehouse system
» Using the Dimensional Data Warehouse as source data for the OLTP process
» Is it a best practice that Data warehouse follows the source system data type?
» when source contains surrogate key instead of natural key
» data warehouse or not ? when is it okay to use OLAP without a data warehouse database
» Using the Dimensional Data Warehouse as source data for the OLTP process
» Is it a best practice that Data warehouse follows the source system data type?
» when source contains surrogate key instead of natural key
» data warehouse or not ? when is it okay to use OLAP without a data warehouse database
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum