Interface/Views Between the ETL and Source Data
3 posters
Page 1 of 1
Interface/Views Between the ETL and Source Data
I have always been of the opinion that the ETL for the data warehouse should have virtually unobstructed access to the source data. This way, the BI/DW team can then be completely accountable for the integrity of the data in the data warehouse. I like to refer to it as the 'chain of custody', liken unto evidence on a crime show.
The owners of our operational/transactional system are insisting on establishing an interface, in this case views, through which the ETL would pull data. Their reasoning is that they would know immediately when a development change to the underlying database breaks the view and therefore avoid breaking our ETL. I understand the benefit here but feel strongly that there should be no filters/layers/interfaces introduced between the source and the data warehouse processes. The ETL, in my opition, should be that layer/interface.
Having views will leave open the possibility of modifying them for convenience without the knowledge of the BI/DW team, affecting our ability to guarantee the accuracy of our deliverables. Also, whenever we need to adjust our ETL to pull additional data, we will have the additional dependency now of requesting changes to the views first.
There was a posting on this a couple of years ago, but I would really like to hear an opinion on this from the Kimball folks, or have a reference to an article where they may have already addressed it.
Any help here would be appreciated.
Thanks,
Bob L.
The owners of our operational/transactional system are insisting on establishing an interface, in this case views, through which the ETL would pull data. Their reasoning is that they would know immediately when a development change to the underlying database breaks the view and therefore avoid breaking our ETL. I understand the benefit here but feel strongly that there should be no filters/layers/interfaces introduced between the source and the data warehouse processes. The ETL, in my opition, should be that layer/interface.
Having views will leave open the possibility of modifying them for convenience without the knowledge of the BI/DW team, affecting our ability to guarantee the accuracy of our deliverables. Also, whenever we need to adjust our ETL to pull additional data, we will have the additional dependency now of requesting changes to the views first.
There was a posting on this a couple of years ago, but I would really like to hear an opinion on this from the Kimball folks, or have a reference to an article where they may have already addressed it.
Any help here would be appreciated.
Thanks,
Bob L.
Bob L- Posts : 1
Join date : 2011-07-12
Re: Interface/Views Between the ETL and Source Data
I would not care one way or the other wither I used views or accessed the table directly. Technically it doesn't really matter.
The stronger argument is the issues you raised about changes and communication of those changes. The idea that they can 'fix it' (whatever 'it' is) in a view without affecting ETL is plainly ignorant. Sure, they can make a technical change to satisfy the technical requirements of a view, but what are the data and business implications? ETL should not be 'protected' from change. The ETL/BI/DW team needs to be actively involved in being aware of and assessing the impact of changes to the source system.
The reason I say view or table doesn't matter is regardless of how its done, you still have problems unless the organization recognizes the need to keep you in the loop when things change. Just as they could change a view without informing you, they could just as easily change a table without you knowing about it.
The stronger argument is the issues you raised about changes and communication of those changes. The idea that they can 'fix it' (whatever 'it' is) in a view without affecting ETL is plainly ignorant. Sure, they can make a technical change to satisfy the technical requirements of a view, but what are the data and business implications? ETL should not be 'protected' from change. The ETL/BI/DW team needs to be actively involved in being aware of and assessing the impact of changes to the source system.
The reason I say view or table doesn't matter is regardless of how its done, you still have problems unless the organization recognizes the need to keep you in the loop when things change. Just as they could change a view without informing you, they could just as easily change a table without you knowing about it.
Re: Interface/Views Between the ETL and Source Data
I wonder how the transaction system guys will react to query missing a join between 2 big tables or big queries that pound the server.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Similar topics
» Getting Dimension Source Data from Existing Views
» Alternative views of data
» Using the Dimensional Data Warehouse as source data for the OLTP process
» Extracting data when there is no timestamp in the source system
» Designing data marts from an EAV data source
» Alternative views of data
» Using the Dimensional Data Warehouse as source data for the OLTP process
» Extracting data when there is no timestamp in the source system
» Designing data marts from an EAV data source
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|