Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

How to consolidate numerous ETL processes?

Go down

How to consolidate numerous ETL processes? Empty How to consolidate numerous ETL processes?

Post  c Thu May 20, 2010 9:26 am

Hi,

Within the department there are numerous users whose ETL processes are slightly identical to one another (e.g. an extra variable). This not only puts a burden on the server load, but also create problems where a certain measure is not identical across the department.

Therefore we're venturing down a new path to consolidate all ETL processes into one, which will eliminate the repetitive processes but most importantly achieve a single source of truth. We have 10 data sources and an ETL tool to use for this project.

Currently ETL processes for each of the user is a long piece of code which brings together the source datasets with a lot of transposing and derived variables based on business rules.

What should we do venturing down this new path.
Should we design a data model where all the data sources will fit neatly into the relevant dimensions?
Or should we just consolidate all the ETL codes into a single one?
What will be the pros and cons of each alternative?

Thanks.

c

Posts : 3
Join date : 2009-08-20

Back to top Go down

How to consolidate numerous ETL processes? Empty Re: How to consolidate numerous ETL processes?

Post  ngalemmo Thu May 20, 2010 12:25 pm

It sounds like you have a data repository that was build piecemeal without any overall vision or structure. The problem you have isn't the fact that there is a lot of ETL code, but, more importantly, no integrated view of the data... ie "certain measure is not identical across the department."

The best approach is to redo the model.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

How to consolidate numerous ETL processes? Empty Re: How to consolidate numerous ETL processes?

Post  c Thu May 20, 2010 9:17 pm

ngalemmo wrote:It sounds like you have a data repository that was build piecemeal without any overall vision or structure. The problem you have isn't the fact that there is a lot of ETL code, but, more importantly, no integrated view of the data... ie "certain measure is not identical across the department."

The best approach is to redo the model.

Thanks!

c

Posts : 3
Join date : 2009-08-20

Back to top Go down

How to consolidate numerous ETL processes? Empty Re: How to consolidate numerous ETL processes?

Post  sgudavalli Thu Jun 10, 2010 10:28 am

we need to first list down couple of things for each ETL process in the department

1) At what time content is available for the ETL Process?
2) Is there any dependency set for the ETL process on external systems?
3) Is the schema changes, latency etc... (outcomes of the new path) is acceptable by the downstream applications of these ETL Process?

if the answer is acceptable then we can go and reconstruct it?

if not then we can fill in the gaps with a federated approach.

Regards
Shiv

sgudavalli

Posts : 29
Join date : 2010-06-10
Age : 36
Location : Pune, India

Back to top Go down

How to consolidate numerous ETL processes? Empty Re: How to consolidate numerous ETL processes?

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top


 
Permissions in this forum:
You cannot reply to topics in this forum