Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Staging area.. is it a must?

4 posters

Go down

Staging area.. is it a must? Empty Staging area.. is it a must?

Post  umutiscan Fri Jun 27, 2014 7:57 am

Hi,

Target data warehouse model can be feeded directly from source systems. I mean, I can directly read from source, transform the data in ETL jobs, and then load the target data model.

Or I may have two flows. First, I can copy source data to staging layer and then I can feed target model from staging layer.

Dou you think that staging layer is a must for DW architecture? In my opininon, it is a must. Because we have to validate our transformations and final model with source data. How can I be sure that source to target mappings are processing without any error, if I don't see the data before transformation.
On the other hand, we request some test data to verify our transformations during test phases, and I need to be sure that source system is generated all the data that I need to validate my transformations.

Thanks in advance..

umutiscan

Posts : 51
Join date : 2010-11-26
Age : 43
Location : Istanbul, Turkey

Back to top Go down

Staging area.. is it a must? Empty Re: Staging area.. is it a must?

Post  nick_white Fri Jun 27, 2014 10:27 am

Hi - I guess it depends on what aspect of the whole ETL process it is that you are looking at.

If you can technically implement your ETL process from source to target without using staging tables then there is no technical reason why you shouldn't. That doesn't necessarily mean it is a good technical design though. For example, if I had multiple different source writing to the same target I could write different ETL processes for each source but it wouldn't be very efficient - instead I normally would write separate ETL processes to write to a common staging format and then a single ETL process to write from staging to target.

However, as you say, if you don't use staging tables then you have no record of the source records you've extracted, no record of how you might have transformed them prior to loading into your target and is likely to make your ETL unnecessarily complicated.

Given that the cost of storage these days is effectively zero (i'm now channelling Mr Kimball whose ETL course I attended a few weeks ago!) I would say that using staging tables is a no-brainer for 99% of all cases

nick_white

Posts : 364
Join date : 2014-01-06
Location : London

Back to top Go down

Staging area.. is it a must? Empty Re: Staging area.. is it a must?

Post  hang Sat Jun 28, 2014 9:57 pm

Nick made a good point that staging is not a must, but recommended for most serious data warehouses. Especially for big transaction data, staging delta data is a performance booster for down-stream activities like aggregations, calculated measures, and unavoidably, surrogate dimension key pipeline process. For reconciliation purpose, staging could also be useful to provide you snapshots of source data at the time of loading while source system might have lost those snapshots.

I would not stage small dimension data only to replicate them in DW without significant transformation, or that come from well implemented master data system. I would only stage data for these purposes: Improve the performance for DW transformation and source system online operation, divide the complex transformation into modules or preserve the historical snapshot for data lineage and reconciliation.

hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

Back to top Go down

Staging area.. is it a must? Empty Re: Staging area.. is it a must?

Post  ngalemmo Mon Jun 30, 2014 4:42 pm

The main reason for pulling data from a source and depositing it in a file or table somewhere else is to avoid having to go back to the operational data if a problem occurs later in the process. The data extracts put a load on the operational system, so you want to keep them as simple as possible and avoid having to do it again. The staged data gives you a restart point should there be an issue with the load process.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Staging area.. is it a must? Empty Re: Staging area.. is it a must?

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum