Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Data redundancy in the staging area

3 posters

Go down

Data redundancy in the staging area Empty Data redundancy in the staging area

Post  figareg Mon Sep 26, 2011 11:55 am

I have to design the ETL process and staging area for a new data warehouse and I am not sure whether it makes sense to keep a copy of all dimension and fact tables in the staging area.

The ETL process will be fed by a daily extract from SAP. Master data like material, customer etc. come as a complete download whereas transactional data come in daily buckets.

It seems like a waste of storage to keep the tables in the staging area and then copy them into the final data marts. Staging area and presentation area are within the same data base so I can easily access the tables in the presentation area if necessary.

Regards,
Fabio

figareg

Posts : 2
Join date : 2011-09-26
Location : Germany

Back to top Go down

Data redundancy in the staging area Empty Re: Data redundancy in the staging area

Post  BoxesAndLines Mon Sep 26, 2011 12:48 pm

I would design the process to load the facts and dimensions directly.
BoxesAndLines
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

Back to top Go down

Data redundancy in the staging area Empty Re: Data redundancy in the staging area

Post  ngalemmo Mon Sep 26, 2011 11:08 pm

All you need to do is compare the incoming table with the dimension and update/insert if necessary. If needed you may have to reformat or transform the source before you compare.

An easy way to compare is to calculate a CRC code on the source, and when you insert or update, store the new CRC in the dimension. To check for change all you need to do is compare the new CRC with the CRC on the dimension row.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Data redundancy in the staging area Empty Re: Data redundancy in the staging area

Post  figareg Thu Sep 29, 2011 3:43 am

Thanks,

I thought it would make sense to compare directly with the final tables but was not sure.

Fabio

figareg

Posts : 2
Join date : 2011-09-26
Location : Germany

Back to top Go down

Data redundancy in the staging area Empty Re: Data redundancy in the staging area

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum