Daily Summaries in Data Warehouse

View previous topic View next topic Go down

Daily Summaries in Data Warehouse

Post  pankaj.tiwari on Tue Dec 10, 2013 2:52 am

Hi,

I am maintaining daily aggregate and monthly aggregates for my Data warehouse.

The only problem is that I have late arriving Facts. i.e. a fact record can come as late as 2 months but 90% of facts will arrive on first 7 days.

So if I am doing daily aggregates how can I do aggregation of late arriving Facts. What is the right approach.
Also what is the right partitioning strategy for Daily aggregates : Should it be ETL process date or business date.

Note: Daily records can be around 150 million

Regards,
Pankaj

pankaj.tiwari

Posts : 2
Join date : 2013-12-10

View user profile

Back to top Go down

Re:Daily Summaries in Data Warehouse

Post  hkandpal on Tue Dec 10, 2013 9:57 am

Hi ,

generally it depends upon the data on which you are doing summary and generating report on that, if they are for goverment/authority reporting then we dont update the data for that day, but create a new entry when the updated fact information was received.

When you are generating report based on business date then the you should use the business date as the partitioning column.

thanks

hkandpal

Posts : 113
Join date : 2010-08-16

View user profile

Back to top Go down

Re: Daily Summaries in Data Warehouse

Post  gvarga on Thu Dec 12, 2013 5:25 am

Hi,

I suggest to have 2 various day keys not only in your elementary fact table but in your aggregate table as well:

Business day key and ETL day key.

In your aggregate table you will group fact data according these 2 day keys ( and other dimension keys of course). That means for a given business day there might be more rows with various ETL day and you will not update aggregate data but insert new rows.

So you can follow in your database the arrival of your data for a specific business day, even you can have report about loading.

gvarga

Posts : 43
Join date : 2010-12-15

View user profile

Back to top Go down

Daily Summaries in Data Warehouse

Post  pankaj.tiwari on Thu Dec 12, 2013 7:18 am

My concern....

Say I have 100 CDR's for 1st of Dec out of which 90 CDRs are received by warehouse and 10 CDRs been late arriving will come on 2nd  of Dec.

I have partition my Transaction Fact(TF) on Business date and Summary Fact (SF) also on Business date.
On 1st TF will have 90 records on 1 Dec partition and the aggregated records would also be on 1st Dec partition in SF.
On 2nd Dec I got another 100 CDR (90 CDR for 2nd and 10 CDR of 1st).
Since my TF is partitioned on Business Date the 10 CDR will go to 1st partition and 90 CDR's will go into 2nd Partition, which is perfect
But if I do summary aggregates for 2nd I will only select records from 2nd partition and aggregate those 90 CDR's.

How can I aggregate those remaining 10 CDR's which should as per rule should be in 1st Dec partition.

Any workaround solution would be much appreciated.

pankaj.tiwari

Posts : 2
Join date : 2013-12-10

View user profile

Back to top Go down

Re: Daily Summaries in Data Warehouse

Post  BoxesAndLines on Thu Dec 12, 2013 4:50 pm

If your aggregates are views, you don't have to do anything. If the aggregates are tables, simply rebuild any aggregate fact that had an underlying partition change.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Daily Summaries in Data Warehouse

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top


 
Permissions in this forum:
You cannot reply to topics in this forum