Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Daily snapshot fact table-any chance to reduce data volume?

3 posters

Go down

Daily snapshot fact table-any chance to reduce data volume? Empty Daily snapshot fact table-any chance to reduce data volume?

Post  e2patrick Wed Oct 05, 2011 1:03 pm

Hi,

our customer wants to have a daily snapshot fact table, so they can see how many customers they have on a specific date. This alone would be easy to solve, but they also want to able to see how many male/female customers they have, where they live etc. So every day round about 2 mio rows have to be load in a fact table. Is there any way to reduce the data volume? I think only a fraction of these are changing over a day. Thought about adding additional columns (start date, end date) but it doesnīt solve the problem.

So, is there any other technique to avoid loading 2 mio rows per day?

e2patrick

Posts : 5
Join date : 2011-07-13

Back to top Go down

Daily snapshot fact table-any chance to reduce data volume? Empty Re: Daily snapshot fact table-any chance to reduce data volume?

Post  ngalemmo Wed Oct 05, 2011 1:18 pm

As you describe it, the table does not make sense if you have a sales fact, as you can easily get the number of customers for any day by looking at who you sold to.

So, it should be safe to assume that what they have in mind is some form of aggregate, meaning you would not be loading millions of rows a day. Find out what attributes they are interested in. You mention sex and location (city? postal code?). I am sure there are a few others. Build an aggregate fact table using dimensions (other than the customer dimension) to cover the attributes you need. Create junk dimensions if necessary.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Daily snapshot fact table-any chance to reduce data volume? Empty Re: Daily snapshot fact table-any chance to reduce data volume?

Post  BoxesAndLines Wed Oct 05, 2011 2:06 pm

Customer data is dimensional data. Do your count from your customer dimension.
BoxesAndLines
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

Back to top Go down

Daily snapshot fact table-any chance to reduce data volume? Empty Re: Daily snapshot fact table-any chance to reduce data volume?

Post  e2patrick Thu Oct 06, 2011 3:37 am

I think I didnīt describe it exactly. The customer not really sell anything, itīs more a service they offer and they like to know how many customers are registered on a specific day. Itīs like an inventory of a warehouse and the customers are the products.
Usually we would be a fact table that have a date key and customer key as references and a customer and date dimension. In this case we have to load about 2 mio customers a day and this would be over 700 mio a year and so on.
I hope that you can now understand our problem better.

e2patrick

Posts : 5
Join date : 2011-07-13

Back to top Go down

Daily snapshot fact table-any chance to reduce data volume? Empty Re: Daily snapshot fact table-any chance to reduce data volume?

Post  ngalemmo Thu Oct 06, 2011 10:45 am

Just use date ranges so there is just one row per 'active' period rather than one row per day.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Daily snapshot fact table-any chance to reduce data volume? Empty Re: Daily snapshot fact table-any chance to reduce data volume?

Post  e2patrick Thu Oct 06, 2011 11:29 am

Can you give a small example? We thought about something like date ranges, but donīt know exactly how to do it. Any source in the Internet where we can read about it?

Thanks!

e2patrick

Posts : 5
Join date : 2011-07-13

Back to top Go down

Daily snapshot fact table-any chance to reduce data volume? Empty Re: Daily snapshot fact table-any chance to reduce data volume?

Post  ngalemmo Thu Oct 06, 2011 12:49 pm

Instead of one date you have an effective and expiration date. For active customer you use a standard future date for the expiration date (such as 1/1/3000). To locate active customer for a particular date, query where the desired date is between the effective and expiration dates.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Daily snapshot fact table-any chance to reduce data volume? Empty Re: Daily snapshot fact table-any chance to reduce data volume?

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum