Daily snapshot fact table-any chance to reduce data volume?
3 posters
Page 1 of 1
Daily snapshot fact table-any chance to reduce data volume?
Hi,
our customer wants to have a daily snapshot fact table, so they can see how many customers they have on a specific date. This alone would be easy to solve, but they also want to able to see how many male/female customers they have, where they live etc. So every day round about 2 mio rows have to be load in a fact table. Is there any way to reduce the data volume? I think only a fraction of these are changing over a day. Thought about adding additional columns (start date, end date) but it doesnīt solve the problem.
So, is there any other technique to avoid loading 2 mio rows per day?
our customer wants to have a daily snapshot fact table, so they can see how many customers they have on a specific date. This alone would be easy to solve, but they also want to able to see how many male/female customers they have, where they live etc. So every day round about 2 mio rows have to be load in a fact table. Is there any way to reduce the data volume? I think only a fraction of these are changing over a day. Thought about adding additional columns (start date, end date) but it doesnīt solve the problem.
So, is there any other technique to avoid loading 2 mio rows per day?
e2patrick- Posts : 5
Join date : 2011-07-13
Re: Daily snapshot fact table-any chance to reduce data volume?
As you describe it, the table does not make sense if you have a sales fact, as you can easily get the number of customers for any day by looking at who you sold to.
So, it should be safe to assume that what they have in mind is some form of aggregate, meaning you would not be loading millions of rows a day. Find out what attributes they are interested in. You mention sex and location (city? postal code?). I am sure there are a few others. Build an aggregate fact table using dimensions (other than the customer dimension) to cover the attributes you need. Create junk dimensions if necessary.
So, it should be safe to assume that what they have in mind is some form of aggregate, meaning you would not be loading millions of rows a day. Find out what attributes they are interested in. You mention sex and location (city? postal code?). I am sure there are a few others. Build an aggregate fact table using dimensions (other than the customer dimension) to cover the attributes you need. Create junk dimensions if necessary.
Re: Daily snapshot fact table-any chance to reduce data volume?
Customer data is dimensional data. Do your count from your customer dimension.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Daily snapshot fact table-any chance to reduce data volume?
I think I didnīt describe it exactly. The customer not really sell anything, itīs more a service they offer and they like to know how many customers are registered on a specific day. Itīs like an inventory of a warehouse and the customers are the products.
Usually we would be a fact table that have a date key and customer key as references and a customer and date dimension. In this case we have to load about 2 mio customers a day and this would be over 700 mio a year and so on.
I hope that you can now understand our problem better.
Usually we would be a fact table that have a date key and customer key as references and a customer and date dimension. In this case we have to load about 2 mio customers a day and this would be over 700 mio a year and so on.
I hope that you can now understand our problem better.
e2patrick- Posts : 5
Join date : 2011-07-13
Re: Daily snapshot fact table-any chance to reduce data volume?
Just use date ranges so there is just one row per 'active' period rather than one row per day.
Re: Daily snapshot fact table-any chance to reduce data volume?
Can you give a small example? We thought about something like date ranges, but donīt know exactly how to do it. Any source in the Internet where we can read about it?
Thanks!
Thanks!
e2patrick- Posts : 5
Join date : 2011-07-13
Re: Daily snapshot fact table-any chance to reduce data volume?
Instead of one date you have an effective and expiration date. For active customer you use a standard future date for the expiration date (such as 1/1/3000). To locate active customer for a particular date, query where the desired date is between the effective and expiration dates.
Similar topics
» Reduce Rows in Inventory Snapshot Fact Table
» Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse
» Large volume of hospital data into fact table
» Variable period data in a single periodic snapshot fact table
» Accumulating Snapshot Fact Table Data Model (Order Management)
» Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse
» Large volume of hospital data into fact table
» Variable period data in a single periodic snapshot fact table
» Accumulating Snapshot Fact Table Data Model (Order Management)
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum