Calculated attributes in Customer Dimension?
+2
BoxesAndLines
boernard
6 posters
Page 1 of 1
Calculated attributes in Customer Dimension?
Hi there,
I am building a DWH for an e-commerce shop. The customer dimension is connected to the transaction fact table and i am wondering if I should implement calculated figures in the customer dimension.
For example:
customer_lifetime_value
total_revenue_by_customer
total_amount_of_returns
abc_analysis_segment
...
Is that ok to do so or are there any pitfalls?
I am building a DWH for an e-commerce shop. The customer dimension is connected to the transaction fact table and i am wondering if I should implement calculated figures in the customer dimension.
For example:
customer_lifetime_value
total_revenue_by_customer
total_amount_of_returns
abc_analysis_segment
...
Is that ok to do so or are there any pitfalls?
boernard- Posts : 13
Join date : 2012-01-19
Re: Calculated attributes in Customer Dimension?
The pitfall is you have to update it every time you receive a new transaction.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Calculated attributes in Customer Dimension?
Segmentation analysis I presume...
Updating a presumably very large customer table on a periodic basis can be unwieldy on most DBMS platforms. You may want to consider 'shadow' dimensions. Basically, another table with the same PK as the dimension it is shadowing. Or, another way to think of it, is as an aggregate fact table with only one dimension FK.
Updating a presumably very large customer table on a periodic basis can be unwieldy on most DBMS platforms. You may want to consider 'shadow' dimensions. Basically, another table with the same PK as the dimension it is shadowing. Or, another way to think of it, is as an aggregate fact table with only one dimension FK.
Re: Calculated attributes in Customer Dimension?
Thanks for the reply! Yes, thought of that "shadow dimension". I think I will put all the information in the customer dimension.
boernard- Posts : 13
Join date : 2012-01-19
Re:Calculated attributes in Customer Dimension?
How frequently you plan to update the summary number, will it be as the detail is updated for a customer or will it be once a day or weekly or monthly ? How big is your customer table and How many transactions are happening.
thanks
Himanshu
thanks
Himanshu
hkandpal- Posts : 113
Join date : 2010-08-16
Re: Calculated attributes in Customer Dimension?
I plan to update the figures every night. We have aroun 20-40k customers, numbers growing. The calculation itself shouldn't be a too big burden for the server as i plan to implement this in PLpython / pandas (big recommendation for analysis by the way!)
boernard- Posts : 13
Join date : 2012-01-19
Re: Calculated attributes in Customer Dimension?
One pitfall is having a rapidly changing dimension because of rapidly changing attributes.
Customer facts like total_revenue_by_customer, total_amount_of_returns etc might be put in a fact table and calculated periodically. This will be useful for trend analysis and some aggregation operations.
If you can easily calculate customer_lifetime_value, I don't prefer to put this information as a new column on customer dimension. This may be calculated at reporting level with a simple database function.
I recommend you to read the article below
http://www.kimballgroup.com/html/articles_search/articles1999/9908bIE.html?TrkID=IE199908_1
Customer facts like total_revenue_by_customer, total_amount_of_returns etc might be put in a fact table and calculated periodically. This will be useful for trend analysis and some aggregation operations.
If you can easily calculate customer_lifetime_value, I don't prefer to put this information as a new column on customer dimension. This may be calculated at reporting level with a simple database function.
I recommend you to read the article below
http://www.kimballgroup.com/html/articles_search/articles1999/9908bIE.html?TrkID=IE199908_1
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Often changed dimension information
Simple scenario:
Customer dimension
FACT table of sales activity (purchases, credits - NOT payments, applied transactions, etc.)
If I wanted more information about the customer e.g. high balance, avg days to pay then I would have two options:
(1) Create aggregate formula of purchases, credits, applied payments, etc. and then update dimension each time customer id transactions change
(2) Create separate aggregate fact table with two columns - high balance, avg days to pay with one primary key linked to customer dimension - and one record for each key
It appears that solution number 2 would be more efficient than having to update the dimension each time a transaction for that dimension occurs, whereas in option number 2, only the secondary fact table changes.
Are there any downfalls with option number 2(e.g. we now have two fact tables - sales, and aggregate customer payment info). Am I even looking at this correctly?
Customer dimension
FACT table of sales activity (purchases, credits - NOT payments, applied transactions, etc.)
If I wanted more information about the customer e.g. high balance, avg days to pay then I would have two options:
(1) Create aggregate formula of purchases, credits, applied payments, etc. and then update dimension each time customer id transactions change
(2) Create separate aggregate fact table with two columns - high balance, avg days to pay with one primary key linked to customer dimension - and one record for each key
It appears that solution number 2 would be more efficient than having to update the dimension each time a transaction for that dimension occurs, whereas in option number 2, only the secondary fact table changes.
Are there any downfalls with option number 2(e.g. we now have two fact tables - sales, and aggregate customer payment info). Am I even looking at this correctly?
cpv- Posts : 1
Join date : 2012-04-23
Re: Calculated attributes in Customer Dimension?
If two facts are required, you have to create them.
But I don't like to create too much aggregated facts, If the data is not huge or aggregated facts are not used often.
But I don't like to create too much aggregated facts, If the data is not huge or aggregated facts are not used often.
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Re: Calculated attributes in Customer Dimension?
CPV, option 2 is technically correct. What is being described is an aggregate fact table with one dimension (the customer).
It's a matter of deployment: Which structure performs better for typical queries? That will depend a lot on specific hardware, so it would need some testing to pick the best choice.
Also, the solution does not work very well if customer is a type2 dimension. You wind up having to aggregate on the natural key, and then wind up choosing to generate a fact table based on the type2 key or simply updating the dimension table using the natural key. The latter may take longer, but if it helps query performance, it may be worth the bother.
It's a matter of deployment: Which structure performs better for typical queries? That will depend a lot on specific hardware, so it would need some testing to pick the best choice.
Also, the solution does not work very well if customer is a type2 dimension. You wind up having to aggregate on the natural key, and then wind up choosing to generate a fact table based on the type2 key or simply updating the dimension table using the natural key. The latter may take longer, but if it helps query performance, it may be worth the bother.
Similar topics
» dimension table design question for around 100 attributes and higher level calculated attributes
» Calculated dimension attributes based on fact
» Add customer attributes to fact or new dimension
» Additional customer attributes or new dimensions - when is a dimension too wide?
» Merging customer data from disparate sources to create a master customer dimension
» Calculated dimension attributes based on fact
» Add customer attributes to fact or new dimension
» Additional customer attributes or new dimensions - when is a dimension too wide?
» Merging customer data from disparate sources to create a master customer dimension
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum