Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Additional customer attributes or new dimensions - when is a dimension too wide?

2 posters

Go down

Additional customer attributes or new dimensions - when is a dimension too wide? Empty Additional customer attributes or new dimensions - when is a dimension too wide?

Post  dk2014 Thu Nov 20, 2014 1:29 pm

I'm having a dilemma as to which direction I should go in when it comes to modeling our Customer dimension.
I want to keep all the customer attributes in a single dimension and avoid the need for additional joins when retrieving the data from the DW but the dimension is getting quite wide - around 70 columns at this point.
After reading some of the articles online a lot of time this is the way to go although some people do recommend breaking the table into separate dimensions. I've read an article that recommends breaking a single dimension into one or more (snowflake design) but only if 80% of the time the main dimension will suffice and 20% of the time we may need to grab additional data from the snowflaked dimension (visitors vs reliable/regular customers). In my case, however, 98% of the time I need all the information that I have so if I split the data into few dimensions, 98% of the time I would need to join all those dimensions and pull the data. It sounds like while I will have a cleaner design and narrower dimensions the performance will inevitably suffer. Btw, I'm talking about attributes such as customer's billing address, customer's shipping address, and a bunch of other unrelated customer's characteristics. The addresses are the only attributes that to me it could make sense to keep in a separate dimension but even with those I'd rather keep them in the customer dimension unless this is not something that is usually done.
What is your opinion and a common approach in this situation and are very wide dimensions (if say 100 column wide dimension is even considered very wide) an issue?

Thanks,

dk2014

Posts : 15
Join date : 2014-11-10

Back to top Go down

Additional customer attributes or new dimensions - when is a dimension too wide? Empty Re: Additional customer attributes or new dimensions - when is a dimension too wide?

Post  ngalemmo Thu Nov 20, 2014 2:38 pm

Given that customer will (hopefully) be your largest dimension, you don't want to go too wide to avoid performance issues.

I would put address off into another table. It takes up a lot of space and most of the information is useless for analysis. You may want to leave a handful of important address attributes on the customer dim itself (zip, state, etc…) that are commonly used in queries.

If you have to deal with segmentation attributes, particularly if marketing keeps adding new ones, you should consider going vertical. Have a segmentation attribute dimension and a bridge between customer and attribute.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Additional customer attributes or new dimensions - when is a dimension too wide? Empty Re: Additional customer attributes or new dimensions - when is a dimension too wide?

Post  dk2014 Thu Nov 20, 2014 3:46 pm

Thanks for your reply ngalemmo! I'll definitely consider doing it the way you suggested.
DimCustomer is/will be the largest dimension but is still relatively small when initially loaded and before SCDs start inserting new rows - around 550k records.
Would I combine all the addresses (shipping & billing) into one dimension, sayDim_CustomerAddress, or would you have one for shipping and another one for billing?
The whole address part is only 5-6 attributes and if I leave State, Zip, City in the main Dim_Customer as you suggested then I would only be storing 2-3 fields, per address type, in this new address dimension.
Makes me wonder if it's worth it?

dk2014

Posts : 15
Join date : 2014-11-10

Back to top Go down

Additional customer attributes or new dimensions - when is a dimension too wide? Empty Re: Additional customer attributes or new dimensions - when is a dimension too wide?

Post  ngalemmo Thu Nov 20, 2014 4:35 pm

There would be one address dimension with all addresses. I would include city, state and zip in both the address dimension (for completeness) and customer. I would probably not include versions of city, state and zip in customer for all address types (billing, shipping) as it may have little or infrequent use. Also, these address could vary by order. If someone wanted to do analysis by shipping address, for example, they could join to the address from the fact.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Additional customer attributes or new dimensions - when is a dimension too wide? Empty Re: Additional customer attributes or new dimensions - when is a dimension too wide?

Post  dk2014 Fri Nov 21, 2014 9:55 am

Thank you ngalemmo!

dk2014

Posts : 15
Join date : 2014-11-10

Back to top Go down

Additional customer attributes or new dimensions - when is a dimension too wide? Empty Re: Additional customer attributes or new dimensions - when is a dimension too wide?

Post  dk2014 Fri Nov 21, 2014 12:53 pm

One more thing, when you say 'to do analysis by shipping address, for example, they could join to the address from the fact' do you mean that I should include the CustomerAddressKey in the Fact table rather than in the Dim_Customer table or that maybe I should include it in both!?
My thought was that if I have the OrderSalesFact table with the granularity of an order item, the fact table would include OrderNumber degenerate dimension, CustomerKey, StoreKey, OrderDateKey, etc. and then DimCustomer table would have a FK CustomerAddressKey (from DimCustomerAddress) in which case I don't see how one could join the DimCustomerAddress directly from the Fact table.

dk2014

Posts : 15
Join date : 2014-11-10

Back to top Go down

Additional customer attributes or new dimensions - when is a dimension too wide? Empty Re: Additional customer attributes or new dimensions - when is a dimension too wide?

Post  ngalemmo Fri Nov 21, 2014 1:46 pm

The fact table would contain foreign keys to the address dimension. There would be separate keys for each role: customer address, shipping address, etc… Using a FK off customer is snowflaking, which is unnesscessary and can hamper query performance.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Additional customer attributes or new dimensions - when is a dimension too wide? Empty Re: Additional customer attributes or new dimensions - when is a dimension too wide?

Post  dk2014 Fri Nov 21, 2014 2:53 pm

Thank you!

dk2014

Posts : 15
Join date : 2014-11-10

Back to top Go down

Additional customer attributes or new dimensions - when is a dimension too wide? Empty Re: Additional customer attributes or new dimensions - when is a dimension too wide?

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum