How to reduce size of a customer like dimension
5 posters
Page 1 of 1
How to reduce size of a customer like dimension
Is it possible to reduce size of dimension like customer? In our scenario we have a customer dimension with following attributes. There are a total of 46 Million customers. Our dimension is not portioned. Any inputs will be appreciated. Database is Oracle 11g
Customer_id surrogate key
Customer_num Primary Key
First_name
Last_name
DOB
Active
Customer_orignal_store
Customer_id surrogate key
Customer_num Primary Key
First_name
Last_name
DOB
Active
Customer_orignal_store
wizard- Posts : 13
Join date : 2010-11-30
Re: How to reduce size of a customer like dimension
When you say "reduce the size" do you mean the number of bytes it takes up or the number of rows?
Is Active and/or Customer_Original_Store type 1 or type 2?
Just curious, but how do you deal with changes in first or last name and how do you distinguish between members with the same first name, last name, and DOB? I know the probability of 2 people having the same first and last name and DOB at the same original store, but it could happen?
What do you do if you get the following? This is one customer.
J|Smith|2/1/2000|Store 1
Jeff|Smith|2/1/2000|Store 1
Jeffrey|Smith|2/1/2000|Store 1
J|Smith|1/2/2000|Store 1
Jeff|Smith|1/2/2000|Store 1
Jeffrey|Smith|1/2/2000|Store 1
J|Smith|2/1/2000|Store 2
Jeff|Smith|2/1/2000|Store 2
Jeffrey|Smith|2/1/2000|Store 2
J|Smith|1/2/2000|Store 2
Jeff|Smith|1/2/2000|Store 2
Jeffrey|Smith|1/2/2000|Store 2
Is Active and/or Customer_Original_Store type 1 or type 2?
Just curious, but how do you deal with changes in first or last name and how do you distinguish between members with the same first name, last name, and DOB? I know the probability of 2 people having the same first and last name and DOB at the same original store, but it could happen?
What do you do if you get the following? This is one customer.
J|Smith|2/1/2000|Store 1
Jeff|Smith|2/1/2000|Store 1
Jeffrey|Smith|2/1/2000|Store 1
J|Smith|1/2/2000|Store 1
Jeff|Smith|1/2/2000|Store 1
Jeffrey|Smith|1/2/2000|Store 1
J|Smith|2/1/2000|Store 2
Jeff|Smith|2/1/2000|Store 2
Jeffrey|Smith|2/1/2000|Store 2
J|Smith|1/2/2000|Store 2
Jeff|Smith|1/2/2000|Store 2
Jeffrey|Smith|1/2/2000|Store 2
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: How to reduce size of a customer like dimension
- By "reduce the size" I mean reduce the number of rows.
- Active is type 1
- We use a NOT_EXISTS strategy to deal with changes in first or last name.
- Customer_id is unique for each customer. In our customer dimension we do not keep track of old names.
Thanks
- Active is type 1
- We use a NOT_EXISTS strategy to deal with changes in first or last name.
- Customer_id is unique for each customer. In our customer dimension we do not keep track of old names.
Thanks
wizard- Posts : 13
Join date : 2010-11-30
Re: How to reduce size of a customer like dimension
I imagine if you could delete rows, you would have already done that. The only other option is partitioning. The partitioning strategy would be driven by access paths (queries).
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: How to reduce size of a customer like dimension
What are you trying to accomplish with a "smaller" customer dimension? Is it performance?
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: How to reduce size of a customer like dimension
You could try partitioning, it may or may not help much, as you are dealing with a dimension. A local partition index will be smaller, but indexes are typically order lnn structures, so speed improvement probably wouldn't be much.
Partitioning is more effective with fact tables as it helps reduce the size of the table it needs to deal with. Dimensions, on the other hand, are accessed broadly, so it would typically need to access all partitions anyway.
Partitioning is more effective with fact tables as it helps reduce the size of the table it needs to deal with. Dimensions, on the other hand, are accessed broadly, so it would typically need to access all partitions anyway.
Re: How to reduce size of a customer like dimension
True. My thought was active/inactive. If inactive customers are there for historical purposes but rarely accessed, that could improve query performance. Another option is to remove the customer dimension as a filter and only provide on drill down capabilities.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: How to reduce size of a customer like dimension
For a dimension with size like that, I would be mindful of following points:
1. Try to make the dimension as static as possible by removing all the dynamic attributes into mini dimensions and leverage fact table to track changes. It looks like all the attributes, except Active, can be type 1 in your existing table. If you include an Age measure/DD in the fact table, you may not need DOB for your analytics.
2. Use mini dimensions to slice and dice your facts if the reports is of analytic nature. Only use customer dimension for drill down purpose.
3. If only a small portion of the customer base is referenced in the respective fact table, you may create and maintain fact specific customer subset dimensions, so that the fact tables are only connected by smaller customer dimensions. Make sure the subset dimensions share the same surrogate key domain with the base dimension so that you can easily consolidate different fact tables, a concept similar to subtype dimensional modeling.
1. Try to make the dimension as static as possible by removing all the dynamic attributes into mini dimensions and leverage fact table to track changes. It looks like all the attributes, except Active, can be type 1 in your existing table. If you include an Age measure/DD in the fact table, you may not need DOB for your analytics.
2. Use mini dimensions to slice and dice your facts if the reports is of analytic nature. Only use customer dimension for drill down purpose.
3. If only a small portion of the customer base is referenced in the respective fact table, you may create and maintain fact specific customer subset dimensions, so that the fact tables are only connected by smaller customer dimensions. Make sure the subset dimensions share the same surrogate key domain with the base dimension so that you can easily consolidate different fact tables, a concept similar to subtype dimensional modeling.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: How to reduce size of a customer like dimension
Thanks for the replies everyone. I am trying to reduce the number of rows in my dimension to reduce the row_id lookup. Secondly, a smaller dimension means less distinct values in the fact table which helps bitmap index.
"Active" column is used very frequently and is type 1.
Thanks
"Active" column is used very frequently and is type 1.
Thanks
wizard- Posts : 13
Join date : 2010-11-30
Similar topics
» Dimension Size
» Merging customer data from disparate sources to create a master customer dimension
» Dimension same size as Fact
» De-normalizing Customer Information to create a Customer Dimension
» Customer Dimension
» Merging customer data from disparate sources to create a master customer dimension
» Dimension same size as Fact
» De-normalizing Customer Information to create a Customer Dimension
» Customer Dimension
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum