User fields design issue
4 posters
Page 1 of 1
User fields design issue
We have a huge data in DWH with each member having around 50 attributes assigned to it. The number and the attributes change with each member participating in the type of program. So, the number of distinct attributes in the DWH reached 2500. How to model this data to be able to use retrieve and use all attributes as dimensions?
If I store this data in fact as memberid+attributeValue+attributedimid, it becomes a 3 billion records table.
examples of attributes are language, contact_during_day, transfernumber, doctor_line etc.
If I store this data in fact as memberid+attributeValue+attributedimid, it becomes a 3 billion records table.
examples of attributes are language, contact_during_day, transfernumber, doctor_line etc.
anusha.jalla@gmail.com- Posts : 3
Join date : 2012-04-03
Re: User fields design issue
I've dealt with this in the past few years working with retailers and the financial industries. While a name/value pair structure is the most flexible, querying it is a pain. You would need to implement a publication process that creates flattened structures of the desired attributes. It may work for you or it may not, and you still need to materialize a flat image (or a view, if your hardware can handle it) to allow direct user queries.
You need to consider use cases and where the same attributes exist in different situations. I would imagine a flat image of all attributes would be extremely sparse, besides being unusable (or exceed table limits of your DBMS). You most likely should consider a sub-type cluster, with common attributes in the main dimension and additional dimension tables for groups of attributes. These sub-type dimensions would carry the same PK as the main dimension so that they can be joined directly to the fact.
On the other hand, if they are truly user defined attributes, you can just store them as 50 varchar values in a row and develop views for each user group with appropriate column names for the attributes. Again, it depends on your use case.
You need to consider use cases and where the same attributes exist in different situations. I would imagine a flat image of all attributes would be extremely sparse, besides being unusable (or exceed table limits of your DBMS). You most likely should consider a sub-type cluster, with common attributes in the main dimension and additional dimension tables for groups of attributes. These sub-type dimensions would carry the same PK as the main dimension so that they can be joined directly to the fact.
On the other hand, if they are truly user defined attributes, you can just store them as 50 varchar values in a row and develop views for each user group with appropriate column names for the attributes. Again, it depends on your use case.
Re: User fields design issue
What ngalemmo calls a sub-type dimension is really just a mini dimension. Store the attributes that vary by type in a mini dimension. You will have a mini dimension for each type. The primary key of the mini dimension is the same as the primary key of the primary dimension. I would even look at creating junk mini dimensions if I have too many types.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: User fields design issue
In other words the attributes which changes a lot throw them into a separate table as you will still need to connect this child and its parent dimension have same primary key column in both.
vishy
www.cognos-ibm.blogspot.com
vishy
www.cognos-ibm.blogspot.com
Vishy- Posts : 75
Join date : 2012-02-28
Location : India
Similar topics
» DW design question - user/role integration to the data
» Design Tip #25 Modeling Issue
» Unusual(?) parent/child Fact table design issue
» User to User subscription model
» Dimension fields depend on other fields
» Design Tip #25 Modeling Issue
» Unusual(?) parent/child Fact table design issue
» User to User subscription model
» Dimension fields depend on other fields
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum