Sparse Product Dim vs Heterogeneous Product Dim

View previous topic View next topic Go down

Sparse Product Dim vs Heterogeneous Product Dim

Post  DataDude on Wed Jan 15, 2014 5:08 pm

What is a best practice or rule of thumb for deciding when to use a single product dim vs using the Heterogeneous Product design where some attributes will be mutually exclusive to certain product types.

For our dimensional model, we maintain type 7 dimensions (type 2 dim table with a surrogate dim key as well as a separate type 1 view with a durable dim key). If we used the Heterogeneous method, we would maintain a single durable dim key across all of the subtypes, but would probably maintain separate type2 hist keys for each subtype dim (instead of trying to keep them in sync) and thus have separate keys on the fact tables. So, the fact table may look like this....

Fact Table
prod_durable_key (scd1 FK)
prod_subtype1_key (scd2 FK)
prod_subtype2_key (scd2 FK)
prod_subtype3_key (scd2 FK)
prod_subtype4_key (scd2 FK)

What is the justifications for choosing one method over the other?
sparsity? what if 90% overlap, what if 50% overlap, what if 25% overlap
ease of use? - knowing which attributes apply to which type or
other?


DataDude

Posts : 2
Join date : 2014-01-15

View user profile

Back to top Go down

Re: Sparse Product Dim vs Heterogeneous Product Dim

Post  ngalemmo on Wed Jan 15, 2014 5:51 pm

Isn't an item of only one subtype? I don't understand the purpose of four different subtype keys.

When you update an item, would you not update its particular subtype and use that type 2 key in the fact? One would assume any subtype table would only contain members of that subtype and when used you would not include other subtypes in such a query.

If, on the other hand, your intent is to allow reporting any subtype's attributes at any time (i.e. against the entire population) what would be the point of having subtypes in the first place? Just have a single wide sparse dimension.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Sparse Product Dim vs Heterogeneous Product Dim

Post  BoxesAndLines on Thu Jan 16, 2014 9:13 am

How varied (heterogeneous) are your products? Do you have specific requirements for global views as well as line of business views of products? Do you have special line of business metrics not applicable to all products? That is the thought process I work through to determine whether I need to implement a heterogeneous product solution.
avatar
BoxesAndLines

Posts : 1212
Join date : 2009-02-03
Location : USA

View user profile

Back to top Go down

Re: Sparse Product Dim vs Heterogeneous Product Dim

Post  DataDude on Thu Jan 16, 2014 4:38 pm

BoxesAndLines,
I am working in a more generic data service environment where we are developing common models and some tools that will then be deployed to multiple clients.  Thus, I don't have specific answers to questions like "How varied are the products", etc.  I was looking more for general rules of thumb that says if there are less than x% overlap, I would lean towards heterogeneous.  There would be some line of business processes (facts) as well as some more generic processes that are common to all lines of business.

ngalemmo,
Yes, an item would only belong to one subtype.  We are modeling our dims as TYPE 7, that is to say we have both a type II as well as a type I.  The product_durable_key represented the reused key for all tables for the type I.  For the type II dims, in order to reuse the same key in the fact table to join to both the common dim as well as any of the subtype dims, then the ETL would have to manage creating a new subtype record (even if nothing changed) if something in the common dim changed, or vice versa.  Instead of that maintenance, I was thinking I would need a separate key for each subtype type II table.

DataDude

Posts : 2
Join date : 2014-01-15

View user profile

Back to top Go down

Re: Sparse Product Dim vs Heterogeneous Product Dim

Post  ngalemmo on Thu Jan 16, 2014 5:08 pm

If you think of the subtype row as an extension of the supertype row, a type 2 change in the supertype should also be reflected in the subtype even if none of the subtype attributes changed. This allows both supertype and subtype to carry the same type 2 key, making queries far simpler.

The problem with the multiple key solution, aside from overcomplicating queries, is where does it end? If you are trying to establish a generic environment, you have no control over how many of these subtypes may exist. You wind up with a sparse fact table rather than a sparse dimension table, which is a much worse situation.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Sparse Product Dim vs Heterogeneous Product Dim

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum