Snowflake Dimensions and the joins between them.

View previous topic View next topic Go down

Snowflake Dimensions and the joins between them.

Post  cmosquera on Tue Nov 19, 2013 8:44 pm

It's kind of tricky to find the correct information on this exact topic:

When resorting to snowflaking dimensions. Do you join them by using their Surrogate Keys or do you join them using natural keys? In our scenario we have dimensions in which there are Type 2 SCDs and we're wondering if we're better off using natural (business ids) or surrogate keys (which would point to the SCD record at a point in time...

The design tip for this mentions using an attribute key but it's hard to say what is meant by that.


Thanks!

cmosquera

Posts : 6
Join date : 2013-11-18

View user profile

Back to top Go down

Re: Snowflake Dimensions and the joins between them.

Post  ngalemmo on Wed Nov 20, 2013 3:08 pm

Surrogate keys whenever possible. The idea of using an attribute is generally limited to simple lookup tables rather than full on dimensions.
avatar
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

View user profile http://aginity.com

Back to top Go down

Re: Snowflake Dimensions and the joins between them.

Post  cmosquera on Thu Dec 12, 2013 8:14 am

ngalemmo wrote:Surrogate keys whenever possible.  The idea of using an attribute is generally limited to simple lookup tables rather than full on dimensions.

Thank you very much ngalemmo for the advice! Much appreciated!

Considering the above,


if you have a base dimension DimCustomers with a PK CustomerSKey (and a natural key called CustomerID) and snowflake dimension DimCustomerRegions that has a PK called CustomerRegionSKey and another field called CustomerSKey. Both of these are tracked for type 2 changes so they have the scdfrom/to and scdstatus fields. If a change happens to a Type2 attribute in the DimCustomer dimension, it should only affect that dimension, NOT the DimCustomerRegions dimension, as changes to that dimensions should be handled separately, am I correct in stating this or is it correct to insert a new record containing my new CustomerSKey in the DimCustomerRegions table every time there is a change to my base dimension?

I believe I read you state somewhere that you have to do a self join on the base dimension by the natural key in order to get both SKeys since no change may have happened in the DimCustomerRegions dimension pointing to the new CustomerSKey...

That being said, I've also read that it is ok to put the natural key from the base dimension in the snowflake dimension to accomplish the same and then constrain both using the SCDs....

What is the best approach to use when you intend to use modern BI Tools for analysis.

Thanks in advance.





cmosquera

Posts : 6
Join date : 2013-11-18

View user profile

Back to top Go down

Re: Snowflake Dimensions and the joins between them.

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top


 
Permissions in this forum:
You cannot reply to topics in this forum