Snowflake a dimension if facts are at different granularities?
4 posters
Page 1 of 1
Snowflake a dimension if facts are at different granularities?
Hi
I'm currently modelling a data warehouse that will contain property management data. Properties consist of one or more suites, and suites can only belong to one property.
My initial thought was to normalize properties and suites into a single dimension. However, I have a fact table with a measure that will be at the property level, and another fact table with a measure that will be at the suite level. This is because the measure in the former is not stored at the suite level in the source system, and is only meaningful at the property level (for instance, the price that a property was sold for - suites aren't sold off individually).
My question is whether it would be considered bad practice to snowflake the design so that suites and properties are stored in two separate dimensions that are linked to one another (i.e.: a suite will have a foreign key to a property). This could cause problems as both are SCD dimensions of type 2, however.
Are there any other approaches that may be better than this?
Thanks.
I'm currently modelling a data warehouse that will contain property management data. Properties consist of one or more suites, and suites can only belong to one property.
My initial thought was to normalize properties and suites into a single dimension. However, I have a fact table with a measure that will be at the property level, and another fact table with a measure that will be at the suite level. This is because the measure in the former is not stored at the suite level in the source system, and is only meaningful at the property level (for instance, the price that a property was sold for - suites aren't sold off individually).
My question is whether it would be considered bad practice to snowflake the design so that suites and properties are stored in two separate dimensions that are linked to one another (i.e.: a suite will have a foreign key to a property). This could cause problems as both are SCD dimensions of type 2, however.
Are there any other approaches that may be better than this?
Thanks.
min.emerg- Posts : 39
Join date : 2011-02-25
Re: Snowflake a dimension if facts are at different granularities?
Yes. Just carry both dimensions on the suite level facts.
Re: Snowflake a dimension if facts are at different granularities?
Hi Nick
Thanks for the response. By both dimensions, do you mean to suggest that it it's okay for me to snowflake the one denormalized dimension into two dimensions (one for property and one for suite)? Apologies for being slow - I'm still a bit confused as to how I deal with the property-level facts.
I was thinking of perhaps doing something like this:
For the fact tables that do not go down to the suite level, I can link to properties in the PropertyKey dimension that have a SuiteName of 'N/A' . So each building will have a record in the dimension with a SuiteName of 'N/A'.
Thanks for the response. By both dimensions, do you mean to suggest that it it's okay for me to snowflake the one denormalized dimension into two dimensions (one for property and one for suite)? Apologies for being slow - I'm still a bit confused as to how I deal with the property-level facts.
I was thinking of perhaps doing something like this:
PropertySuiteKey | PropertyName | SuiteName |
1 | Property One | Suite One |
2 | Property One | Suite Two |
3 | Property One | N/A |
For the fact tables that do not go down to the suite level, I can link to properties in the PropertyKey dimension that have a SuiteName of 'N/A' . So each building will have a record in the dimension with a SuiteName of 'N/A'.
min.emerg- Posts : 39
Join date : 2011-02-25
Re: Snowflake a dimension if facts are at different granularities?
No. For suite level facts, carry both the properly and suite dimension keys on the fact.
Re: Snowflake a dimension if facts are at different granularities?
Hi Nick
Thanks again for the response. Do you mean that I should create two dimensions (one for property, one for suite), and associate properties to suites using the fact table?
DimProperty
DimSuite
Fact
This should show that Suite One belongs to Property One.
Thanks again.
Thanks again for the response. Do you mean that I should create two dimensions (one for property, one for suite), and associate properties to suites using the fact table?
DimProperty
PropertyKey | PropertyName |
1 | Property One |
DimSuite
SuiteKey | SuitName |
1 | Suite One |
Fact
FactKey | PropertyKey | SuiteKey |
1 | 1 | 1 |
This should show that Suite One belongs to Property One.
Thanks again.
min.emerg- Posts : 39
Join date : 2011-02-25
Re: Snowflake a dimension if facts are at different granularities?
Something that Kimball suggested in his DW Toolkit book is to use roll-up dimensions, which seem to involve creating a new dimension that is a subset of a more complex dimension.
In my situation, I'd keep the denormalized dimension with both property and suite data:
DimPropertySuite
But will create a new dimension that contains an exact subset of the attributes in the above table:
DimProperty
In the fact tables that go down to the suite level, I'll use the DimPropertySuite dimension, and I'll use the DimProperty dimension for facts that only go to the property level.
In my situation, I'd keep the denormalized dimension with both property and suite data:
DimPropertySuite
PropertySuiteKey | PropertyName | SuiteName |
1 | Property One | Suite One |
2 | Property One | Suite Two |
But will create a new dimension that contains an exact subset of the attributes in the above table:
DimProperty
PropertyKey | PropertyName |
1 | Property One |
In the fact tables that go down to the suite level, I'll use the DimPropertySuite dimension, and I'll use the DimProperty dimension for facts that only go to the property level.
min.emerg- Posts : 39
Join date : 2011-02-25
Re: Snowflake a dimension if facts are at different granularities?
I attempted to use roll-up dimensions (aka mini-dimensions) at one point to handle facts at different granularities, but ran into difficulty reporting across fact tables with BI tools that didn't understand that an attribute of the mini-dimension represented the same thing as the attribute in the original dimension. In your example, this would be that the tool didn't understand that DimPropertySuite.PropertyName represented the same thing as DimProperty.PropertyName, and therefore the users couldn't do easy ad hoc drill-across fact tables.
I ended up snowflaking, which has worked fine in my modest (11GB) DW on SQL Server 2008.
I did some performance testing and found there was a performance hit of about 30% overall, but most queries still finish in a second or two (10 seconds for the slow ones) so this isn't an issue for me. I would not recommend snowflaking for a very large DW, and I would only snowflake to the extent needed to support various fact table granularies (i.e. don't try to 3NF your dimension tables!)
I ended up snowflaking, which has worked fine in my modest (11GB) DW on SQL Server 2008.
I did some performance testing and found there was a performance hit of about 30% overall, but most queries still finish in a second or two (10 seconds for the slow ones) so this isn't an issue for me. I would not recommend snowflaking for a very large DW, and I would only snowflake to the extent needed to support various fact table granularies (i.e. don't try to 3NF your dimension tables!)
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Snowflake a dimension if facts are at different granularities?
VHF
Thanks for the response. What you say makes sense to me. The warehouse I'll be building will be relatively small (50GB), so I don't think that snowflaking where needed will kill it.
Thanks again.
Thanks for the response. What you say makes sense to me. The warehouse I'll be building will be relatively small (50GB), so I don't think that snowflaking where needed will kill it.
Thanks again.
min.emerg- Posts : 39
Join date : 2011-02-25
Re: Snowflake a dimension if facts are at different granularities?
I think we should question some common believes more often.
There is no guarantee that the snowflaking leads to a poor query performance, i.e. to a query which won't be acceptable by the end user.
It could delay a query, but the real question is how slower it will become.
A good understanding of the RDBMS engine and how a join will be actually implemented, i.e. the ability to tune the SQL statement, can make the delay acceptable or even not noticable..
So, be sure to have an experienced team member, who can test possible design models (by quick prototyping of the riskier queries).
There is no guarantee that the snowflaking leads to a poor query performance, i.e. to a query which won't be acceptable by the end user.
It could delay a query, but the real question is how slower it will become.
A good understanding of the RDBMS engine and how a join will be actually implemented, i.e. the ability to tune the SQL statement, can make the delay acceptable or even not noticable..
So, be sure to have an experienced team member, who can test possible design models (by quick prototyping of the riskier queries).
hayrabedian- Posts : 7
Join date : 2011-04-01
Similar topics
» Partially vs Fully denormalized dimension and different facts granularities
» Only way to pull data from star/snowflake schema is by using facts?
» Large number of snowflake code tables per dimension
» Why not snowflake if the aggregate dimension is needed for a different fact table?
» Static Facts on Dimension Table?
» Only way to pull data from star/snowflake schema is by using facts?
» Large number of snowflake code tables per dimension
» Why not snowflake if the aggregate dimension is needed for a different fact table?
» Static Facts on Dimension Table?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum