Modeling date hierarchy within a dimension
4 posters
Page 1 of 1
Modeling date hierarchy within a dimension
First let me say that I am familiar with date dimensions and use them throughout my models... usually a whole bunch of roll-playing dimensions based on a single date dimension object.
However, I am often confronted with situations where a date may be a dimension attribute, for example, the effective date of an insurance policy. In this case the effective date is not an attribute of any transaction, its an attribute of the policy it's self. It's easy enough to just store a field of type 'date' on the policy dimension, but it then lacks the richness of the hierarchies available through a dedicated date dimension object. Now you can replicate that functionality through the use of additional fields on the dimension like, effective year, effective month, effective day, etc., but it feels like that's generating a lot of clutter and maintenance overhead, especially if you need to multiply this by a whole handful of date attributes. It also seems possible to go with a snowflake and implement the dimension attributes as foreign keys to a date dimension. In addition, using a view, the snowflake can be flattened, so you would get the simplicity and re-usability in the physical model, while still having access to a flat logical model.
In my case I'm working on a Microsoft platform, mostly with SSAS Cubes, but there will eventually be some relational report models that run directly against the warehouse tables and not the cubes. The dimension in question has a few million rows, and given the high cardinality of dates, hierarchies are high desirable for their slicers and Aggregators.
There are performance and usability considerations, not to mention the maintenance overhead. I'm wondering if there is a best practice for this or good arguments for or against a particular implementation.
However, I am often confronted with situations where a date may be a dimension attribute, for example, the effective date of an insurance policy. In this case the effective date is not an attribute of any transaction, its an attribute of the policy it's self. It's easy enough to just store a field of type 'date' on the policy dimension, but it then lacks the richness of the hierarchies available through a dedicated date dimension object. Now you can replicate that functionality through the use of additional fields on the dimension like, effective year, effective month, effective day, etc., but it feels like that's generating a lot of clutter and maintenance overhead, especially if you need to multiply this by a whole handful of date attributes. It also seems possible to go with a snowflake and implement the dimension attributes as foreign keys to a date dimension. In addition, using a view, the snowflake can be flattened, so you would get the simplicity and re-usability in the physical model, while still having access to a flat logical model.
In my case I'm working on a Microsoft platform, mostly with SSAS Cubes, but there will eventually be some relational report models that run directly against the warehouse tables and not the cubes. The dimension in question has a few million rows, and given the high cardinality of dates, hierarchies are high desirable for their slicers and Aggregators.
There are performance and usability considerations, not to mention the maintenance overhead. I'm wondering if there is a best practice for this or good arguments for or against a particular implementation.
tlum- Posts : 2
Join date : 2012-12-27
Re: Modeling date hierarchy within a dimension
There is a "broken relationship" between any date field and the date dimension table. I will join any date column to the date dimension as needed to support a given report. I never snowflake the model though as the experienced practitioner implicitly knows that this join is always available.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Modeling date hierarchy within a dimension
I'm not sure that I understand what you mean. You say
Does "to support a given report" mean you do the join in each report, at run time as needed?
Here I'm talking the actual warehouse model that will support the OLAP cubes, among other things, which is maintained during ETL processing. Joins are expensive, so you want to do them in the right place, if at all.
BoxesAndLines wrote:I will join any date column to the date dimension as needed to support a given report. I never snowflake the model though as the experienced practitioner implicitly knows that this join is always available.
Does "to support a given report" mean you do the join in each report, at run time as needed?
Here I'm talking the actual warehouse model that will support the OLAP cubes, among other things, which is maintained during ETL processing. Joins are expensive, so you want to do them in the right place, if at all.
tlum- Posts : 2
Join date : 2012-12-27
Re: Modeling date hierarchy within a dimension
My date dimension has an index on the date columns. So joining to the date dimension on a date value works well. I only do this when it doesn't make sense to add the date to the fact and when I want to leverage the date hierarchy.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Modeling date hierarchy within a dimension
This is where the art comes into it.
If a date on a dimension is queried frequently, then consider moving it to the fact. If the date is queried infrequently but when queried it is usually rolled up to the Month or Year, then consider keeping the Date on the dimension as a key and snowflake it to the date dimension. If a date on the dimension is used to keep track of changes to the dimension table or other stuff and is rarely queried and never queried above the date, then leave as a date field on the dimension.
If a date on a dimension is queried frequently, then consider moving it to the fact. If the date is queried infrequently but when queried it is usually rolled up to the Month or Year, then consider keeping the Date on the dimension as a key and snowflake it to the date dimension. If a date on the dimension is used to keep track of changes to the dimension table or other stuff and is rarely queried and never queried above the date, then leave as a date field on the dimension.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: Modeling date hierarchy within a dimension
Right. You want to keep the model form as true to a star schema as possible. But that doesn't mean you cannot occasionally throw in an 'off-label' join to resolve an infrequent need. The particular query may not perform optimally, but it is a better option than trying to build a model to handle every exceptional case.
Similar topics
» Date Dimension: Representing partial dates/Imputing date values
» Eliminate Date Dimension Surrogate Key
» Always link date fields to Date Dimension?
» Modeling Fact tables for a Hierarchy
» Hierarchy in dimension table
» Eliminate Date Dimension Surrogate Key
» Always link date fields to Date Dimension?
» Modeling Fact tables for a Hierarchy
» Hierarchy in dimension table
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|