rollup fact table
4 posters
Page 1 of 1
rollup fact table
If creating a roll up fact table would you consider using something other than the surrogate key for your date dimension join? I would not but my colleagues think we should be using month_id for example if we are creating a monthly roll up table from a fact table at the daily grain.
Initially I dismissed the notion but they are persistent and tell me it is best practice and even Kimball advocates this approach. So I dusted off my Kimball books but could not settle the debate either way. In addition to the original question which I suspect most are going to agree with using the date dimension surrogate key can you point me to a Kimball article, tip, etc. that I can show my colleagues that specifically states to always use the surrogate when creating date based roll ups.
Initially I dismissed the notion but they are persistent and tell me it is best practice and even Kimball advocates this approach. So I dusted off my Kimball books but could not settle the debate either way. In addition to the original question which I suspect most are going to agree with using the date dimension surrogate key can you point me to a Kimball article, tip, etc. that I can show my colleagues that specifically states to always use the surrogate when creating date based roll ups.
robber- Posts : 41
Join date : 2009-02-28
Location : Canada
Re: rollup fact table
What is really important is Conformance, namely Month should be a conformed rollup of the Day dimension. Kimball has the definition for conformed dimensions in his famous book 'The Complete Guide to Dimensional Modeling': http://www.rkimball.com/html/booksDWT2.html.
In your case, it really depends on what you mean by month_id. Is it yyyymm integer primary key for the month dimension, or is it just month number without year part which obviously cannot be qualified as a unique key. Now with date dimension, there is one exception about the its surrogate key, that is to use smart yyyymmdd integer key. The recent trend indicates smart integer date key has many advantages over the meaningless surrogate key, defying Kimball's early justification in his book. It seems Kimball has also accepted the smart date key concept based on some other books endorsed by him.
So back to your question, I would suggest to use yyyymm integer key as dimension key for month for the same reason for date key, and include all the relevant month, quarter, and year attributes that also exist in date dimension. In another word, month dimension must be conformed with date dimension so it is possible to compare and reconcile measurements between fact tables at different aggregate levels.
In your case, it really depends on what you mean by month_id. Is it yyyymm integer primary key for the month dimension, or is it just month number without year part which obviously cannot be qualified as a unique key. Now with date dimension, there is one exception about the its surrogate key, that is to use smart yyyymmdd integer key. The recent trend indicates smart integer date key has many advantages over the meaningless surrogate key, defying Kimball's early justification in his book. It seems Kimball has also accepted the smart date key concept based on some other books endorsed by him.
So back to your question, I would suggest to use yyyymm integer key as dimension key for month for the same reason for date key, and include all the relevant month, quarter, and year attributes that also exist in date dimension. In another word, month dimension must be conformed with date dimension so it is possible to compare and reconcile measurements between fact tables at different aggregate levels.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: rollup fact table
Conformed dimensions are in place and the overall health of the data warehouse is good, just want to keep it that way.
robber- Posts : 41
Join date : 2009-02-28
Location : Canada
Re: rollup fact table
I always stick to the dateKey but use the first day of each month when I need a monthly key e.g. 20100101 for January 2010.
Re: rollup fact table
I agree with John. I use a single date dimension and use a predetermined row for month or quarter or annual based aggregates. In the model, I would add a role name to the FK, such as MONTH_DATE_KEY to make it clear it references a monthly aggregate amount.
For the date dimension itself, I add additional natural key columns for month, quarter and year rather than use a set date. Natural key values (such as "201007") are only populated on the row for which aggregates will be maintained, in other words, only one row in a month would have a month NK value, the NK would be null for other days in the month. This way, processes that load monthly values can use the standard surrogate key assignment process without hard-coding anything. You can also use flags to identify the correct rows, but I found the NK columns to be a little cleaner for ETL.
For the date dimension itself, I add additional natural key columns for month, quarter and year rather than use a set date. Natural key values (such as "201007") are only populated on the row for which aggregates will be maintained, in other words, only one row in a month would have a month NK value, the NK would be null for other days in the month. This way, processes that load monthly values can use the standard surrogate key assignment process without hard-coding anything. You can also use flags to identify the correct rows, but I found the NK columns to be a little cleaner for ETL.
Similar topics
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» How to create fact table with measures derived from comparing two fact table rows
» Healthcare Hospital Stay fact table - transaction fact table - a row for In and a row for Out?
» Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» How to create fact table with measures derived from comparing two fact table rows
» Healthcare Hospital Stay fact table - transaction fact table - a row for In and a row for Out?
» Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum