Current and Historic Dimensions (one table or two?)
4 posters
Page 1 of 1
Current and Historic Dimensions (one table or two?)
Hello all,
I have a situation where I have modeled an Item Dimension as per business requirements to be SCD type 2.
It is a conformed dimension.
Our first business process dimensional model is customer grocery sales.
The grain is one fact line for each cash register receipt detail line.
Dimensions include date, store, customerCard, promotion and item
Facts include item count, item amount, discount count, discount amount.
90 percent of users/user queries will only be concerned with the current view of the item (what it is today – or the day the report is run)
10% of usage is anticipated to be concerned with what items looked like at the point in time of the measurements.
With this in mind, I have designed the table with a type 1 “current_attribute” value for each type 2 attribute. (for example - item category is tracked as type 2 so I have included both item_category and current_item_category)
I have some sophisticated users reviewing the dimensional model and they are pushing back for including 2 item dimension tables:
Item Dimension History – SCD type 2
Current Item Dimension – SCD type 1
This can be done, but should it be?
All it would mean is adding a new dimension and a key to the fact table. But it forces maintenance of 2 tables, instead of one. It feels like a compromise of best practices.
Perhaps I could create a Current Item Dimension View, off of the single dimension table, but I wanted to see if the community had any thoughts to address this situation.
I have been asked point blank what is my reluctance to add an additional table? My argument was about the additional maintenance required. I would appreciate any points of view on this.
Thank you,
Patrick
I have a situation where I have modeled an Item Dimension as per business requirements to be SCD type 2.
It is a conformed dimension.
Our first business process dimensional model is customer grocery sales.
The grain is one fact line for each cash register receipt detail line.
Dimensions include date, store, customerCard, promotion and item
Facts include item count, item amount, discount count, discount amount.
90 percent of users/user queries will only be concerned with the current view of the item (what it is today – or the day the report is run)
10% of usage is anticipated to be concerned with what items looked like at the point in time of the measurements.
With this in mind, I have designed the table with a type 1 “current_attribute” value for each type 2 attribute. (for example - item category is tracked as type 2 so I have included both item_category and current_item_category)
I have some sophisticated users reviewing the dimensional model and they are pushing back for including 2 item dimension tables:
Item Dimension History – SCD type 2
Current Item Dimension – SCD type 1
This can be done, but should it be?
All it would mean is adding a new dimension and a key to the fact table. But it forces maintenance of 2 tables, instead of one. It feels like a compromise of best practices.
Perhaps I could create a Current Item Dimension View, off of the single dimension table, but I wanted to see if the community had any thoughts to address this situation.
I have been asked point blank what is my reluctance to add an additional table? My argument was about the additional maintenance required. I would appreciate any points of view on this.
Thank you,
Patrick
pcs- Posts : 20
Join date : 2009-02-03
Re: Current and Historic Dimensions (one table or two?)
There are 3 ways to handle this. The one your users have identified. The second is a self join filtering on the natural dimension key and current row indicator. The last is to add an additional FK to the fact table which provides a read peformance boost for option 2. I would push for option 2 or 3 before creating another dimension due to the redundant data.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Current and Historic Dimensions (one table or two?)
Thank you for the response -
If I follow your thought correctly - option 2 would remove the redundant columns and is appealing from that point of view (do that and create a view for that join...) performance could be troublesome (maybe)
Just to make sure I understand - option 3 would require adding the natural key to the fact table? Is that correct or am I missing your point?
If I follow your thought correctly - option 2 would remove the redundant columns and is appealing from that point of view (do that and create a view for that join...) performance could be troublesome (maybe)
Just to make sure I understand - option 3 would require adding the natural key to the fact table? Is that correct or am I missing your point?
pcs- Posts : 20
Join date : 2009-02-03
Dim View with a Join to the Current Dim row..
When faced with this situation, I have most often done option 2 - built a VIEW on the Dim, that contains a column for CurrentItemID, which links to the Dim row that IsCurrent and matches the business key. Has worked well.
owvi4- Posts : 3
Join date : 2009-04-16
Location : Indianapolis, IN
Re: Current and Historic Dimensions (one table or two?)
There are three ways to implement type 2 dimensions and designing the table with both type 1 and type 2 versions of the attributes is not one of them.
Options are:
1: Textbook type 2. Use a self join based on business key and current flag to get current attributes (can be done with a view).
2: Type 1 and type 2 versions of the table, two keys on the fact. Simple and fastest for queries. Don't worry about redundancy... the warehouse is already full of redundancy.
3: A type 2 with two keys, a type 1 key and type 2 key. Carry both on the fact. Type 1 key stays same for all versions of the same NK, filter on current flag to get current data. Better than 1 and easier to implement than 2.
Options are:
1: Textbook type 2. Use a self join based on business key and current flag to get current attributes (can be done with a view).
2: Type 1 and type 2 versions of the table, two keys on the fact. Simple and fastest for queries. Don't worry about redundancy... the warehouse is already full of redundancy.
3: A type 2 with two keys, a type 1 key and type 2 key. Carry both on the fact. Type 1 key stays same for all versions of the same NK, filter on current flag to get current data. Better than 1 and easier to implement than 2.
Re: Current and Historic Dimensions (one table or two?)
Not the natural key, another surrogate key that represents the natural key. Put that on the fact. It works the same way except you get the added benefit of abstracting the natural keys of the source system.pcs wrote:Thank you for the response -
If I follow your thought correctly - option 2 would remove the redundant columns and is appealing from that point of view (do that and create a view for that join...) performance could be troublesome (maybe)
Just to make sure I understand - option 3 would require adding the natural key to the fact table? Is that correct or am I missing your point?
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Current and Historic Dimensions (one table or two?)
Ok - this makes much better sense to me now. thank you Nick, Boxes and Lines, and owvi4. I appreciate your time, thought and feedback. This has helped a lot.
pcs- Posts : 20
Join date : 2009-02-03
Similar topics
» Fact, factless fact, and current view dimensions
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Dimensions in fact table
» Bridge Table and Degenerate Dimensions
» Number of Dimensions around a FACT Table.
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Dimensions in fact table
» Bridge Table and Degenerate Dimensions
» Number of Dimensions around a FACT Table.
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum