Multiple measures in a fact table- modelling question
3 posters
Page 1 of 1
Multiple measures in a fact table- modelling question
Hi all,
First, sorry for what may be the super easy question for you...
There are around 100 different metrics with the same grain to be analyzed against the same set of dimensions. The initial idea was to build one big table ~100columns wide with a column for each and every metric (this seems to be a pretty straightforward and classical solution). However, the second idea that came to our minds, was to create only two columns for all the metrics, with 1st column serving as a dimension 'Metric ID' and 2nd serving as a regular measure and storing the values for the corresponding 'Metric ID' elements. So, generally the choice is between:
1) Dimensions|Measure Id 1|Measure Id 2|Measure Id 3|....
and
2) Dimensions|Measure Id|Value
Basing on this information, would you be able to tell which solution would provide us with better performance (RDBMS used would be mostly probably Oracle and data from this table would serve as a source for the dashboard design)? Is the 2nd solution not going to be problematic in some way? Also, if the 1st solution is chosen, wouldn't it be a good idea to split this table into some topic-oriented tables (e.g. if they are going to be presented on different set of cockpits)?
I am aware of the fact that the answer may not be a straightforward one but it would be great if you could share your views on that.
Thanks.
First, sorry for what may be the super easy question for you...
There are around 100 different metrics with the same grain to be analyzed against the same set of dimensions. The initial idea was to build one big table ~100columns wide with a column for each and every metric (this seems to be a pretty straightforward and classical solution). However, the second idea that came to our minds, was to create only two columns for all the metrics, with 1st column serving as a dimension 'Metric ID' and 2nd serving as a regular measure and storing the values for the corresponding 'Metric ID' elements. So, generally the choice is between:
1) Dimensions|Measure Id 1|Measure Id 2|Measure Id 3|....
and
2) Dimensions|Measure Id|Value
Basing on this information, would you be able to tell which solution would provide us with better performance (RDBMS used would be mostly probably Oracle and data from this table would serve as a source for the dashboard design)? Is the 2nd solution not going to be problematic in some way? Also, if the 1st solution is chosen, wouldn't it be a good idea to split this table into some topic-oriented tables (e.g. if they are going to be presented on different set of cockpits)?
I am aware of the fact that the answer may not be a straightforward one but it would be great if you could share your views on that.
Thanks.
BenD- Posts : 2
Join date : 2010-11-14
Re: Multiple measures in a fact table- modelling question
If the measures do share the same dimensionality, and they represent the most granular measurements you want analyse, storing them side by side in one fact table is the most efficient approach, saving you from over-normalising the fact and joining them back in future.
However, with so many measures in a single table does warrant some review on the model. Are there many derived values, and if the underlying granular values are also useful, the base metrics may not share the same set of dimension keys.
Another point is about the ‘Metric’ dimension. Is it something useful that business wants as part of dimension portfolio, or would it create more confusion to users.
However, with so many measures in a single table does warrant some review on the model. Are there many derived values, and if the underlying granular values are also useful, the base metrics may not share the same set of dimension keys.
Another point is about the ‘Metric’ dimension. Is it something useful that business wants as part of dimension portfolio, or would it create more confusion to users.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Multiple measures in a fact table- modelling question
I agree with Hang - adding a "metric" dimension will be less efficient for any query that requires multiple metrics (depending on how you store the table). Let's say your fact table has 4 byte integer surrogate keys to 10 dimensions, plus 100 measures, each 8 bytes long. Your total row size, not counting overhead, is 840 bytes. If you switch to a metric dimension, you will have 11 dimension keys plus one measure for a total of 52 bytes per row, BUT, you now need 100 fact rows to get the same information. This totals 5,200 bytes for the same information. Your fact table is over 6x bigger.
You also need to consider the kinds of reports users need. Retrieving multiple metrics and comparing them in the same report can be a real challenge with a metric dimension in SQL. Each metric is essentially a SQL set, and it can be struggle lining them up next to each other in report columns. It depends on your BI tool, but give it a try with a test table.
--Warren
You also need to consider the kinds of reports users need. Retrieving multiple metrics and comparing them in the same report can be a real challenge with a metric dimension in SQL. Each metric is essentially a SQL set, and it can be struggle lining them up next to each other in report columns. It depends on your BI tool, but give it a try with a test table.
--Warren
warrent- Posts : 41
Join date : 2008-08-18
Re: Multiple measures in a fact table- modelling question
Thanks a lot for your insight! My first thought was also that introducing Metric attribute would rather complicate the issues rather than make life easier. Anyway, creating this attribute is not the customer requirement it was rather an alternative for the column-wide modelling. I am simply worried about the size of the table (~4 dimensions, ~100 metrics), I am not used to working with such big tables, so I am not sure how do they perform. But as you say, dividing this table into topic-oriented smaller tables doesn't make much sense and having one table shouldn't be a big deal.
So, summing up, the main reason for discouraging my customer from the Metric attribute solution will be the performance issue, can you see any other clear point that I could give him? (thet want to be clear about the choice of the solution).
Once again thanks for help!
So, summing up, the main reason for discouraging my customer from the Metric attribute solution will be the performance issue, can you see any other clear point that I could give him? (thet want to be clear about the choice of the solution).
Once again thanks for help!
BenD- Posts : 2
Join date : 2010-11-14
Similar topics
» multiple currencies for measures in Fact Table
» Question on Multiple fact table design
» Modelling question - Multiple fact tables with one to many relationships and querying from many fact tables
» How to create fact table with measures derived from comparing two fact table rows
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Question on Multiple fact table design
» Modelling question - Multiple fact tables with one to many relationships and querying from many fact tables
» How to create fact table with measures derived from comparing two fact table rows
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum