Fact with large columns, 320.
5 posters
Page 1 of 1
Fact with large columns, 320.
We have Fact table with 316 columns with approx 9 million records. Out of these 316 column, just 11 are dimension keys. We are using this warehouse for SSAS 2008. Currently cube is able to process in about 30/40 mins. I have recently started Warehosue/BI so trying to improve step by step.
Questions
#1 Is it normal to have these many measures in fact?
#2 Is there better approach then having that many measures in fact? Some article or example will really help.
Thanks to all in advance..
Questions
#1 Is it normal to have these many measures in fact?
#2 Is there better approach then having that many measures in fact? Some article or example will really help.
Thanks to all in advance..
vipinkrsharma- Posts : 1
Join date : 2012-01-24
Re: Fact with large columns, 320.
1. No, it is very unusual.
2. Without knowing what is being modeled, it is hard to recommend an alternative.
2. Without knowing what is being modeled, it is hard to recommend an alternative.
Re: Fact with large columns, 320.
Hi,
To me it is not that unusual. Specifically for predictive analytics, we use views that represent a snapshot of a system's state in a single line. Usually this is at the presentation layer which means it is created from normalised data from the previous layer. You could split the fact by section but then you would need to move in the no-no area of joining facts together to get a single view of your data. Having said that, I have created several data marts in this way and never had a complaint. Quite the opposite. The major disadvantage you get is inflexibility in expanding the model.
I hope this helps.
To me it is not that unusual. Specifically for predictive analytics, we use views that represent a snapshot of a system's state in a single line. Usually this is at the presentation layer which means it is created from normalised data from the previous layer. You could split the fact by section but then you would need to move in the no-no area of joining facts together to get a single view of your data. Having said that, I have created several data marts in this way and never had a complaint. Quite the opposite. The major disadvantage you get is inflexibility in expanding the model.
I hope this helps.
aristo- Posts : 2
Join date : 2012-01-26
Re: Fact with large columns, 320.
aristo, can you clarify that a bit?aristo wrote:
You could split the fact by section but then you would need to move in the no-no area of joining facts together to get a single view of your data.
It sound like you have a system state metrics list that you're pivoting to columns from rows. Why that will require a fact to fact table join?
Re: Fact with large columns, 320.
ykud wrote:aristo, can you clarify that a bit?aristo wrote:
You could split the fact by section but then you would need to move in the no-no area of joining facts together to get a single view of your data.
It sound like you have a system state metrics list that you're pivoting to columns from rows. Why that will require a fact to fact table join?
Apologies for the late reply.
Apologies also for mixing up concepts. In the logical model, there is only just one fact that is represented by many tables in the physical model. Think partitioning the fact table both horizontally and vertically. It is only needed in some implementations. Vertical partitioning is pretty much available natively in most major DW infrastructures. For horizontal, I tend to create several tables that share the same PK. This makes it easier to add more columns in the future as you can get away by only updating part of the fact. So the main reason I guess is ETL. There may be some benefit in allowing the optimiser to do parallelism easier but it also depends on the implementation.
I hope this helps.
aristo- Posts : 2
Join date : 2012-01-26
Re: Fact with large columns, 320.
Hi vipinkrsharma,
Coming back to your original question, your current model does sound a bit out of balance to me. If I understand you right, you have 300+ measures by 11 dimensions? I'd be expecting a ratio of 5:1 at most (not roughly 30:1). I generally end up with lower ratios e.g. 2:1. To be clear, I'm counting each instance of role-playing dimensions (Cube Dimensions in SSAS-speak).
This isn't a design objective per se, but it just seems to be how data naturally models for SSAS once you understand both the data and SSAS fully.
So I'd be looking for repeated groups of measures e.g. Pending Customer Count, Open Customer Count, Closed Customer Count. Sometimes you can rationalise these by added new dimensions or attributes (in this e.g. perhaps a Customer Status with values of Pending, Open or Closed).
Another thing to look for would be sparse measures across the rows of fact data. You may find many rows (e.g. of a certain category, or from a certain source) that have null values for many measures. These could be a candidate for splitting into a separate fact, allowing you to model the data more naturally (with just the relevant measures in each fact).
Good luck!
Mike
Coming back to your original question, your current model does sound a bit out of balance to me. If I understand you right, you have 300+ measures by 11 dimensions? I'd be expecting a ratio of 5:1 at most (not roughly 30:1). I generally end up with lower ratios e.g. 2:1. To be clear, I'm counting each instance of role-playing dimensions (Cube Dimensions in SSAS-speak).
This isn't a design objective per se, but it just seems to be how data naturally models for SSAS once you understand both the data and SSAS fully.
So I'd be looking for repeated groups of measures e.g. Pending Customer Count, Open Customer Count, Closed Customer Count. Sometimes you can rationalise these by added new dimensions or attributes (in this e.g. perhaps a Customer Status with values of Pending, Open or Closed).
Another thing to look for would be sparse measures across the rows of fact data. You may find many rows (e.g. of a certain category, or from a certain source) that have null values for many measures. These could be a candidate for splitting into a separate fact, allowing you to model the data more naturally (with just the relevant measures in each fact).
Good luck!
Mike
Similar topics
» very large/wide fact table considerations?
» Partitioning Large MS SQL-Server FACT table
» Too many columns in fact table
» SK and ID columns in a Fact table
» Disadvantage of Fact table with 44 columns
» Partitioning Large MS SQL-Server FACT table
» Too many columns in fact table
» SK and ID columns in a Fact table
» Disadvantage of Fact table with 44 columns
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum