Fact with large columns, 320.

vipinkrsharma Tue Jan 24, 2012 1:25 pm

We have Fact table with 316 columns with approx 9 million records. Out of these 316 column, just 11 are dimension keys. We are using this warehouse for SSAS 2008. Currently cube is able to process in about 30/40 mins. I have recently started Warehosue/BI so trying to improve step by step.

Questions
#1 Is it normal to have these many measures in fact?
#2 Is there better approach then having that many measures in fact? Some article or example will really help.

Thanks to all in advance..

ngalemmo Tue Jan 24, 2012 2:53 pm

1. No, it is very unusual.
2. Without knowing what is being modeled, it is hard to recommend an alternative.

aristo Thu Jan 26, 2012 5:47 pm

Hi,

To me it is not that unusual. Specifically for predictive analytics, we use views that represent a snapshot of a system's state in a single line. Usually this is at the presentation layer which means it is created from normalised data from the previous layer. You could split the fact by section but then you would need to move in the no-no area of joining facts together to get a single view of your data. Having said that, I have created several data marts in this way and never had a complaint. Quite the opposite. The major disadvantage you get is inflexibility in expanding the model.

I hope this helps.

ykud Fri Jan 27, 2012 9:34 am

aristo wrote:
You could split the fact by section but then you would need to move in the no-no area of joining facts together to get a single view of your data.

aristo, can you clarify that a bit?
It sound like you have a system state metrics list that you're pivoting to columns from rows. Why that will require a fact to fact table join?

aristo Mon Jan 30, 2012 7:58 am

ykud wrote:
aristo wrote:
You could split the fact by section but then you would need to move in the no-no area of joining facts together to get a single view of your data.
aristo, can you clarify that a bit?
It sound like you have a system state metrics list that you're pivoting to columns from rows. Why that will require a fact to fact table join?

Apologies for the late reply.

Apologies also for mixing up concepts. In the logical model, there is only just one fact that is represented by many tables in the physical model. Think partitioning the fact table both horizontally and vertically. It is only needed in some implementations. Vertical partitioning is pretty much available natively in most major DW infrastructures. For horizontal, I tend to create several tables that share the same PK. This makes it easier to add more columns in the future as you can get away by only updating part of the fact. So the main reason I guess is ETL. There may be some benefit in allowing the optimiser to do parallelism easier but it also depends on the implementation.

I hope this helps.

Mike Honey Mon Jan 30, 2012 11:22 pm

Hi vipinkrsharma,

Coming back to your original question, your current model does sound a bit out of balance to me. If I understand you right, you have 300+ measures by 11 dimensions? I'd be expecting a ratio of 5:1 at most (not roughly 30:1). I generally end up with lower ratios e.g. 2:1. To be clear, I'm counting each instance of role-playing dimensions (Cube Dimensions in SSAS-speak).

This isn't a design objective per se, but it just seems to be how data naturally models for SSAS once you understand both the data and SSAS fully.

So I'd be looking for repeated groups of measures e.g. Pending Customer Count, Open Customer Count, Closed Customer Count. Sometimes you can rationalise these by added new dimensions or attributes (in this e.g. perhaps a Customer Status with values of Pending, Open or Closed).

Another thing to look for would be sparse measures across the rows of fact data. You may find many rows (e.g. of a certain category, or from a certain source) that have null values for many measures. These could be a candidate for splitting into a separate fact, allowing you to model the data more naturally (with just the relevant measures in each fact).

Good luck!
Mike

Sponsored content

Fact with large columns, 320.

Fact with large columns, 320.

Re: Fact with large columns, 320.

Re: Fact with large columns, 320.

Re: Fact with large columns, 320.

Re: Fact with large columns, 320.

Re: Fact with large columns, 320.

Re: Fact with large columns, 320.