Dimensions in fact table
+3
Jeff Smith
ngalemmo
fico
7 posters
Page 1 of 1
Dimensions in fact table
Hello,
I'm working definining a dimensional model in a bank. We have Teradata as DB and MicroStrategy as BI tool.
We have many fact tables and most of the fact table lowest level is the "Contract" dimension.
We have many other dimensions that have a relation with "Contract" like "Customer", "Product", "Sales Manager", "Branch",...
They have more than 15 million customers and 400 million Contracts. We have to keep 13 months of history in the fact tables. For example the Balance fact have more than 5200 millon rows.
I want to define a star model with the Balance Fact table but I have a doubt about the dimensions that I should include in this fact table. I have to different approach:
Approach 1: Define the fact table with dimensions Contract, Customer, Product, Sales Manager and Month. and not define any relation between the dimensions.
Approach 2: Define the fact table with only the Contract dimension, and define the relations between the other dimension tables as a hierarchy in MicroStrategy. The model will be more a snowflake than a star.
Sometime my customer wants to query for example how many customers have a contract of type "X" and I don't want to go throw the huge fact table to find the relation between contracts and customers.
Any help will be really apreciated.
Best Regards,
Fico
I'm working definining a dimensional model in a bank. We have Teradata as DB and MicroStrategy as BI tool.
We have many fact tables and most of the fact table lowest level is the "Contract" dimension.
We have many other dimensions that have a relation with "Contract" like "Customer", "Product", "Sales Manager", "Branch",...
They have more than 15 million customers and 400 million Contracts. We have to keep 13 months of history in the fact tables. For example the Balance fact have more than 5200 millon rows.
I want to define a star model with the Balance Fact table but I have a doubt about the dimensions that I should include in this fact table. I have to different approach:
Approach 1: Define the fact table with dimensions Contract, Customer, Product, Sales Manager and Month. and not define any relation between the dimensions.
Approach 2: Define the fact table with only the Contract dimension, and define the relations between the other dimension tables as a hierarchy in MicroStrategy. The model will be more a snowflake than a star.
Sometime my customer wants to query for example how many customers have a contract of type "X" and I don't want to go throw the huge fact table to find the relation between contracts and customers.
Any help will be really apreciated.
Best Regards,
Fico
fico- Posts : 3
Join date : 2010-09-12
Re: Dimensions in fact table
Approach 1. However the comment, "not define any relation between the dimensions" is somewhat incorrect since the fact table itself implies relationships between the dimensions.
You may want to consider additional dimensions, breaking down the 'contract' into smaller dimensions to eliminate the huge contract dimension itself.
You may want to consider additional dimensions, breaking down the 'contract' into smaller dimensions to eliminate the huge contract dimension itself.
Re: Dimensions in fact table
Does Contract need to be a dimension table or can the Contract identifier be a degenerate dimension? In place of the Contract, you could include the Contract Type as a dimension or as part of a junk dimension.
If Contract is the entity that pulls everything together, then it sounds to me that it is the fact.
If Contract is the entity that pulls everything together, then it sounds to me that it is the fact.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: Dimensions in fact table
I assume you are creating these as views since you're using Teradata?
Why not try approach 1, as well as another factless fact table for the relationship between customer and contract?
Why not try approach 1, as well as another factless fact table for the relationship between customer and contract?
Re: Dimensions in fact table
Thinking along the line that ngalemmo suggested, eliminating the massive contract dimension and introducing mini-dimensions in your star schema is a sensible compromise for the benefit of significant performance gain when dealing with Very Large Tables (VLT). You may have your schema built at aggregate levels on both dimensions and facts to give you quick high level analytic views and also allow navigations into lower level details. Querying against 5200 million rows for any aggregation on the fly is challenging in terms of performance whereas retrieving recalculated aggregates against a few million summarized rows would be a matter of seconds.
I would also create a mini-dimension for customer dimension to include all the low cardinality but analytically useful attributes, say demographic dimension, so that you could also build the aggregate fact related only by much smaller mini-dimensions.
Stick to star schema as a general guideline for dimensional modeling, but snowflake sensibly in certain circumstances. For instance you could snowflake the customer dimension by including FK of demographic dimension (outrigger) so that you can have the relationship without going through massive fact table.
I would also create a mini-dimension for customer dimension to include all the low cardinality but analytically useful attributes, say demographic dimension, so that you could also build the aggregate fact related only by much smaller mini-dimensions.
Stick to star schema as a general guideline for dimensional modeling, but snowflake sensibly in certain circumstances. For instance you could snowflake the customer dimension by including FK of demographic dimension (outrigger) so that you can have the relationship without going through massive fact table.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Dimensions in fact table
Hi,
In a banking DW Contract and Customer have a very special role, they are dimensions and in some sence facts also.
I designed the following DW structure concerning your objects:
Contract
Contract Id
Customer Id (FK)
Product Id (FK)
Branch Id (FK)
Manager Id (FK)
Start date
End date etc.
Customer, Product, Branch, Manager etc dimensions
Date dimension
Balances fact table
Id
Contract Id
Date Id
Various balances etc.
When your customer wants to query how many customers have a contract of product "X" then you will get the info from your Contract table .
So the model is snowflake like.
I just miss from the model the Account object.
In a banking DW Contract and Customer have a very special role, they are dimensions and in some sence facts also.
I designed the following DW structure concerning your objects:
Contract
Contract Id
Customer Id (FK)
Product Id (FK)
Branch Id (FK)
Manager Id (FK)
Start date
End date etc.
Customer, Product, Branch, Manager etc dimensions
Date dimension
Balances fact table
Id
Contract Id
Date Id
Various balances etc.
When your customer wants to query how many customers have a contract of product "X" then you will get the info from your Contract table .
So the model is snowflake like.
I just miss from the model the Account object.
gvarga- Posts : 43
Join date : 2010-12-15
Re: Dimensions in fact table
i will go with approach 2..
it doesnt make sense to query facts to identify the relationship b/w contract and dimension... heirarchies shuld be nice.
it doesnt make sense to query facts to identify the relationship b/w contract and dimension... heirarchies shuld be nice.
sgudavalli- Posts : 29
Join date : 2010-06-10
Age : 40
Location : Pune, India
Re: Dimensions in fact table
Approach 2 is a terrible choice. It is not thinking dimensionally and the query performance would be absolutely horrendous.
The idea of a contract dimension, with a near 1:1 relationship to the fact is ER thinking. The idea that you have a business key (i.e. the contract ID) that identifies an entity and that attributes directly relating to that business key must exist in that entity is, plain and simple, 3NF.
Dimensional thinking is that you have a business event (customer enters into a contract) or a business state (customer has a contract) and you represent it with a fact table. You surround that fact with context... aka dimensions. Now, the particular contract is a context, but it is far to broad to be usable as a dimension, so you break it down into smaller contexts. What product, terms, conditions, etc... Each smaller context becomes a dimension and appear as FK references in the fact table. With the smaller scope, each dimension is much, much smaller that the combined (contract) dimension. The result is much faster and simpler queries, and a much smaller database overall. Of course, you retain the contract ID, but it becomes a degenerate dimension in the fact table and a side note in terms of business analysis.
The idea of a contract dimension, with a near 1:1 relationship to the fact is ER thinking. The idea that you have a business key (i.e. the contract ID) that identifies an entity and that attributes directly relating to that business key must exist in that entity is, plain and simple, 3NF.
Dimensional thinking is that you have a business event (customer enters into a contract) or a business state (customer has a contract) and you represent it with a fact table. You surround that fact with context... aka dimensions. Now, the particular contract is a context, but it is far to broad to be usable as a dimension, so you break it down into smaller contexts. What product, terms, conditions, etc... Each smaller context becomes a dimension and appear as FK references in the fact table. With the smaller scope, each dimension is much, much smaller that the combined (contract) dimension. The result is much faster and simpler queries, and a much smaller database overall. Of course, you retain the contract ID, but it becomes a degenerate dimension in the fact table and a side note in terms of business analysis.
Re: Dimensions in fact table
thnks for a perfect explanation.
i see contract as a dimension not sure if its degenerate in nature. and i see snow flakes with contract to the rest of dimensions.
regards
shiv
i see contract as a dimension not sure if its degenerate in nature. and i see snow flakes with contract to the rest of dimensions.
regards
shiv
sgudavalli- Posts : 29
Join date : 2010-06-10
Age : 40
Location : Pune, India
Re: Dimensions in fact table
Don't snowflake. The contract ID is degenerate.
If you need to know the state of a contract, implement it using a factless fact table... a simple row with all the FK's referencing the components of the contract.
If you need to know the state of a contract, implement it using a factless fact table... a simple row with all the FK's referencing the components of the contract.
Similar topics
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Number of Dimensions around a FACT Table.
» Relating dimensions together and fact table design
» should I connect the dimensions or the bridge to fact table?
» Allocating Header Level FACT table Dimensions FKs into Line Level Fact
» Number of Dimensions around a FACT Table.
» Relating dimensions together and fact table design
» should I connect the dimensions or the bridge to fact table?
» Allocating Header Level FACT table Dimensions FKs into Line Level Fact
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum