Question on joining dimensions to avoid possible snowflake
4 posters
Page 1 of 1
Question on joining dimensions to avoid possible snowflake
I have A CostCenter dimension and Division Dimension that both have keys in an "ClaimPayment" fact table.
In the OLTP system Each cost center has a division. In a previous thread a reply recommended splitting cost center and division out. My only problem now is that I cannot tell which cost centers are in which divisions without going to a fact table. And so unless each cost center has a key in a fact table I cannot tell the division it belongs to.
I am contemplating either putting a FK from Division to cost center or moving the Division Data back into the Cost Center dimension.
Any suggestions on which one of the two would be better. I was trying to avoid snowflaking so I figured the FK solution would not be good.
Thanks,
In the OLTP system Each cost center has a division. In a previous thread a reply recommended splitting cost center and division out. My only problem now is that I cannot tell which cost centers are in which divisions without going to a fact table. And so unless each cost center has a key in a fact table I cannot tell the division it belongs to.
I am contemplating either putting a FK from Division to cost center or moving the Division Data back into the Cost Center dimension.
Any suggestions on which one of the two would be better. I was trying to avoid snowflaking so I figured the FK solution would not be good.
Thanks,
mru22- Posts : 34
Join date : 2011-06-14
Re: Question on joining dimensions to avoid possible snowflake
If each cost center belongs to a division, then having the division attributes in the CostCenter dimension seems reasonable (and sounds like a good denormalized dimension.)
The only reason to keep them separate would be if you had other fact data at the division level. If you did keep them separate, you could either snowflake or create a factless fact table to store the "belongs to" relationship. But a single dimension table is the better design choice if there are no requirements forcing you otherwise!
The only reason to keep them separate would be if you had other fact data at the division level. If you did keep them separate, you could either snowflake or create a factless fact table to store the "belongs to" relationship. But a single dimension table is the better design choice if there are no requirements forcing you otherwise!
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Question on joining dimensions to avoid possible snowflake
In your last sentence I assume you meant "single CostCenter" rather than division.
You brought up a good point, There are certain reports I have to create that aggregate up to the division but I think could still do that in the single costcenter dimension, especially with SSAS and creating a User Hierarchy.
Assuming I went the other route and decided to snowflake the dimension but still keep a key directly to the fact table would that lead to reduced performance ? From a complexity standpoint it does seem to be more complex.
You brought up a good point, There are certain reports I have to create that aggregate up to the division but I think could still do that in the single costcenter dimension, especially with SSAS and creating a User Hierarchy.
Assuming I went the other route and decided to snowflake the dimension but still keep a key directly to the fact table would that lead to reduced performance ? From a complexity standpoint it does seem to be more complex.
mru22- Posts : 34
Join date : 2011-06-14
Re: Question on joining dimensions to avoid possible snowflake
One de-normalised CostCenter dimension sounds a better option, especially when you report off SSAS cube. The Division-CostCenter hierarchy should normally be maintained within a single dimension, unless multivalued attributes are involved in which you don't have choice. So even if I snowflake CostCenter by Division and the changes can be ignored, I would not have both FKs in the fact table which may create two channels to the same hierarchy, a confusion to user.
The dimension driven hierarchy can give you a full picture about the dimension relationship independent of facts, while fact driven correlations may only give you part of the picture as many dimension members may not appear in the fact at all.
One reason you may consider having two FKs in the fact table, for CostCenter and Dimension, is when you are dealing with two SCD 2 dimensions, in which case FKs in fact table give you correlation at point in time whereas the snowflake between dimensions give current relationship.
The dimension driven hierarchy can give you a full picture about the dimension relationship independent of facts, while fact driven correlations may only give you part of the picture as many dimension members may not appear in the fact at all.
One reason you may consider having two FKs in the fact table, for CostCenter and Dimension, is when you are dealing with two SCD 2 dimensions, in which case FKs in fact table give you correlation at point in time whereas the snowflake between dimensions give current relationship.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Question on joining dimensions to avoid possible snowflake
It appears switching to the single cost center dimension will best fit my requirements. I won't need to bring the PK from the dimension table so having two FK's in the fact table will be unnecessary.
It is a Type 2 SCD but I will only care about Divisionname. DivsionShortName and DivisionRateGoal as attribute changes that will cause a new row to be created.
Thanks for the feedback its been very helpful.
It is a Type 2 SCD but I will only care about Divisionname. DivsionShortName and DivisionRateGoal as attribute changes that will cause a new row to be created.
Thanks for the feedback its been very helpful.
mru22- Posts : 34
Join date : 2011-06-14
Re: Question on joining dimensions to avoid possible snowflake
I generally just have an 'organizational unit' dimension to hold cost centers, divisions and stuff like that. Usually all they are are hierachical units with IDs and names that the business uses to group things. Sometimes they don't even have designations... not quite a cost center or a division or whatever... just levels in the hierarchy.
I would include a designation code in the dimension if the org unit has a particular label and manage rollups using a hierarchy bridge. When a specific designation is used in the fact, such as cost center, I would have a role named FK referencing the org unit.
I would include a designation code in the dimension if the org unit has a particular label and manage rollups using a hierarchy bridge. When a specific designation is used in the fact, such as cost center, I would have a role named FK referencing the org unit.
Re: Question on joining dimensions to avoid possible snowflake
The designation code idea sounds very siimilar to the "CostCenter" number that I have included in the dimension. I have created a BridgeTable to handle the recursive hierarchies. I'm assuming that the FK from the Bridge table will be FK'd into any fact table that needs a reference from to the CostCenterDivision dimension correct.
Can you elaborate a bit on the idea of a "role" named FK to the organization until ?
Thank you,
Can you elaborate a bit on the idea of a "role" named FK to the organization until ?
Thank you,
mru22- Posts : 34
Join date : 2011-06-14
Re: Question on joining dimensions to avoid possible snowflake
I have one other delimma that I forgot to mention.
While currently having Division and Cost Center separate dimensions, I had Division Goals which change yearly in the division dimension.
Since the company only has less than 10 divisions, i was adding 10 new division records each year for the new yearly goals.
By moving Division and Cost Center into one dimension, I either have to carry the Division key into that single dimension and Keep Division Goal a separate table, or I bring the division Goal information into the same dimension as well.
By doing this however, year cost center records would grow much faster over 10 years for example there would be about 100,000 records as opposed to maybe 2000 or so depending on any attribute changes since this is a SCD. I don't see this being as big problem but wondered if there is anything that I am not considering which could become problematic down the road.
Thanks,
While currently having Division and Cost Center separate dimensions, I had Division Goals which change yearly in the division dimension.
Since the company only has less than 10 divisions, i was adding 10 new division records each year for the new yearly goals.
By moving Division and Cost Center into one dimension, I either have to carry the Division key into that single dimension and Keep Division Goal a separate table, or I bring the division Goal information into the same dimension as well.
By doing this however, year cost center records would grow much faster over 10 years for example there would be about 100,000 records as opposed to maybe 2000 or so depending on any attribute changes since this is a SCD. I don't see this being as big problem but wondered if there is anything that I am not considering which could become problematic down the road.
Thanks,
mru22- Posts : 34
Join date : 2011-06-14
Re: Question on joining dimensions to avoid possible snowflake
What about putting the division goals into a fact table? Reporting on goals vs. actual would then require drilling across facts. And of course this would introduce having a fact that points to a division, not a specific cost center.
Based on recent discussions, you could either
(a.) snowflake your Cost Center and Division (there we go again!)
(b.) designate certain rows in your denormalized dimension to represent the Division (ex: first Cost Center in each Division)
(c.) add special rows in your denormalized dimension to represent a Division
I think SSAS will handle any of these approaches equally well, and of course it provides good support for specifying the granularity at which a fact relates to a dimension.
Based on recent discussions, you could either
(a.) snowflake your Cost Center and Division (there we go again!)
(b.) designate certain rows in your denormalized dimension to represent the Division (ex: first Cost Center in each Division)
(c.) add special rows in your denormalized dimension to represent a Division
I think SSAS will handle any of these approaches equally well, and of course it provides good support for specifying the granularity at which a fact relates to a dimension.
Last edited by VHF on Mon Aug 15, 2011 12:02 pm; edited 2 times in total (Reason for editing : added comment about SSAS)
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Question on joining dimensions to avoid possible snowflake
It think I'm going to try the following schema for one single dimension:
CostCenterKey(PK)
CostCenterId (NK) for glue to track SCD
CostCenterNumber,
CorporateName,
DisplayName,
IsActive,
DivisionName,
DivisionShortName,
DivisionPrimaryCostCenterNumber,
DivisionSortOrder,
DivisionGoalFiscalYear,
DivisionGoalPreventableVehicularAccidentRate,
DivisionGoalOshaAccidentRate,
DivisionGoalLostTimeAccidentRate,
EffectiveDate,
ExpirationDate
Good point on SSAS being able to equally handle any of the solutions rather equally. I want to try and avoid snowflaking and drilling across facts if possible.
Thanks for presenting those options.
CostCenterKey(PK)
CostCenterId (NK) for glue to track SCD
CostCenterNumber,
CorporateName,
DisplayName,
IsActive,
DivisionName,
DivisionShortName,
DivisionPrimaryCostCenterNumber,
DivisionSortOrder,
DivisionGoalFiscalYear,
DivisionGoalPreventableVehicularAccidentRate,
DivisionGoalOshaAccidentRate,
DivisionGoalLostTimeAccidentRate,
EffectiveDate,
ExpirationDate
Good point on SSAS being able to equally handle any of the solutions rather equally. I want to try and avoid snowflaking and drilling across facts if possible.
Thanks for presenting those options.
mru22- Posts : 34
Join date : 2011-06-14
Re: Question on joining dimensions to avoid possible snowflake
I think goal related attributes should belong to a fact table at division and yearly level. You then need a shrunken dimension for division to be referenced by goal fact table. The division dimension should be built based on de-normalised CostCenter dimension with its own surrogate key so that the two dimensions are conformed.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Question on joining dimensions to avoid possible snowflake
I thought about putting goal attributes in a fact table but then I have the division and cost center either separated out again with a foreign key from division to cost center which leads to snowflaking. None of the yearly goal information is ever aggregated, its only used in a side by comparison with measures from another Fact table that I have for injuries and accidents.
However maybe snowflaking is the best way but for now I think im gong to try the single dimension path.
However maybe snowflaking is the best way but for now I think im gong to try the single dimension path.
mru22- Posts : 34
Join date : 2011-06-14
Re: Question on joining dimensions to avoid possible snowflake
No snowflake! Use de-normalised dimension with all attributes for CostCenter and Division. Create another SHRUNKEN/AGGREGATE dimension with only division specific attributes for the goal fact table.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Question on joining dimensions to avoid possible snowflake
The problem that some of us have bumped into with shrunken dimensions is that some BI tools don't understand that a shrunken dimension is a conformed subset.
Trying to get it to work in a Business Objects (BObj) universe was problematic enough that I am introducing some limited snowflaking/outriggers in my DW.
And I think the same issue exists in SSAS... it understands denormalized star schema dimensions, it understands normalized snowlfake dimensions, (and it has good support for specifying the granularity at which a fact relates to a dimension), but I don't know how one would define the relationship between a dimension and conformed shrunken dimension. They end up being two separate unrelated dimensions in the cube.
Any SSAS experts please chime in if you know how to handle this!
Trying to get it to work in a Business Objects (BObj) universe was problematic enough that I am introducing some limited snowflaking/outriggers in my DW.
And I think the same issue exists in SSAS... it understands denormalized star schema dimensions, it understands normalized snowlfake dimensions, (and it has good support for specifying the granularity at which a fact relates to a dimension), but I don't know how one would define the relationship between a dimension and conformed shrunken dimension. They end up being two separate unrelated dimensions in the cube.
Any SSAS experts please chime in if you know how to handle this!
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: Question on joining dimensions to avoid possible snowflake
Division goals definitly belong in a fact table. It's like a budget. Creating new rows in a dimension does not make sense.
As far as role goes, say you had an OrganizationUnit dimension with OrgUnitKey as the PK, if you have a fact who's FK specifically references a cost center, you would name the FK CostCenterOrgUnitKey, appending the name with the specific entity it is referencing. If the FK references any org unit, you would just name it OrgUnitKey without the role suffix.
This leave the org unit dimension to nothing more than:
OrgUnitKey
OrgUnitID
OrgUnitType (code indicating cost center, division, etc...)
CorporateName
DisplayName
..other attributes as necessary
It makes things much simpler.
As far as role goes, say you had an OrganizationUnit dimension with OrgUnitKey as the PK, if you have a fact who's FK specifically references a cost center, you would name the FK CostCenterOrgUnitKey, appending the name with the specific entity it is referencing. If the FK references any org unit, you would just name it OrgUnitKey without the role suffix.
This leave the org unit dimension to nothing more than:
OrgUnitKey
OrgUnitID
OrgUnitType (code indicating cost center, division, etc...)
CorporateName
DisplayName
..other attributes as necessary
It makes things much simpler.
Re: Question on joining dimensions to avoid possible snowflake
So I am a bit confused then either I have a smaller division dimension on top of the denormalized CostCenterDivisionDimension or I have to snowflake ?
Here is what I make of this discussion so far and maybe I am missing something
FactDivisionGoal
DivisionKey
OshaRate
LTARate
FiscalYear (Maybe make FK dateKey)
SmallerDivisionDimension to reference FactGoal ?
FactInjury
CostCenterDivisionKey
InjuredDayCount
OshaDayCount
OtherMeasures..etc
Denormalized CostCenterDivision
costcenterName
Divisionname
etc.....
Or am I missing something here. If I merge CostCenterDivision then either I snowflack or make another smaller dimension as Hang Mentioned.
Thanks,
Here is what I make of this discussion so far and maybe I am missing something
FactDivisionGoal
DivisionKey
OshaRate
LTARate
FiscalYear (Maybe make FK dateKey)
SmallerDivisionDimension to reference FactGoal ?
FactInjury
CostCenterDivisionKey
InjuredDayCount
OshaDayCount
OtherMeasures..etc
Denormalized CostCenterDivision
costcenterName
Divisionname
etc.....
Or am I missing something here. If I merge CostCenterDivision then either I snowflack or make another smaller dimension as Hang Mentioned.
Thanks,
mru22- Posts : 34
Join date : 2011-06-14
Re: Question on joining dimensions to avoid possible snowflake
VHF wrote:The problem that some of us have bumped into with shrunken dimensions is that some BI tools don't understand that a shrunken dimension is a conformed subset.
Trying to get it to work in a Business Objects (BObj) universe was problematic enough that I am introducing some limited snowflaking/outriggers in my DW.
And I think the same issue exists in SSAS... it understands denormalized star schema dimensions, it understands normalized snowlfake dimensions, (and it has good support for specifying the granularity at which a fact relates to a dimension), but I don't know how one would define the relationship between a dimension and conformed shrunken dimension. They end up being two separate unrelated dimensions in the cube.
Any SSAS experts please chime in if you know how to handle this!
Since I am still relatively new to Data Warehousing and am reading Kimball as I go, I now see that maybe what I need is to treat this an an outrigger Dimension as described in his book so I am only one level deeper. I think like you VHF I too might be introducing some limited Outriggers but for now it seems to work ok and will allow me to create the Fact table for the Division Goals as ngalemmo recommended. .
I was looking at the conformed dimension option as recommend earlier and wondered how this is any more beneficial than an outrigger they both seem to introduce snowflaking if i need to match up Injury with DivisionGoals.
This feedback has been very helpful.
Thank,
mru22- Posts : 34
Join date : 2011-06-14
Similar topics
» Joining facts through conforming dimensions
» Using larger data type to avoid sum overflows?
» Question about Dimensions with SKeys to join with other dimensions.
» TWO FACTS JOINing?
» How to avoid snowflaking?
» Using larger data type to avoid sum overflows?
» Question about Dimensions with SKeys to join with other dimensions.
» TWO FACTS JOINing?
» How to avoid snowflaking?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum