Snowflake or Star Schema?
3 posters
Page 1 of 1
Snowflake or Star Schema?
Hi,
I am doing some dimensional modelling for a "not for profit" Fundraising Company
They solicit Funds from the WEB, Direct Mail and Phone Solitications, etc and all Transactions are stamped with the FundID.
The legacy system has all the "smarts" of the fund code in 1 column (WEB1009RA, DM1009PA)
Where:
WEB = CampaignType
1009 = AccountingPeriod (i.e., September 2010)
R = CampaignSubType (Retention, Prospect)
A = CampaignSegment
In the new dimensional Model, I want to create 4 new dimension tables
d_CampaignType
d_AccountingPeriod
d_CampaignSubType
d_CampaignSegment
Consider the following Operational Data Model
=============================================
Fact Table
RevenueTransaction (3 million records)
Transaction ID (PK)
FundID (FK to Fund)
CurrencyAmount
Dimension Table
Fund (1000 records)
FundID (PK)
FundCode (WEB1009RA)
CampaignID (FK to Campaign)
Dimension Table
Campaign (300 records)
CampaignID (PK)
CampaignName
My question is, in the Dimensional Model is it better to snowflake this model into 7 tables as follows
Option 1 (Snowflake)
====================
f_RevenueTransaction (links to d_Fund on FundID)
d_Fund links to d_Campaign
d_Fund links to d_CampaignType
d_Fund links to d_AccountingPeriod
d_Fund linkt to d_CampaignSubType
d_Fund links to d_CampaignSegment
Or is it better to move all the FKs of the fund dimension to the fact table...
Option 2 (Star Schema)
======================
f_RevenueTransaction
TransactionID (PK - Surrogate)
OperationalTransactionID
FundID (links to d_Fund)
CampaignTypeID (links to d_CampaignType)
AccountingPeriodID (links to d_AccountingPeriod)
CampaignSubTypeID (links to d_CampaignSubType)
CampaignSegmentID (links to d_CampaignSegment)
I am leaning towards Option 2. I think the queries would run faster.
Any opinions/suggestions?
Hopefully the explanation above makes sense.
Thanks,
Garry
I am doing some dimensional modelling for a "not for profit" Fundraising Company
They solicit Funds from the WEB, Direct Mail and Phone Solitications, etc and all Transactions are stamped with the FundID.
The legacy system has all the "smarts" of the fund code in 1 column (WEB1009RA, DM1009PA)
Where:
WEB = CampaignType
1009 = AccountingPeriod (i.e., September 2010)
R = CampaignSubType (Retention, Prospect)
A = CampaignSegment
In the new dimensional Model, I want to create 4 new dimension tables
d_CampaignType
d_AccountingPeriod
d_CampaignSubType
d_CampaignSegment
Consider the following Operational Data Model
=============================================
Fact Table
RevenueTransaction (3 million records)
Transaction ID (PK)
FundID (FK to Fund)
CurrencyAmount
Dimension Table
Fund (1000 records)
FundID (PK)
FundCode (WEB1009RA)
CampaignID (FK to Campaign)
Dimension Table
Campaign (300 records)
CampaignID (PK)
CampaignName
My question is, in the Dimensional Model is it better to snowflake this model into 7 tables as follows
Option 1 (Snowflake)
====================
f_RevenueTransaction (links to d_Fund on FundID)
d_Fund links to d_Campaign
d_Fund links to d_CampaignType
d_Fund links to d_AccountingPeriod
d_Fund linkt to d_CampaignSubType
d_Fund links to d_CampaignSegment
Or is it better to move all the FKs of the fund dimension to the fact table...
Option 2 (Star Schema)
======================
f_RevenueTransaction
TransactionID (PK - Surrogate)
OperationalTransactionID
FundID (links to d_Fund)
CampaignTypeID (links to d_CampaignType)
AccountingPeriodID (links to d_AccountingPeriod)
CampaignSubTypeID (links to d_CampaignSubType)
CampaignSegmentID (links to d_CampaignSegment)
I am leaning towards Option 2. I think the queries would run faster.
Any opinions/suggestions?
Hopefully the explanation above makes sense.
Thanks,
Garry
garrywh- Posts : 2
Join date : 2010-08-30
Re: Snowflake or Star Schema?
I like option 2 better as well.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Snowflake or Star Schema?
Thanks,
I'll benchmark both models. The space requirements are a little more for option 2 (4 new 4 byte integer fields added to the fact table (16 bytes * 3 million = extra 45 MB)), but it will be direct JOINS from the fact table dimension tables and easier for the user to navigate...
I put together a visual to show the Options more clearly...
www.garry.ca/SampleDimensionalModel.pdf
Garry
I'll benchmark both models. The space requirements are a little more for option 2 (4 new 4 byte integer fields added to the fact table (16 bytes * 3 million = extra 45 MB)), but it will be direct JOINS from the fact table dimension tables and easier for the user to navigate...
I put together a visual to show the Options more clearly...
www.garry.ca/SampleDimensionalModel.pdf
Garry
garrywh- Posts : 2
Join date : 2010-08-30
Re: Snowflake or Star Schema?
I'd also put the fund id as a degenerate dimension in the fact.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Snowflake or Star Schema?
garrywh wrote:Thanks,
I'll benchmark both models. The space requirements are a little more for option 2 (4 new 4 byte integer fields added to the fact table (16 bytes * 3 million = extra 45 MB)), but it will be direct JOINS from the fact table dimension tables and easier for the user to navigate...
I put together a visual to show the Options more clearly...
www.garry.ca/SampleDimensionalModel.pdf
Garry
I'll chip in $1.00 to cover the cost of the extra disk...
If it is a choice between hardware and improvement in clarity, useablilty, and utility, the latter wins hands down.
Similar topics
» Only way to pull data from star/snowflake schema is by using facts?
» Converting Snowflake to Star
» Star vs Snowflake with many different attributes
» Can Snowflake schema be used in OLTP?
» Customer addresses in a high volume retail environment
» Converting Snowflake to Star
» Star vs Snowflake with many different attributes
» Can Snowflake schema be used in OLTP?
» Customer addresses in a high volume retail environment
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum