Star Schema put to test!
2 posters
Page 1 of 1
Star Schema put to test!
I am trying to model the following:
My fact table contains a row for every game played on a machine. Every game played is attached to a PayID with attributes. Every game played could also contribute to a prize pool with its own attributes. This means that in addition to other dimensions I have a PayID dimension and a PrizePool dimension and these two dimensions have a many-to-many relationship which is perfectly fine because every fact row would have a PayID key and a PrizePoolID key (PrizePool dimension would have a key for "Does not contribute to a prize pool" records).
The problem is that it does not end there. These two dimensions when combined have additional attributes, meaning PayID 1 with PrizePool 2 has additinal attributes such as contribution rates. Also not all games played contribute to a pool. How would I go about designing that? Would I combine these two dimensions into one big dimension (what about the games that don't have a PrizePoolID)?
Thanks,
My fact table contains a row for every game played on a machine. Every game played is attached to a PayID with attributes. Every game played could also contribute to a prize pool with its own attributes. This means that in addition to other dimensions I have a PayID dimension and a PrizePool dimension and these two dimensions have a many-to-many relationship which is perfectly fine because every fact row would have a PayID key and a PrizePoolID key (PrizePool dimension would have a key for "Does not contribute to a prize pool" records).
The problem is that it does not end there. These two dimensions when combined have additional attributes, meaning PayID 1 with PrizePool 2 has additinal attributes such as contribution rates. Also not all games played contribute to a pool. How would I go about designing that? Would I combine these two dimensions into one big dimension (what about the games that don't have a PrizePoolID)?
Thanks,
arowshan- Posts : 23
Join date : 2011-10-18
Location : Vancouver, Canada
Re: Star Schema put to test!
If what you are saying is, the prize pool represents some form of progressive jackpot and that a play contributes some portion of the drop to the pool, then I would simply store the amount of the contribution as a measure in the fact. You could easily back into the contribution rate by dividing the contribution by the drop.
But if you do want to store the rate and some other attributes, place them into one or more junk dimensions and hang those dimensions off the fact. Do not try to somehow cross join pay ID and pool.
Also, you may not actually need a pay ID dimension. Pay ID may be stored as a degenerate and represent the attributes as other dimensions off the fact.
But if you do want to store the rate and some other attributes, place them into one or more junk dimensions and hang those dimensions off the fact. Do not try to somehow cross join pay ID and pool.
Also, you may not actually need a pay ID dimension. Pay ID may be stored as a degenerate and represent the attributes as other dimensions off the fact.
Re: Star Schema put to test!
Thanks ngalemmo. I have some additional questions on that:
So if I do want to store the rate and maybe other attributes in a junk dimension, wouldn't that be a the same size as the cross join of the Pay ID and Pool? Why would I do that? can you explain? When would you decide to use a junk dimension?
Also, you mention that Pay ID could be a degenerate dimension but it has other attributes. Why would I seperate the attributes in other dimensions?
Here is the actual structure of the Pay ID table:
PayID can be associated with multiple PayOut Set indexes which determine the percentate of pay. These payout sets have descriptive attributes. I am pretty sure this structure is based on an XML schema with Payout Sets as sub-elements of PayID. Also the PayID itself has a many to 1 relationship with a game theme which is really the game title. Here is an example:
Because of the fact that the PayOutSets don't mean anything on their own and are just indexes of the Pay ID. I was planning on creating one dimension for the table above. Also, I would include the Theme ID and other them attributes as well as the hierarchy above it in the same dimension such as Game Series which is a level above game theme.
So if I do want to store the rate and maybe other attributes in a junk dimension, wouldn't that be a the same size as the cross join of the Pay ID and Pool? Why would I do that? can you explain? When would you decide to use a junk dimension?
Also, you mention that Pay ID could be a degenerate dimension but it has other attributes. Why would I seperate the attributes in other dimensions?
Here is the actual structure of the Pay ID table:
PayID can be associated with multiple PayOut Set indexes which determine the percentate of pay. These payout sets have descriptive attributes. I am pretty sure this structure is based on an XML schema with Payout Sets as sub-elements of PayID. Also the PayID itself has a many to 1 relationship with a game theme which is really the game title. Here is an example:
- Code:
PayID ThemeID Theme PayOutSet PayOutDescription PayOutPercent
1000 44 "My Perfect Game" 1 "description 5" 95
1000 44 "My Perfect Game" 2 "description 6" 97
1000 44 "My Perfect Game" 3 "description 7" 98
2000 44 "My Perfect Game" 1 "description 10" 90
Because of the fact that the PayOutSets don't mean anything on their own and are just indexes of the Pay ID. I was planning on creating one dimension for the table above. Also, I would include the Theme ID and other them attributes as well as the hierarchy above it in the same dimension such as Game Series which is a level above game theme.
arowshan- Posts : 23
Join date : 2011-10-18
Location : Vancouver, Canada
Re: Star Schema put to test!
A payID represents a single play or a single machine, correct? If theme is one of the attributes of payID, why would you want to repeat the theme in the payID dimension rather than simply have a theme dimension off the fact? The latter would be much much smaller and perform better. Also with machines that offer multiple different games, relating the theme to the particular play (fact row) makes more sense.
The payout sets need to be treated as a multivalued dimension or maintained in a different fact table at a more detailed grain. Not sure which is better... it depends on what the business want's to do with the data.
The payout sets need to be treated as a multivalued dimension or maintained in a different fact table at a more detailed grain. Not sure which is better... it depends on what the business want's to do with the data.
Re: Star Schema put to test!
I thought Kimball suggests to collapse many-to-one relationships into one dimension. In this case PayID and Theme. Each game played is assigned a PayID and the payID is associated to a theme. So, why would I seperate that out? Isn't this the same as combining Store and Region into one dimension?
Also, I don't think payout set can be a multivalued dimension because each game played is only attached to one payout set. The payout set happens to be part of the PayID table, kind of like a composite key with PayID and payout set index.
Also, I don't think payout set can be a multivalued dimension because each game played is only attached to one payout set. The payout set happens to be part of the PayID table, kind of like a composite key with PayID and payout set index.
arowshan- Posts : 23
Join date : 2011-10-18
Location : Vancouver, Canada
Re: Star Schema put to test!
No, its not the same thing. Store does not have a 1:1 relationship with the facts. What you are attempting is equivalent to creating a purchase dimension table in a retail environment. It simply isn't done. You don't record characteristics of an event into a single dimension, you break it down into dimensions that have meaning to the business and can be reused. A single play has no meaning to anything else but the play. The fact that your source may have theme and pay ID on the same record, does not mean they are dependent on each other.
Re: Star Schema put to test!
I think I get your point. It is a bit different from your example of the purchase ID. We only have about 3000 PayID records (each of which have multiple payout sets). So, the play fact rows are not 1-1 with PayID's. However, I get your point that the collapsing of dimensions should only happen if there is a logical hierarchy. In other words, a theme is not logically broken down to PayID's where as a Region is logically broken down to Stores.
You did not answer my second quesion for payout sets. Since each fact record is associated with a PayID and a payoutset (index in the PayID object) and the Payout sets only mean anything if they go with the PayID, I think I have no choice but to combine them into one dimension to be able to include descriptive attributes such as Payout Set Description. A mulit-valued dimension would not work because only one of the payout sets applies.
Thanks again.
You did not answer my second quesion for payout sets. Since each fact record is associated with a PayID and a payoutset (index in the PayID object) and the Payout sets only mean anything if they go with the PayID, I think I have no choice but to combine them into one dimension to be able to include descriptive attributes such as Payout Set Description. A mulit-valued dimension would not work because only one of the payout sets applies.
Thanks again.
arowshan- Posts : 23
Join date : 2011-10-18
Location : Vancouver, Canada
Similar topics
» Help designing star schema
» Snowflake or Star Schema?
» Star Schema for MPP databases
» Star schema or using reference tables?
» Star Schema for Surgeries
» Snowflake or Star Schema?
» Star Schema for MPP databases
» Star schema or using reference tables?
» Star Schema for Surgeries
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum