is it ok to join two fact tables on a degenerate dimension?
3 posters
Page 1 of 1
is it ok to join two fact tables on a degenerate dimension?
The title question is kind of rhetorical because I didn't actually think it was a common practice but I was hoping someone could offer an alternative :).
Let's say I have a fact_service_provided table where each row describes a service that the business provided to a customer as show below.
fact_service_provided
Now, after the service has been provided Quality Control might take place on random services. And this process is modelled by a fact table called fact_quality_control_transaction, where each row represents an action that a quality control officer (these could be different people) carried out on a service. A specific action may also be carried out multiple times on the same service at different times.
fact_quality_control_transaction
A typical question that the business might ask is: How many of the services that we delivered in March had the following action(s) performed on the service?
I can't see how joining on the degenerate dimension makes any sense (one of the fact tables would have 800+ million rows) but I can't figure how best to work around this. Because the business want to constrain on service it almost seems like service_id should sit in a dimension of its own but having a one-to-one mapping between a dimension and a fact table doesn't seem right either (not least because I'd have a dimension table with 800 million rows in it).
Any ideas?
Let's say I have a fact_service_provided table where each row describes a service that the business provided to a customer as show below.
fact_service_provided
customer_key |
date_completion_key |
service_id (DD) |
some facts |
Now, after the service has been provided Quality Control might take place on random services. And this process is modelled by a fact table called fact_quality_control_transaction, where each row represents an action that a quality control officer (these could be different people) carried out on a service. A specific action may also be carried out multiple times on the same service at different times.
fact_quality_control_transaction
quality_control_officer_key |
date_key |
action_key |
service_id (DD) |
A typical question that the business might ask is: How many of the services that we delivered in March had the following action(s) performed on the service?
I can't see how joining on the degenerate dimension makes any sense (one of the fact tables would have 800+ million rows) but I can't figure how best to work around this. Because the business want to constrain on service it almost seems like service_id should sit in a dimension of its own but having a one-to-one mapping between a dimension and a fact table doesn't seem right either (not least because I'd have a dimension table with 800 million rows in it).
Any ideas?
salaman- Posts : 21
Join date : 2011-03-24
Re: is it ok to join two fact tables on a degenerate dimension?
First off, you do not join fact tables. Period.
A fact table has a many-to-many relationship with any other fact table.
What you DO join, are the results of sub-queries against the fact tables. The sub query aggregates each fact table to the same grain (i.e. attributes of common dimensions). The aggregates have a 1-1 relationship with each other.
The join can involve degenerate dimensions or full dimensions, it doesn't matter.
A fact table has a many-to-many relationship with any other fact table.
What you DO join, are the results of sub-queries against the fact tables. The sub query aggregates each fact table to the same grain (i.e. attributes of common dimensions). The aggregates have a 1-1 relationship with each other.
The join can involve degenerate dimensions or full dimensions, it doesn't matter.
Re: is it ok to join two fact tables on a degenerate dimension?
Thanks for the quick response. So if I understand you correctly, the question "how many services that were delivered in March had a specific action" could be answered like this:
Aggregate the fact_quality_control_transaction table (which is more granular) for a given time period and specified action. And then join those results against my fact_service table (by the degenerate dimension) to find how many of my aggregated result set exist in fact_service for the same specified time period.
Aggregate the fact_quality_control_transaction table (which is more granular) for a given time period and specified action. And then join those results against my fact_service table (by the degenerate dimension) to find how many of my aggregated result set exist in fact_service for the same specified time period.
salaman- Posts : 21
Join date : 2011-03-24
Re: is it ok to join two fact tables on a degenerate dimension?
Or count distinct services that had the action from the QA Fact table.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: is it ok to join two fact tables on a degenerate dimension?
Jeff Smith wrote:Or count distinct services that had the action from the QA Fact table.
Thanks Jeff.
I forgot that the date_key in the QA fact table relates to the date when a QA action was taken - which could be days or weeks after the service was completed.
Would adding a service_completed_date_key to the QA fact table be advisable? Although this would result in repeating values for a given service_id it does mean that I would be able to filter on this date_key to make the "count distinct services" query more efficient.
salaman- Posts : 21
Join date : 2011-03-24
Similar topics
» use of degenerate dimension to physically join two logically related fact tables
» Need to join dimension tables
» Join two fact tables?
» Can we join to 2 fact tables directly ?
» It's good ideia join 2 tables in 1 dimension
» Need to join dimension tables
» Join two fact tables?
» Can we join to 2 fact tables directly ?
» It's good ideia join 2 tables in 1 dimension
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum