Multiple Fact Tables vs. Consolidated Fact Table
2 posters
Page 1 of 1
Multiple Fact Tables vs. Consolidated Fact Table
I am trying to put together the bus matrix (to identify the facts tables) for a government correctional agency and am debating between multiple facts for each event in the life cycle of an offender vs. fewer fact tables. I wanted to avoid joining multiple facts (my hunch is that it is going to impede performance). My question is whether fewer fact tables are a better design or I need not worry about that?
The Business: The agency receives offenders sentenced from a court case (to serve a term for an offense) and then supervises their time until the term is served. Offenders are supervised in Prisons and/or through Probation and Parole programs. During their sentence several events (violation, transfer, release from prison to probation/parole, escape, new offense, revocation of probation etc.) could happen which could add or subtract (because of good behavior) their time of supervision. Before starting the lifecycle and during this lifecycle, the agency also administers various investigations, assessments, programs, and treatments to offenders in order to identify their level and correct their offending behavior to be able to ultimately release them back into the society.
For the first iteration, I am currently leaning towards FactSentence (one row for each sentence to one offender) and FactOffenderHistory (Transcation Fact - captures all events for each offender for each sentence). The dimensions are offender, offense, date, court case, supervision type, types of programs, treatments, assessments, etc.
For second iteration I was leaning towards a consolidated FactEnrollment table (one row for each participation by an offender) that would cover their enrollments in programs, treatments, assessments, investigations, employment etc. vs. 5 or more different fact tables. Programs and treatments do have some similarities but assessments and investigations are very much different in the outcomes (measures) they produce. Several reports cut across several events and enrollment of an offender in various things and hence my question - Would it be better to have them as separate fact tables vs. one FactEnrollment table (to avoid joins)?
I would appreciate any feedback. Thanks.
-kv30
The Business: The agency receives offenders sentenced from a court case (to serve a term for an offense) and then supervises their time until the term is served. Offenders are supervised in Prisons and/or through Probation and Parole programs. During their sentence several events (violation, transfer, release from prison to probation/parole, escape, new offense, revocation of probation etc.) could happen which could add or subtract (because of good behavior) their time of supervision. Before starting the lifecycle and during this lifecycle, the agency also administers various investigations, assessments, programs, and treatments to offenders in order to identify their level and correct their offending behavior to be able to ultimately release them back into the society.
For the first iteration, I am currently leaning towards FactSentence (one row for each sentence to one offender) and FactOffenderHistory (Transcation Fact - captures all events for each offender for each sentence). The dimensions are offender, offense, date, court case, supervision type, types of programs, treatments, assessments, etc.
For second iteration I was leaning towards a consolidated FactEnrollment table (one row for each participation by an offender) that would cover their enrollments in programs, treatments, assessments, investigations, employment etc. vs. 5 or more different fact tables. Programs and treatments do have some similarities but assessments and investigations are very much different in the outcomes (measures) they produce. Several reports cut across several events and enrollment of an offender in various things and hence my question - Would it be better to have them as separate fact tables vs. one FactEnrollment table (to avoid joins)?
I would appreciate any feedback. Thanks.
-kv30
dwuser30- Posts : 7
Join date : 2010-08-29
Re: Multiple Fact Tables vs. Consolidated Fact Table
Sounds like a model for a typical HR system....
When identifying facts you need to take into account the measures and relation of the events. Generally speaking, if the events are independent and have different measures and grains, they need to be in their own fact.
However, like in HR, there is a need to maintain a history of such events, so there is usually an 'Actions' fact that includes an action type as a dimension which records every event that has occured with that person. It may or may not have any measures, but if it does it would be ones common across most events. And, like HR, if you are tracking location, such a fact table would carry from and to location dimensional references as part of the event. These would be the same if the action does not involve a transfer.
Once you have established the various atomic facts, it is common to create aggregates across various facts to accomodate life cycle analysis as well as improve performance of other common areas of analysis. One such aggregate may be a population snapshot showing locations and status at a point in time, such as month-end.
When identifying facts you need to take into account the measures and relation of the events. Generally speaking, if the events are independent and have different measures and grains, they need to be in their own fact.
However, like in HR, there is a need to maintain a history of such events, so there is usually an 'Actions' fact that includes an action type as a dimension which records every event that has occured with that person. It may or may not have any measures, but if it does it would be ones common across most events. And, like HR, if you are tracking location, such a fact table would carry from and to location dimensional references as part of the event. These would be the same if the action does not involve a transfer.
Once you have established the various atomic facts, it is common to create aggregates across various facts to accomodate life cycle analysis as well as improve performance of other common areas of analysis. One such aggregate may be a population snapshot showing locations and status at a point in time, such as month-end.
Re: Multiple Fact Tables vs. Consolidated Fact Table
ngalemmo- Thanks so much. That really helps. Your suggestion about the population snapshot is going to be good addition my design! Thanks again. For now I will plan for multiple facts.
dwuser30- Posts : 7
Join date : 2010-08-29
Similar topics
» Single fact table vs multiple fact tables - what is the right way in a dimensional model?
» Consolidated fact table or separate facts?
» Modelling question - Multiple fact tables with one to many relationships and querying from many fact tables
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Multiple fact tables for invoice
» Consolidated fact table or separate facts?
» Modelling question - Multiple fact tables with one to many relationships and querying from many fact tables
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Multiple fact tables for invoice
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum