Static facts in dimensions?
3 posters
Page 1 of 1
Static facts in dimensions?
Hello,
I am somewhat new to dimensional modelling and need a bit of advice for the following business scenario.
Business functions
A university accepts applications for "research grants" in various subject areas.
Accepted applications go through 8 different "processing stages" (eg. New, Validated, Accepted, Incomplete, Complete, Withdrawn, Rejected, Unknown). Withdrawn\Rejected\Accepted are the end stages. Applications can arrive at the same stage multiple times.
The whole journey through processing stages can take days to years.
Each submitted application has a "requested grant value" ($ value).
Each approved application has an "approved grant value", which can be different to the requested value.
Queries
Analyse total processing time by "subject area", submitted date\month\year.
Analyse requested grant amounts by "subject area", submitted date\month\year.
Analyse applications in each processing stage (by subject area)
Analyse approved grants amounts
The questions I have are
1. Is "Requested grant amount" a dimension (because it is static) or a fact (because sum/avg operations are to be done)?
2. Can "Requested grant amount" and "Approved grant amount" be placed in the same fact table? They have the same granularity, but different event-times.
3. Should "processing stages" be a separate dimension or attributes in application dimension?
4. How to record applications changing their processing stages, so that we can analyse time spent in each stage?
Any insight is much appreciated.
sman
I am somewhat new to dimensional modelling and need a bit of advice for the following business scenario.
Business functions
A university accepts applications for "research grants" in various subject areas.
Accepted applications go through 8 different "processing stages" (eg. New, Validated, Accepted, Incomplete, Complete, Withdrawn, Rejected, Unknown). Withdrawn\Rejected\Accepted are the end stages. Applications can arrive at the same stage multiple times.
The whole journey through processing stages can take days to years.
Each submitted application has a "requested grant value" ($ value).
Each approved application has an "approved grant value", which can be different to the requested value.
Queries
Analyse total processing time by "subject area", submitted date\month\year.
Analyse requested grant amounts by "subject area", submitted date\month\year.
Analyse applications in each processing stage (by subject area)
Analyse approved grants amounts
The questions I have are
1. Is "Requested grant amount" a dimension (because it is static) or a fact (because sum/avg operations are to be done)?
2. Can "Requested grant amount" and "Approved grant amount" be placed in the same fact table? They have the same granularity, but different event-times.
3. Should "processing stages" be a separate dimension or attributes in application dimension?
4. How to record applications changing their processing stages, so that we can analyse time spent in each stage?
Any insight is much appreciated.
sman
sman- Posts : 22
Join date : 2011-01-30
Re: Static facts in dimensions?
1. yes
2. yes (accumulating snapshot fact)
3. see accumulating snapshot. Each stage is simply identified by a date and optionally a binary metric indicating that the stage has occurred.
4. Subtract the time period between any two stage dates
2. yes (accumulating snapshot fact)
3. see accumulating snapshot. Each stage is simply identified by a date and optionally a binary metric indicating that the stage has occurred.
4. Subtract the time period between any two stage dates
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Static facts in dimensions?
1. Is "Requested grant amount" a dimension (because it is static) or a fact (because sum/avg operations are to be done)?
2. Accumulating Snapshot fact will look like;
Thanks BoxesAndLines for your input.
2. Accumulating Snapshot fact will look like;
- Code:
<Surrogate keys to dims>
...
New_Start_Date_Key
New_End_Data_Key
New_Days
..
<repeat for all 8 stages>
..
Total_Start_Date_key
Total_End_Date_Key
Total_Days
Requested Amount
Approved Amount
Thanks BoxesAndLines for your input.
sman- Posts : 22
Join date : 2011-01-30
Re: Static facts in dimensions?
The requested and approved grant values are measures and belong in a fact table.
The issue is you are trying to put two facts into one table. Since you are modelling the process of approving a grant each decision or action during that process is a business event (request, review, denial. etc...). So, at the atomic level you track the events... date, grant requestor, officer/staff involved, decision/action, etc... and value as a measure.
The second fact is an aggregation of these atomic facts. This can be implemented as a view or materialized as a table. It would be derived from the atomic table by selecting the request and approval actions. This could be bound by date should there be a need to restrict to the earliest or latest event of that type.
The issue is you are trying to put two facts into one table. Since you are modelling the process of approving a grant each decision or action during that process is a business event (request, review, denial. etc...). So, at the atomic level you track the events... date, grant requestor, officer/staff involved, decision/action, etc... and value as a measure.
The second fact is an aggregation of these atomic facts. This can be implemented as a view or materialized as a table. It would be derived from the atomic table by selecting the request and approval actions. This could be bound by date should there be a need to restrict to the earliest or latest event of that type.
Re: Static facts in dimensions?
Thanks Ngalemmo.
How do I link the other dimensions (ie. Geography, Subject Area etc) to the star schema? They don't change when an Application changes stages while being processed. So, should they be modeled as outriggers to Application dimension? Or, should they be attributes in the Application Dimension?
Is it normal to have a dimension that has the same granularity as a fact table?
What I mean is that, if we move out "Requested Amount" and "Approved Amount" to a fact table, then the fact table will have same granularity as the Application dimension.
Regards
How do I link the other dimensions (ie. Geography, Subject Area etc) to the star schema? They don't change when an Application changes stages while being processed. So, should they be modeled as outriggers to Application dimension? Or, should they be attributes in the Application Dimension?
Is it normal to have a dimension that has the same granularity as a fact table?
What I mean is that, if we move out "Requested Amount" and "Approved Amount" to a fact table, then the fact table will have same granularity as the Application dimension.
Regards
sman- Posts : 22
Join date : 2011-01-30
Re: Static facts in dimensions?
All dimensions are referenced by foreign keys from the fact table. So other dimensions (geography, etc...) would appear as foreign key references. It doesn't matter if they don't change from one action to the next.
It is not normal to have a dimension that has a near 1:1 correspondance to a fact table. You should consider breaking the application dimension down into usable (ie conforming) components, such as applicant, and so forth.
It is not normal to have a dimension that has a near 1:1 correspondance to a fact table. You should consider breaking the application dimension down into usable (ie conforming) components, such as applicant, and so forth.
Re: Static facts in dimensions?
ngalemmo,
A given Application has one and only one "Requested Grant Amount" and an "Approved Grant Amount", so the relationship is 1:1.
So, if those two values are moved out from Application Dim to a fact table, that fact table will have the same 1:1 grain as the Application Dimension. How to avoid this?
The users don't analyse data "by applicants". They analyse/group "dollar amounts" by geography, subject area, Application etc.
For a given Application, geography and subject area are static.
It is not normal to have a dimension that has a near 1:1 correspondance to a fact table. You should consider breaking the application dimension down into usable (ie conforming) components, such as applicant, and so forth.
A given Application has one and only one "Requested Grant Amount" and an "Approved Grant Amount", so the relationship is 1:1.
So, if those two values are moved out from Application Dim to a fact table, that fact table will have the same 1:1 grain as the Application Dimension. How to avoid this?
The users don't analyse data "by applicants". They analyse/group "dollar amounts" by geography, subject area, Application etc.
For a given Application, geography and subject area are static.
sman- Posts : 22
Join date : 2011-01-30
Re: Static facts in dimensions?
If you are modeling a workflow process, you don't need both start and end dates. Start date should be sufficient. I also don't normally store the day counts between the stages or total day counts since these are easily calculated on the fly and it avoids performing updates every day just to add 1 to all the active applications. I'm not sure why you are focused on whether an attribute is static or not. This has no implication on dimensional modeling or relational modeling for that matter. What attributes of application do users want to analyze facts by? Those are the attributes you should look to move to other dimensions to avoid 1-1 relationship to the fact table. It may be that you still end up with a 1-1, but the smaller the dimension, the better the performance.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Static facts in dimensions?
The users don't analyse data "by applicants". They analyse/group "dollar amounts" by geography, subject area, Application etc.
For a given Application, geography and subject area are static.
What are the attributes of an application? For example, when modeling sales orders, to it rare to have an order dimension table. Attributes of the order header, such as customer, dates, addresses and so on, are represented by separate dimensions. The only thing left is usually the order number which is stored as a degenerate dimension value on the fact table. This is done to avoid near 1:1 correspondence with facts, because queries using such relationships generally perform very poorly. It also provides more useable dimensions to conform with other facts.
As B&L points out, static or dynamic makes no difference. In fact, most of the time, measures are static, and if they are not, often techniques are used (such as delta based fact tables) to make them so.
Re: Static facts in dimensions?
B&L and Ngalemmo,
1. Application dimensions corresponds to the paper-based application for grants.
It has attributes like "Project Title", "Project Description", "Subject Area", "Federal Government Funded Flag", "State Government Funded Flag", etc, and "Requested Amount" & "Approved Amount". A lot of these attributes are textual descriptors rather than strict dimensions. What I mean is that, two applications have the same "Project Title", but they are not the same project. Subject Area is, however, picked from a list of values prepared by the department.
2. If we move "Requested Amount" to the Application fact table (transaction fact), then it will look like
Note that the static "Requested Amount" is appearing on the fact table that records events of an application progressing, because as far as application processing is concerned, "Requested Amount" does not change. So, sum of "Requested Amount" from this fact table is not useful because of the duplicates.
1. Application dimensions corresponds to the paper-based application for grants.
It has attributes like "Project Title", "Project Description", "Subject Area", "Federal Government Funded Flag", "State Government Funded Flag", etc, and "Requested Amount" & "Approved Amount". A lot of these attributes are textual descriptors rather than strict dimensions. What I mean is that, two applications have the same "Project Title", but they are not the same project. Subject Area is, however, picked from a list of values prepared by the department.
2. If we move "Requested Amount" to the Application fact table (transaction fact), then it will look like
- Code:
App Key, Processing Stage Key, Requested Amount,....<date keys>
1, 1, $485.00,...
1, 2, $485.00,...
1, 3, $485.00,...
1, 5, $485.00,...
1, 7, $485.00,...
3, 1, $1473.00,...
3, 3, $1473.00,...
3, 2, $1473.00,...
3, 7, $1473.00,...
3, 4, $1473.00,...
Note that the static "Requested Amount" is appearing on the fact table that records events of an application progressing, because as far as application processing is concerned, "Requested Amount" does not change. So, sum of "Requested Amount" from this fact table is not useful because of the duplicates.
Last edited by sman on Thu Jan 12, 2012 11:07 pm; edited 1 time in total
sman- Posts : 22
Join date : 2011-01-30
Re: Static facts in dimensions?
Aha! You're not doing a transaction fact, you are doing an accumulating snapshot. At least that's what I told you to do. ;-)
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Similar topics
» Static Facts on Dimension Table?
» No of Dimensions and Facts
» Arriving at Facts and Dimensions
» What Impact do dimensions have that actually are facts ?
» Creating facts and dimensions from raw data
» No of Dimensions and Facts
» Arriving at Facts and Dimensions
» What Impact do dimensions have that actually are facts ?
» Creating facts and dimensions from raw data
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum