Accumulating Snapshot Fact table
5 posters
Page 1 of 1
Accumulating Snapshot Fact table
I have some questions about modeling an accumulating snapshot fact table.
I will take as an example a loan application analysis process.
1- I will have a 1-1 relationship between dim_application and fact_application_process (accumulating snapshot). So in this case should I merge everything in the same table (fact_application_process ) to avoid useless joins?
2- My fact table will contain a client_key FK to dim_client. Giving dim_client is a SCD2, will I have to update all the client_key of my fact table every time a change occurs in client_dim? I would say yes. What do you think?
I will take as an example a loan application analysis process.
1- I will have a 1-1 relationship between dim_application and fact_application_process (accumulating snapshot). So in this case should I merge everything in the same table (fact_application_process ) to avoid useless joins?
2- My fact table will contain a client_key FK to dim_client. Giving dim_client is a SCD2, will I have to update all the client_key of my fact table every time a change occurs in client_dim? I would say yes. What do you think?
Re: Accumulating Snapshot Fact table
1. No. Question is, do you need an application dimension, or can it be handled by a combination of smaller dimensions?
2. Definitely no. If you are going to do that, why implement a type 2?
2. Definitely no. If you are going to do that, why implement a type 2?
Re: Accumulating Snapshot Fact table
1. This is actually the point. I am not sure I need a dim_application. Here are briefly the attributes of an application:
General attributes:
the application number, a client, a loan type, a branch, a currency, a purpose
Step attributes:
At every step (request,review,approval,disbursement) I would have a date, an amount, a term and a comment.
So I could put the general attributes in a dim_application or everything in my fact_application_process
2. I need to keep client changes history. So I have to model dim_client as a SCD2. Then to keep my fact_application_process table up to date I will have to keep updated the client_key to the most actual client record.
For example, if a client changes of branch then a new row with a new SK will be added in my dim_client table. Then I will need my applications corresponding to that client to be updated in fact_application_process. Am I missing something?
General attributes:
the application number, a client, a loan type, a branch, a currency, a purpose
Step attributes:
At every step (request,review,approval,disbursement) I would have a date, an amount, a term and a comment.
So I could put the general attributes in a dim_application or everything in my fact_application_process
2. I need to keep client changes history. So I have to model dim_client as a SCD2. Then to keep my fact_application_process table up to date I will have to keep updated the client_key to the most actual client record.
For example, if a client changes of branch then a new row with a new SK will be added in my dim_client table. Then I will need my applications corresponding to that client to be updated in fact_application_process. Am I missing something?
Re: Accumulating Snapshot Fact table
I think what you are missing is that the Fact table joins to the client dimension using the SK, but your where clause would not filter by the SK. That is why the fact table would not need updated. If you were trying to filter by a particular client, you would be using the client's natural key. That natural key will be represented multiple times in the dimension table.
select sum(loanAmt) from factLoan l join dimClient c on l.ClientKey = c.ClientKey where clientSalesForceID = 777389
I don't care that my client switched branches 6 times, and has 6 client records in the dimClient, the natural key stayed the same so my query works to get total loan amount for this client.
Hope that makes it a little clearer.
select sum(loanAmt) from factLoan l join dimClient c on l.ClientKey = c.ClientKey where clientSalesForceID = 777389
I don't care that my client switched branches 6 times, and has 6 client records in the dimClient, the natural key stayed the same so my query works to get total loan amount for this client.
Hope that makes it a little clearer.
TheNJDevil- Posts : 68
Join date : 2011-03-01
Re: Accumulating Snapshot Fact table
It is not clear. If in my cube (let's say application process) I need to display the branch of my client it will get the wrong branch if I did not update the client_key in my fact table.
I want to specify that this question makes sense only with an accumulating snapshot fact table because we are making updates not only inserts.
I want to specify that this question makes sense only with an accumulating snapshot fact table because we are making updates not only inserts.
Re: Accumulating Snapshot Fact table
I'm not that familiar with cube generation, but if the client dimension has a current_record flag, we still get the result of bringing back everything when filtering on the natural key. The client cube dimension would use the current_record flag to bring back proper client record separately.
If this is incorrect in cube generation, then you should look into making client an SCD 3 if at all possible.
If this is incorrect in cube generation, then you should look into making client an SCD 3 if at all possible.
TheNJDevil- Posts : 68
Join date : 2011-03-01
Re: Accumulating Snapshot Fact table
remiby wrote:1. This is actually the point. I am not sure I need a dim_application. Here are briefly the attributes of an application:
General attributes:
the application number, a client, a loan type, a branch, a currency, a purpose
Step attributes:
At every step (request,review,approval,disbursement) I would have a date, an amount, a term and a comment.
So I could put the general attributes in a dim_application or everything in my fact_application_process
You could have other dimensions, such as client, branch, currency, etc... and reference them as FKs on the fact. Stuff like type and purpose can be combined into a single junk dimension. You then place the application number as a degenerate dimension on the fact. You do not create a big wide fact table, as performance would be terrible.
You do not change fact keys. The point of a type 2 dimension is not to keep dimension history, it is to represent facts using historical context. If there is no need to represent facts in a historical context, there is no need for dimensional history. But that's beside the point... you get current values from a type 2 using a self join on the dimension using the natural key to locate the current row version. You do not re-key facts. The FK on the fact represents the version of the dimension at the time of the event represented by the fact.
2. I need to keep client changes history. So I have to model dim_client as a SCD2. Then to keep my fact_application_process table up to date I will have to keep updated the client_key to the most actual client record.
For example, if a client changes of branch then a new row with a new SK will be added in my dim_client table. Then I will need my applications corresponding to that client to be updated in fact_application_process. Am I missing something?
Re: Accumulating Snapshot Fact table
Why performance would be terrible since I have only one row per application in the fact table?performance would be terrible
Thanks so much for your help.
Re: Accumulating Snapshot Fact table
Searching the web I tumbled on this article confirming my initial thought:
blog.oaktonsoftware.com/2010/12/deeper-into-accumulating-snapshot.html
Here is what the author says:
blog.oaktonsoftware.com/2010/12/deeper-into-accumulating-snapshot.html
Here is what the author says:
If the item or entity being processed undergoes slow changes, update its surrogate key in its corresponding fact table row. This will be sure the fact table always points to the most recent version of the item.
Re: Accumulating Snapshot Fact table
Well if you found the answer you were looking for on the internet, it must be best practice! Try to find that answer anywhere on Kimball's site.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Accumulating Snapshot Fact table
remiby wrote:Searching the web I tumbled on this article confirming my initial thought:
blog.oaktonsoftware.com/2010/12/deeper-into-accumulating-snapshot.html
Here is what the author says:If the item or entity being processed undergoes slow changes, update its surrogate key in its corresponding fact table row. This will be sure the fact table always points to the most recent version of the item.
It is incorrect. It is not best practice. Anything that requires retroactive updates of fact FKs should be avoided.
Anyway, you are building a cube. If you need current dimensional state from a type 2, handle it in the query that feeds the cube. Perform a self join on the dimension table as previously mentioned. Its very simple.
Re: Accumulating Snapshot Fact table
Here you go: "Both foreign keys and measured facts may be changed during the revisit" on Accumulating Snapshot for Admissions Tracking, p244, chapter 12 Education, Kimball's dimensional modeling toolkit.
My take is, if you use accumulating snapshot to track status and measures in columns, all the columns in such a fact table are supposed to be updated by current value. It's obvious for date key to change, mostly from 0/-1 (null), but it seems a bit uncomfortable for other FK's. I guess having SCD2 SK in this type of fact table is purely for RI, and NK combined with SCD current status would be more pragmatic approach, saving some ETL work but compromising on modeling sanity.
My take is, if you use accumulating snapshot to track status and measures in columns, all the columns in such a fact table are supposed to be updated by current value. It's obvious for date key to change, mostly from 0/-1 (null), but it seems a bit uncomfortable for other FK's. I guess having SCD2 SK in this type of fact table is purely for RI, and NK combined with SCD current status would be more pragmatic approach, saving some ETL work but compromising on modeling sanity.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Similar topics
» Accumulating Snapshot fact table
» Conformed Dimension for Transaction Fact and Accumulating Snapshot Fact Table
» Underlying fact for accumulating snapshot
» Accumulating Snapshot Fact Table Data Model (Order Management)
» Can you combine an accumulating and periodic snapshot into a single accumulating periodic snapshot table?!
» Conformed Dimension for Transaction Fact and Accumulating Snapshot Fact Table
» Underlying fact for accumulating snapshot
» Accumulating Snapshot Fact Table Data Model (Order Management)
» Can you combine an accumulating and periodic snapshot into a single accumulating periodic snapshot table?!
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum