SCD and Fact update
2 posters
Page 1 of 1
SCD and Fact update
I need some advice for design on new fact table and type 2 dimension table.
The subject area surrounds employees and their scheduled hours and actual worked hours.
An employee has a home account, but when working may punch in and out of other accounts.
An employee's home account may change over time and it is a requirement to track historically what their home account is with their scheduled and actual work hour facts.
So, I have designed a user_account_dim as a SCD:
The natural key in the source system is employeeid=33. The source system assigns a new surrogate key (homeacctid) everyime the employee changes to a new account.
As part of our normal design, a surrogate key is assigned to every new row when it comes into the data warehouse (labor_account_dim_id)
The source system has an employee's work schedule which includes their schedule from day 1 of employment to months into the future (31-DEC-2015). It has a row for each day, with a starttime and endtime.
A snippit of fact table would look like this correctly assigning the labor_account_dim_id which was in effect at the time of the schedule.
Now, the employee get a job and moves into a new account effective 01-APR-2014. We insert a new row into our dimnesion.
We want the facts for scheduled work starting from 01-APR-2014 to reflect the new employee home account. (so, the last row in the fact example above). THere is no need to ever see the original employee home account assigned. But, we always have to show the home account that is in effect for the schedule date - and we know that the home account can change over time.
Is it ok to go back and update the fact table for those rows affected? I know that I can do a self join back to the dimension on the employeeid and get the correct dim_id, but I never want the historic one if it no longer is valid. Any thoughts on the best approach would be appreciated. Thanks.
The subject area surrounds employees and their scheduled hours and actual worked hours.
An employee has a home account, but when working may punch in and out of other accounts.
An employee's home account may change over time and it is a requirement to track historically what their home account is with their scheduled and actual work hour facts.
So, I have designed a user_account_dim as a SCD:
The natural key in the source system is employeeid=33. The source system assigns a new surrogate key (homeacctid) everyime the employee changes to a new account.
As part of our normal design, a surrogate key is assigned to every new row when it comes into the data warehouse (labor_account_dim_id)
LABOR_ACCOUNT_DIM_ID | EMPLOYEEID | HOMEACCTID | EFFECTIVEDTM | EXPIRATIONDTM |
1 | 33 | 32 | 01-JAN-1953 12.00.00 AM | 18-MAR-2006 12.00.00 AM |
2 | 33 | 10419 | 18-MAR-2006 12.00.00 AM | 14-FEB-2011 12.00.00 AM |
3 | 33 | 29692 | 14-FEB-2011 12.00.00 AM | 16-SEP-2013 12.00.00 AM |
4 | 33 | 44237 | 16-SEP-2013 12.00.00 AM | 11-NOV-2013 12.00.00 AM |
5 | 33 | 45372 | 11-NOV-2013 12.00.00 AM | 01-JAN-2015 12.00.00 AM |
A snippit of fact table would look like this correctly assigning the labor_account_dim_id which was in effect at the time of the schedule.
LABOR_ACCOUNT_DIM_ID | COMBHOMEACCTID | LABOR_ACCOUNT_DESC | EMPLOYEEID | SHIFT_START_DATE | SHIFT_END_DATE | EXPECTED_DAILY_HOURS_QTY |
2 | 10419 | Strategic Operations | 33 | 03-DEC-2008 06:00:00 PM | 04-DEC-2008 06:00:00 AM | 8 |
2 | 10419 | Strategic Operations | 33 | 04-DEC-2008 06:00:00 PM | 05-DEC-2008 06:00:00 AM | 8 |
2 | 10419 | Strategic Operations | 33 | 05-DEC-2008 06:00:00 PM | 06-DEC-2008 06:00:00 AM | 8 |
3 | 29692 | Sales Operations | 33 | 14-FEB-2011 06:00:00 PM | 15-FEB-2011 06:00:00 AM | 8 |
3 | 29692 | Sales Operations | 33 | 15-FEB-2011 06:00:00 PM | 16-FEB-2011 06:00:00 AM | 8 |
4 | 44237 | Human Resources | 33 | 16-SEP-2013 08:30:00 AM | 16-SEP-2013 05:00:00 PM | 8 |
4 | 44237 | Human Resources | 33 | 17-SEP-2013 08:30:00 AM | 17-SEP-2013 05:00:00 PM | 8 |
5 | 45372 | Production Business | 33 | 11-NOV-2013 08:30:00 AM | 11-NOV-2013 05:00:00 PM | 8 |
5 | 45372 | Production Business | 33 | 11-NOV-2013 08:30:00 AM | 11-NOV-2013 05:00:00 PM | 8 |
5 | 45372 | Production Business | 33 | 31-MAR-2014 08:30:00 AM | 31-MAR-2014 05:00:00 PM | 8 |
5 | 45372 | Production Business | 33 | 01-APR-2014 08:30:00 AM | 01-APR-2014 05:00:00 PM | 8 |
LABOR_ACCOUNT_DIM_ID | EMPLOYEEID | HOMEACCTID | EFFECTIVEDTM | EXPIRATIONDTM |
1 | 33 | 32 | 01-JAN-1953 12.00.00 AM | 18-MAR-2006 12.00.00 AM |
2 | 33 | 10419 | 18-MAR-2006 12.00.00 AM | 14-FEB-2011 12.00.00 AM |
3 | 33 | 29692 | 14-FEB-2011 12.00.00 AM | 16-SEP-2013 12.00.00 AM |
4 | 33 | 44237 | 16-SEP-2013 12.00.00 AM | 11-NOV-2013 12.00.00 AM |
5 | 33 | 45372 | 11-NOV-2013 12.00.00 AM | 01-APR-2014 12.00.00 AM |
6 | 33 | 99999 | 01-APR-2014 12.00.00 AM | 01-JAN-2015 12.00.00 AM |
We want the facts for scheduled work starting from 01-APR-2014 to reflect the new employee home account. (so, the last row in the fact example above). THere is no need to ever see the original employee home account assigned. But, we always have to show the home account that is in effect for the schedule date - and we know that the home account can change over time.
Is it ok to go back and update the fact table for those rows affected? I know that I can do a self join back to the dimension on the employeeid and get the correct dim_id, but I never want the historic one if it no longer is valid. Any thoughts on the best approach would be appreciated. Thanks.
kjfischer- Posts : 28
Join date : 2011-05-04
Re: SCD and Fact update
COMBHOMEACCTID shouldn't be in the fact table. It's a dimension attribute. Once you remove it you no longer have issues with updating the fact table. The EMPLOYEEID shouldn't be in your fact table either, but since it is, it offers a solution to your problem. Simply join to the labor dimension using the EMPLOYEEID where current row = 'Y' (add it if you don't have it already) and you get the current HOMEACCTID.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: SCD and Fact update
BoxesAndLines wrote:COMBHOMEACCTID shouldn't be in the fact table. It's a dimension attribute. Once you remove it you no longer have issues with updating the fact table. The EMPLOYEEID shouldn't be in your fact table either, but since it is, it offers a solution to your problem. Simply join to the labor dimension using the EMPLOYEEID where current row = 'Y' (add it if you don't have it already) and you get the current HOMEACCTID.
Thanks BoxesAndLines. Agreed that COMHOMEACCTID and LABOR_ACCOUNT_DESC shouldn't be in the fact table - I only put it in the post for illustration. But, in any case, I do not want the CURRENT homeacctid - I ALWAYS want the homeacctid that is in effect for the scheduled fact row. So, for example, for the (yellow highlighted) fact rows in 2006, I want to retrieve the homeacctid/desc of 10419/Strategic Operations and for the first three blue rows, the homeacctid should reflect 45372/Production Business and the last blue line starting Apr-1, should reflect their new homeacctid of 99999/New Home Acct.
kjfischer- Posts : 28
Join date : 2011-05-04
Re: SCD and Fact update
OK, got it. Yes, updating the fact rows is your best option. How feasible it is depends on database volumetrics, tuning, number of updates, etc.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA

» What to do when facts change? Update the fact table with update and deletes? or use SCD type 2?
» Update dim ID in fact rows or create new fact row?
» Transactional Fact and update of records
» Updating a Fact Table
» Late arriving dim *change* - how to update the fact?
» Update dim ID in fact rows or create new fact row?
» Transactional Fact and update of records
» Updating a Fact Table
» Late arriving dim *change* - how to update the fact?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|