Attendance data and Fact Table
2 posters
Page 1 of 1
Attendance data and Fact Table
hi,
We have a requirement to design a fact table for employee attendance and leave.
From the source system, daily clocked time is in the form of number of hours.
Leave information is in the form of from_date and to_date.
Reporting requirements include,
1) Show Emp Leave details for the period ie "From date" and "To date"
2) "Number of days of leave" for arriving at derived measures
one of the approaches we can think of is,
Generate a fact transaction for each date as applicable in the leave duration.
ie. if the leave period is "02-Jan-2012 Half day to 05-Jan-2012 Full day" then
generate separate fact entries for Earned Leave as,
Emp _Id ==== Date === Leave ==== Hours
==================================================
E_01 ==== 02-Jan-12 === On Duty ==== 4
E_01 ==== 02-Jan-12 === Earned Leave ====
E_01 ==== 03-Jan-12 === Earned Leave ====
E_01 ==== 04-Jan-12 === Earned Leave ====
E_01 ==== 05-Jan-12 === Earned Leave ====
E_01 ==== 06-Jan-12 === On duty ==== 8
But considering the number of days of leave this may make the Fact table quite large.
What will be the most effective design approach here?
thanks,
Pullai
We have a requirement to design a fact table for employee attendance and leave.
From the source system, daily clocked time is in the form of number of hours.
Leave information is in the form of from_date and to_date.
Reporting requirements include,
1) Show Emp Leave details for the period ie "From date" and "To date"
2) "Number of days of leave" for arriving at derived measures
one of the approaches we can think of is,
Generate a fact transaction for each date as applicable in the leave duration.
ie. if the leave period is "02-Jan-2012 Half day to 05-Jan-2012 Full day" then
generate separate fact entries for Earned Leave as,
Emp _Id ==== Date === Leave ==== Hours
==================================================
E_01 ==== 02-Jan-12 === On Duty ==== 4
E_01 ==== 02-Jan-12 === Earned Leave ====
E_01 ==== 03-Jan-12 === Earned Leave ====
E_01 ==== 04-Jan-12 === Earned Leave ====
E_01 ==== 05-Jan-12 === Earned Leave ====
E_01 ==== 06-Jan-12 === On duty ==== 8
But considering the number of days of leave this may make the Fact table quite large.
What will be the most effective design approach here?
thanks,
Pullai
PJK- Posts : 4
Join date : 2012-09-26
Re: Attendance data and Fact Table
Hi Pullai
I would first determine whether you'll be building a transactional or periodic-snapshot fact. For transactional facts, I understand that a record is inserted only when an event takes place (i.e.: an insurance claim is submitted, a credit card transaction is captured). For periodic snapshots, there isn't really an event that occurs - instead, the state of the data is recorded for a particular period in time (i.e.: it is snapshotted).
I would argue that you can consider time logged as an event (i.e.: transactional), and that a time will be associated with it (i.e.: the employee logs time for a particular day). That being said, I would have a single measure in the fact that stores the number of hours logged for a particular day. The type of time being logged would be a dimension with specific types of time (i.e.: time worked, sick leave, annual leave, public holiday, unpaid leave, study leave etc.) What I have found useful with storing time at an hour level is that it was easy for me to introduce tasks to the fact, whereby people assign time not at a day level, but at a task level (i.e.: today I spend 2 hours on development, 3 hours on training, and 3 hours in meetings. You'd have multiple fact records per person per day as a result, if they did more than one task during the day).
In terms of people taking leave for an extended period of time, I would enter a record into the fact for each of the days that they were on leave for, and log the time for each day as 8 hours (or however many number of hours a standard working is). So if someone is on leave for 5 days, enter 5 records into the fact, with the measure being set as 8 (hours) for each record.
I wouldn't worry too much about the number of records that this approach will generate. If you're modeling an organization with 10 000 employees, you'll have at most 365 x 10,000 = 3,650 000 records per year, which isn't very much at all. Worry when you get into the hundreds of millions or billions. And even then, there are ways to manage it.
I would first determine whether you'll be building a transactional or periodic-snapshot fact. For transactional facts, I understand that a record is inserted only when an event takes place (i.e.: an insurance claim is submitted, a credit card transaction is captured). For periodic snapshots, there isn't really an event that occurs - instead, the state of the data is recorded for a particular period in time (i.e.: it is snapshotted).
I would argue that you can consider time logged as an event (i.e.: transactional), and that a time will be associated with it (i.e.: the employee logs time for a particular day). That being said, I would have a single measure in the fact that stores the number of hours logged for a particular day. The type of time being logged would be a dimension with specific types of time (i.e.: time worked, sick leave, annual leave, public holiday, unpaid leave, study leave etc.) What I have found useful with storing time at an hour level is that it was easy for me to introduce tasks to the fact, whereby people assign time not at a day level, but at a task level (i.e.: today I spend 2 hours on development, 3 hours on training, and 3 hours in meetings. You'd have multiple fact records per person per day as a result, if they did more than one task during the day).
In terms of people taking leave for an extended period of time, I would enter a record into the fact for each of the days that they were on leave for, and log the time for each day as 8 hours (or however many number of hours a standard working is). So if someone is on leave for 5 days, enter 5 records into the fact, with the measure being set as 8 (hours) for each record.
I wouldn't worry too much about the number of records that this approach will generate. If you're modeling an organization with 10 000 employees, you'll have at most 365 x 10,000 = 3,650 000 records per year, which isn't very much at all. Worry when you get into the hundreds of millions or billions. And even then, there are ways to manage it.
min.emerg- Posts : 39
Join date : 2011-02-25
Similar topics
» Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse
» Data in a fact or dimenzion table or bridge table
» SCD or data in FACT TABLE
» Loading data into fact table
» Show just data that are relavant to fact table
» Data in a fact or dimenzion table or bridge table
» SCD or data in FACT TABLE
» Loading data into fact table
» Show just data that are relavant to fact table
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum