Date and Time Dimension Combined or Separate
4 posters
Page 1 of 1
Date and Time Dimension Combined or Separate
Hi Guys,
I am modelling the employee attendance process. The grain of the fact table is every-time the employee uses the scanning system which is multiple records for each day for an employee. The transactions recorded are check-in, check-out, personal-check out etc..
This fact should be able to answer questions such as how many employees checked in late, how many checked out early, the number of hours they worked and so on.
Can you guys suggest the grain of the fact table and the dimensions I should be targeting.
I have an option of setting the grain at one record for each date for every employee but its always recommended to record the detail and the lowest level of granularity possible.
Lastly can you tell me how I can query to find out employees who late checked in.
thank you
I am modelling the employee attendance process. The grain of the fact table is every-time the employee uses the scanning system which is multiple records for each day for an employee. The transactions recorded are check-in, check-out, personal-check out etc..
This fact should be able to answer questions such as how many employees checked in late, how many checked out early, the number of hours they worked and so on.
Can you guys suggest the grain of the fact table and the dimensions I should be targeting.
I have an option of setting the grain at one record for each date for every employee but its always recommended to record the detail and the lowest level of granularity possible.
Lastly can you tell me how I can query to find out employees who late checked in.
thank you
businessintelligence- Posts : 14
Join date : 2015-06-23
Re: Date and Time Dimension Combined or Separate
The fact should be one row for each event. Dimensions should include employee, shift (so you can identify 'late'), event type and anything else that matters. As far as temporal dimensions, at minimum you would have a date dimension, you may also have a timestamp as a degenerate dimension. What to do with time of day depends on what you need to know about time of day. If the business has attributes that describe the time of day, then you need a dimension. Normally a time dimension is independent of the date dimension. But if the attribute values describing the time vary depending on the day, you may need to incorporate both time and day in the same dimension.
Re: Date and Time Dimension Combined or Separate
Thank You. The potential complication I can see is that staff can have multiple check-ins during a day, lets suppose that staff are expected to start office at 10AM, anyone later than that would be considered Late. However lets take an example, a staff checks in at 09:45AM and then does a personal checkout at 11:00 AM and then checks in back at 11:30 AM. There will be 3 events or records for this staff during that day.
If I need to identify employees who have checked in late this design could be mis-leading because I may consider a check-in at 11:30 AM as Late Check. Do you see this as a potential design issue.
Lastly what do you think would be the attributes for Shift and how at a query level it could be used to identify employees who have checked-in late. Thank You for collaborating and your help. Much Appreciated!
If I need to identify employees who have checked in late this design could be mis-leading because I may consider a check-in at 11:30 AM as Late Check. Do you see this as a potential design issue.
Lastly what do you think would be the attributes for Shift and how at a query level it could be used to identify employees who have checked-in late. Thank You for collaborating and your help. Much Appreciated!
businessintelligence- Posts : 14
Join date : 2015-06-23
Re: Date and Time Dimension Combined or Separate
Further to my above reply I just thought of modifying the grain of the fact table to just be one record for each employee on one date.
So instead of multiple records for each day for every event, I will only have one row. I will then add pseudo facts such as Late CheckIn (1 or 0), Early Check Out (1 or 0), Hours Worked, Late by Minutes, Early Check out by minutes
One thing I can see is that there is no time of day key, what do you think of this design. Do you see potential disadvantages
So instead of multiple records for each day for every event, I will only have one row. I will then add pseudo facts such as Late CheckIn (1 or 0), Early Check Out (1 or 0), Hours Worked, Late by Minutes, Early Check out by minutes
One thing I can see is that there is no time of day key, what do you think of this design. Do you see potential disadvantages
businessintelligence- Posts : 14
Join date : 2015-06-23
Accumulating snapshot table with history
I agree with one record for each employee on one date. This is a special accumulating snapshot fact table. The only problem is that an employee can have multiple personnel check outs. Therefore I suggest an accumulating snapshot with history. Main attributes are following
Emp Id
Day Key
Check-in time
Check-out time
Personal-check out time
Personal (second) check-in time in the day
Hours worked
Actual record (I/N)
You may add of course your other pseudo facts.
In this I don’t see the need for Time dimension. But if there is a need to analyze the check-in and out events in special periods of day, then of course you have to add this Time dimension as well.
The model is suitable for drill down in one day and trace the work periods of employees also with multiple personal check-outs.
Emp Id
Day Key
Check-in time
Check-out time
Personal-check out time
Personal (second) check-in time in the day
Hours worked
Actual record (I/N)
You may add of course your other pseudo facts.
In this I don’t see the need for Time dimension. But if there is a need to analyze the check-in and out events in special periods of day, then of course you have to add this Time dimension as well.
The model is suitable for drill down in one day and trace the work periods of employees also with multiple personal check-outs.
gvarga- Posts : 43
Join date : 2010-12-15
Re: Date and Time Dimension Combined or Separate
Hi,
Thanks for your reply. One possible issue here is that we clearly do not know how many times an employee can personal check out and check in during a day. It could be zero, once or for that matter any number of times.
Do you think a snapshot fact table is a good design at the date and employee level.
Thanks for your reply. One possible issue here is that we clearly do not know how many times an employee can personal check out and check in during a day. It could be zero, once or for that matter any number of times.
Do you think a snapshot fact table is a good design at the date and employee level.
businessintelligence- Posts : 14
Join date : 2015-06-23
Re: Date and Time Dimension Combined or Separate
It is not a snapshot, but a so called accumulating snapshot: once the employee checks-in, the row will be created with the time of check-in. Normally he checks-out, then the row will be updated with check-out time ( maybe a status is updated) and the Hours worked is calculated. This is a very simple daily process.
When he checks-out personally, the row will be again updated with the Personal-check out time and you have to calculate Working hours till this check-out. He comes back, again update the row. Meanwhile the row is Actual (I).
When he checks-out again personally then you set the Actual attribute of this row to N, and insert a new row for the same day, where
Emp Id (same as in previous row)
Day Key (same as in previous row)
Check-in time (same as in previous row)
Personal-check out time: CURRENT CHECK_OUT time
Personal (second) check-in time in the day: not known
Hours worked : updated with the duration of the previous period he was in the office
Actual record (I).
If he checks out personally 3 times in a day there will be 3 rows .
When you analyze the early and late check-ins and the working hours you have filter the table for the actual rows ( or use a view).
But when you have to see the details if all check ins and outs in a day, you use all the history rows.
When he checks-out personally, the row will be again updated with the Personal-check out time and you have to calculate Working hours till this check-out. He comes back, again update the row. Meanwhile the row is Actual (I).
When he checks-out again personally then you set the Actual attribute of this row to N, and insert a new row for the same day, where
Emp Id (same as in previous row)
Day Key (same as in previous row)
Check-in time (same as in previous row)
Personal-check out time: CURRENT CHECK_OUT time
Personal (second) check-in time in the day: not known
Hours worked : updated with the duration of the previous period he was in the office
Actual record (I).
If he checks out personally 3 times in a day there will be 3 rows .
When you analyze the early and late check-ins and the working hours you have filter the table for the actual rows ( or use a view).
But when you have to see the details if all check ins and outs in a day, you use all the history rows.
gvarga- Posts : 43
Join date : 2010-12-15
Re: Date and Time Dimension Combined or Separate
The problem with a snapshot is it loses information. An event level fact does not. Most important is the time at which these events occur, which, since you are using it to track time, could be significant. You can always create an aggregate or a view to simplify some queries.
As far as checking in, you just need to find the earliest check-in time. The 'shift' dimension should have info as to when they are expected to be at work, which would be used to identify if they are late. This assumes you have a shifts defined somewhere in your HR system and employees are assigned to them. If you don't then you would use some other method to define 'late'.
As far as checking in, you just need to find the earliest check-in time. The 'shift' dimension should have info as to when they are expected to be at work, which would be used to identify if they are late. This assumes you have a shifts defined somewhere in your HR system and employees are assigned to them. If you don't then you would use some other method to define 'late'.
Re: Date and Time Dimension Combined or Separate
Hi All,
Guys I have created the transaction fact table with the lowest level of detail. This is a great start. Now I need to answer questions such as how many employees checked-in late but the problem is an employee can login at 08:45 AM before the official work time of 09:00 AM then do a personal check out later during the day and check-in again. The second check-in considers the employee is late to work when I develop a daily report for each employee.
How should I tackle this issue, Should I create another fact table using the transaction fact as my source, create a view, add a sequence number to the transaction fact so I would consider sequence number 1 as the first check-in and will use that to generate the late check-in report.
I am just thinking loud, thanks guys
Guys I have created the transaction fact table with the lowest level of detail. This is a great start. Now I need to answer questions such as how many employees checked-in late but the problem is an employee can login at 08:45 AM before the official work time of 09:00 AM then do a personal check out later during the day and check-in again. The second check-in considers the employee is late to work when I develop a daily report for each employee.
How should I tackle this issue, Should I create another fact table using the transaction fact as my source, create a view, add a sequence number to the transaction fact so I would consider sequence number 1 as the first check-in and will use that to generate the late check-in report.
I am just thinking loud, thanks guys
businessintelligence- Posts : 14
Join date : 2015-06-23
Re: Date and Time Dimension Combined or Separate
I'd look at adding a sequence (e.g. "Daily_Event_Number") to the fact. Perhaps also a "Latest_Daily_Event_Flag".
That should give you the most flexibility.
Hours of work could be a tricky calculation, as it would assume perfect data (i.e. no missed events).
That should give you the most flexibility.
Hours of work could be a tricky calculation, as it would assume perfect data (i.e. no missed events).
LAndrews- Posts : 132
Join date : 2010-05-13
Location : British Columbia, Canada
Similar topics
» Combined Date Time dimension vs. Separate Day and Time dimensions and timezones
» Transaction Hour in Fact table or Separate Time Dimension?
» How to handle a Time component when using a Date Dimension
» Subsetting date/time Dimensions and Role Playing Date/Time Dimensions
» Date Dimension: Representing partial dates/Imputing date values
» Transaction Hour in Fact table or Separate Time Dimension?
» How to handle a Time component when using a Date Dimension
» Subsetting date/time Dimensions and Role Playing Date/Time Dimensions
» Date Dimension: Representing partial dates/Imputing date values
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum