modelling cashier utilisation

salaman Thu Mar 24, 2011 1:09 pm

Hi,

I have a relational background (in other words go easy on me) but would like to get more involved with dimensional modelling and I'm trying to figure out how to best model the following scenario.

A cashier in a supermarket logs onto the cash till at various points during the day and remains logged on for some time, after which point they log off. During the period of time that they are logged on they will process sales (recorded in a fact_sales table) but they will also spend some time (on the order of minutes) where they might not be doing anything.

The business would like to identify how many minutes within a given hour a cashier was being utilised (i.e. processing sales).

A colleague suggested we create a fact table called fact_cashier_hours, for example, into which a record is inserted for every hour that a cashier was logged on. So, if a cashier logged on at 9am and logged off at 12pm then 3 records would be inserted into this table against the corresponding time_keys.

Somehow this idea doesn't quite sit right with me although I admit that may just be because of my lack of experience with dimensional modelling.

I figured that this kind of problem must have been faced by many as I imagine most business would need to identify times of the day for which they need to allocate more staff (or less) depending on utilisation patterns.

As such I was hoping someone could outline any best practice approaches to this kind of scenario.

Thanks

Jeff Smith Thu Mar 24, 2011 2:07 pm

If you have the time they logged on and the time the logged off - you can calculate total logged on time.

If you have the time in which the transaction started and ended, you can calculate the total transaction time. Sum the transaction time for the individual transactions. For each cashier, for each day (or shift), you have total logged on time and total transaction time. Subsctract the total transaction time from the total time logged on, and you have total idle time.

Usually, a plsae has a staffing model based on the average length of time it takes to complete a transaction. Total number of transactions during a given time will tell you the number of cashiers you need. They'll work faster when the line is long and slower when the line is short or non-existent.

salaman Thu Mar 24, 2011 3:01 pm

Thanks for the quick response Jeff.

Your approach is pretty much what I had in mind, however the users want to be able to drill down on utilisation for cashiers down to the hour level . Which I guess means having to perform that calculation for each hour that every cashier was logged on.

My colleague seems to think that his approach, where each hour a cashier is logged in is stored in a fact_table, would be much quicker. Would this be called an accumulating snapshot fact or is that something completely different? And I can kind of see his point but I wasn't sure if this approach was ideal or not or if performing the calculation you suggested for each hour that every cashier is logged on can be done more efficiently in a way I haven't considered - MDX perhaps?

Jeff Smith Thu Mar 24, 2011 6:09 pm

Actually, I think it's manufacturing data, which isn't necessarily wrong. I think you would have to add a row into the fact table for every hour.

Ah, I've got it. I assume you have a time dimension with the hours flagged. Assume you have a table with the cashiers start and end time. Join the this table to the time dimension using Time Key between the Start Time Key and End Time Key filtering for just the hours. Add this data to a table that already has the start time. This gives you a table with Period Start times. Do a self join, staggering the rows so that the earlier Start time is matched to the second earliest time (first clock hour at work) and this becomes the period end time. For the last Clock Hour, make the quitting time the period end time. It's a little clugie but I use a similar technique whenever I get data from a source system with Begin and End Dates (there's always gaps and overlaps).

I personally think the better way to get to what they want to know is to count how many cashiers are working at any given time and the number of transactions they processed. That's how we did it when working with staffing models at bank branches. I bet money that the average transaction time is shorter when they are busy. Plus, during slow periods with multiple cashiers, one will be busier than the others simply because it's the closest cashier.

salaman Fri Mar 25, 2011 5:05 am

Thanks for the replies Jeff - I'll try your suggestions out and see if it makes any appreciable difference to manufacturing the data.

Sponsored content

modelling cashier utilisation

modelling cashier utilisation

Re: modelling cashier utilisation

Re: modelling cashier utilisation

Re: modelling cashier utilisation

Re: modelling cashier utilisation

Re: modelling cashier utilisation