CRM DW, measuring activity status durations

View previous topic View next topic Go down

CRM DW, measuring activity status durations

Post  hugoV on Sat Jul 05, 2014 11:06 am

Hi,

I'm designing a CRM DW.

Activities can be created due to an order, a request or a complaint.
There are several types of activities, some are typically opened and closed in a few minutes others may take up to a couple of month.

In any given day (for example the 15th of June) the business would like to be able to efficiently get the following information:

  1. Number of activities created in that day
  2. Number of activities that entered the Assigned status on that day
  3. Number of activities that entered the Waiting customer info status on that day
  4. Number of activities that entered the completed status on that day
  5. Number of activities that entered the canceled status on that day
  6. Number of activities on the Assigned status on that day
  7. Number of activities on the Waiting customer info status on that day

Note: there are more status.
Besides the Nş of activities we also need to know the duration of the activities, so we can see that Activities of Type A are taking on average X days.

Take into consideration that activities can go from Assigned to Waiting customer info to Assigned to Waiting customer info ...

I can have a transnational fact to answer the first 5 questions.
For duration and for the last 2 questions I was considering to have a daily snapshot with the status, total activity duration and total current status duration of each activity.
The activity could change status more than once per day so  should I have 1 row per status per activity each day, and would need a flag to identify the last status of the day or use the end time to pick the last.

Is this the recommended approach? Is there a better one?

I know about accumulating snapshots but it doesn't allow to correctly track an activity that was on status Assigned multiple times.
I also know about time stamped accumulating snapshots but I don't understand how I can efficiently query the DB with them.
I mean, with start and end dates for the status I want to know the nş of activities on Assigned status on day X how many days of data will the query search?

Thank you,
Hugo

hugoV

Posts : 6
Join date : 2014-07-05

View user profile

Back to top Go down

Re: CRM DW, measuring activity status durations

Post  manickam on Mon Jul 07, 2014 12:54 am

A transaction fact table with your grain as status,activity,timestamp will meet your requirements.

manickam

Posts : 27
Join date : 2013-04-26

View user profile

Back to top Go down

Re: CRM DW, measuring activity status durations

Post  hugoV on Mon Jul 07, 2014 4:33 am

Are you advinsing on something like:
20140101 10:00:00 Activity1 Assigned
20140110 09:00:00 Activity1 Waiting for user info
20140310 09:00:00 Activity1 Assigned

When the business needs information about the nş of activities in status "Waiting for user info" on day 2014-03-01.
How do you efficiently gather this information?
And by that I mean that I dont want a query that crosses several months of data to figure out the status of an activitiy in any given day.

The daily snapshot would allow me to search only one partition for the answer.
I was just wondering if there is any other efficient aproach that I'm missing.

hugoV

Posts : 6
Join date : 2014-07-05

View user profile

Back to top Go down

Re: CRM DW, measuring activity status durations

Post  nick_white on Mon Jul 07, 2014 6:01 am

Just for clarification, for points 6 & 7 do you mean given that status at some point during the day or they have that status at some arbitrary end-of-day cutoff time?
For example, if a record started that day with a status of Assigned, went to Waiting customer info, went back to Assigned and then went to Completed - all in one day - would you expect the count to be:
2 - it had the status Assigned twice in the day
1 - it had the status Assigned at some point in the day (but you don't care how many times)
0 - it didn't end the day with a status of assigned

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: CRM DW, measuring activity status durations

Post  hugoV on Mon Jul 07, 2014 6:25 am

For some reports we need to show the full history of the activities, and knowing that it was on assigned status 2x is important.

Most reports should show aggregated data like:
- There where Y1 activities of type T1 closed in January. Those activities where in the assigned status on average 6 hours (involves counting total assigned time).
- On day D1, there where Y2 activities that ended the day in the assigned status.

But there are some reports where business needs to know on average how log each "assigned" iteration takes, so an activity may be 3x on assigned in a given day, for a total of 3 hours but, on average, each assigned iteration took 1 hour.

Also, when something is off business needs to be able to drill down and see the details of one activity, so they expect to be able to see the full history of one activity.
Detais include, who worked on the activity in each step, durations, ...

hugoV

Posts : 6
Join date : 2014-07-05

View user profile

Back to top Go down

Re: CRM DW, measuring activity status durations

Post  nick_white on Mon Jul 07, 2014 6:51 am

Have a look at "Timespan Fact Tables" in Kimball's book - this may be a solution to your problem. Basically you add effective start/end dates to your fact table and whenever your activity status changes you update the end date on the existing fact record and create a new fact record - basically its like an SCD2 but for facts rather than dimensions. You can also calculate the duration and hold it as a measure when you end each record or any SQL query can calculate the duration by taking the difference between the two effective dates. You'd need to be a bit careful calculating durations for records that haven't been ended e.g. substituting the current time as the effective end time at runtime.
There's probably other useful info. you can add to the fact or calculate in SQL at run time. For example (and assuming my logic is correct) if the date portion of the end date is greater than the date portion of the start date then you know that this record was the current one at e.o.d for that day (assuming a midnight e.o.d) - or if you have a different e.o.d cutoff point you can play around with the data and time portions of these two columns to flag the record if is was an e.o.d record

nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: CRM DW, measuring activity status durations

Post  hugoV on Mon Jul 07, 2014 9:20 am

Using the sugested method to know the number of activities on status X on day Y I would have to query all historical data having date < day Y.
This worries me since we could end up with poor performance on reports.

Currently I'm analysing if a mix between a "timespan fact table" and an "accumulating snapshot fact table" is reasonable to maintain and to use in reports.

I would need to:
- maintain the data in a "timespan fact table" or traditional "transactional fact table".
- create an "accumulating snapshot timespan fact table".

Assuming daily snapshots we would daily:
- copy all non finished activities from previous day snapshot.
- load today records like a timespan fact table.

Adding the indicators "last activity record" and "last activity record of the day" I could answer all questions quering only 1 day of data.

I believe most of activities (around 80%) end on the same day they are created and only about 1% should take more than 1 month.
Currently I'm estimating that this model will require close to 2x the amount of space of a traditional snapshot fact table.

Any "alarms" you can think of regarding this aproach?
Any better aproach?

Or any other comment?

Thank you,
Hugo

hugoV

Posts : 6
Join date : 2014-07-05

View user profile

Back to top Go down

Re: CRM DW, measuring activity status durations

Post  nick_white on Mon Jul 07, 2014 10:20 am

If you want to know the number of activities of status X on day Y then you just filter on STATUS = 'X' and EFF_START_DATE <= Y and EFF_END_DATE >= Y. Assuming you've added appropriate indexes, and/or partitioned the table appropriately performance shouldn't be an issue. Or am I missing something in what you are trying to do? How many activity change of status records would you expect per day? If the answer is less than 10's of millions then any appropriately spec'd and configured DB should be able to handle this without any trouble.

While the snapshot approach would work I'm sure I'm not convinced it gives you much/any benefit and has the downside of adding considerably to your build/maintenance and storage - but always happy to be proved wrong


nick_white

Posts : 366
Join date : 2014-01-06
Location : London

View user profile

Back to top Go down

Re: CRM DW, measuring activity status durations

Post  hugoV on Mon Jul 07, 2014 11:18 am

That's the query I imagine:

SELECT ... WHERE EFF_START_DATE <= Y and EFF_END_DATE >= Y

I'll partition the table by START DATE but currently I'm not confident on production environment performance.
It's a new customer to me and the feedback I got is that the hardware is outdated and is being used for several projects and may not deliver the performance we hoped (haven't been able to properly test it yet).

Nevertheless, I agree with you but would add 1 restriction: with partitions, indexes, decent hardware and up to a few millions records per day I wouldn't think twice

My next step is to make a few tests on the production environment.

hugoV

Posts : 6
Join date : 2014-07-05

View user profile

Back to top Go down

Re: CRM DW, measuring activity status durations

Post  Sponsored content


Sponsored content


Back to top Go down

View previous topic View next topic Back to top


 
Permissions in this forum:
You cannot reply to topics in this forum