service fulfillment - one fact or many?
3 posters
Page 1 of 1
service fulfillment - one fact or many?
I'm trying to model a service fulfillment process which essentially begins with the receipt of an audio message and ends up with the delivery of a text conversion of that audio.
In between the start and end events there can be several other steps that happen before the conversion. I've modelled this by creating a dimension with all of the different events that can act upon an audio message. The fact table looks like:
fact_audio_transaction
dim_event_key
dim_day_key
dim_time_key
dim_language_key
audio_id (dd)
transaction_duration
Possible values for the dimension table are (but not limited to):
audio_received
audio_cleaned_up
audio_converted
audio_language_identified
audio_converted
audio_quality_set
audio_unconverted
conversion_delivered
The business is generally interested in knowing how many converted messages we processed yesterday, how many were unconverted, what was the avg transaction_duration for each individual audio file. In addition they may want to know how long individual steps in the process took.
On average we process 2 million audio_messages each day and each audio has about 10 events associated with it so the total duration to convert an audio_message is the sum of all of durations for its associated events.
Is there any value in creating a fact_delivered table that represents the final conversion event or would it make no sense to do this? Something like:
fact_delivered
dim_day_key
dim_time_key
dim_language_key
audio_id (dd)
total_conversion_duration
Any pointers or tips would be much appreciated.
Thanks,
In between the start and end events there can be several other steps that happen before the conversion. I've modelled this by creating a dimension with all of the different events that can act upon an audio message. The fact table looks like:
fact_audio_transaction
dim_event_key
dim_day_key
dim_time_key
dim_language_key
audio_id (dd)
transaction_duration
Possible values for the dimension table are (but not limited to):
audio_received
audio_cleaned_up
audio_converted
audio_language_identified
audio_converted
audio_quality_set
audio_unconverted
conversion_delivered
The business is generally interested in knowing how many converted messages we processed yesterday, how many were unconverted, what was the avg transaction_duration for each individual audio file. In addition they may want to know how long individual steps in the process took.
On average we process 2 million audio_messages each day and each audio has about 10 events associated with it so the total duration to convert an audio_message is the sum of all of durations for its associated events.
Is there any value in creating a fact_delivered table that represents the final conversion event or would it make no sense to do this? Something like:
fact_delivered
dim_day_key
dim_time_key
dim_language_key
audio_id (dd)
total_conversion_duration
Any pointers or tips would be much appreciated.
Thanks,
Last edited by salaman on Fri Mar 25, 2011 1:01 pm; edited 1 time in total (Reason for editing : provided additional record count information)
salaman- Posts : 21
Join date : 2011-03-24
Re: service fulfillment - one fact or many?
I've been reading up a bit on pipelines and order fullfilment in Kimball's "Complete Guide to Dimensional Modelling". It seems to me that each process within the fulfilment process above can be described more accurately by a seperate fact table. Whether I need that much detail or not is another question altogether.
Kimball then goes on to state that when users are interested in analyzing the entire pipeline that an accumulating snapshot can provide this.
In my example users are particularly interested in knowing how long it takes for a conversion to move through the pipeline so maybe an accumulating snapshot would help. Here's where I get lost however.....
What happens when one or more of the events don't happen?
For example, depending on what happens along the pipeline, the audio_converted event might not happen, in which case an audio is marked as unconverted. Do I set the audio_converted value to 9999-12-31?
Kimball then goes on to state that when users are interested in analyzing the entire pipeline that an accumulating snapshot can provide this.
In my example users are particularly interested in knowing how long it takes for a conversion to move through the pipeline so maybe an accumulating snapshot would help. Here's where I get lost however.....
What happens when one or more of the events don't happen?
For example, depending on what happens along the pipeline, the audio_converted event might not happen, in which case an audio is marked as unconverted. Do I set the audio_converted value to 9999-12-31?
salaman- Posts : 21
Join date : 2011-03-24
Re: service fulfillment - one fact or many?
You have a metric for each stage as well as a date. The date is set to the unknown value in your date dimension. The metric is set to 0.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: service fulfillment - one fact or many?
I would think that if the Audio Coverted never happened, it wouldn't get loaded into the fact.
I assume the "Audio Coverted" is a row in the Event dimension. If an event never occurs, then you have nothing to load into the fact table.
I guess the design isn't clear to me.
I assume the "Audio Coverted" is a row in the Event dimension. If an event never occurs, then you have nothing to load into the fact table.
I guess the design isn't clear to me.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: service fulfillment - one fact or many?
BoxesAndLines wrote:You have a metric for each stage as well as a date. The date is set to the unknown value in your date dimension. The metric is set to 0.
Thanks, that make sense.
In which situation would I use one approach over the other (i.e. using my original fact_audio_transaction table where each event appears on a new row or the accumulating snapshot)? Or is it not a question of one approach being better but rather catering to a specific need?
I must admit that I'm not sure on the accumulating snapshot - there are something like 20 possible events (some of which hardly come to pass) in this business process and I'm not sure about having a really wide fact table. Also, there's a lot of potential for change or new events coming on board and implementing those changes into the fact table in the future would mean adding columns to a fact table rather than rows to a dim_events table.
Last edited by salaman on Mon Mar 28, 2011 11:42 am; edited 2 times in total
salaman- Posts : 21
Join date : 2011-03-24
Re: service fulfillment - one fact or many?
Jeff Smith wrote:I would think that if the Audio Coverted never happened, it wouldn't get loaded into the fact.
I assume the "Audio Coverted" is a row in the Event dimension. If an event never occurs, then you have nothing to load into the fact table.
I guess the design isn't clear to me.
Hi Jeff,
You're right that in my original design if an event never occurs then there's nothing to load into the fact table. I was however referring to a possible accumulating snapshot that would look something like this.
fact_audio
dim_event_key
dim_day_key
dim_time_key
dim_language_key
audio_id (dd)
audio_received_date_key
audio_received_duration
conversion_date_key
conversion_date_duration
etc....
salaman- Posts : 21
Join date : 2011-03-24
Similar topics
» Order fulfillment accumulating fact - problems with status movement
» A fact table for each service line of business?
» Modelling Service Visits
» Customer - Account - Service modelling
» Questions about a subscription based online service
» A fact table for each service line of business?
» Modelling Service Visits
» Customer - Account - Service modelling
» Questions about a subscription based online service
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum