Tracking repeated issue "status"
2 posters
Page 1 of 1
Tracking repeated issue "status"
I am designing a star schema for issue tracking system, each issue has a status and this status is keeping changing along the time till the issue solved. In my opinion the accumulative snapshot fact table is the perfect match to model this schema with the following columns
Accum_Fact_Issue
{
Issue start Date (FK)
Issue Assigned Date (FK)
Issue Owned Date (FK)
Issue Delayed Date (FK)
Issue Returned Date (FK)
Issue Reassigned Date (FK)
.....
Solved Date (FK)
Assigned Lag
Owned Lag
....
Solved Lag
}
But there is one more thing that confuse me, some of the status can be repeated with no limit of repeating times,for example, the same issue can be reassigned again and then enters the same cycle till being solved and if it is not solved, it can be reassigned again and again, and also it can be returned many times and so on. So if we want to analyze the progress of solving the issues and determine the bottlenecks through determining the lags between dates and in the same time keep tracking of intermediate status changes I think the standard accumulative snapshot fact will not fit this requirements, so I changed my mind to design it as time stamping accumulating snapshot like in Design Tip #145.
The question is, is the new suggested design is matching the requirements, if yes, how can I easily determine the lags while it is saved in separate rows ?, and if not, in you opinion what is the perfect design that match this requirements ?
Accum_Fact_Issue
{
Issue start Date (FK)
Issue Assigned Date (FK)
Issue Owned Date (FK)
Issue Delayed Date (FK)
Issue Returned Date (FK)
Issue Reassigned Date (FK)
.....
Solved Date (FK)
Assigned Lag
Owned Lag
....
Solved Lag
}
But there is one more thing that confuse me, some of the status can be repeated with no limit of repeating times,for example, the same issue can be reassigned again and then enters the same cycle till being solved and if it is not solved, it can be reassigned again and again, and also it can be returned many times and so on. So if we want to analyze the progress of solving the issues and determine the bottlenecks through determining the lags between dates and in the same time keep tracking of intermediate status changes I think the standard accumulative snapshot fact will not fit this requirements, so I changed my mind to design it as time stamping accumulating snapshot like in Design Tip #145.
The question is, is the new suggested design is matching the requirements, if yes, how can I easily determine the lags while it is saved in separate rows ?, and if not, in you opinion what is the perfect design that match this requirements ?
a_sherbeeny- Posts : 15
Join date : 2009-02-04
Re: Tracking repeated issue "status"
I think the time stamped accumulating snapshot is the solution to your problem. The solution gives you more granular monitoring on the milestone fulfillment in which you can report the issue progress at any point in time in history.
Suppose you have multiple rows for the same issue with different effective dates, the lag calculation should be always based on the row relevant to the cycle. If an issue gets reassigned, a new mini cycle will start on a new row for the same issue and all the subsequent lags will be calculated based on the new set of milestones that follow the new reassigned date. At the end, you will end up with multiple versions of the same lag at different point in time.
Like other forms of effective dated fact tables, for trend analysis, you might also need a periodic snapshot fact table based on the time stamped accumulating snapshot fact.
Suppose you have multiple rows for the same issue with different effective dates, the lag calculation should be always based on the row relevant to the cycle. If an issue gets reassigned, a new mini cycle will start on a new row for the same issue and all the subsequent lags will be calculated based on the new set of milestones that follow the new reassigned date. At the end, you will end up with multiple versions of the same lag at different point in time.
Like other forms of effective dated fact tables, for trend analysis, you might also need a periodic snapshot fact table based on the time stamped accumulating snapshot fact.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Tracking repeated issue "status"
Thank you Hang for your reply it is very useful.
Just to be sure that I understood your reply, I will draw a part of the suggested design:
Accum_Fact_Issue
No other columns need to be added ?
Also I will include periodic snapshot fact table to help users generate trend analysis reports. But for lag calculating , because it would be difficult to calculate it from different rows, I would suggest to add a standard accumulating snapshot for the key milestone status values to calculate the lag between them. it that right ?
Just to be sure that I understood your reply, I will draw a part of the suggested design:
Accum_Fact_Issue
ID | OtherDimKeys | StatusDimKey | StatusChangeDateDimKey | ValidFromDateTimeStamp | ValidToDateTimeStamp | CurrentFlag | Count(Always=1) |
1 | ... | 4 | 122 | 23/12/2012 02:34:42 PM | 24/12/2012 02:34:42 PM | 0 | 1 |
2 | ... | 5 | 123 | 24/12/2012 02:34:42 PM | 01/01/2100 12:00:00 AM | 1 | 1 |
No other columns need to be added ?
Also I will include periodic snapshot fact table to help users generate trend analysis reports. But for lag calculating , because it would be difficult to calculate it from different rows, I would suggest to add a standard accumulating snapshot for the key milestone status values to calculate the lag between them. it that right ?
Last edited by a_sherbeeny on Mon Dec 24, 2012 6:26 am; edited 1 time in total (Reason for editing : Add some information)
a_sherbeeny- Posts : 15
Join date : 2009-02-04
Re: Tracking repeated issue "status"
I assume the OtherDimKeys also includ all your milestone date keys in the table, similar to your initial table structure but also having effective date pair. I would also include a Current flag to indicate the final version of all the milestone vlaues. You may not need Reassigned Date as it should be reflected by previous version of Assigned Date in another row. Think of the time stamped fact as a type 2 fact but we don't call it that way to avoid any confusion with type 2 dimension when using it.
Not quite sure about the lag calculation involving multiple rows. Normally lags in accumulating snapshot are derived from two dates in the same row depending on business requirements. Is assigned lag just the duration between Start and Assigned date on the same row? As I have said, you may have two Assigned dates for the same issue because of reassignment, one current in the same row, and one expired in another row which contains another set of milestone dates and their derived lags depending how far the previous mini-cycle had progressed before the reasignment.
Sometimes the lag could be more complicated than just taking away one milestone date from another one, even involving multirow calculation. In that case, the lag is more important to be expressed physically in the fact table to cut down runtime calculation in reports. I am not sure if it is the lag you are talking about. It looks to me all the lags in your initial post should be based on the milestone dates on the same row.
Not quite sure about the lag calculation involving multiple rows. Normally lags in accumulating snapshot are derived from two dates in the same row depending on business requirements. Is assigned lag just the duration between Start and Assigned date on the same row? As I have said, you may have two Assigned dates for the same issue because of reassignment, one current in the same row, and one expired in another row which contains another set of milestone dates and their derived lags depending how far the previous mini-cycle had progressed before the reasignment.
Sometimes the lag could be more complicated than just taking away one milestone date from another one, even involving multirow calculation. In that case, the lag is more important to be expressed physically in the fact table to cut down runtime calculation in reports. I am not sure if it is the lag you are talking about. It looks to me all the lags in your initial post should be based on the milestone dates on the same row.
Last edited by hang on Tue Dec 25, 2012 6:05 am; edited 1 time in total (Reason for editing : Typo)
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Tracking repeated issue "status"
Thanks Hang for your valuable clarification.
What I understand now is that we will include all milestones date dimension keys as well as physically storing the lag values between milestones. In regard to data loading, when running the ETL job, we update the same row when the status of the issue moves in the pipeline, for example , we update the same row when the status changes from "New" to "Assigned", or from "Assigned" to "Returned". However when the issue status changes to a previously happened before status, then instead of updating the same record, we will create a new effective record including the same info from previous record except the status that has been changed and close the old record. right ?
What I understand now is that we will include all milestones date dimension keys as well as physically storing the lag values between milestones. In regard to data loading, when running the ETL job, we update the same row when the status of the issue moves in the pipeline, for example , we update the same row when the status changes from "New" to "Assigned", or from "Assigned" to "Returned". However when the issue status changes to a previously happened before status, then instead of updating the same record, we will create a new effective record including the same info from previous record except the status that has been changed and close the old record. right ?
a_sherbeeny- Posts : 15
Join date : 2009-02-04
Re: Tracking repeated issue "status"
Close, but still not quite the same as what the time stamped accumulating snapshot fact is supposed to behave in Tip #145, as far I understand. For single cycle, it would be like type 2 SCD in which any status change will be responded by expiring the previuos record and inserting a new record with all the attributes set to previous values, as well as the new values that cause the change.
Part from normal features of accumulating snapshot when filtering on current status, the time stamped version will give you extra information about when the status changes with a by-product of catering for repeating cycles. When reassigning an issue, I think you should reset milestone dates after assign date so that you can record the mini-cycle with new set of milestones.
Part from normal features of accumulating snapshot when filtering on current status, the time stamped version will give you extra information about when the status changes with a by-product of catering for repeating cycles. When reassigning an issue, I think you should reset milestone dates after assign date so that you can record the mini-cycle with new set of milestones.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Similar topics
» Is it a good idea to have a single STATUS table if we have to store multiple status in the data mart?
» Status Code vs Status Description
» Dimension hierarchy and repeated attributes
» Relationship between a history tracking table and a non-history tracking table?
» Tracking changes in a hierarchy
» Status Code vs Status Description
» Dimension hierarchy and repeated attributes
» Relationship between a history tracking table and a non-history tracking table?
» Tracking changes in a hierarchy
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum