Storing timestamps in fact table?
3 posters
Page 1 of 1
Storing timestamps in fact table?
Hello
I've decided to take a step back from my original post and rethink the design again, gradually building up my dimensional modelling.
For context I am aiming to build a dimensional model that is based on operating theatre data, the data is structured in a parent-child relationship between sessions and operations. There is planned data and actual data.
I am concentrating on the lowest grain in the OLTP system which is operations and I am firstly looking at the measures for actual operations. I have joined the actual and planned data to get all the necessary dimension data required for the actual operation fact.
There are measures in the actual operation fact table which are derived from various timestamps which are captured as the operation starts and ends. I would need to be able to provide the timestamps for the different stages of each operation to the end user.
My question is should I store the timestamps for each operation within the fact table or do I create an operation dimension which contains the PK of the fact table along with the timestamps as the dimension attributes. There are currently 6 timestamps associated with each operation which are captured with an accuracy of minutes. My understanding is that a fact table should not contain this data, though I am seeking clarification before progressing further.
Any help is greatly appreciated.
I've decided to take a step back from my original post and rethink the design again, gradually building up my dimensional modelling.
For context I am aiming to build a dimensional model that is based on operating theatre data, the data is structured in a parent-child relationship between sessions and operations. There is planned data and actual data.
I am concentrating on the lowest grain in the OLTP system which is operations and I am firstly looking at the measures for actual operations. I have joined the actual and planned data to get all the necessary dimension data required for the actual operation fact.
There are measures in the actual operation fact table which are derived from various timestamps which are captured as the operation starts and ends. I would need to be able to provide the timestamps for the different stages of each operation to the end user.
My question is should I store the timestamps for each operation within the fact table or do I create an operation dimension which contains the PK of the fact table along with the timestamps as the dimension attributes. There are currently 6 timestamps associated with each operation which are captured with an accuracy of minutes. My understanding is that a fact table should not contain this data, though I am seeking clarification before progressing further.
Any help is greatly appreciated.
djphatic- Posts : 20
Join date : 2012-04-21
Re: Storing timestamps in fact table?
Timestamps should be stored in the fact and not the dimension, especially if you are going to slice by them. E.g., number of operations performed within a certain date range. You should consider an Accumulating Snapshot Fact table that has all six timestamps for each operation, so you will have one row per operation. Each of these timestamps will be stored a surrogate key to the Date dimension and if you would like to store the times along with the dates, you can consider a Time of Day dimension as well. This approach can also help you identify the time taken to perform each step of an operation.
gsidhu- Posts : 10
Join date : 2012-05-10
Location : Southern California
Re: Storing timestamps in fact table?
gsidhu wrote:Timestamps should be stored in the fact and not the dimension, especially if you are going to slice by them. E.g., number of operations performed within a certain date range. You should consider an Accumulating Snapshot Fact table that has all six timestamps for each operation, so you will have one row per operation. Each of these timestamps will be stored a surrogate key to the Date dimension and if you would like to store the times along with the dates, you can consider a Time of Day dimension as well. This approach can also help you identify the time taken to perform each step of an operation.
Thanks for your reply.
I should have added that there is already a date field in the fact table. At present it is not expected that I would be slicing by the hour/minute stages of operations occurred. The times taken to perform steps of an operation have already been calculated in my T-SQL query. The timestamps are merely for information purpose that is why I was unsure whether to store them in a fact or dimension.
Interested to hear your thoughts.
djphatic- Posts : 20
Join date : 2012-04-21
Re: Storing timestamps in fact table?
If I understood correctly, your reporting requirements include providing the time it took to perform each stage of each operation (which you have calculated in your query) and not necessarily how many operations/stages were performed within a date range. Still according to me, here is how the design should look like:
Fact Operation
Operating Theater Key
Patient Key
Operation Stage 1 Completed Date Key
Operation Stage 1 Completed Time Key (Optional)
Operation Stage 2 Completed Date Key
Operation Stage 2 Completed Time Key (Optional)
Operation Stage 3 Completed Date Key
Operation Stage 3 Completed Time Key (Optional)
Operation Stage 4 Completed Date Key
Operation Stage 4 Completed Time Key (Optional)
Operation Stage 5 Completed Date Key
Operation Stage 5 Completed Time Key (Optional)
Operation Stage 6 Completed Date Key
Operation Stage 6 Completed Time Key (Optional)
Operation Count (= 1)
From these date and time keys, you can either calculate the time lags in your queries, or store the time lags in the fact table or a view like you already have.
The reason for this design is that you are building for potential future requirements and not just current requirements. In the future, you may need to filter on dates, or even if you don't, this design should serve your current needs and some other potential requirements that may or may not come your way in the future.
At this point, is there any information you have in your Operation Dimension other than the timestamps for each stage?
Fact Operation
Operating Theater Key
Patient Key
Operation Stage 1 Completed Date Key
Operation Stage 1 Completed Time Key (Optional)
Operation Stage 2 Completed Date Key
Operation Stage 2 Completed Time Key (Optional)
Operation Stage 3 Completed Date Key
Operation Stage 3 Completed Time Key (Optional)
Operation Stage 4 Completed Date Key
Operation Stage 4 Completed Time Key (Optional)
Operation Stage 5 Completed Date Key
Operation Stage 5 Completed Time Key (Optional)
Operation Stage 6 Completed Date Key
Operation Stage 6 Completed Time Key (Optional)
Operation Count (= 1)
From these date and time keys, you can either calculate the time lags in your queries, or store the time lags in the fact table or a view like you already have.
The reason for this design is that you are building for potential future requirements and not just current requirements. In the future, you may need to filter on dates, or even if you don't, this design should serve your current needs and some other potential requirements that may or may not come your way in the future.
At this point, is there any information you have in your Operation Dimension other than the timestamps for each stage?
gsidhu- Posts : 10
Join date : 2012-05-10
Location : Southern California
Re: Storing timestamps in fact table?
Thanks for your reply. There isn't anything in my Operation dimension at present other than the unique ID of the operation.
Here is my current structure of the actual operation fact table.
OperationRecordID (PK)
DateOfOperation (FK - Dim Date)
ActualSession (FK - Dim Session/Fact Session)
PatientNumber (FK - Dim Patient)
ActualConsultant (FK - Dim Consultant)
ActualSpecialty (FK - Dim Specialty)
ActualTheatre (FK - Dim Theatre)
ActualAnaesthetic (FK - Dim Anaesthetic Type)
ActualASAGrade (FK - Dim ASA Grade)
ActualOperationType (FK - Dim Operation Type)
FinishWard (FK - Dim Ward)
OperationStatus (FK - Dim Operation Status, could become measure as it is either Completed or Abandoned)
PlannedSession (FK - Dim Session/Fact Session)
PlannedConsultant (FK - Dim Consultant)
PlannedSpecialty (FK - Dim Specialty)
PlannedTheatre (FK - Dim Theatre)
AdmissionType (FK - Dim Admission Type)
PlannedAnaestheticType (FK - Dim Anaesthetic)
PlannedASAGrade (FK - Dim ASA Grade)
PlannedOperationType (FK - Dim Operation Type)
StartWard (FK - Dim Ward)
ElectiveBasis (FK Dim Elective Basis)
ExpectedTheatreTime (Measure)
OperationVerified (Measure)
AnaestheticTime (Measure - Derived from Timestamps)
SurgicalTime (Measure - Derived from Timestamps)
NeedleToSkinTime (Measure - Derived from Timestamps)
GapTime (Measure - Derived from Timestamps)
Here is my current structure of the actual operation fact table.
OperationRecordID (PK)
DateOfOperation (FK - Dim Date)
ActualSession (FK - Dim Session/Fact Session)
PatientNumber (FK - Dim Patient)
ActualConsultant (FK - Dim Consultant)
ActualSpecialty (FK - Dim Specialty)
ActualTheatre (FK - Dim Theatre)
ActualAnaesthetic (FK - Dim Anaesthetic Type)
ActualASAGrade (FK - Dim ASA Grade)
ActualOperationType (FK - Dim Operation Type)
FinishWard (FK - Dim Ward)
OperationStatus (FK - Dim Operation Status, could become measure as it is either Completed or Abandoned)
PlannedSession (FK - Dim Session/Fact Session)
PlannedConsultant (FK - Dim Consultant)
PlannedSpecialty (FK - Dim Specialty)
PlannedTheatre (FK - Dim Theatre)
AdmissionType (FK - Dim Admission Type)
PlannedAnaestheticType (FK - Dim Anaesthetic)
PlannedASAGrade (FK - Dim ASA Grade)
PlannedOperationType (FK - Dim Operation Type)
StartWard (FK - Dim Ward)
ElectiveBasis (FK Dim Elective Basis)
ExpectedTheatreTime (Measure)
OperationVerified (Measure)
AnaestheticTime (Measure - Derived from Timestamps)
SurgicalTime (Measure - Derived from Timestamps)
NeedleToSkinTime (Measure - Derived from Timestamps)
GapTime (Measure - Derived from Timestamps)
djphatic- Posts : 20
Join date : 2012-04-21
Re: Storing timestamps in fact table?
Can there be a time gap between end of one stage and the beginning of the next?
gsidhu- Posts : 10
Join date : 2012-05-10
Location : Southern California
Re: Storing timestamps in fact table?
gsidhu wrote:Can there be a time gap between end of one stage and the beginning of the next?
No. The gap time field is the gap between the previous operation ending and current operation starting.
djphatic- Posts : 20
Join date : 2012-04-21
Re: Storing timestamps in fact table?
Is the OperationRecordID in your fact table just an auto increment field or does it refer to some other table, e.g., Operation table? Because if it is just an auto increment field, then you probably don't need it. The uniqueness within the fact table should be defined by the grain of the fact, which could be one surrogate key or a set of surrogate keys. Here, it should be one record per operation if I'm not wrong.
Here's what I would suggest:
1) Add all the date keys and also time keys into this fact table. You will need a separate Time of Day Dimension for the Time keys. Since majority of the operations or at least some stages of an operation will be conducted within the same day, you'll be better off storing the time keys.
2) Get rid of the Operation Dimension and consider adding Operation Unique ID into the fact table as a Degenerate Dimension.
3) Add measures for Operation Status if you feel like but the Operation Status Key should remain as such.
Rest of your fact table looks good to me. Here's how the table structure will look:
Operation ID (Unique Operation ID - Degenerate Dimension)
ActualSession (FK - Dim Session/Fact Session)
PatientNumber (FK - Dim Patient)
ActualConsultant (FK - Dim Consultant)
ActualSpecialty (FK - Dim Specialty)
ActualTheatre (FK - Dim Theatre)
ActualAnaesthetic (FK - Dim Anaesthetic Type)
ActualASAGrade (FK - Dim ASA Grade)
ActualOperationType (FK - Dim Operation Type)
FinishWard (FK - Dim Ward)
OperationStatus (FK - Dim Operation Status)
PlannedSession (FK - Dim Session/Fact Session)
PlannedConsultant (FK - Dim Consultant)
PlannedSpecialty (FK - Dim Specialty)
PlannedTheatre (FK - Dim Theatre)
AdmissionType (FK - Dim Admission Type)
PlannedAnaestheticType (FK - Dim Anaesthetic)
PlannedASAGrade (FK - Dim ASA Grade)
PlannedOperationType (FK - Dim Operation Type)
StartWard (FK - Dim Ward)
ElectiveBasis (FK Dim Elective Basis)
ExpectedTheatreTime (Measure)
OperationVerified (Measure)
AnaestheticTime (Measure - Derived from Timestamps)
SurgicalTime (Measure - Derived from Timestamps)
NeedleToSkinTime (Measure - Derived from Timestamps)
GapTime (Measure - Derived from Timestamps)
Operation Stage 1 Start Date Key
Operation Stage 1 Start Time Key
Operation Stage 1 End Date Key
Operation Stage 1 End Time Key
Operation Stage 2 Start Date Key
Operation Stage 2 Start Time Key
Operation Stage 2 End Date Key
Operation Stage 2 End Time Key
Operation Stage 3 Start Date Key
Operation Stage 3 Start Time Key
Operation Stage 3 End Date Key
Operation Stage 3 End Time Key
Operation Stage 4 Start Date Key
Operation Stage 4 Start Time Key
Operation Stage 4 End Date Key
Operation Stage 4 End Time Key
Operation Stage 5 Start Date Key
Operation Stage 5 Start Time Key
Operation Stage 5 End Date Key
Operation Stage 5 End Time Key
Operation Stage 6 Start Date Key
Operation Stage 6 Start Time Key
Operation Stage 6 End Date Key
Operation Stage 6 End Time Key
Operation Count (= 1)
Completed Operation Count (= 1 when Operation Status = Completed)
Abandoned Operation Count (= 1 when Operation Status = Abandoned)
Of course the names for the date and time columns will be based on the stage names. Please feel free to point out if I missed anything.
Another option that I can think of is storing one record per Stage instead of one record per Operation which will make the table slimmer:
Operation ID (Unique Operation ID - Degenerate Dimension)
Operation Stage Key (FK - Dim Operation Stage - This will store names of all stages)
ActualSession (FK - Dim Session/Fact Session)
PatientNumber (FK - Dim Patient)
ActualConsultant (FK - Dim Consultant)
ActualSpecialty (FK - Dim Specialty)
ActualTheatre (FK - Dim Theatre)
ActualAnaesthetic (FK - Dim Anaesthetic Type)
ActualASAGrade (FK - Dim ASA Grade)
ActualOperationType (FK - Dim Operation Type)
FinishWard (FK - Dim Ward)
OperationStatus (FK - Dim Operation Status)
PlannedSession (FK - Dim Session/Fact Session)
PlannedConsultant (FK - Dim Consultant)
PlannedSpecialty (FK - Dim Specialty)
PlannedTheatre (FK - Dim Theatre)
AdmissionType (FK - Dim Admission Type)
PlannedAnaestheticType (FK - Dim Anaesthetic)
PlannedASAGrade (FK - Dim ASA Grade)
PlannedOperationType (FK - Dim Operation Type)
StartWard (FK - Dim Ward)
ElectiveBasis (FK Dim Elective Basis)
ExpectedTheatreTime (Measure)
OperationVerified (Measure)
AnaestheticTime (Measure - Derived from Timestamps)
SurgicalTime (Measure - Derived from Timestamps)
NeedleToSkinTime (Measure - Derived from Timestamps)
GapTime (Measure - Derived from Timestamps)
Operation Stage Start Date Key
Operation Stage Start Time Key
Operation Stage End Date Key
Operation Stage End Time Key
Here's what I would suggest:
1) Add all the date keys and also time keys into this fact table. You will need a separate Time of Day Dimension for the Time keys. Since majority of the operations or at least some stages of an operation will be conducted within the same day, you'll be better off storing the time keys.
2) Get rid of the Operation Dimension and consider adding Operation Unique ID into the fact table as a Degenerate Dimension.
3) Add measures for Operation Status if you feel like but the Operation Status Key should remain as such.
Rest of your fact table looks good to me. Here's how the table structure will look:
Operation ID (Unique Operation ID - Degenerate Dimension)
ActualSession (FK - Dim Session/Fact Session)
PatientNumber (FK - Dim Patient)
ActualConsultant (FK - Dim Consultant)
ActualSpecialty (FK - Dim Specialty)
ActualTheatre (FK - Dim Theatre)
ActualAnaesthetic (FK - Dim Anaesthetic Type)
ActualASAGrade (FK - Dim ASA Grade)
ActualOperationType (FK - Dim Operation Type)
FinishWard (FK - Dim Ward)
OperationStatus (FK - Dim Operation Status)
PlannedSession (FK - Dim Session/Fact Session)
PlannedConsultant (FK - Dim Consultant)
PlannedSpecialty (FK - Dim Specialty)
PlannedTheatre (FK - Dim Theatre)
AdmissionType (FK - Dim Admission Type)
PlannedAnaestheticType (FK - Dim Anaesthetic)
PlannedASAGrade (FK - Dim ASA Grade)
PlannedOperationType (FK - Dim Operation Type)
StartWard (FK - Dim Ward)
ElectiveBasis (FK Dim Elective Basis)
ExpectedTheatreTime (Measure)
OperationVerified (Measure)
AnaestheticTime (Measure - Derived from Timestamps)
SurgicalTime (Measure - Derived from Timestamps)
NeedleToSkinTime (Measure - Derived from Timestamps)
GapTime (Measure - Derived from Timestamps)
Operation Stage 1 Start Date Key
Operation Stage 1 Start Time Key
Operation Stage 1 End Date Key
Operation Stage 1 End Time Key
Operation Stage 2 Start Date Key
Operation Stage 2 Start Time Key
Operation Stage 2 End Date Key
Operation Stage 2 End Time Key
Operation Stage 3 Start Date Key
Operation Stage 3 Start Time Key
Operation Stage 3 End Date Key
Operation Stage 3 End Time Key
Operation Stage 4 Start Date Key
Operation Stage 4 Start Time Key
Operation Stage 4 End Date Key
Operation Stage 4 End Time Key
Operation Stage 5 Start Date Key
Operation Stage 5 Start Time Key
Operation Stage 5 End Date Key
Operation Stage 5 End Time Key
Operation Stage 6 Start Date Key
Operation Stage 6 Start Time Key
Operation Stage 6 End Date Key
Operation Stage 6 End Time Key
Operation Count (= 1)
Completed Operation Count (= 1 when Operation Status = Completed)
Abandoned Operation Count (= 1 when Operation Status = Abandoned)
Of course the names for the date and time columns will be based on the stage names. Please feel free to point out if I missed anything.
Another option that I can think of is storing one record per Stage instead of one record per Operation which will make the table slimmer:
Operation ID (Unique Operation ID - Degenerate Dimension)
Operation Stage Key (FK - Dim Operation Stage - This will store names of all stages)
ActualSession (FK - Dim Session/Fact Session)
PatientNumber (FK - Dim Patient)
ActualConsultant (FK - Dim Consultant)
ActualSpecialty (FK - Dim Specialty)
ActualTheatre (FK - Dim Theatre)
ActualAnaesthetic (FK - Dim Anaesthetic Type)
ActualASAGrade (FK - Dim ASA Grade)
ActualOperationType (FK - Dim Operation Type)
FinishWard (FK - Dim Ward)
OperationStatus (FK - Dim Operation Status)
PlannedSession (FK - Dim Session/Fact Session)
PlannedConsultant (FK - Dim Consultant)
PlannedSpecialty (FK - Dim Specialty)
PlannedTheatre (FK - Dim Theatre)
AdmissionType (FK - Dim Admission Type)
PlannedAnaestheticType (FK - Dim Anaesthetic)
PlannedASAGrade (FK - Dim ASA Grade)
PlannedOperationType (FK - Dim Operation Type)
StartWard (FK - Dim Ward)
ElectiveBasis (FK Dim Elective Basis)
ExpectedTheatreTime (Measure)
OperationVerified (Measure)
AnaestheticTime (Measure - Derived from Timestamps)
SurgicalTime (Measure - Derived from Timestamps)
NeedleToSkinTime (Measure - Derived from Timestamps)
GapTime (Measure - Derived from Timestamps)
Operation Stage Start Date Key
Operation Stage Start Time Key
Operation Stage End Date Key
Operation Stage End Time Key
gsidhu- Posts : 10
Join date : 2012-05-10
Location : Southern California
Re: Storing timestamps in fact table?
The OperationRecordID is the unique identifier for the operation from the source system, which should be a degenerate dimension.
The interval times will usually be on the same day expect for emergency operations which occur late at night and go into the next day.
I'll look into using a Time dimension, along with Date dimension for the different intervals of the operation. There are occasions when only some of the intervals are populated though I don't think this will cause any issues?
The interval times will usually be on the same day expect for emergency operations which occur late at night and go into the next day.
I'll look into using a Time dimension, along with Date dimension for the different intervals of the operation. There are occasions when only some of the intervals are populated though I don't think this will cause any issues?
djphatic- Posts : 20
Join date : 2012-04-21
Re: Storing timestamps in fact table?
You can keep the OperationRecordID as a Degenerate Dimension, so you won't need the Operation Dimension. If some of the stage timestamps don't apply to an operation, it won't be a problem as you can populate those timestamps with the date key that corresponds to blank date. E.g. I use "-1" as date key which corresponds to a date = 1900-01-01 in the Date dimension. Your time key will be the one that corresponds to time = 00:00:00, i.e. hour = 0, minute = 0 and second = 0.
gsidhu- Posts : 10
Join date : 2012-05-10
Location : Southern California
Re: Storing timestamps in fact table?
Hi,
If you have the requirement that you want to count the no. of patients in a particular operation stage based on the date, you can have the following design.
Operating Theater Key
Patient Key
Calendar Key
Operation Stage 1 Count
Operation Stage 2 Count
Operation Stage 3 Count
Operation Stage 4 Count
Operation Stage 5 Count
Operation Stage 6 Count
.
So you will have 1 record for each of the six stages where Operating Theater Key & Patient Key will be the same but calendar key will change as you move from 1 stage to the next. Each stage count measure will have value '1' at that particular stage and the other measure columns will have a 0.
.
If you have the requirement that you want to count the no. of patients in a particular operation stage based on the date, you can have the following design.
Operating Theater Key
Patient Key
Calendar Key
Operation Stage 1 Count
Operation Stage 2 Count
Operation Stage 3 Count
Operation Stage 4 Count
Operation Stage 5 Count
Operation Stage 6 Count
.
So you will have 1 record for each of the six stages where Operating Theater Key & Patient Key will be the same but calendar key will change as you move from 1 stage to the next. Each stage count measure will have value '1' at that particular stage and the other measure columns will have a 0.
.
Jebyjj- Posts : 1
Join date : 2012-05-31
Similar topics
» Storing Fiscal Year in a table
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» How to create fact table with measures derived from comparing two fact table rows
» Storing data that changes periodically - should I use a periodic snapshot fact?
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» Integrating new fact table which has one to many relationship with the main fact table in existing star schema
» How to create fact table with measures derived from comparing two fact table rows
» Storing data that changes periodically - should I use a periodic snapshot fact?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum