Advice for modelling planned relational data
Page 1 of 1
Advice for modelling planned relational data
Hi All,
I’m hoping someone might be able to offer a little advice for a KPI/measure I’m looking at modelling in a dimension model. The data I want to provide to the business is around operational planning. Periodically through the day our planning team will submit plans out of our plan software which is passed via BizTalk into our operational (execution software). These plans can be sent often throughout the day for a customisable horizon window. I want to provide them that information along with the actual data as completed.
i.e. Planners will tend to submit plan now to weeks into the future, the operational guys who work within a 24 hour window will only go for a day or 2.
If I break down the plan message (being mindful I only want to look at accepted plans), I have something of this schema.
• Plan (PlanID, UserID, PlanDate, PlanHorizon)
• Plan.Train (TrainNumber, StationPlatform, ETA, ETD)
• Plan.Train.Rake (RakeNo)
• Plan.Train.Rake.Task (TaskID, StartTime, EndTime, Tonnes)
• Plan.DeletedTasks (TaskID)
So for example, a plan is made up of trains that are made up of rakes (or packages) and for each package (as they are of different product) there are tasks associated with them.
So how do I model this?
I’m really at a loss at this stage on the best design and heck the dimensions to use. The data isn’t really additive or semi-additive.
So I’ve decided to work from my immediate need/goal as a proof of concept to see how it goes.
factPlanAnalysis (PlanID, UserID, PlanDate, PlanHorizon, TrainNumberID (FK), StationPlatform, RakeNo, TaskID, StartTime, EndTime, Tonnes, DeletedTaskID)
dimTrain (TrainNumberID (PK), TrainNumber, StationPlatform , ATA, ATD)
My concern with this is that as Train and DeletedTask data aren’t dependant in the plan I end up with lots of rows in the fact table.
i.e
Plan Data columns.., Train Data columns.., DeletedTasks columns..
1.., Train1, NULL
1.., Train2, NULL
1.., Train3, NULL
1.., NULL , 1000
1.., NULL , 1001
2.., Train1, NULL
2.., Train2, NULL
2.., Train3, NULL
2.., Train4, NULL
2.., NULL , 1002
2.., NULL , 1003
Also, when it comes to the analysis it’s across the fact AND the dimension – so not really a ‘slice-and-dice’ design.
Has anyone come up against this type of data? Maybe I need to simplify it further and create more fact tables – factPlanAnalysis_Trains, factPlanAnalysis_DeletedTasks.
I’ve been working through the Kimball modelling book but the KPI/measures are very basic.
Maybe I’m tackling something too complex as a start?
Any help would be great!
I’m hoping someone might be able to offer a little advice for a KPI/measure I’m looking at modelling in a dimension model. The data I want to provide to the business is around operational planning. Periodically through the day our planning team will submit plans out of our plan software which is passed via BizTalk into our operational (execution software). These plans can be sent often throughout the day for a customisable horizon window. I want to provide them that information along with the actual data as completed.
i.e. Planners will tend to submit plan now to weeks into the future, the operational guys who work within a 24 hour window will only go for a day or 2.
If I break down the plan message (being mindful I only want to look at accepted plans), I have something of this schema.
• Plan (PlanID, UserID, PlanDate, PlanHorizon)
• Plan.Train (TrainNumber, StationPlatform, ETA, ETD)
• Plan.Train.Rake (RakeNo)
• Plan.Train.Rake.Task (TaskID, StartTime, EndTime, Tonnes)
• Plan.DeletedTasks (TaskID)
So for example, a plan is made up of trains that are made up of rakes (or packages) and for each package (as they are of different product) there are tasks associated with them.
So how do I model this?
I’m really at a loss at this stage on the best design and heck the dimensions to use. The data isn’t really additive or semi-additive.
So I’ve decided to work from my immediate need/goal as a proof of concept to see how it goes.
factPlanAnalysis (PlanID, UserID, PlanDate, PlanHorizon, TrainNumberID (FK), StationPlatform, RakeNo, TaskID, StartTime, EndTime, Tonnes, DeletedTaskID)
dimTrain (TrainNumberID (PK), TrainNumber, StationPlatform , ATA, ATD)
My concern with this is that as Train and DeletedTask data aren’t dependant in the plan I end up with lots of rows in the fact table.
i.e
Plan Data columns.., Train Data columns.., DeletedTasks columns..
1.., Train1, NULL
1.., Train2, NULL
1.., Train3, NULL
1.., NULL , 1000
1.., NULL , 1001
2.., Train1, NULL
2.., Train2, NULL
2.., Train3, NULL
2.., Train4, NULL
2.., NULL , 1002
2.., NULL , 1003
Also, when it comes to the analysis it’s across the fact AND the dimension – so not really a ‘slice-and-dice’ design.
Has anyone come up against this type of data? Maybe I need to simplify it further and create more fact tables – factPlanAnalysis_Trains, factPlanAnalysis_DeletedTasks.
I’ve been working through the Kimball modelling book but the KPI/measures are very basic.
Maybe I’m tackling something too complex as a start?
Any help would be great!
colabus- Posts : 1
Join date : 2013-01-23
Similar topics
» Dimensional Modelling compared to Relational (3NF +) Modelling for DSS
» Cube design Planned and Actual data?
» Modelling - Financial Advice (Sales)
» Is it the end of the Relational Dimensional Data Warehouse ?
» Track hierarchical slowly changing data which allows relational multiple future dates and retro terms
» Cube design Planned and Actual data?
» Modelling - Financial Advice (Sales)
» Is it the end of the Relational Dimensional Data Warehouse ?
» Track hierarchical slowly changing data which allows relational multiple future dates and retro terms
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum