Question - creating a dimensional model for facility management
3 posters
Page 1 of 1
Question - creating a dimensional model for facility management
Hello, folks. I'd like to ask some newbie questions.
I'm starting a multidimensional model for facilities management from scratch, and I'll certainly need a SPACE dimension, since it's the grain in which all the spatial information is stored in the transactional system. Here goes one of its full hierarchies (left 1 : M right, except for Site 1 : 1 Address), each one being a relational table in the transactional system:
CONTINENT -> COUNTRY -> STATE -> CITY -> ADDRESS -> SITE -> BUILDING -> FLOOR -> SPACE
Question 1 - spaces and some higher hierarchy members have a wide number of measurement fields such as gross area, raw area, usable area, etc. Which of these would be the best way of modeling my space information from a dimensional perspective? Is more than one of those correct? If so, in which cases each approach would be more effective? Are there other options?
- create a DIM_SPACE with the "pure descritive" attributes and a FACT_SPACE_MEASUREMENTS with the measurement fields, using drill accross from other fact tables to retrieve these metrics when needed
- put both the "pure descritive" attributes and measurement fields all on a DIM_SPACE
- create only a DIM_SPACE with the "pure descritive" attributes, skip the FACT_SPACE_MEASUREMENTS and include only the measurement fields in future fact tables that eventually needs them
Although the 1st option seems the best one to me, I'm kinda resistant to create both a dimension and a fact tables to reference data from the same relational table. I've considered the 2nd option, but going that way could compromise something in the future. Anyway, if you could comment each one, I'd be very thankful!
Question 2 - the measures will be mostly used in meters, but some reports will need them in feet as well. In this case, since only those two units of measurement are needed (and that's really not going to change), would it be better to create two columns for each measure, one for meters and other for feet, or using a single column and a calculated attribute in my front-end application with a formula like measure_in_feet = measure (already in meters) * 0.3?
Question 3 - suppose that to store the complete hierarchy in a conformed DIM_SPACE I'd need 60 columns from space and their parents. While my relational space table has 200k records, there are no more than 1k rows in the floor table. Again, I have considered the following alternatives:
- create a single DIM_SPACE table with 200k records and the 60 fields for all hierarchies
- create a DIM_SPACE table with 200k records and about 20 fields, plus a DIM_FLOOR table witk 1k records and about 40 fields for the hierarchies above FLOOR
Although the disk space saved by the 2nd approach is not significant, the impact on user queries using a DIM_FLOOR containing only a few thousand rows instead of a considerably bigger DIM_SPACE could be significant(?). How much does a 60 column table seems less user-friendly than 1 with 20 columns coupled with another one with 40 columns IF NEEDED (FLOOR )? Should I discard this option anyway since The Data Warehouse Toolkit, pg 327 stated this as a mistake ("Mistake 8: Split hierarchies and hierarchy levels into multiple dimensions"), and use a single table?
That's all for today. I have another big problem to solve (modeling a M:M relationship between employees and seats that can change over time), but more on that later. Thanks in advance, and sorry for my poor English :-)
I'm starting a multidimensional model for facilities management from scratch, and I'll certainly need a SPACE dimension, since it's the grain in which all the spatial information is stored in the transactional system. Here goes one of its full hierarchies (left 1 : M right, except for Site 1 : 1 Address), each one being a relational table in the transactional system:
CONTINENT -> COUNTRY -> STATE -> CITY -> ADDRESS -> SITE -> BUILDING -> FLOOR -> SPACE
Question 1 - spaces and some higher hierarchy members have a wide number of measurement fields such as gross area, raw area, usable area, etc. Which of these would be the best way of modeling my space information from a dimensional perspective? Is more than one of those correct? If so, in which cases each approach would be more effective? Are there other options?
- create a DIM_SPACE with the "pure descritive" attributes and a FACT_SPACE_MEASUREMENTS with the measurement fields, using drill accross from other fact tables to retrieve these metrics when needed
- put both the "pure descritive" attributes and measurement fields all on a DIM_SPACE
- create only a DIM_SPACE with the "pure descritive" attributes, skip the FACT_SPACE_MEASUREMENTS and include only the measurement fields in future fact tables that eventually needs them
Although the 1st option seems the best one to me, I'm kinda resistant to create both a dimension and a fact tables to reference data from the same relational table. I've considered the 2nd option, but going that way could compromise something in the future. Anyway, if you could comment each one, I'd be very thankful!
Question 2 - the measures will be mostly used in meters, but some reports will need them in feet as well. In this case, since only those two units of measurement are needed (and that's really not going to change), would it be better to create two columns for each measure, one for meters and other for feet, or using a single column and a calculated attribute in my front-end application with a formula like measure_in_feet = measure (already in meters) * 0.3?
Question 3 - suppose that to store the complete hierarchy in a conformed DIM_SPACE I'd need 60 columns from space and their parents. While my relational space table has 200k records, there are no more than 1k rows in the floor table. Again, I have considered the following alternatives:
- create a single DIM_SPACE table with 200k records and the 60 fields for all hierarchies
- create a DIM_SPACE table with 200k records and about 20 fields, plus a DIM_FLOOR table witk 1k records and about 40 fields for the hierarchies above FLOOR
Although the disk space saved by the 2nd approach is not significant, the impact on user queries using a DIM_FLOOR containing only a few thousand rows instead of a considerably bigger DIM_SPACE could be significant(?). How much does a 60 column table seems less user-friendly than 1 with 20 columns coupled with another one with 40 columns IF NEEDED (FLOOR )? Should I discard this option anyway since The Data Warehouse Toolkit, pg 327 stated this as a mistake ("Mistake 8: Split hierarchies and hierarchy levels into multiple dimensions"), and use a single table?
That's all for today. I have another big problem to solve (modeling a M:M relationship between employees and seats that can change over time), but more on that later. Thanks in advance, and sorry for my poor English :-)
Demitri- Posts : 9
Join date : 2010-07-27
Re: Question - creating a dimensional model for facility management
Welcome to the fun of dimensional modelling.
Question 1 :
Here's how I would think about it.
The space is an physical object, so any descriptive attributes of the physical object itself could exist in the dimension. In your case, Gross Area, Raw Area, Usable Area are valid descriptive attributes of the physical item, and could be on the dimension as Type-2 attributes.
Other "area" measurements are more likely the result of business events, and are best left to fact tables. E.g. Occupied_area, Vacant_Area.
Some of the physical attributes may also exist on a fact table , for example, you could have a snapshot fact table where the grain is one record for each site at the beginning of every month. Measures could include Gross Area, Usable Area, Occupied Area, Vacant Area
So the answer is "Both".
Question 2 :
Given that there are only 2 units of measure, I'd precalculate both (e.g. Gross_Area_Feet,Gross_Area_Meters) rather than embedding a runtime calculation in the BI layer.
Question 3 :
DIM_SPACE should have all the attributes in it. The only reason to create a "DIM_FLOOR" would be if you had facts at that level, but even so, it would be a subset of the attributes in DIM_SPACE, not a replacement.
For example, the hierarchy (and associated attributes)
DIM_SPACE : CONTINENT -> COUNTRY -> STATE -> CITY -> SITE -> BUILDING -> FLOOR -> SPACE
DIM_FLOOR : CONTINENT -> COUNTRY -> STATE -> CITY -> SITE -> BUILDING -> FLOOR
You'll notice I left out address .... its not part of the hierarchy, but rather an attribute of the SITE.
Employees to seats ... most likely this is a fact relationship ... FACT_EMPLOYEE_SEATING...one record for every seat assignment.
Hope this helps .... have fun!
LAndrews- Posts : 132
Join date : 2010-05-13
Location : British Columbia, Canada
Re: Question - creating a dimensional model for facility management
Hello, LAndrews. Thanks for your reply. Here goes some comments, and new questions :-)
About question 1, I really designed the DIM_SPACE with the almost unchangeable measures on it, such as total, gross and raw areas. But a FACT_SPACE_MEASUREMENTS wouldn't be wrong either, would it? Also, since we don't intend to track "construction facts" (read: create a specific fact table for this), does attributes like occuped area and vacant area fit in my dimension as well? Is this logic applyable in most cases: "If an event that changes an attribute will not be stored as a fact, these attributes can be stored in a dimension."?
About question 2, I agree with you. But now I'm curious about a somewhat similar scenario: let's say I have a fact table with 15 attributes that express values in dollars. Should the need to analyze these values using 6 or 7 other currencies arise, how could a new table handling currencies fit in my model?
About question 3, you're right, it's much more practical that way, and thanks for the input on the addres being not part of the hierarchy, now I understand it better.
About my relationship between employees and seats, it's not that easy. Some seats are "rotating", meaning that they can be assigned to more than one person at a time. Also, our salespersons can use rotating seats in different offices at the same time, so we have a M:M relationship. I'll do some reading about bridges and try to figure out something, then I'll post it here to ask for opinions; yours would be more than welcome.
Once again, thanks forhelping me!
About question 1, I really designed the DIM_SPACE with the almost unchangeable measures on it, such as total, gross and raw areas. But a FACT_SPACE_MEASUREMENTS wouldn't be wrong either, would it? Also, since we don't intend to track "construction facts" (read: create a specific fact table for this), does attributes like occuped area and vacant area fit in my dimension as well? Is this logic applyable in most cases: "If an event that changes an attribute will not be stored as a fact, these attributes can be stored in a dimension."?
About question 2, I agree with you. But now I'm curious about a somewhat similar scenario: let's say I have a fact table with 15 attributes that express values in dollars. Should the need to analyze these values using 6 or 7 other currencies arise, how could a new table handling currencies fit in my model?
About question 3, you're right, it's much more practical that way, and thanks for the input on the addres being not part of the hierarchy, now I understand it better.
About my relationship between employees and seats, it's not that easy. Some seats are "rotating", meaning that they can be assigned to more than one person at a time. Also, our salespersons can use rotating seats in different offices at the same time, so we have a M:M relationship. I'll do some reading about bridges and try to figure out something, then I'll post it here to ask for opinions; yours would be more than welcome.
Once again, thanks forhelping me!
Demitri- Posts : 9
Join date : 2010-07-27
Re: Question - creating a dimensional model for facility management
When dealing with currencies, it is always a good idea to store measures in both local currency and a standard currency (dollars, euros, or yen usually) that the corporation uses to keep its books. You then maintain a conversion table to restate fact in some other currency. Work with someone in Accounting to figure out what rates should be used and when.
New scenario: additive x hierarchy
Thank you ngalemmo for your reply. But this conversion table wouldn't fit neither as a fact nor a dimension, right? How should I represent it in my multidimensional data model? Would it be a "loose" table, without any relationship?
Back to my "facilities problem", new entry about question 3: I've talked a bit with some people from the transactional system, and discovered that floor, building and site all have metrics that are exclusive for these kind of entities, thus they are not stored in the lowest grain (space). In fact, some of these measurements are stored in specific tables (Space_Measurements, Floor_Measurements, etc).
If I put all the data in a single table, it'd look something like:
This complicates things a little, because the measurements of the members above space in the hierarchy won't be additive. For example: building A has three floors; the first one has Floor_Space_Actual=1800, the second one has Floor_Space_Actual=1650 and the third one has Floor_Space_Actual=2000, so my "Total Floor Space Actual in building A" is 1800+1650+2000=5450 ft, but if I submit a SELECT SUM(Floor_Space_Actual) FROM DIM_SPACE WHERE BUILDING = 'BUILDING A' the answer will be 1800+1800+1650+1650+2000=8900, but that won't be correct.
Would I need a DIM_SPACE, a DIM_FLOOR, a DIM_BUILDING and a DIM_SITE to solve this problem without appealing to a snow flake? That's a REALLY new situation to me, so any inputs, once again, would be more than welcome!
Back to my "facilities problem", new entry about question 3: I've talked a bit with some people from the transactional system, and discovered that floor, building and site all have metrics that are exclusive for these kind of entities, thus they are not stored in the lowest grain (space). In fact, some of these measurements are stored in specific tables (Space_Measurements, Floor_Measurements, etc).
If I put all the data in a single table, it'd look something like:
Space | Space_Raw_Sq_Ft | Floor | Floor_Space_Actual | Building | Building_Rentable_Eligible_For_Building_Sharing | Site | Site_Usable_Rentable |
Seat 1 | 12 | 1st Floor | 1800 | Building A | 5000 | Site | 1400 |
Seat 2 | 13 | 1st Floor | 1800 | Building A | 5000 | Site | 1400 |
Seat 3 | 12 | 2nd Floor | 1650 | Building A | 5000 | Site | 1400 |
Seat 4 | 15 | 2nd Floor | 1650 | Building A | 5000 | Site | 1400 |
Seat 5 | 15 | 3rd Floor | 2000 | Building A | 5000 | Site | 1400 |
Seat 6 | 18 | 1st Floor | 3200 | Building B | 800 | Site | 1400 |
Seat 7 | 14 | 1st Floor | 3200 | Building B | 800 | Site | 1400 |
This complicates things a little, because the measurements of the members above space in the hierarchy won't be additive. For example: building A has three floors; the first one has Floor_Space_Actual=1800, the second one has Floor_Space_Actual=1650 and the third one has Floor_Space_Actual=2000, so my "Total Floor Space Actual in building A" is 1800+1650+2000=5450 ft, but if I submit a SELECT SUM(Floor_Space_Actual) FROM DIM_SPACE WHERE BUILDING = 'BUILDING A' the answer will be 1800+1800+1650+1650+2000=8900, but that won't be correct.
Would I need a DIM_SPACE, a DIM_FLOOR, a DIM_BUILDING and a DIM_SITE to solve this problem without appealing to a snow flake? That's a REALLY new situation to me, so any inputs, once again, would be more than welcome!
Demitri- Posts : 9
Join date : 2010-07-27
Re: Question - creating a dimensional model for facility management
The conversion table would be a bridge... Dimensional models are really made up of facts, dimensions and the occasional bridge, which sits between a fact and a dimension.
As far as the rest of it goes, the hierarchy of concern is SITE -> BUILDING -> FLOOR -> SPACE. The address info are best treated as attributes.
To get to the kind of reporting you want, you need to implement SITE -> BUILDING -> FLOOR -> SPACE as a single dimension, call it "Facility" and support the heirarchy with a hierarchy bridge. All four types of facilities would be represented in the dimension, all with similar attributes (type (site, building, ect...), description, gross space, rentable space, address, etc...) and an exploded bridge of the parent/child relationships.
The bridge will allow you to capture facts at any level in the hierarchy as well as report at any level of the hierarchy.
As far as the rest of it goes, the hierarchy of concern is SITE -> BUILDING -> FLOOR -> SPACE. The address info are best treated as attributes.
To get to the kind of reporting you want, you need to implement SITE -> BUILDING -> FLOOR -> SPACE as a single dimension, call it "Facility" and support the heirarchy with a hierarchy bridge. All four types of facilities would be represented in the dimension, all with similar attributes (type (site, building, ect...), description, gross space, rentable space, address, etc...) and an exploded bridge of the parent/child relationships.
The bridge will allow you to capture facts at any level in the hierarchy as well as report at any level of the hierarchy.
Re: Question - creating a dimensional model for facility management
Ok, Nick, I think I got the idea. With this approach I'd have a DIM_FACILITY with all the common attributes for space, floor, building and site, plus a DIM_FACILITY_SPACE_HIERARCHY_BRIDGE with parent SK, child SK, level, distance from parent, top flag, bottom flag.
The only thing I did't figured out yet is how to deal with attributes that are exclusive for a member of the hierarchy in records that stores information of a parent of this kind of record. For example, the "level-in-relation-to-ground" information only makes sense when we're talking about floors. If my DIM_FACILITY has a "Level" field, I can populate this field with valid data when I'm loading floors (or spaces, which are always children of a floor), but the information isn't applyable for buildings or sites. To avoid normalizing the model, I should create the fields in the dimension anyway, and fill them with dummy values in these cases, right?
The only thing I did't figured out yet is how to deal with attributes that are exclusive for a member of the hierarchy in records that stores information of a parent of this kind of record. For example, the "level-in-relation-to-ground" information only makes sense when we're talking about floors. If my DIM_FACILITY has a "Level" field, I can populate this field with valid data when I'm loading floors (or spaces, which are always children of a floor), but the information isn't applyable for buildings or sites. To avoid normalizing the model, I should create the fields in the dimension anyway, and fill them with dummy values in these cases, right?
Demitri- Posts : 9
Join date : 2010-07-27
Re: Question - creating a dimensional model for facility management
Yes. If the particular attribute doesn't apply to a particular type, just leave it blank or null.
If the number of such attributes are large, you may want to consider sub-type tables sharing the same PK. But I get the sense that is not the case and you can reasonably handle all attributes in one table.
If the number of such attributes are large, you may want to consider sub-type tables sharing the same PK. But I get the sense that is not the case and you can reasonably handle all attributes in one table.
Re: Question - creating a dimensional model for facility management
Great, there are few of these "exclusive" attributes indeed, so a single table will be enough.
One last question: a hierarchy with country -> state -> city fields in the same dimension can be easily handled by OLAP tools. Do they also support the structure with a bridge to resolve hierarchies, allowing the end-users to perform drill operations?
One last question: a hierarchy with country -> state -> city fields in the same dimension can be easily handled by OLAP tools. Do they also support the structure with a bridge to resolve hierarchies, allowing the end-users to perform drill operations?
Demitri- Posts : 9
Join date : 2010-07-27

» Question - creating a dimensional model for incident management
» Dimensional Model for Property Management
» creating dimensional model of log data
» Dimensional Model for Project management software
» Dimensional Model question
» Dimensional Model for Property Management
» creating dimensional model of log data
» Dimensional Model for Project management software
» Dimensional Model question
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum
|
|