Person Dimension - Split or Consolidate
4 posters
Page 1 of 1
Person Dimension - Split or Consolidate
We are a small university in the beginning stages for laying out our data model for a data warehouse. One question that has come up, is what to do with the person dimension. We have admittees, students, alumni, faculty, staff, administration, donor's... etc that are all "persons"... Probably 85% of the information is common, however, each type of person has it's own set of special person information that would not be relevant to other person types. So, do you still combine them into one big person dimension, or do you split these out as student dim, employee dim, donor dim... etc... Or do you make a hybrid that has the common information stored in a person dim, then have the student dim, employee dim, donor dim in separate mini dimensions?
Thanks for your feedback..
Chris
Thanks for your feedback..
Chris
sandercd- Posts : 5
Join date : 2010-04-20
Re: Person Dimension - Split or Consolidate
Given you are in the beginning stages I would suggest keeping is simple. I would create one dimension for your customers (admittees, students, alumni) and one for your employees (faculty, staff, administration) and one for your donors. If needed, you could role play these dimension. You would probably want attributes on your donor dimension like IsAlumni and IsEmployee, but those could be added during the ETL.
cjrinpdx- Posts : 51
Join date : 2011-07-14
Location : Portland, OR
Re: Person Dimension - Split or Consolidate
Cjrinpdx,
Thank you for your post. I had not thought about grouping them in that manner (Employees, Customers, donors). Very interesting idea. I love forums where you can get the input from others.
I was probably leaning towards putting all of these groups in one big dimension called person and then have a whole bunch of flags for isAlumni, isDonor, isStudent, isEmployee, ... etc.. My thought on this is that we seem to always want to slice and dice so many ways, and this way would keep it simple for analyzing any person type. I do know however that it means there will be information that is empty from person to person if they were not in a specific role.
Thoughts?
Thanks, sandercd
Thank you for your post. I had not thought about grouping them in that manner (Employees, Customers, donors). Very interesting idea. I love forums where you can get the input from others.
I was probably leaning towards putting all of these groups in one big dimension called person and then have a whole bunch of flags for isAlumni, isDonor, isStudent, isEmployee, ... etc.. My thought on this is that we seem to always want to slice and dice so many ways, and this way would keep it simple for analyzing any person type. I do know however that it means there will be information that is empty from person to person if they were not in a specific role.
Thoughts?
Thanks, sandercd
sandercd- Posts : 5
Join date : 2010-04-20
Re: Person Dimension - Split or Consolidate
How many records will this dimension have?
cjrinpdx- Posts : 51
Join date : 2011-07-14
Location : Portland, OR
Re: Person Dimension - Split or Consolidate
I estimate about 150,000 to start.
Thanks, Chris
Thanks, Chris
sandercd- Posts : 5
Join date : 2010-04-20
Re: Person Dimension - Split or Consolidate
I think one physical table is an acceptible approach. You could always create views (ie vStudent, vAlumni) to be used by SSAS when you create your dimension. Also, those flags comein hand for create calculated measures. Try to keep the number of attributes in your dimension table as small as possible. Keep an eyeoout for attributes that could become theirown dimension, Like PersonType.
cjrinpdx- Posts : 51
Join date : 2011-07-14
Location : Portland, OR
Re: Person Dimension - Split or Consolidate
One person may be an alumni and a student at the same time. (For example, an alumni studying for master degree)
Or one staff may be graduted from that university, so he is a staff and an alumni at the same time. So matching one person with a single person type may be impossible for this kind of cases. You can not easily design the relation between an alumni and a staff without having alumni and staff dimensions.
"Person" entity is usually used in normalized models and it is created as the supertype of all kind of persons. Dimensional models are good because they are more understandable than normalized models. I this case, I like to see seperate dimensions.
Or one staff may be graduted from that university, so he is a staff and an alumni at the same time. So matching one person with a single person type may be impossible for this kind of cases. You can not easily design the relation between an alumni and a staff without having alumni and staff dimensions.
"Person" entity is usually used in normalized models and it is created as the supertype of all kind of persons. Dimensional models are good because they are more understandable than normalized models. I this case, I like to see seperate dimensions.
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Re: Person Dimension - Split or Consolidate
umutiscan,
You are correct and getting at my concern with finding the right design. We have people who were students, graduated, so they are alum's then became staff, then left to further their education and then came back to became faculty. (Side note, do we split employees into faculty and then staff/admin? Faculty have a bunch of descriptive information that employees don't have ).
However, one thing I am wondering about... you mentioned that "You can not easily design the relation between an alumni and a staff without having alumni and staff dimensions"... I guess I was thinking that if you had a flag in the person dimension that indicated isAlumni, isStaff, isFaculty, isEmployee, isFriend, isAdministration, isStudent, isRecruit, isAdmittee.. That these flags would give you the ability to do whatever analysis you would need.
I do however see your point that a "Person" entity is usually used in normalized models... When you have been designing relational models for 30 years... old habits creep back in... From an end user perspective, I can see how having a dimension for each type could be much more understandable.. I do wonder though, what do you think of a hybrid approach... Make a person dimension, then use mini dimensions to cover the additional elements for alum, staff, faculty, employee... (or is this just me trying to be efficient with space and am I just going back to my years of a relational db designer?)
Thanks for your input... I really enjoy and learn from hearing other's perspectives.
You are correct and getting at my concern with finding the right design. We have people who were students, graduated, so they are alum's then became staff, then left to further their education and then came back to became faculty. (Side note, do we split employees into faculty and then staff/admin? Faculty have a bunch of descriptive information that employees don't have ).
However, one thing I am wondering about... you mentioned that "You can not easily design the relation between an alumni and a staff without having alumni and staff dimensions"... I guess I was thinking that if you had a flag in the person dimension that indicated isAlumni, isStaff, isFaculty, isEmployee, isFriend, isAdministration, isStudent, isRecruit, isAdmittee.. That these flags would give you the ability to do whatever analysis you would need.
I do however see your point that a "Person" entity is usually used in normalized models... When you have been designing relational models for 30 years... old habits creep back in... From an end user perspective, I can see how having a dimension for each type could be much more understandable.. I do wonder though, what do you think of a hybrid approach... Make a person dimension, then use mini dimensions to cover the additional elements for alum, staff, faculty, employee... (or is this just me trying to be efficient with space and am I just going back to my years of a relational db designer?)
Thanks for your input... I really enjoy and learn from hearing other's perspectives.
sandercd- Posts : 5
Join date : 2010-04-20
Re: Person Dimension - Split or Consolidate
It depends on what measures (fact) you are planning to slice and dice. I guess you would only be concerned with a subtype of person, say employee, in one star schema with a fact table in the center. I would start with subtype dimensions relevant to fact tables. If you don't have any fact table that requires a supertype person dimension, you don't need one.
Generally, dimensional modeling should avoid abstract generic dimensions as they are unnecessarily big and inefficient to connect to fact table. Even if you do need a supertype dimension for a fact table, you should create one that represents only relevant subtypes, in which case, the supertype and subtype dimension may coexist to serve their respective fact tables.
Don't be too concerned about common attributes repeated in subtypes/supertypes. Dimensional modeling is about performance, small dimensions and fewer joins, and ease of use by avoiding snowflakes.
Generally, dimensional modeling should avoid abstract generic dimensions as they are unnecessarily big and inefficient to connect to fact table. Even if you do need a supertype dimension for a fact table, you should create one that represents only relevant subtypes, in which case, the supertype and subtype dimension may coexist to serve their respective fact tables.
Don't be too concerned about common attributes repeated in subtypes/supertypes. Dimensional modeling is about performance, small dimensions and fewer joins, and ease of use by avoiding snowflakes.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: Person Dimension - Split or Consolidate
Hang,
That makes a lot of sense... Thanks for your time and input.
I appreciate everyone's input. I had heard varying thoughts in the past about placing these person's in a supertype... but I can see how working from the subtypes and their relevant fact tables would be best. It puzzles me, why others feel that they should be placed in a supertype.
Thanks..
That makes a lot of sense... Thanks for your time and input.
I appreciate everyone's input. I had heard varying thoughts in the past about placing these person's in a supertype... but I can see how working from the subtypes and their relevant fact tables would be best. It puzzles me, why others feel that they should be placed in a supertype.
Thanks..
sandercd- Posts : 5
Join date : 2010-04-20
Similar topics
» Tip #136 Adding a Mini-Dimension to a Bridge Table: What if you can't split the dimension?
» Flattening Dimension
» Party (Person/Organization) Dimension
» How to split the dimension members with SCD Type 2?
» Split hierarchies and hierarchy levels into multiple dimensions
» Flattening Dimension
» Party (Person/Organization) Dimension
» How to split the dimension members with SCD Type 2?
» Split hierarchies and hierarchy levels into multiple dimensions
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum