Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Loading Fact Table with Type 2 Slowly Changing Dimension

5 posters

Go down

Loading Fact Table with Type 2 Slowly Changing Dimension Empty Loading Fact Table with Type 2 Slowly Changing Dimension

Post  jgaull Tue Jun 01, 2010 5:01 pm

I'm new to data warehousing .. sorry if this is a basic question.

I have a type 2 slowly changing dimension already loaded and now I want to load a fact table. At least, I think it should be a fact table. A lot of the entities in the dimension table have a schedule associated with them. I thought it would be best to implement the schedule as a fact table. The problem I'm having is that the schedule should apply to all instances of an entity in the dimension table. Is the best solution to create a bridge dimension table that contains only unique entities from my dimension table?

Here's a made up example scenario:

I have students in a student dimension. I track changes (type 2) on the students' address so each student can have multiple records in the dimension table. I have another table (fact table) with each student's class schedule. I want to relate the class schedule to all instances of the student. Should I create another dimension table that only contains 1 row per student and relate the fact table to this "bridged" table?

Thanks!

jgaull

Posts : 5
Join date : 2010-06-01

Back to top Go down

Loading Fact Table with Type 2 Slowly Changing Dimension Empty Re: Loading Fact Table with Type 2 Slowly Changing Dimension

Post  ngalemmo Tue Jun 01, 2010 5:17 pm

jgaull wrote: I want to relate the class schedule to all instances of the student.

Why?

The purpose of a type 2 dimension is to provide the state of a dimension at the time the fact occured. A type 2 can also provide the current state of the dimension with a self-join on the dimension table. Other than the point-in-time state and/or the current state of the dimension, what would be the reason to see any other instance of the dimension?

To get the current state from a type 2 dimension, you join the fact table to the dimension table. You then join the dimension table to itself using the natural key and filtering on current flag = true. If you had effective and expiration dates in the dimension table, you could get something other than the current row by self-joining on the natural key and filtering on some date between the effective and expiration dates, but I would be hard pressed to come up with a reason why someone would want to do that.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Loading Fact Table with Type 2 Slowly Changing Dimension Empty Re: Loading Fact Table with Type 2 Slowly Changing Dimension

Post  jgaull Wed Jun 02, 2010 7:48 am

Thanks for your reply!

Are you saying that you do recommend that I keep the schedule in a fact table and that I should join the fact table row to the dimension row that is active at the time that I receive the schedule data? Would I ever need to update the foreign key in the fact table as the data in the dimension table changes?

So for the student example ...

If I have the following dimension table (Sorry for the formatting)

StudentKey LastName FirstName City RowIsCurrent RowStartDate RowEndDate
1 Smith John Chicago 0 5/1/2008 3/20/2009
2 Smith John Austin 0 3/21/2009 5/15/2010
3 Smith Jonn San Francisco 1 5/16/2010 12/31/9999


Here would be a made up fact table for data I received on 3/22/2009.
ScheduleKey DateKey StudentKey Class
1 20100510 2 Science
2 20100610 2 Math
3 20100525 2 History

The business case calls for the schedule to apply to all instances of John Smith regardless of where he's living but I only have the fact table joining to one instance of the student. Would it be better to have 3 versions of the fact table data (1 for each instance of the student)?

Thanks!!!!!

jgaull

Posts : 5
Join date : 2010-06-01

Back to top Go down

Loading Fact Table with Type 2 Slowly Changing Dimension Empty Re: Loading Fact Table with Type 2 Slowly Changing Dimension

Post  ngalemmo Wed Jun 02, 2010 11:58 am

You most definiately do not want 3 versions of the fact data for each version of the student.

A critical piece missing in the student dimension is a natural key, that is, the business key that uniquely identifies the student. This would be the same value on each student row. That value ties the three rows together, allowing you to self-join on the dimension to locate whichever version of student you want to use for your report.

There are other techniques as well, covered in other discussions: http://forum.kimballgroup.com/dimensional-modeling-and-data-architecture-f6/scd-type-1-and-2-t433.htm
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Loading Fact Table with Type 2 Slowly Changing Dimension Empty Thanks!

Post  jgaull Wed Jun 02, 2010 12:09 pm

Thank you for taking the time and helping me out.

I definitely do have a natural key in my data warehouse and I should have made one up for my example. I am able to move forward now based on your recommendations. I will have only one version of the fact data and I can use the natural key in the dimension table to find what instance of the dimension entity I wish to view. This is great advice and I really appreciate your help!

jgaull

Posts : 5
Join date : 2010-06-01

Back to top Go down

Loading Fact Table with Type 2 Slowly Changing Dimension Empty type 2 slowly changing dimension already loaded

Post  jmamedov Sun Mar 06, 2011 12:42 pm

Hi. I just registered into group.
Your business requirement is clear.
The student schedule relates to a student not the location that he or she resides on.
You need to normalize your student dimension by separating the location information into its own dimension and create location key in the fact table that points to this (location) dimension.
Based on what you have stated it is clear that you need to tack changings on location dimension only.
This simple change will meet your requirements. (I think so anyway).

jmamedov

Posts : 1
Join date : 2011-03-06

Back to top Go down

Loading Fact Table with Type 2 Slowly Changing Dimension Empty Re: Loading Fact Table with Type 2 Slowly Changing Dimension

Post  hang Sun Mar 06, 2011 7:47 pm

jgaull wrote:Here would be a made up fact table for data I received on 3/22/2009.
ScheduleKey DateKey StudentKey Class
1 20100510 2 Science
2 20100610 2 Math
3 20100525 2 History
I wonder if you really meant StudentKey=3 in the fact table, as the SK-3 is the latest version of John Smith. There would be an extra work, but still achievable, to retrieve the second last version of SCD2 dimension.

Another point is, do you also tend store the snapshot datekey=20090322 in the fact table.
jmamedov wrote:
You need to normalize your student dimension by separating the location information into its own dimension and create location key in the fact table that points to this (location) dimension.
In the sample case, I don’t see any need for a separate location dimension, unless the student dimension is a monster dimension and there is another hierarchy tying up schools, regions and states together in a multi-school scenario, and a student could be enrolled in different schools at the same time. Otherwise, keep it simple by denormalising the city name in your student dimension and set it to SCD1 or 2 attribute based on your business requirement.


hang

Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia

Back to top Go down

Loading Fact Table with Type 2 Slowly Changing Dimension Empty Re: Loading Fact Table with Type 2 Slowly Changing Dimension

Post  bidw_kk Tue Jul 26, 2011 11:07 am

I fully understand that this technique can be used to display the dimension instance we want while displaying the schedule.
However, I am still confused about answering questions like:

What is the scheduled class for John Smith on 06/10/2010?
Assuming that I am running this query on 05/17/2010 by which time the latest instance for John Smith is having studentkey of 3.
How do I know which instance of John smith has to be joined with fact table to get the schedule? It could be 1,2 or 3 in real scenario.
Should I perform a right outer join between the student dimension (filtered for instances of John Smith) and schedule fact table?

bidw_kk

Posts : 5
Join date : 2011-07-16

Back to top Go down

Loading Fact Table with Type 2 Slowly Changing Dimension Empty Re: Loading Fact Table with Type 2 Slowly Changing Dimension

Post  Sponsored content


Sponsored content


Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum