Kimball Forum
Would you like to react to this message? Create an account in a few clicks or log in to continue.

Kimball Dimensions

2 posters

Go down

Kimball Dimensions Empty Kimball Dimensions

Post  kdot Fri Sep 24, 2010 11:16 am

I got into a nice heated debate the other day. The item of discussion was flattened dimensions. I’d like to know your opinion (or facts) about the following, (Some of this is common sense… please still give me some feedback…). Any article citations would be greatly appreciated.

1.Flattening data (such as many [but limited] codes to describe one event) can entail creating columns to describe each code and populating the rows with a flag to indicate the value occurred. Correct?

2.Flattening data will bring the dimension’s data to the granularity of the related fact table (one row per fact row = 1 row in the dimension). Correct?

3.Flattening the data for all permutations can save data storage space and ETL processing time (when built for all permutations). Correct?

4.The flattened dimension could be considered friendlier for customers interacting directly with the fact and dimension tables (enterprise users working through Business Objects & SAS). Correct?

5.The “Kimball camp” prefers to flatten data (dimensions) unless there is a data, business, or technical requirement or limitation forcing an outrigger or bridge scenario. Correct?

6.All granularity of the dimensional model is determined by the fact tables. This is the level where the “measured amounts” occurred in the source system. Flattening the dimension’s data is not “changing the granularity” rather; it’s bringing dimension’s data to the established granularity. Correct?

In statements 1, 2, and 3 let’s say the codes are static – they don’t change – and are limited to six different codes. These codes are AA, AB, AC, BB, BC, BD. With no requirements for either format, which layout would the “Kimball camp” prefer?


Method A: (This table will have the same row count as the source system -millions of rows – and require an ETL for every DW update.)

Surrogate Code Code Description
1234 AA Entry sent to Account Sys
1234 AB Entry sent to Reconcile Sys
1234 AC Entry sent to Archive Sys
4321 BB Entry sent to Audit Sys
4321 BD Entry sent to Federal Review Sys
6677 BC Entry sent to State Review Sys


Method B: (This table will be built for all permutations – 64 rows – and requires no ETL updates)

Surrogate AA Account AB Reconcile AC Archive BB Audit BC State BD Federal
1234 Y Y Y N N N
4321 N N N Y N Y
6677 N N N N Y N

Thanks in advance for any feedback.

kdot

Posts : 3
Join date : 2010-07-20

Back to top Go down

Kimball Dimensions Empty Re: Kimball Dimensions

Post  ngalemmo Fri Sep 24, 2010 12:00 pm

Sure, this will work if each attribute is well defined and the number of attributes are relatively small. It makes sense in the scenario you defined.

It doesn't work well when there are a lot of potential code values, all representing the same thing, and multiple values apply to the same fact.... i.e. a many-to-many relationship. Situations like that require a bridge table. A common example are diagnoses relating to a medical claim.
ngalemmo
ngalemmo

Posts : 3000
Join date : 2009-05-15
Location : Los Angeles

http://aginity.com

Back to top Go down

Back to top

- Similar topics

 
Permissions in this forum:
You cannot reply to topics in this forum