location dimension
+3
BoxesAndLines
ngalemmo
scabral
7 posters
Page 1 of 1
location dimension
Hi,
I have a location dimension which contains County, State, and Zip fields.
I have a fact table that only has County and State values, so how do i do a lookup to the location dimension to only get 1 row for the combination of County and State? Right now it will return multiple rows each with a different surrogate key.
thanks
scott
I have a location dimension which contains County, State, and Zip fields.
I have a fact table that only has County and State values, so how do i do a lookup to the location dimension to only get 1 row for the combination of County and State? Right now it will return multiple rows each with a different surrogate key.
thanks
scott
scabral- Posts : 58
Join date : 2012-05-02
Re: location dimension
ngalemmo,
I'm a bit confused. My fact data has only County and State. What would be contained in a City dimension? How would I do a lookup to find the right combination of County and State?
I'm a bit confused. My fact data has only County and State. What would be contained in a City dimension? How would I do a lookup to find the right combination of County and State?
scabral- Posts : 58
Join date : 2012-05-02
Re: location dimension
He meant to say create a country/state dimension.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: location dimension
You could also add rows to your existing dimension for unique County/State combinations with the zip code left blank (or populated with a * or some other special value.) You would point to those rows when loading facts that don't have a specific zip code.
That way you only have a single location dimension, which could be advantages if you want to query across fact tables.
On the other hand, you add some complexity by having entries both with and without zip code in your you location dimension.
That way you only have a single location dimension, which could be advantages if you want to query across fact tables.
On the other hand, you add some complexity by having entries both with and without zip code in your you location dimension.
VHF- Posts : 236
Join date : 2009-04-28
Location : Wisconsin, US
Re: location dimension
Is solution covered in any of the Kimball Books? Are there any issues other than handling the complexities? The reason I ask is, our old Data Warehouse has separate dimensions for situations like this. We recently had some consultants come in to build a prototype for a new DW. In their design they combined 2 of our dimensions into a single location dimension as you described. I thought it was a pretty slick way to handle fact table with different levels of granularity for the same dimension but, was wondering if there are any 'Gotchas'. In the past when I've deviated from the standard Kimbal Dimensional Model for something that seemed better or more logical, I've regretted it.
Thanks
Thanks
KimballFan- Posts : 11
Join date : 2014-01-15
Location : Tucson
Re: location dimension
The gotcha is what the OP stated. Which row do I select when may dimension rows are applicable? An analogy is picking a date for a monthly snapshot fact when joining to the date dimension. Any of the days of a given month can be applicable.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Location Dimension
Hi,
just as an FYI, according to Kimball we should not be creating Location Dimensions:
"Geographic information is naturally handled as attributes within multiple dimensions, not as a standalone location dimension or outrigger."
There is a page on this in Ch. 11 - Telecommunications.
But assuming you have a valid business reason for creating this dimension, is the reason you are facing this challenge due to the issue of grain? You have a fact table with a declared grain of county/state (plus other attributes presumably) but you are trying to join it to a dimension with a grain of zip/county/state. This would never work cleanly - though as discussed in this chain there are possible workarounds. Instead you need to roll up your zip/county/state location dimension to create a new county/state dimension and then join your fact table to that new location dimension.
This is the same design principle where if you create a monthly snapshot/aggregate fact table you would join it to a month dimension and not a date dimension (as the 'time' grain of the fact is now month and not day).
Regards,
just as an FYI, according to Kimball we should not be creating Location Dimensions:
"Geographic information is naturally handled as attributes within multiple dimensions, not as a standalone location dimension or outrigger."
There is a page on this in Ch. 11 - Telecommunications.
But assuming you have a valid business reason for creating this dimension, is the reason you are facing this challenge due to the issue of grain? You have a fact table with a declared grain of county/state (plus other attributes presumably) but you are trying to join it to a dimension with a grain of zip/county/state. This would never work cleanly - though as discussed in this chain there are possible workarounds. Instead you need to roll up your zip/county/state location dimension to create a new county/state dimension and then join your fact table to that new location dimension.
This is the same design principle where if you create a monthly snapshot/aggregate fact table you would join it to a month dimension and not a date dimension (as the 'time' grain of the fact is now month and not day).
Regards,
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: location dimension
It depends on what information you have or the intent (such as creating aggregates). Certainly in telecommunications you have precise geographic information all over the place. In the original case, if all you have is ZIP or County, then you wind up implementing two dimensions, one for ZIP the other for County. You can't really piggyback county or state off ZIP as ZIP codes do not conform to political boundaries.
Re: location dimension
I've done lots of work for telco's and we used geography all over the place. I'd have to say this is one of the few times I would have to disagree with the Doc.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: location dimension
BoxesAndLines wrote:The gotcha is what the OP stated. Which row do I select when may dimension rows are applicable? An analogy is picking a date for a monthly snapshot fact when joining to the date dimension. Any of the days of a given month can be applicable.
Selecting which record to use is not an issue in this design. The Dimension was created with an additional set of records at the higher dimension level. In this example, the lowest level is Service Point (This is a Utility Industry model). For the most part, each service point represents where a meter is installed at a Premise (Service Address, abount 700,000 rows). The highler level in the dimension is the Premise. So, the table has a record for every Service Point which contains details about the premise. There is an Additional record added for each Premise where the service point information is blank. THe table contains a Primary Surrogate Key (Location Key), a Premise Key and an SP Key. So, if a fact table is at the Service Point level, it joins to the record where the Location Key = the Service Point Key, if a fact record is at the Premise Level, it will join to the record where the Location Key = the Premise Key. I did find a brief paragraph in Kimball's most current Data Warehouse tookit where, it seemed like it was alluding to this design and recommended against it. I left that book at home today so, I don't have the exact quote or page but, I believe it was in Chapter 7. As I stated earlier, when I've deviated from Dimensional Modelling standards, I've come to regret it and, usually too late to go back and fix it without a lot of effort. The 1 gotcha I see is, If 2 tables are joined to the same dimension and, thos fact tables are linking to different grains of the dimension, it might be difficult to enforce how this data would be used by the end user. We could aggregate the table that is at the lower level but, why have the lower level if you can't use it. I realize that handling fact tables at multiple grains is an issue even when they are at the same dimension level, this just seems to compound that.
KimballFan- Posts : 11
Join date : 2014-01-15
Location : Tucson
Re: location dimension
If I have understood correctly, you have put Service Points and Premises in the same Dimension? If so then I'm pretty sure this is wrong - they are two separate entities and so should be in separate Dims. It sounds like there is a hierachical relationship between Service Point (child) and Premise (parent); if this is the case then you wouldn't need to assign the Premise SK to the fact table as the Fact-Service Point relationship covers the Fact-Premise relationship anyway.
If you create an aggregate fact table at the premise level then you would not include the Service Point Dim but you would include the Premise Dim.
This is the same principle as the Date/Month issue - if you have an aggregate table at the Month level you don't assign a (random) Date Dim SK to it, you create a Month Dim and assign the relevant Month Dim SK to each fact record.
Regards,
If you create an aggregate fact table at the premise level then you would not include the Service Point Dim but you would include the Premise Dim.
This is the same principle as the Date/Month issue - if you have an aggregate table at the Month level you don't assign a (random) Date Dim SK to it, you create a Month Dim and assign the relevant Month Dim SK to each fact record.
Regards,
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
.
Thanks for the response! My sentiments exactly. Our old Data Warehouse had separte dimensions for these entities as well as the PERSON, ACCOUNT and SERVICE AGREEMENT entities that were also rolled into one dimension in this new model. I was just looking for confirmation as I'm trying to convince my boss that we need to revise this design before moving foreward. We are in the very early stages of implementing a new model and I want a solid foundation to build on. There is a reference to this structure on page 302 of Kimball's latest book but, it didn't go into much detail as to the impact of doing this.nick_white wrote:If I have understood correctly, you have put Service Points and Premises in the same Dimension? If so then I'm pretty sure this is wrong - they are two separate entities and so should be in separate Dims. It sounds like there is a hierachical relationship between Service Point (child) and Premise (parent); if this is the case then you wouldn't need to assign the Premise SK to the fact table as the Fact-Service Point relationship covers the Fact-Premise relationship anyway.
If you create an aggregate fact table at the premise level then you would not include the Service Point Dim but you would include the Premise Dim.
This is the same principle as the Date/Month issue - if you have an aggregate table at the Month level you don't assign a (random) Date Dim SK to it, you create a Month Dim and assign the relevant Month Dim SK to each fact record.
Regards,
KimballFan- Posts : 11
Join date : 2014-01-15
Location : Tucson
Re: location dimension
I am presuming that when you add a location dimension this has no effect on the address fields in the Customer dimension, right?
davej- Posts : 6
Join date : 2014-02-17
Similar topics
» Location in Each Dimension?
» Location Dimension(s) with multiple hierarchies
» Nearby Locations in a Location Dimension
» Master Data and Dimension location
» Location Dimension for locating different types of infrastruture
» Location Dimension(s) with multiple hierarchies
» Nearby Locations in a Location Dimension
» Master Data and Dimension location
» Location Dimension for locating different types of infrastruture
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum