How to Handle Data that serves as both a dimension and attributes of another dimension
4 posters
Page 1 of 1
How to Handle Data that serves as both a dimension and attributes of another dimension
I have a invoicing warehouse with two fact tables. First is an invoice header fact table which reports on invoice totals and includes a surrogate key for a stat code that was assigned to the invoice. The grain is one row per invoice. Next is a part detail fact table that basically is a break-down of all the invoices in our header fact table showing line item details of the parts that were included in an invoice. The grain is one row per part per invoice. This second fact table uses a part dimension that includes the part number and associated attributes.
The users would now like to see the currently assigned stat code for a specific part. I could solve this by placing the stat code and all of it's attributes into the part dimension as additional attributes for the part.
So which would you recommend.
Option 1: I would continue loading my stat code dimension for use by the invoice header fact table. Then when loading my part dimension used by the part detail fact, I'd load the stat code and then lookup it's remaining attributes from the stat code dimesnion. Then I would have stat code information duplicated in both the stat code dimension and the part dimension (180,000 records). The stat code dimension is only 130 records with five columns (1 key & 4 attribute excluding audit columns).
Option 2: I would continue loading my stat code dimension for use by the first fact table. When loading the part dimension, I'd load a surrogate key to the stat code dimension creating a snowflake.
Is there a better way that I have not identified? Please note that there may be stat codes with no parts assigned.
Thanks!
The users would now like to see the currently assigned stat code for a specific part. I could solve this by placing the stat code and all of it's attributes into the part dimension as additional attributes for the part.
So which would you recommend.
Option 1: I would continue loading my stat code dimension for use by the invoice header fact table. Then when loading my part dimension used by the part detail fact, I'd load the stat code and then lookup it's remaining attributes from the stat code dimesnion. Then I would have stat code information duplicated in both the stat code dimension and the part dimension (180,000 records). The stat code dimension is only 130 records with five columns (1 key & 4 attribute excluding audit columns).
Option 2: I would continue loading my stat code dimension for use by the first fact table. When loading the part dimension, I'd load a surrogate key to the stat code dimension creating a snowflake.
Is there a better way that I have not identified? Please note that there may be stat codes with no parts assigned.
Thanks!
spaul- Posts : 10
Join date : 2010-08-24
Re: How to Handle Data that serves as both a dimension and attributes of another dimension
Why wouldn't you put the stat code dimension on the second fact?
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: How to Handle Data that serves as both a dimension and attributes of another dimension
I would not put the stat code on the 2nd fact table for the part detail because the stat code for a part may change over time. They want to see the current stat code for a part even if they are pulling a part detail invoice line from many years ago. So the stat code is associated with the part number rather than the part detail fact record. The invoice header fact is different, the stat code is associated with a particular invoice.
spaul- Posts : 10
Join date : 2010-08-24
Re: How to Handle Data that serves as both a dimension and attributes of another dimension
Put the Stat on the Part Dimension. You could also put it on your second fact table. You can either state histroy as it looked at the time or you can restate history based on the current Stat/Part relationship.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: How to Handle Data that serves as both a dimension and attributes of another dimension
Jeff,
You stated, "Put the Stat on the Part Dimension". Are you suggesting option 1 which is adding the stat code and it's attributes in the part dimension but also keeping the stat code dimension? Sorry, I'm kinda new at this and want to be certain that I understand your advice.
You also mentioned "You could also put it on your second fact table. You can either state histroy as it looked at the time or you can restate history based on the current Stat/Part relationship." Just an FYI, we do have a stat code surrogate on each part detail fact table row which has proven to be confusing to our users. We will be hiding this in our report model. They do not want the historical stat code at the time the part detail invoice line was created. They only want the current stat code assigned to a part.
Thanks much.
You stated, "Put the Stat on the Part Dimension". Are you suggesting option 1 which is adding the stat code and it's attributes in the part dimension but also keeping the stat code dimension? Sorry, I'm kinda new at this and want to be certain that I understand your advice.
You also mentioned "You could also put it on your second fact table. You can either state histroy as it looked at the time or you can restate history based on the current Stat/Part relationship." Just an FYI, we do have a stat code surrogate on each part detail fact table row which has proven to be confusing to our users. We will be hiding this in our report model. They do not want the historical stat code at the time the part detail invoice line was created. They only want the current stat code assigned to a part.
Thanks much.
spaul- Posts : 10
Join date : 2010-08-24
Re: How to Handle Data that serves as both a dimension and attributes of another dimension
spaul wrote:I would not put the stat code on the 2nd fact table for the part detail because the stat code for a part may change over time. They want to see the current stat code for a part even if they are pulling a part detail invoice line from many years ago. So the stat code is associated with the part number rather than the part detail fact record. The invoice header fact is different, the stat code is associated with a particular invoice.
How can it be a stat code for a part if you are getting the stat code from the invoice header?
If individual parts have their own stat code, it should be attributes on the part dimension and sourced from the part master, not the invoice header.
Re: How to Handle Data that serves as both a dimension and attributes of another dimension
The source provides me with stat code on invoice header, on part detail line, and on part number. The stat code used for an invoice header is assigned at the time of invoicing based on the most expensive part for that invoice. There are rules in place in the source for which types of invoices are assigned a stat code and how the number is assigned. They most definitely want this stat code as it was at time of invoicing as it is used for different types of reporting. They also want the current stat code for the part. I know it seems a bit contradictory, but that's how they use the data.
spaul- Posts : 10
Join date : 2010-08-24
Re: How to Handle Data that serves as both a dimension and attributes of another dimension
spaul wrote:The source provides me with stat code on invoice header, on part detail line, and on part number. The stat code used for an invoice header is assigned at the time of invoicing based on the most expensive part for that invoice. There are rules in place in the source for which types of invoices are assigned a stat code and how the number is assigned. They most definitely want this stat code as it was at time of invoicing as it is used for different types of reporting. They also want the current stat code for the part. I know it seems a bit contradictory, but that's how they use the data.
No, its not contradictory. You have three different status codes and you should capture all three. The code and description based on the part number should be in the part dimension. Those on the invoice line and invoice header should be referencing a status dimension off the fact table(one on the invoice header fact and both on the invoice line fact). If you want to retain part status at the time of invoicing, include a third FK to the status dimension on the invoice line fact.
By the way, why do you have an invoice header fact? Why not simply carry the header dimensions on the invoice line fact? Are invoices so many and so long that it makes much of a difference to sum lines to get a total?
Re: How to Handle Data that serves as both a dimension and attributes of another dimension
The source system contains many invoice types (full maintenance, rentals, service/parts, etc.) Not every invoice is represented in the part detail table because some of invoice types do not contain parts. Invoice totals are not allocated to the part level. These two fact tables provide different types of reporting. There may have been many other great reasons, but the original developer did not leave much documenation!
I really appreciate your time. Thanks.
I really appreciate your time. Thanks.
spaul- Posts : 10
Join date : 2010-08-24
Re: How to Handle Data that serves as both a dimension and attributes of another dimension
I would have a Stat Dimension and put the Stat Surrogate Key on the Parts Dimension.
Here's another way. Create a Type 3 SCD. It has the Stat code at the time of the transaction as well as the current Stat Code (maybe even the prior Stat Code if you want to track it that way.
There's a couple of different ways to do it. Alot of it depends on if you are dealing with existing dimensions or designing a brand new DW. If you have 2 dimension tables (product and Stat) with a lot of data already loaded into fact tables, then it might be easier to do a snow flake from the product dimension to the Stat dimension and call it the "Current Stat Code". You could put the Product Dimension and the Stat Dimension of the fact table which would enable you to report on the Stat Code in use at the time of the transaction (always a good thing) and run a report rolling up to the current Stat Code.
Here's another way. Create a Type 3 SCD. It has the Stat code at the time of the transaction as well as the current Stat Code (maybe even the prior Stat Code if you want to track it that way.
There's a couple of different ways to do it. Alot of it depends on if you are dealing with existing dimensions or designing a brand new DW. If you have 2 dimension tables (product and Stat) with a lot of data already loaded into fact tables, then it might be easier to do a snow flake from the product dimension to the Stat dimension and call it the "Current Stat Code". You could put the Product Dimension and the Stat Dimension of the fact table which would enable you to report on the Stat Code in use at the time of the transaction (always a good thing) and run a report rolling up to the current Stat Code.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: How to Handle Data that serves as both a dimension and attributes of another dimension
Thanks so much for the detailed answer. You've made some good points.
spaul- Posts : 10
Join date : 2010-08-24
Similar topics
» Dimension Attributes and Fact attributes storing same data in multiple data marts??
» How to Handle Updates to Dimension Attributes (SCD 1) that have a Cardinality > 1 (Context is Data Warehouse Star Schema)
» dimension table design question for around 100 attributes and higher level calculated attributes
» Status attributes on main dimension or as separate dimension
» Attributes as part of employee dimension and/or own dimension
» How to Handle Updates to Dimension Attributes (SCD 1) that have a Cardinality > 1 (Context is Data Warehouse Star Schema)
» dimension table design question for around 100 attributes and higher level calculated attributes
» Status attributes on main dimension or as separate dimension
» Attributes as part of employee dimension and/or own dimension
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum