Header/Detail Fact Problem
4 posters
Page 1 of 1
Header/Detail Fact Problem
Hi guys,
In OLTP I have 2 fact tables : ORDERs and ORDER_LINEs .
ORDER_TABLE
----------------
For an ORDER there are multiple ORDER_LINE. I don't draw a table for order line. It has multiple product items for each order.
In dimensional modeling I can push the some of the ORDER attributes to the ORDER_LINE fact ( customer_key, transaction_date ). And my orderItems fact now have this fields:
ORDER_ID(DD), CUSTOMER_SK, TRANSACTION_DATE, PRODUCT_SK, PRICE etc.
But I have ORDER_SHIPMENT_TYPE in the header table ( ORDERS ). And it specifies the shipment status of that order ( not order item level ). If the ORDER_SHIPMENT_TYPE is 1, it means that it has given to the shipper, if it is 2, the parcel is on its way, if it is 3 the parcel has been delivered. And so this status changing should be tracked ( something like scd ) ( accumulating snapshot not fit because the # of steps is not certain )
So my question is, where should I put the ORDER_SHIPMENT_TYPE key in my dimensional modeling when I have ORDER and ORDER_ITEM tables in my OLTP environment?
Thanks.
In OLTP I have 2 fact tables : ORDERs and ORDER_LINEs .
ORDER_TABLE
----------------
ORDERID | ORDER_DATE | CUSTOMER_SK | STORE_SK | ORDER_SHIPMENT_TYPE_SK |
1 | 2014-10-10 01:00:00 | 1 | 2 | 3 |
2 | 2014-10-10 01:00:00 | 2 | 3 | 3 |
3 | 2014-10-10 02:00:00 | 3 | 3 | 2 |
For an ORDER there are multiple ORDER_LINE. I don't draw a table for order line. It has multiple product items for each order.
In dimensional modeling I can push the some of the ORDER attributes to the ORDER_LINE fact ( customer_key, transaction_date ). And my orderItems fact now have this fields:
ORDER_ID(DD), CUSTOMER_SK, TRANSACTION_DATE, PRODUCT_SK, PRICE etc.
But I have ORDER_SHIPMENT_TYPE in the header table ( ORDERS ). And it specifies the shipment status of that order ( not order item level ). If the ORDER_SHIPMENT_TYPE is 1, it means that it has given to the shipper, if it is 2, the parcel is on its way, if it is 3 the parcel has been delivered. And so this status changing should be tracked ( something like scd ) ( accumulating snapshot not fit because the # of steps is not certain )
So my question is, where should I put the ORDER_SHIPMENT_TYPE key in my dimensional modeling when I have ORDER and ORDER_ITEM tables in my OLTP environment?
Thanks.
ozisamur- Posts : 30
Join date : 2014-10-27
Re: Header/Detail Fact Problem
Put it in your Order Shipment Type transaction fact table.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Header/Detail Fact Problem
BoxesAndLines wrote:Put it in your Order Shipment Type transaction fact table.
Well, how should I connect with my OrderItem (FACT) and the OrderShipmentType (FACT) ? What dimension should be used as conformed?
Thanks.
ozisamur- Posts : 30
Join date : 2014-10-27
Re: Header/Detail Fact Problem
Order ID seems like a great candidate.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Header/Detail Fact Problem
Great, but as a best practice I should not link the 2 fact tables ( ORDER_STATUS_FACT and ORDER_ITEM_FACT ). There should be a conformed dimension.
And also I should not create ORDERS table as a conformed dimension to link those 2 fact tables. Becase as a best practice header table (ORDERs) should not be treated as a dimension.
I am in hesitance. Should I create ORDERs dimension table to link two fact tables ?
Thanks.
And also I should not create ORDERS table as a conformed dimension to link those 2 fact tables. Becase as a best practice header table (ORDERs) should not be treated as a dimension.
I am in hesitance. Should I create ORDERs dimension table to link two fact tables ?
Thanks.
ozisamur- Posts : 30
Join date : 2014-10-27
Re: Header/Detail Fact Problem
The Order Id is a Degenerate Dimension and so joining Facts using it doesn't break the best practice of joining facts by Conformed Dims.
However, I understand the point you are making and my understanding of the theory is as follows (others may be able to give a better explanation):
The rule about joining via conformed dimensions applies to what I would call analytical queries i.e. high-level summaries of the data your star schemas contain. Where the resultset you want contains data from multiple facts, you query each fact (and its related dims) independently and then join the resultsets from each query using the conformed dims. You don't join all the tables together in a single query.
However, where you have multiple facts, usually in the same fact table, that all relate to the same "parent" record and you need to know which relate to the same parent record, then it is OK to use an ID, such as Order ID, to link them. You would use the ID to query just this table to get the resultset you need and then, if necessary, join this resultset to the results of querying other facts using conformed dims.
Hope this helps
However, I understand the point you are making and my understanding of the theory is as follows (others may be able to give a better explanation):
The rule about joining via conformed dimensions applies to what I would call analytical queries i.e. high-level summaries of the data your star schemas contain. Where the resultset you want contains data from multiple facts, you query each fact (and its related dims) independently and then join the resultsets from each query using the conformed dims. You don't join all the tables together in a single query.
However, where you have multiple facts, usually in the same fact table, that all relate to the same "parent" record and you need to know which relate to the same parent record, then it is OK to use an ID, such as Order ID, to link them. You would use the ID to query just this table to get the resultset you need and then, if necessary, join this resultset to the results of querying other facts using conformed dims.
Hope this helps
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: Header/Detail Fact Problem
As far as I understand If I have an ORDER_ITEM fact ( ORDER_ID, ORDER_LINE_ID, ORDER_DATE, STORE_ID etc. ), and ORDER_SHIPMENT_STATUS fact ( ORDER_ID, ORDER_SHIPMENT_STATUS_TYPE_ID, SHIPMENT_STATUS_CHANGE_DATE ), I can join both two tables ( ORDER_ITEM and ORDER_SHIPMENT_STATUS ) via ORDER_ID column.
Is that correct?
Thanks for all help.
Is that correct?
Thanks for all help.
ozisamur- Posts : 30
Join date : 2014-10-27
Re: Header/Detail Fact Problem
You can join on an aggregate of the two tables. Fact tables are assumed to have a M:M relation between them. Two facts are aggregated then joined on the common dimensions(or attributes). The aggregation eliminates the M:M relationship.
Re: Header/Detail Fact Problem
In ORDER_ITEMS_FACT fact table, ORDER_ID column is duplicated, because it is allocated from header into the detail table. Consider this data:
ORDER_ID|ORDER_ITEM_ID|PRODUCT_ID|CUSTOMER_ID ...
1 | 1 | 2 | 5
1 | 2 | 3 | 5
1 | 3 | 4 | 5
Same ORDER_ID with many different order items, products.
When I am looking for the status of an order, I get the ORDER_ID value from ORDER_ITEMS_FACT fact table and then I go to the ORDER_STATUS_FACT table. But because of ORDER_ID is duplicated in the ORDER_ITEMS table I would get duplicated rows. Should I run 2 different query ( by using distinct order_id ) then merge together in BI tool ?
ORDER_ID|ORDER_ITEM_ID|PRODUCT_ID|CUSTOMER_ID ...
1 | 1 | 2 | 5
1 | 2 | 3 | 5
1 | 3 | 4 | 5
Same ORDER_ID with many different order items, products.
When I am looking for the status of an order, I get the ORDER_ID value from ORDER_ITEMS_FACT fact table and then I go to the ORDER_STATUS_FACT table. But because of ORDER_ID is duplicated in the ORDER_ITEMS table I would get duplicated rows. Should I run 2 different query ( by using distinct order_id ) then merge together in BI tool ?
ozisamur- Posts : 30
Join date : 2014-10-27
Re: Header/Detail Fact Problem
I would create a type 1 Order Status dimension that contains the current status for your order items fact.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Header/Detail Fact Problem
I will store the current status of an order in the ORDER_ITEM_FACT like this:
ORDER_ID | ORDER_ITEM_ID | PRODUCT_ID | CUSTOMER_ID | ORDER_STATUS_ID
1 | 1 | 2 | 3 | 2
1 | 2 | 3 | 3 | 2
1 | 3 | 4 | 3 | 2
2 | 4 | 2 | 2 | 4
2 | 5 | 3 | 2 | 4
2 | 6 | 3 | 2 | 4
And history is stored in another fact table ORDER_STATUS_FACT like this:
ORDER_ID | ORDER_STATUS_ID | START_DATE | END_DATE | IS_CURRENT
1 | 1 | 2014-10-10 | 2014-10-15 | N
1 | 2 | 2014-10-15 | 2014-10-19 | N
1 | 3 | 2014-10-19 | 9999-12-31 | Y
What do you think about it ? ORDER_STATUS_FACT table behaves like scd 2 dimension table.
ORDER_ID | ORDER_ITEM_ID | PRODUCT_ID | CUSTOMER_ID | ORDER_STATUS_ID
1 | 1 | 2 | 3 | 2
1 | 2 | 3 | 3 | 2
1 | 3 | 4 | 3 | 2
2 | 4 | 2 | 2 | 4
2 | 5 | 3 | 2 | 4
2 | 6 | 3 | 2 | 4
And history is stored in another fact table ORDER_STATUS_FACT like this:
ORDER_ID | ORDER_STATUS_ID | START_DATE | END_DATE | IS_CURRENT
1 | 1 | 2014-10-10 | 2014-10-15 | N
1 | 2 | 2014-10-15 | 2014-10-19 | N
1 | 3 | 2014-10-19 | 9999-12-31 | Y
What do you think about it ? ORDER_STATUS_FACT table behaves like scd 2 dimension table.
ozisamur- Posts : 30
Join date : 2014-10-27
Re: Header/Detail Fact Problem
I guess I'm a bit confused about the notion of an order status in an analytic database. From my experience with distributors and manufacturers, an order isn't an order until it is accepted and released. Prior to that it is considered an indication or quote. The fulfillment (shipping) and invoicing of an order are separate, independent business processes that are usually handled by separate fact tables with references (degenerate dimensions) back to the order line. Potentially you can have 4 subject areas: indications (not common), orders, fulfillment and invoicing. If necessary, you can create an aggregate from these to provide an end-to-end picture.
Re: Header/Detail Fact Problem
Don't like it. Just build the type 1 status dimension. It will only have as many rows as there are distinct statuses. It will answer the majority of your questions.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Header/Detail Fact Problem
ngalemmo wrote:I guess I'm a bit confused about the notion of an order status in an analytic database. ...
The telecoms track it as all one process. Submitted, accepted, processing, provisioning, etc. I used an accumulating snapshot to report stage, lags, fallout, how many completed, etc. It was populated from the Order status fact. So I think the OP is almost on the right track.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Header/Detail Fact Problem
BoxesAndLines wrote:Don't like it. Just build the type 1 status dimension. It will only have as many rows as there are distinct statuses. It will answer the majority of your questions.
I could not understand exactly what you said. What is the structure of table do you suggest and how can I join it with Order_Items_Fact table? By the way OrderId is duplicated in that fact table.
Thx
ozisamur- Posts : 30
Join date : 2014-10-27
Re: Header/Detail Fact Problem
Create an Order Status dimension. This table has all distinct order statuses. Add this table to both fact tables.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Similar topics
» Header/Detail Fact Granularity
» Data Modelling -- linking Header and Detail Fact Tables.
» data model for 2 fact tables (Header / Detail scenario)
» Header/Detail Both should be two facts?
» Allocating Header Level FACT table Dimensions FKs into Line Level Fact
» Data Modelling -- linking Header and Detail Fact Tables.
» data model for 2 fact tables (Header / Detail scenario)
» Header/Detail Both should be two facts?
» Allocating Header Level FACT table Dimensions FKs into Line Level Fact
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum