Booking number and description in the fact table?
5 posters
Page 1 of 1
Booking number and description in the fact table?
Hi there,
we have a fact table that stores the items of a costplan of a project. Each item has a booking number and a description. If we would have a booking dimensions, the dimension would have as many rows as the fact table. My question is, whether we should build this mentioned dimension or should we add two attributes to the fact table -> booking_nr and booking_nr_description?
Actually this is a degenerate dimenion, but I'm not sure because of the description field.
Thanks
we have a fact table that stores the items of a costplan of a project. Each item has a booking number and a description. If we would have a booking dimensions, the dimension would have as many rows as the fact table. My question is, whether we should build this mentioned dimension or should we add two attributes to the fact table -> booking_nr and booking_nr_description?
Actually this is a degenerate dimenion, but I'm not sure because of the description field.
Thanks
bustaliz- Posts : 4
Join date : 2009-11-17
Re: Booking number and description in the fact table?
It is generally not a good idea to put text, particularly lengthy text, columns in a fact table. It can significantly bloat the size of the facts and signficantly impact performance of all queries wither they need the description or not.
What I usually do is a case like this is create a generic text dimension with a surrogate primary key, a non-unique hash value as an alternate key and the text. The hash serves as the natural key and is indexed. The text value is not indexed. When a new text value comes in, I caclulate the hash (most DB's and ETL tools have functions that will do this) and do a lookup to for a matching hash and text value. The lookup is efficient because the hash will usually limit the search to only a few rows (use as large a hash range as feasable with the hash function you have). Basically treat it like any type 1 dimension.
The fact table would then contain the booking number (a degenerate dimension) and the FK to the descriptive text. Usually you will find, particularly in an application such as this, the same description is used repeatedly, so often the text dimension is much smaller than what it would be had you stored every instance individually.
What I usually do is a case like this is create a generic text dimension with a surrogate primary key, a non-unique hash value as an alternate key and the text. The hash serves as the natural key and is indexed. The text value is not indexed. When a new text value comes in, I caclulate the hash (most DB's and ETL tools have functions that will do this) and do a lookup to for a matching hash and text value. The lookup is efficient because the hash will usually limit the search to only a few rows (use as large a hash range as feasable with the hash function you have). Basically treat it like any type 1 dimension.
The fact table would then contain the booking number (a degenerate dimension) and the FK to the descriptive text. Usually you will find, particularly in an application such as this, the same description is used repeatedly, so often the text dimension is much smaller than what it would be had you stored every instance individually.
Re: Booking number and description in the fact table?
Thanks ngalemmo, that sounds good.
bustaliz- Posts : 4
Join date : 2009-11-17
Re: Booking number and description in the fact table?
Interesting. I've never thought of that. A junk dim of text descriptions. You ever add multiple text columns to this dim or create a new dim for each text value?
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Booking number and description in the fact table?
You only need one table with one text value. If there are multiple texts associated with a fact, you would have multiple foreign keys for each text. Another aspect of this approach is, if you wanted to add a keyword search capability, it is fairly easy to add a keyword/text association table using the dimension's primary key.
Re: Booking number and description in the fact table?
Nick,
Could you please eloborate on the aspect of multiple texts tied to one fact record. I did not understand the part of "having multiple foreign keys for the text".
I have a situation where I have an order that can have upto 20-25 messages tied to it. How do I get handle this situation. Could you please explain?
Thanks,
Kris
Could you please eloborate on the aspect of multiple texts tied to one fact record. I did not understand the part of "having multiple foreign keys for the text".
I have a situation where I have an order that can have upto 20-25 messages tied to it. How do I get handle this situation. Could you please explain?
Thanks,
Kris
AKris- Posts : 6
Join date : 2010-02-28
Re: Booking number and description in the fact table?
It depends.
If the texts fit into clearly defined roles, are a fixed number and coexist at the same time with a single fact, then you can handle it by defining multiple foreign keys, one for each role. Not unlike having a ship-to customer and bill-to customer on an order. The case where I used this was a data mart to track projects where the PM could describe the 3 top issues relating to the project at the time of his report. The business wanted to collect these texts in the mart and were stored as 3 role based FK's off the project status fact table.
If the texts accumulate over time and/or the number is unknown, then they are best modeled in its own fact table with appropriate dimensions and one text FK per row.
If the texts fit into clearly defined roles, are a fixed number and coexist at the same time with a single fact, then you can handle it by defining multiple foreign keys, one for each role. Not unlike having a ship-to customer and bill-to customer on an order. The case where I used this was a data mart to track projects where the PM could describe the 3 top issues relating to the project at the time of his report. The business wanted to collect these texts in the mart and were stored as 3 role based FK's off the project status fact table.
If the texts accumulate over time and/or the number is unknown, then they are best modeled in its own fact table with appropriate dimensions and one text FK per row.
Re: Booking number and description in the fact table?
In my case, the messages fall under 4 different catogeriesM,Q,T,S.Each order can have 1 or more from each category. The number of messages is variable. Some orders can have 10 messages (3 M,2Q,1S,4T messages) or even more and all the four categories may not have to exist. I am trying to think of using 4 bridge tables between message dim and order header fact each having it own grouids,combination of messages (ex: M-Message bridge will have groupID's,FKs of M Messages ) and insert this groupids' as FKs to the fact table. Does this work?Could you please let me know?
AKris- Posts : 6
Join date : 2010-02-28
Re: Booking number and description in the fact table?
A factless fact table with order, message type, and message (maybe message date) as dimensions is essentially the same thing... you can call it a bridge it you want. I don't think you would need to group them as you would with a multivalue dimension (but I may be wrong).
I wouldn't create separate tables, but would prefer to include the type as a dimension (it could simply be a degenerate code value). The reason is the users may invent new types in the future and if they do, it becomes a simple matter to accomodate new types.
I wouldn't create separate tables, but would prefer to include the type as a dimension (it could simply be a degenerate code value). The reason is the users may invent new types in the future and if they do, it becomes a simple matter to accomodate new types.
Re: Booking number and description in the fact table?
Hi Kris,
I work in the Healthcare Insurance domain and have a similar kind of a thing which we implemented in our Enterprise Data Warehouse as Below.
I suggest you hit this problem in two phases.
Phase 1: How to maintain the Messages with their categories separately
Phase2: How to attach appropriate messages to your Orders.
Phase 1 can be implemented as below:
-----------------------------------
Create Two tables one for Messages and one for Message categories.
Add a surrogate key to the Message table below called MSG_ID which will uniquely identify a Message along with its description.
Do not worry about which Category it belongs to at this point.
ORDER_MSG
---------
MSG_ID
MSG_CD
MSG_DESC
Create another table for the categories called ORDER_MSG_CAT as shown below and maintain all the categories(M,Q,T,S) there along with some description
if applicable.
ORDER_MSG_CAT
-------------
MSG_CAT_ID
MSG_CAT_CD
MSG_CAT_DESC
Now create a Cross-Reference table ORDER_MSG_CAT_XREF which links all messages to its categories as shown below.
ORDER_MSG_CAT_XREF
-------------------
MSG_CAT_XREF_ID
MSG_ID
MSG_CAT_ID
Phase 2 can be implemented as below:
-----------------------------------
Create a Cross reference table ORDER_MSG_XREF which links each message to the appropriate order.
Two cases arise here:
Case1: A message can belong one and only one Category. In this case ORDER_MSG_XREF will have the following FKs
ORDER_MSG_XREF
--------------
ORDER_ID
MSG_ID
Case2: A message can belong to more than 1 categories. In this case ORDER_MSG_XREF will have the following FKs
ORDER_MSG_XREF
--------------
ORDER_ID
MSG_CAT_XREF_ID
Or if you like you can build the ORDER_MSG table with the Category_CD column as well, in that case the natural key of this table should be
Combination of MSG_CD and MSG_CAT_CD. Add a Surrogate Key MSG_ID and use this as a FK in the ORDER_MSG_XREF table.
I hope it helps
Manik
I work in the Healthcare Insurance domain and have a similar kind of a thing which we implemented in our Enterprise Data Warehouse as Below.
I suggest you hit this problem in two phases.
Phase 1: How to maintain the Messages with their categories separately
Phase2: How to attach appropriate messages to your Orders.
Phase 1 can be implemented as below:
-----------------------------------
Create Two tables one for Messages and one for Message categories.
Add a surrogate key to the Message table below called MSG_ID which will uniquely identify a Message along with its description.
Do not worry about which Category it belongs to at this point.
ORDER_MSG
---------
MSG_ID
MSG_CD
MSG_DESC
Create another table for the categories called ORDER_MSG_CAT as shown below and maintain all the categories(M,Q,T,S) there along with some description
if applicable.
ORDER_MSG_CAT
-------------
MSG_CAT_ID
MSG_CAT_CD
MSG_CAT_DESC
Now create a Cross-Reference table ORDER_MSG_CAT_XREF which links all messages to its categories as shown below.
ORDER_MSG_CAT_XREF
-------------------
MSG_CAT_XREF_ID
MSG_ID
MSG_CAT_ID
Phase 2 can be implemented as below:
-----------------------------------
Create a Cross reference table ORDER_MSG_XREF which links each message to the appropriate order.
Two cases arise here:
Case1: A message can belong one and only one Category. In this case ORDER_MSG_XREF will have the following FKs
ORDER_MSG_XREF
--------------
ORDER_ID
MSG_ID
Case2: A message can belong to more than 1 categories. In this case ORDER_MSG_XREF will have the following FKs
ORDER_MSG_XREF
--------------
ORDER_ID
MSG_CAT_XREF_ID
Or if you like you can build the ORDER_MSG table with the Category_CD column as well, in that case the natural key of this table should be
Combination of MSG_CD and MSG_CAT_CD. Add a Surrogate Key MSG_ID and use this as a FK in the ORDER_MSG_XREF table.
I hope it helps
Manik
Mj1978- Posts : 8
Join date : 2010-03-10
Similar topics
» Number of Dimensions around a FACT Table.
» How to create fact table with measures derived from comparing two fact table rows
» Fact Table with huge number of Blank (or Empty) foreign keys
» Unknown number of relationships from dimension to fact until fact loaded
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
» How to create fact table with measures derived from comparing two fact table rows
» Fact Table with huge number of Blank (or Empty) foreign keys
» Unknown number of relationships from dimension to fact until fact loaded
» 'Routing' the grain of the fact table to multpile members of multiple dimensions causes the fact table to 'explode'
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum