Modelling inventory quantities - fact or dimension
2 posters
Page 1 of 1
Modelling inventory quantities - fact or dimension
We currently have a part_dimension that is FK referenced in a invoice_detail_fact table.
This part_dimension is only current view with daily updates. Maybe incorrectly, it has a mix of attributes, including descriptive (part_class, cataegory, family, etc), status indicators (discontinued, in_catalog, on_internet, etc), and factual measures (available_quantity, current_cost, current_price, etc).
We have a requirement for viewing daily inventory levels: available_qty, shipping_reserved_qty. This will evolve to a hourly requirement...
I am thinking that in building a new fact table, such as, inventory_snapshot_fact, we would cleanup our part_dimension by removing factual measures and status indicators and create new part_status_dim in the process.
We have recently built a couple of other new fact tables for pricing and cost:
INVENTORY_COST_FACT
PART_DIM_ID
PURCHASE_COST_TYPE_DIM_ID
COST_EFFECTIVE_DATE
PART_ID
MIN_BREAK_QTY
MAX_BREAK_QTY
PART_COST_AMT
INTERNAL_INVENTORY_PRICE_FACT
PART_DIM_ID
VENDOR_DIM_ID
CURRENCY_DIM_ID
PRICE_EFFECTIVE_DATE
PART_ID
MIN_BREAK_QTY
MAX_BREAK_QTY
PART_PRICE_AMT
I am thinking that INVENTORY_SNAPSHOT_FACT would look like this:
PART_DIM_ID
VENDOR_DIM_ID
PART_STATUS_DIM_ID
PART_STATUS_TIMESTAMP
AVAILABLE_QTY
SHIPPING_RESERVED_QTY
And that the PART_STATUS_DIM might be a junk dimension holding all the various statuses that a part could have.
The original PART_DIMENSION could stay current view because it would just have the decriptive attributes which don't really change. The PART_STATUS_DIM might need to be SCD. If the requirement for viewing inventory quantities changes from daily to hourly, is there anything that we should plan for doing differently?
I appreciate any thoughts on this from those of you who have tackled this before.
This part_dimension is only current view with daily updates. Maybe incorrectly, it has a mix of attributes, including descriptive (part_class, cataegory, family, etc), status indicators (discontinued, in_catalog, on_internet, etc), and factual measures (available_quantity, current_cost, current_price, etc).
We have a requirement for viewing daily inventory levels: available_qty, shipping_reserved_qty. This will evolve to a hourly requirement...
I am thinking that in building a new fact table, such as, inventory_snapshot_fact, we would cleanup our part_dimension by removing factual measures and status indicators and create new part_status_dim in the process.
We have recently built a couple of other new fact tables for pricing and cost:
INVENTORY_COST_FACT
PART_DIM_ID
PURCHASE_COST_TYPE_DIM_ID
COST_EFFECTIVE_DATE
PART_ID
MIN_BREAK_QTY
MAX_BREAK_QTY
PART_COST_AMT
INTERNAL_INVENTORY_PRICE_FACT
PART_DIM_ID
VENDOR_DIM_ID
CURRENCY_DIM_ID
PRICE_EFFECTIVE_DATE
PART_ID
MIN_BREAK_QTY
MAX_BREAK_QTY
PART_PRICE_AMT
I am thinking that INVENTORY_SNAPSHOT_FACT would look like this:
PART_DIM_ID
VENDOR_DIM_ID
PART_STATUS_DIM_ID
PART_STATUS_TIMESTAMP
AVAILABLE_QTY
SHIPPING_RESERVED_QTY
And that the PART_STATUS_DIM might be a junk dimension holding all the various statuses that a part could have.
The original PART_DIMENSION could stay current view because it would just have the decriptive attributes which don't really change. The PART_STATUS_DIM might need to be SCD. If the requirement for viewing inventory quantities changes from daily to hourly, is there anything that we should plan for doing differently?
I appreciate any thoughts on this from those of you who have tackled this before.
kjfischer- Posts : 28
Join date : 2011-05-04
Sounds okay so far...
Looks like you are moving down the right path. Moving the part "metrics" into their own facts is a good move. I do have a few questions:
1. You list "date" for some columns; are these real dates or dimension surrogates? If they are dates, I would recommend making these surrogate keys tied to a Date Dimension.
2. Same for the new inventory fact, instead of a timestamp, you should implement a Date Dimension surrogate (grain = 1 row per day) as well as a Time Dimension surrogate (grain = 1 row per second), if the inventory is going to move to an hourly snapshot.
3. Why/how would you implement a SCD on the "status junk" dimension? Junk usually means the status/flags/etc the are used to describe a fact, but do not belong in their own dimension. If the various statuses are truly junk, then I do not think SCD would apply.
1. You list "date" for some columns; are these real dates or dimension surrogates? If they are dates, I would recommend making these surrogate keys tied to a Date Dimension.
2. Same for the new inventory fact, instead of a timestamp, you should implement a Date Dimension surrogate (grain = 1 row per day) as well as a Time Dimension surrogate (grain = 1 row per second), if the inventory is going to move to an hourly snapshot.
3. Why/how would you implement a SCD on the "status junk" dimension? Junk usually means the status/flags/etc the are used to describe a fact, but do not belong in their own dimension. If the various statuses are truly junk, then I do not think SCD would apply.
elmorejr- Posts : 25
Join date : 2011-10-20
Location : Columbus, Ohio, United States of America, Earth, Sol
Re: Modelling inventory quantities - fact or dimension
Thanks for your reply.
We do have a "date dimension". The primary key of the date dimension is a date; it is not a sequence generated key. We currently do not have a time dimension. I have not worked in an environment where near real-time reporting was required.
In the source system, inventory levels (available_qty) is changing fequently, i.e. everytime a part is ordered, then the available_qty is changed. I have looked at Kimball's Human Resource snapshot fact example, but employee facts do not change as rapidly as inventory facts. So, I don't know if the modelling approach is transferable.
Regarding the part status flags and indicators, this is what I want to be able to report:
example,
available_quantity for parts that are available_on_internet and are stocked_in_warehouse
Maybe those are just descriptors and belong or should stay in the part_dimension?
We do have a "date dimension". The primary key of the date dimension is a date; it is not a sequence generated key. We currently do not have a time dimension. I have not worked in an environment where near real-time reporting was required.
In the source system, inventory levels (available_qty) is changing fequently, i.e. everytime a part is ordered, then the available_qty is changed. I have looked at Kimball's Human Resource snapshot fact example, but employee facts do not change as rapidly as inventory facts. So, I don't know if the modelling approach is transferable.
Regarding the part status flags and indicators, this is what I want to be able to report:
example,
available_quantity for parts that are available_on_internet and are stocked_in_warehouse
Maybe those are just descriptors and belong or should stay in the part_dimension?
kjfischer- Posts : 28
Join date : 2011-05-04
Junk dim
Based on your example,
I would say these are junk dimension attributes. They change as rapidly as the snapshot is changing (i.e. one hour a part is available in the warehouse, the next hour it could not be). Having your Part dimension change that rapidly would lead to a large number of Type 2 updates.
By putting those into a junk dim, you would still be able to capture the context.
"at this snapshot date (Date dim) and time (Time dim), there were 3 (measure) widgets (Part dim) in warehouse A (Warehouse dim) with a status of X & indicator Z (both in Status Junk dim)"
And if you are taking an hourly snapshot, then yes, your fact will grow very large, quickly (depending on the number of part/warehouse combinations). You may want to consider keeping an hourly fact that only maintains 24 or 72 hours of data. With another fact that keeps a daily snapshot for summary. And possibly other rollups if the daily is also too much data to maintain over time (weekly, monthly, etc.).
available_quantity for parts that are available_on_internet and are stocked_in_warehouse
I would say these are junk dimension attributes. They change as rapidly as the snapshot is changing (i.e. one hour a part is available in the warehouse, the next hour it could not be). Having your Part dimension change that rapidly would lead to a large number of Type 2 updates.
By putting those into a junk dim, you would still be able to capture the context.
"at this snapshot date (Date dim) and time (Time dim), there were 3 (measure) widgets (Part dim) in warehouse A (Warehouse dim) with a status of X & indicator Z (both in Status Junk dim)"
And if you are taking an hourly snapshot, then yes, your fact will grow very large, quickly (depending on the number of part/warehouse combinations). You may want to consider keeping an hourly fact that only maintains 24 or 72 hours of data. With another fact that keeps a daily snapshot for summary. And possibly other rollups if the daily is also too much data to maintain over time (weekly, monthly, etc.).
elmorejr- Posts : 25
Join date : 2011-10-20
Location : Columbus, Ohio, United States of America, Earth, Sol
Similar topics
» Modelling cutomer dimension and sales orders fact
» Modelling Product Dimension when incoming fact records have missing lowest level
» Modelling many to many relationships in a dimension
» The Role of the Inventory Transaction Fact Table Within a Value Chain DW
» Reduce Rows in Inventory Snapshot Fact Table
» Modelling Product Dimension when incoming fact records have missing lowest level
» Modelling many to many relationships in a dimension
» The Role of the Inventory Transaction Fact Table Within a Value Chain DW
» Reduce Rows in Inventory Snapshot Fact Table
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum