Tracking history in huge hierarchies
2 posters
Page 1 of 1
Tracking history in huge hierarchies
Let's review the following dimensional challenge:
Challenge
There is a big organizational tree with the 6 levels: OrgLevel1 -> OrgLevel2 -> .. -> OrgLevel6
Each level has its own set of attributes of type 2 and could be used for facts analysis
The relationships between levels are changing pretty fast: each month up to 10% of members in one level could change their parent members and the history of changes should be kept.
The question is how to store all the data and track all the changes efficiently.
Possible implementation
I have following vision how it could be implemented:
What do you thing of this approach?
Thanks
[i]
Challenge
There is a big organizational tree with the 6 levels: OrgLevel1 -> OrgLevel2 -> .. -> OrgLevel6
Each level has its own set of attributes of type 2 and could be used for facts analysis
The relationships between levels are changing pretty fast: each month up to 10% of members in one level could change their parent members and the history of changes should be kept.
The question is how to store all the data and track all the changes efficiently.
Possible implementation
I have following vision how it could be implemented:
- Keep separate dimensional table for each OrgLevel with SCD type 2 structure (row_start_date, row_end_date columns etc.)
- Create a special bridge table to keep relationships between organization levels with time range indicators. So the bridge table would have following structure:
OrgHierarchyBridgeTable
BridgeTableSurrogateKey
OrgLevel_1_BusinessKey
OrgLevel_2_BusinessKey
...
OrgLevel_6_BusinessKey
RowStartDate
RowEndDate - Keep in fact table the BridgeTableSurrogateKey
What do you thing of this approach?
Thanks
[i]
andriy.zabavskyy- Posts : 18
Join date : 2011-09-12
Re: Tracking history in huge hierarchies
Adding effective periods to the bridge table, as well as the source relationships is fine. Problem is, the bridge table is incorrect.
A bridge is not a flattening of the hierarchy. If you flatten it, you just store a flat picture keyed at the lowest level. Its a dimension, not a bridge.
A bridge table is an explosion of the recursive parent/child keys. The table contains every possible parent-child relationship across all levels. Each relationship can be date bound to track versions. A bridge contains a child key, a parent key, child level, parent level or distance, date ranges, and other attributes as appropriate. Since the bridge is every possible relationship pair, you can join at any level and traverse up or down the hierarchy using simple non-recursive SQL.
The update process is to first update the recursive structure (parent/child relationships), then create a new exploded bridge table. Do a match/merge pass of the new exploded bridge table and the current one (full outer join on parent key and child key). Expire old relationships that are not in the new table and add new relationships that are not in the old table.
A bridge is not a flattening of the hierarchy. If you flatten it, you just store a flat picture keyed at the lowest level. Its a dimension, not a bridge.
A bridge table is an explosion of the recursive parent/child keys. The table contains every possible parent-child relationship across all levels. Each relationship can be date bound to track versions. A bridge contains a child key, a parent key, child level, parent level or distance, date ranges, and other attributes as appropriate. Since the bridge is every possible relationship pair, you can join at any level and traverse up or down the hierarchy using simple non-recursive SQL.
The update process is to first update the recursive structure (parent/child relationships), then create a new exploded bridge table. Do a match/merge pass of the new exploded bridge table and the current one (full outer join on parent key and child key). Expire old relationships that are not in the new table and add new relationships that are not in the old table.
Re: Tracking history in huge hierarchies
I agree, I used not proper naming - it is not a real bridge table.
But my question if the approach in general will work properly? Are there any disadvantages or difficulties related to it?
I personally don't think we need to build a real bridge table because the depth of the hierarchy is fixed.
Thanks
But my question if the approach in general will work properly? Are there any disadvantages or difficulties related to it?
I personally don't think we need to build a real bridge table because the depth of the hierarchy is fixed.
Thanks
andriy.zabavskyy- Posts : 18
Join date : 2011-09-12
Re: Tracking history in huge hierarchies
Both would support similar queries, but an exploded bridge is a more efficient and flexible structure.
Similar topics
» Relationship between a history tracking table and a non-history tracking table?
» History tracking in a CRM data warehouse
» Tracking fact table history
» Tracking history of multiple SCD type 2 attributes
» Designing huge incremental factless fact
» History tracking in a CRM data warehouse
» Tracking fact table history
» Tracking history of multiple SCD type 2 attributes
» Designing huge incremental factless fact
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum