How does dimension hierarcy affect granularity?
2 posters
Page 1 of 1
How does dimension hierarcy affect granularity?
Hi,
I hope you can help with this problem, and I hope I can explain it well enough!
I have a few different source systems which have drug data at different levels. E.g. One has a tree structure where drugs are grouped into sets for prescribing. Typically there are three levels or fewer. But potentially more levels. Another system has two levels, at the drug level and then broken down into the chemical level. Another has drug broken down into multiple EAN-products. Another system groups them into arbitrarily.
After a bit of head scratching I decided that the way to go was putting all these into a self-referencing or tree structure dimension with unique key of [source_system] & [natural_key]. And having a [level] and a [parent_sk] attribute to explain the tree structure.
But now I come to build the fact tables I am really stuck. What granularity do they have? We are told not to mix granularities in one table, but the system that has a tree structure inherently has a mixture of granularities. (Potentially it could have no end of different granularities in it's facts.) I don't want three fact tables. To put it another way, it has a "Product" level at the base. But that product may be prescribed at the top level, or by a set, or at the individual drug level. So to report all prescriptions, at the level the prescribers will recognise, will we need to union 3 (or more) fact tables?
Alternatively I could have one dimension that flattens out the tree structure by having e.g.
1 | System1 | DrugName1 | Set1 | Regime1
3 | System1 | DrugName1 | Set2 | Regime1
4 | System1 | DrugName1 | Set1 | Regime3
5 | System1 | DrugName2 | Set1 | Regime1
But then I need chemical in there too, and that data is from another system.
That doesn't seem to make sense.
1 | System2 | Chemical1 | DrugName1 | NONE | NONE
2 | System2 | Chemical2 | DrugName1 | NONE | NONE
3 | System? | Chemical1 | DrugName1 | Set2 | Regime1
6 | System? | Chemical2 | DrugName1 | Set2 | Regime1
4 | System? | Chemical1 | DrugName1 | Set1 | Regime3
5 | System? | Chemical1 | DrugName2 | Set1 | Regime1
Hope you can help,
Al Wood
I hope you can help with this problem, and I hope I can explain it well enough!
I have a few different source systems which have drug data at different levels. E.g. One has a tree structure where drugs are grouped into sets for prescribing. Typically there are three levels or fewer. But potentially more levels. Another system has two levels, at the drug level and then broken down into the chemical level. Another has drug broken down into multiple EAN-products. Another system groups them into arbitrarily.
After a bit of head scratching I decided that the way to go was putting all these into a self-referencing or tree structure dimension with unique key of [source_system] & [natural_key]. And having a [level] and a [parent_sk] attribute to explain the tree structure.
But now I come to build the fact tables I am really stuck. What granularity do they have? We are told not to mix granularities in one table, but the system that has a tree structure inherently has a mixture of granularities. (Potentially it could have no end of different granularities in it's facts.) I don't want three fact tables. To put it another way, it has a "Product" level at the base. But that product may be prescribed at the top level, or by a set, or at the individual drug level. So to report all prescriptions, at the level the prescribers will recognise, will we need to union 3 (or more) fact tables?
Alternatively I could have one dimension that flattens out the tree structure by having e.g.
1 | System1 | DrugName1 | Set1 | Regime1
3 | System1 | DrugName1 | Set2 | Regime1
4 | System1 | DrugName1 | Set1 | Regime3
5 | System1 | DrugName2 | Set1 | Regime1
But then I need chemical in there too, and that data is from another system.
That doesn't seem to make sense.
1 | System2 | Chemical1 | DrugName1 | NONE | NONE
2 | System2 | Chemical2 | DrugName1 | NONE | NONE
3 | System? | Chemical1 | DrugName1 | Set2 | Regime1
6 | System? | Chemical2 | DrugName1 | Set2 | Regime1
4 | System? | Chemical1 | DrugName1 | Set1 | Regime3
5 | System? | Chemical1 | DrugName2 | Set1 | Regime1
Hope you can help,
Al Wood
Al Wood- Posts : 46
Join date : 2010-12-08
Re: How does dimension hierarcy affect granularity?
Ignore chemistry, it is a multivalued dimension based on drug. (see your other thread).
I don't know where you are, but I assume there is a standard idenfication system for all prescription drugs, if not an international standard, at least a national one. I would also assume your sources would include that identifier. If that is the case, use that identifier as the natural key to the drug dimension, this would allow you to leverage the chemical information from one source with the drugs received from other sources.
As far as the hierarchies go, keep it in a separate table keyed by drug and include a hierarchy type as part of the PK. This would allow you to keep multiple hierarchies if necessary for the same drugs.
I don't know where you are, but I assume there is a standard idenfication system for all prescription drugs, if not an international standard, at least a national one. I would also assume your sources would include that identifier. If that is the case, use that identifier as the natural key to the drug dimension, this would allow you to leverage the chemical information from one source with the drugs received from other sources.
As far as the hierarchies go, keep it in a separate table keyed by drug and include a hierarchy type as part of the PK. This would allow you to keep multiple hierarchies if necessary for the same drugs.
We are ahead of the standards
Thanks for the reply.
I'm in the UK. Maybe half our drugs are trial or investigational so the standard codes don't exist yet. Our source systems don't use them.
Do you mean keep the hierarchy in a table separate from the Dim? Then how would I use it?
In the other post you recommend having a multivalued dimension. I think I will, but apart from that I'm hoping to use a single table: a drug Dimension, with a source_system column, and having a flattened hierarcy with a column for each level. But then what if they configure a drug with 4 or more levels in its hierarchy?
I'm going to have to rely on Data Stewards to conform the drugs. Part of the need is for a mapping between systems, but its not clear at what level this should exist. The two main systems are Prescribing and Pharmacy, and the link is manual. So to fulfill a prescription of aspirin/tablet/500mg there might be two issues of aspirin/tablet/250mg. They should fulfill the prescription with a trial-coded drug if its part of a trial drugset. This is to keep charges seperate, but the Pharmacy staff break this rule sometimes. So the mapping may need to be at a high level, e.g. aspirin, in which case there will be a many-to-many mapping which I can't see how to handle.
This is my toughest design challenge yet. I'm really hoping for a simple answer that gives some practical wins.
Al Wood
I'm in the UK. Maybe half our drugs are trial or investigational so the standard codes don't exist yet. Our source systems don't use them.
Do you mean keep the hierarchy in a table separate from the Dim? Then how would I use it?
In the other post you recommend having a multivalued dimension. I think I will, but apart from that I'm hoping to use a single table: a drug Dimension, with a source_system column, and having a flattened hierarcy with a column for each level. But then what if they configure a drug with 4 or more levels in its hierarchy?
I'm going to have to rely on Data Stewards to conform the drugs. Part of the need is for a mapping between systems, but its not clear at what level this should exist. The two main systems are Prescribing and Pharmacy, and the link is manual. So to fulfill a prescription of aspirin/tablet/500mg there might be two issues of aspirin/tablet/250mg. They should fulfill the prescription with a trial-coded drug if its part of a trial drugset. This is to keep charges seperate, but the Pharmacy staff break this rule sometimes. So the mapping may need to be at a high level, e.g. aspirin, in which case there will be a many-to-many mapping which I can't see how to handle.
This is my toughest design challenge yet. I'm really hoping for a simple answer that gives some practical wins.
Al Wood
Al Wood- Posts : 46
Join date : 2010-12-08
Re: How does dimension hierarcy affect granularity?
I got the impression by your desription that there were different hierarchies for the same drugs. If you go with a flat hierarchy, they cannot coexist in the drug dimension table. Instead have a hierarchy table that is keyed by drug key and hierarchy type, this way you can apply a selected heirarchy (ie hierarchy type) to a fact using the drug key.
As far as flattened hierarchies in general, they are only usful if the number of levels are known and identifiable. If the levels change or interpretation of a level varies from one application to the next, you are better off using an exploded hierarchy bridge table.
As far as flattened hierarchies in general, they are only usful if the number of levels are known and identifiable. If the levels change or interpretation of a level varies from one application to the next, you are better off using an exploded hierarchy bridge table.
Similar topics
» Can a conformed dimension have a varying granularity?
» Actual and Plan Facts at different granularity - one conformed dimension?
» Facts Tables linking to different granularity of a Conformed Dimension
» Time dimension design for cross time zone and custom calendar warehouse, 1min granularity
» Model Design With Several 0 to Many Joins?
» Actual and Plan Facts at different granularity - one conformed dimension?
» Facts Tables linking to different granularity of a Conformed Dimension
» Time dimension design for cross time zone and custom calendar warehouse, 1min granularity
» Model Design With Several 0 to Many Joins?
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum