Help designing star schema
4 posters
Page 1 of 1
Help designing star schema
Relational DB report writer swinging into dimensional modeling, so please be patient.
Essentially I am creating a Fact Table which holds the Numerator and Denominator for various healthcare Measures per Physician.
Each Measure has a Goal and a Score.
This will serve to populate a Scorecard which each Physician will receive. (In the report, we'll compare the % from Num/Den to the Goal and Score appropriately.)
A Physician has the usual demographic info (Name, Practice Name, etc.) and a Primary Specialty.
Modeling this seems relatively simple as I read up on Dimensions and Facts. The tricky part, which is causing me to stumble, is that there are different types of Scorecards. The Scorecard a Physician receives is dependent upon their Primary Specialty. The Scorecard itself has a subset of Measures.
So it's this relationship between Physician, Scorecard, and Measure that's throwing me off.
Physician to Scorecard is many to one. Scorecard to Measure is many to many.
While we may store some Measurements for some Physicians, they won't be part of the Scorecard report.
Before running into this, I imagined a Dimension for each of the following Quarter, Physician, Measure around the Fact Table for Measurements (Num and Denom).
Now I need help working in the Scorecard with its relationships in mind. Maybe I'm over-thinking, maybe I haven't come across the complexity in my reading, but any help would be appreciated!
Thank you in advance!
Joe
Essentially I am creating a Fact Table which holds the Numerator and Denominator for various healthcare Measures per Physician.
Each Measure has a Goal and a Score.
This will serve to populate a Scorecard which each Physician will receive. (In the report, we'll compare the % from Num/Den to the Goal and Score appropriately.)
A Physician has the usual demographic info (Name, Practice Name, etc.) and a Primary Specialty.
Modeling this seems relatively simple as I read up on Dimensions and Facts. The tricky part, which is causing me to stumble, is that there are different types of Scorecards. The Scorecard a Physician receives is dependent upon their Primary Specialty. The Scorecard itself has a subset of Measures.
So it's this relationship between Physician, Scorecard, and Measure that's throwing me off.
Physician to Scorecard is many to one. Scorecard to Measure is many to many.
While we may store some Measurements for some Physicians, they won't be part of the Scorecard report.
Before running into this, I imagined a Dimension for each of the following Quarter, Physician, Measure around the Fact Table for Measurements (Num and Denom).
Now I need help working in the Scorecard with its relationships in mind. Maybe I'm over-thinking, maybe I haven't come across the complexity in my reading, but any help would be appreciated!
Thank you in advance!
Joe
avgjoe- Posts : 7
Join date : 2014-08-25
Re: Help designing star schema
If you have a many-to-many relationship between Dimensions then the solution is probably to implement a bridge table.
So in your case you'd create a Scorecard Dimension and then a bridge table that relates Measures to Scorecards
Regards
So in your case you'd create a Scorecard Dimension and then a bridge table that relates Measures to Scorecards
Regards
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
re:suggestion
But the Measurement Fact Table holds information about Measures, not Scorecards. Not sure how a bridge helps accomplish this.
avgjoe- Posts : 7
Join date : 2014-08-25
Re: Help designing star schema
Isn't a scorecard just a report? If so, why would you try to model it?
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Help designing star schema
Sorry, I was assuming that a Physician could get many different scorecards; if they can...avgjoe wrote:But the Measurement Fact Table holds information about Measures, not Scorecards. Not sure how a bridge helps accomplish this.
The joins would be from your fact table to your Measures Dim to the Bridge to the Scorecard Dim.
If you want to report on a particular Scorecard then you would constrain your query to the appropriate Scorecard record, this would in turn only give you the measures related to that Scorecard and therefore only the Facts related to it.
Obviously you can also constrain by any other dimension e.g. The measures for a particular Scorecard for a particular Physician
If a Physician can only get one type of Scorecard, and will only be given measures relevant to that Scorecard then isn't the Scorecard just an attribute of the Physician?
Regards,
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
re:above
Yes, a Physician would only have one Scorecard. If I follow the line of thought that Scorecard is just a report, then the report itself would be responsible to use the Physician attribute Scorecard to determine which Measures to use, correct? Which would require lots of coding in the report.
avgjoe- Posts : 7
Join date : 2014-08-25
Re: Help designing star schema
Can you clarify: do you have data for measures for a physician that don't appear on that physician's scorecard?
If the answer is no then you don't need to worry about which measures apply to which scorecard - as you don't have any measures that don't apply to the physician's scorecard.
If the answer is yes then my first question is why? If measures don't appear on a physician's scorecard why are they being collected for that physician?
Regardless of the reason why you would have measures unallocated to a scorecard, using a bridge table to link Scorecard to Measures would allow you to constrain a report to only those measures that apply to a specific scorecard
Feel free to keep asking questions if you don't think I have understood the issue correctly
If the answer is no then you don't need to worry about which measures apply to which scorecard - as you don't have any measures that don't apply to the physician's scorecard.
If the answer is yes then my first question is why? If measures don't appear on a physician's scorecard why are they being collected for that physician?
Regardless of the reason why you would have measures unallocated to a scorecard, using a bridge table to link Scorecard to Measures would allow you to constrain a report to only those measures that apply to a specific scorecard
Feel free to keep asking questions if you don't think I have understood the issue correctly
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: Help designing star schema
Maybe. Most fact tables don't have data driven facts. That is, the measures within a fact table generally apply to all dimensions. That in part drives the grain of the fact table. Here, your measures vary by dimension value, which strikes me as a performance hit in maintenance and reporting as well. I'm not sure I completely understand your situation and could be completely off base (happens a lot these days).
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
re:accumulating measures
Most measures are available from the source (picture an excel spreadsheet with Physicians along the Y-axis and all Measurement Num and Denom across the X-axis), but maybe in ETL, we should only capture Measurements relevant to each Physician's Scorecard.
avgjoe- Posts : 7
Join date : 2014-08-25
Re: Help designing star schema
You are mixing two things here. On one hand you have measures relating to physicians, and you have reports (scorecards) the present one or more measures. Is the intent to create a single fact table that can only produce predefined scorecards or do you want a fact table that can support general analysis and reporting as well as produce predefined scorecard reports?
If you want the former, the design is fine. You will need to repeat measures that appear on more than one scorecard.
If you want the latter, drop scorecard as a dimension. Have measure type as a dimension and use a bridge between measure type and scorecard to control which measures are used for a particular report.
If you want the former, the design is fine. You will need to repeat measures that appear on more than one scorecard.
If you want the latter, drop scorecard as a dimension. Have measure type as a dimension and use a bridge between measure type and scorecard to control which measures are used for a particular report.
follow-up
Initially this is just to produce scorecards, but I think they will want to do other analysis down the road so I'd prefer to work towards the latter.
If I have:
DimPhysicians
DimQuarters
DimMeasures->bridgeMeasuresScorecards->DimScorecards
FactMeasurements
Is it permissible/wise to have a ForeignKey to DimScorecard within DimPhysicians?
My thinking: On the report, I need to show only those measurements associated with a physician's scorecard.
As an aside, I will need to to show all of those measures whether they have measurements or not. How is this accomplished?
If I have:
DimPhysicians
DimQuarters
DimMeasures->bridgeMeasuresScorecards->DimScorecards
FactMeasurements
Is it permissible/wise to have a ForeignKey to DimScorecard within DimPhysicians?
My thinking: On the report, I need to show only those measurements associated with a physician's scorecard.
As an aside, I will need to to show all of those measures whether they have measurements or not. How is this accomplished?
avgjoe- Posts : 7
Join date : 2014-08-25
sigh
I think if I map this out in an ER Diagram, I have a circular reference going on.
Provider->Scorecard->Measurements->Measures->Provider
I probably need to fix that before trying to move to DM, if that's even appropriate.
Provider->Scorecard->Measurements->Measures->Provider
I probably need to fix that before trying to move to DM, if that's even appropriate.
avgjoe- Posts : 7
Join date : 2014-08-25
Re: Help designing star schema
That is a strategy I often use when I'm uncertain on the dimensional model design, build out the logical model so you can clearly identify the dependencies.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Help designing star schema
It's circular, but it works. You would join the fact to provider, provider to scorecard, scorecard to measure and measure to the fact. That will ensure you have facts related to the physician and scorecard. So, yes, two bridges: provider/scorecard and scorecard/measure.
Coverage table?
I modeled this as DimProvider<-BridgeProviderScorecard->DimScorecard<-BridgeScorecardMeasure->DimMeasure<-FactMeasurements(links also to DimProvider and DimQuarter).
Running a simple query of course gets only those Measures for an appropriate providers scorecard which have recordings. I need to also include ones without recordings. This makes me look into a factless table (which I think is called a Coverage Table.) This make me wonder if Scorecard itself should be instead modeled as a coverage table.
Running a simple query of course gets only those Measures for an appropriate providers scorecard which have recordings. I need to also include ones without recordings. This makes me look into a factless table (which I think is called a Coverage Table.) This make me wonder if Scorecard itself should be instead modeled as a coverage table.
avgjoe- Posts : 7
Join date : 2014-08-25
Re: Help designing star schema
Hi - I would say it depends on how many measures a physician can have. If you have a relatively small number of measures then I would just create a record for each one in your fact table and include a "Not recorded" flag somewhere. You'd need to make sure that these records didn't skew your queries incorrectly - i.e. you can exclude them when required.
A coverage table and a factless fact table are not necessarily the same thing - though a coverage fact table could be a subtype of a factless fact table if you decide not to include a measure in it.
The classic example of using a coverage table is a supermarket sales promotion - where 1000s of products in every store could be part of the promotion - so creating a dummy sales records for every possible promotion/store/product would generate a huge number of fact records which would clog up your sales fact table; so instead you move all the possible sales to a coverage table.
Nothing to stop you creating a coverage table for your measures but if you only have a few 10s (or possibly 100s) then the overhead of maintaining the coverage table and the additional query complexity of unioning your measure fact and your coverage fact queries probably means just having a single fact table is a better solution
A coverage table and a factless fact table are not necessarily the same thing - though a coverage fact table could be a subtype of a factless fact table if you decide not to include a measure in it.
The classic example of using a coverage table is a supermarket sales promotion - where 1000s of products in every store could be part of the promotion - so creating a dummy sales records for every possible promotion/store/product would generate a huge number of fact records which would clog up your sales fact table; so instead you move all the possible sales to a coverage table.
Nothing to stop you creating a coverage table for your measures but if you only have a few 10s (or possibly 100s) then the overhead of maintaining the coverage table and the additional query complexity of unioning your measure fact and your coverage fact queries probably means just having a single fact table is a better solution
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Similar topics
» star schema designing
» Need help designing this star for education sector/university
» Star Schema for MPP databases
» Fact 1:N Dimension with Star Schema
» Star Schema vs All in one table
» Need help designing this star for education sector/university
» Star Schema for MPP databases
» Fact 1:N Dimension with Star Schema
» Star Schema vs All in one table
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum