How many columns is too many OR how to reduce the columns
4 posters
Page 1 of 1
How many columns is too many OR how to reduce the columns
I have a fact table that holds Radiology Examination events:
[ServiceProviderExam_sk]
,[ServiceProviderReport_sk]
,[Patient_sk]
,[Radiologist_Typed_By_sk]
,[Staff_Reported_By_sk]
,[Staff_Referrer_sk]
,[Staff_Radiographer_1_sk]
,[Specialty_sk]
,[Exam_Day_sk]
,[Exam_Datetime]
,[Reported_Date]
,[Latest_Clock_Reset_Date]
,[Exam_sk]
,[Location_Referring_sk]
,[Location_sk]
,[Request_Category_sk]
,[Urgency_sk]
,[Exam_Key]
,[Event_Key]
,[Waiting Not Planned]
I know I should have a narrower fact table, but how to do it? I could have a Exam Status fact which is filled as the Exam goes through the steps of Referred, Booked, Attended, Examined, Reported etc.
OR I could have separate fact tables describing these different events?
Which is best?
Al Wood
[ServiceProviderExam_sk]
,[ServiceProviderReport_sk]
,[Patient_sk]
,[Radiologist_Typed_By_sk]
,[Staff_Reported_By_sk]
,[Staff_Referrer_sk]
,[Staff_Radiographer_1_sk]
,[Specialty_sk]
,[Exam_Day_sk]
,[Exam_Datetime]
,[Reported_Date]
,[Latest_Clock_Reset_Date]
,[Exam_sk]
,[Location_Referring_sk]
,[Location_sk]
,[Request_Category_sk]
,[Urgency_sk]
,[Exam_Key]
,[Event_Key]
,[Waiting Not Planned]
I know I should have a narrower fact table, but how to do it? I could have a Exam Status fact which is filled as the Exam goes through the steps of Referred, Booked, Attended, Examined, Reported etc.
OR I could have separate fact tables describing these different events?
Which is best?
Al Wood
Al Wood- Posts : 46
Join date : 2010-12-08
Re:How many columns is too many OR how to reduce the columns
Hi,
what you have mentioned the should be fine as from the list of columns that you have mentioned they dont look too many.
I dont thing you need to split the table as per the stage.
thanks
what you have mentioned the should be fine as from the list of columns that you have mentioned they dont look too many.
I dont thing you need to split the table as per the stage.
thanks
hkandpal- Posts : 113
Join date : 2010-08-16
Re: How many columns is too many OR how to reduce the columns
The issue with the number of fact tables has to do with the dimensionality of the event. When a referral is logged, does it make sense to record the date of the examination or the radiographer involved in the examination?
Re: How many columns is too many OR how to reduce the columns
Thanks ngalemmo.
That makes me think I need different fact tables for the different events because they have quite different attributes.
Hi hkandpal, I'm not sure the number of columns is fine. I need to put 1.3 million rows in it, perhaps much more.
The existing table has 20 columns. It is a result of the organisation trying to use a fact table to relpace a very wide reporting table.
In a four-column fact table you can have all possible combinations of index! e.g.
a,b,c,d
a,b,d,c
a,c,b,d
etc
It's 4 factorial (written as 4!) which is 24.
The existing table has 20 columns so we need 20! = 2432902008176640000 indexes to do that.
So without the right indexes the queries will chug along and fast analysis will not be possible.
Of course, in production you can build them as needed if a query goes slow. But out of 2432902008176640000 possible ones,
will you ever have all the indexes that are likely to be needed for the next query that hits?
Al Wood
That makes me think I need different fact tables for the different events because they have quite different attributes.
Hi hkandpal, I'm not sure the number of columns is fine. I need to put 1.3 million rows in it, perhaps much more.
The existing table has 20 columns. It is a result of the organisation trying to use a fact table to relpace a very wide reporting table.
In a four-column fact table you can have all possible combinations of index! e.g.
a,b,c,d
a,b,d,c
a,c,b,d
etc
It's 4 factorial (written as 4!) which is 24.
The existing table has 20 columns so we need 20! = 2432902008176640000 indexes to do that.
So without the right indexes the queries will chug along and fast analysis will not be possible.
Of course, in production you can build them as needed if a query goes slow. But out of 2432902008176640000 possible ones,
will you ever have all the indexes that are likely to be needed for the next query that hits?
Al Wood
Al Wood- Posts : 46
Join date : 2010-12-08
Re: How many columns is too many OR how to reduce the columns
Regarding indexes, unless I've misunderstood what you are saying then I don't think that's how they work or should be implemented.
You have single indexes on each surrogate key to implement the join between the Dims and the Fact and then you have (possibly bit-mapped) indexes on the Dimension attributes that you constrain your queries on.
Having 20-30 SKs on a fact table is quite normal so if your indexing is slowing down rather than speeding up your queries then I'm guessing there is an issue with your index design rather than your Dimensional design
You have single indexes on each surrogate key to implement the join between the Dims and the Fact and then you have (possibly bit-mapped) indexes on the Dimension attributes that you constrain your queries on.
Having 20-30 SKs on a fact table is quite normal so if your indexing is slowing down rather than speeding up your queries then I'm guessing there is an issue with your index design rather than your Dimensional design
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Similar topics
» how to "reduce the duplicated rows"
» How to reduce size of a customer like dimension
» Reduce Rows in Inventory Snapshot Fact Table
» Daily snapshot fact table-any chance to reduce data volume?
» SCD 2 Audit / Metadata Columns
» How to reduce size of a customer like dimension
» Reduce Rows in Inventory Snapshot Fact Table
» Daily snapshot fact table-any chance to reduce data volume?
» SCD 2 Audit / Metadata Columns
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum