Insurance single fact vs multiple facts
2 posters
Page 1 of 1
Insurance single fact vs multiple facts
Hi,
Having read the Kimball DW toolkit chapter on insurance, it seems that recommendation is to have a single Policy Transaction fact, measuring the count of transactions that occur by Date, PolicyHolder, Product/Coverage etc.
Does anyone think that this would be better as separate fact tables? In the insurance company who I work for, we have New Sale transactions, Renewal Transactions, Adjustment transactions (where the customer calls and changes their cover). So these can all go into one fact with a Transaction Type dim as per the Kimball design. However, as an example, the Renewals team want to measure the #Renewals that have occurred. There is also a bit of extra dimensionality for renewals, such as the Renewal Reason, or the Renewal Provider. Does the fact that users want to see measures like #Renewals, plus the few extra dims, warrant a separate fact table?
Also - related to the above - users are interested in things like #Renewals, #New Sales as measures. The fact table will be at a lower grain than this - due to the fact that transactions occur at a 'Coverage' level, e.g. Legal Cover, Public Liability or Personal Accident. So I don't think the #Transactions will be that useful to people as measures, as its the #sales that are more important than the #Transactions. This is a shame, because I obviously want to model the fact at the lowest level (which is coverage), but to get things like #Renewals or #New Sales I'm going to have to do distinct counts everywhere in the Analysis Services cubes that will sit on top. On the other hand, the Premium at the Cover level will be very interesting to users! Its just the counts that really will need to be at a higher/distinct level. Can anyone please advise - what's the best practice approach here?
Thank you
Having read the Kimball DW toolkit chapter on insurance, it seems that recommendation is to have a single Policy Transaction fact, measuring the count of transactions that occur by Date, PolicyHolder, Product/Coverage etc.
Does anyone think that this would be better as separate fact tables? In the insurance company who I work for, we have New Sale transactions, Renewal Transactions, Adjustment transactions (where the customer calls and changes their cover). So these can all go into one fact with a Transaction Type dim as per the Kimball design. However, as an example, the Renewals team want to measure the #Renewals that have occurred. There is also a bit of extra dimensionality for renewals, such as the Renewal Reason, or the Renewal Provider. Does the fact that users want to see measures like #Renewals, plus the few extra dims, warrant a separate fact table?
Also - related to the above - users are interested in things like #Renewals, #New Sales as measures. The fact table will be at a lower grain than this - due to the fact that transactions occur at a 'Coverage' level, e.g. Legal Cover, Public Liability or Personal Accident. So I don't think the #Transactions will be that useful to people as measures, as its the #sales that are more important than the #Transactions. This is a shame, because I obviously want to model the fact at the lowest level (which is coverage), but to get things like #Renewals or #New Sales I'm going to have to do distinct counts everywhere in the Analysis Services cubes that will sit on top. On the other hand, the Premium at the Cover level will be very interesting to users! Its just the counts that really will need to be at a higher/distinct level. Can anyone please advise - what's the best practice approach here?
Thank you
jryan- Posts : 33
Join date : 2010-09-27
Re: Insurance single fact vs multiple facts
Hi - the design should be driven by your business' requirements. Assuming that the different dimensionality of different transaction types is not too great then you can technically put them all in the same fact table - with some types having zero values (or whatever your 'dummy' key value is) for the Dims that don't apply.
However, I would only put them in the same fact table if that is the best way of supporting the queries your business wants to run e.g. if 90% of queries are about how many transactions there have been, sliced/diced by a common set of attributes then it makes sense to have them all in the same fact table. However, if most queries about renewals are specific to renewal-only attributes it may make more sense to have a renewal fact table.
There is also nothing to stop you having both fact tables - a generic transaction fact table and a renewal-specific one (and a "new business" one and any others you have). Remember - a dimensional model is not normalised. Obviously if you have multiple fact tables containing related information then it makes it more challenging for you to present the information correctly to the end users in a way they will understand: if they are writing a renewals report which fact table should they be querying?
Holding data at the lowest grain: one of the reasons for doing this is it is then easy (or at least easier!) to roll up the data to higher levels - you have all the information necessary to meet business requirements as they evolve in the future and you can roll up the data in multiple different "directions" to different summarised fact tables. But bear in mind that you do not have to expose these lowest level fact tables to the end users if there is no need to - you can effectively hold them in the background almost as staging tables. How you actually present the data to the end users depends on data volumes, performance, etc. For the "Distinct Count" issue you mention you might get away with just creating a normal view on top of your tables though you are probably better off creating an indexed/materialised view in your DB or populating an aggregate table from your base fact tables using your ETL process.
Hope this is of some help?
However, I would only put them in the same fact table if that is the best way of supporting the queries your business wants to run e.g. if 90% of queries are about how many transactions there have been, sliced/diced by a common set of attributes then it makes sense to have them all in the same fact table. However, if most queries about renewals are specific to renewal-only attributes it may make more sense to have a renewal fact table.
There is also nothing to stop you having both fact tables - a generic transaction fact table and a renewal-specific one (and a "new business" one and any others you have). Remember - a dimensional model is not normalised. Obviously if you have multiple fact tables containing related information then it makes it more challenging for you to present the information correctly to the end users in a way they will understand: if they are writing a renewals report which fact table should they be querying?
Holding data at the lowest grain: one of the reasons for doing this is it is then easy (or at least easier!) to roll up the data to higher levels - you have all the information necessary to meet business requirements as they evolve in the future and you can roll up the data in multiple different "directions" to different summarised fact tables. But bear in mind that you do not have to expose these lowest level fact tables to the end users if there is no need to - you can effectively hold them in the background almost as staging tables. How you actually present the data to the end users depends on data volumes, performance, etc. For the "Distinct Count" issue you mention you might get away with just creating a normal view on top of your tables though you are probably better off creating an indexed/materialised view in your DB or populating an aggregate table from your base fact tables using your ETL process.
Hope this is of some help?
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: Insurance single fact vs multiple facts
Hi Nick,
Thanks for the reply.
The differences in reporting requirements between the two isn't that different, so thinking about it I think having the odd -1 or 0 key for perhaps the 1 or 2 dimensions that don't apply is ok.
I do agree with all your ideas - you're right I could have 2 fact tables and I don't have to expose the lowest level. Thank you - this has helped clarify my thinking.
I guess the angle I'm coming from is what's the best practice for an insurance model here? Given that lots of people must have done insurance models already, what have people generally gone with?
Thanks for the reply.
The differences in reporting requirements between the two isn't that different, so thinking about it I think having the odd -1 or 0 key for perhaps the 1 or 2 dimensions that don't apply is ok.
I do agree with all your ideas - you're right I could have 2 fact tables and I don't have to expose the lowest level. Thank you - this has helped clarify my thinking.
I guess the angle I'm coming from is what's the best practice for an insurance model here? Given that lots of people must have done insurance models already, what have people generally gone with?
jryan- Posts : 33
Join date : 2010-09-27
Similar topics
» Multiple Facts or Single Facts and Status Table?
» Single or Multiple Fact : Single or Multiple Dimension
» Multiple Facts Mapped to Single Dimension Record
» Single fact table vs multiple fact tables - what is the right way in a dimensional model?
» Multiple Date Values for a Single Fact Row
» Single or Multiple Fact : Single or Multiple Dimension
» Multiple Facts Mapped to Single Dimension Record
» Single fact table vs multiple fact tables - what is the right way in a dimensional model?
» Multiple Date Values for a Single Fact Row
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum