Fact Table Design

Cavalera Sat Aug 27, 2011 5:29 am

Hello everyone....first post so please ignore my naivety.....

The Fact tables that I have seen created have all been very 'standard' i.e. typically one line per order (as an example) and the resultant cube aggregations generally have been simple to create. My source data is not like this as it comes from what we call a "Cashbook" table which in effect is like an audit trail as to what has happened on a client's account.

Sample source data:

'Branch' 'PolicyReference' 'TransactionType' 'RecordType' 'Premium' 'Commission'
'1' 'ABCD01PC01' 'New Business' 'New Transactions' '£500.00' '£50.00'
'1' 'SCDD76HQ02' 'New Business' 'New Transactions' '£240.00' '£36.00'
'1' 'WASR12CV01' 'Renewal' 'New Transactions' '£657.56' '£67.75'
'1' 'ABCD01PC01' 'New Business' 'DeletedTransactions' '£500.00' '£50.00'
'1' 'ABCD01PC01' 'New Business' 'Commission Adjs' '£0.00' '£50.00'

Records 1, 2 and 5 are a good example of what I am querying. I work for an Insurance Brokers by the way.

To get a count of policies, we need to add up all 'New Transaction' RecordTypes and take away all 'Deleted Transaction' Record Types.

To get a sum of the Commissions, we need to add up the Commission amounts for 'New Transaction' and 'Commission Adjs' RecordTypes and take away all 'DeletedTransactions' RecordTypes.

This is not practical (or probably even possible) to transform this data into the typical 'easy' fact table examples that you see in AdventureWorks for example i.e. nice and tidy with one 'order' per line.

So, the question is, can I and should I, do the following:

Create derived columns and show the 'net value' of what I am hoping to calculate and use these new columns as the basis for calculations/aggregations? In other words, create a 'Net Commission' column and in it, place the net value i.e. positive values for 'NewTransaction' and 'CommissionAdjs' Record Types and negative values for 'DeletedTransaction' RecordTypes?

In addition, this table contains a lot of other RecordTypes which, while not used to calculate Commissions or Premiums, may be used to calculate something else i.e. retention rates. If the previous suggestion above is correct, should I use a single Fact table with multiple derived columns, many of which would contain zero values, or should I create multiple Fact tables, one to measure Premiums and Commissions, one to measure policies, one to calculate Retention Rates etc (obviously restricting those Fact table to only the Record Types that are pertinent to the particular measure that I am trying to calculate. Each of these three examples would use a different set of 'RecordTypes' in order to generate the net totals.

I hope that this makes sense. Your thoughts and advice would be much appreciated.

ngalemmo Mon Aug 29, 2011 11:05 am

Adjusting the sign to properly calculate the ins and outs of cashflow is fine. Assuming policy reference allows you to identify other dimensions, such as customer, I would add those dimensions to the table as well.

You may want to consider aggregate tables for some of the more complex queries, such as retention, if performance becomes an issue.

Cavalera Tue Aug 30, 2011 2:39 am

Many thanks for taking the time to reply and for your assistance.

Accepting that it is quite normal to negate certain values, can you offer some insights as to best practice design for counting policies and retentions, in terms of whether or not I try and do this with a single fact table or multiple fact tables?

All measures (premiums/commissions/fees/policy counts/retention rates) can be calculated from this single table (with multiple calculated fields i.e. one for Premium, one for Fees etc) but I am a little concerned about populating a lot of zero's if I use this method. There are far more RecordType values within our Cashbook that do not relate to monetary or policy count values and if I added a calculated measure for each element that I am trying to report on, I would need to populate zeros for probably 3/4's of the records. Each of the measures listed above use a different 'subset' of RecordTypes in order to calculate their values.

The question is therefore is this OK/acceptable and good design or should I create separate Fact tables containing only the RecordTypes that would 'count towards' the measure that I am trying to report on?

Hope this makes sense - and thanks in advance.

Sponsored content

Fact Table Design

Fact Table Design

Re: Fact Table Design

Re: Fact Table Design

Re: Fact Table Design