Modeling history (versions) of a measure

DBADS Sun Oct 17, 2010 10:28 am

Hello,

We have measures that get updated for the same set of dimensions.
A figure gets inserted in the fact for a certain day, and then (may be at the end of the month) this figure gets audited and can be updated.
How can I keep the history of the figure (i.e.: old and new figure) in my dimensional model? I tried adding a version column to the fact key but the problem is each record in the fact has several measures, so I won't know which measure got updated unless through comparing both records (version 1 & version 2 records).
Is there a better way to model this, taking into consideration that the figure can be changed several times (not just twice)?

Thank you

ngalemmo Mon Oct 18, 2010 12:30 pm

There are basically two forms of historical fact tables: transactional (net change) facts and accumulating snapshots.

A transactional fact table is insert only and stores changes as the difference between the current state and the new state. Depending on what changes, it would require one or two new rows.

An accumulating snapshot stores each new version of the fact bound by effective and expiration dates. It requires updating the expiration date of the previous version of the fact.

Both will allow you to generate a set of facts at any point in time, the first by summing up to the desired date and the other by filtering facts within the effective range. A transactional fact has the advantage of being able to easily query the magnitude and direction of change.

Jeff Smith Mon Oct 18, 2010 2:56 pm

I have a claims fact table. A claim can get reprocessed which results in the amount covered and or the amount we pay to be altered. Each time a claim is modified in the source, it creates a new generation. A claim can be reworked a few days after it was originally processed or weeks later.

We pay the claim and charge our clients when the claim is processed. If the claim gets reprocessed at a later date, we will pay the difference (if more) or credit our clients with the overpayment.

To handle this, I load the claim as it comes to me. If it's been reworked, I search for the prior version of the claim in my history, clone the original except for the amounts and "paid date". I grab the amount from the new version of the claim and make it a negative and take the paid date from the new version. I also make the "counter" on the record a negative. I then add in the new version of the claim.

I can correctly state the amount we paid on the claim when we paid it. I can correctly state the amount we paid after it was reworked.

ngalemmo Mon Oct 18, 2010 3:12 pm

That's the transactional method when source transactions do not exist... but in "I grab the amount from the new version of the claim and make it a negative " I believe you meant to say "old version".

An extended technique, when there can be multiple changes over time, is to extract and negate all versions of the fact rows into a holding table (old positives become negative and old negatives become positive), with the addition of a positive version of the new row. Then summarize the holding table across all dimensions (except activity/posting date... use MAX() if it is in the holding table), and filter out zero rows... the result is the total net change resulting in one or two rows per fact. You would wind up with two rows if there is a change in one or more dimensions, one row if the dimensions did not change and no rows if there was no change (all measures are zero).

Sponsored content

Modeling history (versions) of a measure

Modeling history (versions) of a measure

Re: Modeling history (versions) of a measure

Re: Modeling history (versions) of a measure

Re: Modeling history (versions) of a measure

Re: Modeling history (versions) of a measure