Supplement claim process - Additional data
Page 1 of 1
Supplement claim process - Additional data
Hello,
i am currently creating a dimensional model for supplement claims. I have identified some dimensions like "person", "claim type", "support program", "support program group" (since group is N:M to program), and some more.
I have identified fact tables for "supplements granted" and "supplements payed".
But i have trouble understanding if i should create dimensions for "claim" and "claim position" or if i can extract all of the relevant data and just keep the claim_number and claim_position_number as degenerate dimensions in the fact tables. A claim has one or more claim positions.
An applicant only knows the claim. The claim positions are created to manage where the money from the supplements comes from etc. on a more detailed level. Most of the dimensions are unique in the claim position.
I would like to have claim and claim position only as degenerate dimensions, because there usually are only about 3 transactions per claim position (Some only have 2, some may have 20, but the usual case is about 3). (Transactions meaning supplement granted, supplement payed, supplement revoked, ...). So the claim position dimension would be almost as large as many of the fact tables.
My problem is that there is quite a lot of "additional data" in the source system, that applies to either claim or claim position. After extracting the descriptive data from claim and claim position to other, more specific dimensions, i am left with some numeric and date fields.
E.g. "number of jobs created" is recorded at the claim level, if the supplements are used by a company to create jobs. The number of jobs can be changed independently from trainsactions like supplement granted or supplement payed. But usually these additional data fields are recorded once and not changed at all. There are other data fields that are recroded at the claim position level.
If i would create a claim dimension, i am tempted to include the number of jobs created in the claim dimension. This would be easy in the etl process and since the data doesn't change much, it wouldn't increase the size of the claim dimension very much, even when tracking it as SCD2.
But there might be reports that want to sum over the number of jobs created. I feel like i am not following the best practices if i include such a field for summarization in the dimension table.
On the other hand i am not sure how i should call a transaction fact table that records the changes in these additional data fields. Especially since the data fields are very heterogeneous, since there are lots of different support programs. Instead of creating jobs, some other programs supplement the building of houses, restoration to make buildings more energy efficient, etc.
The source system doesn't create transactions for changes in the number of jobs created, but it does save the history.
So for example on 2014/07/01 there might be 78 jobs created by claim 1234, on 2014/08/05 the number is changed to 58 jobs created. But it is not recorded that the number changed by -20. I would have to calculate these changes in the etl process if i want to create a transaction fact for it.
If there was only additional data for the number of jobs created i might creaet a transaction fact table "jobs created" where i would store the changes, e.g. +78 on 2014/07/01 and -20 on 2014/08/05. Some of the additional data fields are kept in the same source table and changed at the same time. So in addition to "jobs created" there is "jobs for males created", "jobs for females created", "jobs for males saved", ...
These could be saved in the same transaction fact table, although some of the changes might be 0.
But i would need to create quite a few additional fact tables for the other additional data fields that only apply to housing related supplements and so on.
Sorry for the long text. I hope everything is understandable.
Do you think i should create dimensions for claim and claim position? Or do you think i could create appropriate fact tables? How would you call those fact tables?
Thank you very much for your help!
Tobias
i am currently creating a dimensional model for supplement claims. I have identified some dimensions like "person", "claim type", "support program", "support program group" (since group is N:M to program), and some more.
I have identified fact tables for "supplements granted" and "supplements payed".
But i have trouble understanding if i should create dimensions for "claim" and "claim position" or if i can extract all of the relevant data and just keep the claim_number and claim_position_number as degenerate dimensions in the fact tables. A claim has one or more claim positions.
An applicant only knows the claim. The claim positions are created to manage where the money from the supplements comes from etc. on a more detailed level. Most of the dimensions are unique in the claim position.
I would like to have claim and claim position only as degenerate dimensions, because there usually are only about 3 transactions per claim position (Some only have 2, some may have 20, but the usual case is about 3). (Transactions meaning supplement granted, supplement payed, supplement revoked, ...). So the claim position dimension would be almost as large as many of the fact tables.
My problem is that there is quite a lot of "additional data" in the source system, that applies to either claim or claim position. After extracting the descriptive data from claim and claim position to other, more specific dimensions, i am left with some numeric and date fields.
E.g. "number of jobs created" is recorded at the claim level, if the supplements are used by a company to create jobs. The number of jobs can be changed independently from trainsactions like supplement granted or supplement payed. But usually these additional data fields are recorded once and not changed at all. There are other data fields that are recroded at the claim position level.
If i would create a claim dimension, i am tempted to include the number of jobs created in the claim dimension. This would be easy in the etl process and since the data doesn't change much, it wouldn't increase the size of the claim dimension very much, even when tracking it as SCD2.
But there might be reports that want to sum over the number of jobs created. I feel like i am not following the best practices if i include such a field for summarization in the dimension table.
On the other hand i am not sure how i should call a transaction fact table that records the changes in these additional data fields. Especially since the data fields are very heterogeneous, since there are lots of different support programs. Instead of creating jobs, some other programs supplement the building of houses, restoration to make buildings more energy efficient, etc.
The source system doesn't create transactions for changes in the number of jobs created, but it does save the history.
So for example on 2014/07/01 there might be 78 jobs created by claim 1234, on 2014/08/05 the number is changed to 58 jobs created. But it is not recorded that the number changed by -20. I would have to calculate these changes in the etl process if i want to create a transaction fact for it.
If there was only additional data for the number of jobs created i might creaet a transaction fact table "jobs created" where i would store the changes, e.g. +78 on 2014/07/01 and -20 on 2014/08/05. Some of the additional data fields are kept in the same source table and changed at the same time. So in addition to "jobs created" there is "jobs for males created", "jobs for females created", "jobs for males saved", ...
These could be saved in the same transaction fact table, although some of the changes might be 0.
But i would need to create quite a few additional fact tables for the other additional data fields that only apply to housing related supplements and so on.
Sorry for the long text. I hope everything is understandable.
Do you think i should create dimensions for claim and claim position? Or do you think i could create appropriate fact tables? How would you call those fact tables?
Thank you very much for your help!
Tobias
T-Otte- Posts : 1
Join date : 2014-08-18
Similar topics
» Using the Dimensional Data Warehouse as source data for the OLTP process
» Would you include additional Data Integration and ETL solutions to this list of Top providers?
» Bridge table for patient diagnosis
» Insurance Claim Model
» cross process(data marts) ratios
» Would you include additional Data Integration and ETL solutions to this list of Top providers?
» Bridge table for patient diagnosis
» Insurance Claim Model
» cross process(data marts) ratios
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum