Modeling invoice payment
2 posters
Page 1 of 1
Modeling invoice payment
Hi there,
I am modeling a data warehouse that represents the payment of invoices. The fact table register the outstanding for each invoice (or in other word the amount 'yet-to-be-paid')
For instance, let's say I issue an invoice of $100.
The purchaser n°1 (the debtor) pays $10, then $20 and finally clear the debt with a $70 payment.
In this case I will have four facts in my fact table:
I have a dimension with information about the invoice such as functional key, initial amount, due date etc..
I wonder if I should put the invoice information into the fact table rather than into a dimension table?
The reasons why I am asking are :
1. I have an amount in the invoice dimension which looks like a measure
2. Each invoice generate an average of 2 or 3 payment events, consequently my invoice dimension grows almost as fast as my fact tables
3. If I need to know the purchaser for a specific invoice I had to go through the fact table which costs me a join
4. On the other hand fact table is huge (1 millions lines) and I'd rather limit the number of column in it (particularly VARCHAR column like functionnal key)
What is your opinion on that point ?
I am modeling a data warehouse that represents the payment of invoices. The fact table register the outstanding for each invoice (or in other word the amount 'yet-to-be-paid')
For instance, let's say I issue an invoice of $100.
The purchaser n°1 (the debtor) pays $10, then $20 and finally clear the debt with a $70 payment.
In this case I will have four facts in my fact table:
- Code:
invoice_key purchaser_key date outstanding
1 1 yy/mm/dd 100
1 1 yy/mm/dd 90
1 1 yy/mm/dd 70
1 1 yy/mm/dd 0
I have a dimension with information about the invoice such as functional key, initial amount, due date etc..
I wonder if I should put the invoice information into the fact table rather than into a dimension table?
The reasons why I am asking are :
1. I have an amount in the invoice dimension which looks like a measure
2. Each invoice generate an average of 2 or 3 payment events, consequently my invoice dimension grows almost as fast as my fact tables
3. If I need to know the purchaser for a specific invoice I had to go through the fact table which costs me a join
4. On the other hand fact table is huge (1 millions lines) and I'd rather limit the number of column in it (particularly VARCHAR column like functionnal key)
What is your opinion on that point ?
wawanco- Posts : 6
Join date : 2014-05-28
Location : Paris
Re: Modeling invoice payment
My thoughts on your points:
1. Why not put the invoice amount, amount paid and outstanding amount on each fact record? You then have all the information available on each record to support any business queries
2. Your Dim grows at half or third of the rate of your fact - there's nothing wrong with this. Many examples that model an invoicing process have a grain of the invoice line rather than the invoice that you have and so have relatively much higher record counts
3. Fact tables are how you relate Dimensions in a dimensional model. Joining through the fact table is unlikely to have a significant impact assuming proper design, indexing, performance tuning. However, rules are there to be broken! If you want to start creating FKs between dimensions because you have a valid reason to and you fully understand the impact of what you are doing then go ahead.
4. 1 million records in a fact table is not huge (I'm working on a project likely to have to handle 10s of millions of fact records being created per day). However, as you say, putting varchar columns in a fact table should be avoided if possible: a varchar(20) column holding an invoice number as a degenerate dimension is probably not a big deal; a varchar(250) column holding a text comment is probably not a good idea
1. Why not put the invoice amount, amount paid and outstanding amount on each fact record? You then have all the information available on each record to support any business queries
2. Your Dim grows at half or third of the rate of your fact - there's nothing wrong with this. Many examples that model an invoicing process have a grain of the invoice line rather than the invoice that you have and so have relatively much higher record counts
3. Fact tables are how you relate Dimensions in a dimensional model. Joining through the fact table is unlikely to have a significant impact assuming proper design, indexing, performance tuning. However, rules are there to be broken! If you want to start creating FKs between dimensions because you have a valid reason to and you fully understand the impact of what you are doing then go ahead.
4. 1 million records in a fact table is not huge (I'm working on a project likely to have to handle 10s of millions of fact records being created per day). However, as you say, putting varchar columns in a fact table should be avoided if possible: a varchar(20) column holding an invoice number as a degenerate dimension is probably not a big deal; a varchar(250) column holding a text comment is probably not a good idea
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Similar topics
» Invoice dimensional modeling question
» rethinking sales invoice line modeling
» Dimensional modeling of product and vendor for invoice fact
» Modeling Invoice Level Sales With a Volatile Sales Org
» Modeling invoice detail with rebate detail
» rethinking sales invoice line modeling
» Dimensional modeling of product and vendor for invoice fact
» Modeling Invoice Level Sales With a Volatile Sales Org
» Modeling invoice detail with rebate detail
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum