# Sparse data in fact table

2 posters

Page

**1**of**1**## Sparse data in fact table

Hello,

I have a transaction fact table with measurements like:

grand total original (eg before voucher)

grand total original net

grand total

grand total net

total net product costs

quantity of items

Now I am thinking about how to integrate returned products. My colleague and I have two different views:

1.

Extend the fact table with the facts

grand total original after return

grand total original net after return

grand total after return

grand total net after return

total net product costs after return

quantity of items after return

Here I see two problems:

a) sparse data! we have about 10% returns, maybe in the worst case 25% of returns

b) You have a more complex ETL process as you have to look up the historical order data and edit it

2.

Every time there is a return, add a row to the fact table with negative values for the mentioned attributes (but the same DegenerateDimension order number). Maybe add an attribute "return reason" in the junk dimension.

What are your thoughts about this? In general, shouldn't one avoid editing historical data in the fact table?

I have a transaction fact table with measurements like:

grand total original (eg before voucher)

grand total original net

grand total

grand total net

total net product costs

quantity of items

Now I am thinking about how to integrate returned products. My colleague and I have two different views:

1.

Extend the fact table with the facts

grand total original after return

grand total original net after return

grand total after return

grand total net after return

total net product costs after return

quantity of items after return

Here I see two problems:

a) sparse data! we have about 10% returns, maybe in the worst case 25% of returns

b) You have a more complex ETL process as you have to look up the historical order data and edit it

2.

Every time there is a return, add a row to the fact table with negative values for the mentioned attributes (but the same DegenerateDimension order number). Maybe add an attribute "return reason" in the junk dimension.

What are your thoughts about this? In general, shouldn't one avoid editing historical data in the fact table?

**boernard**- Posts : 13

Join date : 2012-01-19

## Re: Sparse data in fact table

What is 'grand total' doing in a transactional fact table?

Anyway, just add a new row with negative amounts reflecting the value of the return.

Anyway, just add a new row with negative amounts reflecting the value of the return.

## Re: Sparse data in fact table

thx for the reply!

grand total is the price that is on the bill which the customer receives. Isn't that right?

Another question:

Should I store the monetary values as integers (in cent) or as decimal (in Euro) ?

And do you have an idea for:

-"In general, shouldn't one avoid editing historical data in the fact table?"

grand total is the price that is on the bill which the customer receives. Isn't that right?

Another question:

Should I store the monetary values as integers (in cent) or as decimal (in Euro) ?

And do you have an idea for:

-"In general, shouldn't one avoid editing historical data in the fact table?"

**boernard**- Posts : 13

Join date : 2012-01-19

## Re: Sparse data in fact table

If this is sales, you usually maintain line level facts, that is if someone buys 5 different items, there are 5 rows in the fact. The total purchase is simply a sum of the rows. If by 'grand total' you mean the sum of the 5 different items, then you are building an aggregate which is not advisable unless it is a summary of an existing atomic level fact table.

Working at the atomic (i.e. lowest level of detail) level, a return is simply another row with negative values. When summed with the other values (Fact tables queries are almost always a sum of some sort), it will properly reflect total gross revenues and quantities sold.

When storing monetary values, always use an exact (i.e. base 10) data type with appropriate decimal positions. Depending on your database, the data type may be called NUMBER or MONEY. Never use a non-exact binary type (FLOAT, REAL, etc...).

Working at the atomic (i.e. lowest level of detail) level, a return is simply another row with negative values. When summed with the other values (Fact tables queries are almost always a sum of some sort), it will properly reflect total gross revenues and quantities sold.

When storing monetary values, always use an exact (i.e. base 10) data type with appropriate decimal positions. Depending on your database, the data type may be called NUMBER or MONEY. Never use a non-exact binary type (FLOAT, REAL, etc...).

## Re: Sparse data in fact table

Aaah I didn't mention that I already modelled a fact table on line item level. The fact table on order level has additional information which can't be allocated on the line item level. Thank you!

**boernard**- Posts : 13

Join date : 2012-01-19

Similar topics

» How to report on sparse areas of sparse fact table

» Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

» Periodic snapshot fact tables with sparse data

» Data in a fact or dimenzion table or bridge table

» SCD or data in FACT TABLE

» Design a snapshot Fact table in data mart from compliance-enable fact table in enterprise data warehouse

» Periodic snapshot fact tables with sparse data

» Data in a fact or dimenzion table or bridge table

» SCD or data in FACT TABLE

Page

**1**of**1****Permissions in this forum:**

**cannot**reply to topics in this forum