Categorizin Facts/Measures
5 posters
Page 1 of 1
Categorizin Facts/Measures
Hi,
I just wanted to ask how to properly identify a category of measure/fact: Additive, semi-additive or non-additive.
For example, I have the following measures
an amount
a count of days
amount/some divisor
average
also, if a flag is being used for counting, can it be considered a fact? if so, would you consider this a non-additive or semi-additive?
Thanks
Aris
I just wanted to ask how to properly identify a category of measure/fact: Additive, semi-additive or non-additive.
For example, I have the following measures
an amount
a count of days
amount/some divisor
average
also, if a flag is being used for counting, can it be considered a fact? if so, would you consider this a non-additive or semi-additive?
Thanks
Aris
Last edited by castro.aris on Tue Jul 02, 2013 10:56 am; edited 1 time in total (Reason for editing : Additional details)
castro.aris- Posts : 2
Join date : 2013-07-02
Re: Categorizin Facts/Measures
Additive - Measures can be summed up for all dimensions of the fact table.
Non Additive - Measures can be summed up for some dimensions of the fact table.
Semi Additive - Measures cannot be summed up for any dimension.
Let me give an example, this is my fact table :
-------------------------------
DATE
ACCOUNT_ID
-------------------------------
EOD_ACCOUNT_BALANCE
TOTAL_TXN_AMOUNT
LAST_TRANSACTION_DATE
-------------------------------
TOTAL_TXN_AMOUNT is additive for all dimensions of the fact table. You can sum up TOTAL_TXN_AMOUNT for DATE, or ACCOUNT_ID or both. So this is an additive fact.
EOD_ACCOUNT_BALANCE is additive for DATE dimension, you can find the total EOD balance of all accounts in a given day. But EOD Account balance is not additive for DATE and ACCOUNT_ID.
Summing up all EOD balances for a given account for each day of a week does not give any useful result. So this is a semi additive fact.
LAST_TRANSACTION_DATE is a date and it cannot be summed up for any dimension. This is a non additive fact.
Your flag attribute could be additive, non additive or semi additive. It depends on the definition of that fact.
Non Additive - Measures can be summed up for some dimensions of the fact table.
Semi Additive - Measures cannot be summed up for any dimension.
Let me give an example, this is my fact table :
-------------------------------
DATE
ACCOUNT_ID
-------------------------------
EOD_ACCOUNT_BALANCE
TOTAL_TXN_AMOUNT
LAST_TRANSACTION_DATE
-------------------------------
TOTAL_TXN_AMOUNT is additive for all dimensions of the fact table. You can sum up TOTAL_TXN_AMOUNT for DATE, or ACCOUNT_ID or both. So this is an additive fact.
EOD_ACCOUNT_BALANCE is additive for DATE dimension, you can find the total EOD balance of all accounts in a given day. But EOD Account balance is not additive for DATE and ACCOUNT_ID.
Summing up all EOD balances for a given account for each day of a week does not give any useful result. So this is a semi additive fact.
LAST_TRANSACTION_DATE is a date and it cannot be summed up for any dimension. This is a non additive fact.
Your flag attribute could be additive, non additive or semi additive. It depends on the definition of that fact.
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Re: Categorizin Facts/Measures
What he said except switch the last two definitions. :-)
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Example???
I understand that if a flag is used to count for something then it becomes additive.
Can you pleas provide an example on when it becomes non-additive or semi-additive?
Can you pleas provide an example on when it becomes non-additive or semi-additive?
umutiscan wrote:Additive - Measures can be summed up for all dimensions of the fact table.
Non Additive - Measures can be summed up for some dimensions of the fact table.
Semi Additive - Measures cannot be summed up for any dimension.
Let me give an example, this is my fact table :
-------------------------------
DATE
ACCOUNT_ID
-------------------------------
EOD_ACCOUNT_BALANCE
TOTAL_TXN_AMOUNT
LAST_TRANSACTION_DATE
-------------------------------
TOTAL_TXN_AMOUNT is additive for all dimensions of the fact table. You can sum up TOTAL_TXN_AMOUNT for DATE, or ACCOUNT_ID or both. So this is an additive fact.
EOD_ACCOUNT_BALANCE is additive for DATE dimension, you can find the total EOD balance of all accounts in a given day. But EOD Account balance is not additive for DATE and ACCOUNT_ID.
Summing up all EOD balances for a given account for each day of a week does not give any useful result. So this is a semi additive fact.
LAST_TRANSACTION_DATE is a date and it cannot be summed up for any dimension. This is a non additive fact.
Your flag attribute could be additive, non additive or semi additive. It depends on the definition of that fact.
castro.aris- Posts : 2
Join date : 2013-07-02
Re: Categorizin Facts/Measures
Oh yes, sorry =)BoxesAndLines wrote:What he said except switch the last two definitions. :-)
Hi castro.aris,
This is another example for flag attributes:
-------------------------------------
DATE
ACCOUNT_ID
-------------------------------------
HAS_UNPAID_BILL_FLAG (YES/NO)
HAS_UNPAID_BILL_FLAG_2 (1/0)
-------------------------------------
HAS_UNPAID_BILL_FLAG can be used to count number of "YES" or "NO". But it cannot be summed up, you can not find any result for
YES+YES+YES = ? .. So this is a non additive fact.
HAS_UNPAID_BILL_FLAG_2 is a numeric attribute, and it can be summed up. You can find total number of accounts that have unpaid bills in a given day. But summing up this flag for a given account for each day of a week makes no sense. So this is semi additive.
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Re: Categorizin Facts/Measures
umutiscan wrote:
LAST_TRANSACTION_DATE is a date and it cannot be summed up for any dimension. This is a non additive fact.
Actually, last transaction date is a degenerate dimension, not a measure, so are the flags. They provide context to the fact.
A non-additive measure would be something like UNIT_PRICE.
Re: Categorizin Facts/Measures
ngalemmo wrote:Actually, last transaction date is a degenerate dimension, not a measure, so are the flags.
I don't agree. In my example, dates and flags are not dimensions. They are some metrics required by the business for some analytical reasons.
I don't see any difference between "Last Transaction Date" and "Last Transaction Amount". If Last Transaction Amount is a fact, Last Transaction Date has to be a fact =)
And this is a comment from Ralph Kimball about datetime stamp facts :
"I would consider the datetime stamp as a non additive numeric fact. The main thing is that it's not a candidate for being a dimension key to some sort of time dimension."
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Re: Categorizin Facts/Measures
I live in Los Angeles. That is a fact.
What did I say? If I was to create a fact table that reflects the state of all persons, one of the attributes of that state would be where that person lives. The term 'fact' has different meanings in English and in Dimensional Modeling. In Dimensional Modeling a 'fact' (English) is classified as either a Measure or an Attribute. (The term Fact in Dimensional modeling qualifies a data structure (i.e. Fact Table)). A Measure represents the magnitude of the event or interaction. An Attribute represents context for the measures. We use attributes to filter, identify, and aggregate measures.
Where I live is clearly an Attribute. It is a context related to my current state in life. A count of the number of people living in Los Angeles is a measure derived from the fact row (i.e. a constant 1) based on the contents of an Attribute (filter).
In the particular case of a timestamp, it can be used to identify magnitude as well as provide context. Context in the sense it identifies when the event or interaction occurred as well as provide a measurement of time (interval) between events. It is an exceptional case.
What did I say? If I was to create a fact table that reflects the state of all persons, one of the attributes of that state would be where that person lives. The term 'fact' has different meanings in English and in Dimensional Modeling. In Dimensional Modeling a 'fact' (English) is classified as either a Measure or an Attribute. (The term Fact in Dimensional modeling qualifies a data structure (i.e. Fact Table)). A Measure represents the magnitude of the event or interaction. An Attribute represents context for the measures. We use attributes to filter, identify, and aggregate measures.
Where I live is clearly an Attribute. It is a context related to my current state in life. A count of the number of people living in Los Angeles is a measure derived from the fact row (i.e. a constant 1) based on the contents of an Attribute (filter).
In the particular case of a timestamp, it can be used to identify magnitude as well as provide context. Context in the sense it identifies when the event or interaction occurred as well as provide a measurement of time (interval) between events. It is an exceptional case.
Re: Categorizin Facts/Measures
We know that a fact table has some dimensions and facts.
List of dimensions defines the grain of the fact table, facts represent business measures.
And I return back to my example. This is an aggregated snapshot fact :
-------------------------------
DATE
ACCOUNT_ID
-------------------------------
EOD_ACCOUNT_BALANCE
TOTAL_TXN_AMOUNT
LAST_TRANSACTION_DATE
-------------------------------
DATE and ACCOUNT_ID defines the grain. LAST_TRANSACTION_DATE is the date of the last transaction produced by the account in a given date. It has no effect on the grain of this fact table. So it very difficult to say that LAST_TRANSACTION_DATE is a dimension of this entity. And it is also very difficult to say that LAST_TRANSACTION_DATE is a degenerate dimension. Degenerate dimensions usually occur in transaction level fact tables and a degenerate dimension also has effect on the grain like other dimensions.
Let me show you the query that generates the metrics, I group by the dimensions and calculate the facts. Red ones are the dimensions, blue ones are the facts. Aren't they?
SELECT TRUNC(TXN_DATE), ACCOUNT_ID, SUM(TXN_AMOUNT), MAX(TXN_DATE)
FROM TRANSACTION
GROUP BY TRUNC(TXN_DATE), ACCOUNT_ID
So what is your opininon on this case?
List of dimensions defines the grain of the fact table, facts represent business measures.
And I return back to my example. This is an aggregated snapshot fact :
-------------------------------
DATE
ACCOUNT_ID
-------------------------------
EOD_ACCOUNT_BALANCE
TOTAL_TXN_AMOUNT
LAST_TRANSACTION_DATE
-------------------------------
DATE and ACCOUNT_ID defines the grain. LAST_TRANSACTION_DATE is the date of the last transaction produced by the account in a given date. It has no effect on the grain of this fact table. So it very difficult to say that LAST_TRANSACTION_DATE is a dimension of this entity. And it is also very difficult to say that LAST_TRANSACTION_DATE is a degenerate dimension. Degenerate dimensions usually occur in transaction level fact tables and a degenerate dimension also has effect on the grain like other dimensions.
Let me show you the query that generates the metrics, I group by the dimensions and calculate the facts. Red ones are the dimensions, blue ones are the facts. Aren't they?
SELECT TRUNC(TXN_DATE), ACCOUNT_ID, SUM(TXN_AMOUNT), MAX(TXN_DATE)
FROM TRANSACTION
GROUP BY TRUNC(TXN_DATE), ACCOUNT_ID
So what is your opininon on this case?
umutiscan- Posts : 51
Join date : 2010-11-26
Age : 44
Location : Istanbul, Turkey
Re: Categorizin Facts/Measures
Here's my thinking.
Last_transaction_date is dimensional. It provides context to the measures you have aggregated.
If you want to leverage the date dimension, then you can join to it. Otherwise, it can be left as degenerate.
Last_transaction_date is dimensional. It provides context to the measures you have aggregated.
If you want to leverage the date dimension, then you can join to it. Otherwise, it can be left as degenerate.
LAndrews- Posts : 132
Join date : 2010-05-13
Location : British Columbia, Canada
Re: Categorizin Facts/Measures
ALL dimensions do not define the grain of a fact table... SOME dimensions define the grain. There is nothing to prevent a fact table having additional dimensions that do not affect the grain of the fact.
For example, you have an order line fact table. The grain is a line from an order (order #, line #). You have customer, product, ship to address, billing address as dimensions on the fact. None of these affect the grain of the fact.
Timestamps are unique cases as they can be both an attribute and a measure. It depends on how you use it.
For example, you have an order line fact table. The grain is a line from an order (order #, line #). You have customer, product, ship to address, billing address as dimensions on the fact. None of these affect the grain of the fact.
Timestamps are unique cases as they can be both an attribute and a measure. It depends on how you use it.
Similar topics
» Facts with same grain: some shared and some different measures
» Multiple Facts or Single Facts and Status Table?
» How best to model Timesheet facts against Sales Order facts
» Inventory facts and production facts
» Identify the facts and facts grain
» Multiple Facts or Single Facts and Status Table?
» How best to model Timesheet facts against Sales Order facts
» Inventory facts and production facts
» Identify the facts and facts grain
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum