How many fact columns is too many? 80-100? Grain Check.
3 posters
Page 1 of 1
How many fact columns is too many? 80-100? Grain Check.
Requirement: Present for analysis, banking account Credits & Debits with Begin Balance and End Balance per day by product (no account level information). The highly reliable source provides about 3 maybe 4 levels of roll-up metrics: Credits = Posted Credits + Non-Posted Credits. Posted Credits = Automatic Credits + Manual Credits. Automatic Credits = ATM Charges + Other Charges. ATM & Other Charges equate to the lowest level or "transaction grained" level (to an extent.)
Question is why build both a transaction grain level table and the snapshot table in this case? My source system provides both transaction and snapshot (summary) data. Even a lowest level ATM Charge, the lowest level of detail available, is actually a product level summary of all individual checking accounts that had an ATM charge on a given day. In essence my source is giving me snapshot data, albeit at varying levels of roll-up detail from individual transaction types all the way up to Total Credits and Debits.
Anyone have an issue if I put all 80 -100 of these metrics in one fact table that looks like this?:
DimProductKey, DimDateKey, DimBranchkey, BeginBalance, EndBalance, TotalCredits, TotalDebits, PostedDebits, AutomaticCredits, ManualCredits, TotalCharges, ATMCharges, OtherCharges...75 more snapshot metrics.
This fact table would have 4 dim keys and about 80 fact columns, all of which are needed to build canned reports and predetermined views. If I do a traditional transaction level table representing the most atomic facts currently available, I still will need about 50 snapshot fact columns. Why not just cut to the chase in this case and just build a wider snapshot table. This also eliminates the need for a Transaction Dimension.
Too wide? Any problem with the grain? For a given product on a given day, each of these facts make sense: Total Credits for Product A on a given Date is $10,000. Total ATM Fees on the same day for the same Product A = $1500, and so on.
Should I care that it does assume that report builders and ad-hoc users know that ATM + Other = Total Charges, and That Total Charges eventually rolls into Total Debits? It's their job to know this, right?
Input appreciated.
Thanks, PA
Question is why build both a transaction grain level table and the snapshot table in this case? My source system provides both transaction and snapshot (summary) data. Even a lowest level ATM Charge, the lowest level of detail available, is actually a product level summary of all individual checking accounts that had an ATM charge on a given day. In essence my source is giving me snapshot data, albeit at varying levels of roll-up detail from individual transaction types all the way up to Total Credits and Debits.
Anyone have an issue if I put all 80 -100 of these metrics in one fact table that looks like this?:
DimProductKey, DimDateKey, DimBranchkey, BeginBalance, EndBalance, TotalCredits, TotalDebits, PostedDebits, AutomaticCredits, ManualCredits, TotalCharges, ATMCharges, OtherCharges...75 more snapshot metrics.
This fact table would have 4 dim keys and about 80 fact columns, all of which are needed to build canned reports and predetermined views. If I do a traditional transaction level table representing the most atomic facts currently available, I still will need about 50 snapshot fact columns. Why not just cut to the chase in this case and just build a wider snapshot table. This also eliminates the need for a Transaction Dimension.
Too wide? Any problem with the grain? For a given product on a given day, each of these facts make sense: Total Credits for Product A on a given Date is $10,000. Total ATM Fees on the same day for the same Product A = $1500, and so on.
Should I care that it does assume that report builders and ad-hoc users know that ATM + Other = Total Charges, and That Total Charges eventually rolls into Total Debits? It's their job to know this, right?
Input appreciated.
Thanks, PA
parbie- Posts : 11
Join date : 2010-04-06
Re: How many fact columns is too many? 80-100? Grain Check.
You are kidding right?
How do you manage to get a wide fact table from ATM transactions? You have a transaction amount, a balance and thats about it. The rest of it are attributes in dimensions.
How do you manage to get a wide fact table from ATM transactions? You have a transaction amount, a balance and thats about it. The rest of it are attributes in dimensions.
Re: How many fact columns is too many? 80-100? Grain Check.
Not kidding. Take my word, there are roughly 80 - 100 transaction types. ATM Fee is meant to be an easy example. My attempt to economically craft a post has resulted in your missing the scenario. Please re-read the post and here is more info:
The snapshots or summaries are captured as they flow into a staging area destined for the General Ledger. The source is technically not the General Ledger, but it has GL attributes. (was hoping not to have to get too deep into the specifics, again for brevity and not wanting to divert your focus from the actual question: how wide is too wide? and why use a transaction grained fact table if you are only have summary data to begin with? The account level transactions are coming in the next version...)
The requirement is to report the activity that these ledgers represent, which are categorized transaction types down to the lowest transaction.
Think of your own checking account. If you earned interest during the month, the bank would credit your account for the amount they owe you. My source gives me that transaction and all the other update activity possible from all accounts like yours. Some additional transaction types and categories include, Accrued and Paid interest, Withholding Activity, Late charges, Accounting Transfers etc. The products are all deposit and loan account types that banks offer: Checking, Saving, Installment Loans.
Thanks,
PA
The snapshots or summaries are captured as they flow into a staging area destined for the General Ledger. The source is technically not the General Ledger, but it has GL attributes. (was hoping not to have to get too deep into the specifics, again for brevity and not wanting to divert your focus from the actual question: how wide is too wide? and why use a transaction grained fact table if you are only have summary data to begin with? The account level transactions are coming in the next version...)
The requirement is to report the activity that these ledgers represent, which are categorized transaction types down to the lowest transaction.
Think of your own checking account. If you earned interest during the month, the bank would credit your account for the amount they owe you. My source gives me that transaction and all the other update activity possible from all accounts like yours. Some additional transaction types and categories include, Accrued and Paid interest, Withholding Activity, Late charges, Accounting Transfers etc. The products are all deposit and loan account types that banks offer: Checking, Saving, Installment Loans.
Thanks,
PA
parbie- Posts : 11
Join date : 2010-04-06
Re: How many fact columns is too many? 80-100? Grain Check.
You are basically flattening out the measures. You could create a Measure dimension with 80-100 rows. That would give the fact table 1 measure. I bet this would run faster than the flattened design. You could also put a hierarchy into the dimension to group similar measures together. The various charges would roll up to a Total Charges. You could have the various fees roll up to total fees. This would lend itself to a nice a cube.
The fact table you are talking about seems to be an aggregate. You could create an Account level fact table with the same measure dimension. This would enable the user to drill down from the product into the detail fairly easily.
Picture a report with the Products across the top and the various measures in your fact table down the side, grouped together with sub totals. Or picture the report with the products down the side and the measures across the top. With a Measure Dimension you can do both.
The fact table you are talking about seems to be an aggregate. You could create an Account level fact table with the same measure dimension. This would enable the user to drill down from the product into the detail fairly easily.
Picture a report with the Products across the top and the various measures in your fact table down the side, grouped together with sub totals. Or picture the report with the products down the side and the measures across the top. With a Measure Dimension you can do both.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: How many fact columns is too many? 80-100? Grain Check.
Good point, and we do have the measure dimension you speak of. I am wanting to take advantage of the complimentary snapshot concept where the fact table is flattened. I like the snapshot table especially for including measures like ending balance, weighted rate, days to maturity. I like the measures across the top of the report image, but doesn't it make it easier to build the report if I simply drag over a column from the fact table from a single row rather than query a deeper narrower single amount fact column?
Thanks...PA
Thanks...PA
parbie- Posts : 11
Join date : 2010-04-06
Re: How many fact columns is too many? 80-100? Grain Check.
It depends on the software you are using to create the report.
If it's an ad hoc report, filter of the metrics you want and do a cross tab.
I like designs that have the most flexibility. If the table is relatively short, you can always create a view that flattens out the data to make it easier for users.
If it's an ad hoc report, filter of the metrics you want and do a cross tab.
I like designs that have the most flexibility. If the table is relatively short, you can always create a view that flattens out the data to make it easier for users.
Jeff Smith- Posts : 471
Join date : 2009-02-03
Re: How many fact columns is too many? 80-100? Grain Check.
Don't dig yourself into a hole by flattening the fact table. By the time you get done you will have to change it.
What you are doing is applying interpretation of attributes into columns rather than letting the query do the interpretation. Any one interpretation is fleeting as businesses change, hard-coding such interpretations into columns in a primary fact table only leads to significant problems down the road.
Capture the data in a proper atomic fact table. Apply interpretation in the BI layer or a view. And you can always create an aggregate later if necessary.
What you are doing is applying interpretation of attributes into columns rather than letting the query do the interpretation. Any one interpretation is fleeting as businesses change, hard-coding such interpretations into columns in a primary fact table only leads to significant problems down the road.
Capture the data in a proper atomic fact table. Apply interpretation in the BI layer or a view. And you can always create an aggregate later if necessary.
Re: How many fact columns is too many? 80-100? Grain Check.
Component One is the report software and Silverlight hand coding is being used for the graphical display. Yes a huge challenge for the coders, and it is hard to know what database design limitations are when coding your own BI from scratch. Obviously I would prefer an actual BI tool, but not an option at this point.
How would you handle balances or weighted interest rate, or number of new accounts open today, or closed today? Surely these don't belong in a single column Amount with a dimension pointing them do the? In other words, do I put those with the atomic facts? Isn't that the purpose of a "complimentary snapshot table"? See page 203 of Kimball's the Complete Guide to Dimensional Modeling or any other publication where he talks about 3 (atomic, snapshot and accumulating)types of fact tables. I am getting confused on when and how to use the snapshot table.
Anyone ever use the atomic/snapshot situation?
thanks,
PA
How would you handle balances or weighted interest rate, or number of new accounts open today, or closed today? Surely these don't belong in a single column Amount with a dimension pointing them do the? In other words, do I put those with the atomic facts? Isn't that the purpose of a "complimentary snapshot table"? See page 203 of Kimball's the Complete Guide to Dimensional Modeling or any other publication where he talks about 3 (atomic, snapshot and accumulating)types of fact tables. I am getting confused on when and how to use the snapshot table.
Anyone ever use the atomic/snapshot situation?
thanks,
PA
parbie- Posts : 11
Join date : 2010-04-06
Re: How many fact columns is too many? 80-100? Grain Check.
How would you handle balances or weighted interest rate, or number of new accounts open today, or closed today? Surely these don't belong in a single column Amount with a dimension pointing them do the?
They could, in part... but usually there are other facts relating to these other events.
In terms of facts relating to transactional activity against accounts there most certainly could be a transaction type attribute that indicates it is an initial deposit into an account, there is also no reason you could not include the interest rate applied as a degenerate dimension in the fact.
Similar topics
» load a table without unique indentifier
» Fact with large columns, 320.
» SK and ID columns in a Fact table
» Too many columns in fact table
» Should the informational columns be on the fact or dimension
» Fact with large columns, 320.
» SK and ID columns in a Fact table
» Too many columns in fact table
» Should the informational columns be on the fact or dimension
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum