Data Warehouse for mortgage tracking - need some advice
3 posters
Page 1 of 1
Data Warehouse for mortgage tracking - need some advice
Hello, I am fairly new to this, and hope that someone may have some insight on a project our company is looking into. We have acquired several thousand mortgages from a handful of originators, and would like to build a data warehouse to track payments, outstanding balances, late / delinquent loans, etc. It seems easy enough on the surface, but where we seem to be stuck is deciding if 'Loan' would be treated as a dimension or not. One thought is that we'd have a Loan dimension (source, original loan amount, loan number, etc.) and a Loan Payment fact (Loan ID, payment date, payment amount, principal payment, interest payment, remaining balance??). To me, this makes sense logically, but I have read the argument that this is not a good approach, as you'd have a 1 to 1 tie between the tables. So, we're not sure on the proper set up here and have been searching for similar scenarios, but really haven't found anything clear cut. Would any of the experts here have any insight they'd be willing to share? Thanks in advance.
creosote- Posts : 8
Join date : 2016-03-28
Re: Data Warehouse for mortgage tracking - need some advice
With the volumes you are dealing with now, what you describe is fine. From a growth and expansion perspective, you would be better off by breaking down loan into multiple dimensions and carry those foreign keys in the fact. (You may still have a loan dimension that contains attributes distinct to the loan). This gives you greater selectivity in queries (a filter on a smaller dimension can eliminate large numbers of facts). It also gives greater opportunity to aggregate and/or integrate with future facts.
Re: Data Warehouse for mortgage tracking - need some advice
Hi ngalemmo, thanks for your response. That makes sense, and I'm glad to see our initial thoughts weren't way off base. Per your comment about growth, that is a great point, as it's very possible we could acquire additional loans in the future. I'm not sure how a loan dimension table could be split into smaller dimensions, though. Maybe something like the source of the loan could be a new dimension, but then we'd just be moving from a star schema to a snowflake schema, as the loan dimension would still need to be linked to the loan source. Maybe we could have a dimension for 'status' (current / delinquent / written off) as well? If anyone has done anything similar in the past, I'm up for any pointers or issues you may have encountered!
Also, we initially thought about a 'Loan Payment' fact table, but have since considered having a 'Loan Status' fact table that would update daily with each loan, and would include items such as last payment date (tied to date dimension), last payment amount, outstanding principal balance (as of that day), outstanding interest balance (as of that day), days delinquent (as of that day) and so on... So, our payments would be available in our daily status table. Does anyone have an opinion they'd be willing to share on that approach?
Also, we initially thought about a 'Loan Payment' fact table, but have since considered having a 'Loan Status' fact table that would update daily with each loan, and would include items such as last payment date (tied to date dimension), last payment amount, outstanding principal balance (as of that day), outstanding interest balance (as of that day), days delinquent (as of that day) and so on... So, our payments would be available in our daily status table. Does anyone have an opinion they'd be willing to share on that approach?
creosote- Posts : 8
Join date : 2016-03-28
Re: Data Warehouse for mortgage tracking - need some advice
To ngalemmo's point, your volume will allow you to carry a loan dimension and fact table at a 1-1 relationship. Once you reach a Wells Fargo or BOA servicing level, you will run into issues. What you shouldn't do though, is throw 200 columns into your loan dimension. You are on the right approach is identifying delinquency stage as a candidate for it's own dimension. Payments are a different fact table than your "Loan" fact table. Payments are transactions. Your loan fact table will be a snapshot with all the current balances, payment amounts, etc.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Data Warehouse for mortgage tracking - need some advice
Thanks, BoxesAndLines (love the name) - you guys have been super helpful. I sure don't see us getting to Wells / BoA numbers, and if that were to happen, the discussion would certainly change! Thanks for the note on delinquency stage, we'll add that as a dimension. Regarding the Payments fact table, would you suggest only including the date and amount of the payment (linking with the loan ID and time dimension), and not having a 'last paid date' in the Loan fact table? I'm assuming this table would get populated only when an actual payment was processed. Also, I'm glad you mentioned 'snapshot'. We were thinking of having an 'end of month' snapshot table, and also the primary loan fact table, which would be updated with the status of every loan daily. So, I'd actually see this as a 'daily' snapshot table. Would an approach like that make sense? Thanks again to you two for sharing your insight.
creosote- Posts : 8
Join date : 2016-03-28
Re: Data Warehouse for mortgage tracking - need some advice
BoxesAndLines wrote:To ngalemmo's point, your volume will allow you to carry a loan dimension and fact table at a 1-1 relationship. Once you reach a Wells Fargo or BOA servicing level, you will run into issues.
One more question on this: As previously noted, I don't see us getting to that volume level, but out of curiosity... Do you have any examples of issues that would be encountered in that scenario? What would the challenges and implications be? Thanks!
creosote- Posts : 8
Join date : 2016-03-28
Re: Data Warehouse for mortgage tracking - need some advice
Two problems primarily. The first is pure volume. Snapshot tables chew up tons of space. The second is performance. Joining two multi million row tables can be problematic.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Data Warehouse for mortgage tracking - need some advice
Cool, thanks. If you don't mind, do you have any thoughts on my post above? Specifically : "Regarding the Payments fact table, would you suggest only including the date and amount of the payment (linking with the loan ID and time dimension), and not having a 'last paid date' in the Loan fact table? I'm assuming this table would get populated only when an actual payment was processed. Also, I'm glad you mentioned 'snapshot'. We were thinking of having an 'end of month' snapshot table, and also the primary loan fact table, which would be updated with the status of every loan daily. So, I'd actually see this as a 'daily' snapshot table. Would an approach like that make sense?"
I think this is clicking, but it's great to have input from others who have already tackled these issues... Thank you!
I think this is clicking, but it's great to have input from others who have already tackled these issues... Thank you!
creosote- Posts : 8
Join date : 2016-03-28
Re: Data Warehouse for mortgage tracking - need some advice
While you can certainly determine the last pay date by looking at payment facts, it may be too cumbersome for users or not perform well. Carrying that value in the loan dimension makes it easily accessible and results in more efficient queries. Last paid date would not go into a loan fact table if calculated. It is an attribute of the loan dimension.
Re: Data Warehouse for mortgage tracking - need some advice
Every fact table will have the loan number as a degenerate dimension (you'll thank me later). Keeping last paid amount in the daily snapshot along with the payment amount in the payment fact is fine. When I last built one of these I kept 6 months of daily snapshots and then kept the end of month snapshot. We kept it all the same table using a sql script to do the partition pruning. I also did have an aggregate fact table for common queries above the loan number level.
I also had fact tables for bankruptcies, foreclosures, delinquencies, etc.
I also had fact tables for bankruptcies, foreclosures, delinquencies, etc.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Data Warehouse for mortgage tracking - need some advice
You guys are tremendous resources, thanks so much. Wish I had stumbled across this board earlier.
Regarding adding the loan number as a degenerate dimension - so far, each fact table will have a key linking it to a record in the loan dimension, which would contain the loan number. Does that accomplish the same thing?
Regarding adding the loan number as a degenerate dimension - so far, each fact table will have a key linking it to a record in the loan dimension, which would contain the loan number. Does that accomplish the same thing?
creosote- Posts : 8
Join date : 2016-03-28
Re: Data Warehouse for mortgage tracking - need some advice
Nope. Put the loan number in all the (loan grained) facts.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Data Warehouse for mortgage tracking - need some advice
Pardon my ignorance, but why do that when you also have that loan number value via the linkage to the loan dimension table? What's the benefit? Thanks again!
creosote- Posts : 8
Join date : 2016-03-28
Re: Data Warehouse for mortgage tracking - need some advice
The benefit is the ability to join anywhere directly via loan number or produce a report the requires loan number without joining to the loan dimension.
For example, say you're reporting on bankruptcies and you want the current outstanding principal balance. Principal balance is stored in the loan fact, you bankruptcy metrics are in the bankruptcy fact. You could drill across using the common loan dimension or you could join directly using the loan number.
You're absolutely right in that the loan number is in the loan dimension and you will be able to do the same thing, it will just be a little slower and slightly more complex.
For example, say you're reporting on bankruptcies and you want the current outstanding principal balance. Principal balance is stored in the loan fact, you bankruptcy metrics are in the bankruptcy fact. You could drill across using the common loan dimension or you could join directly using the loan number.
You're absolutely right in that the loan number is in the loan dimension and you will be able to do the same thing, it will just be a little slower and slightly more complex.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Data Warehouse for mortgage tracking - need some advice
Makes sense, thanks for the example! Obviously, I still need to shake some of my relational habits.
So just to be clear, there's no 'rule' stating the loan number can't be in a fact table as well as the loan dimension, right?
So just to be clear, there's no 'rule' stating the loan number can't be in a fact table as well as the loan dimension, right?
creosote- Posts : 8
Join date : 2016-03-28
Re: Data Warehouse for mortgage tracking - need some advice
Rules are more like guidelines, Arrrr!
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Similar topics
» History tracking in a CRM data warehouse
» Building data warehouse from scratch.. Need some advice.
» difference between data mart and data warehouse at logical/physical level
» Tracking of historical data using SCD2 in a non-dimensional data model
» data warehouse or not ? when is it okay to use OLAP without a data warehouse database
» Building data warehouse from scratch.. Need some advice.
» difference between data mart and data warehouse at logical/physical level
» Tracking of historical data using SCD2 in a non-dimensional data model
» data warehouse or not ? when is it okay to use OLAP without a data warehouse database
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum