Normalized Data Store + Dimensional Modeling
Page 1 of 1
Normalized Data Store + Dimensional Modeling
Hi there Kimball forum, and welcome to my first topic :-),
I just received my "Kimball's Data warehouse Toolkit Classics" on monday. Unfortunately I still can not find the hint, how to solve my design problem.
We have a bit overcomplicated database over here for which I have the honor to extract a subset to feed our not yet existing data warehouse from. Our operational database does not saves history most of the time, but there are "last changed" fields on some of the tables at least.
Since I expect growing BI requirements but mostly on the same base data I feel like it would be a good idea to have a normalized data store between my data sources and my dimensional model. I would like to use this normalized data store (NDD) (how would you call it in "Kimball DW/BI terms?" as an simplified replica of our operational db enriched with historized information.
E.g. we have a customer, a subscription, a customer_subscription junction and a subscription_referral table. For report purpose we would like to have reports about our subscriptions, later on about our subscription_referrals and customer base itself (not every customer has a subscription also). Business process at the moment is, that ex customers simply vanish from the customer database as well as the entries in the junction table. Since I am not the guy doing decisions about the productive database, I would love to have this tables in my NDD as well with to added fields "valid_from" and "valid_to" to track changes. I also want to define surrogate keys in this step already and I want to enforce referential integrity. I want to extract my dimensonial data from this historized and simplified database in the next step.
My (current) main problem is, how to transform the foreign keys. I am almost sure that I have to point the old source foreign keys to the new surrogate ones in my NDD. But how do I have to maintain the order of insert, e.g. for the junction table? I would need to know about all the new surrogate keys in advance. I would have to do a lookup for many-to-many relations that vanished to replace the current date (valid_until) values. How do I manage this stuff? Is there a best practise? Where do I have to read?
EDIT:
How do I manage junction tables when replacing source system id's with surrogate keys?
kind regards,
Marius
I just received my "Kimball's Data warehouse Toolkit Classics" on monday. Unfortunately I still can not find the hint, how to solve my design problem.
We have a bit overcomplicated database over here for which I have the honor to extract a subset to feed our not yet existing data warehouse from. Our operational database does not saves history most of the time, but there are "last changed" fields on some of the tables at least.
Since I expect growing BI requirements but mostly on the same base data I feel like it would be a good idea to have a normalized data store between my data sources and my dimensional model. I would like to use this normalized data store (NDD) (how would you call it in "Kimball DW/BI terms?" as an simplified replica of our operational db enriched with historized information.
E.g. we have a customer, a subscription, a customer_subscription junction and a subscription_referral table. For report purpose we would like to have reports about our subscriptions, later on about our subscription_referrals and customer base itself (not every customer has a subscription also). Business process at the moment is, that ex customers simply vanish from the customer database as well as the entries in the junction table. Since I am not the guy doing decisions about the productive database, I would love to have this tables in my NDD as well with to added fields "valid_from" and "valid_to" to track changes. I also want to define surrogate keys in this step already and I want to enforce referential integrity. I want to extract my dimensonial data from this historized and simplified database in the next step.
My (current) main problem is, how to transform the foreign keys. I am almost sure that I have to point the old source foreign keys to the new surrogate ones in my NDD. But how do I have to maintain the order of insert, e.g. for the junction table? I would need to know about all the new surrogate keys in advance. I would have to do a lookup for many-to-many relations that vanished to replace the current date (valid_until) values. How do I manage this stuff? Is there a best practise? Where do I have to read?
EDIT:
How do I manage junction tables when replacing source system id's with surrogate keys?
kind regards,
Marius
Last edited by Bergtroll on Wed Feb 02, 2011 9:30 am; edited 1 time in total
Bergtroll- Posts : 15
Join date : 2011-02-02
Re: Normalized Data Store + Dimensional Modeling
Additionally I would like to know, if it is correct to use my new surrogate keys from the NDD in my dimensional model as well. I will not have to create an additional sequence over there, do I ?
Bergtroll- Posts : 15
Join date : 2011-02-02
Similar topics
» Reporting table data repository vs. Dimensional data store
» Data Mart/Dimensional Data Store Definition.
» Data Modeling question (not really dimensional though)
» effect of data sources in dimensional modeling
» Data Mart/Dimensional Data Store Definition.
» Data Modeling question (not really dimensional though)
» effect of data sources in dimensional modeling
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum