DWH architecture problem
4 posters
Page 1 of 1
DWH architecture problem
Hello,
I have some trouble to design my data warehouse. Here's the context :
- Financial people register our deals and report a financial snapshot every month. When they register new deals, they also indicates some information like which equipment is sold, at which customer, etc. (our dimensions).
- Project managers add additionnal data to these deals with milestones information (startup project date, customer acceptance date, etc.), also on a monthly basis.
Finance will only use finance information, Project Manager could use both type of information.
Based on this information, I have many possible scenarios, which is the best ?
1st scenario : star schema
In this scenario, I have two separate tables for Finance and Project management. But the thing is that I will have to duplicate reference to dimensions (equipment, customer, etc.) as it is Finance that declare deals and that information have to stay consistant for a same deal.
2nd scenario : one common table
As we have the same granularity (both are monthly snapshot), we could merge Finance and Project management information in a single table and proposes two views to the users. But I fear that it will become a mess (different enterprise function in a single table...).
3nd scenario : snowflake schema
We also could add a "Deal" table, containing all references to other dimensions (customer, equipment, etc.).
Thanks in advice for any usefull advice !
I have some trouble to design my data warehouse. Here's the context :
- Financial people register our deals and report a financial snapshot every month. When they register new deals, they also indicates some information like which equipment is sold, at which customer, etc. (our dimensions).
- Project managers add additionnal data to these deals with milestones information (startup project date, customer acceptance date, etc.), also on a monthly basis.
Finance will only use finance information, Project Manager could use both type of information.
Based on this information, I have many possible scenarios, which is the best ?
1st scenario : star schema
In this scenario, I have two separate tables for Finance and Project management. But the thing is that I will have to duplicate reference to dimensions (equipment, customer, etc.) as it is Finance that declare deals and that information have to stay consistant for a same deal.
2nd scenario : one common table
As we have the same granularity (both are monthly snapshot), we could merge Finance and Project management information in a single table and proposes two views to the users. But I fear that it will become a mess (different enterprise function in a single table...).
3nd scenario : snowflake schema
We also could add a "Deal" table, containing all references to other dimensions (customer, equipment, etc.).
Thanks in advice for any usefull advice !
Skualys- Posts : 13
Join date : 2014-04-04
Re: DWH architecture problem
Hi, I would approach this as follows:
1. Define what the measures are that you want to report on
2. Define the grain of these measures, which also defines the base set of dimensions that will apply to them
3. Any measures that have the same grain can (but don't have to) go in the same fact table; measures that don't have the same grain cannot go in the same fact tale - so this give you your fact tables.
BTW - I would never expose your underlying tables to users. If you are not using a BI tool, that hides the tables, then I would put create Views and give the users access to the Views rather than the tables
a) This means the users don't have to know how to join the tables as you do this in the Views for them
b) It allows you some flexibility to change the underlying tables without breaking all the reports/queries that use the Views
1. Define what the measures are that you want to report on
2. Define the grain of these measures, which also defines the base set of dimensions that will apply to them
3. Any measures that have the same grain can (but don't have to) go in the same fact table; measures that don't have the same grain cannot go in the same fact tale - so this give you your fact tables.
BTW - I would never expose your underlying tables to users. If you are not using a BI tool, that hides the tables, then I would put create Views and give the users access to the Views rather than the tables
a) This means the users don't have to know how to join the tables as you do this in the Views for them
b) It allows you some flexibility to change the underlying tables without breaking all the reports/queries that use the Views
nick_white- Posts : 364
Join date : 2014-01-06
Location : London
Re: DWH architecture problem
Thanks for your response.
The granularity will be the same (monthly snapshot). I was just a bit annoyed by storing in the same table information coming from two different business process, as I thought one fact table = one business process.
Of course, if I stay on an unique table, I will make two views to users (it will ease our life in the BI tool - especially if we decide to go for QlikView).
The granularity will be the same (monthly snapshot). I was just a bit annoyed by storing in the same table information coming from two different business process, as I thought one fact table = one business process.
Of course, if I stay on an unique table, I will make two views to users (it will ease our life in the BI tool - especially if we decide to go for QlikView).
Skualys- Posts : 13
Join date : 2014-04-04
Re: DWH architecture problem
When we talk about one fact table = one business process, we are talking about atomic level facts. That is, transactional level detail of every activity. As such, the atomic level data across processes is almost never at the same grain and occur at different points in time, hence different fact tables.
In your case, this is an aggregate fact containing summary data collected from different systems at the same time. Combining them into one table is common.
To your question, 1 & 2 are options, 3 is not.
In your case, this is an aggregate fact containing summary data collected from different systems at the same time. Combining them into one table is common.
To your question, 1 & 2 are options, 3 is not.
Re: DWH architecture problem
Thanks you very much for the detail you gave me, I got a better understanding of the principle of segragation between fact tables.
I will go for an unique table with two views, it will be easier to maintain.
(And obviously, I apologize for my English, I'm French =))
I will go for an unique table with two views, it will be easier to maintain.
(And obviously, I apologize for my English, I'm French =))
Skualys- Posts : 13
Join date : 2014-04-04
Re: DWH architecture problem
I wonder if you should have a Deal coverage factless fact table to store the relationships specific only to Deals in a dedicated centralized place. This fact may not have any referential relationship with other fact table, but can be queried against other facts.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: DWH architecture problem
Could you explain it to me a bit further (with a schema for example) ? I'm new to dimensional modeling and I'm afraid I don't understand what you are suggesting.
Skualys- Posts : 13
Join date : 2014-04-04
Re: DWH architecture problem
I am not sure if your deals are similar to retail promotions that stores promotion details for certain products. If so, you would have a deal dimension containing deal price, start date and end date etc. Then you have daily/monthly snapshot to capture what products are associated with which deal, as sales fact my have many products that are not in the deals.
However in other fact, if all your equipments are sold through deals, then you might just store deal key along with other dimension keys in the fact table, as suggested by Negalemmo.
However in other fact, if all your equipments are sold through deals, then you might just store deal key along with other dimension keys in the fact table, as suggested by Negalemmo.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Re: DWH architecture problem
hang wrote:I am not sure if your deals are similar to retail promotions that stores promotion details for certain products. If so, you would have a deal dimension containing deal price, start date and end date etc. Then you have daily/monthly snapshot to capture what products are associated with which deal, as sales fact my have many products that are not in the deals.
However in other fact, if all your equipments are sold through deals, then you might just store deal key along with other dimension keys in the fact table, as suggested by Negalemmo.
Ok, I got it ! In our case (packaging & bottling machine industry), one deal = one equipment sold, it's not retail sales.
Skualys- Posts : 13
Join date : 2014-04-04
Similar topics
» The age problem
» too many dimension problem
» 15 month fiscal year
» Help with Complex One-to-Many relationships
» data model for 2 fact tables (Header / Detail scenario)
» too many dimension problem
» 15 month fiscal year
» Help with Complex One-to-Many relationships
» data model for 2 fact tables (Header / Detail scenario)
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum