Relational Datawarehouse
3 posters
Page 1 of 1
Relational Datawarehouse
Hi,
I have always understood that a Datawarehouse uses a Dimentional Model. But then i hear that there are Datawarehouses using the Relational model. My question is if the OLTP system by itself falls under the relational model, then what would the difference between the models(Relational Datawarehouse and OLTP) , except that relational datawarehouse would have historical data also.
Sorry, if my question sounds silly.
Thanks,
Manjula
I have always understood that a Datawarehouse uses a Dimentional Model. But then i hear that there are Datawarehouses using the Relational model. My question is if the OLTP system by itself falls under the relational model, then what would the difference between the models(Relational Datawarehouse and OLTP) , except that relational datawarehouse would have historical data also.
Sorry, if my question sounds silly.
Thanks,
Manjula
ManjulaSiva- Posts : 1
Join date : 2010-04-29
Re: Relational Datawarehouse
Not silly, in fact your summary is fairly accurate.
The use of ER models in data warehousing is often referred to as the Inmon architecture, as he (Bill Inmon) is its biggest proponents of this approach. The somewhat 3NF model serves as the data repository which supports an analytics layer, typically modeled using dimensional schema. Users access data in the analytics layer, while the core repository is generally off-limits to ad-hoc end-user queries.
This is done a lot in large organizations primarily because those in charge (both managerial and technical) are comfortable with 3NF relational databases and can't get their heads around the idea of a dimensional repository. This attitude has been slowly changing over the years, but most of those who are attempting dimensional repositories, are still not able to achive an integrated environment due to lack of a cohesive vision and, in most cases, internal politics and the way development projects are handled.
The use of ER models in data warehousing is often referred to as the Inmon architecture, as he (Bill Inmon) is its biggest proponents of this approach. The somewhat 3NF model serves as the data repository which supports an analytics layer, typically modeled using dimensional schema. Users access data in the analytics layer, while the core repository is generally off-limits to ad-hoc end-user queries.
This is done a lot in large organizations primarily because those in charge (both managerial and technical) are comfortable with 3NF relational databases and can't get their heads around the idea of a dimensional repository. This attitude has been slowly changing over the years, but most of those who are attempting dimensional repositories, are still not able to achive an integrated environment due to lack of a cohesive vision and, in most cases, internal politics and the way development projects are handled.
Re: Relational Datawarehouse
To me, normalised relational DW is a simple extension of OLTP system that can hold historical data in a centralised storage. To a lot of people with decades of relational modeling experience, dimensional modeling would somehow undo their skills and make them learn to adapt to something they have been fighting to avoid.
Admittedly, it's big challenge to jump to dimensional DW system from an OLTP system that is not even relationally normalised. In practice, an operational data store ODS may have to be created to tidy things up before moving to dimensional model. However an ODS is not a data warehouse, but rather a properly done OLTP system.
The biggest threat to building a proper dimensional datastore is the temptation of normalising dimension tables and denormlising fact tables. Even in SQL Server, some sample DW databases like AdventureWorks have been used to encourage people to snowflake their dimensional schema as an evident justification. But don't forget, AdventureWorks is a simplified retail system. In reality, snowflaked hierarchy becomes much more complex and rigid when you need to track SCD changes in all the levels in the hierarchy.
Take an example of Product-Category-Subcategory hierarchy. In real world, you would need to treat Category and Subcategory as type2 SCD dimensions. Trying to maintain related SCD dimensions dealing with multiple surrogate keys makes schema confusing and makes ETL more complex as the relationship really should exist between natural keys. However in denormalised product dimension, the hierarchy or relationship can be easily formed by the attributes in a single table and hence you are only dealing with a single surrogate key. Any hierarchical changes in history will be simply tracked like normal attribute changes through normal SCD process in the ETL.
A final nail on the coffin for the normalised dimension like product-category-subcategory, there are potentially more hierarchical relationships in the product. Obviously you would not feel comfortable to snowflake all the possible hierarchies even they are all potentially very useful. In dimensional schema, having normalised product-category-subcategory is a rigid and biased setup carried over from OLTP system. Users know what relationships they may get from the data by their business knowledge not by being forced to understand underlying ER models.
Admittedly, it's big challenge to jump to dimensional DW system from an OLTP system that is not even relationally normalised. In practice, an operational data store ODS may have to be created to tidy things up before moving to dimensional model. However an ODS is not a data warehouse, but rather a properly done OLTP system.
The biggest threat to building a proper dimensional datastore is the temptation of normalising dimension tables and denormlising fact tables. Even in SQL Server, some sample DW databases like AdventureWorks have been used to encourage people to snowflake their dimensional schema as an evident justification. But don't forget, AdventureWorks is a simplified retail system. In reality, snowflaked hierarchy becomes much more complex and rigid when you need to track SCD changes in all the levels in the hierarchy.
Take an example of Product-Category-Subcategory hierarchy. In real world, you would need to treat Category and Subcategory as type2 SCD dimensions. Trying to maintain related SCD dimensions dealing with multiple surrogate keys makes schema confusing and makes ETL more complex as the relationship really should exist between natural keys. However in denormalised product dimension, the hierarchy or relationship can be easily formed by the attributes in a single table and hence you are only dealing with a single surrogate key. Any hierarchical changes in history will be simply tracked like normal attribute changes through normal SCD process in the ETL.
A final nail on the coffin for the normalised dimension like product-category-subcategory, there are potentially more hierarchical relationships in the product. Obviously you would not feel comfortable to snowflake all the possible hierarchies even they are all potentially very useful. In dimensional schema, having normalised product-category-subcategory is a rigid and biased setup carried over from OLTP system. Users know what relationships they may get from the data by their business knowledge not by being forced to understand underlying ER models.
hang- Posts : 528
Join date : 2010-05-07
Location : Brisbane, Australia
Similar topics
» Is it the end of the Relational Dimensional Data Warehouse ?
» Customer looking for sub second performance in a relational implementation.
» Dimensional DB to completely replace Relational DB
» Start schema in relational database vs OLAP
» Advice for modelling planned relational data
» Customer looking for sub second performance in a relational implementation.
» Dimensional DB to completely replace Relational DB
» Start schema in relational database vs OLAP
» Advice for modelling planned relational data
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum