Materialized Views vs. Tables
4 posters
Page 1 of 1
Materialized Views vs. Tables
Hi All,
We are trying to implement business solution where we are storing the data in normalized tables (around 5 - 6 base tables having max volume of data to 1 million (in one of the transaction table)), however data in these normalized tables needs to be used for reporting purposes. Planning to create flatten materialized views - storing data in columns (using pivot query to achieve this).
However confused whether should go for Tables or MV's, Please provide your suggestions or pros and cons of implementing MV's vs Table.
Regards,
Abhiraizada
We are trying to implement business solution where we are storing the data in normalized tables (around 5 - 6 base tables having max volume of data to 1 million (in one of the transaction table)), however data in these normalized tables needs to be used for reporting purposes. Planning to create flatten materialized views - storing data in columns (using pivot query to achieve this).
However confused whether should go for Tables or MV's, Please provide your suggestions or pros and cons of implementing MV's vs Table.
Regards,
Abhiraizada
Abhiraizada- Posts : 20
Join date : 2011-05-24
Materialized Views vs. Tables
Hi,
you can go for Materialized Views, if that works good and for MV's you need to determing the refresh method if you need to do it on demand or at a particular frequency. How many MV's are you planning ? is it one or more than one.
thanks
Himanshu
you can go for Materialized Views, if that works good and for MV's you need to determing the refresh method if you need to do it on demand or at a particular frequency. How many MV's are you planning ? is it one or more than one.
thanks
Himanshu
hkandpal- Posts : 113
Join date : 2010-08-16
Re: Materialized Views vs. Tables
I would propose building a dimensional model to support the business process.
BoxesAndLines- Posts : 1212
Join date : 2009-02-03
Location : USA
Re: Materialized Views vs. Tables
Agree with B&L. Materialized views are somewhat restrictive and you wind up with a big flat table. Big flat tables don't perform very well on a lot of db systems and you may wind up with cross-products due to joins. There can also be significant overhead maintaing the MV. Plus you have little opportunity to optimize the structure.
Build a dimensional model and processes to load it.
Build a dimensional model and processes to load it.
Re: Materialized Views vs. Tables
ngalemmo wrote:Agree with B&L. Materialized views are somewhat restrictive and you wind up with a big flat table. Big flat tables don't perform very well on a lot of db systems and you may wind up with cross-products due to joins. There can also be significant overhead maintaing the MV. Plus you have little opportunity to optimize the structure.
Build a dimensional model and processes to load it.
Thanks for valuable feedback guys.
I would like to understand what are these "overheads" because if I decide to move on to tables (i.e a Dimensional model, which I initially thought is the best way to address the problem) I need to give pointers as to why we have to change. FYI, I am facing stiff opposition from other senior developers against a table approach as it require ETL to populate these tables while MV's can be refreshed.
Would really appreciate specific points where MV's can cause problem.
Many thanks..
Abhiraizada
Abhiraizada- Posts : 20
Join date : 2011-05-24
Re: Materialized Views vs. Tables
I Did some reading/prototyping and found that Materialized view in general can be/are use for following purposes -
Data distribution - Distributing the data at multiple sites thus reducing the network load on one site.
Data sub-setting - Replicate data based on column and row level sub-setting.
Precomputed joins and Aggregation - Storing the precomputed joins and aggregated data to avoid run time computation of the same.
Query rewrite - Optimizer transparently rewrites the request to use the materialized view. Queries are then directed to the materialized view and not to the underlying detail tables, resulting in a significant performance gain.
However following points needs to be considered before implementing them -
Complex MV's - Materialized view is considered complex when the defining query of the MV contains: A CONNECT BY clause, set operations, DISTINCT or UNIQUE keyword. Also in some cases use of Aggregate functions and presence of joins (Joins other than those in a sub-query) will result in creation of complex MV's which can not be FAST refreshed hence will result in costly COMPLETE refresh when ever needed.
COMPLETE Refresh - To perform a complete refresh of a MV's, the server that manages the materialized view executes the materialized view's defining query, which essentially recreates the MV. To refresh the materialized view, the result set of the query replaces the existing materialized view data. Oracle can perform a complete refresh for any materialized view. Depending on the amount of data that satisfies the defining query, a complete refresh can take a substantially longer amount of time to perform than a fast refresh.
Having frequent refreshes for MV with COMPLETE refresh mechanism can drastically reduce the performance and can cause data inconsistency issues in MV. Hence its very important to understand the nature of application before scheduling refreshes. Generally COMPLETE refresh should only be used where refreshes are needed rarely.
From Oracle 10g onwards ATOMIC_REFRESH parameter is defaulted to perform Delete instead of Truncate, this has been done to make MV more available at refresh time. But this DELETE is an expensive operation in terms of refresh time it takes. A DELETE is always expensive and sometimes even impossible when we are talking about the complete refresh of materialized views with millions of rows.
ON COMMIT Refresh method : If the ON COMMIT refresh method is chosen, whenever a materialized view is affected by changes made to the base table data the materialized view will be automatically updated to reflect this change. However as this update to the materialized view occurs as part of the commit processing in the transaction where the changes to the base table are made. Therefore, the commit will take longer time, because changes are being made to both the original table and then any materialized view whose definition includes that table.In some cases deadlock's on base table records are experienced in this approach.
Refresh Groups : These are the group of MV's that will be refreshed together in each refresh cycle. Generally we want to define a single refresh group for each materialized view group and for efficiency purposes its advisable to have a refresh group that contains objects from multiple materialized view groups. It is recommended that we should not have multiple refresh groups to refresh the contents of a single materialized view group. Doing this might introduce inconsistencies in the materialized view data, which can cause referential integrity problems at the materialized view.
MV Indexes - Indexes can be created on MV's to optimized data retrieval. However these indexes can have a negative impact on refresh performance. More indexes require more work for the refresh, which requires more time. Also restrictive indexes like -unique indexes can interfere with the refresh operation and keep it from completing in case when these restrictions are not enforced on the master data.
MV invalidation - Staleness of data in MV needs to be reviewed regularly. A MV can be become invalidated - if any DML operation is performed on referenced table or any DDL modification is done the referenced table. Mostly the materialized view is transparently re-validated. However, in some cases where MV's are stale it should be manually re-validated.
Application might face performance degradation where multiple MV's are implemented with FAST refresh option as this will require multiple MV log/Triggers to be created on base tables.
More storage is required since the materialized view performs the query and store the resultant data.
Feel free to add more to the content or ask question if you want to, this might help people deciding the most suitable approach.
Regards,
Abhishek Raizada
Data distribution - Distributing the data at multiple sites thus reducing the network load on one site.
Data sub-setting - Replicate data based on column and row level sub-setting.
Precomputed joins and Aggregation - Storing the precomputed joins and aggregated data to avoid run time computation of the same.
Query rewrite - Optimizer transparently rewrites the request to use the materialized view. Queries are then directed to the materialized view and not to the underlying detail tables, resulting in a significant performance gain.
However following points needs to be considered before implementing them -
Complex MV's - Materialized view is considered complex when the defining query of the MV contains: A CONNECT BY clause, set operations, DISTINCT or UNIQUE keyword. Also in some cases use of Aggregate functions and presence of joins (Joins other than those in a sub-query) will result in creation of complex MV's which can not be FAST refreshed hence will result in costly COMPLETE refresh when ever needed.
COMPLETE Refresh - To perform a complete refresh of a MV's, the server that manages the materialized view executes the materialized view's defining query, which essentially recreates the MV. To refresh the materialized view, the result set of the query replaces the existing materialized view data. Oracle can perform a complete refresh for any materialized view. Depending on the amount of data that satisfies the defining query, a complete refresh can take a substantially longer amount of time to perform than a fast refresh.
Having frequent refreshes for MV with COMPLETE refresh mechanism can drastically reduce the performance and can cause data inconsistency issues in MV. Hence its very important to understand the nature of application before scheduling refreshes. Generally COMPLETE refresh should only be used where refreshes are needed rarely.
From Oracle 10g onwards ATOMIC_REFRESH parameter is defaulted to perform Delete instead of Truncate, this has been done to make MV more available at refresh time. But this DELETE is an expensive operation in terms of refresh time it takes. A DELETE is always expensive and sometimes even impossible when we are talking about the complete refresh of materialized views with millions of rows.
ON COMMIT Refresh method : If the ON COMMIT refresh method is chosen, whenever a materialized view is affected by changes made to the base table data the materialized view will be automatically updated to reflect this change. However as this update to the materialized view occurs as part of the commit processing in the transaction where the changes to the base table are made. Therefore, the commit will take longer time, because changes are being made to both the original table and then any materialized view whose definition includes that table.In some cases deadlock's on base table records are experienced in this approach.
Refresh Groups : These are the group of MV's that will be refreshed together in each refresh cycle. Generally we want to define a single refresh group for each materialized view group and for efficiency purposes its advisable to have a refresh group that contains objects from multiple materialized view groups. It is recommended that we should not have multiple refresh groups to refresh the contents of a single materialized view group. Doing this might introduce inconsistencies in the materialized view data, which can cause referential integrity problems at the materialized view.
MV Indexes - Indexes can be created on MV's to optimized data retrieval. However these indexes can have a negative impact on refresh performance. More indexes require more work for the refresh, which requires more time. Also restrictive indexes like -unique indexes can interfere with the refresh operation and keep it from completing in case when these restrictions are not enforced on the master data.
MV invalidation - Staleness of data in MV needs to be reviewed regularly. A MV can be become invalidated - if any DML operation is performed on referenced table or any DDL modification is done the referenced table. Mostly the materialized view is transparently re-validated. However, in some cases where MV's are stale it should be manually re-validated.
Application might face performance degradation where multiple MV's are implemented with FAST refresh option as this will require multiple MV log/Triggers to be created on base tables.
More storage is required since the materialized view performs the query and store the resultant data.
Feel free to add more to the content or ask question if you want to, this might help people deciding the most suitable approach.
Regards,
Abhishek Raizada
Abhiraizada- Posts : 20
Join date : 2011-05-24
Re: Materialized Views vs. Tables
Good work. I'm glad to see you did your homework. It should give you some sense of what I meant by overhead.
Similar topics
» Understanding Materialized Views as aggregate tables
» Appropriate use of materialized views
» Dimension Attributes and Fact attributes storing same data in multiple data marts??
» Derived fact tables, aggregation and views
» Creating materialized view of snowflaked dimensions
» Appropriate use of materialized views
» Dimension Attributes and Fact attributes storing same data in multiple data marts??
» Derived fact tables, aggregation and views
» Creating materialized view of snowflaked dimensions
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum