SQL View Creation Standards - Guidance Needed
4 posters
Page 1 of 1
SQL View Creation Standards - Guidance Needed
Looking for guidance on the best method for creating an SQL view. The 2 methods in question are as follows:
(1)
CREATE VIEW “VIEW NAME” AS
SELECT *
FROM “TABLE NAME”
(2)
CREATE VIEW “VIEW NAME” AS
SELECT COLUM1, COLUMN 2 , COLUM3, COLUMN 4, COLUMN 5, etc…
FROM “TABLE NAME”
What are the drawbacks/advatages of each method, keeping in mind the view could possibly be updated to include join to another table, and calculations/comparisons on columns could aslo be done?
(1)
CREATE VIEW “VIEW NAME” AS
SELECT *
FROM “TABLE NAME”
(2)
CREATE VIEW “VIEW NAME” AS
SELECT COLUM1, COLUMN 2 , COLUM3, COLUMN 4, COLUMN 5, etc…
FROM “TABLE NAME”
What are the drawbacks/advatages of each method, keeping in mind the view could possibly be updated to include join to another table, and calculations/comparisons on columns could aslo be done?
ksaizon- Posts : 1
Join date : 2011-03-07
Re: SQL View Creation Standards - Guidance Needed
My opinion is that the best practice is to enumerate the columns returned by the view (i.e. #2). And, if you are doing joins or deriving columns, you really have no choice. You might as well be consistent.
SELECT *
Hi ksaizon,
I'd like to put the case for option 1 (SELECT *). This certainly has the advantage of much less code to maintain, and (for MS SQL Server) doesn't preclude any join or derived column code you might want to add.
A further advantage that isn't as obvious, is that I strongly prefer to drive most logic of any complexity (e.g. anything more complex than a single-column join or an ISNULL derived column) upstream and perform it in the ETL layer, not in Views. The ETL tools (e.g. SSIS) can use various techniques to optimize the logic and performance, and apply a single consistent definition. The result is then stored in columns which can be indexed to optimize query performance.
So when I'm reviewing or mentoring an ETL team using SELECT * views, I can quickly scan their View code - it should mostly be SELECT * style. Anything more complex triggers me to stop and review why they are not performing that logic in the ETL layer.
Good luck!
Mike
I'd like to put the case for option 1 (SELECT *). This certainly has the advantage of much less code to maintain, and (for MS SQL Server) doesn't preclude any join or derived column code you might want to add.
A further advantage that isn't as obvious, is that I strongly prefer to drive most logic of any complexity (e.g. anything more complex than a single-column join or an ISNULL derived column) upstream and perform it in the ETL layer, not in Views. The ETL tools (e.g. SSIS) can use various techniques to optimize the logic and performance, and apply a single consistent definition. The result is then stored in columns which can be indexed to optimize query performance.
So when I'm reviewing or mentoring an ETL team using SELECT * views, I can quickly scan their View code - it should mostly be SELECT * style. Anything more complex triggers me to stop and review why they are not performing that logic in the ETL layer.
Good luck!
Mike
Re: SQL View Creation Standards - Guidance Needed
SELECT * is bad practice.
Firstly, there is the performance hit. If you include all columns, then you require a scan of each table (unless you are silly enough to write a covering index for every column of every table).
Secondly, if you write SELECT * for a view with multiple tables, how do you know which tables the columns PRODUCTKEY and PRODUCTKEY1 refer to? You don't. If you are using a LEFT JOIN, you may be then trying to use the PRODUCTKEY that contains NULLS.
It's simply bad practice and any google search will show that. You should only include the columns you need in the view.
Firstly, there is the performance hit. If you include all columns, then you require a scan of each table (unless you are silly enough to write a covering index for every column of every table).
Secondly, if you write SELECT * for a view with multiple tables, how do you know which tables the columns PRODUCTKEY and PRODUCTKEY1 refer to? You don't. If you are using a LEFT JOIN, you may be then trying to use the PRODUCTKEY that contains NULLS.
It's simply bad practice and any google search will show that. You should only include the columns you need in the view.
In defence of SELECT *
Hi John,
I should've been specific that I'm referring to Microsoft SQL Server, where the issues you listed don't exist in my experience.
In MS SQL Server, the view code is effectively "compiled" when you create the view, locking in the current schema definition. At runtime, the query optimizer "looks through" the view definition and only requests the minimum required columns from the source tables. You can see this at work in the varying Excecution Plans generated when you select various combinations of columns from a "SELECT *" view. So in my experience there is no "performance hit".
Regarding your second point, the JOIN clause is separate to the SELECT clause, and using "SELECT *" does not prevent you from coding your JOIN with table prefixes - in fact you have to code with table prefixes, otherwise you will get an error for ambiguous column references.
In such a multi-table query, you can also use syntax like "SELECT Table1.*, Table2.ColumnB " if you want to return all columns from Table1, together with selected columns from Table2.
I'm not sure what google search you relied on to draw your conclusion - I couldn't turn up anything to support your view after 3 or 4 attempts.
In any case, I'd suggest this topic is more subtle than you think. I certainly didn't mean to imply that the more conventional style of view coding was somehow "bad" - I merely intended to present an alternative perspective.
I should've been specific that I'm referring to Microsoft SQL Server, where the issues you listed don't exist in my experience.
In MS SQL Server, the view code is effectively "compiled" when you create the view, locking in the current schema definition. At runtime, the query optimizer "looks through" the view definition and only requests the minimum required columns from the source tables. You can see this at work in the varying Excecution Plans generated when you select various combinations of columns from a "SELECT *" view. So in my experience there is no "performance hit".
Regarding your second point, the JOIN clause is separate to the SELECT clause, and using "SELECT *" does not prevent you from coding your JOIN with table prefixes - in fact you have to code with table prefixes, otherwise you will get an error for ambiguous column references.
In such a multi-table query, you can also use syntax like "SELECT Table1.*, Table2.ColumnB " if you want to return all columns from Table1, together with selected columns from Table2.
I'm not sure what google search you relied on to draw your conclusion - I couldn't turn up anything to support your view after 3 or 4 attempts.
In any case, I'd suggest this topic is more subtle than you think. I certainly didn't mean to imply that the more conventional style of view coding was somehow "bad" - I merely intended to present an alternative perspective.
Re: SQL View Creation Standards - Guidance Needed
Sorry Mike, you're quite right. I was focusing on the SELECT * which is a no-no.
If you do a SELECT * for the creation of the view, then there is no difference if the user selects only particular columns. However if they run a SELECT * from the view, that will indeed have a performance impact.
If you do a SELECT * for the creation of the view, then there is no difference if the user selects only particular columns. However if they run a SELECT * from the view, that will indeed have a performance impact.
Similar topics
» View or materialized view?
» How does dimension hierarcy affect granularity?
» Date Dimension Creation
» Measure Creation from Dimension
» Naming standards
» How does dimension hierarcy affect granularity?
» Date Dimension Creation
» Measure Creation from Dimension
» Naming standards
Page 1 of 1
Permissions in this forum:
You cannot reply to topics in this forum