A View in SQL Server can be considered as a virtual table defined on top of it. True to its name, a view offers another entrance to look at data. A conventional view does not store actual data but merely contains a SELECT statement and metadata about the tables involved. Through views, clients no longer need to understand the underlying table structures and their relationships, as views provide a unified interface to access data.
Why Use Views?
Views abstract the underlying table structures, simplifying data access operations.
By hiding the underlying table structures, security is significantly enhanced, allowing users to see only the data provided by the views.
Views facilitate permission management by granting users access to views instead of the underlying tables, further strengthening security.
Views provide an interface for users to access data. When the underlying tables change, altering the view’s statement to adapt ensures that client programs built on this view remain unaffected.
Classification of Views in SQL
Views in SQL can be categorized into three types:
Regular View
Indexed View
Partitioned View
Let’s discuss these view types in detail.
1) Regular View
A Regular View is defined by a SELECT statement and includes only its definition and the metadata of the referenced tables, without actually storing data. The template for creating a view as per MSDN is as follows:
Parameters are quite straightforward, and here is an explanation for the above:
ENCRYPTION: The view is encrypted. Selecting this option prevents modifications. Save the script when creating the view; otherwise, it cannot be modified later.
SCHEMABINDING: Binds the view definition to the schema of the referenced tables. If this option is selected, the schema of the referenced tables (such as column data types) cannot be changed casually. To alter the underlying table’s schema, first drop or alter the views bound to it.
VIEW_METADATA: An interesting option. If not selected, the metadata returned to the client is that of the tables referenced by the View. If selected, the View’s metadata is returned, making the view appear more like a table to clients.
WITH CHECK OPTION: This option is used for restricting data updates through the view, which will be explained further in the section on updating data through views.
There are some rules to follow when creating views, aside from complying with the above syntax:
In a View, unless the TOP keyword is present, the ORDER BY clause cannot be used (a hack to use ORDER BY is applying TOP 100 percent…).
Views must be uniquely named within each Schema.
Views nesting cannot exceed 32 levels (in practical work, nesting beyond two levels is discouraged).
Keywords like COMPUTE, COMPUTE BY, INTO are not allowed in Views.
2) Indexed View
An Indexed View is a regular view with a unique clustered index added to it, effectively turning the view into an object equivalent to a table!
Indexed views in SQL Server are conceptually similar to Oracle’s Materialized Views. To understand indexed views, one must first grasp the concept of clustered indexes. Simplified, a clustered index can be thought of as a primary key, with the database’s data physically stored in the table according to the primary key’s order, like a dictionary organized from A to Z. This organization avoids full table scans, thus improving performance. Therefore, a table can have only one clustered index.
For indexed views, adding a clustered index means the view no longer just contains a SELECT statement and table metadata; the indexed view physically stores data in the database and keeps it synchronized with the underlying tables.
Understanding the principle behind indexed views reveals their significant performance boost for LDAP, involving massive data analysis and queries, especially when indexed views contain aggregate functions and involve costly JOINs. Since the results of aggregate functions are physically stored in the indexed view, it’s not necessary to perform aggregation every time the view is used, markedly improving performance.
However, every time an underlying table involved in an indexed view is updated, inserted, or deleted, SQL Server needs to identify the changed rows to synchronize the indexed view. Thus, for OLTP systems with frequent modifications, the performance might degrade due to the extensive synchronization required.
3) Partitioned View
Partitioned views, from a micro-implementation perspective, return data sets obtained by UNION-joining several parallel tables (i.e., tables with the same structure - columns and data types, but storing different row sets).
There are generally two types of partitioned views:
Local Partitioned View
Distributed Partitioned View
Since local partitioned views are mainly for backward compatibility with versions before SQL Server 2005, this article will focus on distributed partitioned views.
Distributed partitioned views essentially connect parallel data sets obtained from different or the same data sources. The biggest advantage of using distributed partitioned views is performance enhancement. For example, in the scenario above, if I only want to retrieve information for EmployeeID=8 using a distributed view, SQL Server can intelligently scan only Table 2 that contains EmployeeID=8, avoiding a full table scan. This significantly reduces IO operations, thereby enhancing performance.
It’s important to note that the primary keys involved in the tables of a distributed partitioned view cannot overlap. For example, if Table A’s ContactID ranges from 1-4, then Table B’s ContactID cannot range from 2-8.
Another point to note is that Check constraints should be added to the primary keys of the distributed partitioned view, enabling the SQL Server’s query analyzer to know which table to scan. Here’s an example:
In the Microsoft sample database AdventureWorks, I store data from the first 100 rows and rows 100-200 into two tables, Employee100 and Employee200, respectively:
1 2 3 4 5 6 7 8 9 10 11
-- Create Employee100 SELECT TOP 100*INTO Employee100 FROM HumanResources.Employee ORDERBY EmployeeID -- Create Employee200 SELECT*INTO Employee200 FROM (SELECT TOP 100* FROM HumanResources.Employee WHERE EmployeeID NOTIN (SELECT TOP 100 EmployeeID FROM HumanResources.Employee ORDERBY EmployeeID) ORDERBY HumanResources.Employee.EmployeeID) AS e
Now, let’s create a distributed partitioned view:
1 2 3 4 5
CREATEVIEW v_part_view_test AS SELECT*FROM Employee100 UNION SELECT*FROM Employee200
When querying this view:
1 2
SELECT*FROM v_part_view_test WHERE EmployeeID=105
So, when different data tables are placed on different servers or use a RAID5 disk array, distributed partitioned views can further enhance query performance.
Can using distributed partitioned views always improve performance? Absolutely not. If the query involves aggregate functions, especially with DISTINCT in the aggregate functions, or sorts without a WHERE condition, it’s definitely a performance killer. Aggregate functions require scanning all tables in the distributed partitioned view, then performing UNION operations before calculating, which can significantly impact performance.
Updating Data Through Views
I do not recommend updating data through views since views cannot accept parameters. I prefer using stored procedures for implementation.
Updating data through views is done in the same manner as updating data in tables (as mentioned before, a view can be considered as a virtual table, and if it’s an indexed view, it’s practically a table).
When updating data through views, keep in mind:
At least one user table must follow the FROM clause in the view.
Regardless of how many tables the view’s query involves, only one table’s data can be updated at a time.
Columns calculated through expressions, constant columns, and columns resulting from aggregate functions cannot be updated.
Columns not affected by Group By, Having, and Distinct keywords cannot be updated.
Tips in Views
To find a view’s definition through its name:
1 2
SELECT*FROM sys.sql_modules WHERE object_id=OBJECT_ID('view_name')
As mentioned, regular views only store the SELECT statement and referenced tables’ metadata. When the underlying table data changes, sometimes the view’s table metadata is not synchronized in time. Use the following code for manual synchronization:
1
EXEC sp_refreshview 'view_name'
Best Practices in Views
These are some of my personal experiences, and I welcome additions:
Ensure the SELECT statement in the View is optimized for performance (seems obvious, but truths are often simple).
Avoid nesting views, if necessary, at most nest one level.
Use stored procedures and user-defined functions instead of Views when possible, as stored procedures cache execution plans, offer better performance, and have fewer restrictions.
Do not use aggregate functions in partitioned views, especially when they include DISTINCT.
If possible, include WHERE clauses inside the view rather than outside (because calling a view returns all rows, then filters, which is a performance killer, especially if you also add an ORDER BY…).
Conclusion
This article detailed the three types of views, each with its own use cases. Properly used, they can enhance performance, but improper use can drag down performance.
Remember the adage: “Practice makes perfect”… With practice and thought, there will surely be gains.