UserGuide Windows Server 2012 R2 with SQL Server 2012 SP3 Standard

Documents

Product Categories

Windows Server 2012 R2 with SQL Server 2012 SP3 Standard

Jun 28, 2024

\repl_snapshot 5 . Tutorial: Configure Replication between Two Fully Connected Servers (Transactional) Updated: 2018-04-13 (Previous | Next) Create a subscription to the Transactional publication In this section, you will add a subscriber to the Publication that was previously created. This tutorial uses a remote subscriber (NODE2\SQL2016) but a subscription can also be added locally to the publisher. To create the subscription 1. Connect to the Publisher in {Included-Content-Goes-Here} , expand the server node, and then expand the Replication folder. 2. In the Local Publications folder, right-click the AdvWorksProductTrans publication, and then select New Subscriptions. The New Subscription Wizard launches: New Subscription 3. On the Publication page, select AdvWorksProductTrans, and then select Next: Select Tran Publisher 4. On the Distribution Agent Location page, select Run all agents at the Distributor, and then select Next. For more information on pull and push subscriptions, please see Subscribe to Publications: Run Agents at Dist 5. On the Subscribers page, if the name of the Subscriber instance is not displayed, select Add Subscriber and then select Add SQL Server Subscriber from the drop-down. This will launch the Connect to Server dialog box. Enter the Subscriber instance name and then select Connect. 6 . Tutorial: Configure Replication between a Server and Mobile Clients (Merge) Updated: 2018-04-13 (Previous | Next) The Employee table contains a column (OrganizationNode) that has the hierarchyid data type, which is only supported for replication in SQL 2017. If you''re using a build lower than SQL 2017, you''ll see a message at the bottom of the screen notifying you of potential data loss for using this column in bi-directional replication. For the purpose of this tutorial, this message can be ignored. However, this datatype should not be replicated in a production environment unless you''re using the supported build. For more inforamtion about replicating the hierarchyid datatype, please see Using Hierarchyid Columns in Replication On the Filter Table Rows page, select Add and then select Add Filter.In the Add Filter dialog box, select Employee (HumanResources) in Select the table to filter. Select the LoginID column, select the right arrow to add the column to the WHERE clause of the filter query, and modify the WHERE clause as follows: WHERE [LoginID] = HOST_NAME() a. Select A row from this table will go to only one subscription, and select OK: Add Filter On the Filter Table Rows page, select Employee (Human Resources), select Add, and then select Add Join to Extend the Selected Filter. a. In the Add Join dialog box, select Sales.SalesOrderHeader under Joined table. Select Write the join statement manually, and complete the join statement as follows: 7 . Query with Full-Text Search Updated: 2018-04-13 (Previous | Next) More info about generation term searches The inflectional forms are the different tenses and conjugations of a verb or the singular and plural forms of a noun. For example, search for the inflectional form of the word "drive." If various rows in the table include the words "drive," "drives," "drove," "driving," and "driven," all would be in the result set because each of these can be inflectionally generated from the word drive. [FREETEXT] and [FREETEXTTABLE] look for inflectional terms of all specified words by default. [CONTAINS] and [CONTAINSTABLE] support an optional INFLECTIONAL argument. Search for synonyms of a specific word A thesaurus defines user-specified synonyms for terms. For more info about thesaurus files, see [Configure and Manage Thesaurus Files for Full-Text Search]. For example, if an entry, "{car, automobile, truck, van}," is added to a thesaurus, you can search for the thesaurus form of the word "car." All rows in the table queried that include the words "automobile," "truck," "van," or "car," appear in the result set because each of these words belongs to the synonym expansion set containing the word "car." [FREETEXT] and [FREETEXTTABLE] use the thesaurus by default. [CONTAINS] and [CONTAINSTABLE] support an optional THESAURUS argument. 8 . Transparent Data Encryption with Bring Your Own Key support for Azure SQL Database and Data Warehouse Updated: 2018-04-24 (Previous | Next)How to configure Geo-DR with Azure Key Vault To maintain high availability of TDE Protectors for encrypted databases, it is required to configure redundant Azure Key Vaults based on the existing or desired SQL Database failover groups or active geo-replication instances. Each geo-replicated server requires a separate key vault, that must be co-located with the server in the same Azure region. Should a primary database become inaccessible due to an outage in one region and a failover is triggered, the secondary database is able to take over using the secondary key vault. For Geo-Replicated Azure SQL databases, the following Azure Key Vault configuration is required: One primary database with a key vault in region and one secondary database with a key vault in region. At least one secondary is required, up to four secondaries are supported. Secondaries of secondaries (chaining) are not supported. The following section will go over the setup and configuration steps in more detail. Azure Key Vault Configuration Steps Install PowerShell Create two Azure Key Vaults in two different regions using PowerShell to enable the "soft-delete" property on the key vaults (this option is not available from the AKV Portal yet – but required by SQL). Both Azure Key Vaults must be located in the two regions available in the same Azure Geo in order for backup and restore of keys to work. If you need the two key vaults to be located in different geos to meet SQL Geo-DR requirements, follow the BYOK Process that allows keys to be imported from an on-prem HSM. 9 . PowerShell and CLI: Enable Transparent Data Encryption using your own key from Azure Key Vault Updated: 2018-04-24 (Previous | Next) Prerequisites for CLI You must have an Azure subscription and be an administrator on that subscription. [Recommended but Optional] Have a hardware security module (HSM) or local key store for creating a local copy of the TDE Protector key material. Command-Line Interface version 2.0 or later. To install the latest version and connect to your Azure subscription, see Install and Configure the Azure Cross-Platform Command-Line Interface 2.0. Create an Azure Key Vault and Key to use for TDE. Manage Key Vault using CLI 2.0 Instructions for using a hardware security module (HSM) and Key Vault The key vault must have the following property to be used for TDE: soft-delete How to use Key Vault soft-delete with CLI The key must have the following attributes to be used for TDE: No expiration date Not disabled Able to perform get, wrap key, unwrap key operations Step: Create a server and assign an Azure AD identity to your servercli # create server (with identity) and database 1 0. About Change Data Capture (SQL Server) Updated: 2018-04-17 (Previous) Working with database and table collation differences It is important to be aware of a situation where you have different collations between the database and the columns of a table configured for change data capture. CDC uses interim storage to populate side tables. If a table has CHAR or VARCHAR columns with collations that are different from the database collation and if those columns store non-ASCII characters (such as double byte DBCS characters), CDC might not be able to persist the changed data consistent with the data in the base tables. This is due to the fact that the interim storage variables cannot have collations associated with them. Please consider one of the following approaches to ensure change captured data is consistent with base tables: Use NCHAR or NVARCHAR data type for columns containing non-ASCII data. Or, Use the same collation for columns and for the database. For example, if you have one database that uses a collation of SQL_Latin1_General_CP1_CI_AS, consider the following table: CREATE TABLE T1( C1 INT PRIMARY KEY, C2 VARCHAR(10) collate Chinese_PRC_CI_AI) CDC might fail to capture the binary data for column C2, because its collation is different (Chinese_PRC_CI_AI). Use NVARCHAR to avoid this problem: CREATE TABLE T1( C1 INT PRIMARY KEY, C2 NVARCHAR(10) collate Chinese_PRC_CI_AI --Unicode data type, CDC works well with this data type) Similar articles about new or updated articles This section lists very similar articles for recently updated articles in other subject areas, within our public GitHub.com repository: MicrosoftDocs/sql-docs. Subject areas that do have new or recently updated articles New + Updated (11+6): Advanced Analytics for SQL docs New + Updated (18+0): Analysis Services for SQL docs New + Updated (218+14): Connect to SQL docs New + Updated (14+0): Database Engine for SQL docs New + Updated (3+2): Integration Services for SQL docs New + Updated (3+3): Linux for SQL docs New + Updated (7+10): Relational Databases for SQL docsNew + Updated (0+2): Reporting Services for SQL docs New + Updated (1+3): SQL Operations Studio docs New + Updated (2+3): Microsoft SQL Server docs New + Updated (1+1): SQL Server Data Tools (SSDT) docs New + Updated (5+2): SQL Server Management Studio (SSMS) docs New + Updated (0+2): Transact-SQL docs New + Updated (1+1): Tools for SQL docs Subject areas that do not have any new or recently updated articles New + Updated (0+0): Analytics Platform System for SQL docs New + Updated (0+0): Data Quality Services for SQL docs New + Updated (0+0): Data Mining Extensions (DMX) for SQL docs New + Updated (0+0): Master Data Services (MDS) for SQL docs New + Updated (0+0): Multidimensional Expressions (MDX) for SQL docs New + Updated (0+0): ODBC (Open Database Connectivity) for SQL docs New + Updated (0+0): PowerShell for SQL docs New + Updated (0+0): Samples for SQL docs New + Updated (0+0): SQL Server Migration Assistant (SSMA) docs New + Updated (0+0): XQuery for SQL docsSQL Server Guides 5/3/2018 • 1 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse The following guides are available. They discuss general concepts and apply to all versions of SQL Server, unless stated otherwise in the respective guide. Always On Availability Groups Troubleshooting and Monitoring Guide Index Architecture and Design Guide Memory Management Architecture Guide Pages and Extents Architecture Guide Post-migration Validation and Optimization Guide Query Processing Architecture Guide SQL Server Transaction Locking and Row Versioning Guide SQL Server Transaction Log Architecture and Management Guide Thread and Task Architecture GuideSQL Server Index Architecture and Design Guide 5/3/2018 • 60 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse Poorly designed indexes and a lack of indexes are primary sources of database application bottlenecks. Designing efficient indexes is paramount to achieving good database and application performance. This SQL Server index design guide contains information on index architecture, and best practices to help you design effective indexes to meet the needs of your application. This guide assumes the reader has a general understanding of the index types available in SQL Server. For a general description of index types, see Index Types. This guide covers the following types of indexes: Clustered Nonclustered Unique Filtered Columnstore Hash Memory-Optimized Nonclustered For information about XML indexes, see XML Indexes Overview. For information about Spatial indexes, see Spatial Indexes Overview. For information about Full-text indexes, see Populate Full-Text Indexes. Index Design Basics An index is an on-disk or in-memory structure associated with a table or view that speeds retrieval of rows from the table or view. An index contains keys built from one or more columns in the table or view. For on-disk indexes, these keys are stored in a structure (B-tree) that enables SQL Server to find the row or rows associated with the key values quickly and efficiently. An index stores data logically organized as a table with rows and columns, and physically stored in a row-wise data format called rowstore 1, or stored in a column-wise data format called columnstore. The selection of the right indexes for a database and its workload is a complex balancing act between query speed and update cost. Narrow indexes, or indexes with few columns in the index key, require less disk space and maintenance overhead. Wide indexes, on the other hand, cover more queries. You may have to experiment with several different designs before finding the most efficient index. Indexes can be added, modified, and dropped without affecting the database schema or application design. Therefore, you should not hesitate to experiment with different indexes. The query optimizer in SQL Server reliably chooses the most effective index in the vast majority of cases. Your overall index design strategy should provide a variety of indexes for the query optimizer to choose from and trust it to make the right decision. This reduces analysis time and produces good performance over a variety of situations. To see which indexes the query optimizer uses for a specific query, in SQL Server Management Studio, on the Query menu, select Include Actual Execution Plan.Do not always equate index usage with good performance, and good performance with efficient index use. If using an index always helped produce the best performance, the job of the query optimizer would be simple. In reality, an incorrect index choice can cause less than optimal performance. Therefore, the task of the query optimizer is to select an index, or combination of indexes, only when it will improve performance, and to avoid indexed retrieval when it will hinder performance. 1 Rowstore has been the traditional way to store relational table data. In SQL Server, rowstore refers to table where the underlying data storage format is a heap, a B-tree (clustered index), or a memory-optimized table. Index Design Tasks The follow tasks make up our recommended strategy for designing indexes: 1. Understand the characteristics of the database itself. For example, is it an online transaction processing (OLTP) database with frequent data modifications that must sustain a high throughput. Starting with SQL Server 2014 (12.x), memory-optimized tables and indexes are especially appropriate for this scenario, by providing a latch-free design. For more information, see Indexes for Memory-Optimized Tables, or Nonclustered Index for Memory-Optimized Tables Design Guidelines and Hash Index for Memory-Optimized Tables Design Guidelines in this guide. Or an example of a Decision Support System (DSS) or data warehousing (OL AP) database that must process very large data sets quickly. Starting with SQL Server 2012 (11.x), columnstore indexes are especially appropriate for typical data warehousing data sets. Columnstore indexes can transform the data warehousing experience for users by enabling faster performance for common data warehousing queries such as filtering, aggregating, grouping, and star-join queries. For more information, see Columnstore Indexes overview, or Columnstore Index Design Guidelines in this guide. 2. Understand the characteristics of the most frequently used queries. For example, knowing that a frequently used query joins two or more tables will help you determine the best type of indexes to use. 3. Understand the characteristics of the columns used in the queries. For example, an index is ideal for columns that have an integer data type and are also unique or nonnull columns. For columns that have well-defined subsets of data, you can use a filtered index in SQL Server 2008 and higher versions. For more information, see Filtered Index Design Guidelines in this guide. 4. Determine which index options might enhance performance when the index is created or maintained. For example, creating a clustered index on an existing large table would benefit from the ONLINE index option. The ONLINE option allows for concurrent activity on the underlying data to continue while the index is being created or rebuilt. For more information, see Set Index Options. 5. Determine the optimal storage location for the index. A nonclustered index can be stored in the same filegroup as the underlying table, or on a different filegroup. The storage location of indexes can improve query performance by increasing disk I/O performance. For example, storing a nonclustered index on a filegroup that is on a different disk than the table filegroup can improve performance because multiple disks can be read at the same time. Alternatively, clustered and nonclustered indexes can use a partition scheme across multiple filegroups. Partitioning makes large tables or indexes more manageable by letting you access or manage subsets of data quickly and efficiently, while maintaining the integrity of the overall collection. For more information, see Partitioned Tables and Indexes. When you consider partitioning, determine whether the index should be aligned, that is, partitioned in essentially the same manner as the table, or partitioned independently. General Index Design Guidelines Experienced database administrators can design a good set of indexes, but this task is very complex, time- consuming, and error-prone even for moderately complex databases and workloads. Understanding the characteristics of your database, queries, and data columns can help you design optimal indexes.Database Considerations When you design an index, consider the following database guidelines: Large numbers of indexes on a table affect the performance of INSERT , UPDATE , DELETE , and MERGE statements because all indexes must be adjusted appropriately as data in the table changes. For example, if a column is used in several indexes and you execute an UPDATE statement that modifies that column''s data, each index that contains that column must be updated as well as the column in the underlying base table (heap or clustered index). Avoid over-indexing heavily updated tables and keep indexes narrow, that is, with as few columns as possible. Use many indexes to improve query performance on tables with low update requirements, but large volumes of data. Large numbers of indexes can help the performance of queries that do not modify data, such as SELECT statements, because the query optimizer has more indexes to choose from to determine the fastest access method. Indexing small tables may not be optimal because it can take the query optimizer longer to traverse the index searching for data than to perform a simple table scan. Therefore, indexes on small tables might never be used, but must still be maintained as data in the table changes. Indexes on views can provide significant performance gains when the view contains aggregations, table joins, or a combination of aggregations and joins. The view does not have to be explicitly referenced in the query for the query optimizer to use it. Use the Database Engine Tuning Advisor to analyze your database and make index recommendations. For more information, see Database Engine Tuning Advisor. Query Considerations When you design an index, consider the following query guidelines: Create nonclustered indexes on the columns that are frequently used in predicates and join conditions in queries. However, you should avoid adding unnecessary columns. Adding too many index columns can adversely affect disk space and index maintenance performance. Covering indexes can improve query performance because all the data needed to meet the requirements of the query exists within the index itself. That is, only the index pages, and not the data pages of the table or clustered index, are required to retrieve the requested data; therefore, reducing overall disk I/O. For example, a query of columns a and b on a table that has a composite index created on columns a, b, and c can retrieve the specified data from the index alone. Write queries that insert or modify as many rows as possible in a single statement, instead of using multiple queries to update the same rows. By using only one statement, optimized index maintenance could be exploited. Evaluate the query type and how columns are used in the query. For example, a column used in an exact- match query type would be a good candidate for a nonclustered or clustered index. Column Considerations When you design an index consider the following column guidelines: Keep the length of the index key short for clustered indexes. Additionally, clustered indexes benefit from being created on unique or nonnull columns. Columns that are of the ntext, text, image, varchar(max), nvarchar(max), and varbinary(max) data types cannot be specified as index key columns. However, varchar(max), nvarchar(max), varbinary(max), and xml data types can participate in a nonclustered index as nonkey index columns. Formore information, see the section ''Index with Included Columns'' in this guide. An xml data type can only be a key column only in an XML index. For more information, see XML Indexes (SQL Server). SQL Server 2012 SP1 introduces a new type of XML index known as a Selective XML Index. This new index can improve querying performance over data stored as XML in SQL Server, allow for much faster indexing of large XML data workloads, and improve scalability by reducing storage costs of the index itself. For more information, see Selective XML Indexes (SXI). Examine column uniqueness. A unique index instead of a nonunique index on the same combination of columns provides additional information for the query optimizer that makes the index more useful. For more information, see Unique Index Design Guidelines in this guide. Examine data distribution in the column. Frequently, a long-running query is caused by indexing a column with few unique values, or by performing a join on such a column. This is a fundamental problem with the data and query, and generally cannot be resolved without identifying this situation. For example, a physical telephone directory sorted alphabetically on last name will not expedite locating a person if all people in the city are named Smith or Jones. For more information about data distribution, see Statistics. Consider using filtered indexes on columns that have well-defined subsets, for example sparse columns, columns with mostly NULL values, columns with categories of values, and columns with distinct ranges of values. A well-designed filtered index can improve query performance, reduce index maintenance costs, and reduce storage costs. Consider the order of the columns if the index will contain multiple columns. The column that is used in the WHERE clause in an equal to (=), greater than (>), less than (<), or BETWEEN search condition, or participates in a join, should be placed first. Additional columns should be ordered based on their level of distinctness, that is, from the most distinct to the least distinct. For example, if the index is defined as LastName , FirstName the index will be useful when the search criterion is WHERE LastName = ''Smith'' or WHERE LastName = Smith AND FirstName LIKE ''J%'' . However, the query optimizer would not use the index for a query that searched only on FirstName (WHERE FirstName = ''Jane'') . Consider indexing computed columns. For more information, see Indexes on Computed Columns. Index Characteristics After you have determined that an index is appropriate for a query, you can select the type of index that best fits your situation. Index characteristics include the following: Clustered versus nonclustered Unique versus nonunique Single column versus multicolumn Ascending or descending order on the columns in the index Full-table versus filtered for nonclustered indexes Columnstore versus rowstore Hash versus nonclustered for Memory-Optimized tables You can also customize the initial storage characteristics of the index to optimize its performance or maintenance by setting an option such as FILLFACTOR. Also, you can determine the index storage location by using filegroups or partition schemes to optimize performance. Index Placement on Filegroups or Partitions Schemes As you develop your index design strategy, you should consider the placement of the indexes on the filegroups associated with the database. Careful selection of the filegroup or partition scheme can improve query performance.By default, indexes are stored in the same filegroup as the base table on which the index is created. A nonpartitioned clustered index and the base table always reside in the same filegroup. However, you can do the following: Create nonclustered indexes on a filegroup other than the filegroup of the base table or clustered index. Partition clustered and nonclustered indexes to span multiple filegroups. Move a table from one filegroup to another by dropping the clustered index and specifying a new filegroup or partition scheme in the MOVE TO clause of the DROP INDEX statement or by using the CREATE INDEX statement with the DROP_EXISTING clause. By creating the nonclustered index on a different filegroup, you can achieve performance gains if the filegroups are using different physical drives with their own controllers. Data and index information can then be read in parallel by the multiple disk heads. For example, if Table_A on filegroup f1 and Index_A on filegroup f2 are both being used by the same query, performance gains can be achieved because both filegroups are being fully used without contention. However, if Table_A is scanned by the query but Index_A is not referenced, only filegroup f1 is used. This creates no performance gain. Because you cannot predict what type of access will occur and when it will occur, it could be a better decision to spread your tables and indexes across all filegroups. This would guarantee that all disks are being accessed because all data and indexes are spread evenly across all disks, regardless of which way the data is accessed. This is also a simpler approach for system administrators. Partitions across multiple Filegroups You can also consider partitioning clustered and nonclustered indexes across multiple filegroups. Partitioned indexes are partitioned horizontally, or by row, based on a partition function. The partition function defines how each row is mapped to a set of partitions based on the values of certain columns, called partitioning columns. A partition scheme specifies the mapping of the partitions to a set of filegroups. Partitioning an index can provide the following benefits: Provide scalable systems that make large indexes more manageable. OLTP systems, for example, can implement partition-aware applications that deal with large indexes. Make queries run faster and more efficiently. When queries access several partitions of an index, the query optimizer can process individual partitions at the same time and exclude partitions that are not affected by the query. For more information, see Partitioned Tables and Indexes. Index Sort Order Design Guidelines When defining indexes, you should consider whether the data for the index key column should be stored in ascending or descending order. Ascending is the default and maintains compatibility with earlier versions of SQL Server. The syntax of the CREATE INDEX, CREATE TABLE, and ALTER TABLE statements supports the keywords ASC (ascending) and DESC (descending) on individual columns in indexes and constraints. Specifying the order in which key values are stored in an index is useful when queries referencing the table have ORDER BY clauses that specify different directions for the key column or columns in that index. In these cases, the index can remove the need for a SORT operator in the query plan; therefore, this makes the query more efficient. For example, the buyers in the Adventure Works Cycles purchasing department have to evaluate the quality of products they purchase from vendors. The buyers are most interested in finding products sent by these vendors with a high rejection rate. As shown in the following query, retrieving the data to meet this criteria requires the RejectedQty column in the Purchasing.PurchaseOrderDetail table to be sorted in descending order (large to small) and the ProductID column to be sorted in ascending order (small to large).SELECT RejectedQty, ((RejectedQty/OrderQty)*100) AS RejectionRate, ProductID, DueDate FROM Purchasing.PurchaseOrderDetail ORDER BY RejectedQty DESC, ProductID ASC; The following execution plan for this query shows that the query optimizer used a SORT operator to return the result set in the order specified by the ORDER BY clause. If an index is created with key columns that match those in the ORDER BY clause in the query, the SORT operator can be eliminated in the query plan and the query plan is more efficient. CREATE NONCLUSTERED INDEX IX_PurchaseOrderDetail_RejectedQty ON Purchasing.PurchaseOrderDetail (RejectedQty DESC, ProductID ASC, DueDate, OrderQty); After the query is executed again, the following execution plan shows that the SORT operator has been eliminated and the newly created nonclustered index is used. The Database Engine can move equally efficiently in either direction. An index defined as (RejectedQty DESC, ProductID ASC) can still be used for a query in which the sort direction of the columns in the ORDER BY clause are reversed. For example, a query with the ORDER BY clause ORDER BY RejectedQty ASC, ProductID DESC can use the index. Sort order can be specified only for key columns. The sys.index_columns catalog view and the INDEXKEY_PROPERTY function report whether an index column is stored in ascending or descending order. Metadata Use these metadata views to see attributes of indexes. More architectural information is embedded in some of these views. NOTE For columnstore indexes, all columns are stored in the metadata as included columns. The columnstore index does not have key columns. sys.indexes (Transact-SQL) sys.index_columns (Transact-SQL) sys.partitions (Transact-SQL) sys.internal_partitions (Transact-SQL) sys.dm_db_index_operational_stats (Transact-SQL) sys.dm_db_index_physical_stats (Transact-SQL)sys.column_store_segments (Transact-SQL) sys.column_store_dictionaries (Transact-SQL) sys.column_store_row_groups (Transact-SQL) sys.dm_db_column_store_row_group_operational_stats (Transact-SQL) sys.dm_db_column_store_row_group_physical_stats (Transact- sys.dm_column_store_object_pool (Transact-SQL) SQL) sys.dm_db_column_store_row_group_operational_stats sys.dm_db_xtp_hash_index_stats (Transact-SQL) (Transact-SQL) sys.dm_db_xtp_index_stats (Transact-SQL) sys.dm_db_xtp_object_stats (Transact-SQL) sys.dm_db_xtp_nonclustered_index_stats (Transact-SQL) sys.dm_db_xtp_table_memory_stats (Transact-SQL) sys.hash_indexes (Transact-SQL) sys.memory_optimized_tables_internal_attributes (Transact- SQL) Clustered Index Design Guidelines Clustered indexes sort and store the data rows in the table based on their key values. There can only be one clustered index per table, because the data rows themselves can only be sorted in one order. With few exceptions, every table should have a clustered index defined on the column, or columns, that offer the following: Can be used for frequently used queries. Provide a high degree of uniqueness. NOTE When you create a PRIMARY KEY constraint, a unique index on the column, or columns, is automatically created. By default, this index is clustered; however, you can specify a nonclustered index when you create the constraint. Can be used in range queries. If the clustered index is not created with the UNIQUE property, the Database Engine automatically adds a 4- byte uniqueifier column to the table. When it is required, the Database Engine automatically adds a uniqueifier value to a row to make each key unique. This column and its values are used internally and cannot be seen or accessed by users. Clustered Index Architecture In SQL Server, indexes are organized as B-Trees. Each page in an index B-tree is called an index node. The top node of the B-tree is called the root node. The bottom nodes in the index are called the leaf nodes. Any index levels between the root and the leaf nodes are collectively known as intermediate levels. In a clustered index, the leaf nodes contain the data pages of the underlying table. The root and intermediate level nodes contain index pages holding index rows. Each index row contains a key value and a pointer to either an intermediate level page in the B-tree, or a data row in the leaf level of the index. The pages in each level of the index are linked in a doubly-linked list. Clustered indexes have one row in sys.partitions, with index_id = 1 for each partition used by the index. By default, a clustered index has a single partition. When a clustered index has multiple partitions, each partition has a B-tree structure that contains the data for that specific partition. For example, if a clustered index has four partitions, there are four B-tree structures; one in each partition.Depending on the data types in the clustered index, each clustered index structure will have one or more allocation units in which to store and manage the data for a specific partition. At a minimum, each clustered index will have one IN_ROW_DATA allocation unit per partition. The clustered index will also have one LOB_DATA allocation unit per partition if it contains large object (LOB) columns. It will also have one ROW_OVERFLOW_DATA allocation unit per partition if it contains variable length columns that exceed the 8,060 byte row size limit. The pages in the data chain and the rows in them are ordered on the value of the clustered index key. All inserts are made at the point where the key value in the inserted row fits in the ordering sequence among existing rows. This illustration shows the structure of a clustered index in a single partition. Query Considerations Before you create clustered indexes, understand how your data will be accessed. Consider using a clustered index for queries that do the following: Return a range of values by using operators such as BETWEEN , >, >=, <, and <=. After the row with the first value is found by using the clustered index, rows with subsequent indexed values are guaranteed to be physically adjacent. For example, if a query retrieves records between a range of sales order numbers, a clustered index on the column SalesOrderNumber can quickly locate the row that contains the starting sales order number, and then retrieve all successive rows in the table until the last sales order number is reached. Return large result sets. Use JOIN clauses; typically these are foreign key columns. Use ORDER BY or GROUP BY clauses. An index on the columns specified in the ORDER BY or GROUP BY clause may remove the need for the Database Engine to sort the data, because the rows are already sorted. This improves query performance. Column Considerations Generally, you should define the clustered index key with as few columns as possible. Consider columns that have one or more of the following attributes:Are unique or contain many distinct values For example, an employee ID uniquely identifies employees. A clustered index or PRIMARY KEY constraint on the EmployeeID column would improve the performance of queries that search for employee information based on the employee ID number. Alternatively, a clustered index could be created on LastName , FirstName , MiddleName because employee records are frequently grouped and queried in this way, and the combination of these columns would still provide a high degree of difference. TIP If not specified differently, when creating a PRIMARY KEY constraint, SQL Server creates a clustered index to support that constraint. Although a uniqueidentifier can be used to enforce uniqueness as a PRIMARY KEY, it is not an efficient clustering key. If using a uniqueidentifier as PRIMARY KEY, the recommendation is to create it as a nonclustered index, and use another column such as an IDENTITY to create the clustered index. Are accessed sequentially For example, a product ID uniquely identifies products in the Production.Product table in the AdventureWorks2012 database. Queries in which a sequential search is specified, such as WHERE ProductID BETWEEN 980 and 999 , would benefit from a clustered index on ProductID . This is because the rows would be stored in sorted order on that key column. Defined as IDENTITY . Used frequently to sort the data retrieved from a table. It can be a good idea to cluster, that is physically sort, the table on that column to save the cost of a sort operation every time the column is queried. Clustered indexes are not a good choice for the following attributes: Columns that undergo frequent changes This causes in the whole row to move, because the Database Engine must keep the data values of a row in physical order. This is an important consideration in high-volume transaction processing systems in which data is typically volatile. Wide keys Wide keys are a composite of several columns or several large-size columns. The key values from the clustered index are used by all nonclustered indexes as lookup keys. Any nonclustered indexes defined on the same table will be significantly larger because the nonclustered index entries contain the clustering key and also the key columns defined for that nonclustered index. Nonclustered Index Design Guidelines A nonclustered index contains the index key values and row locators that point to the storage location of the table data. You can create multiple nonclustered indexes on a table or indexed view. Generally, nonclustered indexes should be designed to improve the performance of frequently used queries that are not covered by the clustered index. Similar to the way you use an index in a book, the query optimizer searches for a data value by searching the nonclustered index to find the location of the data value in the table and then retrieves the data directly from that location. This makes nonclustered indexes the optimal choice for exact match queries because the index contains entries describing the exact location in the table of the data values being searched for in the queries. For example, to query the HumanResources. Employee table for all employees that report to a specific manager, the query optimizer might use the nonclustered index IX_Employee_ManagerID ; this has ManagerID as its key column. Thequery optimizer can quickly find all entries in the index that match the specified ManagerID . Each index entry points to the exact page and row in the table, or clustered index, in which the corresponding data can be found. After the query optimizer finds all entries in the index, it can go directly to the exact page and row to retrieve the data. Nonclustered Index Architecture Nonclustered indexes have the same B-tree structure as clustered indexes, except for the following significant differences: The data rows of the underlying table are not sorted and stored in order based on their nonclustered keys. The leaf layer of a nonclustered index is made up of index pages instead of data pages. The row locators in nonclustered index rows are either a pointer to a row or are a clustered index key for a row, as described in the following: If the table is a heap, which means it does not have a clustered index, the row locator is a pointer to the row. The pointer is built from the file identifier (ID), page number, and number of the row on the page. The whole pointer is known as a Row ID (RID). If the table has a clustered index, or the index is on an indexed view, the row locator is the clustered index key for the row. Nonclustered indexes have one row in sys.partitions with index_id > 1 for each partition used by the index. By default, a nonclustered index has a single partition. When a nonclustered index has multiple partitions, each partition has a B-tree structure that contains the index rows for that specific partition. For example, if a nonclustered index has four partitions, there are four B-tree structures, with one in each partition. Depending on the data types in the nonclustered index, each nonclustered index structure will have one or more allocation units in which to store and manage the data for a specific partition. At a minimum, each nonclustered index will have one IN_ROW_DATA allocation unit per partition that stores the index B-tree pages. The nonclustered index will also have one LOB_DATA allocation unit per partition if it contains large object (LOB) columns. Additionally, it will have one ROW_OVERFLOW_DATA allocation unit per partition if it contains variable length columns that exceed the 8,060 byte row size limit. The following illustration shows the structure of a nonclustered index in a single partition.Database Considerations Consider the characteristics of the database when designing nonclustered indexes. Databases or tables with low update requirements, but large volumes of data can benefit from many nonclustered indexes to improve query performance. Consider creating filtered indexes for well-defined subsets of data to improve query performance, reduce index storage costs, and reduce index maintenance costs compared with full-table nonclustered indexes. Decision Support System applications and databases that contain primarily read-only data can benefit from many nonclustered indexes. The query optimizer has more indexes to choose from to determine the fastest access method, and the low update characteristics of the database mean index maintenance will not impede performance. Online Transaction Processing applications and databases that contain heavily updated tables should avoid over-indexing. Additionally, indexes should be narrow, that is, with as few columns as possible. Large numbers of indexes on a table affect the performance of INSERT, UPDATE, DELETE, and MERGE statements because all indexes must be adjusted appropriately as data in the table changes. Query Considerations Before you create nonclustered indexes, you should understand how your data will be accessed. Consider using a nonclustered index for queries that have the following attributes: Use JOIN or GROUP BY clauses. Create multiple nonclustered indexes on columns involved in join and grouping operations, and a clustered index on any foreign key columns. Queries that do not return large result sets. Create filtered indexes to cover queries that return a well-defined subset of rows from a large table. Contain columns frequently involved in search conditions of a query, such as WHERE clause, that return exact matches. Column Considerations Consider columns that have one or more of these attributes: Cover the query. Performance gains are achieved when the index contains all columns in the query. The query optimizer can locate all the column values within the index; table or clustered index data is not accessed resulting in fewer disk I/O operations. Use index with included columns to add covering columns instead of creating a wide index key. If the table has a clustered index, the column or columns defined in the clustered index are automatically appended to the end of each nonclustered index on the table. This can produce a covered query without specifying the clustered index columns in the definition of the nonclustered index. For example, if a table has a clustered index on column C , a nonclustered index on columns B and A will have as its key values columns B , A , and C . Lots of distinct values, such as a combination of last name and first name, if a clustered index is used for other columns. If there are very few distinct values, such as only 1 and 0, most queries will not use the index because a table scan is generally more efficient. For this type of data, consider creating a filtered index on a distinct value that only occurs in a small number of rows. For example, if most of the values are 0, the query optimizer might use a filtered index for the data rows that contain 1.Use Included Columns to Extend Nonclustered Indexes You can extend the functionality of nonclustered indexes by adding nonkey columns to the leaf level of the nonclustered index. By including nonkey columns, you can create nonclustered indexes that cover more queries. This is because the nonkey columns have the following benefits: They can be data types not allowed as index key columns. They are not considered by the Database Engine when calculating the number of index key columns or index key size. An index with included nonkey columns can significantly improve query performance when all columns in the query are included in the index either as key or nonkey columns. Performance gains are achieved because the query optimizer can locate all the column values within the index; table or clustered index data is not accessed resulting in fewer disk I/O operations. NOTE When an index contains all the columns referenced by the query it is typically referred to as covering the query. While key columns are stored at all levels of the index, nonkey columns are stored only at the leaf level. U s i n g In c l u d e d C o l u m n s t o A v o i d Si z e L i m i t s You can include nonkey columns in a nonclustered index to avoid exceeding the current index size limitations of a maximum of 16 key columns and a maximum index key size of 900 bytes. The Database Engine does not consider nonkey columns when calculating the number of index key columns or index key size. For example, assume that you want to index the following columns in the Document table: Title nvarchar(50) Revision nchar(5) FileName nvarchar(400) Because the nchar and nvarchar data types require 2 bytes for each character, an index that contains these three columns would exceed the 900 byte size limitation by 10 bytes (455 * 2). By using the INCLUDE clause of the CREATE INDEX statement, the index key could be defined as ( Title, Revision ) and FileName defined as a nonkey column. In this way, the index key size would be 110 bytes (55 * 2), and the index would still contain all the required columns. The following statement creates such an index. CREATE INDEX IX_Document_Title ON Production.Document (Title, Revision) INCLUDE (FileName); I n d e x w i t h In c l u d e d C o l u m n s G u i d e l i n e s When you design nonclustered indexes with included columns consider the following guidelines: Nonkey columns are defined in the INCLUDE clause of the CREATE INDEX statement. Nonkey columns can only be defined on nonclustered indexes on tables or indexed views. All data types are allowed except text, ntext, and image. Computed columns that are deterministic and either precise or imprecise can be included columns. For more information, see Indexes on Computed Columns. As with key columns, computed columns derived from image, ntext, and text data types can be nonkey (included) columns as long as the computed column data type is allowed as a nonkey index column. Column names cannot be specified in both the INCLUDE list and in the key column list.Column names cannot be repeated in the INCLUDE list. C o l u m n Si z e G u i d e l i n e s At least one key column must be defined. The maximum number of nonkey columns is 1023 columns. This is the maximum number of table columns minus 1. Index key columns, excluding nonkeys, must follow the existing index size restrictions of 16 key columns maximum, and a total index key size of 900 bytes. The total size of all nonkey columns is limited only by the size of the columns specified in the INCLUDE clause; for example, varchar(max) columns are limited to 2 GB. C o l u m n M o d i f i c a t i o n G u i d e l i n e s When you modify a table column that has been defined as an included column, the following restrictions apply: Nonkey columns cannot be dropped from the table unless the index is dropped first. Nonkey columns cannot be changed, except to do the following: Change the nullability of the column from NOT NULL to NULL. Increase the length of varchar, nvarchar, or varbinary columns. NOTE These column modification restrictions also apply to index key columns. D e s i g n R e c o m m e n d a t i o n s Redesign nonclustered indexes with a large index key size so that only columns used for searching and lookups are key columns. Make all other columns that cover the query included nonkey columns. In this way, you will have all columns needed to cover the query, but the index key itself is small and efficient. For example, assume that you want to design an index to cover the following query. SELECT AddressLine1, AddressLine2, City, StateProvinceID, PostalCode FROM Person.Address WHERE PostalCode BETWEEN N''98000'' and N''99999''; To cover the query, each column must be defined in the index. Although you could define all columns as key columns, the key size would be 334 bytes. Because the only column actually used as search criteria is the PostalCode column, having a length of 30 bytes, a better index design would define PostalCode as the key column and include all other columns as nonkey columns. The following statement creates an index with included columns to cover the query. CREATE INDEX IX_Address_PostalCode ON Person.Address (PostalCode) INCLUDE (AddressLine1, AddressLine2, City, StateProvinceID); P e r fo r m a n c e C o n s i d e r a t i o n s Avoid adding unnecessary columns. Adding too many index columns, key or nonkey, can have the following performance implications: Fewer index rows will fit on a page. This could create I/O increases and reduced cache efficiency. More disk space will be required to store the index. In particular, adding varchar(max), nvarchar(max), varbinary(max), or xml data types as nonkey index columns may significantly increase disk space requirements. This is because the column values are copied into the index leaf level. Therefore, they residein both the index and the base table. Index maintenance may increase the time that it takes to perform modifications, inserts, updates, or deletes, to the underlying table or indexed view. You will have to determine whether the gains in query performance outweigh the affect to performance during data modification and in additional disk space requirements. Unique Index Design Guidelines A unique index guarantees that the index key contains no duplicate values and therefore every row in the table is in some way unique. Specifying a unique index makes sense only when uniqueness is a characteristic of the data itself. For example, if you want to make sure that the values in the NationalIDNumber column in the HumanResources.Employee table are unique, when the primary key is EmployeeID , create a UNIQUE constraint on the NationalIDNumber column. If the user tries to enter the same value in that column for more than one employee, an error message is displayed and the duplicate value is not entered. With multicolumn unique indexes, the index guarantees that each combination of values in the index key is unique. For example, if a unique index is created on a combination of LastName , FirstName , and MiddleName columns, no two rows in the table could have the same combination of values for these columns. Both clustered and nonclustered indexes can be unique. Provided that the data in the column is unique, you can create both a unique clustered index and multiple unique nonclustered indexes on the same table. The benefits of unique indexes include the following: Data integrity of the defined columns is ensured. Additional information helpful to the query optimizer is provided. Creating a PRIMARY KEY or UNIQUE constraint automatically creates a unique index on the specified columns. There are no significant differences between creating a UNIQUE constraint and creating a unique index independent of a constraint. Data validation occurs in the same manner and the query optimizer does not differentiate between a unique index created by a constraint or manually created. However, you should create a UNIQUE or PRIMARY KEY constraint on the column when data integrity is the objective. By doing this the objective of the index will be clear. Considerations A unique index, UNIQUE constraint, or PRIMARY KEY constraint cannot be created if duplicate key values exist in the data. If the data is unique and you want uniqueness enforced, creating a unique index instead of a nonunique index on the same combination of columns provides additional information for the query optimizer that can produce more efficient execution plans. Creating a unique index (preferably by creating a UNIQUE constraint) is recommended in this case. A unique nonclustered index can contain included nonkey columns. For more information, see Index with Included Columns. Filtered Index Design Guidelines A filtered index is an optimized nonclustered index, especially suited to cover queries that select from a well- defined subset of data. It uses a filter predicate to index a portion of rows in the table. A well-designed filtered index can improve query performance, reduce index maintenance costs, and reduce index storage costs compared with full-table indexes. Applies to: SQL Server 2008 through SQL Server 2017.Filtered indexes can provide the following advantages over full-table indexes: Improved query performance and plan quality A well-designed filtered index improves query performance and execution plan quality because it is smaller than a full-table nonclustered index and has filtered statistics. The filtered statistics are more accurate than full-table statistics because they cover only the rows in the filtered index. Reduced index maintenance costs An index is maintained only when data manipulation language (DML) statements affect the data in the index. A filtered index reduces index maintenance costs compared with a full-table nonclustered index because it is smaller and is only maintained when the data in the index is affected. It is possible to have a large number of filtered indexes, especially when they contain data that is affected infrequently. Similarly, if a filtered index contains only the frequently affected data, the smaller size of the index reduces the cost of updating the statistics. Reduced index storage costs Creating a filtered index can reduce disk storage for nonclustered indexes when a full-table index is not necessary. You can replace a full-table nonclustered index with multiple filtered indexes without significantly increasing the storage requirements. Filtered indexes are useful when columns contain well-defined subsets of data that queries reference in SELECT statements. Examples are: Sparse columns that contain only a few non-NULL values. Heterogeneous columns that contain categories of data. Columns that contain ranges of values such as dollar amounts, time, and dates. Table partitions that are defined by simple comparison logic for column values. Reduced maintenance costs for filtered indexes are most noticeable when the number of rows in the index is small compared with a full-table index. If the filtered index includes most of the rows in the table, it could cost more to maintain than a full-table index. In this case, you should use a full-table index instead of a filtered index. Filtered indexes are defined on one table and only support simple comparison operators. If you need a filter expression that references multiple tables or has complex logic, you should create a view. Design Considerations In order to design effective filtered indexes, it is important to understand what queries your application uses and how they relate to subsets of your data. Some examples of data that have well-defined subsets are columns with mostly NULL values, columns with heterogeneous categories of values and columns with distinct ranges of values. The following design considerations give a variety of scenarios for when a filtered index can provide advantages over full-table indexes. TIP The nonclustered columnstore index definition supports using a filtered condition. To minimize the performance impact of adding a columnstore index on an OLTP table, use a filtered condition to create a nonclustered columnstore index on only the cold data of your operational workload. Filtered Indexes for subsets of data When a column only has a small number of relevant values for queries, you can create a filtered index on the subset of values. For example, when the values in a column are mostly NULL and the query selects only from thenon-NULL values, you can create a filtered index for the non-NULL data rows. The resulting index will be smaller and cost less to maintain than a full-table nonclustered index defined on the same key columns. For example, the AdventureWorks2012 database has a Production.BillOfMaterials table with 2679 rows. The EndDate column has only 199 rows that contain a non-NULL value and the other 2480 rows contain NULL. The following filtered index would cover queries that return the columns defined in the index and that select only rows with a non-NULL value for EndDate . CREATE NONCLUSTERED INDEX FIBillOfMaterialsWithEndDate ON Production.BillOfMaterials (ComponentID, StartDate) WHERE EndDate IS NOT NULL ; GO The filtered index FIBillOfMaterialsWithEndDate is valid for the following query. You can display the query execution plan to determine if the query optimizer used the filtered index. SELECT ProductAssemblyID, ComponentID, StartDate FROM Production.BillOfMaterials WHERE EndDate IS NOT NULL AND ComponentID = 5 AND StartDate > ''20080101'' ; For more information about how to create filtered indexes and how to define the filtered index predicate expression, see Create Filtered Indexes. Filtered Indexes for heterogeneous data When a table has heterogeneous data rows, you can create a filtered index for one or more categories of data. For example, the products listed in the Production.Product table are each assigned to a ProductSubcategoryID , which are in turn associated with the product categories Bikes, Components, Clothing, or Accessories. These categories are heterogeneous because their column values in the Production.Product table are not closely correlated. For example, the columns Color , ReorderPoint , ListPrice , Weight , Class , and Style have unique characteristics for each product category. Suppose that there are frequent queries for accessories which have subcategories between 27 and 36 inclusive. You can improve the performance of queries for accessories by creating a filtered index on the accessories subcategories as shown in the following example. CREATE NONCLUSTERED INDEX FIProductAccessories ON Production.Product (ProductSubcategoryID, ListPrice) Include (Name) WHERE ProductSubcategoryID >= 27 AND ProductSubcategoryID <= 36; The filtered index FIProductAccessories covers the following query because the query results are contained in the index and the query plan does not include a base table lookup. For example, the query predicate expression ProductSubcategoryID = 33 is a subset of the filtered index predicate ProductSubcategoryID >= 27 and ProductSubcategoryID <= 36 , the ProductSubcategoryID and ListPrice columns in the query predicate are both key columns in the index, and name is stored in the leaf level of the index as an included column. SELECT Name, ProductSubcategoryID, ListPrice FROM Production.Product WHERE ProductSubcategoryID = 33 AND ListPrice > 25.00 ; Key Columns It is a best practice to include a small number of key or included columns in a filtered index definition, and toincorporate only the columns that are necessary for the query optimizer to choose the filtered index for the query execution plan. The query optimizer can choose a filtered index for the query regardless of whether it does or does not cover the query. However, the query optimizer is more likely to choose a filtered index if it covers the query. In some cases, a filtered index covers the query without including the columns in the filtered index expression as key or included columns in the filtered index definition. The following guidelines explain when a column in the filtered index expression should be a key or included column in the filtered index definition. The examples refer to the filtered index, FIBillOfMaterialsWithEndDate that was created previously. A column in the filtered index expression does not need to be a key or included column in the filtered index definition if the filtered index expression is equivalent to the query predicate and the query does not return the column in the filtered index expression with the query results. For example, FIBillOfMaterialsWithEndDate covers the following query because the query predicate is equivalent to the filter expression, and EndDate is not returned with the query results. FIBillOfMaterialsWithEndDate does not need EndDate as a key or included column in the filtered index definition. SELECT ComponentID, StartDate FROM Production.BillOfMaterials WHERE EndDate IS NOT NULL; A column in the filtered index expression should be a key or included column in the filtered index definition if the query predicate uses the column in a comparison that is not equivalent to the filtered index expression. For example, FIBillOfMaterialsWithEndDate is valid for the following query because it selects a subset of rows from the filtered index. However, it does not cover the following query because EndDate is used in the comparison EndDate > ''20040101'' , which is not equivalent to the filtered index expression. The query processor cannot execute this query without looking up the values of EndDate . Therefore, EndDate should be a key or included column in the filtered index definition. SELECT ComponentID, StartDate FROM Production.BillOfMaterials WHERE EndDate > ''20040101''; A column in the filtered index expression should be a key or included column in the filtered index definition if the column is in the query result set. For example, FIBillOfMaterialsWithEndDate does not cover the following query because it returns the EndDate column in the query results. Therefore, EndDate should be a key or included column in the filtered index definition. SELECT ComponentID, StartDate, EndDate FROM Production.BillOfMaterials WHERE EndDate IS NOT NULL; The clustered index key of the table does not need to be a key or included column in the filtered index definition. The clustered index key is automatically included in all nonclustered indexes, including filtered indexes. Data Conversion Operators in the Filter Predicate If the comparison operator specified in the filtered index expression of the filtered index results in an implicit or explicit data conversion, an error will occur if the conversion occurs on the left side of a comparison operator. A solution is to write the filtered index expression with the data conversion operator (CAST or CONVERT) on the right side of the comparison operator. The following example creates a table with a variety of data types. USE AdventureWorks2012; GO CREATE TABLE dbo.TestTable (a int, b varbinary(4));In the following filtered index definition, column b is implicitly converted to an integer data type for the purpose of comparing it to the constant 1. This generates error message 10611 because the conversion occurs on the left hand side of the operator in the filtered predicate. CREATE NONCLUSTERED INDEX TestTabIndex ON dbo.TestTable(a,b) WHERE b = 1; The solution is to convert the constant on the right hand side to be of the same type as column b , as seen in the following example: CREATE INDEX TestTabIndex ON dbo.TestTable(a,b) WHERE b = CONVERT(Varbinary(4), 1); Moving the data conversion from the left side to the right side of a comparison operator might change the meaning of the conversion. In the above example, when the CONVERT operator was added to the right side, the comparison changed from an integer comparison to a varbinary comparison. Columnstore Index Design Guidelines A columnstore index is a technology for storing, retrieving and managing data by using a columnar data format, called a columnstore. For more information, refer to Columnstore Indexes overview. For version information, see Columnstore indexes - What''s new. Columnstore Index Architecture Knowing these basics will make it easier to understand other columnstore articles that explain how to use them effectively. Data storage uses columnstore and rowstore compression When discussing columnstore indexes, we use the terms rowstore and columnstore to emphasize the format for the data storage. Columnstore indexes use both types of storage. A columnstore is data that is logically organized as a table with rows and columns, and physically stored in a column-wise data format. A columnstore index physically stores most of the data in columnstore format. In columnstore format, the data is compressed and uncompressed as columns. There is no need to uncompress other values in each row that are not requested by the query. This makes it fast to scan an entire column of a large table. A rowstore is data that is logically organized as a table with rows and columns, and then physically stored in a row-wise data format. This has been the traditional way to store relational table data such as a heap or clustered B-tree index.A columnstore index also physically stores some rows in a rowstore format called a deltastore. The deltastore,also called delta rowgroups, is a holding place for rows that are too few in number to qualify for compression into the columnstore. Each delta rowgroup is implemented as a clustered B-tree index. The deltastore is a holding place for rows that are too few in number to be compressed into the columnstore. The deltastore stores the rows in rowstore format. Operations are performed on rowgroups and column segments The columnstore index groups rows into manageable units. Each of these units is called a rowgroup. For best performance, the number of rows in a rowgroup is large enough to improve compression rates and small enough to benefit from in-memory operations. A rowgroup is a group of rows on which the columnstore index performs management and compression operations. For example, the columnstore index performs these operations on rowgroups: Compresses rowgroups into the columnstore. Compression is performed on each column segment within a rowgroup. Merges rowgroups during an ALTER INDEX REORGANIZE operation. Creates new rowgroups during an ALTER INDEX REBUILD operation. Reports on rowgroup health and fragmentation in the dynamic management views (DMVs). The deltastore is comprised of one or more rowgroups called delta rowgroups. Each delta rowgroup is a clustered B-tree index that stores rows when they are too few in number for compression into the columnstore. A delta rowgroup is a clustered B-tree index that stores small bulk loads and inserts until the rowgroup contains 1,048,576 rows or until the index is rebuilt. When a delta rowgroup contains 1,048,576 rows it is marked as closed and waits for a process called the tuple-mover to compress it into the columnstore. Each column has some of its values in each rowgroup. These values are called column segments. When the columnstore index compresses a rowgroup, it compresses each column segment separately. To uncompress an entire column, the columnstore index only needs to uncompress one column segment from each rowgroup. A column segment is the portion of column values in a rowgroup. Each rowgroup contains one column segment for every column in the table. Each column has one column segment in each rowgroup.| Small loads and inserts go to the deltastore A columnstore index improves columnstore compression and performance by compressing at least 102,400 rows at a time into the columnstore index. To compress rows in bulk, the columnstore index accumulates small loads and inserts in the deltastore. The deltastore operations are handled behind the scenes. To return the correct query results, the clustered columnstore index combines query results from both the columnstore and the deltastore.Rows go to the deltastore when they are: Inserted with the INSERT INTO ... VALUES statement. At the end of a bulk load and they number less than 102,400. Updated. Each update is implemented as a delete and an insert. The deltastore also stores a list of IDs for deleted rows that have been marked as deleted but not yet physically deleted from the columnstore. When delta rowgroups are full they get compressed into the columnstore Clustered columnstore indexes collect up to 1,048,576 rows in each delta rowgroup before compressing the rowgroup into the columnstore. This improves the compression of the columnstore index. When a delta rowgroup contains 1,048,576 rows, the columnstore index marks the rowgroup as closed. A background process, called the tuple-mover, finds each closed rowgroup and compresses it into the columnstore. You can force delta rowgroups into the columnstore by using ALTER INDEX to rebuild or reorganize the index. Note that if there is memory pressure during compression, the columnstore index might reduce the number of rows in the compressed rowgroup. Each table partition has its own rowgroups and delta rowgroups The concept of partitioning is the same in both a clustered index, a heap, and a columnstore index. Partitioning a table divides the table into smaller groups of rows according to a range of column values. It is often used for managing the data. For example, you could create a partition for each year of data, and then use partition switching to archive data to less expensive storage. Partition switching works on columnstore indexes and makes it easy to move a partition of data to another location. Rowgroups are always defined within a table partition. When a columnstore index is partitioned, each partition has its own compressed rowgroups and delta rowgroups. Ea c h p a r t i t i o n c a n h a v e m u l t i p l e d e l t a r o w g r o u p s Each partition can have more than one delta rowgroups. When the columnstore index needs to add data to a delta rowgroup and the delta rowgroup is locked, the columnstore index will try to obtain a lock on a different delta rowgroup. If there are no delta rowgroups available, the columnstore index will create a new delta rowgroup. For example, a table with 10 partitions could easily have 20 or more delta rowgroups. You can combine columnstore and rowstore indexes on the same table A nonclustered index contains a copy of part or all of the rows and columns in the underlying table. The index is defined as one or more columns of the table, and has an optional condition that filters the rows. Starting with SQL Server 2016 (13.x), you can create an updatable nonclustered columnstore index on a rowstore table. The columnstore index stores a copy of the data so you do need extra storage. However, the data in the columnstore index will compress to a smaller size than the rowstore table requires. By doing this, you can run analytics on the columnstore index and transactions on the rowstore index at the same time. The column store is updated when data changes in the rowstore table, so both indexes are working against the same data. Starting with SQL Server 2016 (13.x), you can have one or more nonclustered rowstore indexes on a columnstore index. By doing this, you can perform efficient table seeks on the underlying columnstore. Other options become available too. For example, you can enforce a primary key constraint by using a UNIQUE constraint on the rowstore table. Since an non-unique value will fail to insert into the rowstore table, SQL Server cannot insert the value into the columnstore. Performance considerations The nonclustered columnstore index definition supports using a filtered condition. To minimize the performance impact of adding a columnstore index on an OLTP table, use a filtered condition to create a nonclustered columnstore index on only the cold data of your operational workload. An in-memory table can have one columnstore index. You can create it when the table is created or add itlater with ALTER TABLE (Transact-SQL). Before SQL Server 2016 (13.x), only a disk-based table could have a columnstore index. For more information, refer to Columnstore indexes - Query performance. Design Guidance A rowstore table can have one updateable nonclustered columnstore index. Before SQL Server 2014 (12.x), the nonclustered columnstore index was read-only. For more information, refer to Columnstore indexes - Design Guidance. Hash Index Design Guidelines All memory-optimized tables must have at least one index, because it is the indexes that connect the rows together. On a memory-optimized table, every index is also memory-optimized. Hash indexes are one of the possible index types in a memory-optimized table. For more information, see Indexes for Memory-Optimized Tables. Applies to: SQL Server 2014 (12.x) through SQL Server 2017. Hash Index Architecture A hash index consists of an array of pointers, and each element of the array is called a hash bucket. Each bucket is 8 bytes, which are used to store the memory address of a link list of key entries. Each entry is a value for an index key, plus the address of its corresponding row in the underlying memory- optimized table. Each entry points to the next entry in a link list of entries, all chained to the current bucket. The number of buckets must be specified at index definition time: The lower the ratio of buckets to table rows or to distinct values, the longer the average bucket link list will be. Short link lists perform faster than long link lists. The maximum number of buckets in hash indexes is 1,073,741,824. TIP To determine the right BUCKET_COUNT for your data, see Configuring the hash index bucket count. The hash function is applied to the index key columns and the result of the function determines what bucket that key falls into. Each bucket has a pointer to rows whose hashed key values are mapped to that bucket. The hashing function used for hash indexes has the following characteristics: SQL Server has one hash function that is used for all hash indexes. The hash function is deterministic. The same input key value is always mapped to the same bucket in the hash index. Multiple index keys may be mapped to the same hash bucket. The hash function is balanced, meaning that the distribution of index key values over hash buckets typically follows a Poisson or bell curve distribution, not a flat linear distribution. Poisson distribution is not an even distribution. Index key values are not evenly distributed in the hash buckets. If two index keys are mapped to the same hash bucket, there is a hash collision. A large number of hash collisions can have a performance impact on read operations. A realistic goal is for 30% of the buckets contain two different key values. The interplay of the hash index and the buckets is summarized in the following image.Configuring the hash index bucket count The hash index bucket count is specified at index create time, and can be changed using the ALTER TABLE...ALTER INDEX REBUILD syntax. In most cases the bucket count would ideally be between 1 and 2 times the number of distinct values in the index key. You may not always be able to predict how many values a particular index key may have, or will have. Performance is usually still good if the BUCKET_COUNT value is within 10 times of the actual number of key values, and overestimating is generally better than underestimating. Too few buckets has the following drawbacks: More hash collisions of distinct key values. Each distinct value is forced to share the same bucket with a different distinct value. The average chain length per bucket grows. The longer the bucket chain, the slower the speed of equality lookups in the index. Too many buckets has the following drawbacks: Too high a bucket count might result in more empty buckets. Empty buckets impact the performance of full index scans. If those are performed regularly, consider picking a bucket count close to the number of distinct index key values. Empty buckets use memory, though each bucket uses only 8 bytes. NOTE Adding more buckets does nothing to reduce the chaining together of entries that share a duplicate value. The rate of value duplication is used to decide whether a hash is the appropriate index type, not to calculate the bucket count. Performance considerations The performance of a hash index is: Excellent when the predicate in the WHERE clause specifies an exact value for each column in the hash index key. A hash index will revert to a scan given an inequality predicate. Poor when the predicate in the WHERE clause looks for a range of values in the index key. Poor when the predicate in the WHERE clause stipulates one specific value for the first column of a two column hash index key, but does not specify a value for other columns of the key.TIP The predicate must include all columns in the hash index key. The hash index requires a key (to hash) to seek into the index. If an index key consists of two columns and the WHERE clause only provides the first column, SQL Server does not have a complete key to hash. This will result in an index scan query plan. If a hash index is used and the number of unique index keys is 100 times (or more) than the row count, consider either increasing to a larger bucket count to avoid large row chains, or use a nonclustered index instead. Declaration considerations A hash index can exist only on a memory-optimized table. It cannot exist on a disk-based table. A hash index can be declared as: UNIQUE, or can default to Non-Unique. NONCLUSTERED, which is the default. The following is an example of the syntax to create a hash index, outside of the CREATE TABLE statement: ```sql ALTER TABLE MyTable_memop ADD INDEX ix_hash_Column2 UNIQUE HASH (Column2) WITH (BUCKET_COUNT = 64); ``` Row versions and garbage collection In a memory-optimized table, when a row is affected by an UPDATE , the table creates an updated version of the row. During the update transaction, other sessions might be able to read the older version of the row and thereby avoid the performance slowdown associated with a row lock. The hash index might also have different versions of its entries to accommodate the update. Later when the older versions are no longer needed, a garbage collection (GC) thread traverses the buckets and their link lists to clean away old entries. The GC thread performs better if the link list chain lengths are short. For more information, refer to In-Memory OLTP Garbage Collection. Memory-Optimized Nonclustered Index Design Guidelines Nonclustered indexes are one of the possible index types in a memory-optimized table. For more information, see Indexes for Memory-Optimized Tables. Applies to: SQL Server 2014 (12.x) through SQL Server 2017. In-memory Nonclustered Index Architecture In-memory nonclustered indexes are implemented using a data structure called a Bw-Tree, originally envisioned and described by Microsoft Research in 2011. A Bw-Tree is a lock and latch-free variation of a B-Tree. For more details please see The Bw-Tree: A B-tree for New Hardware Platforms. At a very high level the Bw-Tree can be understood as a map of pages organized by page ID (PidMap), a facility to allocate and reuse page IDs (PidAlloc) and a set of pages linked in the page map and to each other. These three high level sub-components make up the basic internal structure of a Bw-Tree. The structure is similar to a normal B-Tree in the sense that each page has a set of key values that are ordered and there are levels in the index each pointing to a lower level and the leaf levels point to a data row. However there are several differences.Just like hash indexes, multiple data rows can be linked together (versions). The page pointers between the levels are logical page IDs, which are offsets into a page mapping table, that in turn has the physical address for each page. There are no in-place updates of index pages. New delta pages are introduced for this purpose. No latching or locking is required for page updates. Index pages are not a fixed size. The key value in each non-leaf level page depicted is the highest value that the child that it points to contains and each row also contains that page logical page ID. On the leaf-level pages, along with the key value, it contains the physical address of the data row. Point lookups are similar to B-Trees except that because pages are linked in only one direction, the SQL Server Database Engine follows right page pointers, where each non-leaf pages has the highest value of its child, rather than lowest value as in a B-Tree. If a Leaf-level page has to change, the SQL Server Database Engine does not modify the page itself. Rather, the SQL Server Database Engine creates a delta record that describes the change, and appends it to the previous page. Then it also updates the page map table address for that previous page, to the address of the delta record which now becomes the physical address for this page. There are three different operations that can be required for managing the structure of a Bw-Tree: consolidation, split and merge. Delta Consolidation A long chain of delta records can eventually degrade search performance as it could mean we are traversing long chains when searching through an index. If a new delta record is added to a chain that already has 16 elements, the changes in the delta records will be consolidated into the referenced index page, and the page will then be rebuilt, including the changes indicated by the new delta record that triggered the consolidation. The newly rebuilt page will have the same page ID but a new memory address. Split page An index page in Bw-Tree grows on as-needed basis starting from storing a single row to storing a maximum of 8 KB. Once the index page grows to 8 KB, a new insert of a single row will cause the index page to split. For an internal page, this means when there is no more room to add another key value and pointer, and for a leaf page, it means that the row would be too big to fit on the page once all the delta records are incorporated. The statistics information in the page header for a leaf page keeps track of how much space would be required to consolidate the delta records, and that information is adjusted as each new delta record is added.A Split operation is done in two atomic steps. In the picture below, assume a Leaf-page forces a split because a key with value 5 is being inserted, and a non-leaf page exists pointing to the end of the current Leaf-level page (key value 4). Step 1: Allocate two new pages P1 and P2, and split the rows from old P1 page onto these new pages, including the newly inserted row. A new slot in Page Mapping Table is used to store the physical address of page P2. These pages, P1 and P2 are not accessible to any concurrent operations yet. In addition, the logical pointer from P1 to P2 is set. Then, in one atomic step update the Page Mapping Table to change the pointer from old P1 to new P1. Step 2: The non-leaf page points to P1 but there is no direct pointer from a non-leaf page to P2. P2 is only reachable via P1. To create a pointer from a non-leaf page to P2, allocate a new non-leaf page (internal index page), copy all the rows from old non-leaf page, and add a new row to point to P2. Once this is done, in one atomic step, update the Page Mapping Table to change the pointer from old non-leaf page to new non-leaf page. Merge page When a DELETE operation results in a page having less than 10% of the maximum page size (currently 8 KB), or with a single row on it, that page will be merged with a contiguous page. When a row is deleted from a page, a delta record for the delete is added. Additionally, a check is made to determine if the index page (non-leaf page) qualifies for Merge. This check verifies if the remaining space after deleting the row will be less than 10% of maximum page size. If it does qualify, the Merge is performed in three atomic steps. In the picture below, assume a DELETE operation will delete the key value 10.Step 1: A delta page representing key value 10 (blue triangle) is created and its pointer in the non-leaf page Pp1 is set to the new delta page. Additionally a special merge-delta page (green triangle) is created, and it is linked to point to the delta page. At this stage, both pages (delta page and merge-delta page) are not visible to any concurrent transaction. In one atomic step, the pointer to the Leaf-level page P1 in the Page Mapping Table is updated to point to the merge-delta page. After this step, the entry for key value 10 in Pp1 now points to the merge-delta page. Step 2: The row representing key value 7 in the non-leaf page Pp1 needs to be removed, and the entry for key value 10 updated to point to P1. To do this, a new non-leaf page Pp2 is allocated and all the rows from Pp1 are copied except for the row representing key value 7; then the row for key value 10 is updated to point to page P1. Once this is done, in one atomic step, the Page Mapping Table entry pointing to Pp1 is updated to point to Pp2. Pp1 is no longer reachable. Step 3: The Leaf-level pages P2 and P1 are merged and the delta pages removed. To do this, a new page P3 is allocated and the rows from P2 and P1 are merged, and the delta page changes are included in the new P3. Then, in one atomic step, the Page Mapping Table entry pointing to page P1 is updated to point to page P3. Performance considerations The performance of a nonclustered index is better than nonclustered hash indexes when querying a memory- optimized table with inequality predicates. NOTE A column in a memory-optimized table can be part of both a hash index and a nonclustered index. TIP When a column in a nonclustered index key columns have many duplicate values, performance can degrade for updates, inserts, and deletes. One way to improve performance in this situation is to add another column to the nonclustered index. Additional Reading Improving Performance with SQL Server 2008 Indexed Views Partitioned Tables and Indexes Create a Primary KeyIndexes for Memory-Optimized Tables Columnstore Indexes overview Troubleshooting Hash Indexes for Memory-Optimized Tables Memory-Optimized Table Dynamic Management Views (Transact-SQL) Index Related Dynamic Management Views and Functions (Transact-SQL) Indexes on Computed Columns Indexes and ALTER TABLE CREATE INDEX (Transact-SQL) ALTER INDEX (Transact-SQL) CREATE XML INDEX (Transact-SQL) CREATE SPATIAL INDEX (Transact-SQL)Memory Management Architecture Guide 5/3/2018 • 24 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse Windows Virtual Memory Manager The committed regions of address space are mapped to the available physical memory by the Windows Virtual Memory Manager (VMM). For more information on the amount of physical memory supported by different operating systems, see the Windows documentation on Memory Limits for Windows Releases. Virtual memory systems allow the over-commitment of physical memory, so that the ratio of virtual-to-physical memory can exceed 1:1. As a result, larger programs can run on computers with a variety of physical memory configurations. However, using significantly more virtual memory than the combined average working sets of all the processes can cause poor performance. SQL Server Memory Architecture SQL Server dynamically acquires and frees memory as required. Typically, an administrator does not have to specify how much memory should be allocated to SQL Server, although the option still exists and is required in some environments. One of the primary design goals of all database software is to minimize disk I/O because disk reads and writes are among the most resource-intensive operations. SQL Server builds a buffer pool in memory to hold pages read from the database. Much of the code in SQL Server is dedicated to minimizing the number of physical reads and writes between the disk and the buffer pool. SQL Server tries to reach a balance between two goals: Keep the buffer pool from becoming so big that the entire system is low on memory. Minimize physical I/O to the database files by maximizing the size of the buffer pool. NOTE In a heavily loaded system, some large queries that require a large amount of memory to run cannot get the minimum amount of requested memory and receive a time-out error while waiting for memory resources. To resolve this, increase the query wait Option. For a parallel query, consider reducing the max degree of parallelism Option. NOTE In a heavily loaded system under memory pressure, queries with merge join, sort and bitmap in the query plan can drop the bitmap when the queries do not get the minimum required memory for the bitmap. This can affect the query performance and if the sorting process can not fit in memory, it can increase the usage of worktables in tempdb database, causing tempdb to grow. To resolve this problem add physical memory or tune the queries to use a different and faster query plan. Providing the maximum amount of memory to SQL Server By using AWE and the Locked Pages in Memory privilege, you can provide the following amounts of memory to the SQL Server Database Engine.NOTE The following table includes a column for 32-bit versions, which are no longer available. 32-BIT 1 64-BIT Conventional memory All SQL Server editions. Up to process All SQL Server editions. Up to process virtual address space limit: virtual address space limit: - 2 GB - 7 TB with IA64 architecture (IA64 not - 3 GB with /3gb boot parameter 2 supported in SQL Server 2012 (11.x) - 4 GB on WOW64 3 and above) - Operating system maximum with x64 architecture 4 AWE mechanism (Allows SQL Server to SQL Server Standard, Enterprise, and Not applicable 5 go beyond the process virtual address Developer editions: Buffer pool is space limit on 32-bit platform.) capable of accessing up to 64 GB of memory. Lock pages in memory operating SQL Server Standard, Enterprise, and Only used when necessary, namely if system (OS) privilege (allows locking Developer editions: Required for SQL there are signs that sqlservr process is physical memory, preventing OS paging Server process to use AWE mechanism. being paged out. In this case, error of the locked memory.) 6 Memory allocated through AWE 17890 will be reported in the Errorlog, mechanism cannot be paged out. resembling the following example: Granting this privilege without enabling A significant part of sql server AWE has no effect on the server. process memory has been paged out. This may result in a performance degradation. Duration: #### seconds. Working set (KB): ####, committed (KB): ####, memory utilization: ##%. 1 32-bit versions are not available starting with SQL Server 2014 (12.x). 2 /3gb is an operating system boot parameter. For more information, visit the MSDN Library. 3 WOW64 (Windows on Windows 64) is a mode in which 32-bit SQL Server runs on a 64-bit operating system. 4 SQL Server Standard Edition supports up to 128 GB. SQL Server Enterprise Edition supports the operating system maximum. 5 Note that the sp_configure awe enabled option was present on 64-bit SQL Server, but it is ignored. 6 If lock pages in memory privilege (LPIM) is granted (either on 32-bit for AWE support or on 64-bit by itself), we recommend also setting max server memory. For more information on LPIM, refer to Server Memory Server Configuration Options NOTE Older versions of SQL Server could run on a 32-bit operating system. Accessing more than 4 gigabytes (GB) of memory on a 32-bit operating system required Address Windowing Extensions (AWE) to manage the memory. This is not necessary when SQL Server is running on 64-bit operation systems. For more information about AWE, see Process Address Space and Managing Memory for Large Databases in the SQL Server 2008 documentation. Changes to Memory Management starting with SQL Server 2012 (11.x) In earlier versions of SQL Server ( SQL Server 2005, SQL Server 2008 and SQL Server 2008 R2), memory allocation was done using five different mechanisms: Single-page Allocator (SPA), including only memory allocations that were less than, or equal to 8-KB in the SQL Server process. The max server memory (MB) and min server memory (MB) configuration options determined the limits of physical memory that the SPA consumed. THe buffer pool was simultaneously themechanism for SPA, and the largest consumer of single-page allocations. Multi-Page Allocator (MPA), for memory allocations that request more than 8-KB. CLR Allocator, including the SQL CLR heaps and its global allocations that are created during CLR initialization. Memory allocations for thread stacks in the SQL Server process. Direct Windows allocations (DWA), for memory allocation requests made directly to Windows. These include Windows heap usage and direct virtual allocations made by modules that are loaded into the SQL Server process. Examples of such memory allocation requests include allocations from extended stored procedure DLLs, objects that are created by using Automation procedures (sp_OA calls), and allocations from linked server providers. Starting with SQL Server 2012 (11.x), Single-lage allocations, Multi-Page allocations and CLR allocations are all consolidated into a "Any size" Page Allocator, and it''s included in memory limits that are controlled by max server memory (MB) and min server memory (MB) configuration options. This change provided a more accurate sizing ability for all memory requirements that go through the SQL Server memory manager. IMPORTANT Carefully review your current max server memory (MB) and min server memory (MB) configurations after you upgrade to SQL Server 2012 (11.x) through SQL Server 2017. This is because starting in SQL Server 2012 (11.x), such configurations now include and account for more memory allocations compared to earlier versions. These changes apply to both 32-bit and 64-bit versions of SQL Server 2012 (11.x) and SQL Server 2014 (12.x), and 64-bit versions of SQL Server 2016 (13.x) through SQL Server 2017. The following table indicates whether a specific type of memory allocation is controlled by the max server memory (MB) and min server memory (MB) configuration options: SQL SERVER 2005, SQL SERVER 2008 AND TYPE OF MEMORY ALLOCATION SQL SERVER 2008 R2 STARTING WITH SQL SERVER 2012 (11.X) Single-page allocations Yes Yes, consolidated into "any size" page allocations Multi-page allocations No Yes, consolidated into "any size" page allocations CLR allocations No Yes Thread stacks memory No No Direct allocations from Windows No No Starting with SQL Server 2012 (11.x), SQL Server might allocate more memory than the value specified in the max server memory setting. This behavior may occur when the Total Server Memory (KB) value has already reached the Target Server Memory (KB) setting (as specified by max server memory). If there is insufficient contiguous free memory to meet the demand of multi-page memory requests (more than 8 KB) because of memory fragmentation, SQL Server can perform over-commitment instead of rejecting the memory request. As soon as this allocation is performed, the Resource Monitor background task starts to signal all memory consumers to release the allocated memory, and tries to bring the Total Server Memory (KB) value below the Target Server Memory (KB) specification. Therefore, SQL Server memory usage could briefly exceed the max server memory setting. In this situation, the Total Server Memory (KB) performance counter reading will exceed the max server memory and Target Server Memory (KB) settings.This behavior is typically observed during the following operations: Large Columnstore index queries. Columnstore index (re)builds, which use large volumes of memory to perform Hash and Sort operations. Backup operations that require large memory buffers. Tracing operations that have to store large input parameters. Changes to "memory_to_reserve" starting with SQL Server 2012 (11.x) In earlier versions of SQL Server ( SQL Server 2005, SQL Server 2008 and SQL Server 2008 R2), the SQL Server memory manager set aside a part of the process virtual address space (VAS) for use by the Multi-Page Allocator (MPA), CLR Allocator, memory allocations for thread stacks in the SQL Server process, and Direct Windows allocations (DWA). This part of the virtual address space is also known as "Mem-To-Leave" or "non- Buffer Pool" region. The virtual address space that is reserved for these allocations is determined by the memory_to_reserve configuration option. The default value that SQL Server uses is 256 MB. To override the default value, use the SQL Server -g startup parameter. Refer to the documentation page on Database Engine Service Startup Options for information on the -g startup parameter. Because starting with SQL Server 2012 (11.x), the new "any size" page allocator also handles allocations greater than 8 KB, the memory_to_reserve value does not include the multi-page allocations. Except for this change, everything else remains the same with this configuration option. The following table indicates whether a specific type of memory allocation falls into the memory_to_reserve region of the virtual address space for the SQL Server process: SQL SERVER 2005, SQL SERVER 2008 AND TYPE OF MEMORY ALLOCATION SQL SERVER 2008 R2 STARTING WITH SQL SERVER 2012 (11.X) Single-page allocations No No, consolidated into "any size" page allocations Multi-page allocations Yes No, consolidated into "any size" page allocations CLR allocations Yes Yes Thread stacks memory Yes Yes Direct allocations from Windows Yes Yes Dynamic Memory Management The default memory management behavior of the SQL Server SQL Server Database Engine is to acquire as much memory as it needs without creating a memory shortage on the system. The SQL Server Database Engine does this by using the Memory Notification APIs in Microsoft Windows. When SQL Server is using memory dynamically, it queries the system periodically to determine the amount of free memory. Maintaining this free memory prevents the operating system (OS) from paging. If less memory is free, SQL Server releases memory to the OS. If more memory is free, SQL Server may allocate more memory. SQL Server adds memory only when its workload requires more memory; a server at rest does not increase the size of its virtual address space. Max server memory controls the SQL Server memory allocation, compile memory, all caches (including the 1buffer pool), query execution memory grants, lock manager memory, and CLR1 memory (essentially any memory clerk found in sys.dm_os_memory_clerks). 1 CLR memory is managed under max_server_memory allocations starting with SQL Server 2012 (11.x). The following query returns information about currently allocated memory: SELECT physical_memory_in_use_kb/1024 AS sql_physical_memory_in_use_MB, large_page_allocations_kb/1024 AS sql_large_page_allocations_MB, locked_page_allocations_kb/1024 AS sql_locked_page_allocations_MB, virtual_address_space_reserved_kb/1024 AS sql_VAS_reserved_MB, virtual_address_space_committed_kb/1024 AS sql_VAS_committed_MB, virtual_address_space_available_kb/1024 AS sql_VAS_available_MB, page_fault_count AS sql_page_fault_count, memory_utilization_percentage AS sql_memory_utilization_percentage, process_physical_memory_low AS sql_process_physical_memory_low, process_virtual_memory_low AS sql_process_virtual_memory_low FROM sys.dm_os_process_memory; Memory for thread stacks1, CLR2, extended procedure .dll files, the OLE DB providers referenced by distributed queries, automation objects referenced in Transact-SQL statements, and any memory allocated by a non SQL Server DLL are not controlled by max server memory. 1 Refer to the documentation page on how to Configure the max worker threads Server Configuration Option, for information on the calculated default worker threads for a given number of affinitized CPUs in the current host. SQL Server stack sizes are as follows: SQL SERVER ARCHITECTURE OS ARCHITECTURE STACK SIZE x86 (32-bit) x86 (32-bit) 512 KB x86 (32-bit) x64 (64-bit) 768 KB x64 (64-bit) x64 (64-bit) 2048 KB IA64 (Itanium) IA64 (Itanium) 4096 KB 2 CLR memory is managed under max_server_memory allocations starting with SQL Server 2012 (11.x). SQL Server uses the memory notification API QueryMemoryResourceNotification to determine when the SQL Server Memory Manager may allocate memory and release memory. When SQL Server starts, it computes the size of virtual address space for the buffer pool based on a number of parameters such as amount of physical memory on the system, number of server threads and various startup parameters. SQL Server reserves the computed amount of its process virtual address space for the buffer pool, but it acquires (commits) only the required amount of physical memory for the current load. The instance then continues to acquire memory as needed to support the workload. As more users connect and run queries, SQL Server acquires the additional physical memory on demand. A SQL Server instance continues to acquire physical memory until it either reaches its max server memory allocation target or Windows indicates there is no longer an excess of free memory; it frees memory when it has more than the min server memory setting, and Windows indicates that there is a shortage of free memory. As other applications are started on a computer running an instance of SQL Server, they consume memory and the amount of free physical memory drops below the SQL Server target. The instance of SQL Server adjusts its memory consumption. If another application is stopped and more memory becomes available, the instance of SQL Server increases the size of its memory allocation. SQL Server can free and acquire several megabytes ofmemory each second, allowing it to quickly adjust to memory allocation changes. Effects of min and max server memory The min server memory and max server memory configuration options establish upper and lower limits to the amount of memory used by the buffer pool and other caches of the SQL Server Database Engine. The buffer pool does not immediately acquire the amount of memory specified in min server memory. The buffer pool starts with only the memory required to initialize. As the SQL Server Database Engine workload increases, it keeps acquiring the memory required to support the workload. The buffer pool does not free any of the acquired memory until it reaches the amount specified in min server memory. Once min server memory is reached, the buffer pool then uses the standard algorithm to acquire and free memory as needed. The only difference is that the buffer pool never drops its memory allocation below the level specified in min server memory, and never acquires more memory than the level specified in max server memory. NOTE SQL Server as a process acquires more memory than specified by max server memory option. Both internal and external components can allocate memory outside of the buffer pool, which consumes additional memory, but the memory allocated to the buffer pool usually still represents the largest portion of memory consumed by SQL Server. The amount of memory acquired by the SQL Server Database Engine is entirely dependent on the workload placed on the instance. A SQL Server instance that is not processing many requests may never reach min server memory. If the same value is specified for both min server memory and max server memory, then once the memory allocated to the SQL Server Database Engine reaches that value, the SQL Server Database Engine stops dynamically freeing and acquiring memory for the buffer pool. If an instance of SQL Server is running on a computer where other applications are frequently stopped or started, the allocation and deallocation of memory by the instance of SQL Server may slow the startup times of other applications. Also, if SQL Server is one of several server applications running on a single computer, the system administrators may need to control the amount of memory allocated to SQL Server. In these cases, you can use the min server memory and max server memory options to control how much memory SQL Server can use. The min server memory and max server memory options are specified in megabytes. For more information, see Server Memory Configuration Options. Memory used by SQL Server objects specifications The following list describes the approximate amount of memory used by different objects in SQL Server. The amounts listed are estimates and can vary depending on the environment and how objects are created: Lock (as maintained by the Lock Manager): 64 bytes + 32 bytes per owner User connection: Approximately (3 * network_packet_size + 94 kb) The network packet size is the size of the tabular data scheme (TDS) packets that are used to communicate between applications and the SQL Server Database Engine. The default packet size is 4 KB, and is controlled by the network packet size configuration option. When multiple active result sets (MARS) are enabled, the user connection is approximately (3 + 3 * num_logical_connections) * network_packet_size + 94 KB Buffer management The primary purpose of a SQL Server database is to store and retrieve data, so intensive disk I/O is a core characteristic of the Database Engine. And because disk I/O operations can consume many resources and take arelatively long time to finish, SQL Server focuses on making I/O highly efficient. Buffer management is a key component in achieving this efficiency. The buffer management component consists of two mechanisms: the buffer manager to access and update database pages, and the buffer cache (also called the buffer pool), to reduce database file I/O. How buffer management works A buffer is an 8 KB page in memory, the same size as a data or index page. Thus, the buffer cache is divided into 8 KB pages. The buffer manager manages the functions for reading data or index pages from the database disk files into the buffer cache and writing modified pages back to disk. A page remains in the buffer cache until the buffer manager needs the buffer area to read in more data. Data is written back to disk only if it is modified. Data in the buffer cache can be modified multiple times before being written back to disk. For more information, see Reading Pages and Writing Pages. When SQL Server starts, it computes the size of virtual address space for the buffer cache based on a number of parameters such as the amount of physical memory on the system, the configured number of maximum server threads, and various startup parameters. SQL Server reserves this computed amount of its process virtual address space (called the memory target) for the buffer cache, but it acquires (commits) only the required amount of physical memory for the current load. You can query the bpool_commit_target and bpool_committed columns in the sys.dm_os_sys_info catalog view to return the number of pages reserved as the memory target and the number of pages currently committed in the buffer cache, respectively. The interval between SQL Server startup and when the buffer cache obtains its memory target is called ramp-up. During this time, read requests fill the buffers as needed. For example, a single 8 KB page read request fills a single buffer page. This means the ramp-up depends on the number and type of client requests. Ramp-up is expedited by transforming single page read requests into aligned eight page requests (making up one extent). This allows the ramp-up to finish much faster, especially on machines with a lot of memory. For more information about pages and extents, refer to Pages and Extents Architecture Guide. Because the buffer manager uses most of the memory in the SQL Server process, it cooperates with the memory manager to allow other components to use its buffers. The buffer manager interacts primarily with the following components: Resource manager to control overall memory usage and, in 32-bit platforms, to control address space usage. Database manager and the SQL Server Operating System (SQLOS) for low-level file I/O operations. Log manager for write-ahead logging. Supported Features The buffer manager supports the following features: The buffer manager is non-uniform memory access (NUMA) aware. Buffer cache pages are distributed across hardware NUMA nodes, which allows a thread to access a buffer page that is allocated on the local NUMA node rather than from foreign memory. The buffer manager supports Hot Add Memory, which allows users to add physical memory without restarting the server. The buffer manager supports large pages on 64-bit platforms. The page size is specific to the version of Windows. NOTE Prior to SQL Server 2012 (11.x), enabling large pages in SQL Server requires trace flag 834. The buffer manager provides additional diagnostics that are exposed through dynamic management views. You can use these views to monitor a variety of operating system resources that are specific to SQL Server. For example, you can use the sys.dm_os_buffer_descriptors view to monitor the pages in the buffer cache.Disk I/O The buffer manager only performs reads and writes to the database. Other file and database operations such as open, close, extend, and shrink are performed by the database manager and file manager components. Disk I/O operations by the buffer manager have the following characteristics: All I/Os are performed asynchronously, which allows the calling thread to continue processing while the I/O operation takes place in the background. All I/Os are issued in the calling threads unless the affinity I/O option is in use. The affinity I/O mask option binds SQL Server disk I/O to a specified subset of CPUs. In high-end SQL Server online transactional processing (OLTP) environments, this extension can enhance the performance of SQL Server threads issuing I/Os. Multiple page I/Os are accomplished with scatter-gather I/O, which allows data to be transferred into or out of noncontiguous areas of memory. This means that SQL Server can quickly fill or flush the buffer cache while avoiding multiple physical I/O requests. Long I/O requests The buffer manager reports on any I/O request that has been outstanding for at least 15 seconds. This helps the system administrator distinguish between SQL Server problems and I/O subsystem problems. Error message 833 is reported and appears in the SQL Server error log as follows: SQL Server has encountered ## occurrence(s) of I/O requests taking longer than 15 seconds to complete on file [##] in database [##] (#). The OS file handle is 0x00000. The offset of the latest long I/O is: 0x00000. A long I/O may be either a read or a write; it is not currently indicated in the message. Long-I/O messages are warnings, not errors. They do not indicate problems with SQL Server but with the underlying I/O system. The messages are reported to help the system administrator find the cause of poor SQL Server response times more quickly, and to distinguish problems that are outside the control of SQL Server. As such, they do not require any action, but the system administrator should investigate why the I/O request took so long, and whether the time is justifiable. Causes of Long-I/O Requests A long-I/O message may indicate that an I/O is permanently blocked and will never complete (known as lost I/O), or merely that it just has not completed yet. It is not possible to tell from the message which scenario is the case, although a lost I/O will often lead to a latch timeout. Long I/Os often indicate a SQL Server workload that is too intense for the disk subsystem. An inadequate disk subsystem may be indicated when: Multiple long I/O messages appear in the error log during a heavy SQL Server workload. Perfmon counters show long disk latencies, long disk queues, or no disk idle time. Long I/Os may also be caused by a component in the I/O path (for example, a driver, controller, or firmware) continually postponing servicing an old I/O request in favor of servicing newer requests that are closer to the current position of the disk head. The common technique of processing requests in priority based upon which ones are closest to the current position of the read/write head is known as "elevator seeking." This may be difficult to corroborate with the Windows System Monitor (PERFMON.EXE) tool because most I/Os are being serviced promptly. Long I/O requests can be aggravated by workloads that perform large amounts of sequential I/O, such as backup and restore, table scans, sorting, creating indexes, bulk loads, and zeroing out files. Isolated long I/Os that do not appear related to any of the previous conditions may be caused by a hardware or driver problem. The system event log may contain a related event that helps to diagnose the problem. Error Detection Database pages can use one of two optional mechanisms that help insure the integrity of the page from the time it is written to disk until it is read again: torn page protection and checksum protection. These mechanisms allow an independent method of verifying the correctness of not only the data storage, but hardware components such ascontrollers, drivers, cables, and even the operating system. The protection is added to the page just before writing it to disk, and verified after it is read from disk. SQL Server will retry any read that fails with a checksum, torn page, or other I/O error four times. If the read is successful in any one of the retry attempts, a message will be written to the error log and the command that triggered the read will continue. If the retry attempts fail, the command will fail with error message 824. The kind of page protection used is an attribute of the database containing the page. Checksum protection is the default protection for databases created in SQL Server 2005 and later. The page protection mechanism is specified at database creation time, and may be altered by using ALTER DATABASE SET. You can determine the current page protection setting by querying the page_verify_option column in the sys.databases catalog view or the IsTornPageDetectionEnabled property of the DATABASEPROPERTYEX function. NOTE If the page protection setting is changed, the new setting does not immediately affect the entire database. Instead, pages adopt the current protection level of the database whenever they are written next. This means that the database may be composed of pages with different kinds of protection. Torn Page Protection Torn page protection, introduced in SQL Server 2000, is primarily a way of detecting page corruptions due to power failures. For example, an unexpected power failure may leave only part of a page written to disk. When torn page protection is used, a specific 2-bit signature pattern for each 512-byte sector in the 8-kilobyte (KB) database page and stored in the database page header when the page is written to disk. When the page is read from disk, the torn bits stored in the page header are compared to the actual page sector information. The signature pattern alternates between binary 01 and 10 with every write, so it is always possible to tell when only a portion of the sectors made it to disk: if a bit is in the wrong state when the page is later read, the page was written incorrectly and a torn page is detected. Torn page detection uses minimal resources; however, it does not detect all errors caused by disk hardware failures. For information on setting torn page detection, see ALTER DATABASE SET Options (Transact-SQL). Checksum Protection Checksum protection, introduced in SQL Server 2005, provides stronger data integrity checking. A checksum is calculated for the data in each page that is written, and stored in the page header. Whenever a page with a stored checksum is read from disk, the database engine recalculates the checksum for the data in the page and raises error 824 if the new checksum is different from the stored checksum. Checksum protection can catch more errors than torn page protection because it is affected by every byte of the page, however, it is moderately resource intensive. When checksum is enabled, errors caused by power failures and flawed hardware or firmware can be detected any time the buffer manager reads a page from disk. For information on setting checksum, see ALTER DATABASE SET Options (Transact-SQL). IMPORTANT When a user or system database is upgraded to SQL Server 2005 or a later version, the PAGE_VERIFY value (NONE or TORN_PAGE_DETECTION) is retained. We recommend that you use CHECKSUM. TORN_PAGE_DETECTION may use fewer resources but provides a minimal subset of the CHECKSUM protection. Understanding Non-uniform Memory Access SQL Server is non-uniform memory access (NUMA) aware, and performs well on NUMA hardware without special configuration. As clock speed and the number of processors increase, it becomes increasingly difficult to reduce the memory latency required to use this additional processing power. To circumvent this, hardware vendors provide large L3 caches, but this is only a limited solution. NUMA architecture provides a scalable solution to thisproblem. SQL Server has been designed to take advantage of NUMA-based computers without requiring any application changes. For more information, see How to: Configure SQL Server to Use Soft-NUMA. See Also Server Memory Server Configuration Options Reading Pages Writing Pages How to: Configure SQL Server to Use Soft-NUMA Requirements for Using Memory-Optimized Tables Resolve Out Of Memory Issues Using Memory-Optimized TablesReading Pages 5/3/2018 • 6 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse The I/O from an instance of the SQL Server Database Engine includes logical and physical reads. A logical read occurs every time the Database Engine requests a page from the buffer cache. If the page is not currently in the buffer cache, a physical read first copies the page from disk into the cache. The read requests generated by an instance of the Database Engine are controlled by the relational engine and optimized by the storage engine. The relational engine determines the most effective access method (such as a table scan, an index scan, or a keyed read); the access methods and buffer manager components of the storage engine determine the general pattern of reads to perform, and optimize the reads required to implement the access method. The thread executing the batch schedules the reads. Read-Ahead The Database Engine supports a performance optimization mechanism called read-ahead. Read-ahead anticipates the data and index pages needed to fulfill a query execution plan and brings the pages into the buffer cache before they are actually used by the query. This allows computation and I/O to overlap, taking full advantage of both the CPU and the disk. The read-ahead mechanism allows the Database Engine to read up to 64 contiguous pages (512KB) from one file. The read is performed as a single scatter-gather read to the appropriate number of (probably non-contiguous) buffers in the buffer cache. If any of the pages in the range are already present in the buffer cache, the corresponding page from the read will be discarded when the read completes. The range of pages may also be "trimmed" from either end if the corresponding pages are already present in the cache. There are two kinds of read-ahead: one for data pages and one for index pages. Reading Data Pages Table scans used to read data pages are very efficient in the Database Engine. The index allocation map (IAM) pages in a SQL Server database list the extents used by a table or index. The storage engine can read the IAM to build a sorted list of the disk addresses that must be read. This allows the storage engine to optimize its I/Os as large sequential reads that are performed in sequence, based on their location on the disk. For more information about IAM pages, see Managing Space Used by Objects. Reading Index Pages The storage engine reads index pages serially in key order. For example, this illustration shows a simplified representation of a set of leaf pages that contains a set of keys and the intermediate index node mapping the leaf pages. For more information about the structure of pages in an index, see Clustered Index Structures.The storage engine uses the information in the intermediate index page above the leaf level to schedule serial read-aheads for the pages that contain the keys. If a request is made for all the keys from ABC to DEF, the storage engine first reads the index page above the leaf page. However, it does not just read each data page in sequence from page 504 to page 556 (the last page with keys in the specified range). Instead, the storage engine scans the intermediate index page and builds a list of the leaf pages that must be read. The storage engine then schedules all the reads in key order. The storage engine also recognizes that pages 504/505 and 527/528 are contiguous and performs a single scatter read to retrieve the adjacent pages in a single operation. When there are many pages to be retrieved in a serial operation, the storage engine schedules a block of reads at a time. When a subset of these reads is completed, the storage engine schedules an equal number of new reads until all the required reads have been scheduled. The storage engine uses prefetching to speed base table lookups from nonclustered indexes. The leaf rows of a nonclustered index contain pointers to the data rows that contain each specific key value. As the storage engine reads through the leaf pages of the nonclustered index, it also starts scheduling asynchronous reads for the data rows whose pointers have already been retrieved. This allows the storage engine to retrieve data rows from the underlying table before it has completed the scan of the nonclustered index. Prefetching is used regardless of whether the table has a clustered index. SQL Server Enterprise uses more prefetching than other editions of SQL Server, allowing more pages to be read ahead. The level of prefetching is not configurable in any edition. For more information about nonclustered indexes, see Nonclustered Index Structures. Advanced Scanning In SQL Server Enterprise, the advanced scan feature allows multiple tasks to share full table scans. If the execution plan of a Transact-SQL statement requires a scan of the data pages in a table and the Database Engine detects that the table is already being scanned for another execution plan, the Database Engine joins the second scan to the first, at the current location of the second scan. The Database Engine reads each page one time and passes the rows from each page to both execution plans. This continues until the end of the table is reached. At that point, the first execution plan has the complete results of a scan, but the second execution plan must still retrieve the data pages that were read before it joined the in-progress scan. The scan for the second execution plan then wraps back to the first data page of the table and scans forward to where it joined the first scan. Any number of scans can be combined like this. The Database Engine will keep looping through the data pages until it has completed all the scans. This mechanism is also called "merry-go-round scanning" and demonstrates why the order of the results returned from a SELECT statement cannot be guaranteed without an ORDER BY clause. For example, assume that you have a table with 500,000 pages. UserA executes a Transact-SQL statement that requires a scan of the table. When that scan has processed 100,000 pages, UserB executes another Transact-SQL statement that scans the same table. The Database Engine schedules one set of read requests for pages after 100,001, and passes the rows from each page back to both scans. When the scan reaches the 200,000th page, UserC executes another Transact-SQL statement that scans the same table. Starting with page 200,001, the Database Engine passes the rows from each page it reads back to all three scans. After it reads the 500,000th row, the scan for UserA is complete, and the scans for UserB and UserC wrap back and start to read the pages starting with page 1. When the Database Engine gets to page 100,000, the scan for UserB is completed. The scan for UserC then keeps going alone until it reads page 200,000. At this point, all the scans have been completed. Without advanced scanning, each user would have to compete for buffer space and cause disk arm contention. The same pages would then be read once for each user, instead of read one time and shared by multiple users, slowing down performance and taxing resources. See Also Pages and Extents Architecture Guide Writing PagesWriting Pages 5/3/2018 • 3 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse The I/O from an instance of the Database Engine includes logical and physical writes. A logical write occurs when data is modified in a page in the buffer cache. A physical write occurs when the page is written from the buffer cache to disk. When a page is modified in the buffer cache, it is not immediately written back to disk; instead, the page is marked as dirty. This means that a page can have more than one logical write made before it is physically written to disk. For each logical write, a transaction log record is inserted in the log cache that records the modification. The log records must be written to disk before the associated dirty page is removed from the buffer cache and written to disk. SQL Server uses a technique known as write-ahead logging that prevents writing a dirty page before the associated log record is written to disk. This is essential to the correct working of the recovery manager. For more information, see Write-Ahead Transaction Log. The following illustration shows the process for writing a modified data page. When the buffer manager writes a page, it searches for adjacent dirty pages that can be included in a single gather-write operation. Adjacent pages have consecutive page IDs and are from the same file; the pages do not have to be contiguous in memory. The search continues both forward and backward until one of the following events occurs: A clean page is found. 32 pages have been found. A dirty page is found whose log sequence number (LSN) has not yet been flushed in the log. A page is found that cannot be immediately latched. In this way, the entire set of pages can be written to disk with a single gather-write operation. Just before a page is written, the form of page protection specified in the database is added to the page. If torn page protection is added, the page must be latched EX(clusively) for the I/O. This is because the torn page protection modifies the page, making it unsuitable for any other thread to read. If checksum page protection is added, or the database uses no page protection, the page is latched with an UP(date) latch for the I/O. This latch prevents anyone else from modifying the page during the write, but still allows readers to use it. For more information about disk I/O page protection options, see Buffer Management. A dirty page is written to disk in one of three ways: Lazy writing The lazy writer is a system process that keeps free buffers available by removing infrequently used pages from the buffer cache. Dirty pages are first written to disk. Eager writing The eager write process writes dirty data pages associated with nonlogged operations such as bulk insertand select into. This process allows creating and writing new pages to take place in parallel. That is, the calling operation does not have to wait until the entire operation finishes before writing the pages to disk. Checkpoint The checkpoint process periodically scans the buffer cache for buffers with pages from a specified database and writes all dirty pages to disk. Checkpoints save time during a later recovery by creating a point at which all dirty pages are guaranteed to have been written to disk. The user may request a checkpoint operation by using the CHECKPOINT command, or the Database Engine may generate automatic checkpoints based on the amount of log space used and time elapsed since the last checkpoint. In addition, a checkpoint is generated when certain activities occur. For example, when a data or log file is added or removed from a database, or when the instance of SQL Server is stopped. For more information, see Checkpoints and the Active Portion of the Log. The lazy writing, eager writing, and checkpoint processes do not wait for the I/O operation to complete. They always use asynchronous (or overlapped) I/O and continue with other work, checking for I/O success later. This allows SQL Server to maximize both CPU and I/O resources for the appropriate tasks. See Also Pages and Extents Architecture Guide Reading PagesPages and Extents Architecture Guide 5/3/2018 • 13 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse The page is the fundamental unit of data storage in SQL Server. An extent is a collection of eight physically contiguous pages. Extents help efficiently manage pages. This guide describes the data structures that are used to manage pages and extents in all versions of SQL Server. Understanding the architecture of pages and extents is important for designing and developing databases that perform efficiently. P ages and Extents The fundamental unit of data storage in SQL Server is the page. The disk space allocated to a data file (.mdf or .ndf) in a database is logically divided into pages numbered contiguously from 0 to n. Disk I/O operations are performed at the page level. That is, SQL Server reads or writes whole data pages. Extents are a collection of eight physically contiguous pages and are used to efficiently manage the pages. All pages are stored in extents. Pages In SQL Server, the page size is 8 KB. This means SQL Server databases have 128 pages per megabyte. Each page begins with a 96-byte header that is used to store system information about the page. This information includes the page number, page type, the amount of free space on the page, and the allocation unit ID of the object that owns the page. The following table shows the page types used in the data files of a SQL Server database. PAGE TYPE CONTENTS Data Data rows with all data, except text, ntext, image, nvarchar(max), varchar(max), varbinary(max), and xml data, when text in row is set to ON. Index Index entries. Text/Image Large object data types: (text, ntext, image, nvarchar(max), varchar(max), varbinary(max), and xml data) Variable length columns when the data row exceeds 8 KB: (varchar, nvarchar, varbinary, and sql_variant) Global Allocation Map, Shared Global Allocation Map Information about whether extents are allocated. Page Free Space (PFS) Information about page allocation and free space available on pages. Index Allocation Map Information about extents used by a table or index per allocation unit. Bulk Changed Map Information about extents modified by bulk operations since the last BACKUP LOG statement per allocation unit.PAGE TYPE CONTENTS Differential Changed Map Information about extents that have changed since the last BACKUP DATABASE statement per allocation unit. NOTE Log files do not contain pages; they contain a series of log records. Data rows are put on the page serially, starting immediately after the header. A row offset table starts at the end of the page, and each row offset table contains one entry for each row on the page. Each entry records how far the first byte of the row is from the start of the page. The entries in the row offset table are in reverse sequence from the sequence of the rows on the page. Large Row Support Rows cannot span pages, however portions of the row may be moved off the row''s page so that the row can actually be very large. The maximum amount of data and overhead that is contained in a single row on a page is 8,060 bytes (8 KB). However, this does not include the data stored in the Text/Image page type. This restriction is relaxed for tables that contain varchar, nvarchar, varbinary, or sql_variant columns. When the total row size of all fixed and variable columns in a table exceeds the 8,060 byte limitation, SQL Server dynamically moves one or more variable length columns to pages in the ROW_OVERFLOW_DATA allocation unit, starting with the column with the largest width. This is done whenever an insert or update operation increases the total size of the row beyond the 8060 byte limit. When a column is moved to a page in the ROW_OVERFLOW_DATA allocation unit, a 24-byte pointer on the original page in the IN_ROW_DATA allocation unit is maintained. If a subsequent operation reduces the row size, SQL Server dynamically moves the columns back to the original data page. Extents Extents are the basic unit in which space is managed. An extent is eight physically contiguous pages, or 64 KB. This means SQL Server databases have 16 extents per megabyte. To make its space allocation efficient, SQL Server does not allocate whole extents to tables with small amounts of data. SQL Server has two types of extents: Uniform extents are owned by a single object; all eight pages in the extent can only be used by the owning object. Mixed extents are shared by up to eight objects. Each of the eight pages in the extent can be owned by a different object. A new table or index is generally allocated pages from mixed extents. When the table or index grows to the point that it has eight pages, it then switches to use uniform extents for subsequent allocations. If you create an index on an existing table that has enough rows to generate eight pages in the index, all allocations to the index are in uniform extents.Managing Extent Allocations and Free Space The SQL Server data structures that manage extent allocations and track free space have a relatively simple structure. This has the following benefits: The free space information is densely packed, so relatively few pages contain this information. This increases speed by reducing the amount of disk reads that are required to retrieve allocation information. This also increases the chance that the allocation pages will remain in memory and not require more reads. Most of the allocation information is not chained together. This simplifies the maintenance of the allocation information. Each page allocation or deallocation can be performed quickly. This decreases the contention between concurrent tasks having to allocate or deallocate pages. Managing Extent Allocations SQL Server uses two types of allocation maps to record the allocation of extents: Global Allocation Map (GAM) GAM pages record what extents have been allocated. Each GAM covers 64,000 extents, or almost 4 GB of data. The GAM has one bit for each extent in the interval it covers. If the bit is 1, the extent is free; if the bit is 0, the extent is allocated. Shared Global Allocation Map (SGAM) SGAM pages record which extents are currently being used as mixed extents and also have at least one unused page. Each SGAM covers 64,000 extents, or almost 4 GB of data. The SGAM has one bit for each extent in the interval it covers. If the bit is 1, the extent is being used as a mixed extent and has a free page. If the bit is 0, the extent is not used as a mixed extent, or it is a mixed extent and all its pages are being used. Each extent has the following bit patterns set in the GAM and SGAM, based on its current use. CURRENT USE OF EX TENT GAM BIT SETTING SGAM BIT SETTING Free, not being used 1 0 Uniform extent, or full mixed extent 0 0 Mixed extent with free pages 0 1 This causes simple extent management algorithms. To allocate a uniform extent, the SQL Server Database Engine searches the GAM for a 1 bit and sets it to 0. To find a mixed extent with free pages, the SQL Server Database Engine searches the SGAM for a 1 bit. To allocate a mixed extent, the SQL Server Database Engine searches the GAM for a 1 bit, sets it to 0, and then also sets the corresponding bit in the SGAM to 1. To deallocate an extent, the SQL Server Database Engine makes sure that the GAM bit is set to 1 and theSGAM bit is set to 0. The algorithms that are actually used internally by the SQL Server Database Engine are more sophisticated than what is described in this topic, because the SQL Server Database Engine distributes data evenly in a database. However, even the real algorithms are simplified by not having to manage chains of extent allocation information. Tracking free space Page Free Space (PFS) pages record the allocation status of each page, whether an individual page has been allocated, and the amount of free space on each page. The PFS has one byte for each page, recording whether the page is allocated, and if so, whether it is empty, 1 to 50 percent full, 51 to 80 percent full, 81 to 95 percent full, or 96 to 100 percent full. After an extent has been allocated to an object, the Database Engine uses the PFS pages to record which pages in the extent are allocated or free. This information is used when the Database Engine has to allocate a new page. The amount of free space in a page is only maintained for heap and Text/Image pages. It is used when the Database Engine has to find a page with free space available to hold a newly inserted row. Indexes do not require that the page free space be tracked, because the point at which to insert a new row is set by the index key values. A PFS page is the first page after the file header page in a data file (page id 1). This is followed by a GAM page (page id 2), and then an SGAM page (page id 3). There is a PFS page approximately 8,000 pages in size after the first PFS page. There is another GAM page 64,000 extents after the first GAM page on page 2, and another SGAM page 64,000 extents after the first SGAM page on page 3. The following illustration shows the sequence of pages used by the SQL Server Database Engine to allocate and manage extents. Managing space used by objects An Index Allocation Map (IAM) page maps the extents in a 4-gigabyte (GB) part of a database file used by an allocation unit. An allocation unit is one of three types: IN_ROW_DATA Holds a partition of a heap or index. LOB_DATA Holds large object (LOB) data types, such as xml, varbinary(max), and varchar(max). ROW_OVERFLOW_DATA Holds variable length data stored in varchar, nvarchar, varbinary, or sql_variant columns that exceed the 8,060 byte row size limit. Each partition of a heap or index contains at least an IN_ROW_DATA allocation unit. It may also contain a LOB_DATA or ROW_OVERFLOW_DATA allocation unit, depending on the heap or index schema. For more information about allocation units, see Table and Index Organization. An IAM page covers a 4-GB range in a file and is the same coverage as a GAM or SGAM page. If the allocation unit contains extents from more than one file, or more than one 4-GB range of a file, there will be multiple IAM pages linked in an IAM chain. Therefore, each allocation unit has at least one IAM page for each file on which it has extents. There may also be more than one IAM page on a file, if the range of the extents on the file allocated to the allocation unit exceeds the range that a single IAM page can record.IAM pages are allocated as required for each allocation unit and are located randomly in the file. The system view, sys.system_internals_allocation_units, points to the first IAM page for an allocation unit. All the IAM pages for that allocation unit are linked in a chain. IMPORTANT The sys.system_internals_allocation_units system view is for internal use only and is subject to change. Compatibility is not guaranteed. IAM pages linked in a chain per allocation unit An IAM page has a header that indicates the starting extent of the range of extents mapped by the IAM page. The IAM page also has a large bitmap in which each bit represents one extent. The first bit in the map represents the first extent in the range, the second bit represents the second extent, and so on. If a bit is 0, the extent it represents is not allocated to the allocation unit owning the IAM. If the bit is 1, the extent it represents is allocated to the allocation unit owning the IAM page. When the SQL Server Database Engine has to insert a new row and no space is available in the current page, it uses the IAM and PFS pages to find a page to allocate, or, for a heap or a Text/Image page, a page with sufficient space to hold the row. The SQL Server Database Engine uses the IAM pages to find the extents allocated to the allocation unit. For each extent, the SQL Server Database Engine searches the PFS pages to see if there is a page that can be used. Each IAM and PFS page covers lots of data pages, so there are few IAM and PFS pages in a database. This means that the IAM and PFS pages are generally in memory in the SQL Server buffer pool, so they can be searched quickly. For indexes, the insertion point of a new row is set by the index key. In this case, the search process previously described does not occur. The SQL Server Database Engine allocates a new extent to an allocation unit only when it cannot quickly find a page in an existing extent with sufficient space to hold the row being inserted. The SQL Server Database Engine allocates extents from those available in the filegroup using a proportional fill allocation algorithm. If in the same filegroup with two files, one file has two times the free space as the other, two pages will be allocated from the file with the available space for every one page allocated from the other file. This means that every file in a filegroup should have a similar percentage of space used. Tracking Modified Extents SQL Server uses two internal data structures to track extents modified by bulk copy operations and extents modified since the last full backup. These data structures greatly speed up differential backups. They also speed up the logging of bulk copy operations when a database is using the bulk-logged recovery model. Like the Global Allocation Map (GAM) and Shared Global Allocation Map (SGAM) pages, these structures are bitmaps in which each bit represents a single extent.Differential Changed Map (DCM) This tracks the extents that have changed since the last BACKUP DATABASE statement. If the bit for an extent is 1, the extent has been modified since the last BACKUP DATABASE statement. If the bit is 0, the extent has not been modified. Differential backups read just the DCM pages to determine which extents have been modified. This greatly reduces the number of pages that a differential backup must scan. The length of time that a differential backup runs is proportional to the number of extents modified since the last BACKUP DATABASE statement and not the overall size of the database. Bulk Changed Map (BCM) This tracks the extents that have been modified by bulk logged operations since the last BACKUP LOG statement. If the bit for an extent is 1, the extent has been modified by a bulk logged operation after the last BACKUP LOG statement. If the bit is 0, the extent has not been modified by bulk logged operations. Although BCM pages appear in all databases, they are only relevant when the database is using the bulk- logged recovery model. In this recovery model, when a BACKUP LOG is performed, the backup process scans the BCMs for extents that have been modified. It then includes those extents in the log backup. This lets the bulk logged operations be recovered if the database is restored from a database backup and a sequence of transaction log backups. BCM pages are not relevant in a database that is using the simple recovery model, because no bulk logged operations are logged. They are not relevant in a database that is using the full recovery model, because that recovery model treats bulk logged operations as fully logged operations. The interval between DCM pages and BCM pages is the same as the interval between GAM and SGAM page, 64,000 extents. The DCM and BCM pages are located behind the GAM and SGAM pages in a physical file:Post-migration Validation and Optimization Guide 5/3/2018 • 6 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse SQL Server post migration step is very crucial for reconciling any data accuracy and completeness, as well as uncover performance issues with the workload. Common Performance Scenarios Below are some of the common performance scenarios encountered after migrating to SQL Server Platform and how to resolve them. These include scenarios that are specific to SQL Server to SQL Server migration (older versions to newer versions), as well as foreign platform (such as Oracle, DB2, MySQL and Sybase) to SQL Server migration. Query regressions due to change in CE version Applies to: SQL Server to SQL Server migration. When migrating from an older versions of SQL Server to SQL Server 2014 (12.x) or newer, and upgrading the database compatibility level to the latest one, a workload may be exposed to the risk of performance regression. This is because starting with SQL Server 2014 (12.x), all Query Optimizer changes are tied to the latest database compatibility level, so plans are not changed right at point of upgrade but rather when a user changes the COMPATIBILITY_LEVEL database option to the latest one. This capability, in combination with Query Store gives you a great level of control over the query performance in the upgrade process. For more information on Query Optimizer changes introduced in SQL Server 2014 (12.x), see Optimizing Your Query Plans with the SQL Server 2014 Cardinality Estimator. Steps to resolve Change the database compatibility level to the source version, and follow the recommended upgrade workflow as shown in the following picture: For more information on this topic, see Keep performance stability during the upgrade to newer SQL Server. Sensitivity to parameter sniffing Applies to: Foreign platform (such as Oracle, DB2, MySQL and Sybase) to SQL Server migration.NOTE For SQL Server to SQL Server migrations, if this issue existed in the source SQL Server, migrating to a newer version of SQL Server as-is will not address this scenario. SQL Server compiles query plans on stored procedures by using sniffing the input parameters at the first compile, generating a parameterized and reusable plan, optimized for that input data distribution. Even if not stored procedures, most statements generating trivial plans will be parameterized. After a plan is first cached, any future execution maps to a previously cached plan. A potential problem arises when that first compilation may not have used the most common sets of parameters for the usual workload. For different parameters, the same execution plan becomes inefficient. For more information on this topic, see Parameter Sniffing. Steps to resolve 1. Use the RECOMPILE hint. A plan is calculated every time adapted to each parameter value. 2. Rewrite the stored procedure to use the option (OPTIMIZE FOR( = )) . Decide which value to use that suits most of the relevant workload, creating and maintaining one plan that becomes efficient for the parameterized value. 3. Rewrite the stored procedure using local variable inside the procedure. Now the optimizer uses the density vector for estimations, resulting in the same plan regardless of the parameter value. 4. Rewrite the stored procedure to use the option (OPTIMIZE FOR UNKNOWN) . Same effect as using the local variable technique. 5. Rewrite the query to use the hint DISABLE_PARAMETER_SNIFFING . Same effect as using the local variable technique by totally disabling parameter sniffing, unless OPTION(RECOMPILE) , WITH RECOMPILE or OPTIMIZE FOR is used. TIP Leverage the Management Studio Plan Analysis feature to quickly identify if this is an issue. More information available here. Missing indexes Applies to: Foreign platform (such as Oracle, DB2, MySQL and Sybase) and SQL Server to SQL Server migration. Incorrect or missing indexes causes extra I/O that leads to extra memory and CPU being wasted. This maybe because workload profile has changed such as using different predicates, invalidating existing index design. Evidence of a poor indexing strategy or changes in workload profile include: Look for duplicate, redundant, rarely used and completely unused indexes. Special care with unused indexes with updates. Steps to resolve 1. Leverage the graphical execution plan for any Missing Index references. 2. Indexing suggestions generated by Database Engine Tuning Advisor. 3. Leverage the Missing Indexes DMV or through the SQL Server Performance Dashboard. 4. Leverage pre-existing scripts that can use existing DMVs to provide insight into any missing, duplicate, redundant, rarely used and completely unused indexes, but also if any index reference is hinted/hard-coded into existing procedures and functions in your database.TIP Examples of such pre-existing scripts include Index Creation and Index Information. Inability to use predicates to filter data Applies to: Foreign platform (such as Oracle, DB2, MySQL and Sybase) and SQL Server to SQL Server migration. NOTE For SQL Server to SQL Server migrations, if this issue existed in the source SQL Server, migrating to a newer version of SQL Server as-is will not address this scenario. SQL Server Query Optimizer can only account for information that is known at compile time. If a workload relies on predicates that can only be known at execution time, then the potential for a poor plan choice increases. For a better-quality plan, predicates must be SARGable, or Search Argumentable. Some examples of non-SARGable predicates: Implicit data conversions, like VARCHAR to NVARCHAR, or INT to VARCHAR. Look for runtime CONVERT_IMPLICIT warnings in the Actual Execution Plans. Converting from one type to another can also cause a loss of precision. Complex undetermined expressions such as WHERE UnitPrice + 1 < 3.975 , but not WHERE UnitPrice < 320 * 200 * 32 . Expressions using functions, such as WHERE ABS(ProductID) = 771 or WHERE UPPER(LastName) = ''Smith'' Strings with a leading wildcard character, such as WHERE LastName LIKE ''%Smith'' , but not WHERE LastName LIKE ''Smith%'' . Steps to resolve 1. Always declare variables/parameters as the intended target data type. This may involve comparing any user-defined code construct that is stored in the database (such as stored procedures, user-defined functions or views) with system tables that hold information on data types used in underlying tables (such as sys.columns). 2. If unable to traverse all code to the previous point, then for the same purpose, change the data type on the table to match any variable/parameter declaration. 3. Reason out the usefulness of the following constructs: Functions being used as predicates; Wildcard searches; Complex expressions based on columnar data – evaluate the need to instead create persisted computed columns, which can be indexed; NOTE All of the above can be done programmatically. Use of Table Valued Functions (Multi-Statement vs Inline) Applies to: Foreign platform (such as Oracle, DB2, MySQL and Sybase) and SQL Server to SQL Server migration.NOTE For SQL Server to SQL Server migrations, if this issue existed in the source SQL Server, migrating to a newer version of SQL Server as-is will not address this scenario. Table Valued Functions return a table data type that can be an alternative to views. While views are limited to a single SELECT statement, user-defined functions can contain additional statements that allow more logic than is possible in views. IMPORTANT Since the output table of an MSTVF (Multi-Statement Table Valued Function) is not created at compile time, the SQL Server Query Optimizer relies on heuristics, and not actual statistics, to determine row estimations. Even if indexes are added to the base table(s), this is not going to help. For MSTVFs, SQL Server uses a fixed estimation of 1 for the number of rows expected to be returned by an MSTVF (starting with SQL Server 2014 (12.x) that fixed estimation is 100 rows). Steps to resolve 1. If the Multi-Statement TVF is single statement only, convert to Inline TVF. CREATE FUNCTION dbo.tfnGetRecentAddress(@ID int) RETURNS @tblAddress TABLE ([Address] VARCHAR(60) NOT NULL) AS BEGIN INSERT INTO @tblAddress ([Address]) SELECT TOP 1 [AddressLine1] FROM [Person].[Address] WHERE AddressID = @ID ORDER BY [ModifiedDate] DESC RETURN END To CREATE FUNCTION dbo.tfnGetRecentAddress_inline(@ID int) RETURNS TABLE AS RETURN ( SELECT TOP 1 [AddressLine1] AS [Address] FROM [Person].[Address] WHERE AddressID = @ID ORDER BY [ModifiedDate] DESC ) 2. If more complex, consider using intermediate results stored in Memory-Optimized tables or temporary tables. Additional Reading Best Practice with the Query Store Memory-Optimized Tables User-Defined Functions Table Variables and Row Estimations - Part 1 Table Variables and Row Estimations - Part 2 Execution Plan Caching and ReusePerformance Center for SQL Server Database Engine and Azure SQL Database 5/3/2018 • 2 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse This page provides links to help you locate the information that you need about performance in the SQL Server Database Engine and Azure SQL Database. Legend Configuration Options for Performance SQL Server provides the ability to affect database engine performance through a number of configuration options at the SQL Server Database Engine level. With Azure SQL Database, Microsoft performs most, but not all, of these optimizations for you. Disk configuration options Disk striping and RAID Data and log file configuration options Place Data and Log Files on Separate Drives View or Change the Default Locations for Data and Log Files (SQL Server Management Studio) TempDB configuration options Performance Improvements in TempDB Database Engine Configuration - TempDB Using SSDs in Azure VMs to store SQL Server TempDB and Buffer Pool Extensions Disk and performance best practices for temporary disk for SQL Server in Azure Virtual MachinesServer Configuration Options Processor configuration options affinity mask Server Configuration Option affinity Input-Output mask Server Configuration Option affinity64 mask Server Configuration Option affinity64 Input-Output mask Server Configuration Option Configure the max worker threads Server Configuration Option Memory configuration options Server Memory Server Configuration Options Index configuration options Configure the fill factor Server Configuration Option Query configuration options Configure the min memory per query Server Configuration Option Configure the query governor cost limit Server Configuration Option Configure the max degree of parallelism Server Configuration Option Configure the cost threshold for parallelism Server Configuration Option optimize for ad hoc workloads Server Configuration Option Backup configuration options View or Configure the backup compression default Server Configuration Option Database configuration optimization options Data Compression View or Change the Compatibility Level of a Database ALTER DATABASE SCOPED CONFIGURATION (Transact-SQL) Table configuration optimization Partitioned Tables and Indexes Database Engine Performance in an Azure Virtual Quick check list Machine Virtual machine size and storage account considerations Disks and performance considerations I/O Performance Considerations Feature specific performance considerations Query Performance OptionsIndexes Reorganize and Rebuild Indexes Specify Fill Factor for an Index Configure Parallel Index Operations SORT_IN_TEMPDB Option For Indexes Improve the Performance of Full-Text Indexes Configure the min memory per query Server Configuration Option Configure the index create memory Server Configuration Option Partitioned Tables and Indexes Benefits of Partitioning Joins Join Fundamentals Nested Loops join Merge join Hash join Subqueries Subquery Fundamentals Correlated subqueries Subquery types Stored Procedures CREATE PROCEDURE (Transact-SQL) User-Defined Functions CREATE FUNCTION (Transact-SQL) Parallelism optimization Configure the max worker threads Server Configuration Option ALTER DATABASE SCOPED CONFIGURATION (Transact-SQL) Query optimizer optimization ALTER DATABASE SCOPED CONFIGURATION (Transact-SQL) Statistics When to Update Statistics Update Statistics In-Memory OLTP (In-Memory Optimization) Memory-Optimized Tables Natively Compiled Stored Procedures Creating and Accessing Tables in TempDB from Natively Compiled Stored Procedures Troubleshooting Common Performance Problems with Memory-Optimized Hash Indexes Demonstration: Performance Improvement of In-Memory OLTP See Also Monitor and Tune for Performance Monitoring Performance By Using the Query Store Azure SQL Database performance guidance for single databases Optimizing Azure SQL Database Performance using Elastic Pools Azure Query Performance Insight Index Design Guide Memory Management Architecture Guide Pages and Extents Architecture Guide Post-migration Validation and Optimization GuideQuery Processing Architecture Guide SQL Server Transaction Locking and Row Versioning Guide SQL Server Transaction Log Architecture and Management Guide Thread and Task Architecture GuideQuery Processing Architecture Guide 5/3/2018 • 72 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse The SQL Server Database Engine processes queries on various data storage architectures such as local tables, partitioned tables, and tables distributed across multiple servers. The following topics cover how SQL Server processes queries and optimizes query reuse through execution plan caching. SQL Statement Processing Processing a single SQL statement is the most basic way that SQL Server executes SQL statements. The steps used to process a single SELECT statement that references only local base tables (no views or remote tables) illustrates the basic process. Logical Operator Precedence When more than one logical operator is used in a statement, NOT is evaluated first, then AND , and finally OR . Arithmetic, and bitwise, operators are handled before logical operators. For more information, see Operator Precedence. In the following example, the color condition pertains to product model 21, and not to product model 20, because AND has precedence over OR . SELECT ProductID, ProductModelID FROM Production.Product WHERE ProductModelID = 20 OR ProductModelID = 21 AND Color = ''Red''; GO You can change the meaning of the query by adding parentheses to force evaluation of the OR first. The following query finds only products under models 20 and 21 that are red. SELECT ProductID, ProductModelID FROM Production.Product WHERE (ProductModelID = 20 OR ProductModelID = 21) AND Color = ''Red''; GO Using parentheses, even when they are not required, can improve the readability of queries, and reduce the chance of making a subtle mistake because of operator precedence. There is no significant performance penalty in using parentheses. The following example is more readable than the original example, although they are syntactically the same. SELECT ProductID, ProductModelID FROM Production.Product WHERE ProductModelID = 20 OR (ProductModelID = 21 AND Color = ''Red''); GO Optimizing SELECT statements A SELECT statement is non-procedural; it does not state the exact steps that the database server should use toretrieve the requested data. This means that the database server must analyze the statement to determine the most efficient way to extract the requested data. This is referred to as optimizing the SELECT statement. The component that does this is called the Query Optimizer. The input to the Query Optimizer consists of the query, the database schema (table and index definitions), and the database statistics. The output of the Query Optimizer is a query execution plan, sometimes referred to as a query plan or just a plan. The contents of a query plan are described in more detail later in this topic. The inputs and outputs of the Query Optimizer during optimization of a single SELECT statement are illustrated in the following diagram: A SELECT statement defines only the following: The format of the result set. This is specified mostly in the select list. However, other clauses such as ORDER BY and GROUP BY also affect the final form of the result set. The tables that contain the source data. This is specified in the FROM clause. How the tables are logically related for the purposes of the SELECT statement. This is defined in the join specifications, which may appear in the WHERE clause or in an ON clause following FROM . The conditions that the rows in the source tables must satisfy to qualify for the SELECT statement. These are specified in the WHERE and HAVING clauses. A query execution plan is a definition of the following: The sequence in which the source tables are accessed. Typically, there are many sequences in which the database server can access the base tables to build the result set. For example, if the SELECT statement references three tables, the database server could first access TableA , use the data from TableA to extract matching rows from TableB , and then use the data from TableB to extract data from TableC . The other sequences in which the database server could access the tables are: TableC , TableB , TableA , or TableB , TableA , TableC , or TableB , TableC , TableA , or TableC , TableA , TableB The methods used to extract data from each table. Generally, there are different methods for accessing the data in each table. If only a few rows with specific key values are required, the database server can use an index. If all the rows in the table are required, the database server can ignore the indexes and perform a table scan. If all the rows in a table are required but there is an index whose key columns are in an ORDER BY , performing an index scan instead of a table scan may save a separate sort of the result set. If a table is very small, table scans may be the most efficient method for almost all access to the table. The process of selecting one execution plan from potentially many possible plans is referred to as optimization. The Query Optimizer is one of the most important components of a SQL database system. While some overhead is used by the Query Optimizer to analyze the query and select a plan, this overhead is typically saved several-fold when the Query Optimizer picks an efficient execution plan. For example, two construction companies can be given identical blueprints for a house. If one company spends a few days at the beginning to plan how they will build the house, and the other company begins building without planning, the company that takes the time to plan their project will probably finish first.The SQL Server Query Optimizer is a cost-based Query Optimizer. Each possible execution plan has an associated cost in terms of the amount of computing resources used. The Query Optimizer must analyze the possible plans and choose the one with the lowest estimated cost. Some complex SELECT statements have thousands of possible execution plans. In these cases, the Query Optimizer does not analyze all possible combinations. Instead, it uses complex algorithms to find an execution plan that has a cost reasonably close to the minimum possible cost. The SQL Server Query Optimizer does not choose only the execution plan with the lowest resource cost; it chooses the plan that returns results to the user with a reasonable cost in resources and that returns the results the fastest. For example, processing a query in parallel typically uses more resources than processing it serially, but completes the query faster. The SQL Server Query Optimizer will use a parallel execution plan to return results if the load on the server will not be adversely affected. The SQL Server Query Optimizer relies on distribution statistics when it estimates the resource costs of different methods for extracting information from a table or index. Distribution statistics are kept for columns and indexes. They indicate the selectivity of the values in a particular index or column. For example, in a table representing cars, many cars have the same manufacturer, but each car has a unique vehicle identification number (VIN). An index on the VIN is more selective than an index on the manufacturer. If the index statistics are not current, the Query Optimizer may not make the best choice for the current state of the table. For more information about keeping index statistics current, see Statistics. The SQL Server Query Optimizer is important because it enables the database server to adjust dynamically to changing conditions in the database without requiring input from a programmer or database administrator. This enables programmers to focus on describing the final result of the query. They can trust that the SQL Server Query Optimizer will build an efficient execution plan for the state of the database every time the statement is run. Processing a SELECT Statement The basic steps that SQL Server uses to process a single SELECT statement include the following: 1. The parser scans the SELECT statement and breaks it into logical units such as keywords, expressions, operators, and identifiers. 2. A query tree, sometimes referred to as a sequence tree, is built describing the logical steps needed to transform the source data into the format required by the result set. 3. The Query Optimizer analyzes different ways the source tables can be accessed. It then selects the series of steps that returns the results fastest while using fewer resources. The query tree is updated to record this exact series of steps. The final, optimized version of the query tree is called the execution plan. 4. The relational engine starts executing the execution plan. As the steps that require data from the base tables are processed, the relational engine requests that the storage engine pass up data from the rowsets requested from the relational engine. 5. The relational engine processes the data returned from the storage engine into the format defined for the result set and returns the result set to the client. Processing Other Statements The basic steps described for processing a SELECT statement apply to other SQL statements such as INSERT , UPDATE , and DELETE . UPDATE and DELETE statements both have to target the set of rows to be modified or deleted. The process of identifying these rows is the same process used to identify the source rows that contribute to the result set of a SELECT statement. The UPDATE and INSERT statements may both contain embedded `SELECT statements that provide the data values to be updated or inserted. Even Data Definition Language (DDL) statements, such as CREATE PROCEDURE or ALTER TABL E, are ultimately resolved to a series of relational operations on the system catalog tables and sometimes (such as ALTER TABLE ADD COLUMN ) against the data tables. Worktables The relational engine may need to build a worktable to perform a logical operation specified in an SQL statement. Worktables are internal tables that are used to hold intermediate results. Worktables are generated for certainGROUP BY , ORDER BY , or UNION queries. For example, if an ORDER BY clause references columns that are not covered by any indexes, the relational engine may need to generate a worktable to sort the result set into the order requested. Worktables are also sometimes used as spools that temporarily hold the result of executing a part of a query plan. Worktables are built in tempdb and are dropped automatically when they are no longer needed. View Resolution The SQL Server query processor treats indexed and nonindexed views differently: The rows of an indexed view are stored in the database in the same format as a table. If the Query Optimizer decides to use an indexed view in a query plan, the indexed view is treated the same way as a base table. Only the definition of a nonindexed view is stored, not the rows of the view. The Query Optimizer incorporates the logic from the view definition into the execution plan it builds for the SQL statement that references the nonindexed view. The logic used by the SQL Server Query Optimizer to decide when to use an indexed view is similar to the logic used to decide when to use an index on a table. If the data in the indexed view covers all or part of the SQL statement, and the Query Optimizer determines that an index on the view is the low-cost access path, the Query Optimizer will choose the index regardless of whether the view is referenced by name in the query. When an SQL statement references a nonindexed view, the parser and Query Optimizer analyze the source of both the SQL statement and the view and then resolve them into a single execution plan. There is not one plan for the SQL statement and a separate plan for the view. For example, consider the following view: USE AdventureWorks2014; GO CREATE VIEW EmployeeName AS SELECT h.BusinessEntityID, p.LastName, p.FirstName FROM HumanResources.Employee AS h JOIN Person.Person AS p ON h.BusinessEntityID = p.BusinessEntityID; GO Based on this view, both of these SQL statements perform the same operations on the base tables and produce the same results: /* SELECT referencing the EmployeeName view. */ SELECT LastName AS EmployeeLastName, SalesOrderID, OrderDate FROM AdventureWorks2014.Sales.SalesOrderHeader AS soh JOIN AdventureWorks2014.dbo.EmployeeName AS EmpN ON (soh.SalesPersonID = EmpN.BusinessEntityID) WHERE OrderDate > ''20020531''; /* SELECT referencing the Person and Employee tables directly. */ SELECT LastName AS EmployeeLastName, SalesOrderID, OrderDate FROM AdventureWorks2014.HumanResources.Employee AS e JOIN AdventureWorks2014.Sales.SalesOrderHeader AS soh ON soh.SalesPersonID = e.BusinessEntityID JOIN AdventureWorks2014.Person.Person AS p ON e.BusinessEntityID =p.BusinessEntityID WHERE OrderDate > ''20020531''; The SQL Server Management Studio Showplan feature shows that the relational engine builds the same execution plan for both of these SELECT statements. Using Hints with Views Hints that are placed on views in a query may conflict with other hints that are discovered when the view isexpanded to access its base tables. When this occurs, the query returns an error. For example, consider the following view that contains a table hint in its definition: USE AdventureWorks2014; GO CREATE VIEW Person.AddrState WITH SCHEMABINDING AS SELECT a.AddressID, a.AddressLine1, s.StateProvinceCode, s.CountryRegionCode FROM Person.Address a WITH (NOLOCK), Person.StateProvince s WHERE a.StateProvinceID = s.StateProvinceID; Now suppose you enter this query: SELECT AddressID, AddressLine1, StateProvinceCode, CountryRegionCode FROM Person.AddrState WITH (SERIALIZABLE) WHERE StateProvinceCode = ''WA''; The query fails, because the hint SERIALIZABLE that is applied on view Person.AddrState in the query is propagated to both tables Person.Address and Person.StateProvince in the view when it is expanded. However, expanding the view also reveals the NOLOCK hint on Person.Address . Because the SERIALIZABLE and NOLOCK hints conflict, the resulting query is incorrect. The PAGLOCK , NOLOCK , ROWLOCK , TABLOCK , or TABLOCKX table hints conflict with each other, as do the HOLDLOCK , NOLOCK , READCOMMITTED , REPEATABLEREAD , SERIALIZABLE table hints. Hints can propagate through levels of nested views. For example, suppose a query applies the HOLDLOCK hint on a view v1 . When v1 is expanded, we find that view v2 is part of its definition. v2 ''s definition includes a NOLOCK hint on one of its base tables. But this table also inherits the HOLDLOCK hint from the query on view v1 . Because the NOLOCK and HOLDLOCK hints conflict, the query fails. When the FORCE ORDER hint is used in a query that contains a view, the join order of the tables within the view is determined by the position of the view in the ordered construct. For example, the following query selects from three tables and a view: SELECT * FROM Table1, Table2, View1, Table3 WHERE Table1.Col1 = Table2.Col1 AND Table2.Col1 = View1.Col1 AND View1.Col2 = Table3.Col2; OPTION (FORCE ORDER); And View1 is defined as shown in the following: CREATE VIEW View1 AS SELECT Colx, Coly FROM TableA, TableB WHERE TableA.ColZ = TableB.Colz; The join order in the query plan is Table1 , Table2 , TableA , TableB , Table3 . Resolving Indexes on Views As with any index, SQL Server chooses to use an indexed view in its query plan only if the Query Optimizer determines it is beneficial to do so. Indexed views can be created in any edition of SQL Server. In some editions of some versions of SQL Server, the Query Optimizer automatically considers the indexed view. In some editions of some versions of SQL Server, to use an indexed view, the NOEXPAND table hint must be used. For clarification, see the documentation for eachversion. The SQL Server Query Optimizer uses an indexed view when the following conditions are met: These session options are set to ON : ANSI_NULLS ANSI_PADDING ANSI_WARNINGS ARITHABORT CONCAT_NULL_YIELDS_NULL QUOTED_IDENTIFIER The NUMERIC_ROUNDABORT session option is set to OFF. The Query Optimizer finds a match between the view index columns and elements in the query, such as the following: Search condition predicates in the WHERE clause Join operations Aggregate functions GROUP BY clauses Table references The estimated cost for using the index has the lowest cost of any access mechanisms considered by the Query Optimizer. Every table referenced in the query (either directly, or by expanding a view to access its underlying tables) that corresponds to a table reference in the indexed view must have the same set of hints applied on it in the query. NOTE The READCOMMITTED and READCOMMITTEDLOCK hints are always considered different hints in this context, regardless of the current transaction isolation level. Other than the requirements for the SET options and table hints, these are the same rules that the Query Optimizer uses to determine whether a table index covers a query. Nothing else has to be specified in the query for an indexed view to be used. A query does not have to explicitly reference an indexed view in the FROM clause for the Query Optimizer to use the indexed view. If the query contains references to columns in the base tables that are also present in the indexed view, and the Query Optimizer estimates that using the indexed view provides the lowest cost access mechanism, the Query Optimizer chooses the indexed view, similar to the way it chooses base table indexes when they are not directly referenced in a query. The Query Optimizer may choose the view when it contains columns that are not referenced by the query, as long as the view offers the lowest cost option for covering one or more of the columns specified in the query. The Query Optimizer treats an indexed view referenced in the FROM clause as a standard view. The Query Optimizer expands the definition of the view into the query at the start of the optimization process. Then, indexed view matching is performed. The indexed view may be used in the final execution plan selected by the Query Optimizer, or instead, the plan may materialize necessary data from the view by accessing the base tables referenced by the view. The Query Optimizer chooses the lowest-cost alternative. Using Hints with Indexed Views You can prevent view indexes from being used for a query by using the EXPAND VIEWS query hint, or you can use the NOEXPAND table hint to force the use of an index for an indexed view specified in the FROM clause of a query. However, you should let the Query Optimizer dynamically determine the best access methods to use for each query. Limit your use of EXPAND and NOEXPAND to specific cases where testing has shown that they improveperformance significantly. The EXPAND VIEWS option specifies that the Query Optimizer not use any view indexes for the whole query. When NOEXPAND is specified for a view, the Query Optimizer considers using any indexes defined on the view. NOEXPAND specified with the optional INDEX() clause forces the Query Optimizer to use the specified indexes. NOEXPAND can be specified only for an indexed view and cannot be specified for a view not indexed. When neither NOEXPAND nor EXPAND VIEWS is specified in a query that contains a view, the view is expanded to access underlying tables. If the query that makes up the view contains any table hints, these hints are propagated to the underlying tables. (This process is explained in more detail in View Resolution.) As long as the set of hints that exists on the underlying tables of the view are identical to each other, the query is eligible to be matched with an indexed view. Most of the time, these hints will match each other, because they are being inherited directly from the view. However, if the query references tables instead of views, and the hints applied directly on these tables are not identical, then such a query is not eligible for matching with an indexed view. If the INDEX , PAGLOCK , ROWLOCK , TABLOCKX , UPDLOCK , or XLOCK hints apply to the tables referenced in the query after view expansion, the query is not eligible for indexed view matching. If a table hint in the form of INDEX (index_val[ ,...n] ) references a view in a query and you do not also specify the NOEXPAND hint, the index hint is ignored. To specify use of a particular index, use NOEXPAND . Generally, when the Query Optimizer matches an indexed view to a query, any hints specified on the tables or views in the query are applied directly to the indexed view. If the Query Optimizer chooses not to use an indexed view, any hints are propagated directly to the tables referenced in the view. For more information, see View Resolution. This propagation does not apply to join hints. They are applied only in their original position in the query. Join hints are not considered by the Query Optimizer when matching queries to indexed views. If a query plan uses an indexed view that matches part of a query that contains a join hint, the join hint is not used in the plan. Hints are not allowed in the definitions of indexed views. In compatibility mode 80 and higher, SQL Server ignores hints inside indexed view definitions when maintaining them, or when executing queries that use indexed views. Although using hints in indexed view definitions will not produce a syntax error in 80 compatibility mode, they are ignored. Resolving Distributed Partitioned Views The SQL Server query processor optimizes the performance of distributed partitioned views. The most important aspect of distributed partitioned view performance is minimizing the amount of data transferred between member servers. SQL Server builds intelligent, dynamic plans that make efficient use of distributed queries to access data from remote member tables: The Query Processor first uses OLE DB to retrieve the check constraint definitions from each member table. This allows the query processor to map the distribution of key values across the member tables. The Query Processor compares the key ranges specified in an SQL statement WHERE clause to the map that shows how the rows are distributed in the member tables. The query processor then builds a query execution plan that uses distributed queries to retrieve only those remote rows that are required to complete the SQL statement. The execution plan is also built in such a way that any access to remote member tables, for either data or metadata, are delayed until the information is required. For example, consider a system where a customers table is partitioned across Server1 ( CustomerID from 1 through 3299999), Server2 ( CustomerID from 3300000 through 6599999), and Server3 ( CustomerID from 6600000 through 9999999). Consider the execution plan built for this query executed on Server1:SELECT * FROM CompanyData.dbo.Customers WHERE CustomerID BETWEEN 3200000 AND 3400000; The execution plan for this query extracts the rows with CustomerID key values from 3200000 through 3299999 from the local member table, and issues a distributed query to retrieve the rows with key values from 3300000 through 3400000 from Server2. The SQL Server Query Processor can also build dynamic logic into query execution plans for SQL statements in which the key values are not known when the plan must be built. For example, consider this stored procedure: CREATE PROCEDURE GetCustomer @CustomerIDParameter INT AS SELECT * FROM CompanyData.dbo.Customers WHERE CustomerID = @CustomerIDParameter; SQL Server cannot predict what key value will be supplied by the @CustomerIDParameter parameter every time the procedure is executed. Because the key value cannot be predicted, the query processor also cannot predict which member table will have to be accessed. To handle this case, SQL Server builds an execution plan that has conditional logic, referred to as dynamic filters, to control which member table is accessed, based on the input parameter value. Assuming the GetCustomer stored procedure was executed on Server1, the execution plan logic can be represented as shown in the following: IF @CustomerIDParameter BETWEEN 1 and 3299999 Retrieve row from local table CustomerData.dbo.Customer_33 ELSE IF @CustomerIDParameter BETWEEN 3300000 and 6599999 Retrieve row from linked table Server2.CustomerData.dbo.Customer_66 ELSE IF @CustomerIDParameter BETWEEN 6600000 and 9999999 Retrieve row from linked table Server3.CustomerData.dbo.Customer_99 SQL Server sometimes builds these types of dynamic execution plans even for queries that are not parameterized. The Query Optimizer may parameterize a query so that the execution plan can be reused. If the Query Optimizer parameterizes a query referencing a partitioned view, the Query Optimizer can no longer assume the required rows will come from a specified base table. It will then have to use dynamic filters in the execution plan. Stored Procedure and Trigger Execution SQL Server stores only the source for stored procedures and triggers. When a stored procedure or trigger is first executed, the source is compiled into an execution plan. If the stored procedure or trigger is again executed before the execution plan is aged from memory, the relational engine detects the existing plan and reuses it. If the plan has aged out of memory, a new plan is built. This process is similar to the process SQL Server follows for all SQL statements. The main performance advantage that stored procedures and triggers have in SQL Server compared with batches of dynamic SQL is that their SQL statements are always the same. Therefore, the relational engine easily matches them with any existing execution plans. Stored procedure and trigger plans are easily reused. The execution plan for stored procedures and triggers is executed separately from the execution plan for the batch calling the stored procedure or firing the trigger. This allows for greater reuse of the stored procedure and trigger execution plans. E xecution Plan Caching and Reuse SQL Server has a pool of memory that is used to store both execution plans and data buffers. The percentage of the pool allocated to either execution plans or data buffers fluctuates dynamically, depending on the state of thesystem. The part of the memory pool that is used to store execution plans is referred to as the plan cache. SQL Server execution plans have the following main components: Query Execution Plan The bulk of the execution plan is a re-entrant, read-only data structure used by any number of users. This is referred to as the query plan. No user context is stored in the query plan. There are never more than one or two copies of the query plan in memory: one copy for all serial executions and another for all parallel executions. The parallel copy covers all parallel executions, regardless of their degree of parallelism. Execution Context Each user that is currently executing the query has a data structure that holds the data specific to their execution, such as parameter values. This data structure is referred to as the execution context. The execution context data structures are reused. If a user executes a query and one of the structures is not being used, it is reinitialized with the context for the new user. When any SQL statement is executed in SQL Server, the relational engine first looks through the plan cache to verify that an existing execution plan for the same SQL statement exists. SQL Server reuses any existing plan it finds, saving the overhead of recompiling the SQL statement. If no existing execution plan exists, SQL Server generates a new execution plan for the query. SQL Server has an efficient algorithm to find any existing execution plans for any specific SQL statement. In most systems, the minimal resources that are used by this scan are less than the resources that are saved by being able to reuse existing plans instead of compiling every SQL statement. The algorithms to match new SQL statements to existing, unused execution plans in the cache require that all object references be fully qualified. For example, the first of these SELECT statements is not matched with an existing plan, and the second is matched: SELECT * FROM Person; SELECT * FROM Person.Person; Removing Execution Plans from the Plan Cache Execution plans remain in the plan cache as long as there is enough memory to store them. When memory pressure exists, the SQL Server Database Engine uses a cost-based approach to determine which execution plans to remove from the plan cache. To make a cost-based decision, the SQL Server Database Engine increases and decreases a current cost variable for each execution plan according to the following factors. When a user process inserts an execution plan into the cache, the user process sets the current cost equal to the original query compile cost; for ad-hoc execution plans, the user process sets the current cost to zero. Thereafter, each time a user process references an execution plan, it resets the current cost to the original compile cost; for ad- hoc execution plans the user process increases the current cost. For all plans, the maximum value for the current cost is the original compile cost. When memory pressure exists, the SQL Server Database Engine responds by removing execution plans from the plan cache. To determine which plans to remove, the SQL Server Database Engine repeatedly examines the state of each execution plan and removes plans when their current cost is zero. An execution plan with zero current costis not removed automatically when memory pressure exists; it is removed only when the SQL Server Database Engine examines the plan and the current cost is zero. When examining an execution plan, the SQL Server Database Engine pushes the current cost towards zero by decreasing the current cost if a query is not currently using the plan. The SQL Server Database Engine repeatedly examines the execution plans until enough have been removed to satisfy memory requirements. While memory pressure exists, an execution plan may have its cost increased and decreased more than once. When memory pressure no longer exists, the SQL Server Database Engine stops decreasing the current cost of unused execution plans and all execution plans remain in the plan cache, even if their cost is zero. The SQL Server Database Engine uses the resource monitor and user worker threads to free memory from the plan cache in response to memory pressure. The resource monitor and user worker threads can examine plans run concurrently to decrease the current cost for each unused execution plan. The resource monitor removes execution plans from the plan cache when global memory pressure exists. It frees memory to enforce policies for system memory, process memory, resource pool memory, and maximum size for all caches. The maximum size for all caches is a function of the buffer pool size and cannot exceed the maximum server memory. For more information on configuring the maximum server memory, see the max server memory setting in sp_configure . The user worker threads remove execution plans from the plan cache when single cache memory pressure exists. They enforce policies for maximum single cache size and maximum single cache entries. The following examples illustrate which execution plans get removed from the plan cache: An execution plan is frequently referenced so that its cost never goes to zero. The plan remains in the plan cache and is not removed unless there is memory pressure and the current cost is zero. An ad-hoc execution plan is inserted and is not referenced again before memory pressure exists. Since ad-hoc plans are initialized with a current cost of zero, when the SQL Server Database Engine examines the execution plan, it will see the zero current cost and remove the plan from the plan cache. The ad-hoc execution plan remains in the plan cache with a zero current cost when memory pressure does not exist. To manually remove a single plan or all plans from the cache, use DBCC FREEPROCCACHE. Recompiling Execution Plans Certain changes in a database can cause an execution plan to be either inefficient or invalid, based on the new state of the database. SQL Server detects the changes that invalidate an execution plan and marks the plan as not valid. A new plan must then be recompiled for the next connection that executes the query. The conditions that invalidate a plan include the following: Changes made to a table or view referenced by the query ( ALTER TABLE and ALTER VIEW ). Changes made to a single procedure, which would drop all plans for that procedure from the cache ( ALTER PROCEDURE ). Changes to any indexes used by the execution plan. Updates on statistics used by the execution plan, generated either explicitly from a statement, such as UPDATE STATISTICS , or generated automatically. Dropping an index used by the execution plan. An explicit call to sp_recompile . Large numbers of changes to keys (generated by INSERT or DELETE statements from other users that modify a table referenced by the query). For tables with triggers, if the number of rows in the inserted or deleted tables grows significantly. Executing a stored procedure using the WITH RECOMPILE option. Most recompilations are required either for statement correctness or to obtain potentially faster query executionplans. In SQL Server 2000, whenever a statement within a batch causes recompilation, the whole batch, whether submitted through a stored procedure, trigger, ad-hoc batch, or prepared statement, is recompiled. Starting with SQL Server 2005, only the statement inside the batch that causes recompilation is recompiled. Because of this difference, recompilation counts in SQL Server 2000 and later releases are not comparable. Also, there are more types of recompilations in SQL Server 2005 and later because of its expanded feature set. Statement-level recompilation benefits performance because, in most cases, a small number of statements causes recompilations and their associated penalties, in terms of CPU time and locks. These penalties are therefore avoided for the other statements in the batch that do not have to be recompiled. The sql_statement_recompile extended event (xEvent) reports statement-level recompilations. This xEvent occurs when a statement-level recompilation is required by any kind of batch. This includes stored procedures, triggers, ad hoc batches and queries. Batches may be submitted through several interfaces, including sp_executesql, dynamic SQL, Prepare methods or Execute methods. The recompile_cause column of sql_statement_recompile xEvent contains an integer code that indicates the reason for the recompilation. The following table contains the possible reasons: Schema changed Statistics changed Deferred compile SET option changed Temporary table changed Remote rowset changed FOR BROWSE permission changed Query notification environment changed Partitioned view changed Cursor options changed OPTION (RECOMPILE) requested Parameterized plan flushed Plan affecting database version changed Query Store plan forcing policy changed Query Store plan forcing failed Query Store missing the plan NOTE In SQL Server versions where xEvents are not available, then the SQL Server Profiler SP:Recompile trace event can be used for the same purpose of reporting statement-level recompilations. The trace event SQL:StmtRecompile also reports statement-level recompilations, and this trace event can also be used to track and debug recompilations. Whereas SP:Recompile generates only for stored procedures and triggers, SQL:StmtRecompile generates for stored procedures, triggers, ad-hoc batches, batches that are executed by using sp_executesql , prepared queries, and dynamic SQL. The EventSubClass column of SP:Recompile and SQL:StmtRecompile contains an integer code that indicates the reason for the recompilation. The codes are described here.NOTE When the AUTO_UPDATE_STATISTICS database option is set to ON , queries are recompiled when they target tables or indexed views whose statistics have been updated or whose cardinalities have changed significantly since the last execution. This behavior applies to standard user-defined tables, temporary tables, and the inserted and deleted tables created by DML triggers. If query performance is affected by excessive recompilations, consider changing this setting to OFF . When the AUTO_UPDATE_STATISTICS database option is set to OFF , no recompilations occur based on statistics or cardinality changes, with the exception of the inserted and deleted tables that are created by DML INSTEAD OF triggers. Because these tables are created in tempdb, the recompilation of queries that access them depends on the setting of AUTO_UPDATE_STATISTICS in tempdb. Note that in SQL Server 2000, queries continue to recompile based on cardinality changes to the DML trigger inserted and deleted tables, even when this setting is OFF . Parameters and Execution Plan Reuse The use of parameters, including parameter markers in ADO, OLE DB, and ODBC applications, can increase the reuse of execution plans. WARNING Using parameters or parameter markers to hold values that are typed by end users is more secure than concatenating the values into a string that is then executed by using either a data access API method, the EXECUTE statement, or the sp_executesql stored procedure. The only difference between the following two SELECT statements is the values that are compared in the WHERE clause: SELECT * FROM AdventureWorks2014.Production.Product WHERE ProductSubcategoryID = 1; SELECT * FROM AdventureWorks2014.Production.Product WHERE ProductSubcategoryID = 4; The only difference between the execution plans for these queries is the value stored for the comparison against the ProductSubcategoryID column. While the goal is for SQL Server to always recognize that the statements generate essentially the same plan and reuse the plans, SQL Server sometimes does not detect this in complex SQL statements. Separating constants from the SQL statement by using parameters helps the relational engine recognize duplicate plans. You can use parameters in the following ways: In Transact-SQL, use sp_executesql : DECLARE @MyIntParm INT SET @MyIntParm = 1 EXEC sp_executesql N''SELECT * FROM AdventureWorks2014.Production.Product WHERE ProductSubcategoryID = @Parm'', N''@Parm INT'', @MyIntParm This method is recommended for Transact-SQL scripts, stored procedures, or triggers that generate SQLstatements dynamically. ADO, OLE DB, and ODBC use parameter markers. Parameter markers are question marks (?) that replace a constant in an SQL statement and are bound to a program variable. For example, you would do the following in an ODBC application: Use SQLBindParameter to bind an integer variable to the first parameter marker in an SQL statement. Put the integer value in the variable. Execute the statement, specifying the parameter marker (?): SQLExecDirect(hstmt, "SELECT * FROM AdventureWorks2014.Production.Product WHERE ProductSubcategoryID = ?", SQL_NTS); The SQL Server Native Client OLE DB Provider and the SQL Server Native Client ODBC driver included with SQL Server use sp_executesql to send statements to SQL Server when parameter markers are used in applications. To design stored procedures, which use parameters by design. If you do not explicitly build parameters into the design of your applications, you can also rely on the SQL Server Query Optimizer to automatically parameterize certain queries by using the default behavior of simple parameterization. Alternatively, you can force the Query Optimizer to consider parameterizing all queries in the database by setting the PARAMETERIZATION option of the ALTER DATABASE statement to FORCED . When forced parameterization is enabled, simple parameterization can still occur. For example, the following query cannot be parameterized according to the rules of forced parameterization: SELECT * FROM Person.Address WHERE AddressID = 1 + 2; However, it can be parameterized according to simple parameterization rules. When forced parameterization is tried but fails, simple parameterization is still subsequently tried. Simple Parameterization In SQL Server, using parameters or parameter markers in Transact-SQL statements increases the ability of the relational engine to match new SQL statements with existing, previously-compiled execution plans. WARNING Using parameters or parameter markers to hold values typed by end users is more secure than concatenating the values into a string that is then executed using either a data access API method, the EXECUTE statement, or the sp_executesql stored procedure. If a SQL statement is executed without parameters, SQL Server parameterizes the statement internally to increase the possibility of matching it against an existing execution plan. This process is called simple parameterization. In SQL Server 2000, the process was referred to as auto-parameterization. Consider this statement: SELECT * FROM AdventureWorks2014.Production.Product WHERE ProductSubcategoryID = 1;The value 1 at the end of the statement can be specified as a parameter. The relational engine builds the execution plan for this batch as if a parameter had been specified in place of the value 1. Because of this simple parameterization, SQL Server recognizes that the following two statements generate essentially the same execution plan and reuses the first plan for the second statement: SELECT * FROM AdventureWorks2014.Production.Product WHERE ProductSubcategoryID = 1; SELECT * FROM AdventureWorks2014.Production.Product WHERE ProductSubcategoryID = 4; When processing complex SQL statements, the relational engine may have difficulty determining which expressions can be parameterized. To increase the ability of the relational engine to match complex SQL statements to existing, unused execution plans, explicitly specify the parameters using either sp_executesql or parameter markers. NOTE When the +, -, *, /, or % arithmetic operators are used to perform implicit or explicit conversion of int, smallint, tinyint, or bigint constant values to the float, real, decimal or numeric data types, SQL Server applies specific rules to calculate the type and precision of the expression results. However, these rules differ, depending on whether the query is parameterized or not. Therefore, similar expressions in queries can, in some cases, produce differing results. Under the default behavior of simple parameterization, SQL Server parameterizes a relatively small class of queries. However, you can specify that all queries in a database be parameterized, subject to certain limitations, by setting the PARAMETERIZATION option of the ALTER DATABASE command to FORCED . Doing so may improve the performance of databases that experience high volumes of concurrent queries by reducing the frequency of query compilations. Alternatively, you can specify that a single query, and any others that are syntactically equivalent but differ only in their parameter values, be parameterized. Forced Parameterization You can override the default simple parameterization behavior of SQL Server by specifying that all SELECT , INSERT , UPDATE , and DELETE statements in a database be parameterized, subject to certain limitations. Forced parameterization is enabled by setting the PARAMETERIZATION option to FORCED in the ALTER DATABASE statement. Forced parameterization may improve the performance of certain databases by reducing the frequency of query compilations and recompilations. Databases that may benefit from forced parameterization are generally those that experience high volumes of concurrent queries from sources such as point-of-sale applications. When the PARAMETERIZATION option is set to FORCED , any literal value that appears in a SELECT , INSERT , UPDATE , or DELETE statement, submitted in any form, is converted to a parameter during query compilation. The exceptions are literals that appear in the following query constructs: INSERT...EXECUTE statements. Statements inside the bodies of stored procedures, triggers, or user-defined functions. SQL Server already reuses query plans for these routines. Prepared statements that have already been parameterized on the client-side application. Statements that contain XQuery method calls, where the method appears in a context where its arguments would typically be parameterized, such as a WHERE clause. If the method appears in a context where its arguments would not be parameterized, the rest of the statement is parameterized. Statements inside a Transact-SQL cursor. ( SELECT statements inside API cursors are parameterized.)Deprecated query constructs. Any statement that is run in the context of ANSI_PADDING or ANSI_NULLS set to OFF . Statements that contain more than 2,097 literals that are eligible for parameterization. Statements that reference variables, such as WHERE T.col2 >= @bb . Statements that contain the RECOMPILE query hint. Statements that contain a COMPUTE clause. Statements that contain a WHERE CURRENT OF clause. Additionally, the following query clauses are not parameterized. Note that in these cases, only the clauses are not parameterized. Other clauses within the same query may be eligible for forced parameterization. The of any SELECT statement. This includes SELECT lists of subqueries and SELECT lists inside INSERT statements. Subquery SELECT statements that appear inside an IF statement. The TOP , TABLESAMPLE , HAVING , GROUP BY , ORDER BY , OUTPUT...INTO , or FOR XM L clauses of a query. Arguments, either direct or as subexpressions, to OPENROWSET , OPENQUERY , OPENDATASOURCE , OPENXML , or any FULLTEXT operator. The pattern and escape_character arguments of a LIKE clause. The style argument of a CONVERT clause. Integer constants inside an IDENTITY clause. Constants specified by using ODBC extension syntax. Constant-foldable expressions that are arguments of the +, -, *, /, and % operators. When considering eligibility for forced parameterization, SQL Server considers an expression to be constant-foldable when either of the following conditions is true: No columns, variables, or subqueries appear in the expression. The expression contains a CASE clause. Arguments to query hint clauses. These include the number_of_rows argument of the FAST query hint, the number_of_processors argument of the MAXDOP query hint, and the number argument of the MAXRECURSION query hint. Parameterization occurs at the level of individual Transact-SQL statements. In other words, individual statements in a batch are parameterized. After compiling, a parameterized query is executed in the context of the batch in which it was originally submitted. If an execution plan for a query is cached, you can determine whether the query was parameterized by referencing the sql column of the sys.syscacheobjects dynamic management view. If a query is parameterized, the names and data types of parameters come before the text of the submitted batch in this column, such as (@1 tinyint). NOTE Parameter names are arbitrary. Users or applications should not rely on a particular naming order. Also, the following can change between versions of SQL Server and Service Pack upgrades: Parameter names, the choice of literals that are parameterized, and the spacing in the parameterized text. Data Types of Parameters When SQL Server parameterizes literals, the parameters are converted to the following data types: Integer literals whose size would otherwise fit within the int data type parameterize to int. Larger integer literals that are parts of predicates that involve any comparison operator (includes <, <=, =, !=, >, >=, , !<, !>, <>, ALL , ANY , SOME , BETWEEN , and IN ) parameterize to numeric(38,0). Larger literals that are not parts of predicates that involve comparison operators parameterize to numeric whose precision is just large enough to support its size and whose scale is 0.Fixed-point numeric literals that are parts of predicates that involve comparison operators parameterize to numeric whose precision is 38 and whose scale is just large enough to support its size. Fixed-point numeric literals that are not parts of predicates that involve comparison operators parameterize to numeric whose precision and scale are just large enough to support its size. Floating point numeric literals parameterize to float(53). Non-Unicode string literals parameterize to varchar(8000) if the literal fits within 8,000 characters, and to varchar(max) if it is larger than 8,000 characters. Unicode string literals parameterize to nvarchar(4000) if the literal fits within 4,000 Unicode characters, and to nvarchar(max) if the literal is larger than 4,000 characters. Binary literals parameterize to varbinary(8000) if the literal fits within 8,000 bytes. If it is larger than 8,000 bytes, it is converted to varbinary(max). Money type literals parameterize to money. Guidelines for Using Forced Parameterization Consider the following when you set the PARAMETERIZATION option to FORCED: Forced parameterization, in effect, changes the literal constants in a query to parameters when compiling a query. Therefore, the Query Optimizer might choose suboptimal plans for queries. In particular, the Query Optimizer is less likely to match the query to an indexed view or an index on a computed column. It may also choose suboptimal plans for queries posed on partitioned tables and distributed partitioned views. Forced parameterization should not be used for environments that rely heavily on indexed views and indexes on computed columns. Generally, the PARAMETERIZATION FORCED option should only be used by experienced database administrators after determining that doing this does not adversely affect performance. Distributed queries that reference more than one database are eligible for forced parameterization as long as the PARAMETERIZATION option is set to FORCED in the database whose context the query is running. Setting the PARAMETERIZATION option to FORCED flushes all query plans from the plan cache of a database, except those that currently are compiling, recompiling, or running. Plans for queries that are compiling or running during the setting change are parameterized the next time the query is executed. Setting the PARAMETERIZATION option is an online operation that it requires no database-level exclusive locks. The current setting of the PARAMETERIZATION option is preserved when reattaching or restoring a database. You can override the behavior of forced parameterization by specifying that simple parameterization be attempted on a single query, and any others that are syntactically equivalent but differ only in their parameter values. Conversely, you can specify that forced parameterization be attempted on only a set of syntactically equivalent queries, even if forced parameterization is disabled in the database. Plan guides are used for this purpose. NOTE When the PARAMETERIZATION option is set to FORCED , the reporting of error messages may differ from when the PARAMETERIZATION option is set to SIMPLE : multiple error messages may be reported under forced parameterization, where fewer messages would be reported under simple parameterization, and the line numbers in which errors occur may be reported incorrectly. Preparing SQL Statements The SQL Server relational engine introduces full support for preparing SQL statements before they are executed. If an application has to execute an SQL statement several times, it can use the database API to do the following: Prepare the statement once. This compiles the SQL statement into an execution plan. Execute the precompiled execution plan every time it has to execute the statement. This prevents having to recompile the SQL statement on each execution after the first time. Preparing and executing statements is controlled by API functions and methods. It is not part of the Transact- SQL language. The prepare/execute model of executing SQL statements is supported by the SQL ServerNative Client OLE DB Provider and the SQL Server Native Client ODBC driver. On a prepare request, either the provider or the driver sends the statement to SQL Server with a request to prepare the statement. SQL Server compiles an execution plan and returns a handle for that plan to the provider or driver. On an execute request, either the provider or the driver sends the server a request to execute the plan that is associated with the handle. Prepared statements cannot be used to create temporary objects on SQL Server. Prepared statements cannot reference system stored procedures that create temporary objects, such as temporary tables. These procedures must be executed directly. Excess use of the prepare/execute model can degrade performance. If a statement is executed only once, a direct execution requires only one network round-trip to the server. Preparing and executing an SQL statement executed only one time requires an extra network round-trip; one trip to prepare the statement and one trip to execute it. Preparing a statement is more effective if parameter markers are used. For example, assume that an application is occasionally asked to retrieve product information from the AdventureWorks sample database. There are two ways the application can do this. Using the first way, the application can execute a separate query for each product requested: SELECT * FROM AdventureWorks2014.Production.Product WHERE ProductID = 63; Using the second way, the application does the following: 1. Prepares a statement that contains a parameter marker (?): sql SELECT * FROM AdventureWorks2014.Production.Product WHERE ProductID = ?; 2. Binds a program variable to the parameter marker. 3. Each time product information is needed, fills the bound variable with the key value and executes the statement. The second way is more efficient when the statement is executed more than three times. In SQL Server, the prepare/execute model has no significant performance advantage over direct execution, because of the way SQL Server reuses execution plans. SQL Server has efficient algorithms for matching current SQL statements with execution plans that are generated for prior executions of the same SQL statement. If an application executes a SQL statement with parameter markers multiple times, SQL Server will reuse the execution plan from the first execution for the second and subsequent executions (unless the plan ages from the plan cache). The prepare/execute model still has these benefits: Finding an execution plan by an identifying handle is more efficient than the algorithms used to match an SQL statement to existing execution plans. The application can control when the execution plan is created and when it is reused. The prepare/execute model is portable to other databases, including earlier versions of SQL Server. P arameter Sniffing "Parameter sniffing" refers to a process whereby SQL Server "sniffs" the current parameter values during compilation or recompilation, and passes it along to the Query Optimizer so that they can be used to generate potentially more efficient query execution plans. Parameter values are sniffed during compilation or recompilation for the following types of batches: Stored procedures Queries submitted via sp_executesql Prepared queriesNOTE For queries using the RECOMPILE hint, both parameter values and current values of local variables are sniffed. The values sniffed (of parameters and local variables) are those that exist at the place in the batch just before the statement with the RECOMPILE hint. In particular, for parameters, the values that came along with the batch invocation call are not sniffed. Parallel Query Processing SQL Server provides parallel queries to optimize query execution and index operations for computers that have more than one microprocessor (CPU). Because SQL Server can perform a query or index operation in parallel by using several operating system worker threads, the operation can be completed quickly and efficiently. During query optimization, SQL Server looks for queries or index operations that might benefit from parallel execution. For these queries, SQL Server inserts exchange operators into the query execution plan to prepare the query for parallel execution. An exchange operator is an operator in a query execution plan that provides process management, data redistribution, and flow control. The exchange operator includes the Distribute Streams , Repartition Streams , and Gather Streams logical operators as subtypes, one or more of which can appear in the Showplan output of a query plan for a parallel query. After exchange operators are inserted, the result is a parallel-query execution plan. A parallel-query execution plan can use more than one worker thread. A serial execution plan, used by a nonparallel query, uses only one worker thread for its execution. The actual number of worker threads used by a parallel query is determined at query plan execution initialization and is determined by the complexity of the plan and the degree of parallelism. Degree of parallelism determines the maximum number of CPUs that are being used; it does not mean the number of worker threads that are being used. The degree of parallelism value is set at the server level and can be modified by using the sp_configure system stored procedure. You can override this value for individual query or index statements by specifying the MAXDOP query hint or MAXDOP index option. The SQL Server Query Optimizer does not use a parallel execution plan for a query if any one of the following conditions is true: The serial execution cost of the query is not high enough to consider an alternative, parallel execution plan. A serial execution plan is considered faster than any possible parallel execution plan for the particular query. The query contains scalar or relational operators that cannot be run in parallel. Certain operators can cause a section of the query plan to run in serial mode, or the whole plan to run in serial mode. Degree of Parallelism SQL Server automatically detects the best degree of parallelism for each instance of a parallel query execution or index data definition language (DDL) operation. It does this based on the following criteria: 1. Whether SQL Server is running on a computer that has more than one microprocessor or CPU, such as a symmetric multiprocessing computer (SMP). Only computers that have more than one CPU can use parallel queries. 2. Whether sufficient worker threads are available. Each query or index operation requires a certain number of worker threads to execute. Executing a parallel plan requires more worker threads than a serial plan, and the number of required worker threads increases with the degree of parallelism. When the worker thread requirement of the parallel plan for a specific degree of parallelism cannot be satisfied, the SQL Server Database Engine decreases the degree of parallelism automatically or completely abandons the parallel plan in the specified workload context. It then executes the serial plan (one worker thread). 3. The type of query or index operation executed. Index operations that create or rebuild an index, or drop a clustered index and queries that use CPU cyclesheavily are the best candidates for a parallel plan. For example, joins of large tables, large aggregations, and sorting of large result sets are good candidates. Simple queries, frequently found in transaction processing applications, find the additional coordination required to execute a query in parallel outweigh the potential performance boost. To distinguish between queries that benefit from parallelism and those that do not benefit, the SQL Server Database Engine compares the estimated cost of executing the query or index operation with the cost threshold for parallelism value. Users can change the default value of 5 using sp_configure if proper testing found that a different value is better suited for the running workload. 4. Whether there are a sufficient number of rows to process. If the Query Optimizer determines that the number of rows is too low, it does not introduce exchange operators to distribute the rows. Consequently, the operators are executed serially. Executing the operators in a serial plan avoids scenarios when the startup, distribution, and coordination costs exceed the gains achieved by parallel operator execution. 5. Whether current distribution statistics are available. If the highest degree of parallelism is not possible, lower degrees are considered before the parallel plan is abandoned. For example, when you create a clustered index on a view, distribution statistics cannot be evaluated, because the clustered index does not yet exist. In this case, the SQL Server Database Engine cannot provide the highest degree of parallelism for the index operation. However, some operators, such as sorting and scanning, can still benefit from parallel execution. NOTE Parallel index operations are only available in SQL Server Enterprise, Developer, and Evaluation editions. At execution time, the SQL Server Database Engine determines whether the current system workload and configuration information previously described allow for parallel execution. If parallel execution is warranted, the SQL Server Database Engine determines the optimal number of worker threads and spreads the execution of the parallel plan across those worker threads. When a query or index operation starts executing on multiple worker threads for parallel execution, the same number of worker threads is used until the operation is completed. The SQL Server Database Engine re-examines the optimal number of worker thread decisions every time an execution plan is retrieved from the plan cache. For example, one execution of a query can result in the use of a serial plan, a later execution of the same query can result in a parallel plan using three worker threads, and a third execution can result in a parallel plan using four worker threads. In a parallel query execution plan, the insert, update, and delete operators are executed serially. However, the WHERE clause of an UPDATE or a DELETE statement, or the SELECT part of an INSERT statement may be executed in parallel. The actual data changes are then serially applied to the database. Static and keyset-driven cursors can be populated by parallel execution plans. However, the behavior of dynamic cursors can be provided only by serial execution. The Query Optimizer always generates a serial execution plan for a query that is part of a dynamic cursor. Overriding Degrees of Parallelism You can use the max degree of parallelism (MAXDOP) server configuration option (ALTER DATABASE SCOPED CONFIGURATION on SQL Database ) to limit the number of processors to use in parallel plan execution. The max degree of parallelism option can be overridden for individual query and index operation statements by specifying the MAXDOP query hint or MAXDOP index option. MAXDOP provides more control over individual queries and index operations. For example, you can use the MAXDOP option to control, by increasing or reducing, the number of processors dedicated to an online index operation. In this way, you can balance the resources used by an index operation with those of the concurrent users. Setting the max degree of parallelism option to 0 (default) enables SQL Server to use all available processors up to a maximum of 64 processors in a parallel plan execution. Although SQL Server sets a runtime target of 64logical processors when MAXDOP option is set to 0, a different value can be manually set if needed. Setting MAXDOP to 0 for queries and indexes allows SQL Server to use all available processors up to a maximum of 64 processors for the given queries or indexes in a parallel plan execution. MAXDOP is not an enforced value for all parallel queries, but rather a tentative target for all queries eligible for parallelism. This means that if not enough worker threads are available at runtime, a query may execute with a lower degree of parallelism than the MAXDOP server configuration option. Refer to this Microsoft Support Article for best practices on configuring MAXDOP. Parallel Query Example The following query counts the number of orders placed in a specific quarter, starting on April 1, 2000, and in which at least one line item of the order was received by the customer later than the committed date. This query lists the count of such orders grouped by each order priority and sorted in ascending priority order. This example uses theoretical table and column names. SELECT o_orderpriority, COUNT(*) AS Order_Count FROM orders WHERE o_orderdate >= ''2000/04/01'' AND o_orderdate < DATEADD (mm, 3, ''2000/04/01'') AND EXISTS ( SELECT * FROM lineitem WHERE l_orderkey = o_orderkey AND l_commitdate < l_receiptdate ) GROUP BY o_orderpriority ORDER BY o_orderpriority Assume the following indexes are defined on the lineitem and orders tables: CREATE INDEX l_order_dates_idx ON lineitem (l_orderkey, l_receiptdate, l_commitdate, l_shipdate) CREATE UNIQUE INDEX o_datkeyopr_idx ON ORDERS (o_orderdate, o_orderkey, o_custkey, o_orderpriority) Here is one possible parallel plan generated for the query previously shown:|--Stream Aggregate(GROUP BY:([ORDERS].[o_orderpriority]) DEFINE:([Expr1005]=COUNT(*))) |--Parallelism(Gather Streams, ORDER BY: ([ORDERS].[o_orderpriority] ASC)) |--Stream Aggregate(GROUP BY: ([ORDERS].[o_orderpriority]) DEFINE:([Expr1005]=Count(*))) |--Sort(ORDER BY:([ORDERS].[o_orderpriority] ASC)) |--Merge Join(Left Semi Join, MERGE: ([ORDERS].[o_orderkey])= ([LINEITEM].[l_orderkey]), RESIDUAL:([ORDERS].[o_orderkey]= [LINEITEM].[l_orderkey])) |--Sort(ORDER BY:([ORDERS].[o_orderkey] ASC)) | |--Parallelism(Repartition Streams, PARTITION COLUMNS: ([ORDERS].[o_orderkey])) | |--Index Seek(OBJECT: ([tpcd1G].[dbo].[ORDERS].[O_DATKEYOPR_IDX]), SEEK:([ORDERS].[o_orderdate] >= Apr 1 2000 12:00AM AND [ORDERS].[o_orderdate] < Jul 1 2000 12:00AM) ORDERED) |--Parallelism(Repartition Streams, PARTITION COLUMNS: ([LINEITEM].[l_orderkey]), ORDER BY:([LINEITEM].[l_orderkey] ASC)) |--Filter(WHERE: ([LINEITEM].[l_commitdate]< [LINEITEM].[l_receiptdate])) |--Index Scan(OBJECT: ([tpcd1G].[dbo].[LINEITEM].[L_ORDER_DATES_IDX]), ORDERED) The illustration below shows a query plan executed with a degree of parallelism equal to 4 and involving a two- table join. The parallel plan contains three parallelism operators. Both the Index Seek operator of the o_datkey_ptr index and the Index Scan operator of the l_order_dates_idx index are performed in parallel. This produces several exclusive streams. This can be determined from the nearest Parallelism operators above the Index Scan and Index Seek operators, respectively. Both are repartitioning the type of exchange. That is, they are just reshuffling data among the streams and producing the same number of streams on their output as they have on their input. This numberof streams is equal to the degree of parallelism. The parallelism operator above the l_order_dates_idx Index Scan operator is repartitioning its input streams using the value of L_ORDERKEY as a key. In this way, the same values of L_ORDERKEY end up in the same output stream. At the same time, output streams maintain the order on the L_ORDERKEY column to meet the input requirement of the Merge Join operator. The parallelism operator above the Index Seek operator is repartitioning its input streams using the value of O_ORDERKEY . Because its input is not sorted on the O_ORDERKEY column values and this is the join column in the Merge Join operator, the Sort operator between the parallelism and Merge Join operators make sure that the input is sorted for the Merge Join operator on the join columns. The Sort operator, like the Merge Join operator, is performed in parallel. The topmost parallelism operator gathers results from several streams into a single stream. Partial aggregations performed by the Stream Aggregate operator below the parallelism operator are then accumulated into a single SUM value for each different value of the O_ORDERPRIORITY in the Stream Aggregate operator above the parallelism operator. Because this plan has two exchange segments, with degree of parallelism equal to 4, it uses eight worker threads. For more information on the operators used in this example, refer to the Showplan Logical and Physical Operators Reference. Parallel Index Operations The query plans built for the index operations that create or rebuild an index, or drop a clustered index, allow for parallel, multi-worker threaded operations on computers that have multiple microprocessors. NOTE Parallel index operations are only available in Enterprise Edition, starting with SQL Server 2008. SQL Server uses the same algorithms to determine the degree of parallelism (the total number of separate worker threads to run) for index operations as it does for other queries. The maximum degree of parallelism for an index operation is subject to the max degree of parallelism server configuration option. You can override the max degree of parallelism value for individual index operations by setting the MAXDOP index option in the CREATE INDEX, ALTER INDEX, DROP INDEX, and ALTER TABLE statements. When the SQL Server Database Engine builds an index execution plan, the number of parallel operations is set to the lowest value from among the following: The number of microprocessors, or CPUs in the computer. The number specified in the max degree of parallelism server configuration option. The number of CPUs not already over a threshold of work performed for SQL Server worker threads. For example, on a computer that has eight CPUs, but where max degree of parallelism is set to 6, no more than six parallel worker threads are generated for an index operation. If five of the CPUs in the computer exceed the threshold of SQL Server work when an index execution plan is built, the execution plan specifies only three parallel worker threads. The main phases of a parallel index operation include the following: A coordinating worker thread quickly and randomly scans the table to estimate the distribution of the index keys. The coordinating worker thread establishes the key boundaries that will create a number of key ranges equal to the degree of parallel operations, where each key range is estimated to cover similar numbers of rows. For example, if there are four million rows in the table and the degree of parallelism is 4, the coordinating worker thread will determine the key values that delimit four sets of rows with 1 million rows in each set. Ifenough key ranges cannot be established to use all CPUs, the degree of parallelism is reduced accordingly. The coordinating worker thread dispatches a number of worker threads equal to the degree of parallel operations and waits for these worker threads to complete their work. Each worker thread scans the base table using a filter that retrieves only rows with key values within the range assigned to the worker thread. Each worker thread builds an index structure for the rows in its key range. In the case of a partitioned index, each worker thread builds a specified number of partitions. Partitions are not shared among worker threads. After all the parallel worker threads have completed, the coordinating worker thread connects the index subunits into a single index. This phase applies only to offline index operations. Individual CREATE TABLE or ALTER TABLE statements can have multiple constraints that require that an index be created. These multiple index creation operations are performed in series, although each individual index creation operation may be a parallel operation on a computer that has multiple CPUs. Distributed Query Architecture Microsoft SQL Server supports two methods for referencing heterogeneous OLE DB data sources in Transact- SQL statements: Linked server names The system stored procedures sp_addlinkedserver and sp_addlinkedsrvlogin are used to give a server name to an OLE DB data source. Objects in these linked servers can be referenced in Transact-SQL statements using four-part names. For example, if a linked server name of DeptSQLSrvr is defined against another instance of SQL Server, the following statement references a table on that server : SELECT JobTitle, HireDate FROM DeptSQLSrvr.AdventureWorks2014.HumanResources.Employee; The linked server name can also be specified in an OPENQUERY statement to open a rowset from the OLE DB data source. This rowset can then be referenced like a table in Transact-SQL statements. Ad hoc connector names For infrequent references to a data source, the OPENROWSET or OPENDATASOURCE functions are specified with the information needed to connect to the linked server. The rowset can then be referenced the same way a table is referenced in Transact-SQL statements: SELECT * FROM OPENROWSET(''Microsoft.Jet.OLEDB.4.0'', ''c:\MSOffice\Access\Samples\Northwind.mdb'';''Admin'';''''; Employees); SQL Server uses OLE DB to communicate between the relational engine and the storage engine. The relational engine breaks down each Transact-SQL statement into a series of operations on simple OLE DB rowsets opened by the storage engine from the base tables. This means the relational engine can also open simple OLE DB rowsets on any OLE DB data source.The relational engine uses the OLE DB application programming interface (API) to open the rowsets on linked servers, fetch the rows, and manage transactions. For each OLE DB data source accessed as a linked server, an OLE DB provider must be present on the server running SQL Server. The set of Transact-SQL operations that can be used against a specific OLE DB data source depends on the capabilities of the OLE DB provider. For each instance of SQL Server, members of the sysadmin fixed server role can enable or disable the use of ad- hoc connector names for an OLE DB provider using the SQL Server DisallowAdhocAccess property. When ad-hoc access is enabled, any user logged on to that instance can execute SQL statements containing ad-hoc connector names, referencing any data source on the network that can be accessed using that OLE DB provider. To control access to data sources, members of the sysadmin role can disable ad-hoc access for that OLE DB provider, thereby limiting users to only those data sources referenced by linked server names defined by the administrators. By default, ad-hoc access is enabled for the SQL Server OLE DB provider, and disabled for all other OLE DB providers. Distributed queries can allow users to access another data source (for example, files, non-relational data sources such as Active Directory, and so on) using the security context of the Microsoft Windows account under which the SQL Server service is running. SQL Server impersonates the login appropriately for Windows logins; however, that is not possible for SQL Server logins. This can potentially allow a distributed query user to access another data source for which they do not have permissions, but the account under which the SQL Server service is running does have permissions. Use sp_addlinkedsrvlogin to define the specific logins that are authorized to access the corresponding linked server. This control is not available for ad-hoc names, so use caution in enabling an OLE DB provider for ad-hoc access. When possible, SQL Server pushes relational operations such as joins, restrictions, projections, sorts, and group by operations to the OLE DB data source. SQL Server does not default to scanning the base table into SQL Server and performing the relational operations itself. SQL Server queries the OLE DB provider to determine the level of SQL grammar it supports, and, based on that information, pushes as many relational operations as possible to the provider. SQL Server specifies a mechanism for an OLE DB provider to return statistics indicating how key values are distributed within the OLE DB data source. This lets the SQL Server Query Optimizer better analyze the pattern of data in the data source against the requirements of each SQL statement, increasing the ability of the Query Optimizer to generate optimal execution plans. Query Processing Enhancements on Partitioned Tables and Indexes SQL Server 2008 improved query processing performance on partitioned tables for many parallel plans, changes the way parallel and serial plans are represented, and enhanced the partitioning information provided in both compile-time and run-time execution plans. This topic describes these improvements, provides guidance on howto interpret the query execution plans of partitioned tables and indexes, and provides best practices for improving query performance on partitioned objects. NOTE Partitioned tables and indexes are supported only in the SQL Server Enterprise, Developer, and Evaluation editions. New Partition-Aware Seek Operation In SQL Server, the internal representation of a partitioned table is changed so that the table appears to the query processor to be a multicolumn index with PartitionID as the leading column. PartitionID is a hidden computed column used internally to represent the ID of the partition containing a specific row. For example, assume the table T, defined as T(a, b, c) , is partitioned on column a, and has a clustered index on column b. In SQL Server, this partitioned table is treated internally as a nonpartitioned table with the schema T(PartitionID, a, b, c) and a clustered index on the composite key (PartitionID, b) . This allows the Query Optimizer to perform seek operations based on PartitionID on any partitioned table or index. Partition elimination is now done in this seek operation. In addition, the Query Optimizer is extended so that a seek or scan operation with one condition can be done on PartitionID (as the logical leading column) and possibly other index key columns, and then a second-level seek, with a different condition, can be done on one or more additional columns, for each distinct value that meets the qualification for the first-level seek operation. That is, this operation, called a skip scan, allows the Query Optimizer to perform a seek or scan operation based on one condition to determine the partitions to be accessed and a second-level index seek operation within that operator to return rows from these partitions that meet a different condition. For example, consider the following query. SELECT * FROM T WHERE a < 10 and b = 2; For this example, assume that table T, defined as T(a, b, c) , is partitioned on column a, and has a clustered index on column b. The partition boundaries for table T are defined by the following partition function: CREATE PARTITION FUNCTION myRangePF1 (int) AS RANGE LEFT FOR VALUES (3, 7, 10); To solve the query, the query processor performs a first-level seek operation to find every partition that contains rows that meet the condition T.a < 10 . This identifies the partitions to be accessed. Within each partition identified, the processor then performs a second-level seek into the clustered index on column b to find the rows that meet the condition T.b = 2 and T.a < 10 . The following illustration is a logical representation of the skip scan operation. It shows table T with data in columns a and b . The partitions are numbered 1 through 4 with the partition boundaries shown by dashed vertical lines. A first-level seek operation to the partitions (not shown in the illustration) has determined that partitions 1, 2, and 3 meet the seek condition implied by the partitioning defined for the table and the predicate on column a . That is, T.a < 10 . The path traversed by the second-level seek portion of the skip scan operation is illustrated by the curved line. Essentially, the skip scan operation seeks into each of these partitions for rows that meet the condition b = 2 . The total cost of the skip scan operation is the same as that of three separate index seeks.Displaying Partitioning Information in Query Execution Plans The execution plans of queries on partitioned tables and indexes can be examined by using the Transact-SQL SET statements SET SHOWPLAN_XML or SET STATISTICS XML , or by using the graphical execution plan output in SQL Server Management Studio. For example, you can display the compile-time execution plan by clicking Display Estimated Execution Plan on the Query Editor toolbar and the run-time plan by clicking Include Actual Execution Plan. Using these tools, you can ascertain the following information: The operations such as scans , seeks , inserts , updates , merges , and deletes that access partitioned tables or indexes. The partitions accessed by the query. For example, the total count of partitions accessed and the ranges of contiguous partitions that are accessed are available in run-time execution plans. When the skip scan operation is used in a seek or scan operation to retrieve data from one or more partitions. Partition Information Enhancements SQL Server provides enhanced partitioning information for both compile-time and run-time execution plans. Execution plans now provide the following information: An optional Partitioned attribute that indicates that an operator, such as a seek , scan , insert , update , merge , or delete , is performed on a partitioned table. A new SeekPredicateNew element with a SeekKeys subelement that includes PartitionID as the leading index key column and filter conditions that specify range seeks on PartitionID . The presence of two SeekKeys subelements indicates that a skip scan operation on PartitionID is used. Summary information that provides a total count of the partitions accessed. This information is available only in run-time plans. To demonstrate how this information is displayed in both the graphical execution plan output and the XML Showplan output, consider the following query on the partitioned table fact_sales . This query updates data in two partitions. UPDATE fact_sales SET quantity = quantity * 2 WHERE date_id BETWEEN 20080802 AND 20080902; The following illustration shows the properties of the Clustered Index Seek operator in the compile-time execution plan for this query. To view the definition of the fact_sales table and the partition definition, see "Example" in this topic.Partitioned Attribute When an operator such as an Index Seek is executed on a partitioned table or index, the Partitioned attribute appears in the compile-time and run-time plan and is set to True (1). The attribute does not display when it is set to False (0). The Partitioned attribute can appear in the following physical and logical operators: Table Scan Index Scan Index Seek Insert Update Delete Merge As shown in the previous illustration, this attribute is displayed in the properties of the operator in which it is defined. In the XML Showplan output, this attribute appears as Partitioned="1" in the RelOp node of the operator in which it is defined. New Seek Predicate In XML Showplan output, the SeekPredicateNew element appears in the operator in which it is defined. It can contain up to two occurrences of the SeekKeys sub-element. The first SeekKeys item specifies the first-level seek operation at the partition ID level of the logical index. That is, this seek determines the partitions that must be accessed to satisfy the conditions of the query. The second SeekKeys item specifies the second-level seek portionof the skip scan operation that occurs within each partition identified in the first-level seek. Partition Summary Information In run-time execution plans, partition summary information provides a count of the partitions accessed and the identity of the actual partitions accessed. You can use this information to verify that the correct partitions are accessed in the query and that all other partitions are eliminated from consideration. The following information is provided: Actual Partition Count , and Partitions Accessed . Actual Partition Count is the total number of partitions accessed by the query. Partitions Accessed , in XML Showplan output, is the partition summary information that appears in the new RuntimePartitionSummary element in RelOp node of the operator in which it is defined. The following example shows the contents of the RuntimePartitionSummary element, indicating that two total partitions are accessed (partitions 2 and 3). Displaying Partition Information by Using Other Showplan Methods The Showplan methods SHOWPLAN_ALL , SHOWPLAN_TEXT , and STATISTICS PROFILE do not report the partition information described in this topic, with the following exception. As part of the SEEK predicate, the partitions to be accessed are identified by a range predicate on the computed column representing the partition ID. The following example shows the SEEK predicate for a Clustered Index Seek operator. Partitions 2 and 3 are accessed, and the seek operator filters on the rows that meet the condition date_id BETWEEN 20080802 AND 20080902 . |--Clustered Index Seek(OBJECT:([db_sales_test].[dbo].[fact_sales].[ci]), SEEK:([PtnId1000] >= (2) AND [PtnId1000] \<= (3) AND [db_sales_test].[dbo].[fact_sales].[date_id] >= (20080802) AND [db_sales_test].[dbo].[fact_sales].[date_id] <= (20080902)) ORDERED FORWARD) Interpreting Execution Plans for Partitioned Heaps A partitioned heap is treated as a logical index on the partition ID. Partition elimination on a partitioned heap is represented in an execution plan as a Table Scan operator with a SEEK predicate on partition ID. The following example shows the Showplan information provided: |-- Table Scan (OBJECT: ([db].[dbo].[T]), SEEK: ([PtnId1001]=[Expr1011]) ORDERED FORWARD) Interpreting Execution Plans for Collocated Joins Join collocation can occur when two tables are partitioned using the same or equivalent partitioning function and the partitioning columns from both sides of the join are specified in the join condition of the query. The Query Optimizer can generate a plan where the partitions of each table that have equal partition IDs are joined separately. Collocated joins can be faster than non-collocated joins because they can require less memory and processing time. The Query Optimizer chooses a non-collocated plan or a collocated plan based on cost estimates.In a collocated plan, the Nested Loops join reads one or more joined table or index partitions from the inner side. The numbers within the Constant Scan operators represent the partition numbers. When parallel plans for collocated joins are generated for partitioned tables or indexes, a Parallelism operator appears between the Constant Scan and the Nested Loops join operators. In this case, multiple worker threads on the outer side of the join each read and work on a different partition. The following illustration demonstrates a parallel query plan for a collocated join. Parallel Query Execution Strategy for Partitioned Objects The query processor uses a parallel execution strategy for queries that select from partitioned objects. As part of the execution strategy, the query processor determines the table partitions required for the query and the proportion of worker threads to allocate to each partition. In most cases, the query processor allocates an equal or almost equal number of worker threads to each partition, and then executes the query in parallel across the partitions. The following paragraphs explain worker thread allocation in greater detail. If the number of worker threads is less than the number of partitions, the query processor assigns each worker thread to a different partition, initially leaving one or more partitions without an assigned worker thread. When a worker thread finishes executing on a partition, the query processor assigns it to the next partition until each partition has been assigned a single worker thread. This is the only case in which the query processor reallocates worker threads to other partitions. Shows worker thread reassigned after it finishes. If the number of worker threads is equal to the number of partitions, the query processor assigns one worker thread to each partition. When a worker thread finishes, it is not reallocated to another partition. If the number of worker threads is greater than the number of partitions, the query processor allocates an equal number of worker threads to each partition. If the number of worker threads is not an exact multiple of the number of partitions, the query processor allocates one additional worker thread to some partitions in order to use all of the available worker threads. Note that if there is only one partition, all worker threads will be assigned to that partition. In the diagram below, there are four partitions and 14 worker threads. Each partition has 3 worker threads assigned, and two partitions have an additional worker thread, for a total of 14 worker thread assignments. When a worker thread finishes, it is not reassigned to another partition.Although the above examples suggest a straightforward way to allocate worker threads, the actual strategy is more complex and accounts for other variables that occur during query execution. For example, if the table is partitioned and has a clustered index on column A and a query has the predicate clause WHERE A IN (13, 17, 25) , the query processor will allocate one or more worker threads to each of these three seek values (A=13, A=17, and A=25) instead of each table partition. It is only necessary to execute the query in the partitions that contain these values, and if all of these seek predicates happen to be in the same table partition, all of the worker threads will be assigned to the same table partition. To take another example, suppose that the table has four partitions on column A with boundary points (10, 20, 30), an index on column B, and the query has a predicate clause WHERE B IN (50, 100, 150) . Because the table partitions are based on the values of A, the values of B can occur in any of the table partitions. Thus, the query processor will seek for each of the three values of B (50, 100, 150) in each of the four table partitions. The query processor will assign worker threads proportionately so that it can execute each of these 12 query scans in parallel. TABLE PARTITIONS BASED ON COLUMN A SEEKS FOR COLUMN B IN EACH TABLE PARTITION Table Partition 1: A < 10 B=50, B=100, B=150 Table Partition 2: A >= 10 AND A < 20 B=50, B=100, B=150 Table Partition 3: A >= 20 AND A < 30 B=50, B=100, B=150 Table Partition 4: A >= 30 B=50, B=100, B=150 Best Practices To improve the performance of queries that access a large amount of data from large partitioned tables and indexes, we recommend the following best practices: Stripe each partition across many disks. This is especially relevant when using spinning disks. When possible, use a server with enough main memory to fit frequently accessed partitions or all partitions in memory to reduce I/O cost. If the data you query will not fit in memory, compress the tables and indexes. This will reduce I/O cost. Use a server with fast processors and as many processor cores as you can afford, to take advantage of parallel query processing capability. Ensure the server has sufficient I/O controller bandwidth. Create a clustered index on every large partitioned table to take advantage of B-tree scanning optimizations. Follow the best practice recommendations in the white paper, The Data Loading Performance Guide, when bulk loading data into partitioned tables. Example The following example creates a test database containing a single table with seven partitions. Use the tools described previously when executing the queries in this example to view partitioning information for both compile-time and run-time plans.NOTE This example inserts more than 1 million rows into the table. Running this example may take several minutes depending on your hardware. Before executing this example, verify that you have more than 1.5 GB of disk space available.USE master; GO IF DB_ID (N''db_sales_test'') IS NOT NULL DROP DATABASE db_sales_test; GO CREATE DATABASE db_sales_test; GO USE db_sales_test; GO CREATE PARTITION FUNCTION [pf_range_fact](int) AS RANGE RIGHT FOR VALUES (20080801, 20080901, 20081001, 20081101, 20081201, 20090101); GO CREATE PARTITION SCHEME [ps_fact_sales] AS PARTITION [pf_range_fact] ALL TO ([PRIMARY]); GO CREATE TABLE fact_sales(date_id int, product_id int, store_id int, quantity int, unit_price numeric(7,2), other_data char(1000)) ON ps_fact_sales(date_id); GO CREATE CLUSTERED INDEX ci ON fact_sales(date_id); GO PRINT ''Loading...''; SET NOCOUNT ON; DECLARE @i int; SET @i = 1; WHILE (@i<1000000) BEGIN INSERT INTO fact_sales VALUES(20080800 + (@i%30) + 1, @i%10000, @i%200, RAND() * 25, (@i%3) + 1, ''''); SET @i += 1; END; GO DECLARE @i int; SET @i = 1; WHILE (@i<10000) BEGIN INSERT INTO fact_sales VALUES(20080900 + (@i%30) + 1, @i%10000, @i%200, RAND() * 25, (@i%3) + 1, ''''); SET @i += 1; END; PRINT ''Done.''; GO -- Two-partition query. SET STATISTICS XML ON; GO SELECT date_id, SUM(quantity*unit_price) AS total_price FROM fact_sales WHERE date_id BETWEEN 20080802 AND 20080902 GROUP BY date_id ; GO SET STATISTICS XML OFF; GO -- Single-partition query. SET STATISTICS XML ON; GO SELECT date_id, SUM(quantity*unit_price) AS total_price FROM fact_sales WHERE date_id BETWEEN 20080801 AND 20080831 GROUP BY date_id; GO SET STATISTICS XML OFF; GO Additional Reading Showplan Logical and Physical Operators Reference Extended EventsBest Practice with the Query Store Cardinality Estimation Adaptive query processing Operator PrecedenceThread and Task Architecture Guide 5/3/2018 • 8 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse Threads are an operating system feature that lets application logic be separated into several concurrent execution paths. This feature is useful when complex applications have many tasks that can be performed at the same time. When an operating system executes an instance of an application, it creates a unit called a process to manage the instance. The process has a thread of execution. This is the series of programming instructions performed by the application code. For example, if a simple application has a single set of instructions that can be performed serially, there is just one execution path or thread through the application. More complex applications may have several tasks that can be performed in tandem, instead of serially. The application can do this by starting separate processes for each task. However, starting a process is a resource-intensive operation. Instead, an application can start separate threads. These are relatively less resource-intensive. Additionally, each thread can be scheduled for execution independently from the other threads associated with a process. Threads allow complex applications to make more effective use of a CPU, even on computers that have a single CPU. With one CPU, only one thread can execute at a time. If one thread executes a long-running operation that does not use the CPU, such as a disk read or write, another one of the threads can execute until the first operation is completed. By being able to execute threads while other threads are waiting for an operation to be completed, an application can maximize its use of the CPU. This is especially true for multi-user, disk I/O intensive applications such as a database server. Computers that have multiple microprocessors or CPUs can execute one thread per CPU at the same time. For example, if a computer has eight CPUs, it can execute eight threads at the same time. SQL Server Batch or Task Scheduling Allocating Threads to a CPU By default, each instance of SQL Server starts each thread. If affinity has been enabled, the operating system assigns each thread to a specific CPU. The operating system distributes threads from instances of SQL Server among the microprocessors, or CPUs on a computer based on load. Sometimes, the operating system can also move a thread from one CPU with heavy usage to another CPU. In contrast, the SQL Server Database Engine assigns worker threads to schedulers that distribute the threads evenly among the CPUs. The affinity mask option is set by using ALTER SERVER CONFIGURATION. When the affinity mask is not set, the instance of SQL Server allocates worker threads evenly among the schedulers that have not been masked off Using the lightweight pooling Option The overhead involved in switching thread contexts is not very large. Most instances of SQL Server will not see any performance differences between setting the lightweight pooling option to 0 or 1. The only instances of SQL Server that might benefit from lightweight pooling are those that run on a computer having the following characteristics: A large multi-CPU server. All the CPUs are running near maximum capacity. There is a high level of context switching. These systems may see a small increase in performance if the lightweight pooling value is set to 1. We do not recommend that you use fiber mode scheduling for routine operation. This is because it can decrease performance by inhibiting the regular benefits of context switching, and because some components of SQL Servercannot function correctly in fiber mode. For more information, see lightweight pooling. Thread and Fiber Execution Microsoft Windows uses a numeric priority system that ranges from 1 through 31 to schedule threads for execution. Zero is reserved for operating system use. When several threads are waiting to execute, Windows dispatches the thread with the highest priority. By default, each instance of SQL Server is a priority of 7, which is referred to as the normal priority. This default gives SQL Server threads a high enough priority to obtain sufficient CPU resources without adversely affecting other applications. The priority boost configuration option can be used to increase the priority of the threads from an instance of SQL Server to 13. This is referred to as high priority. This setting gives SQL Server threads a higher priority than most other applications. Thus, SQL Server threads will generally be dispatched whenever they are ready to run and will not be pre-empted by threads from other applications. This can improve performance when a server is running only instances of SQL Server and no other applications. However, if a memory-intensive operation occurs in SQL Server, however, other applications are not likely to have a high-enough priority to pre-empt the SQL Server thread. If you are running multiple instances of SQL Server on a computer, and turn on priority boost for only some of the instances, the performance of any instances running at normal priority can be adversely affected. Also, the performance of other applications and components on the server can decline if priority boost is turned on. Therefore, it should only be used under tightly controlled conditions. Hot Add CPU Hot add CPU is the ability to dynamically add CPUs to a running system. Adding CPUs can occur physically by adding new hardware, logically by online hardware partitioning, or virtually through a virtualization layer. Starting with SQL Server 2008, SQL Server supports hot add CPU. Requirements for hot add CPU: Requires hardware that supports hot add CPU. Requires the 64-bit edition of Windows Server 2008 Datacenter or the Windows Server 2008 Enterprise Edition for Itanium-Based Systems operating system. Requires SQL Server Enterprise. SQL Server cannot be configured to use soft NUMA. For more information about soft NUMA, see Soft-NUMA (SQL Server). SQL Server does not automatically start to use CPUs after they are added. This prevents SQL Server from using CPUs that might be added for some other purpose. After adding CPUs, execute the RECONFIGURE statement, so that SQL Server will recognize the new CPUs as available resources. NOTE If the affinity64 mask is configured, the affinity64 mask must be modified to use the new CPUs. Best Practices for Running SQL Server on Computers That Have More Than 64 CPUs Assigning Hardware Threads with CPUs Do not use the affinity mask and affinity64 mask server configuration options to bind processors to specific threads. These options are limited to 64 CPUs. Use SET PROCESS AFFINITY option of ALTER SERVERCONFIGURATION instead. Managing the Transaction Log File Size Do not rely on autogrow to increase the size of the transaction log file. Increasing the transaction log must be a serial process. Extending the log can prevent transaction write operations from proceeding until the log extension is finished. Instead, preallocate space for the log files by setting the file size to a value large enough to support the typical workload in the environment. Setting Max Degree of Parallelism for Index Operations The performance of index operations such as creating or rebuilding indexes can be improved on computers that have many CPUs by temporarily setting the recovery model of the database to either the bulk-logged or simple recovery model. These index operations can generate significant log activity and log contention can affect the best degree of parallelism (DOP) choice made by SQL Server. In addition, consider adjusting the max degree of parallelism (MAXDOP) server configuration option for these operations. The following guidelines are based on internal tests and are general recommendations. You should try several different MAXDOP settings to determine the optimal setting for your environment. For the full recovery model, limit the value of the max degree of parallelism option to eight or less. For the bulk-logged model or the simple recovery model, setting the value of the max degree of parallelism option to a value higher than eight should be considered. For servers that have NUMA configured, the maximum degree of parallelism should not exceed the number of CPUs that are assigned to each NUMA node. This is because the query is more likely to use local memory from 1 NUMA node, which can improve memory access time. For servers that have hyper-threading enabled and were manufactured in 2009 or earlier (before hyper- threading feature was improved), the MAXDOP value should not exceed the number of physical processors, rather than logical processors. For more information about the max degree of parallelism option, see Configure the max degree of parallelism Server Configuration Option. Setting the Maximum Number of Worker Threads Always set the maximum number of worker threads to be more than the setting for the maximum degree of parallelism. The number of worker threads must always be set to a value of at least seven times the number of CPUs that are present on the server. For more information, see Configure the max worker threads Option. Using SQL Trace and SQL Server Profiler We recommend that you do not use SQL Trace and SQL Server Profiler in a production environment. The overhead for running these tools also increases as the number of CPUs increases. If you must use SQL Trace in a production environment, limit the number of trace events to a minimum. Carefully profile and test each trace event under load, and avoid using combinations of events that significantly affect performance. Setting the Number of tempdb Data Files Typically, the number of tempdb data files should match the number of CPUs. However, by carefully considering the concurrency needs of tempdb, you can reduce database management. For example, if a system has 64 CPUs and usually only 32 queries use tempdb, increasing the number of tempdb files to 64 will not improve performance. SQL Server Components That Can Use More Than 64 CPUs The following table lists SQL Server components and indicates whether they can use more that 64 CPUs. PROCESS NAME EXECUTABLE PROGRAM USE MORE THAN 64 CPUS SQL Server Database Engine Sqlserver.exe YesPROCESS NAME EXECUTABLE PROGRAM USE MORE THAN 64 CPUS Reporting Services Rs.exe No Analysis Services As.exe No Integration Services Is.exe No Service Broker Sb.exe No Full-Text Search Fts.exe No SQL Server Agent Sqlagent.exe No SQL Server Management Studio Ssms.exe No SQL Server Setup Setup.exe NoSQL Server Transaction Log Architecture and Management Guide 5/3/2018 • 20 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse Every SQL Server database has a transaction log that records all transactions and the database modifications that are made by each transaction. The transaction log is a critical component of the database and, if there is a system failure, the transaction log might be required to bring your database back to a consistent state. This guide provides information about the physical and logical architecture of the transaction log. Understanding the architecture can improve your effectiveness in managing transaction logs. Transaction Log Logical Architecture The SQL Server transaction log operates logically as if the transaction log is a string of log records. Each log record is identified by a log sequence number (LSN). Each new log record is written to the logical end of the log with an LSN that is higher than the LSN of the record before it. Log records are stored in a serial sequence as they are created. Each log record contains the ID of the transaction that it belongs to. For each transaction, all log records associated with the transaction are individually linked in a chain using backward pointers that speed the rollback of the transaction. Log records for data modifications record either the logical operation performed or they record the before and after images of the modified data. The before image is a copy of the data before the operation is performed; the after image is a copy of the data after the operation has been performed. The steps to recover an operation depend on the type of log record: Logical operation logged To roll the logical operation forward, the operation is performed again. To roll the logical operation back, the reverse logical operation is performed. Before and after image logged To roll the operation forward, the after image is applied. To roll the operation back, the before image is applied. Many types of operations are recorded in the transaction log. These operations include: The start and end of each transaction. Every data modification (insert, update, or delete). This includes changes by system stored procedures or data definition language (DDL) statements to any table, including system tables. Every extent and page allocation or deallocation. Creating or dropping a table or index. Rollback operations are also logged. Each transaction reserves space on the transaction log to make sure that enough log space exists to support a rollback that is caused by either an explicit rollback statement or if an error is encountered. The amount of space reserved depends on the operations performed in the transaction, but generally it is equal to the amount of space used to log each operation. This reserved spaceis freed when the transaction is completed. The section of the log file from the first log record that must be present for a successful database-wide rollback to the last-written log record is called the active part of the log, or the active log. This is the section of the log required to a full recovery of the database. No part of the active log can ever be truncated. The log sequence number (LSN) of this first log record is known as the minimum recovery LSN (MinLSN). Transaction Log Physical Architecture The transaction log in a database maps over one or more physical files. Conceptually, the log file is a string of log records. Physically, the sequence of log records is stored efficiently in the set of physical files that implement the transaction log. There must be at least one log file for each database. The SQL Server Database Engine divides each physical log file internally into a number of virtual log files (VLFs). Virtual log files have no fixed size, and there is no fixed number of virtual log files for a physical log file. The Database Engine chooses the size of the virtual log files dynamically while it is creating or extending log files. The Database Engine tries to maintain a small number of virtual files. The size of the virtual files after a log file has been extended is the sum of the size of the existing log and the size of the new file increment. The size or number of virtual log files cannot be configured or set by administrators. NOTE Virtual log file (VLF) creation follows this method: If the next growth is less than 1/8 of current log physical size, then create 1 VLF that covers the growth size (Starting with SQL Server 2014 (12.x)) If the next growth is more than 1/8 of the current log size, then use the pre-2014 method: If growth is less than 64MB, create 4 VLFs that cover the growth size (e.g. for 1 MB growth, create four 256KB VLFs) If growth is from 64MB up to 1GB, create 8 VLFs that cover the growth size (e.g. for 512MB growth, create eight 64MB VLFs) If growth is larger than 1GB, create 16 VLFs that cover the growth size (e.g. for 8 GB growth, create sixteen 512MB VLFs) If the log files grow to a large size in many small increments, they will have many virtual log files. This can slow down database startup and also log backup and restore operations. Conversely, if the log files are set to a large size with few or just one increment, they will have few very large virtual log files. For more information on properly estimating the required size and autogrow setting of a transaction log, refer to the Recommendations section of Manage the size of the transaction log file. We recommend that you assign log files a size value close to the final size required, using the required increments to achieve optimal VLF distribution, and also have a relatively large growth_increment value. See the tip below to determine the optimal VLF distribution for the current transaction log size. The size value, as set by the SIZE argument of ALTER DATABASE is the initial size for the log file. The growth_increment value (also referred as the autogrow value), as set by the FILEGROWTH argument of ALTER DATABASE , is the amount of space added to the file every time new space is required. For more information on FILEGROWTH and SIZE arguments of ALTER DATABASE , see ALTER DATABASE (Transact- SQL) File and Filegroup Options.TIP To determine the optimal VLF distribution for the current transaction log size of all databases in a given instance, and the required growth increments to achieve the required size, see this script. The transaction log is a wrap-around file. For example, consider a database with one physical log file divided into four VLFs. When the database is created, the logical log file begins at the start of the physical log file. New log records are added at the end of the logical log and expand toward the end of the physical log. Log truncation frees any virtual logs whose records all appear in front of the minimum recovery log sequence number (MinLSN). The MinLSN is the log sequence number of the oldest log record that is required for a successful database-wide rollback. The transaction log in the example database would look similar to the one in the following illustration. When the end of the logical log reaches the end of the physical log file, the new log records wrap around to the start of the physical log file. This cycle repeats endlessly, as long as the end of the logical log never reaches the beginning of the logical log. If the old log records are truncated frequently enough to always leave sufficient room for all the new log records created through the next checkpoint, the log never fills. However, if the end of the logical log does reach the start of the logical log, one of two things occurs: If the FILEGROWTH setting is enabled for the log and space is available on the disk, the file is extended by the amount specified in the growth_increment parameter and the new log records are added to the extension. For more information about the FILEGROWTH setting, see ALTER DATABASE File and Filegroup Options (Transact-SQL). If the FILEGROWTH setting is not enabled, or the disk that is holding the log file has less free space than the amount specified in growth_increment, an 9002 error is generated. Refer to Troubleshoot a Full Transaction Log for more information. If the log contains multiple physical log files, the logical log will move through all the physical log files before it wraps back to the start of the first physical log file. IMPORTANT For more information about transaction log size management, see Manage the Size of the Transaction Log File. Log Truncation Log truncation is essential to keep the log from filling. Log truncation deletes inactive virtual log files from the logical transaction log of a SQL Server database, freeing space in the logical log for reuse by the physical transaction log. If a transaction log were never truncated, it would eventually fill all the disk space that is allocated to its physical log files. However, before the log can be truncated, a checkpoint operation must occur. A checkpoint writes the current in-memory modified pages (known as dirty pages) and transaction log information from memory to disk. When the checkpoint is performed, the inactive portion of the transaction log is marked asreusable. Thereafter, the inactive portion can be freed by log truncation. For more information about checkpoints, see Database Checkpoints (SQL Server). The following illustrations show a transaction log before and after truncation. The first illustration shows a transaction log that has never been truncated. Currently, four virtual log files are in use by the logical log. The logical log starts at the front of the first virtual log file and ends at virtual log 4. The MinLSN record is in virtual log 3. Virtual log 1 and virtual log 2 contain only inactive log records. These records can be truncated. Virtual log 5 is still unused and is not part of the current logical log. The second illustration shows how the log appears after being truncated. Virtual log 1 and virtual log 2 have been freed for reuse. The logical log now starts at the beginning of virtual log 3. Virtual log 5 is still unused, and it is not part of the current logical log. Log truncation occurs automatically after the following events, except when delayed for some reason: Under the simple recovery model, after a checkpoint. Under the full recovery model or bulk-logged recovery model, after a log backup, if a checkpoint has occurred since the previous backup. Log truncation can be delayed by a variety of factors. In the event of a long delay in log truncation, the transaction log can fill up. For information, see Factors that can delay log truncation and Troubleshoot a Full Transaction Log (SQL Server Error 9002). Write-Ahead Transaction Log This section describes the role of the write-ahead transaction log in recording data modifications to disk. SQL Server uses a write-ahead logging (WAL) algorithm, which guarantees that no data modifications are written to disk before the associated log record is written to disk. This maintains the ACID properties for a transaction. To understand how the write-ahead log works, it is important for you to know how modified data is written to disk. SQL Server maintains a buffer cache into which it reads data pages when data must be retrieved. When a page is modified in the buffer cache, it is not immediately written back to disk; instead, the page is marked as dirty. A data page can have more than one logical write made before it is physically written to disk. For each logical write, a transaction log record is inserted in the log cache that records the modification. The log records must be written to disk before the associated dirty page is removed from the buffer cache and written to disk. The checkpoint process periodically scans the buffer cache for buffers with pages from a specified database and writes all dirty pages to disk. Checkpoints save time during a later recovery by creating a point at which all dirty pages are guaranteed to have been written to disk. Writing a modified data page from the buffer cache to disk is called flushing the page. SQL Server has logic that prevents a dirty page from being flushed before the associated log record is written. Log records are written to disk when the transactions are committed. Transaction Log BackupsThis section presents concepts about how to back up and restore (apply) transaction logs. Under the full and bulk- logged recovery models, taking routine backups of transaction logs (log backups) is necessary for recovering data. You can back up the log while any full backup is running. For more information about recovery models, see Back Up and Restore of SQL Server Databases. Before you can create the first log backup, you must create a full backup, such as a database backup or the first in a set of file backups. Restoring a database by using only file backups can become complex. Therefore, we recommend that you start with a full database backup when you can. Thereafter, backing up the transaction log regularly is necessary. This not only minimizes work-loss exposure but also enables truncation of the transaction log. Typically, the transaction log is truncated after every conventional log backup. IMPORTANT We recommend taking frequent enough log backups to support your business requirements, specifically your tolerance for work loss such as might be caused by a damaged log storage. The appropriate frequency for taking log backups depends on your tolerance for work-loss exposure balanced by how many log backups you can store, manage, and, potentially, restore. Think about the required RTO and RPO when implementing your recovery strategy, and specifically the log backup cadence. Taking a log backup every 15 to 30 minutes might be enough. If your business requires that you minimize work-loss exposure, consider taking log backups more frequently. More frequent log backups have the added advantage of increasing the frequency of log truncation, resulting in smaller log files. IMPORTANT To limit the number of log backups that you need to restore, it is essential to routinely back up your data. For example, you might schedule a weekly full database backup and daily differential database backups. Again, think about the required RTO and RPO when implementing your recovery strategy, and specifically the full and differential database backup cadence. For more information about transaction log backups, see Transaction Log Backups (SQL Server). The Log Chain A continuous sequence of log backups is called a log chain. A log chain starts with a full backup of the database. Usually, a new log chain is only started when the database is backed up for the first time or after the recovery model is switched from simple recovery to full or bulk-logged recovery. Unless you choose to overwrite existing backup sets when creating a full database backup, the existing log chain remains intact. With the log chain intact, you can restore your database from any full database backup in the media set, followed by all subsequent log backups up through your recovery point. The recovery point could be the end of the last log backup or a specific recovery point in any of the log backups. For more information, see Transaction Log Backups (SQL Server). To restore a database up to the point of failure, the log chain must be intact. That is, an unbroken sequence of transaction log backups must extend up to the point of failure. Where this sequence of log must start depends on the type of data backups you are restoring: database, partial, or file. For a database or partial backup, the sequence of log backups must extend from the end of a database or partial backup. For a set of file backups, the sequence of log backups must extend from the start of a full set of file backups. For more information, see Apply Transaction Log Backups (SQL Server). Restore Log Backups Restoring a log backup rolls forward the changes that were recorded in the transaction log to re-create the exact state of the database at the time the log backup operation started. When you restore a database, you will have to restore the log backups that were created after the full database backup that you restore, or from the start of the first file backup that you restore. Typically, after you restore the most recent data or differential backup, you must restore a series of log backups until you reach your recovery point. Then, you recover the database. This rolls back all transactions that were incomplete when the recovery started and brings the database online. After the databasehas been recovered, you cannot restore any more backups. For more information, see Apply Transaction Log Backups (SQL Server). Checkpoints and the Active Portion of the Log Checkpoints flush dirty data pages from the buffer cache of the current database to disk. This minimizes the active portion of the log that must be processed during a full recovery of a database. During a full recovery, the following types of actions are performed: The log records of modifications not flushed to disk before the system stopped are rolled forward. All modifications associated with incomplete transactions, such as transactions for which there is no COMMIT or ROLLBACK log record, are rolled back. Checkpoint Operation A checkpoint performs the following processes in the database: Writes a record to the log file, marking the start of the checkpoint. Stores information recorded for the checkpoint in a chain of checkpoint log records. One piece of information recorded in the checkpoint is the log sequence number (LSN) of the first log record that must be present for a successful database-wide rollback. This LSN is called the Minimum Recovery LSN (MinLSN). The MinLSN is the minimum of the: LSN of the start of the checkpoint. LSN of the start of the oldest active transaction. LSN of the start of the oldest replication transaction that has not yet been delivered to the distribution database. The checkpoint records also contain a list of all the active transactions that have modified the database. If the database uses the simple recovery model, marks for reuse the space that precedes the MinLSN. Writes all dirty log and data pages to disk. Writes a record marking the end of the checkpoint to the log file. Writes the LSN of the start of this chain to the database boot page. Activities that cause a Checkpoint Checkpoints occur in the following situations: A CHECKPOINT statement is explicitly executed. A checkpoint occurs in the current database for the connection. A minimally logged operation is performed in the database; for example, a bulk-copy operation is performed on a database that is using the Bulk-Logged recovery model. Database files have been added or removed by using ALTER DATABASE. An instance of SQL Server is stopped by a SHUTDOWN statement or by stopping the SQL Server (MSSQLSERVER) service. Either action causes a checkpoint in each database in the instance of SQL Server. An instance of SQL Server periodically generates automatic checkpoints in each database to reduce the time that the instance would take to recover the database. A database backup is taken. An activity requiring a database shutdown is performed. For example, AUTO_CLOSE is ON and the last user connection to the database is closed, or a database option change is made that requires a restart of the database. Automatic CheckpointsThe SQL Server Database Engine generates automatic checkpoints. The interval between automatic checkpoints is based on the amount of log space used and the time elapsed since the last checkpoint. The time interval between automatic checkpoints can be highly variable and long, if few modifications are made in the database. Automatic checkpoints can also occur frequently if lots of data is modified. Use the recovery interval server configuration option to calculate the interval between automatic checkpoints for all the databases on a server instance. This option specifies the maximum time the Database Engine should use to recover a database during a system restart. The Database Engine estimates how many log records it can process in the recovery interval during a recovery operation. The interval between automatic checkpoints also depends on the recovery model: If the database is using either the full or bulk-logged recovery model, an automatic checkpoint is generated whenever the number of log records reaches the number the Database Engine estimates it can process during the time specified in the recovery interval option. If the database is using the simple recovery model, an automatic checkpoint is generated whenever the number of log records reaches the lesser of these two values: The log becomes 70 percent full. The number of log records reaches the number the Database Engine estimates it can process during the time specified in the recovery interval option. For information about setting the recovery interval, see Configure the recovery interval Server Configuration Option. TIP The -k SQL Server advanced setup option enables a database administrator to throttle checkpoint I/O behavior based on the throughput of the I/O subsystem for some types of checkpoints. The -k setup option applies to automatic checkpoints and any otherwise unthrottled checkpoints. Automatic checkpoints truncate the unused section of the transaction log if the database is using the simple recovery model. However, if the database is using the full or bulk-logged recovery models, the log is not truncated by automatic checkpoints. For more information, see The Transaction Log. The CHECKPOINT statement now provides an optional checkpoint_duration argument that specifies the requested period of time, in seconds, for checkpoints to finish. For more information, see CHECKPOINT. Active Log The section of the log file from the MinLSN to the last-written log record is called the active portion of the log, or the active log. This is the section of the log required to do a full recovery of the database. No part of the active log can ever be truncated. All log records must be truncated from the parts of the log before the MinLSN. The following illustration shows a simplified version of the end-of-a-transaction log with two active transactions. Checkpoint records have been compacted to a single record. LSN 148 is the last record in the transaction log. At the time that the recorded checkpoint at LSN 147 was processed, Tran 1 had been committed and Tran 2 was the only active transaction. That makes the first log record for Tran 2 the oldest log record for a transaction active at the time of the last checkpoint. This makes LSN 142, the Begin transaction record for Tran 2, the MinLSN.Long-Running Transactions The active log must include every part of all uncommitted transactions. An application that starts a transaction and does not commit it or roll it back prevents the Database Engine from advancing the MinLSN. This can cause two types of problems: If the system is shut down after the transaction has performed many uncommitted modifications, the recovery phase of the subsequent restart can take much longer than the time specified in the recovery interval option. The log might grow very large, because the log cannot be truncated past the MinLSN. This occurs even if the database is using the simple recovery model, in which the transaction log is generally truncated on each automatic checkpoint. Replication Transactions The Log Reader Agent monitors the transaction log of each database configured for transactional replication, and it copies the transactions marked for replication from the transaction log into the distribution database. The active log must contain all transactions that are marked for replication, but that have not yet been delivered to the distribution database. If these transactions are not replicated in a timely manner, they can prevent the truncation of the log. For more information, see Transactional Replication. See also We recommend the following articles and books for additional information about the transaction log and log management best practices. The Transaction Log (SQL Server) Manage the size of the transaction log file Transaction Log Backups (SQL Server) Database Checkpoints (SQL Server) Configure the recovery interval Server Configuration Option sys.dm_db_log_info (Transact-SQL) sys.dm_db_log_space_usage (Transact-SQL) Understanding Logging and Recovery in SQL Server by Paul Randal SQL Server Transaction Log Management by Tony Davis and Gail ShawTransaction Locking and Row Versioning Guide 5/3/2018 • 123 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse In any database, mismanagement of transactions often leads to contention and performance problems in systems that have many users. As the number of users that access the data increases, it becomes important to have applications that use transactions efficiently. This guide describes the locking and row versioning mechanisms the SQL Server Database Engine uses to ensure the physical integrity of each transaction and provides information on how applications can control transactions efficiently. Applies to: SQL Server 2005 through SQL Server 2017, unless noted otherwise. Transaction Basics A transaction is a sequence of operations performed as a single logical unit of work. A logical unit of work must exhibit four properties, called the atomicity, consistency, isolation, and durability (ACID) properties, to qualify as a transaction. Atomicity A transaction must be an atomic unit of work; either all of its data modifications are performed, or none of them are performed. Consistency When completed, a transaction must leave all data in a consistent state. In a relational database, all rules must be applied to the transaction''s modifications to maintain all data integrity. All internal data structures, such as B-tree indexes or doubly-linked lists, must be correct at the end of the transaction. Isolation Modifications made by concurrent transactions must be isolated from the modifications made by any other concurrent transactions. A transaction either recognizes data in the state it was in before another concurrent transaction modified it, or it recognizes the data after the second transaction has completed, but it does not recognize an intermediate state. This is referred to as serializability because it results in the ability to reload the starting data and replay a series of transactions to end up with the data in the same state it was in after the original transactions were performed. Durability After a fully durable transaction has completed, its effects are permanently in place in the system. The modifications persist even in the event of a system failure. SQL Server 2014 (12.x) and later enable delayed durable transactions. Delayed durable transactions commit before the transaction log record is persisted to disk. For more information on delayed transaction durability see the topic Transaction Durability. SQL programmers are responsible for starting and ending transactions at points that enforce the logical consistency of the data. The programmer must define the sequence of data modifications that leave the data in a consistent state relative to the organization''s business rules. The programmer includes these modification statements in a single transaction so that the SQL Server Database Engine can enforce the physical integrity of the transaction. It is the responsibility of an enterprise database system, such as an instance of the SQL Server Database Engine, to provide mechanisms ensuring the physical integrity of each transaction. The SQL Server Database Engine provides:Locking facilities that preserve transaction isolation. Logging facilities ensure transaction durability. For fully durable transactions the log record is hardened to disk before the transactions commits. Thus, even if the server hardware, operating system, or the instance of the SQL Server Database Engine itself fails, the instance uses the transaction logs upon restart to automatically roll back any uncompleted transactions to the point of the system failure. Delayed durable transactions commit before the transaction log record is hardened to disk. Such transactions may be lost if there is a system failure before the log record is hardened to disk. For more information on delayed transaction durability see the topic Transaction Durability. Transaction management features that enforce transaction atomicity and consistency. After a transaction has started, it must be successfully completed (committed), or the SQL Server Database Engine undoes all of the data modifications made since the transaction started. This operation is referred to as rolling back a transaction because it returns the data to the state it was prior to those changes. Controlling Transactions Applications control transactions mainly by specifying when a transaction starts and ends. This can be specified by using either Transact-SQL statements or database application programming interface (API) functions. The system must also be able to correctly handle errors that terminate a transaction before it completes. For more information, see Transactions, Transactions in ODBC and Transactions in SQL Server Native Client (OLEDB). By default, transactions are managed at the connection level. When a transaction is started on a connection, all Transact-SQL statements executed on that connection are part of the transaction until the transaction ends. However, under a multiple active result set (MARS) session, a Transact-SQL explicit or implicit transaction becomes a batch-scoped transaction that is managed at the batch level. When the batch completes, if the batch- scoped transaction is not committed or rolled back, it is automatically rolled back by SQL Server. For more information, see Using Multiple Active Result Sets (MARS). Starting Transactions Using API functions and Transact-SQL statements, you can start transactions in an instance of the SQL Server Database Engine as explicit, autocommit, or implicit transactions. Explicit Transactions An explicit transaction is one in which you explicitly define both the start and end of the transaction through an API function or by issuing the Transact-SQL BEGIN TRANSACTION, COMMIT TRANSACTION, COMMIT WORK, ROLLBACK TRANSACTION, or ROLLBACK WORK Transact-SQL statements. When the transaction ends, the connection returns to the transaction mode it was in before the explicit transaction was started, either implicit or autocommit mode. You can use all Transact-SQL statements in an explicit transaction, except for the following statements: ALTER DATABASE CREATE DATABASE DROP FULLTEXT INDEX ALTER FULLTEXT CATALOG CREATE FULLTEXT CATALOG RECONFIGURE ALTER FULLTEXT INDEX CREATE FULLTEXT INDEX RESTORE BACKUP DROP DATABASE Full-text system stored procedures CREATE DATABASE DROP FULLTEXT CATALOG sp_dboption to set database options or any system procedure that modifies the master database inside explicit or implicit transactions.NOTE UPDATE STATISTICS can be used inside an explicit transaction. However, UPDATE STATISTICS commits independently of the enclosing transaction and cannot be rolled back. Autocommit Transactions Autocommit mode is the default transaction management mode of the SQL Server Database Engine. Every Transact-SQL statement is committed or rolled back when it completes. If a statement completes successfully, it is committed; if it encounters any error, it is rolled back. A connection to an instance of the SQL Server Database Engine operates in autocommit mode whenever this default mode has not been overridden by either explicit or implicit transactions. Autocommit mode is also the default mode for ADO, OLE DB, ODBC, and DB-Library. Implicit Transactions When a connection is operating in implicit transaction mode, the instance of the SQL Server Database Engine automatically starts a new transaction after the current transaction is committed or rolled back. You do nothing to delineate the start of a transaction; you only commit or roll back each transaction. Implicit transaction mode generates a continuous chain of transactions. Set implicit transaction mode on through either an API function or the Transact-SQL SET IMPLICIT_TRANSACTIONS ON statement. After implicit transaction mode has been set on for a connection, the instance of the SQL Server Database Engine automatically starts a transaction when it first executes any of these statements: ALTER TABLE FETCH REVOKE CREATE GRANT SELECT DELETE INSERT TRUNCATE TABLE DROP OPEN UPDATE Batch-scoped Transactions Applicable only to multiple active result sets (MARS), a Transact-SQL explicit or implicit transaction that starts under a MARS session becomes a batch-scoped transaction. A batch-scoped transaction that is not committed or rolled back when a batch completes is automatically rolled back by SQL Server. Distributed Transactions Distributed transactions span two or more servers known as resource managers. The management of the transaction must be coordinated between the resource managers by a server component called a transaction manager. Each instance of the SQL Server Database Engine can operate as a resource manager in distributed transactions coordinated by transaction managers, such as Microsoft Distributed Transaction Coordinator (MS DTC), or other transaction managers that support the Open Group XA specification for distributed transaction processing. For more information, see the MS DTC documentation. A transaction within a single instance of the SQL Server Database Engine that spans two or more databases is actually a distributed transaction. The instance manages the distributed transaction internally; to the user, it operates as a local transaction. At the application, a distributed transaction is managed much the same as a local transaction. At the end of the transaction, the application requests the transaction to be either committed or rolled back. A distributed commit must be managed differently by the transaction manager to minimize the risk that a network failure may result in some resource managers successfully committing while others roll back the transaction. This is achieved by managing the commit process in two phases (the prepare phase and the commit phase), which is known as a two- phase commit (2PC).Prepare phase When the transaction manager receives a commit request, it sends a prepare command to all of the resource managers involved in the transaction. Each resource manager then does everything required to make the transaction durable, and all buffers holding log images for the transaction are flushed to disk. As each resource manager completes the prepare phase, it returns success or failure of the prepare to the transaction manager. SQL Server 2014 (12.x) introduced delayed transaction durability. Delayed durable transactions commit before log images for the transaction are flushed to disk. For more information on delayed transaction durability see the topic Transaction Durability. Commit phase If the transaction manager receives successful prepares from all of the resource managers, it sends commit commands to each resource manager. The resource managers can then complete the commit. If all of the resource managers report a successful commit, the transaction manager then sends a success notification to the application. If any resource manager reported a failure to prepare, the transaction manager sends a rollback command to each resource manager and indicates the failure of the commit to the application. SQL Server Database Engine applications can manage distributed transactions either through Transact-SQL or the database API. For more information, see BEGIN DISTRIBUTED TRANSACTION (Transact-SQL). Ending Transactions You can end transactions with either a COMMIT or ROLLBACK statement, or through a corresponding API function. COMMIT If a transaction is successful, commit it. A COMMIT statement guarantees all of the transaction''s modifications are made a permanent part of the database. A COMMIT also frees resources, such as locks, used by the transaction. ROLLBACK If an error occurs in a transaction, or if the user decides to cancel the transaction, then roll the transaction back. A ROLLBACK statement backs out all modifications made in the transaction by returning the data to the state it was in at the start of the transaction. A ROLLBACK also frees resources held by the transaction. NOTE Under connections enabled to support multiple active result sets (MARS), an explicit transaction started through an API function cannot be committed while there are pending requests for execution. Any attempt to commit this type of transaction while there are outstanding operations running will result in an error. Errors During Transaction Processing If an error prevents the successful completion of a transaction, SQL Server automatically rolls back the transaction and frees all resources held by the transaction. If the client''s network connection to an instance of the SQL Server Database Engine is broken, any outstanding transactions for the connection are rolled back when the network notifies the instance of the break. If the client application fails or if the client computer goes down or is restarted, this also breaks the connection, and the instance of the SQL Server Database Engine rolls back any outstanding connections when the network notifies it of the break. If the client logs off the application, any outstanding transactions are rolled back. If a run-time statement error (such as a constraint violation) occurs in a batch, the default behavior in the SQL Server Database Engine is to roll back only the statement that generated the error. You can change this behavior using the SET XACT_ABORT statement. After SET XACT_ABORT ON is executed, any run-time statement error causes an automatic rollback of the current transaction. Compile errors, such as syntax errors, are not affected by SET XACT_ABORT . For more information, see SET XACT_ABORT (Transact-SQL). When errors occur, corrective action ( COMMIT or ROLLBACK ) should be included in application code. One effective tool for handling errors, including those in transactions, is the Transact-SQL TRY…CATCH construct. For moreinformation with examples that include transactions, see TRY...CATCH (Transact-SQL). Beginning with SQL Server 2012 (11.x), you can use the THROW statement to raise an exception and transfers execution to a CATCH block of a TRY…CATCH construct. For more information, see THROW (Transact-SQL). C o m p i l e a n d R u n - t i m e E r r o r s i n A u t o c o m m i t m o d e In autocommit mode, it sometimes appears as if an instance of the SQL Server Database Engine has rolled back an entire batch instead of just one SQL statement. This happens if the error encountered is a compile error, not a run-time error. A compile error prevents the SQL Server Database Engine from building an execution plan, so nothing in the batch is executed. Although it appears that all of the statements before the one generating the error were rolled back, the error prevented anything in the batch from being executed. In the following example, none of the INSERT statements in the third batch are executed because of a compile error. It appears that the first two INSERT statements are rolled back when they are never executed. CREATE TABLE TestBatch (Cola INT PRIMARY KEY, Colb CHAR(3)); GO INSERT INTO TestBatch VALUES (1, ''aaa''); INSERT INTO TestBatch VALUES (2, ''bbb''); INSERT INTO TestBatch VALUSE (3, ''ccc''); -- Syntax error. GO SELECT * FROM TestBatch; -- Returns no rows. GO In the following example, the third INSERT statement generates a run-time duplicate primary key error. The first two INSERT statements are successful and committed, so they remain after the run-time error. CREATE TABLE TestBatch (Cola INT PRIMARY KEY, Colb CHAR(3)); GO INSERT INTO TestBatch VALUES (1, ''aaa''); INSERT INTO TestBatch VALUES (2, ''bbb''); INSERT INTO TestBatch VALUES (1, ''ccc''); -- Duplicate key error. GO SELECT * FROM TestBatch; -- Returns rows 1 and 2. GO The SQL Server Database Engine uses deferred name resolution, in which object names are not resolved until execution time. In the following example, the first two INSERT statements are executed and committed, and those two rows remain in the TestBatch table after the third INSERT statement generates a run-time error by referring to a table that does not exist. CREATE TABLE TestBatch (Cola INT PRIMARY KEY, Colb CHAR(3)); GO INSERT INTO TestBatch VALUES (1, ''aaa''); INSERT INTO TestBatch VALUES (2, ''bbb''); INSERT INTO TestBch VALUES (3, ''ccc''); -- Table name error. GO SELECT * FROM TestBatch; -- Returns rows 1 and 2. GO Locking and Row Versioning Basics The SQL Server Database Engine uses the following mechanisms to ensure the integrity of transactions and maintain the consistency of databases when multiple users are accessing data at the same time: Locking Each transaction requests locks of different types on the resources, such as rows, pages, or tables, on which the transaction is dependent. The locks block other transactions from modifying the resources in a way thatwould cause problems for the transaction requesting the lock. Each transaction frees its locks when it no longer has a dependency on the locked resources. Row versioning When a row versioning-based isolation level is enabled, the SQL Server Database Engine maintains versions of each row that is modified. Applications can specify that a transaction use the row versions to view data as it existed at the start of the transaction or query instead of protecting all reads with locks. By using row versioning, the chance that a read operation will block other transactions is greatly reduced. Locking and row versioning prevent users from reading uncommitted data and prevent multiple users from attempting to change the same data at the same time. Without locking or row versioning, queries executed against that data could produce unexpected results by returning data that has not yet been committed in the database. Applications can choose transaction isolation levels, which define the level of protection for the transaction from modifications made by other transactions. Table-level hints can be specified for individual Transact- SQL statements to further tailor behavior to fit the requirements of the application. Managing Concurrent Data Access Users who access a resource at the same time are said to be accessing the resource concurrently. Concurrent data access requires mechanisms to prevent adverse effects when multiple users try to modify resources that other users are actively using. Concurrency Effects Users modifying data can affect other users who are reading or modifying the same data at the same time. These users are said to be accessing the data concurrently. If a data storage system has no concurrency control, users could see the following side effects: Lost updates Lost updates occur when two or more transactions select the same row and then update the row based on the value originally selected. Each transaction is unaware of the other transactions. The last update overwrites updates made by the other transactions, which results in lost data. For example, two editors make an electronic copy of the same document. Each editor changes the copy independently and then saves the changed copy thereby overwriting the original document. The editor who saves the changed copy last overwrites the changes made by the other editor. This problem could be avoided if one editor could not access the file until the other editor had finished and committed the transaction. Uncommitted dependency (dirty read) Uncommitted dependency occurs when a second transaction selects a row that is being updated by another transaction. The second transaction is reading data that has not been committed yet and may be changed by the transaction updating the row. For example, an editor is making changes to an electronic document. During the changes, a second editor takes a copy of the document that includes all the changes made so far, and distributes the document to the intended audience. The first editor then decides the changes made so far are wrong and removes the edits and saves the document. The distributed document contains edits that no longer exist and should be treated as if they never existed. This problem could be avoided if no one could read the changed document until the first editor does the final save of modifications and commits the transaction. Inconsistent analysis (nonrepeatable read) Inconsistent analysis occurs when a second transaction accesses the same row several times and reads different data each time. Inconsistent analysis is similar to uncommitted dependency in that anothertransaction is changing the data that a second transaction is reading. However, in inconsistent analysis, the data read by the second transaction was committed by the transaction that made the change. Also, inconsistent analysis involves multiple reads (two or more) of the same row, and each time the information is changed by another transaction; thus, the term nonrepeatable read. For example, an editor reads the same document twice, but between each reading the writer rewrites the document. When the editor reads the document for the second time, it has changed. The original read was not repeatable. This problem could be avoided if the writer could not change the document until the editor has finished reading it for the last time. Phantom reads A phantom read is a situation that occurs when two identical queries are executed and the collection of rows returned by the second query is different. The example below shows how this may occur. Assume the two transactions below are executing at the same time. The two SELECT statements in the first transaction may return different results because the INSERT statement in the second transaction changes the data used by both. --Transaction 1 BEGIN TRAN; SELECT ID FROM dbo.employee WHERE ID > 5 and ID < 10; --The INSERT statement from the second transaction occurs here. SELECT ID FROM dbo.employee WHERE ID > 5 and ID < 10; COMMIT; --Transaction 2 BEGIN TRAN; INSERT INTO dbo.employee SET name = ''New'' WHERE ID = 5; COMMIT; Missing and double reads caused by row updates Missing a updated row or seeing an updated row multiple times Transactions that are running at the READ UNCOMMITTED level do not issue shared locks to prevent other transactions from modifying data read by the current transaction. Transactions that are running at the READ COMMITTED level do issue shared locks, but the row or page locks are released after the row is read. In either case, when you are scanning an index, if another user changes the index key column of the row during your read, the row might appear again if the key change moved the row to a position ahead of your scan. Similarly, the row might not appear if the key change moved the row to a position in the index that you had already read. To avoid this, use the SERIALIZABLE or HOLDLOCK hint, or row versioning. For more information, see Table Hints (Transact-SQL). Missing one or more rows that were not the target of update When you are using READ UNCOMMITTED , if your query reads rows using an allocation order scan (using IAM pages), you might miss rows if another transaction is causing a page split. This cannot occur when you are using read committed because a table lock is held during a page split and does not happen if the table does not have a clustered index, because updates do not cause page splits. Types of Concurrency When many people attempt to modify data in a database at the same time, a system of controls must be implemented so that modifications made by one person do not adversely affect those of another person. This is called concurrency control.Concurrency control theory has two classifications for the methods of instituting concurrency control: Pessimistic concurrency control A system of locks prevents users from modifying data in a way that affects other users. After a user performs an action that causes a lock to be applied, other users cannot perform actions that would conflict with the lock until the owner releases it. This is called pessimistic control because it is mainly used in environments where there is high contention for data, where the cost of protecting data with locks is less than the cost of rolling back transactions if concurrency conflicts occur. Optimistic concurrency control In optimistic concurrency control, users do not lock data when they read it. When a user updates data, the system checks to see if another user changed the data after it was read. If another user updated the data, an error is raised. Typically, the user receiving the error rolls back the transaction and starts over. This is called optimistic because it is mainly used in environments where there is low contention for data, and where the cost of occasionally rolling back a transaction is lower than the cost of locking data when read. SQL Server supports a range of concurrency control. Users specify the type of concurrency control by selecting transaction isolation levels for connections or concurrency options on cursors. These attributes can be defined using Transact-SQL statements, or through the properties and attributes of database application programming interfaces (APIs) such as ADO, ADO.NET, OLE DB, and ODBC. Isolation Levels in the SQL Server Database Engine Transactions specify an isolation level that defines the degree to which one transaction must be isolated from resource or data modifications made by other transactions. Isolation levels are described in terms of which concurrency side-effects, such as dirty reads or phantom reads, are allowed. Transaction isolation levels control: Whether locks are taken when data is read, and what type of locks are requested. How long the read locks are held. Whether a read operation referencing rows modified by another transaction: Blocks until the exclusive lock on the row is freed. Retrieves the committed version of the row that existed at the time the statement or transaction started. Reads the uncommitted data modification. IMPORTANT Choosing a transaction isolation level does not affect the locks acquired to protect data modifications. A transaction always gets an exclusive lock on any data it modifies, and holds that lock until the transaction completes, regardless of the isolation level set for that transaction. For read operations, transaction isolation levels primarily define the level of protection from the effects of modifications made by other transactions. A lower isolation level increases the ability of many users to access data at the same time, but increases the number of concurrency effects (such as dirty reads or lost updates) users might encounter. Conversely, a higher isolation level reduces the types of concurrency effects that users may encounter, but requires more system resources and increases the chances that one transaction will block another. Choosing the appropriate isolation level depends on balancing the data integrity requirements of the application against the overhead of each isolation level. The highest isolation level, serializable, guarantees that a transaction will retrieve exactly the same data every time it repeats a read operation, but it does this by performing a level of locking that is likely to impact other users in multi-user systems. The lowest isolation level, read uncommitted, may retrieve data that has been modified but not committed by other transactions. All of the concurrency side effects can happen in read uncommitted, but there is no read locking or versioning, so overhead is minimized. SQ L Se r v e r D a t a b a se En g i n e I so l a t i o n L e v e l sThe ISO standard defines the following isolation levels, all of which are supported by the SQL Server Database Engine: ISOLATION LEVEL DEFINITION Read uncommitted The lowest isolation level where transactions are isolated only enough to ensure that physically corrupt data is not read. In this level, dirty reads are allowed, so one transaction may see not-yet-committed changes made by other transactions. Read committed Allows a transaction to read data previously read (not modified) by another transaction without waiting for the first transaction to complete. The SQL Server Database Engine keeps write locks (acquired on selected data) until the end of the transaction, but read locks are released as soon as the SELECT operation is performed. This is the SQL Server Database Engine default level. Repeatable read The SQL Server Database Engine keeps read and write locks that are acquired on selected data until the end of the transaction. However, because range-locks are not managed, phantom reads can occur. Serializable The highest level where transactions are completely isolated from one another. The SQL Server Database Engine keeps read and write locks acquired on selected data to be released at the end of the transaction. Range-locks are acquired when a SELECT operation uses a ranged WHERE clause, especially to avoid phantom reads. Note: DDL operations and transactions on replicated tables may fail when serializable isolation level is requested. This is because replication queries use hints that may be incompatible with serializable isolation level. SQL Server also supports two additional transaction isolation levels that use row versioning. One is an implementation of read committed isolation, and one is a transaction isolation level, snapshot. ROW VERSIONING ISOLATION LEVEL DEFINITION Read Committed Snapshot When the READ_COMMITTED_SNAPSHOT database option is set ON, read committed isolation uses row versioning to provide statement-level read consistency. Read operations require only SCH-S table level locks and no page or row locks. That is, the SQL Server Database Engine uses row versioning to present each statement with a transactionally consistent snapshot of the data as it existed at the start of the statement. Locks are not used to protect the data from updates by other transactions. A user-defined function can return data that was committed after the time the statement containing the UDF began. When the READ_COMMITTED_SNAPSHOT database option is set OFF, which is the default setting, read committed isolation uses shared locks to prevent other transactions from modifying rows while the current transaction is running a read operation. The shared locks also block the statement from reading rows modified by other transactions until the other transaction is completed. Both implementations meet the ISO definition of read committed isolation.ROW VERSIONING ISOLATION LEVEL DEFINITION Snapshot The snapshot isolation level uses row versioning to provide transaction-level read consistency. Read operations acquire no page or row locks; only SCH-S table locks are acquired. When reading rows modified by another transaction, they retrieve the version of the row that existed when the transaction started. You can only use Snapshot isolation against a database when the ALLOW_SNAPSHOT_ISOLATION database option is set ON. By default, this option is set OFF for user databases. Note: SQL Server does not support versioning of metadata. For this reason, there are restrictions on what DDL operations can be performed in an explicit transaction that is running under snapshot isolation. The following DDL statements are not permitted under snapshot isolation after a BEGIN TRANSACTION statement: ALTER TABLE, CREATE INDEX, CREATE XML INDEX, ALTER INDEX, DROP INDEX, DBCC REINDEX, ALTER PARTITION FUNCTION, ALTER PARTITION SCHEME, or any common language runtime (CLR) DDL statement. These statements are permitted when you are using snapshot isolation within implicit transactions. An implicit transaction, by definition, is a single statement that makes it possible to enforce the semantics of snapshot isolation, even with DDL statements. Violations of this principle can cause error 3961: Snapshot isolation transaction failed in database ''%.*ls'' because the object accessed by the statement has been modified by a DDL statement in another concurrent transaction since the start of this transaction. It is not allowed because the metadata is not versioned. A concurrent update to metadata could lead to inconsistency if mixed with snapshot isolation. The following table shows the concurrency side effects enabled by the different isolation levels. ISOLATION LEVEL DIRTY READ NONREPEATABLE READ PHANTOM Read uncommitted Yes Yes Yes Read committed No Yes Yes Repeatable read No No Yes Snapshot No No No Serializable No No No For more information about the specific types of locking or row versioning controlled by each transaction isolation level, see SET TRANSACTION ISOL ATION LEVEL (Transact-SQL). Transaction isolation levels can be set using Transact-SQL or through a database API. Transact-SQL scripts use the SET TRANSACTION ISOL ATION LEVEL statement. ADO ADO applications set the IsolationLevel property of the Connection object to adXactReadUncommitted, adXactReadCommitted, adXactRepeatableRead, or adXactReadSerializable.ADO.NET ADO.NET applications using the System.Data.SqlClient managed namespace can call the SqlConnection.BeginTransaction method and set the IsolationLevel option to Unspecified, Chaos, ReadUncommitted, ReadCommitted, RepeatableRead, Serializable, and Snapshot. OLE DB When starting a transaction, applications using OLE DB call ITransactionLocal::StartTransaction with isoLevel set to ISOL ATIONLEVEL_READUNCOMMITTED, ISOL ATIONLEVEL_READCOMMITTED, ISOL ATIONLEVEL_REPEATABLEREAD, ISOL ATIONLEVEL_SNAPSHOT, or ISOL ATIONLEVEL_SERIALIZABLE. When specifying the transaction isolation level in autocommit mode, OLE DB applications can set the DBPROPSET_SESSION property DBPROP_SESS_AUTOCOMMITISOLEVELS to DBPROPVAL_TI_CHAOS, DBPROPVAL_TI_READUNCOMMITTED, DBPROPVAL_TI_BROWSE, DBPROPVAL_TI_CURSORSTABILITY, DBPROPVAL_TI_READCOMMITTED, DBPROPVAL_TI_REPEATABLEREAD, DBPROPVAL_TI_SERIALIZABLE, DBPROPVAL_TI_ISOL ATED, or DBPROPVAL_TI_SNAPSHOT. ODBC ODBC applications call SQLSetConnectAttr with Attribute set to SQL_ATTR_TXN_ISOL ATION and ValuePtr set to SQL_TXN_READ_UNCOMMITTED, SQL_TXN_READ_COMMITTED, SQL_TXN_REPEATABLE_READ, or SQL_TXN_SERIALIZABLE. For snapshot transactions, applications call SQLSetConnectAttr with Attribute set to SQL_COPT_SS_TXN_ISOL ATION and ValuePtr set to SQL_TXN_SS_SNAPSHOT. A snapshot transaction can be retrieved using either SQL_COPT_SS_TXN_ISOL ATION or SQL_ATTR_TXN_ISOL ATION. Locking in the SQL Server Database Engine Locking is a mechanism used by the SQL Server Database Engine to synchronize access by multiple users to the same piece of data at the same time. Before a transaction acquires a dependency on the current state of a piece of data, such as by reading or modifying the data, it must protect itself from the effects of another transaction modifying the same data. The transaction does this by requesting a lock on the piece of data. Locks have different modes, such as shared or exclusive. The lock mode defines the level of dependency the transaction has on the data. No transaction can be granted a lock that would conflict with the mode of a lock already granted on that data to another transaction. If a transaction requests a lock mode that conflicts with a lock that has already been granted on the same data, the instance of the SQL Server Database Engine will pause the requesting transaction until the first lock is released. When a transaction modifies a piece of data, it holds the lock protecting the modification until the end of the transaction. How long a transaction holds the locks acquired to protect read operations depends on the transaction isolation level setting. All locks held by a transaction are released when the transaction completes (either commits or rolls back). Applications do not typically request locks directly. Locks are managed internally by a part of the SQL Server Database Engine called the lock manager. When an instance of the SQL Server Database Engine processes a Transact-SQL statement, the SQL Server Database Engine query processor determines which resources are to be accessed. The query processor determines what types of locks are required to protect each resource based on the type of access and the transaction isolation level setting. The query processor then requests the appropriate locks from the lock manager. The lock manager grants the locks if there are no conflicting locks held by other transactions. Lock Granularity and Hierarchies The SQL Server Database Engine has multigranular locking that allows different types of resources to be locked by a transaction. To minimize the cost of locking, the SQL Server Database Engine locks resources automatically ata level appropriate to the task. Locking at a smaller granularity, such as rows, increases concurrency but has a higher overhead because more locks must be held if many rows are locked. Locking at a larger granularity, such as tables, are expensive in terms of concurrency because locking an entire table restricts access to any part of the table by other transactions. However, it has a lower overhead because fewer locks are being maintained. The SQL Server Database Engine often has to acquire locks at multiple levels of granularity to fully protect a resource. This group of locks at multiple levels of granularity is called a lock hierarchy. For example, to fully protect a read of an index, an instance of the SQL Server Database Engine may have to acquire share locks on rows and intent share locks on the pages and table. The following table shows the resources that the SQL Server Database Engine can lock. RESOURCE DESCRIPTION RID A row identifier used to lock a single row within a heap. KEY A row lock within an index used to protect key ranges in serializable transactions. PAGE An 8-kilobyte (KB) page in a database, such as data or index pages. EXTENT A contiguous group of eight pages, such as data or index pages. HoBT A heap or B-tree. A lock protecting a B-tree (index) or the heap data pages in a table that does not have a clustered index. TABLE The entire table, including all data and indexes. FILE A database file. APPLICATION An application-specified resource. METADATA Metadata locks. ALLOCATION_UNIT An allocation unit. DATABASE The entire database. NOTE HoBT and TABLE locks can be affected by the LOCK_ESCALATION option of ALTER TABLE. Lock Modes The SQL Server Database Engine locks resources using different lock modes that determine how the resources can be accessed by concurrent transactions. The following table shows the resource lock modes that the SQL Server Database Engine uses. LOCK MODE DESCRIPTIONLOCK MODE DESCRIPTION Shared (S) Used for read operations that do not change or update data, such as a SELECT statement. Update (U) Used on resources that can be updated. Prevents a common form of deadlock that occurs when multiple sessions are reading, locking, and potentially updating resources later. Exclusive (X) Used for data-modification operations, such as INSERT , UPDATE , or DELETE . Ensures that multiple updates cannot be made to the same resource at the same time. Intent Used to establish a lock hierarchy. The types of intent locks are: intent shared (IS), intent exclusive (IX), and shared with intent exclusive (SIX). Schema Used when an operation dependent on the schema of a table is executing. The types of schema locks are: schema modification (Sch-M) and schema stability (Sch-S). Bulk Update (BU) Used when bulk copying data into a table and the TABLOCK hint is specified. Key-range Protects the range of rows read by a query when using the serializable transaction isolation level. Ensures that other transactions cannot insert rows that would qualify for the queries of the serializable transaction if the queries were run again. Shared Locks Shared (S) locks allow concurrent transactions to read (SELECT) a resource under pessimistic concurrency control. No other transactions can modify the data while shared (S) locks exist on the resource. Shared (S) locks on a resource are released as soon as the read operation completes, unless the transaction isolation level is set to repeatable read or higher, or a locking hint is used to retain the shared (S) locks for the duration of the transaction. Update Locks Update (U) locks prevent a common form of deadlock. In a repeatable read or serializable transaction, the transaction reads data, acquiring a shared (S) lock on the resource (page or row), and then modifies the data, which requires lock conversion to an exclusive (X) lock. If two transactions acquire shared-mode locks on a resource and then attempt to update data concurrently, one transaction attempts the lock conversion to an exclusive (X) lock. The shared-mode-to-exclusive lock conversion must wait because the exclusive lock for one transaction is not compatible with the shared-mode lock of the other transaction; a lock wait occurs. The second transaction attempts to acquire an exclusive (X) lock for its update. Because both transactions are converting to exclusive (X) locks, and they are each waiting for the other transaction to release its shared-mode lock, a deadlock occurs. To avoid this potential deadlock problem, update (U) locks are used. Only one transaction can obtain an update (U) lock to a resource at a time. If a transaction modifies a resource, the update (U) lock is converted to an exclusive (X) lock. Exclusive Locks Exclusive (X) locks prevent access to a resource by concurrent transactions. With an exclusive (X) lock, no other transactions can modify data; read operations can take place only with the use of the NOLOCK hint or read uncommitted isolation level. Data modification statements, such as INSERT, UPDATE, and DELETE combine both modification and readoperations. The statement first performs read operations to acquire data before performing the required modification operations. Data modification statements, therefore, typically request both shared locks and exclusive locks. For example, an UPDATE statement might modify rows in one table based on a join with another table. In this case, the UPDATE statement requests shared locks on the rows read in the join table in addition to requesting exclusive locks on the updated rows. Intent Locks The SQL Server Database Engine uses intent locks to protect placing a shared (S) lock or exclusive (X) lock on a resource lower in the lock hierarchy. Intent locks are named intent locks because they are acquired before a lock at the lower level, and therefore signal intent to place locks at a lower level. Intent locks serve two purposes: To prevent other transactions from modifying the higher-level resource in a way that would invalidate the lock at the lower level. To improve the efficiency of the SQL Server Database Engine in detecting lock conflicts at the higher level of granularity. For example, a shared intent lock is requested at the table level before shared (S) locks are requested on pages or rows within that table. Setting an intent lock at the table level prevents another transaction from subsequently acquiring an exclusive (X) lock on the table containing that page. Intent locks improve performance because the SQL Server Database Engine examines intent locks only at the table level to determine if a transaction can safely acquire a lock on that table. This removes the requirement to examine every row or page lock on the table to determine if a transaction can lock the entire table. Intent locks include intent shared (IS), intent exclusive (IX), and shared with intent exclusive (S IX). LOCK MODE DESCRIPTION Intent shared (IS) Protects requested or acquired shared locks on some (but not all) resources lower in the hierarchy. Intent exclusive (IX) Protects requested or acquired exclusive locks on some (but not all) resources lower in the hierarchy. IX is a superset of IS, and it also protects requesting shared locks on lower level resources. Shared with intent exclusive (SIX) Protects requested or acquired shared locks on all resources lower in the hierarchy and intent exclusive locks on some (but not all) of the lower level resources. Concurrent IS locks at the top-level resource are allowed. For example, acquiring a SIX lock on a table also acquires intent exclusive locks on the pages being modified and exclusive locks on the modified rows. There can be only one SIX lock per resource at one time, preventing updates to the resource made by other transactions, although other transactions can read resources lower in the hierarchy by obtaining IS locks at the table level. Intent update (IU) Protects requested or acquired update locks on all resources lower in the hierarchy. IU locks are used only on page resources. IU locks are converted to IX locks if an update operation takes place.LOCK MODE DESCRIPTION Shared intent update (SIU) A combination of S and IU locks, as a result of acquiring these locks separately and simultaneously holding both locks. For example, a transaction executes a query with the PAGLOCK hint and then executes an update operation. The query with the PAGLOCK hint acquires the S lock, and the update operation acquires the IU lock. Update intent exclusive (UIX) A combination of U and IX locks, as a result of acquiring these locks separately and simultaneously holding both locks. Schema Locks The SQL Server Database Engine uses schema modification (Sch-M) locks during a table data definition language (DDL) operation, such as adding a column or dropping a table. During the time that it is held, the Sch-M lock prevents concurrent access to the table. This means the Sch-M lock blocks all outside operations until the lock is released. Some data manipulation language (DML) operations, such as table truncation, use Sch-M locks to prevent access to affected tables by concurrent operations. The SQL Server Database Engine uses schema stability (Sch-S) locks when compiling and executing queries. Sch- S locks do not block any transactional locks, including exclusive (X) locks. Therefore, other transactions, including those with X locks on a table, continue to run while a query is being compiled. However, concurrent DDL operations, and concurrent DML operations that acquire Sch-M locks, cannot be performed on the table. Bulk Update Locks Bulk update (BU) locks allow multiple threads to bulk load data concurrently into the same table while preventing other processes that are not bulk loading data from accessing the table. The SQL Server Database Engine uses bulk update (BU) locks when both of the following conditions are true. You use the Transact-SQL BULK INSERT statement, or the OPENROWSET(BULK) function, or you use one of the Bulk Insert API commands such as .NET SqlBulkCopy, OLEDB Fast Load APIs, or the ODBC Bulk Copy APIs to bulk copy data into a table. The TABLOCK hint is specified or the table lock on bulk load table option is set using sp_tableoption. TIP Unlike the BULK INSERT statement, which holds a less restrictive Bulk Update lock, INSERT INTO…SELECT with the TABLOCK hint holds an exclusive (X) lock on the table. This means that you cannot insert rows using parallel insert operations. Key-Range Locks Key-range locks protect a range of rows implicitly included in a record set being read by a Transact-SQL statement while using the serializable transaction isolation level. Key-range locking prevents phantom reads. By protecting the ranges of keys between rows, it also prevents phantom insertions or deletions into a record set accessed by a transaction. Lock Compatibility Lock compatibility controls whether multiple transactions can acquire locks on the same resource at the same time. If a resource is already locked by another transaction, a new lock request can be granted only if the mode of the requested lock is compatible with the mode of the existing lock. If the mode of the requested lock is not compatible with the existing lock, the transaction requesting the new lock waits for the existing lock to be released or for the lock timeout interval to expire. For example, no lock modes are compatible with exclusive locks. While an exclusive (X) lock is held, no other transaction can acquire a lock of any kind (shared, update, or exclusive) on that resource until the exclusive (X) lock is released. Alternatively, if a shared (S) lock has been applied to a resource,other transactions can also acquire a shared lock or an update (U) lock on that item even if the first transaction has not completed. However, other transactions cannot acquire an exclusive lock until the shared lock has been released. The following table shows the compatibility of the most commonly encountered lock modes. EXISTING GRANTED MODE Requested IS S U IX SIX X mode Intent shared Yes Yes Yes Yes Yes No (IS) Shared (S) Yes Yes Yes No No No Update (U) Yes Yes No No No No Intent Yes No No Yes No No exclusive (IX) Shared with Yes No No No No No intent exclusive (SIX) Exclusive (X) No No No No No No NOTE An intent exclusive (IX) lock is compatible with an IX lock mode because IX means the intention is to update only some of the rows rather than all of them. Other transactions that attempt to read or update some of the rows are also permitted as long as they are not the same rows being updated by other transactions. Further, if two transactions attempt to update the same row, both transactions will be granted an IX lock at table and page level. However, one transaction will be granted an X lock at row level. The other transaction must wait until the row-level lock is removed. Use the following table to determine the compatibility of all the lock modes available in SQL Server.Key-Range Locking Key-range locks protect a range of rows implicitly included in a record set being read by a Transact-SQL statement while using the serializable transaction isolation level. The serializable isolation level requires that any query executed during a transaction must obtain the same set of rows every time it is executed during the transaction. A key range lock protects this requirement by preventing other transactions from inserting new rows whose keys would fall in the range of keys read by the serializable transaction. Key-range locking prevents phantom reads. By protecting the ranges of keys between rows, it also prevents phantom insertions into a set of records accessed by a transaction. A key-range lock is placed on an index, specifying a beginning and ending key value. This lock blocks any attempt to insert, update, or delete any row with a key value that falls in the range because those operations would first have to acquire a lock on the index. For example, a serializable transaction could issue a SELECT statement that reads all rows whose key values are between ''AAA'' and ''CZZ''. A key-range lock on the key values in the range from ''AAA'' to ''CZZ'' prevents other transactions from inserting rows with key values anywhere in that range, such as ''ADG'', ''BBD'', or ''CAL''. Key-Range Lock Modes Key-range locks include both a range and a row component specified in range-row format: Range represents the lock mode protecting the range between two consecutive index entries. Row represents the lock mode protecting the index entry. Mode represents the combined lock mode used. Key-range lock modes consist of two parts. The first represents the type of lock used to lock the index range (RangeT) and the second represents the lock type used to lock a specific key (K). The two parts are connected with a hyphen (-), such as RangeT-K . RANGE ROW MODE DESCRIPTION RangeS S RangeS-S Shared range, shared resource lock; serializable range scan. RangeS U RangeS-U Shared range, update resource lock; serializable update scan.RANGE ROW MODE DESCRIPTION RangeI Null RangeI-N Insert range, null resource lock; used to test ranges before inserting a new key into an index. RangeX X RangeX-X Exclusive range, exclusive resource lock; used when updating a key in a range. NOTE The internal Null lock mode is compatible with all other lock modes. Key-range lock modes have a compatibility matrix that shows which locks are compatible with other locks obtained on overlapping keys and ranges. EXISTING GRANTED MODE Requested S U X RangeS-S RangeS-U RangeI-N RangeX-X mode Shared (S) Yes Yes No Yes Yes Yes No Update (U) Yes No No Yes No Yes No Exclusive No No No No No Yes No (X) RangeS-S Yes Yes No Yes Yes No No RangeS-U Yes No No Yes No No No RangeI-N Yes Yes Yes No No Yes No RangeX-X No No No No No No No Conversion Locks Conversion locks are created when a key-range lock overlaps another lock. LOCK 1 LOCK 2 CONVERSION LOCK S RangeI-N RangeI-S U RangeI-N RangeI-U X RangeI-N RangeI-X RangeI-N RangeS-S RangeX-S RangeI-N RangeS-U RangeX-UConversion locks can be observed for a short period of time under different complex circumstances, sometimes while running concurrent processes. Serializable Range Scan, Singleton Fetch, Delete, and Insert Key-range locking ensures that the following operations are serializable: Range scan query Singleton fetch of nonexistent row Delete operation Insert operation Before key-range locking can occur, the following conditions must be satisfied: The transaction-isolation level must be set to SERIALIZABLE. The query processor must use an index to implement the range filter predicate. For example, the WHERE clause in a SELECT statement could establish a range condition with this predicate: ColumnX BETWEEN N''AAA'' AND N''CZZ''. A key-range lock can only be acquired if ColumnX is covered by an index key. Examples The following table and index are used as a basis for the key-range locking examples that follow. R a n g e Sc a n Q u e r y To ensure a range scan query is serializable, the same query should return the same results each time it is executed within the same transaction. New rows must not be inserted within the range scan query by other transactions; otherwise, these become phantom inserts. For example, the following query uses the table and index in the previous illustration: SELECT name FROM mytable WHERE name BETWEEN ''A'' AND ''C''; Key-range locks are placed on the index entries corresponding to the range of data rows where the name is between the values Adam and Dale, preventing new rows qualifying in the previous query from being added ordeleted. Although the first name in this range is Adam, the RangeS-S mode key-range lock on this index entry ensures that no new names beginning with the letter A can be added before Adam, such as Abigail. Similarly, the RangeS-S key-range lock on the index entry for Dale ensures that no new names beginning with the letter C can be added after Carlos, such as Clive. NOTE The number of RangeS-S locks held is n+1, where n is the number of rows that satisfy the query. Si n g l e t o n F e t c h o f N o n e x i s t e n t D a t a If a query within a transaction attempts to select a row that does not exist, issuing the query at a later point within the same transaction has to return the same result. No other transaction can be allowed to insert that nonexistent row. For example, given this query: SELECT name FROM mytable WHERE name = ''Bill''; A key-range lock is placed on the index entry corresponding to the name range from Ben to Bing because the name Bill would be inserted between these two adjacent index entries. The RangeS-S mode key-range lock is placed on the index entry Bing . This prevents any other transaction from inserting values, such as Bill , between the index entries Ben and Bing . D e l e t e O p e r a t i o n When deleting a value within a transaction, the range the value falls into does not have to be locked for the duration of the transaction performing the delete operation. Locking the deleted key value until the end of the transaction is sufficient to maintain serializability. For example, given this DELETE statement: DELETE mytable WHERE name = ''Bob''; An exclusive (X) lock is placed on the index entry corresponding to the name Bob . Other transactions can insert or delete values before or after the deleted value Bob . However, any transaction that attempts to read, insert, or delete the value Bob will be blocked until the deleting transaction either commits or rolls back. Range delete can be executed using three basic lock modes: row, page, or table lock. The row, page, or table locking strategy is decided by query optimizer or can be specified by the user through optimizer hints such as ROWLOCK, PAGLOCK, or TABLOCK. When PAGLOCK or TABLOCK is used, the SQL Server Database Engine immediately deallocates an index page if all rows are deleted from this page. In contrast, when ROWLOCK is used, all deleted rows are marked only as deleted; they are removed from the index page later using a background task. I n se r t O p e r a t i o n When inserting a value within a transaction, the range the value falls into does not have to be locked for the duration of the transaction performing the insert operation. Locking the inserted key value until the end of the transaction is sufficient to maintain serializability. For example, given this INSERT statement: INSERT mytable VALUES (''Dan''); The RangeI-N mode key-range lock is placed on the index entry corresponding to the name David to test the range. If the lock is granted, Dan is inserted and an exclusive (X) lock is placed on the value Dan . The RangeI-N mode key-range lock is necessary only to test the range and is not held for the duration of the transaction performing the insert operation. Other transactions can insert or delete values before or after the inserted value Dan . However, any transaction attempting to read, insert, or delete the value Dan will be locked until the insertingtransaction either commits or rolls back. Dynamic Locking Using low-level locks, such as row locks, increases concurrency by decreasing the probability that two transactions will request locks on the same piece of data at the same time. Using low-level locks also increases the number of locks and the resources needed to manage them. Using high-level table or page locks lowers overhead, but at the expense of lowering concurrency. The SQL Server Database Engine uses a dynamic locking strategy to determine the most cost-effective locks. The SQL Server Database Engine automatically determines what locks are most appropriate when the query is executed, based on the characteristics of the schema and query. For example, to reduce the overhead of locking, the optimizer may choose page-level locks in an index when performing an index scan. Dynamic locking has the following advantages: Simplified database administration. Database administrators do not have to adjust lock escalation thresholds. Increased performance. The SQL Server Database Engine minimizes system overhead by using locks appropriate to the task. Application developers can concentrate on development. The SQL Server Database Engine adjusts locking automatically. In SQL Server 2008 and later versions, the behavior of lock escalation has changed with the introduction of the LOCK_ESCALATION option. For more information, see the LOCK_ESCALATION option of ALTER TABLE. Deadlocking A deadlock occurs when two or more tasks permanently block each other by each task having a lock on a resource which the other tasks are trying to lock. For example: Transaction A acquires a share lock on row 1. Transaction B acquires a share lock on row 2. Transaction A now requests an exclusive lock on row 2, and is blocked until transaction B finishes and releases the share lock it has on row 2. Transaction B now requests an exclusive lock on row 1, and is blocked until transaction A finishes and releases the share lock it has on row 1. Transaction A cannot complete until transaction B completes, but transaction B is blocked by transaction A. This condition is also called a cyclic dependency: Transaction A has a dependency on transaction B, and transaction B closes the circle by having a dependency on transaction A. Both transactions in a deadlock will wait forever unless the deadlock is broken by an external process. TheSQL Server Database Engine deadlock monitor periodically checks for tasks that are in a deadlock. If the monitor detects a cyclic dependency, it chooses one of the tasks as a victim and terminates its transaction with an error. This allows the other task to complete its transaction. The application with the transaction that terminated with an error can retry the transaction, which usually completes after the other deadlocked transaction has finished. Deadlocking is often confused with normal blocking. When a transaction requests a lock on a resource locked by another transaction, the requesting transaction waits until the lock is released. By default, SQL Server transactions do not time out, unless LOCK_TIMEOUT is set. The requesting transaction is blocked, not deadlocked, because the requesting transaction has not done anything to block the transaction owning the lock. Eventually, the owning transaction will complete and release the lock, and then the requesting transaction will be granted the lock and proceed. Deadlocks are sometimes called a deadly embrace. Deadlock is a condition that can occur on any system with multiple threads, not just on a relational database management system, and can occur for resources other than locks on database objects. For example, a thread in a multithreaded operating system might acquire one or more resources, such as blocks of memory. If the resource being acquired is currently owned by another thread, the first thread may have to wait for the owning thread to release the target resource. The waiting thread is said to have a dependency on the owning thread for that particular resource. In an instance of the SQL Server Database Engine, sessions can deadlock when acquiring nondatabase resources, such as memory or threads. In the illustration, transaction T1 has a dependency on transaction T2 for the Part table lock resource. Similarly, transaction T2 has a dependency on transaction T1 for the Supplier table lock resource. Because these dependencies form a cycle, there is a deadlock between transactions T1 and T2. Deadlocks can also occur when a table is partitioned and the LOCK_ESCALATION setting of ALTER TABLE is set to AUTO. When LOCK_ESCALATION is set to AUTO, concurrency increases by allowing the SQL Server Database Engine to lock table partitions at the HoBT level instead of at the table level. However, when separate transactions hold partition locks in a table and want a lock somewhere on the other transactions partition, this causes a deadlock. This type of deadlock can be avoided by setting LOCK_ESCALATION to TABLE ; although this setting will reduce concurrency by forcing large updates to a partition to wait for a table lock. Detecting and Ending Deadlocks A deadlock occurs when two or more tasks permanently block each other by each task having a lock on a resource which the other tasks are trying to lock. The following graph presents a high level view of a deadlock state where: Task T1 has a lock on resource R1 (indicated by the arrow from R1 to T1) and has requested a lock on resource R2 (indicated by the arrow from T1 to R2). Task T2 has a lock on resource R2 (indicated by the arrow from R2 to T2) and has requested a lock on resource R1 (indicated by the arrow from T2 to R1). Because neither task can continue until a resource is available and neither resource can be released until a task continues, a deadlock state exists.The SQL Server Database Engine automatically detects deadlock cycles within SQL Server. The SQL Server Database Engine chooses one of the sessions as a deadlock victim and the current transaction is terminated with an error to break the deadlock. R e so u r c e s t h a t c a n D e a d l o c k Each user session might have one or more tasks running on its behalf where each task might acquire or wait to acquire a variety of resources. The following types of resources can cause blocking that could result in a deadlock. Locks. Waiting to acquire locks on resources, such as objects, pages, rows, metadata, and applications can cause deadlock. For example, transaction T1 has a shared (S) lock on row r1 and is waiting to get an exclusive (X) lock on r2. Transaction T2 has a shared (S) lock on r2 and is waiting to get an exclusive (X) lock on row r1. This results in a lock cycle in which T1 and T2 wait for each other to release the locked resources. Worker threads. A queued task waiting for an available worker thread can cause deadlock. If the queued task owns resources that are blocking all worker threads, a deadlock will result. For example, session S1 starts a transaction and acquires a shared (S) lock on row r1 and then goes to sleep. Active sessions running on all available worker threads are trying to acquire exclusive (X) locks on row r1. Because session S1 cannot acquire a worker thread, it cannot commit the transaction and release the lock on row r1. This results in a deadlock. Memory. When concurrent requests are waiting for memory grants that cannot be satisfied with the available memory, a deadlock can occur. For example, two concurrent queries, Q1 and Q2, execute as user- defined functions that acquire 10MB and 20MB of memory respectively. If each query needs 30MB and the total available memory is 20MB, then Q1 and Q2 must wait for each other to release memory, and this results in a deadlock. Parallel query execution-related resources Coordinator, producer, or consumer threads associated with an exchange port may block each other causing a deadlock usually when including at least one other process that is not a part of the parallel query. Also, when a parallel query starts execution, SQL Server determines the degree of parallelism, or the number of worker threads, based upon the current workload. If the system workload unexpectedly changes, for example, where new queries start running on the server or the system runs out of worker threads, then a deadlock could occur. Multiple Active Result Sets (MARS) resources. These resources are used to control interleaving of multiple active requests under MARS. For more information, see Using Multiple Active Result Sets (MARS). User resource. When a thread is waiting for a resource that is potentially controlled by a user application, the resource is considered to be an external or user resource and is treated like a lock. Session mutex. The tasks running in one session are interleaved, meaning that only one task can run under the session at a given time. Before the task can run, it must have exclusive access to the session mutex. Transaction mutex. All tasks running in one transaction are interleaved, meaning that only one task can run under the transaction at a given time. Before the task can run, it must have exclusive access to the transaction mutex. In order for a task to run under MARS, it must acquire the session mutex. If the task is running under a transaction, it must then acquire the transaction mutex. This guarantees that only one task is activeat one time in a given session and a given transaction. Once the required mutexes have been acquired, the task can execute. When the task finishes, or yields in the middle of the request, it will first release transaction mutex followed by the session mutex in reverse order of acquisition. However, deadlocks can occur with these resources. In the following code example, two tasks, user request U1 and user request U2, are running in the same session. U1: Rs1=Command1.Execute("insert sometable EXEC usp_someproc"); U2: Rs2=Command2.Execute("select colA from sometable"); The stored procedure executing from user request U1 has acquired the session mutex. If the stored procedure takes a long time to execute, it is assumed by the SQL Server Database Engine that the stored procedure is waiting for input from the user. User request U2 is waiting for the session mutex while the user is waiting for the result set from U2, and U1 is waiting for a user resource. This is deadlock state logically illustrated as: D e a d l o c k D e t e c t i o n All of the resources listed in the section above participate in the SQL Server Database Engine deadlock detection scheme. Deadlock detection is performed by a lock monitor thread that periodically initiates a search through all of the tasks in an instance of the SQL Server Database Engine. The following points describe the search process: The default interval is 5 seconds. If the lock monitor thread finds deadlocks, the deadlock detection interval will drop from 5 seconds to as low as 100 milliseconds depending on the frequency of deadlocks. If the lock monitor thread stops finding deadlocks, the SQL Server Database Engine increases the intervals between searches to 5 seconds. If a deadlock has just been detected, it is assumed that the next threads that must wait for a lock are entering the deadlock cycle. The first couple of lock waits after a deadlock has been detected will immediately trigger a deadlock search rather than wait for the next deadlock detection interval. For example, if the current interval is 5 seconds, and a deadlock was just detected, the next lock wait will kick off the deadlock detector immediately. If this lock wait is part of a deadlock, it will be detected right away rather than during next deadlock search. The SQL Server Database Engine typically performs periodic deadlock detection only. Because the number of deadlocks encountered in the system is usually small, periodic deadlock detection helps to reduce the overhead of deadlock detection in the system. When the lock monitor initiates deadlock search for a particular thread, it identifies the resource on which the thread is waiting. The lock monitor then finds the owner(s) for that particular resource and recursively continues the deadlock search for those threads until it finds a cycle. A cycle identified in this manner forms a deadlock. After a deadlock is detected, the SQL Server Database Engine ends a deadlock by choosing one of the threads as a deadlock victim. The SQL Server Database Engine terminates the current batch being executed for the thread, rolls back the transaction of the deadlock victim, and returns a 1205 error to the application. Rolling back the transaction for the deadlock victim releases all locks held by the transaction. This allows the transactions of the other threads to become unblocked and continue. The 1205 deadlock victim error records information about the threads and resources involved in a deadlock in the error log. By default, the SQL Server Database Engine chooses as the deadlock victim the session running the transaction that is least expensive to roll back. Alternatively, a user can specify the priority of sessions in a deadlock situation using the SET DEADLOCK_PRIORITY statement. DEADLOCK_PRIORITY can be set to LOW, NORMAL, or HIGH, or alternatively can be set to any integer value in the range (-10 to 10). The deadlock priority defaults to NORMAL. If two sessions have different deadlock priorities, the session withthe lower priority is chosen as the deadlock victim. If both sessions have the same deadlock priority, the session with the transaction that is least expensive to roll back is chosen. If sessions involved in the deadlock cycle have the same deadlock priority and the same cost, a victim is chosen randomly. When working with CLR, the deadlock monitor automatically detects deadlock for synchronization resources (monitors, reader/writer lock and thread join) accessed inside managed procedures. However, the deadlock is resolved by throwing an exception in the procedure that was selected to be the deadlock victim. It is important to understand that the exception does not automatically release resources currently owned by the victim; the resources must be explicitly released. Consistent with exception behavior, the exception used to identify a deadlock victim can be caught and dismissed. D e a d l o c k In fo r m a t i o n To o l s To view deadlock information, the SQL Server Database Engine provides monitoring tools in the form of the the system_health xEvent session, two trace flags, and the deadlock graph event in SQL Profiler. De a d l o c k i n s y s t e m_h e a l t h s e s s i o n Starting with SQL Server 2012 (11.x), when deadlocks occur, the system_health session captures all xml_deadlock_report xEvents. The system_health session is enabled by default. The deadlock graph captured typically has three distinct nodes: victim-list. The deadlock victim process identifier. process-list. Information on all the processes involved in the deadlock. resource-list. Information about the resources involved in the deadlock. Opening the system_health session file or ring buffer, if the xml_deadlock_report xEvent is recorded, Management Studio presents a graphical depiction of the tasks and resources involved in a deadlock, as seen in the following example: The following query can view all deadlock events captured by the system_health session ring buffer. SELECT xdr.value(''@timestamp'', ''datetime'') AS [Date], xdr.query(''.'') AS [Event_Data] FROM (SELECT CAST([target_data] AS XML) AS Target_Data FROM sys.dm_xe_session_targets AS xt INNER JOIN sys.dm_xe_sessions AS xs ON xs.address = xt.event_session_address WHERE xs.name = N''system_health'' AND xt.target_name = N''ring_buffer'' ) AS XML_Data CROSS APPLY Target_Data.nodes(''RingBufferTarget/event[@name="xml_deadlock_report"]'') AS XEventData(xdr) ORDER BY [Date] DESC Here is the result set.The following example shows the output, after clicking on the first link of the result above:

SELECT c2, c3 FROM t1 WHERE c2 BETWEEN @p1 AND @p1+ unknown SET NOCOUNT ON WHILE (1=1) BEGIN EXEC p1 4 END UPDATE t1 SET c2 = c2+1 WHERE c1 = @p unknown SET NOCOUNT ON WHILE (1=1) BEGIN EXEC p2 4 END

For more information, see Use the system_health Session Tra c e F l a g 1204 a n d Tra c e F l a g 1222 When deadlocks occur, trace flag 1204 and trace flag 1222 return information that is captured in the SQL Server error log. Trace flag 1204 reports deadlock information formatted by each node involved in the deadlock. Trace flag 1222 formats deadlock information, first by processes and then by resources. It is possible to enable both trace flags to obtain two representations of the same deadlock event. In addition to defining the properties of trace flag 1204 and 1222, the following table also shows the similarities and differences. TRACE FLAG 1204 AND TRACE PROPERTY FLAG 1222 TRACE FLAG 1204 ONLY TRACE FLAG 1222 ONLY Output format Output is captured in the Focused on the nodes Returns information in an SQL Server error log. involved in the deadlock. XML-like format that does Each node has a dedicated not conform to an XML section, and the final section Schema Definition (XSD) describes the deadlock schema. The format has victim. three major sections. The first section declares the deadlock victim. The second section describes each process involved in the deadlock. The third section describes the resources that are synonymous with nodes in trace flag 1204. Identifying attributes SPID: ECID: . Node. Represents the entry deadlock victim. Identifies the system process number in the deadlock Represents the physical ID thread in cases of parallel chain. memory address of the task processes. The entry (see sys.dm_os_tasks SPID: ECID:0 , where Lists. The lock owner can be (Transact-SQL)) that was is replaced by the SPID part of these lists: selected as a deadlock value, represents the main victim. It may be 0 (zero) in thread. The entry Grant List. Enumerates the the case of an unresolved SPID: ECID: , where current owners of the deadlock. A task that is is replaced by the SPID resource. rolling back cannot be value and is greater chosen as a deadlock victim. than 0, represents the sub- Convert List. Enumeratesthan 0, represents the sub- Convert List. Enumerates threads for the same SPID. the current owners that are executionstack. Represents TRACE FLAG 1204 AND TRACE PROPERTY FLAG 1222 tTrRyAinCgE FtLoA cGo n12v0e4rt O tNhLeYir locks TTrRaAnCsEa cFtL-ASGQ 1L 2c2o2d Oe NthLYat is BatchID (sbid for trace flag to a higher level. being executed at the time 1222). Identifies the batch the deadlock occurs. from which code execution is Wait List. Enumerates requesting or holding a lock. current new lock requests priority. Represents When Multiple Active Result for the resource. deadlock priority. In certain Sets (MARS) is disabled, the cases, the SQL Server BatchID value is 0. When Statement Type. Describes Database Engine may opt to MARS is enabled, the value the type of DML statement alter the deadlock priority for active batches is 1 to n. If (SELECT, INSERT, UPDATE, or for a short duration to there are no active batches DELETE) on which the achieve better concurrency. in the session, BatchID is 0. threads have permissions. logused. Log space used by Mode. Specifies the type of Victim Resource Owner. the task. lock for a particular resource Specifies the participating that is requested, granted, thread that SQL Server owner id. The ID of the or waited on by a thread. chooses as the victim to transaction that has control Mode can be IS (Intent break the deadlock cycle. of the request. Shared), S (Shared), U The chosen thread and all (Update), IX (Intent existing sub-threads are status. State of the task. It is Exclusive), SIX (Shared with terminated. one of the following values: Intent Exclusive), and X (Exclusive). Next Branch. Represents >> pending. Waiting for a the two or more sub- worker thread. Line # (line for trace flag threads from the same SPID 1222). Lists the line number that are involved in the >> runnable. Ready to run in the current batch of deadlock cycle. but waiting for a quantum. statements that was being executed when the deadlock >> running. Currently occurred. running on the scheduler. Input Buf (inputbuf for >> suspended. Execution is trace flag 1222). Lists all the suspended. statements in the current batch. >> done. Task has completed. >> spinloop. Waiting for a spinlock to become free. waitresource. The resource needed by the task. waittime. Time in milliseconds waiting for the resource. schedulerid. Scheduler associated with this task. See sys.dm_os_schedulers (Transact-SQL). hostname. The name of the workstation. isolationlevel. The current transaction isolation level. Xactid. The ID of the transaction that has control of the request. currentdb. The ID of the database.TRACE FLAG 1204 AND TRACE lastbatchstarted. The last PROPERTY FLAG 1222 TRACE FLAG 1204 ONLY tTiRmAeC Ea FcLliAeGnt 1 p2r2o2c OesNsL Ystarted batch execution. lastbatchcompleted. The last time a client process completed batch execution. clientoption1 and clientoption2. Set options on this client connection. This is a bitmask that includes information about options usually controlled by SET statements such as SET NOCOUNT and SET XACTABORT. associatedObjectId. Represents the HoBT (heap or b-tree) ID. Resource attributes RID. Identifies the single row None exclusive to this trace None exclusive to this trace within a table on which a flag. flag. lock is held or requested. RID is represented as RID: db_id:file_id:page_no:row_n o. For example, RID: 6:1:20789:0 . OBJECT. Identifies the table on which a lock is held or requested. OBJECT is represented as OBJECT: db_id:object_id. For example, TAB: 6:2009058193 . KEY. Identifies the key range within an index on which a lock is held or requested. KEY is represented as KEY: db_id:hobt_id (index key hash value). For example, KEY: 6:72057594057457664 (350007a4d329) . PAG. Identifies the page resource on which a lock is held or requested. PAG is represented as PAG: db_id:file_id:page_no. For example, PAG: 6:1:20789 . EXT. Identifies the extent structure. EXT is represented as EXT: db_id:file_id:extent_no. For example, EXT: 6:1:9 . DB. Identifies the database lock. DB is represented in one of the following ways:TDRBA: CdEb F_LidAG 1204 AND TRACE PROPERTY FLAG 1222 TRACE FLAG 1204 ONLY TRACE FLAG 1222 ONLY DB: db_id[BULK-OP-DB], which identifies the database lock taken by the backup database. DB: db_id[BULK-OP-LOG], which identifies the lock taken by the backup log for that particular database. APP. Identifies the lock taken by an application resource. APP is represented as APP: lock_resource. For example, APP: Formf370f478 . METADATA. Represents metadata resources involved in a deadlock. Because METADATA has many subresources, the value returned depends upon the subresource that has deadlocked. For example, METADATA.USER_TYPE returns user_type_id = . For more information about METADATA resources and subresources, see sys.dm_tran_locks (Transact- SQL). HOBT. Represents a heap or b-tree involved in a deadlock. Tra c e F l a g 1204 Ex a mp l e The following example shows the output when trace flag 1204 is turned on. In this case, the table in Node 1 is a heap with no indexes, and the table in Node 2 is a heap with a nonclustered index. The index key in Node 2 is being updated when the deadlock occurs.Deadlock encountered .... Printing deadlock information Wait-for graph Node:1 RID: 6:1:20789:0 CleanCnt:3 Mode:X Flags: 0x2 Grant List 0: Owner:0x0315D6A0 Mode: X Flg:0x0 Ref:0 Life:02000000 SPID:55 ECID:0 XactLockInfo: 0x04D9E27C SPID: 55 ECID: 0 Statement Type: UPDATE Line #: 6 Input Buf: Language Event: BEGIN TRANSACTION EXEC usp_p2 Requested By: ResType:LockOwner Stype:''OR''Xdes:0x03A3DAD0 Mode: U SPID:54 BatchID:0 ECID:0 TaskProxy:(0x04976374) Value:0x315d200 Cost:(0/868) Node:2 KEY: 6:72057594057457664 (350007a4d329) CleanCnt:2 Mode:X Flags: 0x0 Grant List 0: Owner:0x0315D140 Mode: X Flg:0x0 Ref:0 Life:02000000 SPID:54 ECID:0 XactLockInfo: 0x03A3DAF4 SPID: 54 ECID: 0 Statement Type: UPDATE Line #: 6 Input Buf: Language Event: BEGIN TRANSACTION EXEC usp_p1 Requested By: ResType:LockOwner Stype:''OR''Xdes:0x04D9E258 Mode: U SPID:55 BatchID:0 ECID:0 TaskProxy:(0x0475E374) Value:0x315d4a0 Cost:(0/380) Victim Resource Owner: ResType:LockOwner Stype:''OR''Xdes:0x04D9E258 Mode: U SPID:55 BatchID:0 ECID:0 TaskProxy:(0x0475E374) Value:0x315d4a0 Cost:(0/380) Tra c e F l a g 1222 Ex a mp l e The following example shows the output when trace flag 1222 is turned on. In this case, one table is a heap with no indexes, and the other table is a heap with a nonclustered index. In the second table, the index key is being updated when the deadlock occurs.deadlock-list deadlock victim=process689978 process-list process id=process6891f8 taskpriority=0 logused=868 waitresource=RID: 6:1:20789:0 waittime=1359 ownerId=310444 transactionname=user_transaction lasttranstarted=2005-09-05T11:22:42.733 XDES=0x3a3dad0 lockMode=U schedulerid=1 kpid=1952 status=suspended spid=54 sbid=0 ecid=0 priority=0 transcount=2 lastbatchstarted=2005-09-05T11:22:42.733 lastbatchcompleted=2005-09-05T11:22:42.733 clientapp=Microsoft SQL Server Management Studio - Query hostname=TEST_SERVER hostpid=2216 loginname=DOMAIN\user isolationlevel=read committed (2) xactid=310444 currentdb=6 lockTimeout=4294967295 clientoption1=671090784 clientoption2=390200 executionStack frame procname=AdventureWorks2016.dbo.usp_p1 line=6 stmtstart=202 sqlhandle=0x0300060013e6446b027cbb00c69600000100000000000000 UPDATE T2 SET COL1 = 3 WHERE COL1 = 1; frame procname=adhoc line=3 stmtstart=44 sqlhandle=0x01000600856aa70f503b8104000000000000000000000000 EXEC usp_p1 inputbuf BEGIN TRANSACTION EXEC usp_p1 process id=process689978 taskpriority=0 logused=380 waitresource=KEY: 6:72057594057457664 (350007a4d329) waittime=5015 ownerId=310462 transactionname=user_transaction lasttranstarted=2005-09-05T11:22:44.077 XDES=0x4d9e258 lockMode=U schedulerid=1 kpid=3024 status=suspended spid=55 sbid=0 ecid=0 priority=0 transcount=2 lastbatchstarted=2005-09-05T11:22:44.077 lastbatchcompleted=2005-09-05T11:22:44.077 clientapp=Microsoft SQL Server Management Studio - Query hostname=TEST_SERVER hostpid=2216 loginname=DOMAIN\user isolationlevel=read committed (2) xactid=310462 currentdb=6 lockTimeout=4294967295 clientoption1=671090784 clientoption2=390200 executionStack frame procname=AdventureWorks2016.dbo.usp_p2 line=6 stmtstart=200 sqlhandle=0x030006004c0a396c027cbb00c69600000100000000000000 UPDATE T1 SET COL1 = 4 WHERE COL1 = 1; frame procname=adhoc line=3 stmtstart=44 sqlhandle=0x01000600d688e709b85f8904000000000000000000000000 EXEC usp_p2 inputbuf BEGIN TRANSACTION EXEC usp_p2 resource-list ridlock fileid=1 pageid=20789 dbid=6 objectname=AdventureWorks2016.dbo.T2 id=lock3136940 mode=X associatedObjectId=72057594057392128 owner-list owner id=process689978 mode=X waiter-list waiter id=process6891f8 mode=U requestType=wait keylock hobtid=72057594057457664 dbid=6 objectname=AdventureWorks2016.dbo.T1 indexname=nci_T1_COL1 id=lock3136fc0 mode=X associatedObjectId=72057594057457664 owner-list owner id=process6891f8 mode=X waiter-list waiter id=process689978 mode=U requestType=wait P ro f i l e r De a d l o c k Gra p h Ev e n t This is an event in SQL Profiler that presents a graphical depiction of the tasks and resources involved in a deadlock. The following example shows the output from SQL Profiler when the deadlock graph event is turned on.For more information about the deadlock event, see Lock:Deadlock Event Class. For more information about running the SQL Profiler deadlock graph, see Save Deadlock Graphs (SQL Server Profiler). Handling Deadlocks When an instance of the SQL Server Database Engine chooses a transaction as a deadlock victim, it terminates the current batch, rolls back the transaction, and returns error message 1205 to the application. Your transaction (process ID #52) was deadlocked on {lock | communication buffer | thread} resources with another process and has been chosen as the deadlock victim. Rerun your transaction. Because any application submitting Transact-SQL queries can be chosen as the deadlock victim, applications should have an error handler that can trap error message 1205. If an application does not trap the error, the application can proceed unaware that its transaction has been rolled back and errors can occur. Implementing an error handler that traps error message 1205 allows an application to handle the deadlock situation and take remedial action (for example, automatically resubmitting the query that was involved in the deadlock). By resubmitting the query automatically, the user does not need to know that a deadlock occurred. The application should pause briefly before resubmitting its query. This gives the other transaction involved in the deadlock a chance to complete and release its locks that formed part of the deadlock cycle. This minimizes the likelihood of the deadlock reoccurring when the resubmitted query requests its locks. Minimizing Deadlocks Although deadlocks cannot be completely avoided, following certain coding conventions can minimize the chance of generating a deadlock. Minimizing deadlocks can increase transaction throughput and reduce system overhead because fewer transactions are: Rolled back, undoing all the work performed by the transaction. Resubmitted by applications because they were rolled back when deadlocked. To help minimize deadlocks: Access objects in the same order. Avoid user interaction in transactions. Keep transactions short and in one batch. Use a lower isolation level. Use a row versioning-based isolation level. Set READ_COMMITTED_SNAPSHOT database option ON to enable read-committed transactions to use row versioning. Use snapshot isolation. Use bound connections. A c c e ss O b j e c t s i n t h e sa m e o r d e r If all concurrent transactions access objects in the same order, deadlocks are less likely to occur. For example, if two concurrent transactions obtain a lock on the Supplier table and then on the Part table, one transaction is blocked on the Supplier table until the other transaction is completed. After the first transaction commits or rolls back, the second continues, and a deadlock does not occur. Using stored procedures for all data modifications canstandardize the order of accessing objects. A v o i d u se r i n t e r a c t i o n i n T r a n sa c t i o n s Avoid writing transactions that include user interaction, because the speed of batches running without user intervention is much faster than the speed at which a user must manually respond to queries, such as replying to a prompt for a parameter requested by an application. For example, if a transaction is waiting for user input and the user goes to lunch or even home for the weekend, the user delays the transaction from completing. This degrades system throughput because any locks held by the transaction are released only when the transaction is committed or rolled back. Even if a deadlock situation does not arise, other transactions accessing the same resources are blocked while waiting for the transaction to complete. K e e p T r a n sa c t i o n s sh o r t a n d i n o n e b a t c h A deadlock typically occurs when several long-running transactions execute concurrently in the same database. The longer the transaction, the longer the exclusive or update locks are held, blocking other activity and leading to possible deadlock situations. Keeping transactions in one batch minimizes network roundtrips during a transaction, reducing possible delays in completing the transaction and releasing locks. U se a l o w e r I so l a t i o n L e v e l Determine whether a transaction can run at a lower isolation level. Implementing read committed allows a transaction to read data previously read (not modified) by another transaction without waiting for the first transaction to complete. Using a lower isolation level, such as read committed, holds shared locks for a shorter duration than a higher isolation level, such as serializable. This reduces locking contention. U se a R o w Ve r s i o n i n g - b a se d I so l a t i o n L e v e l When the READ_COMMITTED_SNAPSHOT database option is set ON, a transaction running under read committed isolation level uses row versioning rather than shared locks during read operations. NOTE Some applications rely upon locking and blocking behavior of read committed isolation. For these applications, some change is required before this option can be enabled. Snapshot isolation also uses row versioning, which does not use shared locks during read operations. Before a transaction can run under snapshot isolation, the ALLOW_SNAPSHOT_ISOLATION database option must be set ON. Implement these isolation levels to minimize deadlocks that can occur between read and write operations. U se b o u n d c o n n e c t i o n s Using bound connections, two or more connections opened by the same application can cooperate with each other. Any locks acquired by the secondary connections are held as if they were acquired by the primary connection, and vice versa. Therefore they do not block each other. Lock PartitioningFor large computer systems, locks on frequently referenced objects can become a performance bottleneck as acquiring and releasing locks place contention on internal locking resources. Lock partitioning enhances locking performance by splitting a single lock resource into multiple lock resources. This feature is only available for systems with 16 or more CPUs, and is automatically enabled and cannot be disabled. Only object locks can be partitioned.Object locks that have a subtype are not partitioned. For more information, see sys.dm_tran_locks (Transact-SQL). Understanding Lock Partitioning Locking tasks access several shared resources, two of which are optimized by lock partitioning: Spinlock. This controls access to a lock resource, such as a row or a table. Without lock partitioning, one spinlock manages all lock requests for a single lock resource. On systems that experience a large volume of activity, contention can occur as lock requests wait for the spinlock to become available. Under this situation, acquiring locks can become a bottleneck and can negatively impact performance. To reduce contention on a single lock resource, lock partitioning splits a single lock resource into multiple lock resources to distribute the load across multiple spinlocks. Memory. This is used to store the lock resource structures. Once the spinlock is acquired, lock structures are stored in memory and then accessed and possibly modified. Distributing lock access across multiple resources helps to eliminate the need to transfer memory blocks between CPUs, which will help to improve performance. Implementing and Monitoring Lock Partitioning Lock partitioning is turned on by default for systems with 16 or more CPUs. When lock partitioning is enabled, an informational message is recorded in the SQL Server error log. When acquiring locks on a partitioned resource: Only NL, SCH-S, IS, IU, and IX lock modes are acquired on a single partition. Shared (S), exclusive (X), and other locks in modes other than NL, SCH-S, IS, IU, and IX must be acquired on all partitions starting with partition ID 0 and following in partition ID order. These locks on a partitioned resource will use more memory than locks in the same mode on a non-partitioned resource since each partition is effectively a separate lock. The memory increase is determined by the number of partitions. The SQL Server lock counters in the Windows Performance Monitor will display information about memory used by partitioned and non-partitioned locks. A transaction is assigned to a partition when the transaction starts. For the transaction, all lock requests that can be partitioned use the partition assigned to that transaction. By this method, access to lock resources of the same object by different transactions is distributed across different partitions. The resource_lock_partition column in the sys.dm_tran_locks Dynamic Management View provides the lock partition ID for a lock partitioned resource. For more information, see sys.dm_tran_locks (Transact- SQL). Working with Lock Partitioning The following code examples illustrate lock partitioning. In the examples, two transactions are executed in two different sessions in order to show lock partitioning behavior on a computer system with 16 CPUs. These Transact-SQL statements create test objects that are used in the examples that follow.-- Create a test table. CREATE TABLE TestTable (col1 int); GO -- Create a clustered index on the table. CREATE CLUSTERED INDEX ci_TestTable ON TestTable (col1); GO -- Populate the table. INSERT INTO TestTable VALUES (1); GO Ex a m p l e A Session 1: A SELECT statement is executed under a transaction. Because of the HOLDLOCK lock hint, this statement will acquire and retain an Intent shared (IS) lock on the table (for this illustration, row and page locks are ignored). The IS lock will be acquired only on the partition assigned to the transaction. For this example, it is assumed that the IS lock is acquired on partition ID 7. -- Start a transaction. BEGIN TRANSACTION -- This SELECT statement will acquire an IS lock on the table. SELECT col1 FROM TestTable WITH (HOLDLOCK); Session 2: A transaction is started, and the SELECT statement running under this transaction will acquire and retain a shared (S) lock on the table. The S lock will be acquired on all partitions which results in multiple table locks, one for each partition. For example, on a 16-cpu system, 16 S locks will be issued across lock partition IDs 0-15. Because the S lock is compatible with the IS lock being held on partition ID 7 by the transaction in session 1, there is no blocking between transactions. BEGIN TRANSACTION SELECT col1 FROM TestTable WITH (TABLOCK, HOLDLOCK); Session 1: The following SELECT statement is executed under the transaction that is still active under session 1. Because of the exclusive (X) table lock hint, the transaction will attempt to acquire an X lock on the table. However, the S lock that is being held by the transaction in session 2 will block the X lock at partition ID 0. SELECT col1 FROM TestTable WITH (TABLOCKX); Ex a m p l e B Session 1: A SELECT statement is executed under a transaction. Because of the HOLDLOCK lock hint, this statement will acquire and retain an Intent shared (IS) lock on the table (for this illustration, row and page locks are ignored). The IS lockwill be acquired only on the partition assigned to the transaction. For this example, it is assumed that the IS lock is acquired on partition ID 6. -- Start a transaction. BEGIN TRANSACTION -- This SELECT statement will acquire an IS lock on the table. SELECT col1 FROM TestTable WITH (HOLDLOCK); Session 2: A SELECT statement is executed under a transaction. Because of the TABLOCKX lock hint, the transaction tries to acquire an exclusive (X) lock on the table. Remember that the X lock must be acquired on all partitions starting with partition ID 0. The X lock will be acquired on all partitions IDs 0-5 but will be blocked by the IS lock that is acquired on partition ID 6. On partition IDs 7-15 that the X lock has not yet reached, other transactions can continue to acquire locks. BEGIN TRANSACTION SELECT col1 FROM TestTable WITH (TABLOCKX, HOLDLOCK); Row Versioning-based Isolation Levels in the SQL Server Database Engine Starting with SQL Server 2005, the SQL Server Database Engine offers an implementation of an existing transaction isolation level, read committed, that provides a statement level snapshot using row versioning. SQL Server Database Engine also offers a transaction isolation level, snapshot, that provides a transaction level snapshot also using row versioning. Row versioning is a general framework in SQL Server that invokes a copy-on-write mechanism when a row is modified or deleted. This requires that while the transaction is running, the old version of the row must be available for transactions that require an earlier transactionally consistent state. Row versioning is used to do the following: Build the inserted and deleted tables in triggers. Any rows modified by the trigger are versioned. This includes the rows modified by the statement that launched the trigger, as well as any data modifications made by the trigger. Support Multiple Active Result Sets (MARS). If a MARS session issues a data modification statement (such as INSERT , UPDATE , or DELETE ) at a time there is an active result set, the rows affected by the modification statement are versioned. Support index operations that specify the ONLINE option. Support row versioning-based transaction isolation levels: A new implementation of read committed isolation level that uses row versioning to provide statement- level read consistency. A new isolation level, snapshot, to provide transaction-level read consistency. The tempdb database must have enough space for the version store. When tempdb is full, update operations will stop generating versions and continue to succeed, but read operations might fail because a particular row version that is needed no longer exists. This affects operations like triggers, MARS, and online indexing.Using row versioning for read-committed and snapshot transactions is a two-step process: 1. Set either or both the READ_COMMITTED_SNAPSHOT and ALLOW_SNAPSHOT_ISOLATION database options ON. 2. Set the appropriate transaction isolation level in an application: When the READ_COMMITTED_SNAPSHOT database option is ON, transactions setting the read committed isolation level use row versioning. When the ALLOW_SNAPSHOT_ISOLATION database option is ON, transactions can set the snapshot isolation level. When either READ_COMMITTED_SNAPSHOT or ALLOW_SNAPSHOT_ISOLATION database option is set ON, the SQL Server Database Engine assigns a transaction sequence number (XSN) to each transaction that manipulates data using row versioning. Transactions start at the time a BEGIN TRANSACTION statement is executed. However, the transaction sequence number starts with the first read or write operation after the BEGIN TRANSACTION statement. The transaction sequence number is incremented by one each time it is assigned. When either the READ_COMMITTED_SNAPSHOT or ALLOW_SNAPSHOT_ISOLATION database options are ON, logical copies (versions) are maintained for all data modifications performed in the database. Every time a row is modified by a specific transaction, the instance of the SQL Server Database Engine stores a version of the previously committed image of the row in tempdb . Each version is marked with the transaction sequence number of the transaction that made the change. The versions of modified rows are chained using a link list. The newest row value is always stored in the current database and chained to the versioned rows stored in tempdb . NOTE For modification of large objects (LOBs), only the changed fragment is copied to the version store in tempdb . Row versions are held long enough to satisfy the requirements of transactions running under row versioning- based isolation levels. The SQL Server Database Engine tracks the earliest useful transaction sequence number and periodically deletes all row versions stamped with transaction sequence numbers that are lower than the earliest useful sequence number. When both database options are set to OFF, only rows modified by triggers or MARS sessions, or read by ONLINE index operations, are versioned. Those row versions are released when no longer needed. A background thread periodically executes to remove stale row versions. NOTE For short-running transactions, a version of a modified row may get cached in the buffer pool without getting written into the disk files of the tempdb database. If the need for the versioned row is short-lived, it will simply get dropped from the buffer pool and may not necessarily incur I/O overhead. Behavior when reading data When transactions running under row versioning-based isolation read data, the read operations do not acquire shared (S) locks on the data being read, and therefore do not block transactions that are modifying data. Also, the overhead of locking resources is minimized as the number of locks acquired is reduced. Read committed isolation using row versioning and snapshot isolation are designed to provide statement-level or transaction-level read consistencies of versioned data. All queries, including transactions running under row versioning-based isolation levels, acquire Sch-S (schema stability) locks during compilation and execution. Because of this, queries are blocked when a concurrent transaction holds a Sch-M (schema modification) lock on the table. For example, a data definition language (DDL)operation acquires a Sch-M lock before it modifies the schema information of the table. Query transactions, including those running under a row versioning-based isolation level, are blocked when attempting to acquire a Sch-S lock. Conversely, a query holding a Sch-S lock blocks a concurrent transaction that attempts to acquire a Sch-M lock. When a transaction using the snapshot isolation level starts, the instance of the SQL Server Database Engine records all of the currently active transactions. When the snapshot transaction reads a row that has a version chain, the SQL Server Database Engine follows the chain and retrieves the row where the transaction sequence number is: Closest to but lower than the sequence number of the snapshot transaction reading the row. Not in the list of the transactions active when the snapshot transaction started. Read operations performed by a snapshot transaction retrieve the last version of each row that had been committed at the time the snapshot transaction started. This provides a transactionally consistent snapshot of the data as it existed at the start of the transaction. Read-committed transactions using row versioning operate in much the same way. The difference is that the read-committed transaction does not use its own transaction sequence number when choosing row versions. Each time a statement is started, the read-committed transaction reads the latest transaction sequence number issued for that instance of the SQL Server Database Engine. This is the transaction sequence number used to select the correct row versions for that statement. This allows read-committed transactions to see a snapshot of the data as it exists at the start of each statement. NOTE Even though read-committed transactions using row versioning provides a transactionally consistent view of the data at a statement level, row versions generated or accessed by this type of transaction are maintained until the transaction completes. Behavior when modifying data In a read-committed transaction using row versioning, the selection of rows to update is done using a blocking scan where an update (U) lock is taken on the data row as data values are read. This is the same as a read- committed transaction that does not use row versioning. If the data row does not meet the update criteria, the update lock is released on that row and the next row is locked and scanned. Transactions running under snapshot isolation take an optimistic approach to data modification by acquiring locks on data before performing the modification only to enforce constraints. Otherwise, locks are not acquired on data until the data is to be modified. When a data row meets the update criteria, the snapshot transaction verifies that the data row has not been modified by a concurrent transaction that committed after the snapshot transaction began. If the data row has been modified outside of the snapshot transaction, an update conflict occurs and the snapshot transaction is terminated. The update conflict is handled by the SQL Server Database Engine and there is no way to disable the update conflict detection.NOTE Update operations running under snapshot isolation internally execute under read committed isolation when the snapshot transaction accesses any of the following: A table with a FOREIGN KEY constraint. A table that is referenced in the FOREIGN KEY constraint of another table. An indexed view referencing more than one table. However, even under these conditions the update operation will continue to verify that the data has not been modified by another transaction. If data has been modified by another transaction, the snapshot transaction encounters an update conflict and is terminated. Behavior in summary The following table summarizes the differences between snapshot isolation and read committed isolation using row versioning. READ-COMMITTED ISOLATION LEVEL PROPERTY USING ROW VERSIONING SNAPSHOT ISOLATION LEVEL The database option that must be set READ_COMMITTED_SNAPSHOT ALLOW_SNAPSHOT_ISOLATION to ON to enable the required support. How a session requests the specific Use the default read-committed Requires the execution of SET type of row versioning. isolation level, or run the SET TRANSACTION ISOLATION LEVEL to TRANSACTION ISOLATION LEVEL specify the SNAPSHOT isolation level statement to specify the READ before the start of the transaction. COMMITTED isolation level. This can be done after the transaction starts. The version of data read by statements. All data that was committed before the All data that was committed before the start of each statement. start of each transaction. How updates are handled. Reverts from row versions to actual Uses row versions to select rows to data to select rows to update and uses update. Tries to acquire an exclusive lock update locks on the data rows selected. on the actual data row to be modified, Acquires exclusive locks on actual data and if the data has been modified by rows to be modified. No update conflict another transaction, an update conflict detection. occurs and the snapshot transaction is terminated. Update conflict detection. None. Integrated support. Cannot be disabled. Row Versioning resource usage The row versioning framework supports the following features available in SQL Server : Triggers Multiple Active Results Sets (MARS) Online indexing The row versioning framework also supports the following row versioning-based transaction isolation levels, which by default are not enabled: When the READ_COMMITTED_SNAPSHOT database option is ON, READ_COMMITTED transactions provide statement- level read consistency using row versioning.When the ALLOW_SNAPSHOT_ISOLATION database option is ON, SNAPSHOT transactions provide transaction- level read consistency using row versioning. Row versioning-based isolation levels reduce the number of locks acquired by transaction by eliminating the use of shared locks on read operations. This increases system performance by reducing the resources used to manage locks. Performance is also increased by reducing the number of times a transaction is blocked by locks acquired by other transactions. Row versioning-based isolation levels increase the resources needed by data modifications. Enabling these options causes all data modifications for the database to be versioned. A copy of the data before modification is stored in tempdb even when there are no active transactions using row versioning-based isolation. The data after modification includes a pointer to the versioned data stored in tempdb. For large objects, only part of the object that changed is copied to tempdb. Space used in TempDB For each instance of the SQL Server Database Engine, tempdb must have enough space to hold the row versions generated for every database in the instance. The database administrator must ensure that TempDB has ample space to support the version store. There are two version stores in TempDB: The online index build version store is used for online index builds in all databases. The common version store is used for all other data modification operations in all databases. Row versions must be stored for as long as an active transaction needs to access it. Once every minute, a background thread removes row versions that are no longer needed and frees up the version space in TempDB. A long-running transaction prevents space in the version store from being released if it meets any of the following conditions: It uses row versioning-based isolation. It uses triggers, MARS, or online index build operations. It generates row versions. NOTE When a trigger is invoked inside a transaction, the row versions created by the trigger are maintained until the end of the transaction, even though the row versions are no longer needed after the trigger completes. This also applies to read- committed transactions that use row versioning. With this type of transaction, a transactionally consistent view of the database is needed only for each statement in the transaction. This means that the row versions created for a statement in the transaction are no longer needed after the statement completes. However, row versions created by each statement in the transaction are maintained until the transaction completes. When TempDB runs out of space, the SQL Server Database Engine forces the version stores to shrink. During the shrink process, the longest running transactions that have not yet generated row versions are marked as victims. A message 3967 is generated in the error log for each victim transaction. If a transaction is marked as a victim, it can no longer read the row versions in the version store. When it attempts to read row versions, message 3966 is generated and the transaction is rolled back. If the shrinking process succeeds, space becomes available in tempdb. Otherwise, tempdb runs out of space and the following occurs: Write operations continue to execute but do not generate versions. An information message (3959) appears in the error log, but the transaction that writes data is not affected. Transactions that attempt to access row versions that were not generated because of a tempdb full rollback terminate with an error 3958. Space used in data rows Each database row may use up to 14 bytes at the end of the row for row versioning information. The rowversioning information contains the transaction sequence number of the transaction that committed the version and the pointer to the versioned row. These 14 bytes are added the first time the row is modified, or when a new row is inserted, under any of these conditions: READ_COMMITTED_SNAPSHOT or ALLOW_SNAPSHOT_ISOLATION options are ON. The table has a trigger. Multiple Active Results Sets (MARS) is being used. Online index build operations are currently running on the table. These 14 bytes are removed from the database row the first time the row is modified under all of these conditions: READ_COMMITTED_SNAPSHOT and ALLOW_SNAPSHOT_ISOLATION options are OFF. The trigger no longer exists on the table. MARS is not being used. Online index build operations are not currently running. If you use any of the row versioning features, you might need to allocate additional disk space for the database to accommodate the 14 bytes per database row. Adding the row versioning information can cause index page splits or the allocation of a new data page if there is not enough space available on the current page. For example, if the average row length is 100 bytes, the additional 14 bytes cause an existing table to grow up to 14 percent. Decreasing the fill factor might help to prevent or decrease fragmentation of index pages. To view fragmentation information for the data and indexes of a table or view, you can use sys.dm_db_index_physical_stats. Space used in Large Objects The SQL Server Database Engine supports six data types that can hold large strings up to 2 gigabytes (GB) in length: nvarchar(max) , varchar(max) , varbinary(max) , ntext , text , and image . Large strings stored using these data types are stored in a series of data fragments that are linked to the data row. Row versioning information is stored in each fragment used to store these large strings. Data fragments are a collection of pages dedicated to large objects in a table. As new large values are added to a database, they are allocated using a maximum of 8040 bytes of data per fragment. Earlier versions of the SQL Server Database Engine stored up to 8080 bytes of ntext , text , or image data per fragment. Existing ntext , text , and image large object (LOB) data is not updated to make space for the row versioning information when a database is upgraded to SQL Server from an earlier version of SQL Server. However, the first time the LOB data is modified, it is dynamically upgraded to enable storage of versioning information. This will happen even if row versions are not generated. After the LOB data is upgraded, the maximum number of bytes stored per fragment is reduced from 8080 bytes to 8040 bytes. The upgrade process is equivalent to deleting the LOB value and reinserting the same value. The LOB data is upgraded even if only one byte is modified. This is a one-time operation for each ntext , text , or image column, but each operation may generate a large amount of page allocations and I/O activity depending upon the size of the LOB data. It may also generate a large amount of logging activity if the modification is fully logged. WRITETEXT and UPDATETEXT operations are minimally logged if database recovery mode is not set to FULL. The nvarchar(max) , varchar(max) , and varbinary(max) data types are not available in earlier versions of SQL Server. Therefore, they have no upgrade issues. Enough disk space should be allocated to accommodate this requirement. Monitoring Row Versioning and the Version StoreFor monitoring row versioning, version store, and snapshot isolation processes for performance and problems, SQL Server provides tools in the form of Dynamic Management Views (DMVs) and performance counters in Windows System Monitor. D M Vs The following DMVs provide information about the current system state of tempdb and the version store, as well as transactions using row versioning. sys.dm_db_file_space_usage. Returns space usage information for each file in the database. For more information, see sys.dm_db_file_space_usage (Transact-SQL). sys.dm_db_session_space_usage. Returns page allocation and deallocation activity by session for the database. For more information, see sys.dm_db_session_space_usage (Transact-SQL). sys.dm_db_task_space_usage. Returns page allocation and deallocation activity by task for the database. For more information, see sys.dm_db_task_space_usage (Transact-SQL). sys.dm_tran_top_version_generators. Returns a virtual table for the objects producing the most versions in the version store. It groups the top 256 aggregated record lengths by database_id and rowset_id. Use this function to find the largest consumers of the version store. For more information, see sys.dm_tran_top_version_generators (Transact-SQL). sys.dm_tran_version_store. Returns a virtual table that displays all version records in the common version store. For more information, see sys.dm_tran_version_store (Transact-SQL). sys.dm_tran_version_store_space_usage. Returns a virtual table that displays the total space in tempdb used by version store records for each database. For more information, see sys.dm_tran_version_store_space_usage (Transact-SQL). NOTE sys.dm_tran_top_version_generators and sys.dm_tran_version_store are potentially very expensive functions to run, since both query the entire version store, which could be very large. sys.dm_tran_version_store_space_usage is efficient and not expensive to run, as it does not navigate through individual version store records and returns aggregated version store space consumed in tempdb per database sys.dm_tran_active_snapshot_database_transactions. Returns a virtual table for all active transactions in all databases within the SQL Server instance that use row versioning. System transactions do not appear in this DMV. For more information, see sys.dm_tran_active_snapshot_database_transactions (Transact-SQL). sys.dm_tran_transactions_snapshot. Returns a virtual table that displays snapshots taken by each transaction. The snapshot contains the sequence number of the active transactions that use row versioning. For more information, see sys.dm_tran_transactions_snapshot (Transact-SQL). sys.dm_tran_current_transaction. Returns a single row that displays row versioning-related state information of the transaction in the current session. For more information, see sys.dm_tran_current_transaction (Transact-SQL). sys.dm_tran_current_snapshot. Returns a virtual table that displays all active transactions at the time the current snapshot isolation transaction starts. If the current transaction is using snapshot isolation, this function returns no rows. sys.dm_tran_current_snapshot is similar to sys.dm_tran_transactions_snapshot, except that it returns only the active transactions for the current snapshot. For more information, see sys.dm_tran_current_snapshot (Transact- SQL). P e r fo r m a n c e C o u n t e r s SQL Server performance counters provide information about the system performance impacted by SQL Server processes. The following performance counters monitor tempdb and the version store, as well as transactions using row versioning. The performance counters are contained in the SQLServer :Transactions performance object.Free Space in tempdb (KB). Monitors the amount, in kilobytes (KB), of free space in the tempdb database. There must be enough free space in tempdb to handle the version store that supports snapshot isolation. The following formula provides a rough estimate of the size of the version store. For long-running transactions, it may be useful to monitor the generation and cleanup rate to estimate the maximum size of the version store. [size of common version store] = 2 * [version store data generated per minute] * [longest running time (minutes) of the transaction] The longest running time of transactions should not include online index builds. Because these operations may take a long time on very large tables, online index builds use a separate version store. The approximate size of the online index build version store equals the amount of data modified in the table, including all indexes, while the online index build is active. Version Store Size (KB). Monitors the size in KB of all version stores. This information helps determine the amount of space needed in the tempdb database for the version store. Monitoring this counter over a period of time provides a useful estimate of additional space needed for tempdb. Version Generation rate (KB/s) . Monitors the version generation rate in KB per second in all version stores. Version Cleanup rate (KB/s) . Monitors the version cleanup rate in KB per second in all version stores. NOTE Information from Version Generation rate (KB/s) and Version Cleanup rate (KB/s) can be used to predict tempdb space requirements. Version Store unit count. Monitors the count of version store units. Version Store unit creation. Monitors the total number of version store units created to store row versions since the instance was started. Version Store unit truncation. Monitors the total number of version store units truncated since the instance was started. A version store unit is truncated when SQL Server determines that none of the version rows stored in the version store unit are needed to run active transactions. Update conflict ratio. Monitors the ratio of update snapshot transaction that have update conflicts to the total number of update snapshot transactions. Longest Transaction Running Time. Monitors the longest running time in seconds of any transaction using row versioning. This can be used to determine if any transaction is running for an unreasonable amount of time. Transactions. Monitors the total number of active transactions. This does not include system transactions. Snapshot Transactions . Monitors the total number of active snapshot transactions. Update Snapshot Transactions . Monitors the total number of active snapshot transactions that perform update operations. NonSnapshot Version Transactions . Monitors the total number of active non-snapshot transactions that generate version records. NOTE The sum of Update Snapshot Transactions and NonSnapshot Version Transactions represents the total number of transactions that participate in version generation. The difference of Snapshot Transactions and Update Snapshot Transactions reports the number of read-only snapshot transactions.Row Versioning-based Isolation Level Example The following examples show the differences in behavior between snapshot isolation transactions and read- committed transactions that use row versioning. A. Working with snapshot isolation In this example, a transaction running under snapshot isolation reads data that is then modified by another transaction. The snapshot transaction does not block the update operation executed by the other transaction, and it continues to read data from the versioned row, ignoring the data modification. However, when the snapshot transaction attempts to modify the data that has already been modified by the other transaction, the snapshot transaction generates an error and is terminated. On session 1: USE AdventureWorks2016; GO -- Enable snapshot isolation on the database. ALTER DATABASE AdventureWorks2016 SET ALLOW_SNAPSHOT_ISOLATION ON; GO -- Start a snapshot transaction SET TRANSACTION ISOLATION LEVEL SNAPSHOT; GO BEGIN TRANSACTION; -- This SELECT statement will return -- 48 vacation hours for the employee. SELECT BusinessEntityID, VacationHours FROM HumanResources.Employee WHERE BusinessEntityID = 4; On session 2: USE AdventureWorks2016; GO -- Start a transaction. BEGIN TRANSACTION; -- Subtract a vacation day from employee 4. -- Update is not blocked by session 1 since -- under snapshot isolation shared locks are -- not requested. UPDATE HumanResources.Employee SET VacationHours = VacationHours - 8 WHERE BusinessEntityID = 4; -- Verify that the employee now has 40 vacation hours. SELECT VacationHours FROM HumanResources.Employee WHERE BusinessEntityID = 4; On session 1:-- Reissue the SELECT statement - this shows -- the employee having 48 vacation hours. The -- snapshot transaction is still reading data from -- the versioned row. SELECT BusinessEntityID, VacationHours FROM HumanResources.Employee WHERE BusinessEntityID = 4; On session 2: -- Commit the transaction; this commits the data -- modification. COMMIT TRANSACTION; GO On session 1: -- Reissue the SELECT statement - this still -- shows the employee having 48 vacation hours -- even after the other transaction has committed -- the data modification. SELECT BusinessEntityID, VacationHours FROM HumanResources.Employee WHERE BusinessEntityID = 4; -- Because the data has been modified outside of the -- snapshot transaction, any further data changes to -- that data by the snapshot transaction will cause -- the snapshot transaction to fail. This statement -- will generate a 3960 error and the transaction will -- terminate. UPDATE HumanResources.Employee SET SickLeaveHours = SickLeaveHours - 8 WHERE BusinessEntityID = 4; -- Undo the changes to the database from session 1. -- This will not undo the change from session 2. ROLLBACK TRANSACTION GO B. Working with read-committed using row versioning In this example, a read-committed transaction using row versioning runs concurrently with another transaction. The read-committed transaction behaves differently than a snapshot transaction. Like a snapshot transaction, the read-committed transaction will read versioned rows even after the other transaction has modified data. However, unlike a snapshot transaction, the read-committed transaction will: Read the modified data after the other transaction commits the data changes. Be able to update the data modified by the other transaction where the snapshot transaction could not. On session 1:USE AdventureWorks2016; -- Or any earlier version of the AdventureWorks database. GO -- Enable READ_COMMITTED_SNAPSHOT on the database. -- For this statement to succeed, this session -- must be the only connection to the AdventureWorks2016 -- database. ALTER DATABASE AdventureWorks2016 SET READ_COMMITTED_SNAPSHOT ON; GO -- Start a read-committed transaction SET TRANSACTION ISOLATION LEVEL READ COMMITTED; GO BEGIN TRANSACTION; -- This SELECT statement will return -- 48 vacation hours for the employee. SELECT BusinessEntityID, VacationHours FROM HumanResources.Employee WHERE BusinessEntityID = 4; On session 2: USE AdventureWorks2016; GO -- Start a transaction. BEGIN TRANSACTION; -- Subtract a vacation day from employee 4. -- Update is not blocked by session 1 since -- under read-committed using row versioning shared locks are -- not requested. UPDATE HumanResources.Employee SET VacationHours = VacationHours - 8 WHERE BusinessEntityID = 4; -- Verify that the employee now has 40 vacation hours. SELECT VacationHours FROM HumanResources.Employee WHERE BusinessEntityID = 4; On session 1: -- Reissue the SELECT statement - this still shows -- the employee having 48 vacation hours. The -- read-committed transaction is still reading data -- from the versioned row and the other transaction -- has not committed the data changes yet. SELECT BusinessEntityID, VacationHours FROM HumanResources.Employee WHERE BusinessEntityID = 4; On session 2: -- Commit the transaction. COMMIT TRANSACTION; GO On session 1:-- Reissue the SELECT statement which now shows the -- employee having 40 vacation hours. Being -- read-committed, this transaction is reading the -- committed data. This is different from snapshot -- isolation which reads from the versioned row. SELECT BusinessEntityID, VacationHours FROM HumanResources.Employee WHERE BusinessEntityID = 4; -- This statement, which caused the snapshot transaction -- to fail, will succeed with read-committed using row versioning. UPDATE HumanResources.Employee SET SickLeaveHours = SickLeaveHours - 8 WHERE BusinessEntityID = 4; -- Undo the changes to the database from session 1. -- This will not undo the change from session 2. ROLLBACK TRANSACTION; GO Enabling Row Versioning-Based Isolation Levels Database administrators control the database-level settings for row versioning by using the READ_COMMITTED_SNAPSHOT and ALLOW_SNAPSHOT_ISOLATION database options in the ALTER DATABASE statement. When the READ_COMMITTED_SNAPSHOT database option is set ON, the mechanisms used to support the option are activated immediately. When setting the READ_COMMITTED_SNAPSHOT option, only the connection executing the ALTER DATABASE command is allowed in the database. There must be no other open connection in the database until ALTER DATABASE is complete. The database does not have to be in single-user mode. The following Transact-SQL statement enables READ_COMMITTED_SNAPSHOT : ALTER DATABASE AdventureWorks2016 SET READ_COMMITTED_SNAPSHOT ON; When the ALLOW_SNAPSHOT_ISOLATION database option is set ON, the instance of the SQL Server Database Engine does not generate row versions for modified data until all active transactions that have modified data in the database complete. If there are active modification transactions, SQL Server sets the state of the option to PENDING_ON . After all of the modification transactions complete, the state of the option is changed to ON. Users cannot start a snapshot transaction in that database until the option is fully ON. The database passes through a PENDING_OFF state when the database administrator sets the ALLOW_SNAPSHOT_ISOLATION option to OFF. The following Transact-SQL statement will enable ALLOW_SNAPSHOT_ISOL ATION: ALTER DATABASE AdventureWorks2016 SET ALLOW_SNAPSHOT_ISOLATION ON; The following table lists and describes the states of the ALLOW_SNAPSHOT_ISOL ATION option. Using ALTER DATABASE with the ALLOW_SNAPSHOT_ISOL ATION option does not block users who are currently accessing the database data. STATE OF SNAPSHOT ISOLATION FRAMEWORK FOR CURRENT DATABASE DESCRIPTION OFF The support for snapshot isolation transactions is not activated. No snapshot isolation transactions are allowed.STATE OF SNAPSHOT ISOLATION FRAMEWORK FOR CURRENT DATABASE DESCRIPTION PENDING_ON The support for snapshot isolation transactions is in transition state (from OFF to ON). Open transactions must complete. No snapshot isolation transactions are allowed. ON The support for snapshot isolation transactions is activated. Snapshot transactions are allowed. PENDING_OFF The support for snapshot isolation transactions is in transition state (from ON to OFF). Snapshot transactions started after this time cannot access this database. Update transactions still pay the cost of versioning in this database. Existing snapshot transactions can still access this database without a problem. The state PENDING_OFF does not become OFF until all snapshot transactions that were active when the database snapshot isolation state was ON finish. Use the sys.databases catalog view to determine the state of both row versioning database options. All updates to user tables and some system tables stored in master and msdb generate row versions. The ALLOW_SNAPSHOT_ISOLATION option is automatically set ON in the master and msdb databases, and cannot be disabled. Users cannot set the READ_COMMITTED_SNAPSHOT option ON in master, tempdb, or msdb. Using Row Versioning-based Isolation Levels The row versioning framework is always enabled in SQL Server, and is used by multiple features. Besides providing row versioning-based isolation levels, it is used to support modifications made in triggers and multiple active result sets (MARS) sessions, and to support data reads for ONLINE index operations. Row versioning-based isolation levels are enabled at the database level. Any application accessing objects from enabled databases can run queries using the following isolation levels: Read-committed that uses row versioning by setting the READ_COMMITTED_SNAPSHOT database option to ON as shown in the following code example: ALTER DATABASE AdventureWorks2016 SET READ_COMMITTED_SNAPSHOT ON; When the database is enabled for READ_COMMITTED_SNAPSHOT , all queries running under the read committed isolation level use row versioning, which means that read operations do not block update operations. Snapshot isolation by setting the ALLOW_SNAPSHOT_ISOLATION database option to ON as shown in the following code example: ALTER DATABASE AdventureWorks2016 SET ALLOW_SNAPSHOT_ISOLATION ON; A transaction running under snapshot isolation can access tables in the database that have been enabled for snapshot. To access tables that have not been enabled for snapshot, the isolation level must be changed. Forexample, the following code example shows a SELECT statement that joins two tables while running under a snapshot transaction. One table belongs to a database in which snapshot isolation is not enabled. When the SELECT statement runs under snapshot isolation, it fails to execute successfully. SET TRANSACTION ISOLATION LEVEL SNAPSHOT; BEGIN TRAN SELECT t1.col5, t2.col5 FROM Table1 as t1 INNER JOIN SecondDB.dbo.Table2 as t2 ON t1.col1 = t2.col2; The following code example shows the same SELECT statement that has been modified to change the transaction isolation level to read-committed. Because of this change, the SELECT statement executes successfully. SET TRANSACTION ISOLATION LEVEL SNAPSHOT; BEGIN TRAN SELECT t1.col5, t2.col5 FROM Table1 as t1 WITH (READCOMMITTED) INNER JOIN SecondDB.dbo.Table2 as t2 ON t1.col1 = t2.col2; Limitations of Transactions Using Row Versioning-based Isolation Levels Consider the following limitations when working with row versioning-based isolation levels: READ_COMMITTED_SNAPSHOT cannot be enabled in tempdb, msdb, or master. Global temp tables are stored in tempdb. When accessing global temp tables inside a snapshot transaction, one of the following must happen: Set the ALLOW_SNAPSHOT_ISOLATION database option ON in tempdb. Use an isolation hint to change the isolation level for the statement. Snapshot transactions fail when: A database is made read-only after the snapshot transaction starts, but before the snapshot transaction accesses the database. If accessing objects from multiple databases, a database state was changed in such a way that database recovery occurred after a snapshot transaction starts, but before the snapshot transaction accesses the database. For example: the database was set to OFFLINE and then to ONLINE, database autoclose and open, or database detach and attach. Distributed transactions, including queries in distributed partitioned databases, are not supported under snapshot isolation. SQL Server does not keep multiple versions of system metadata. Data definition language (DDL) statements on tables and other database objects (indexes, views, data types, stored procedures, and common language runtime functions) change metadata. If a DDL statement modifies an object, any concurrent reference to the object under snapshot isolation causes the snapshot transaction to fail. Read- committed transactions do not have this limitation when the READ_COMMITTED_SNAPSHOT database option is ON. For example, a database administrator executes the following ALTER INDEX statement.USE AdventureWorks2016; GO ALTER INDEX AK_Employee_LoginID ON HumanResources.Employee REBUILD; GO Any snapshot transaction that is active when the ALTER INDEX statement is executed receives an error if it attempts to reference the HumanResources.Employee table after the ALTER INDEX statement is executed. Read- committed transactions using row versioning are not affected. NOTE BULK INSERT operations may cause changes to target table metadata (for example, when disabling constraint checks). When this happens, concurrent snapshot isolation transactions accessing bulk inserted tables fail. Customizing Locking and Row Versioning Customizing the Lock Time-Out When an instance of the Microsoft SQL Server Database Engine cannot grant a lock to a transaction because another transaction already owns a conflicting lock on the resource, the first transaction becomes blocked waiting for the existing lock to be released. By default, there is no mandatory time-out period and no way to test whether a resource is locked before locking it, except to attempt to access the data (and potentially get blocked indefinitely). NOTE In SQL Server, use the sys.dm_os_waiting_tasks dynamic management view to determine whether a process is being blocked and who is blocking it. In earlier versions of SQL Server, use the sp_who system stored procedure. The LOCK_TIMEOUT setting allows an application to set a maximum time that a statement waits on a blocked resource. When a statement has waited longer than the LOCK_TIMEOUT setting, the blocked statement is canceled automatically, and error message 1222 ( Lock request time-out period exceeded ) is returned to the application. Any transaction containing the statement, however, is not rolled back or canceled by SQL Server. Therefore, the application must have an error handler that can trap error message 1222. If an application does not trap the error, the application can proceed unaware that an individual statement within a transaction has been canceled, and errors can occur because statements later in the transaction might depend on the statement that was never executed. Implementing an error handler that traps error message 1222 allows an application to handle the time-out situation and take remedial action, such as: automatically resubmitting the statement that was blocked or rolling back the entire transaction. To determine the current LOCK_TIMEOUT setting, execute the @@LOCK_TIMEOUT function: SELECT @@lock_timeout; GO Customizing Transaction Isolation Level READ COMMITTED is the default isolation level for the Microsoft SQL Server Database Engine. If an application must operate at a different isolation level, it can use the following methods to set the isolation level: Run the SET TRANSACTION ISOL ATION LEVEL statement. ADO.NET applications that use the System.Data.SqlClient managed namespace can specify an IsolationLeveloption by using the SqlConnection.BeginTransaction method. Applications that use ADO can set the Autocommit Isolation Levels property. When starting a transaction, applications using OLE DB can call ITransactionLocal::StartTransaction with isoLevel set to the desired transaction isolation level. When specifying the isolation level in autocommit mode, applications that use OLE DB can set the DBPROPSET_SESSION property DBPROP_SESS_AUTOCOMMITISOLEVELS to the desired transaction isolation level. Applications that use ODBC can set the SQL_COPT_SS_TXN_ISOL ATION attribute by using SQLSetConnectAttr. When the isolation level is specified, the locking behavior for all queries and data manipulation language (DML) statements in the SQL Server session operates at that isolation level. The isolation level remains in effect until the session terminates or until the isolation level is set to another level. The following example sets the SERIALIZABLE isolation level: USE AdventureWorks2016; GO SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; GO BEGIN TRANSACTION; SELECT BusinessEntityID FROM HumanResources.Employee; GO The isolation level can be overridden for individual query or DML statements, if necessary, by specifying a table- level hint. Specifying a table-level hint does not affect other statements in the session. We recommend that table- level hints be used to change the default behavior only when absolutely necessary. The SQL Server Database Engine might have to acquire locks when reading metadata even when the isolation level is set to a level where share locks are not requested when reading data. For example, a transaction running at the read-uncommitted isolation level does not acquire share locks when reading data, but might sometime request locks when reading a system catalog view. This means it is possible for a read uncommitted transaction to cause blocking when querying a table when a concurrent transaction is modifying the metadata of that table. To determine the transaction isolation level currently set, use the DBCC USEROPTIONS statement as shown in the following example. The result set may vary from the result set on your system. USE AdventureWorks2016; GO SET TRANSACTION ISOLATION LEVEL REPEATABLE READ; GO DBCC USEROPTIONS; GO Here is the result set.Set Option Value ---------------------------- ------------------------------------------- textsize 2147483647 language us_english dateformat mdy datefirst 7 ... ... Isolation level repeatable read (14 row(s) affected) DBCC execution completed. If DBCC printed error messages, contact your system administrator. Locking Hints Locking hints can be specified for individual table references in the SELECT, INSERT, UPDATE, and DELETE statements. The hints specify the type of locking or row versioning the instance of the SQL Server Database Engine uses for the table data. Table-level locking hints can be used when a finer control of the types of locks acquired on an object is required. These locking hints override the current transaction isolation level for the session. For more information about the specific locking hints and their behaviors, see Table Hints (Transact-SQL). NOTE The SQL Server Database Engine query optimizer almost always chooses the correct locking level. We recommend that table-level locking hints be used to change the default locking behavior only when necessary. Disallowing a locking level can adversely affect concurrency. The SQL Server Database Engine might have to acquire locks when reading metadata, even when processing a select with a locking hint that prevents requests for share locks when reading data. For example, a SELECT using the NOLOCK hint does not acquire share locks when reading data, but might sometime request locks when reading a system catalog view. This means it is possible for a SELECT statement using NOLOCK to be blocked. As shown in the following example, if the transaction isolation level is set to SERIALIZABLE , and the table-level locking hint NOLOCK is used with the SELECT statement, key-range locks typically used to maintain serializable transactions are not taken.USE AdventureWorks2016; GO SET TRANSACTION ISOLATION LEVEL SERIALIZABLE; GO BEGIN TRANSACTION; GO SELECT JobTitle FROM HumanResources.Employee WITH (NOLOCK); GO -- Get information about the locks held by -- the transaction. SELECT resource_type, resource_subtype, request_mode FROM sys.dm_tran_locks WHERE request_session_id = @@spid; -- End the transaction. ROLLBACK; GO The only lock taken that references HumanResources.Employee is a schema stability (Sch-S) lock. In this case, serializability is no longer guaranteed. In SQL Server 2017, the LOCK_ESCALATION option of ALTER TABLE can disfavor table locks, and enable HoBT locks on partitioned tables. This option is not a locking hint, but can but used to reduce lock escalation. For more information, see ALTER TABLE (Transact-SQL). Customizing Locking for an Index The SQL Server Database Engine uses a dynamic locking strategy that automatically chooses the best locking granularity for queries in most cases. We recommend that you do not override the default locking levels, which have page and row locking on, unless table or index access patterns are well understood and consistent, and there is a resource contention problem to solve. Overriding a locking level can significantly impede concurrent access to a table or index. For example, specifying only table-level locks on a large table that users access heavily can cause bottlenecks because users must wait for the table-level lock to be released before accessing the table. There are a few cases where disallowing page or row locking can be beneficial, if the access patterns are well understood and consistent. For example, a database application uses a lookup table that is updated weekly in a batch process. Concurrent readers access the table with a shared (S) lock and the weekly batch update accesses the table with an exclusive (X) lock. Turning off page and row locking on the table reduces the locking overhead throughout the week by allowing readers to concurrently access the table through shared table locks. When the batch job runs, it can complete the update efficiently because it obtains an exclusive table lock. Turning off page and row locking might or might not be acceptable because the weekly batch update will block the concurrent readers from accessing the table while the update runs. If the batch job only changes a few rows or pages, you can change the locking level to allow row or page level locking, which will enable other sessions to read from the table without blocking. If the batch job has a large number of updates, obtaining an exclusive lock on the table may be the best way to ensure the batch job finishes efficiently. Occasionally a deadlock occurs when two concurrent operations acquire row locks on the same table and then block because they both need to lock the page. Disallowing row locks forces one of the operations to wait, avoiding the deadlock. The granularity of locking used on an index can be set using the CREATE INDEX and ALTER INDEX statements. The lock settings apply to both the index pages and the table pages. In addition, the CREATE TABLE and ALTER TABLE statements can be used to set locking granularity on PRIMARY KEY and UNIQUE constraints. For backwardscompatibility, the sp_indexoption system stored procedure can also set the granularity. To display the current locking option for a given index, use the INDEXPROPERTY function. Page-level locks, row-level locks, or a combination of page-level and row-level locks can be disallowed for a given index. DISALLOWED LOCKS INDEX ACCESSED BY Page level Row-level and table-level locks Row level Page-level and table-level locks Page level and row level Table-level locks Advanced Transaction Information Nesting Transactions Explicit transactions can be nested. This is primarily intended to support transactions in stored procedures that can be called either from a process already in a transaction or from processes that have no active transaction. The following example shows the intended use of nested transactions. The procedure TransProc enforces its transaction regardless of the transaction mode of any process that executes it. If TransProc is called when a transaction is active, the nested transaction in TransProc is largely ignored, and its INSERT statements are committed or rolled back based on the final action taken for the outer transaction. If TransProc is executed by a process that does not have an outstanding transaction, the COMMIT TRANSACTION at the end of the procedure effectively commits the INSERT statements. SET QUOTED_IDENTIFIER OFF; GO SET NOCOUNT OFF; GO CREATE TABLE TestTrans(Cola INT PRIMARY KEY, Colb CHAR(3) NOT NULL); GO CREATE PROCEDURE TransProc @PriKey INT, @CharCol CHAR(3) AS BEGIN TRANSACTION InProc INSERT INTO TestTrans VALUES (@PriKey, @CharCol) INSERT INTO TestTrans VALUES (@PriKey + 1, @CharCol) COMMIT TRANSACTION InProc; GO /* Start a transaction and execute TransProc. */ BEGIN TRANSACTION OutOfProc; GO EXEC TransProc 1, ''aaa''; GO /* Roll back the outer transaction, this will roll back TransProc''s nested transaction. */ ROLLBACK TRANSACTION OutOfProc; GO EXECUTE TransProc 3,''bbb''; GO /* The following SELECT statement shows only rows 3 and 4 are still in the table. This indicates that the commit of the inner transaction from the first EXECUTE statement of TransProc was overridden by the subsequent rollback. */ SELECT * FROM TestTrans; GO Committing inner transactions is ignored by the SQL Server Database Engine. The transaction is either committed or rolled back based on the action taken at the end of the outermost transaction. If the outer transaction iscommitted, the inner nested transactions are also committed. If the outer transaction is rolled back, then all inner transactions are also rolled back, regardless of whether or not the inner transactions were individually committed. Each call to COMMIT TRANSACTION or COMMIT WORK applies to the last executed BEGIN TRANSACTION . If the BEGIN TRANSACTION statements are nested, then a COMMIT statement applies only to the last nested transaction, which is the innermost transaction. Even if a COMMIT TRANSACTION transaction_name statement within a nested transaction refers to the transaction name of the outer transaction, the commit applies only to the innermost transaction. It is not legal for the transaction_name parameter of a ROLLBACK TRANSACTION statement to refer to the inner transactions of a set of named nested transactions. transaction_name can refer only to the transaction name of the outermost transaction. If a ROLLBACK TRANSACTION transaction_name statement using the name of the outer transaction is executed at any level of a set of nested transactions, all of the nested transactions are rolled back. If a ROLLBACK WORK or ROLLBACK TRANSACTION statement without a transaction_name parameter is executed at any level of a set of nested transaction, it rolls back all of the nested transactions, including the outermost transaction. The @@TRANCOUNT function records the current transaction nesting level. Each BEGIN TRANSACTION statement increments @@TRANCOUNT by one. Each COMMIT TRANSACTION or COMMIT WORK statement decrements @@TRANCOUNT by one. A ROLLBACK WORK or a ROLLBACK TRANSACTION statement that does not have a transaction name rolls back all nested transactions and decrements @@TRANCOUNT to 0. A ROLLBACK TRANSACTION that uses the transaction name of the outermost transaction in a set of nested transactions rolls back all of the nested transactions and decrements @@TRANCOUNT to 0. When you are unsure if you are already in a transaction, SELECT @@TRANCOUNT to determine if it is 1 or more. If @@TRANCOUNT is 0, you are not in a transaction. Using Bound Sessions Bound sessions ease the coordination of actions across multiple sessions on the same server. Bound sessions allow two or more sessions to share the same transaction and locks, and can work on the same data without lock conflicts. Bound sessions can be created from multiple sessions within the same application or from multiple applications with separate sessions. To participate in a bound session, a session calls sp_getbindtoken or srv_getbindtoken (through Open Data Services) to get a bind token. A bind token is a character string that uniquely identifies each bound transaction. The bind token is then sent to the other sessions to be bound with the current session. The other sessions bind to the transaction by calling sp_bindsession, using the bind token received from the first session. NOTE A session must have an active user transaction in order for sp_getbindtoken or srv_getbindtoken to succeed. Bind tokens must be transmitted from the application code that makes the first session to the application code that subsequently binds their sessions to the first session. There is no Transact-SQL statement or API function that an application can use to get the bind token for a transaction started by another process. Some of the methods that can be used to transmit a bind token include the following: If the sessions are all initiated from the same application process, bind tokens can be stored in global memory or passed into functions as a parameter. If the sessions are made from separate application processes, bind tokens can be transmitted using interprocess communication (IPC), such as a remote procedure call (RPC) or dynamic data exchange (DDE). Bind tokens can be stored in a table in an instance of the SQL Server Database Engine that can be read by processes wanting to bind to the first session. Only one session in a set of bound sessions can be active at any time. If one session is executing a statement on the instance or has results pending from the instance, no other session bound to it can accessthe instance until the current session finishes processing or cancels the current statement. If the instance is busy processing a statement from another of the bound sessions, an error occurs indicating that the transaction space is in use and the session should retry later. When you bind sessions, each session retains its isolation level setting. Using SET TRANSACTION ISOL ATION LEVEL to change the isolation level setting of one session does not affect the setting of any other session bound to it. Types of Bound Sessions The two types of bound sessions are local and distributed. Local bound session Allows bound sessions to share the transaction space of a single transaction in a single instance of the SQL Server Database Engine. Distributed bound session Allows bound sessions to share the same transaction across two or more instances until the entire transaction is either committed or rolled back by using Microsoft Distributed Transaction Coordinator (MS DTC). Distributed bound sessions are not identified by a character string bind token; they are identified by distributed transaction identification numbers. If a bound session is involved in a local transaction and executes an RPC on a remote server with SET REMOTE_PROC_TRANSACTIONS ON , the local bound transaction is automatically promoted to a distributed bound transaction by MS DTC and an MS DTC session is started. When to use Bound Sessions In earlier versions of SQL Server, bound sessions were primarily used in developing extended stored procedures that must execute Transact-SQL statements on behalf of the process that calls them. Having the calling process pass in a bind token as one parameter of the extended stored procedure allows the procedure to join the transaction space of the calling process, thereby integrating the extended stored procedure with the calling process. In the SQL Server Database Engine, stored procedures written using CLR are more secure, scalable, and stable than extended stored procedures. CLR-stored procedures use the SqlContext object to join the context of the calling session, not sp_bindsession . Bound sessions can be used to develop three-tier applications in which business logic is incorporated into separate programs that work cooperatively on a single business transaction. These programs must be coded to carefully coordinate their access to a database. Because the two sessions share the same locks, the two programs must not try to modify the same data at the same time. At any point in time, only one session can be doing work as part of the transaction; there can be no parallel execution. The transaction can only be switched between sessions at well- defined yield points, such as when all DML statements have completed and their results have been retrieved. Coding efficient transactions It is important to keep transactions as short as possible. When a transaction is started, a database management system (DBMS) must hold many resources until the end of the transaction to protect the atomicity, consistency, isolation, and durability (ACID) properties of the transaction. If data is modified, the modified rows must be protected with exclusive locks that prevent any other transaction from reading the rows, and exclusive locks must be held until the transaction is committed or rolled back. Depending on transaction isolation level settings, SELECT statements may acquire locks that must be held until the transaction is committed or rolled back. Especially in systems with many users, transactions must be kept as short as possible to reduce locking contention for resources between concurrent connections. Long-running, inefficient transactions may not be a problem with small numbers of users, but they are intolerable in a system with thousands of users. Beginning with SQL Server 2014 (12.x) SQL Server supports delayed durable transactions. Delayed durable transactions do not guarantee durability. See the topic Transaction Durability for more information.Coding Guidelines These are guidelines for coding efficient transactions: Do not require input from users during a transaction. Get all required input from users before a transaction is started. If additional user input is required during a transaction, roll back the current transaction and restart the transaction after the user input is supplied. Even if users respond immediately, human reaction times are vastly slower than computer speeds. All resources held by the transaction are held for an extremely long time, which has the potential to cause blocking problems. If users do not respond, the transaction remains active, locking critical resources until they respond, which may not happen for several minutes or even hours. Do not open a transaction while browsing through data, if at all possible. Transactions should not be started until all preliminary data analysis has been completed. Keep the transaction as short as possible. After you know the modifications that have to be made, start a transaction, execute the modification statements, and then immediately commit or roll back. Do not open the transaction before it is required. To reduce blocking, consider using a row versioning-based isolation level for read-only queries. Make intelligent use of lower transaction isolation levels. Many applications can be readily coded to use a read-committed transaction isolation level. Not all transactions require the serializable transaction isolation level. Make intelligent use of lower cursor concurrency options, such as optimistic concurrency options. In a system with a low probability of concurrent updates, the overhead of dealing with an occasional "somebody else changed your data after you read it" error can be much lower than the overhead of always locking rows as they are read. Access the least amount of data possible while in a transaction. This lessens the number of locked rows, thereby reducing contention between transactions. Avoiding concurrency and resource problems To prevent concurrency and resource problems, manage implicit transactions carefully. When using implicit transactions, the next Transact-SQL statement after COMMIT or ROLLBACK automatically starts a new transaction. This can cause a new transaction to be opened while the application browses through data, or even when it requires input from the user. After completing the last transaction required to protect data modifications, turn off implicit transactions until a transaction is once again required to protect data modifications. This process lets the SQL Server Database Engine use autocommit mode while the application is browsing data and getting input from the user. In addition, when the snapshot isolation level is enabled, although a new transaction will not hold locks, a long- running transaction will prevent the old versions from being removed from tempdb . Managing long-running transactions A long-running transaction is an active transaction that has not been committed or roll backed the transaction in a timely manner. For example, if the beginning and end of a transaction is controlled by the user, a typical cause of a long-running transaction is a user starting a transaction and then leaving while the transaction waits for a response from the user. A long running transaction can cause serious problems for a database, as follows: If a server instance is shut down after an active transaction has performed many uncommitted modifications, the recovery phase of the subsequent restart can take much longer than the time specified by the recovery interval server configuration option or by the ALTER DATABASE … SET TARGET_RECOVERY_TIME option. These options control the frequency of active and indirect checkpoints, respectively. For moreinformation about the types of checkpoints, see Database Checkpoints (SQL Server). More importantly, although a waiting transaction might generate very little log, it holds up log truncation indefinitely, causing the transaction log to grow and possibly fill up. If the transaction log fills up, the database cannot perform any more updates. For more information, see SQL Server Transaction Log Architecture and Management Guide, Troubleshoot a Full Transaction Log (SQL Server Error 9002), and The Transaction Log (SQL Server). Discovering long-running transactions To look for long-running transactions, use one of the following: sys.dm_tran_database_transactions This dynamic management view returns information about transactions at the database level. For a long- running transaction, columns of particular interest include the time of the first log record (database_transaction_begin_time), the current state of the transaction (database_transaction_state), and the log sequence number (LSN) of the begin record in the transaction log (database_transaction_begin_lsn). For more information, see sys.dm_tran_database_transactions (Transact-SQL). DBCC OPENTRAN This statement lets you identify the user ID of the owner of the transaction, so you can potentially track down the source of the transaction for a more orderly termination (committing it rather than rolling it back). For more information, see DBCC OPENTRAN (Transact-SQL). Stopping a Transaction You may have to use the KILL statement. Use this statement very carefully, however, especially when critical processes are running. For more information, see KILL (Transact-SQL). Additional Reading Overhead of Row Versioning Extended Events sys.dm_tran_locks (Transact-SQL) Dynamic Management Views and Functions (Transact-SQL) Transaction Related Dynamic Management Views and Functions (Transact-SQL)Back Up and Restore of SQL Server Databases 5/3/2018 • 11 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database (Managed Instance only) Azure SQL Data Warehouse Parallel Data Warehouse This topic describes the benefits of backing up SQL Server databases, basic backup and restore terms, and introduces backup and restore strategies for SQL Server and security considerations for SQL Server backup and restore. IMPORTANT On Azure SQL Database Managed Instance, this T-SQL feature has certain behavior changes. See Azure SQL Database Managed Instance T-SQL differences from SQL Server for details for all T-SQL behavior changes. Looking for step by step instructions? This topic does not provide any specific steps for how to do a back up! If you want to get right to actually backing up, scroll down this page to the links section, organized by backup tasks and whether you want to use SSMS or T-SQL. The SQL Server backup and restore component provides an essential safeguard for protecting critical data stored in your SQL Server databases. To minimize the risk of catastrophic data loss, you need to back up your databases to preserve modifications to your data on a regular basis. A well-planned backup and restore strategy helps protect databases against data loss caused by a variety of failures. Test your strategy by restoring a set of backups and then recovering your database to prepare you to respond effectively to a disaster. In addition to local storage for storing the backups, SQL Server also supports backup to and restore from the Windows Azure Blob Storage Service. For more information, see SQL Server Backup and Restore with Microsoft Azure Blob Storage Service. For database files stored using the Microsoft Azure Blob storage service, SQL Server 2016 (13.x) provides the option to use Azure snapshots for nearly instantaneous backups and faster restores. For more information, see File-Snapshot Backups for Database Files in Azure. Why back up? Backing up your SQL Server databases, running test restores procedures on your backups, and storing copies of backups in a safe, off-site location protects you from potentially catastrophic data loss. Backing up is the only way to proctect your data. With valid backups of a database, you can recover your data from many failures, such as: Media failure. User errors, for example, dropping a table by mistake. Hardware failures, for example, a damaged disk drive or permanent loss of a server. Natural disasters. By using SQL Server Backup to Windows Azure Blob storage service, you can create an off-site backup in a different region than your on-premises location, to use in the event of a natural disaster affecting your on-premises location. Additionally, backups of a database are useful for routine administrative purposes, such as copying a database from one server to another, setting up Always On availability groups or database mirroring, and archiving.Glossary of backup terms back up [verb] Copies the data or log records from a SQL Server database or its transaction log to a backup device, such as a disk, to create a data backup or log backup. backup [noun] A copy of data that can be used to restore and recover the data after a failure. Backups of a database can also be used to restore a copy the database to a new location. backup device A disk or tape device to which SQL Server backups are written and from which they can be restored. SQL Server backups can also be written to a Windows Azure Blob storage service, and URL format is used to specify the destination and the name of the backup file.. For more information, see SQL Server Backup and Restore with Microsoft Azure Blob Storage Service. backup media One or more tapes or disk files to which one or more backup have been written. data backup A backup of data in a complete database (a database backup), a partial database ( a partial backup), or a set of data files or filegroups (a file backup). database backup A backup of a database. Full database backups represent the whole database at the time the backup finished. Differential database backups contain only changes made to the database since its most recent full database backup. differential backup A data backup that is based on the latest full backup of a complete or partial database or a set of data files or filegroups (the differential base) and that contains only the data that has changed since that base. full backup A data backup that contains all the data in a specific database or set of filegroups or files, and also enough log to allow for recovering that data. log backup A backup of transaction logs that includes all log records that were not backed up in a previous log backup. (full recovery model) recover To return a database to a stable and consistent state. recovery A phase of database startup or of a restore with recovery that brings the database into a transaction-consistent state. recovery model A database property that controls transaction log maintenance on a database. Three recovery models exist: simple, full, and bulk-logged. The recovery model of database determines its backup and restore requirements. restore A multi-phase process that copies all the data and log pages from a specified SQL Server backup to a specified database, and then rolls forward all the transactions that are logged in the backup by applying logged changes to bring the data forward in time. Backup and restore strategiesBacking up and restoring data must be customized to a particular environment and must work with the available resources. Therefore, a reliable use of backup and restore for recovery requires a backup and restore strategy. A well-designed backup and restore strategy maximizes data availability and minimizes data loss, while considering your particular business requirements. Important! Place the database and backups on separate devices. Otherwise, if the device containing the database fails, your backups will be unavailable. Placing the data and backups on separate devices also enhances the I/O performance for both writing backups and the production use of the database. A backup and restore strategy contains a backup portion and a restore portion. The backup part of the strategy defines the type and frequency of backups, the nature and speed of the hardware that is required for them, how backups are to be tested, and where and how backup media is to be stored (including security considerations). The restore part of the strategy defines who is responsible for performing restores and how restores should be performed to meet your goals for availability of the database and for minimizing data loss. We recommend that you document your backup and restore procedures and keep a copy of the documentation in your run book. Designing an effective backup and restore strategy requires careful planning, implementation, and testing. Testing is required. You do not have a backup strategy until you have successfully restored backups in all the combinations that are included in your restore strategy. You must consider a variety of factors. These include the following: The production goals of your organization for the databases, especially the requirements for availability and protection of data from loss. The nature of each of your databases: its size, its usage patterns, the nature of its content, the requirements for its data, and so on. Constraints on resources, such as: hardware, personnel, space for storing backup media, the physical security of the stored media, and so on. Impact of the recovery model on backup and restore Backup and restore operations occur within the context of a recovery model. A recovery model is a database property that controls how the transaction log is managed. Also, the recovery model of a database determines what types of backups and what restore scenarios are supported for the database. Typically a database uses either the simple recovery model or the full recovery model. The full recovery model can be supplemented by switching to the bulk-logged recovery model before bulk operations. For an introduction to these recovery models and how they affect transaction log management, see The Transaction Log (SQL Server) The best choice of recovery model for the database depends on your business requirements. To avoid transaction log management and simplify backup and restore, use the simple recovery model. To minimize work-loss exposure, at the cost of administrative overhead, use the full recovery model. For information about the effect of recovery models on backup and restore, see Backup Overview (SQL Server). Design your backup strategy After you have selected a recovery model that meets your business requirements for a specific database, you have to plan and implement a corresponding backup strategy. The optimal backup strategy depends on a variety of factors, of which the following are especially significant: How many hours a day do applications have to access the database? If there is a predictable off-peak period, we recommend that you schedule full database backups for that period. How frequently are changes and updates likely to occur? If changes are frequent, consider the following: Under the simple recovery model, consider scheduling differential backups between full databasebackups. A differential backup captures only the changes since the last full database backup. Under the full recovery model, you should schedule frequent log backups. Scheduling differential backups between full backups can reduce restore time by reducing the number of log backups you have to restore after restoring the data. Are changes likely to occur in only a small part of the database or in a large part of the database? For a large database in which changes are concentrated in a part of the files or filegroups, partial backups and or file backups can be useful. For more information, see Partial Backups (SQL Server) and Full File Backups (SQL Server). How much disk space will a full database backup require? Estimate the size of a full database backup Before you implement a backup and restore strategy, you should estimate how much disk space a full database backup will use. The backup operation copies the data in the database to the backup file. The backup contains only the actual data in the database and not any unused space. Therefore, the backup is usually smaller than the database itself. You can estimate the size of a full database backup by using the sp_spaceused system stored procedure. For more information, see sp_spaceused (Transact-SQL). Schedule backups Performing a backup operation has minimal effect on transactions that are running; therefore, backup operations can be run during regular operations. You can perform a SQL Server backup with minimal effect on production workloads. For information about concurrency restrictions during backup, see Backup Overview (SQL Server). After you decide what types of backups you require and how frequently you have to perform each type, we recommend that you schedule regular backups as part of a database maintenance plan for the database. For information about maintenance plans and how to create them for database backups and log backups, see Use the Maintenance Plan Wizard. Test your backups! You do not have a restore strategy until you have tested your backups. It is very important to thoroughly test your backup strategy for each of your databases by restoring a copy of the database onto a test system. You must test restoring every type of backup that you intend to use. We recommend that you maintain an operations manual for each database. This operations manual should document the location of the backups, backup device names (if any), and the amount of time that is required to restore the test backups. More about backup tasks Create a Maintenance Plan Create a Job Schedule a Job Working with backup devices and backup media Define a Logical Backup Device for a Disk File (SQL Server) Define a Logical Backup Device for a Tape Drive (SQL Server) Specify a Disk or Tape As a Backup Destination (SQL Server)Delete a Backup Device (SQL Server) Set the Expiration Date on a Backup (SQL Server) View the Contents of a Backup Tape or File (SQL Server) View the Data and Log Files in a Backup Set (SQL Server) View the Properties and Contents of a Logical Backup Device (SQL Server) Restore a Backup from a Device (SQL Server) Creating backups Note! For partial or copy-only backups, you must use the Transact-SQLBACKUP statement with the PARTIAL or COPY_ONLY option, respectively. Using SSMS Create a Full Database Backup (SQL Server) Back Up a Transaction Log (SQL Server) Back Up Files and Filegroups (SQL Server) Create a Differential Database Backup (SQL Server) Using T-SQL Use Resource Governor to Limit CPU Usage by Backup Compression (Transact-SQL) Back Up the Transaction Log When the Database Is Damaged (SQL Server) Enable or Disable Backup Checksums During Backup or Restore (SQL Server) Specify Whether a Backup or Restore Operation Continues or Stops After Encountering an Error (SQL Server) Restore data backups Using SSMS Restore a Database Backup Using SSMS Restore a Database to a New Location (SQL Server) Restore a Differential Database Backup (SQL Server) Restore Files and Filegroups (SQL Server) Using T-SQL Restore a Database Backup Under the Simple Recovery Model (Transact-SQL) Restore a Database to the Point of Failure Under the Full Recovery Model (Transact-SQL) Restore Files and Filegroups over Existing Files (SQL Server) Restore Files to a New Location (SQL Server) Restore the master Database (Transact-SQL) Restore transaction logs (Full Recovery Model) Using SSMSRestore a Database to a Marked Transaction (SQL Server Management Studio) Restore a Transaction Log Backup (SQL Server) Restore a SQL Server Database to a Point in Time (Full Recovery Model) Using T-SQL Restore a SQL Server Database to a Point in Time (Full Recovery Model) Restart an Interrupted Restore Operation (Transact-SQL) Recover a Database Without Restoring Data (Transact-SQL) More information and resources Backup Overview (SQL Server) Restore and Recovery Overview (SQL Server) BACKUP (Transact-SQL) RESTORE (Transact-SQL) Backup and Restore of Analysis Services Databases Back Up and Restore Full-Text Catalogs and Indexes Back Up and Restore Replicated Databases The Transaction Log (SQL Server) Recovery Models (SQL Server) Media Sets, Media Families, and Backup Sets (SQL Server)Binary Large Object (Blob) Data (SQL Server) 5/3/2018 • 1 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse SQL Server provides solutions for storing files and documents in the database or on remote storage devices. Compare Options for Storing Blobs in SQL Server Compare the advantages of FILESTREAM, FileTables, and Remote Blob Store. See Compare Options for Storing Blobs (SQL Server). Options for Storing Blobs FILESTREAM (SQL Server) FILESTREAM enables SQL Server-based applications to store unstructured data, such as documents and images, on the file system. Applications can leverage the rich streaming APIs and performance of the file system and at the same time maintain transactional consistency between the unstructured data and corresponding structured data. FileTables (SQL Server) The FileTable feature brings support for the Windows file namespace and compatibility with Windows applications to the file data stored in SQL Server. FileTable lets an application integrate its storage and data management components, and provides integrated SQL Server services - including full-text search and semantic search - over unstructured data and metadata. In other words, you can store files and documents in special tables in SQL Server called FileTables, but access them from Windows applications as if they were stored in the file system, without making any changes to your client applications. Remote Blob Store (RBS) (SQL Server) Remote BLOB store (RBS) for SQL Server lets database administrators store binary large objects (BLOBs) in commodity storage solutions instead of directly on the server. This saves a significant amount of space and avoids wasting expensive server hardware resources. RBS provides a set of API libraries that define a standardized model for applications to access BLOB data. RBS also includes maintenance tools, such as garbage collection, to help manage remote BLOB data. RBS is included on the SQL Server installation media, but is not installed by the SQL Server Setup program.Collation and Unicode Support 5/3/2018 • 16 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse Collations in SQL Server provide sorting rules, case, and accent sensitivity properties for your data. Collations that are used with character data types such as char and varchar dictate the code page and corresponding characters that can be represented for that data type. Whether you are installing a new instance of SQL Server, restoring a database backup, or connecting server to client databases, it is important that you understand the locale requirements, sorting order, and case and accent sensitivity of the data that you are working with. To list the collations available on your instance of SQL Server, see sys.fn_helpcollations (Transact-SQL). When you select a collation for your server, database, column, or expression, you are assigning certain characteristics to your data that affects the results of many operations in the database. For example, when you construct a query by using ORDER BY, the sort order of your result set might depend on the collation that is applied to the database or dictated in a COLL ATE clause at the expression level of the query. To best use collation support in SQL Server, you must understand the terms that are defined in this topic, and how they relate to the characteristics of your data. Collation Terms Collation Locale Code page Sort order Collation A collation specifies the bit patterns that represent each character in a data set. Collations also determine the rules that sort and compare data. SQL Server supports storing objects that have different collations in a single database. For non-Unicode columns, the collation setting specifies the code page for the data and which characters can be represented. Data that is moved between non-Unicode columns must be converted from the source code page to the destination code page. Transact-SQL statement results can vary when the statement is run in the context of different databases that have different collation settings. If it is possible, use a standardized collation for your organization. This way, you do not have to explicitly specify the collation in every character or Unicode expression. If you must work with objects that have different collation and code page settings, code your queries to consider the rules of collation precedence. For more information, see Collation Precedence (Transact-SQL). The options associated with a collation are case sensitivity, accent sensitivity, Kana-sensitivity, width sensitivity, variation-selector-sensitivity. These options are specified by appending them to the collation name. For example, this collation Japanese_Bushu_Kakusu_100_CS_AS_KS_WS is case-sensitive, accent-sensitive, Kana-sensitive, and width- sensitive. As another example, this collation Japanese_Bushu_Kakusu_140_CI_AI_KS_WS_VSS is case-insensitive, accent- insensitive, Kana-sensitive, width-sensitive, and variation-selector-sensitive. The following table describes the behavior associated with these various options.OPTION DESCRIPTION Case-sensitive (_CS) Distinguishes between uppercase and lowercase letters. If selected, lowercase letters sort ahead of their uppercase versions. If this option is not selected, the collation is case- insensitive. That is, SQL Server considers the uppercase and lowercase versions of letters to be identical for sorting purposes. You can explicitly select case insensitivity by specifying _CI. Accent-sensitive (_AS) Distinguishes between accented and unaccented characters. For example, ''a'' is not equal to ''ấ''. If this option is not selected, the collation is accent-insensitive. That is, SQL Server considers the accented and unaccented versions of letters to be identical for sorting purposes. You can explicitly select accent insensitivity by specifying _AI. Kana-sensitive (_KS) Distinguishes between the two types of Japanese kana characters: Hiragana and Katakana. If this option is not selected, the collation is Kana-insensitive. That is, SQL Server considers Hiragana and Katakana characters to be equal for sorting purposes. Omitting this option is the only method of specifying Kana-insensitivity. Width-sensitive (_WS) Distinguishes between full-width and half-width characters. If this option is not selected, SQL Server considers the full-width and half-width representation of the same character to be identical for sorting purposes. Omitting this option is the only method of specifying width-insensitivity. Variation-selector-sensitive (_VSS) Distinguishes between various ideographic variation selectors in Japanese collations Japanese_Bushu_Kakusu_140 and Japanese_XJIS_140 first introduced in SQL Server 2017 (14.x). A variation sequence consists of a base character plus an additional variation selector. If this _VSS option is not selected, the collation is variation selector insensitive, and the variation selector is not considered in the comparison. That is, SQL Server considers characters built upon the same base character with differing variation selectors to be identical for sorting purposes. See also Unicode Ideographic Variation Database. Variation selector sensitive (_VSS) collations are not supported in Full-text search indexes. Full-text search indexes support only Accent-Sensitive (_AS), Kana-sensitive (_KS), and Width- sensitive (_WS) options. SQL Server XML and CLR engines do not support (_VSS) Variation selectors. SQL Server supports the following collation sets: Windows collations Windows collations define rules for storing character data that are based on an associated Windows system locale. For a Windows collation, comparison of non-Unicode data is implemented by using the same algorithm as Unicode data. The base Windows collation rules specify which alphabet or language is used when dictionary sorting is applied, and the code page that is used to store non-Unicode character data. Both Unicode and non- Unicode sorting are compatible with string comparisons in a particular version of Windows. This provides consistency across data types within SQL Server, and it also lets developers sort strings in their applications by using the same rules that are used by SQL Server. For more information, see Windows Collation Name (Transact- SQL).Binary collations Binary collations sort data based on the sequence of coded values that are defined by the locale and data type. They are case sensitive. A binary collation in SQL Server defines the locale and the ANSI code page that is used. This enforces a binary sort order. Because they are relatively simple, binary collations help improve application performance. For non-Unicode data types, data comparisons are based on the code points that are defined in the ANSI code page. For Unicode data types, data comparisons are based on the Unicode code points. For binary collations on Unicode data types, the locale is not considered in data sorts. For example, Latin_1_General_BIN and Japanese_BIN yield identical sorting results when they are used on Unicode data. There are two types of binary collations in SQL Server ; the older BIN collations and the newer BIN2 collations. In a BIN2 collation all characters are sorted according to their code points. In a BIN collation only the first character is sorted according to the code point, and remaining characters are sorted according to their byte values. (Because the Intel platform is a little endian architecture, Unicode code characters are always stored byte-swapped.) SQL Server collations SQL Server collations (SQL_*) provide sort order compatibility with earlier versions of SQL Server. The dictionary sorting rules for non-Unicode data are incompatible with any sorting routine that is provided by Windows operating systems. However, sorting Unicode data is compatible with a particular version of Windows sorting rules. Because SQL Server collations use different comparison rules for non-Unicode and Unicode data, you see different results for comparisons of the same data, depending on the underlying data type. For more information, see SQL Server Collation Name (Transact-SQL). NOTE When you upgrade an English-language instance of SQL Server, SQL Server collations (SQL_*) can be specified for compatibility with existing instances of SQL Server. Because the default collation for an instance of SQL Server is defined during setup, make sure that you specify collation settings carefully when the following are true: Your application code depends on the behavior of previous SQL Server collations. You must store character data that reflects multiple languages. Setting collations are supported at the following levels of an instance of SQL Server : Server-level collations The default server collation is set during SQL Server setup, and also becomes the default collation of the system databases and all user databases. Note that Unicode-only collations cannot be selected during SQL Server setup because they are not supported as server-level collations. After a collation has been assigned to the server, you cannot change the collation except by exporting all database objects and data, rebuilding the master database, and importing all database objects and data. Instead of changing the default collation of an instance of SQL Server, you can specify the desired collation at the time that you create a new database or database column. Database-level collations When a database is created or modified, you can use the COLL ATE clause of the CREATE DATABASE or ALTER DATABASE statement to specify the default database collation. If no collation is specified, the database is assigned the server collation. You cannot change the collation of system databases except by changing the collation for the server. The database collation is used for all metadata in the database, and is the default for all string columns, temporary objects, variable names, and any other strings used in the database. When you change the collation of a user database, there can be collation conflicts when queries in the database access temporary tables. Temporary tables are always stored in the tempdb system database, which uses the collation for the instance. Queries that compare character data between the user database and tempdb may fail if the collations cause a conflict in evaluating the character data. You can resolve this by specifying the COLL ATE clause in the query. For more information, seeCOLL ATE (Transact-SQL). Column-level collations When you create or alter a table, you can specify collations for each character-string column by using the COLL ATE clause. If no collation is specified, the column is assigned the default collation of the database. Expression-level collations Expression-level collations are set when a statement is run, and they affect the way a result set is returned. This enables ORDER BY sort results to be locale-specific. Use a COLL ATE clause such as the following to implement expression-level collations: SELECT name FROM customer ORDER BY name COLLATE Latin1_General_CS_AI; Locale A locale is a set of information that is associated with a location or a culture. This can include the name and identifier of the spoken language, the script that is used to write the language, and cultural conventions. Collations can be associated with one or more locales. For more information, see Locale IDs Assigned by Microsoft. Code Page A code page is an ordered set of characters of a given script in which a numeric index, or code point value, is associated with each character. A Windows code page is typically referred to as a character set or charset. Code pages are used to provide support for the character sets and keyboard layouts that are used by different Windows system locales. Sort Order Sort order specifies how data values are sorted. This affects the results of data comparison. Data is sorted by using collations, and it can be optimized by using indexes. Unicode Support Unicode is a standard for mapping code points to characters. Because it is designed to cover all the characters of all the languages of the world, there is no need for different code pages to handle different sets of characters. If you store character data that reflects multiple languages, always use Unicode data types (nchar, nvarchar, and ntext) instead of the non-Unicode data types (char, varchar, and text). Significant limitations are associated with non-Unicode data types. This is because a non-Unicode computer is limited to use of a single code page. You might experience performance gain by using Unicode because fewer code-page conversions are required. Unicode collations must be selected individually at the database, column, or expression level because they are not supported at the server level. The code pages that a client uses are determined by the operating system settings. To set client code pages on the Windows operating system, use Regional Settings in Control Panel. When you move data from a server to a client, your server collation might not be recognized by older client drivers. This can occur when you move data from a Unicode server to a non-Unicode client. Your best option might be to upgrade the client operating system so that the underlying system collations are updated. If the client has database client software installed, you might consider applying a service update to the database client software. You can also try to use a different collation for the data on the server. Choose a collation that maps to a code page on the client. To use the UTF-16 collations available in SQL Server 2017 to improve searching and sorting of some Unicode characters (Windows collations only), you can select either one of the supplementary characters (_SC) collations or one of the version 140 collations.To evaluate issues that are related to using Unicode or non-Unicode data types, test your scenario to measure performance differences in your environment. It is a good practice to standardize the collation that is used on systems across your organization, and deploy Unicode servers and clients wherever possible. In many situations, SQL Server interacts with other servers or clients, and your organization might use multiple data access standards between applications and server instances. SQL Server clients are one of two main types: Unicode clients that use OLE DB and Open Database Connectivity (ODBC) version 3.7 or a later version. Non-Unicode clients that use DB-Library and ODBC version 3.6 or an earlier version. The following table provides information about using multilingual data with various combinations of Unicode and non-Unicode servers. SERVER CLIENT BENEFITS OR LIMITATIONS Unicode Unicode Because Unicode data is used throughout the system, this scenario provides the best performance and protection from corruption of retrieved data. This is the situation with ActiveX Data Objects (ADO), OLE DB, and ODBC version 3.7 or a later version. Unicode Non-Unicode In this scenario, especially with connections between a server that is running a newer operating system and a client that is running an older version of SQL Server, or on an older operating system, there can be limitations or errors when you move data to a client computer. Unicode data on the server tries to map to a corresponding code page on the non-Unicode client to convert the data. Non-Unicode Unicode This is not an ideal configuration for using multilingual data. You cannot write Unicode data to the non-Unicode server. Problems are likely to occur when data is sent to servers that are outside the server''s code page. Non-Unicode Non-Unicode This is a very limiting scenario for multilingual data. You can use only a single code page. Supplementary Characters SQL Server provides data types such as nchar and nvarchar to store Unicode data. These data types encode text in a format called UTF-16. The Unicode Consortium allocates each character a unique codepoint, which is a value in the range 0x0000 to 0x10FFFF. The most frequently used characters have codepoint values that fit into a 16-bit word in memory and on disk, but characters with codepoint values larger than 0xFFFF require two consecutive 16- bit words. These characters are called supplementary characters, and the two consecutive 16-bit words are called surrogate pairs. Introduced in SQL Server 2012 (11.x), a new family of supplementary character (_SC) collations can be used with the data types nchar, nvarchar, and sql_variant. For example: Latin1_General_100_CI_AS_SC , or if using a Japanese collation, Japanese_Bushu_Kakusu_100_CI_AS_SC .Starting in SQL Server 2014 (12.x), all new collations automatically support supplementary characters. If you use supplementary characters: Supplementary characters can be used in ordering and comparison operations in collation versions 90 or greater. All version 100 collations support linguistic sorting with supplementary characters. Supplementary characters are not supported for use in metadata, such as in names of database objects. Databases that use collations with supplementary characters (_SC), cannot be enabled for SQL Server Replication. This is because some of the system tables and stored procedures that are created for replication, use the legacy ntext data type, which does not support supplementary characters. The SC flag can be applied to: Version 90 collations Version 100 collations The SC flag cannot be applied to: Version 80 non-versioned Windows collations The BIN or BIN2 binary collations The SQL* collations Version 140 collations (these don''t need the SC flag as they already support supplementary characters) The following table compares the behavior of some string functions and string operators when they use supplementary characters with and without a supplementary character-aware (SCA) collation: WITH A SUPPLEMENTARY CHARACTER- STRING FUNCTION OR OPERATOR AWARE (SCA) COLLATION WITHOUT AN SCA COLLATION CHARINDEX The UTF-16 surrogate pair is counted as The UTF-16 surrogate pair is counted as a single codepoint. two codepoints. LEN PATINDEX LEFT These functions treat each surrogate These functions may split any surrogate pair as a single codepoint and work as pairs and lead to unexpected results. REPLACE expected. REVERSE RIGHT SUBSTRING STUFFWITH A SUPPLEMENTARY CHARACTER- STRING FUNCTION OR OPERATOR AWARE (SCA) COLLATION WITHOUT AN SCA COLLATION NCHAR Returns the character corresponding to A value higher than 0xFFFF returns the specified Unicode codepoint value in NULL instead of the corresponding the range 0 to 0x10FFFF. If the value surrogate. specified lies in the range 0 through 0xFFFF, one character is returned. For higher values, the corresponding surrogate is returned. UNICODE Returns a UTF-16 codepoint in the Returns a UCS-2 codepoint in the range range 0 through 0x10FFFF. 0 through 0xFFFF. Match One Character Wildcard Supplementary characters are Supplementary characters are not supported for all wildcard operations. supported for these wildcard Wildcard - Character(s) Not to Match operations. Other wildcard operators are supported. GB18030 Support GB18030 is a separate standard used in the People''s Republic of China for encoding Chinese characters. In GB18030, characters can be 1, 2, or 4 bytes in length. SQL Server provides support for GB18030-encoded characters by recognizing them when they enter the server from a client-side application and converting and storing them natively as Unicode characters. After they are stored in the server, they are treated as Unicode characters in any subsequent operations. You can use any Chinese collation, preferably the latest 100 version. All _100 level collations support linguistic sorting with GB18030 characters. If the data includes supplementary characters (surrogate pairs), you can use the SC collations available in SQL Server 2017 to improve searching and sorting. Complex Script Support SQL Server can support inputting, storing, changing, and displaying complex scripts. Complex scripts include the following types: Scripts that include the combination of both right-to-left and left-to-right text, such as a combination of Arabic and English text. Scripts whose characters change shape depending on their position, or when combined with other characters, such as Arabic, Indic, and Thai characters. Languages such as Thai that require internal dictionaries to recognize words because there are no breaks between them. Database applications that interact with SQL Server must use controls that support complex scripts. Standard Windows form controls that are created in managed code are complex script-enabled. Japanese Collations added in SQL Server 2017 (14.x) Starting in SQL Server 2017 (14.x), two new Japanese collation families are supported, with the permutations of various options (_CS, _AS, _KS, _WS, _VSS). To list these collations, you can query the SQL Server Database Engine: SELECT Name, Description FROM fn_helpcollations() WHERE Name LIKE ''Japanese_Bushu_Kakusu_140%'' OR Name LIKE ''Japanese_XJIS_140%''All of the new collations have built-in support for supplementary characters, so none of the new collations have (or need) the SC flag. These collations are supported in Database Engine indexes, memory-optimized tables, columnstore indexes, and natively compiled modules. Related Tasks TASK TOPIC Describes how to set or change the collation of the instance of Set or Change the Server Collation SQL Server. Describes how to set or change the collation of a user Set or Change the Database Collation database. Describes how to set or change the collation of a column in Set or Change the Column Collation the database. Describes how to return collation information at the server, View Collation Information database, or column level. Describes how to write Transact-SQL statements that are more Write International Transact-SQL Statements portable from one language to another, or support multiple languages more easily. Describes how to change the language of error messages and Set a Session Language preferences for how date, time, and currency data are used and displayed. Related Content SQL Server Best Practices Collation Change "SQL Server Best Practices Migration to Unicode" Unicode Consortium Web site See Also Contained Database Collations Choose a Language When Creating a Full-Text Index sys.fn_helpcollations (Transact-SQL)SQL Server Configuration Manager 5/3/2018 • 4 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse For content related to previous versions of SQL Server, see SQL Server Configuration Manager. SQL Server Configuration Manager is a tool to manage the services associated with SQL Server, to configure the network protocols used by SQL Server, and to manage the network connectivity configuration from SQL Server client computers. SQL Server Configuration Manager is a Microsoft Management Console snap-in that is available from the Start menu, or can be added to any other Microsoft Management Console display. Microsoft Management Console (mmc.exe) uses the SQLServerManager .msc file (such as SQLServerManager13.msc for SQL Server 2016 (13.x)) to open Configuration Manager. Here are the paths to the last four versions when Windows in installed on the C drive. SQL Server 2017 C:\Windows\SysWOW64\SQLServerManager14.msc SQL Server 2016 C:\Windows\SysWOW64\SQLServerManager13.msc SQL Server 2014 (12.x) C:\Windows\SysWOW64\SQLServerManager12.msc SQL Server 2012 (11.x) C:\Windows\SysWOW64\SQLServerManager11.msc NOTE Because SQL Server Configuration Manager is a snap-in for the Microsoft Management Console program and not a stand- alone program, SQL Server Configuration Manager does not appear as an application in newer versions of Windows. Windows 10: To open SQL Server Configuration Manager, on the Start Page, type SQLServerManager13.msc (for SQL Server 2016 (13.x)). For previous versions of SQL Server replace 13 with a smaller number. Clicking SQLServerManager13.msc opens the Configuration Manager. To pin the Configuration Manager to the Start Page or Task Bar, right-click SQLServerManager13.msc, and then click Open file location. In the Windows File Explorer, right-click SQLServerManager13.msc, and then click Pin to Start or Pin to taskbar. Windows 8: To open SQL Server Configuration Manager, in the Search charm, under Apps, type SQLServerManager .msc such as SQLServerManager13.msc, and then press Enter. SQL Server Configuration Manager and SQL Server Management Studio use Window Management Instrumentation (WMI) to view and change some server settings. WMI provides a unified way for interfacing with the API calls that manage the registry operations requested by the SQL Server tools and to provide enhanced control and manipulation over the selected SQL services of the SQL Server Configuration Manager snap-in component. For information about configuring permissions related to WMI, see Configure WMI to Show Server Status in SQL Server Tools. To start, stop, pause, resume, or configure services on another computer by using SQL Server Configuration Manager, see Connect to Another Computer (SQL Server Configuration Manager).Managing Services Use SQL Server Configuration Manager to start, pause, resume, or stop the services, to view service properties, or to change service properties. Use SQL Server Configuration Manager to start the Database Engine using startup parameters. For more information, see Configure Server Startup Options (SQL Server Configuration Manager). Changing the Accounts Used by the Services Manage the SQL Server services using SQL Server Configuration Manager. IMPORTANT Always use SQL Server tools such as SQL Server Configuration Manager to change the account used by the SQL Server or SQL Server Agent services, or to change the password for the account. In addition to changing the account name, SQL Server Configuration Manager performs additional configuration such as setting permissions in the Windows Registry so that the new account can read the SQL Server settings. Other tools such as the Windows Services Control Manager can change the account name but do not change associated settings. If the service cannot access the SQL Server portion of the registry the service may not start properly. As an additional benefit, passwords changed using SQL Server Configuration Manager, SMO, or WMI take affect immediately without restarting the service. Manage Server & Client Network Protocols SQL Server Configuration Manager allows you to configure server and client network protocols, and connectivity options. After the correct protocols are enabled, you usually do not need to change the server network connections. However, you can use SQL Server Configuration Manager if you need to reconfigure the server connections so SQL Server listens on a particular network protocol, port, or pipe. For more information about enabling protocols, see Enable or Disable a Server Network Protocol. For information about enabling access to protocols through a firewall, see Configure the Windows Firewall to Allow SQL Server Access. SQL Server Configuration Manager allows you to manage server and client network protocols, including the ability to force protocol encryption, view alias properties, or enable/disable a protocol. SQL Server Configuration Manager allows you to create or remove an alias, change the order in which protocols are uses, or view properties for a server alias, including: Server Alias — The server alias used for the computer to which the client is connecting. Protocol — The network protocol used for the configuration entry. Connection Parameters — The parameters associated with the connection address for the network protocol configuration. The SQL Server Configuration Manager also allows you to view information about failover cluster instances, though Cluster Administrator should be used for some actions such as starting and stopping the services. Available Network Protocols SQL Server supports Shared Memory, TCP/IP, and Named Pipes protocols. For information about choosing a network protocols, see Configure Client Protocols. SQL Server does not support the VIA, Banyan VINES Sequenced Packet Protocol (SPP), Multiprotocol, AppleTalk, or NWLink IPX/SPX network protocols. Clients previously connecting with these protocols must select a different protocol to connect to SQL Server. You cannot use SQL Server Configuration Manager to configure the WinSock proxy. To configure the WinSock proxy, see yourISA Server documentation. Related Tasks Managing Services How-to Topics (SQL Server Configuration Manager) Start, Stop, Pause, Resume, Restart the Database Engine, SQL Server Agent, or SQL Server Browser Service Start, Stop, or Pause the SQL Server Agent Service Set an Instance of SQL Server to Start Automatically (SQL Server Configuration Manager) Prevent Automatic Startup of an Instance of SQL Server (SQL Server Configuration Manager)Cursors 5/3/2018 • 6 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse Operations in a relational database act on a complete set of rows. For example, the set of rows returned by a SELECT statement consists of all the rows that satisfy the conditions in the WHERE clause of the statement. This complete set of rows returned by the statement is known as the result set. Applications, especially interactive online applications, cannot always work effectively with the entire result set as a unit. These applications need a mechanism to work with one row or a small block of rows at a time. Cursors are an extension to result sets that provide that mechanism. Cursors extend result processing by: Allowing positioning at specific rows of the result set. Retrieving one row or block of rows from the current position in the result set. Supporting data modifications to the rows at the current position in the result set. Supporting different levels of visibility to changes made by other users to the database data that is presented in the result set. Providing Transact-SQL statements in scripts, stored procedures, and triggers access to the data in a result set. Concepts Cursor Implementations SQL Server supports three cursor implementations. Transact-SQL cursors Are based on the DECL ARE CURSOR syntax and are used mainly in Transact-SQL scripts, stored procedures, and triggers. Transact-SQL cursors are implemented on the server and are managed by Transact-SQL statements sent from the client to the server. They may also be contained in batches, stored procedures, or triggers. Application programming interface (API) server cursors Support the API cursor functions in OLE DB and ODBC. API server cursors are implemented on the server. Each time a client application calls an API cursor function, the SQL Server Native Client OLE DB provider or ODBC driver transmits the request to the server for action against the API server cursor. Client cursors Are implemented internally by the SQL Server Native Client ODBC driver and by the DLL that implements the ADO API. Client cursors are implemented by caching all the result set rows on the client. Each time a client application calls an API cursor function, the SQL Server Native Client ODBC driver or the ADO DLL performs the cursor operation on the result set rows cached on the client. Type of Cursors Forward-only A forward-only cursor does not support scrolling; it supports only fetching the rows serially from the start to the end of the cursor. The rows are not retrieved from the database until they are fetched. The effects of all INSERT, UPDATE, and DELETE statements made by the current user or committed by other users that affect rows in the result set are visible as the rows are fetched from the cursor.Because the cursor cannot be scrolled backward, most changes made to rows in the database after the row was fetched are not visible through the cursor. In cases where a value used to determine the location of the row within the result set is modified, such as updating a column covered by a clustered index, the modified value is visible through the cursor. Although the database API cursor models consider a forward-only cursor to be a distinct type of cursor, SQL Server does not. SQL Server considers both forward-only and scroll as options that can be applied to static, keyset-driven, and dynamic cursors. Transact-SQL cursors support forward-only static, keyset-driven, and dynamic cursors. The database API cursor models assume that static, keyset-driven, and dynamic cursors are always scrollable. When a database API cursor attribute or property is set to forward-only, SQL Server implements this as a forward-only dynamic cursor. Static The complete result set of a static cursor is built in tempdb when the cursor is opened. A static cursor always displays the result set as it was when the cursor was opened. Static cursors detect few or no changes, but consume relatively few resources while scrolling. The cursor does not reflect any changes made in the database that affect either the membership of the result set or changes to the values in the columns of the rows that make up the result set. A static cursor does not display new rows inserted in the database after the cursor was opened, even if they match the search conditions of the cursor SELECT statement. If rows making up the result set are updated by other users, the new data values are not displayed in the static cursor. The static cursor displays rows deleted from the database after the cursor was opened. No UPDATE, INSERT, or DELETE operations are reflected in a static cursor (unless the cursor is closed and reopened), not even modifications made using the same connection that opened the cursor. SQL Server static cursors are always read-only. Because the result set of a static cursor is stored in a work table in tempdb, the size of the rows in the result set cannot exceed the maximum row size for a SQL Server table. Transact-SQL uses the term insensitive for static cursors. Some database APIs identify them as snapshot cursors. Keyset The membership and order of rows in a keyset-driven cursor are fixed when the cursor is opened. Keyset-driven cursors are controlled by a set of unique identifiers, keys, known as the keyset. The keys are built from a set of columns that uniquely identify the rows in the result set. The keyset is the set of the key values from all the rows that qualified for the SELECT statement at the time the cursor was opened. The keyset for a keyset-driven cursor is built in tempdb when the cursor is opened. Dynamic Dynamic cursors are the opposite of static cursors. Dynamic cursors reflect all changes made to the rows in their result set when scrolling through the cursor. The data values, order, and membership of the rows in the result set can change on each fetch. All UPDATE, INSERT, and DELETE statements made by all users are visible through the cursor. Updates are visible immediately if they are made through the cursor using either an API function such as SQLSetPos or the Transact-SQL WHERE CURRENT OF clause. Updates made outside the cursor are not visible until they are committed, unless the cursor transaction isolation level is set to read uncommitted. Dynamic cursor plans never use spatial indexes. Requesting a Cursor SQL Server supports two methods for requesting a cursor : Transact-SQL The Transact-SQL language supports a syntax for using cursors modeled after the ISO cursor syntax. Database application programming interface (API) cursor functionsSQL Server supports the cursor functionality of these database APIs: ADO ( Microsoft ActiveX Data Object) OLE DB ODBC (Open Database Connectivity) An application should never mix these two methods of requesting a cursor. An application that has used the API to specify cursor behaviors should not then execute a Transact-SQL DECL ARE CURSOR statement to also request a Transact-SQL cursor. An application should only execute DECL ARE CURSOR if it has set all the API cursor attributes back to their defaults. If neither a Transact-SQL nor API cursor has been requested, SQL Server defaults to returning a complete result set, known as a default result set, to the application. Cursor Process Transact-SQL cursors and API cursors have different syntax, but the following general process is used with all SQL Server cursors: 1. Associate a cursor with the result set of a Transact-SQL statement, and define characteristics of the cursor, such as whether the rows in the cursor can be updated. 2. Execute the Transact-SQL statement to populate the cursor. 3. Retrieve the rows in the cursor you want to see. The operation to retrieve one row or one block of rows from a cursor is called a fetch. Performing a series of fetches to retrieve rows in either a forward or backward direction is called scrolling. 4. Optionally, perform modification operations (update or delete) on the row at the current position in the cursor. 5. Close the cursor. Related Content Cursor Behaviors How Cursors Are Implemented See Also DECL ARE CURSOR (Transact-SQL) Cursors (Transact-SQL) Cursor Functions (Transact-SQL) Cursor Stored Procedures (Transact-SQL)Data Collection 5/3/2018 • 6 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse The Data Collector is a component of SQL Server 2017 that collects different sets of data. Data collection either runs constantly or on a user-defined schedule. The data collector stores the collected data in a relational database known as the management data warehouse. What is Data Collector? The data collector is a core component of the data collection platform for SQL Server 2017 and the tools that are provided by SQL Server. The data collector provides one central point for data collection across your database servers and applications. This collection point can obtain data from a variety of sources and is not limited to performance data, unlike SQL Trace. The data collector enables you to adjust the scope of data collection to suit your test and production environments. The data collector also uses a data warehouse, a relational database that enables you to manage the data that you collect by setting different retention periods for your data. The data collector supports dynamic tuning for data collection and is extensible through its API. For more information, see Data Collector Programming. The following illustration shows how the data collector fits in the overall strategy for data collection and data management in SQL Server 2017. Concepts The data collector is integrated with SQL Server Agent and Integration Services, and uses both extensively. Before you work with the data collector, you should therefore understand certain concepts related to each of these SQL Server components. SQL Server Agent is used to schedule and run collection jobs. You should understand the following concepts: Job Job step Job scheduleSubsystem Proxy accounts For more information, see Automated Administration Tasks (SQL Server Agent). Integration Services ( SSIS) is used to execute packages that collect data from individual data providers. You should be familiar with the following SSIS tools and concepts: SSIS package SSIS package configuration For more information, see Integration Services (SSIS) Packages. Terminology target An instance of the Database Engine in an edition of SQL Server that supports Data Collection. For more information about supported editions, see the "Manageability" section of Features Supported by the Editions of SQL Server 2016. A target root defines a subtree in the target hierarchy. A target set is the group of targets that results from applying a filter to a subtree defined by a target root. A target root can be a database, an instance of SQL Server, or a computer instance. target type The type of target, which has certain characteristics and behavior. For example, a SQL Server instance target has different characteristics than a SQL Server database target. data provider A known data source, specific to a target type, that provides data to a collector type. collector type A logical wrapper around the SSIS packages that provide the actual mechanism for collecting data and uploading it to the management data warehouse. collection item An instance of a collector type. A collection item is created with a specific set of input properties and a collection frequency. collection set A group of collection items. A collection set is a unit of data collection that a user can interact with through the user interface. collection mode The manner in which the data is collected and stored. Collection mode can be cached or non-cached. Cached mode supports continuous collection, whereas non-cached mode is intended for on-demand collection or a collection snapshot. management data warehouse A relational database used to store collected data. The following illustration shows the dependencies and relationships between data collector components.As shown in the illustration, the data provider is external to the data collector and by definition has an implicit relationship with the target. The data provider is specific to a particular target (for example, a SQL Server service such as the relational engine) and provides data such as system views in SQL Server, Performance Monitor counters, and WMI providers, that can be consumed by the data collector. The collector type is specific to a target type, based on the logical association of a data provider to a target type. The collector type defines how data is collected from a specific data provider (by using schematized parameters) and specifies the data storage schema. The data provider schema and storage schema are required in order to store the data that is collected. The collector type also provides the location of the management data warehouse, which can reside on the computer running data collection or on a different computer. A collection item, shown in the illustration, is an instance of a specific collector type, parameterized with input parameters, such as the XML schema for the collector type. All collection items must operate on the same target root or on an empty target root. This enables the data collector to combine collector types from the operating system or from a specific target root, but not from different target roots. A collection item has a collection frequency defined that determines how often snapshots of values are taken. Although it is a building block for a collection set, a collection item cannot exist on its own. Collection sets are defined and deployed on a server instance and can be run independently of each other. Each collection set can be applied to a target that matches the target types of all the collector types that are part of a collection set. The collection set is run by a SQL Server Agent job or jobs, and data is uploaded to the management data warehouse on a predefined schedule. All the data collected by different instances within the collection set is uploaded to the management data warehouse on the same schedule. This schedule is defined as a shared SQL Server Agent schedule and can be used by more than one collection set. A collection set is turned on or turned off as a single entity; collection items cannot be turned on or turned off individually. When you create or update a collection set, you can configure the collection mode for collecting data and uploading it to the management data warehouse. The type of scheduling is determined by the type of collection: cached or non-cached. If the collection is cached, data collection and upload each run on a separate job. Collection runs on a schedule that starts when the SQL Server Agent starts and it runs on the frequency specified in the collection item. Upload runs according to the schedule specified by the user.Under non-cached collection, data collection and upload both run on a single job, but in two steps. Step one is collection, step two is upload. No schedule is required for on-demand collection. After a collection set is enabled, data collection can start, either according to a schedule or on demand. When data collection starts, SQL Server Agent spawns a process for the data collector, which in turn loads the Integration Services packages for the collection set. The collection items, which represent collection types, gather data from the appropriate data providers on the specified targets. When the collection cycle ends, this data is uploaded to the management data warehouse. Things you can do DESCRIPTION TOPIC Manage different aspects of data collection, such as enabling Manage Data Collection or disabling data collection, changing a collection set configuration, or viewing data in the management data warehouse. Use reports to obtain information for monitoring system System Data Collection Set Reports capacity and troubleshooting system performance. Use the Management Data Warehouse to collect data from a Management Data Warehouse server that is a data collection target. Exploit the server-side trace capabilities of SQL Server Profiler Use SQL Server Profiler to Create a SQL Trace Collection Set to export a trace definition that you can use to create a (SQL Server Management Studio) collection set that uses the Generic SQL Trace collector typeData Compression 5/3/2018 • 12 min to read • Edit Online THIS TOPIC APPLIES TO: SQL Server Azure SQL Database Azure SQL Data Warehouse Parallel Data Warehouse SQL Server 2017 and Azure SQL Database support row and page compression for rowstore tables and indexes, and supports columnstore and columnstore archival compression for columnstore tables and indexes. For rowstore tables and indexes, use the data compression feature to help reduce the size of the database. In addition to saving space, data compression can help improve performance of I/O intensive workloads because the data is stored in fewer pages and queries need to read fewer pages from disk. However, extra CPU resources are required on the database server to compress and decompress the data, while data is exchanged with the application. You can configure row and page compression on the following database objects: A whole table that is stored as a heap. A whole table that is stored as a clustered index. A whole nonclustered index. A whole indexed view. For partitioned tables and indexes, you can configure the compression option for each partition, and the various partitions of an object do not have to have the same compression setting. For columnstore tables and indexes, all columnstore tables and indexes always use columnstore compression and this is not user configurable. Use columnstore archival compression to further reduce the data size for situations when you can afford extra time and CPU resources to store and retrieve the data. You can configure columnstore archival compression on the following database objects: A whole columnstore table or a whole clustered columnstore index. Since a columnstore table is stored as a clustered columnstore index, both approaches have the same results. A whole nonclustered columnstore index. For partitioned columnstore tables and columnstore indexes, you can configure the archival compression option for each partition, and the various partitions do not have to have the same archival compression setting. NOTE Data can also be compressed using the GZIP algorithm format. This is an additional step and is most suitable for compressing portions of the data when archiving old data for long-term storage. Data compressed using the COMPRESS function cannot be indexed. For more information, see COMPRESS (Transact-SQL). Considerations for When You Use Row and Page Compression When you use row and page compression, be aware the following considerations: The details of data compression are subject to change without notice in service packs or subsequent releases. Compression is available in Azure SQL Database Compression is not available in every edition of SQL Server. For more information, see Features Supported by the Editions of SQL Server 2016. Compression is not available for system tables. Compression can allow more rows to be stored on a page, but does not change the maximum row size of a table or index.A table cannot be enabled for compression when the maximum row size plus the compression overhead exceeds the maximum row size of 8060 bytes. For example, a table that has the columns c1char(8000) and c2char(53) cannot be compressed because of the additional compression overhead. When the vardecimal storage format is used, the row-size check is performed when the format is enabled. For row and page compression, the row-size check is performed when the object is initially compressed, and then checked as each row is inserted or modified. Compression enforces the following two rules: An update to a fixed-length type must always succeed. Disabling data compression must always succeed. Even if the compressed row fits on the page, which means that it is less than 8060 bytes; SQL Server prevents updates that would not fit on the row when it is uncompressed. When a list of partitions is specified, the compression type can be set to ROW, PAGE, or NONE on individual partitions. If the list of partitions is not specified, all partitions are set with the data compression property that is specified in the statement. When a table or index is created, data compression is set to NONE unless otherwise specified. When a table is modified, the existing compression is preserved unless otherwise specified. If you specify a list of partitions or a partition that is out of range, an error is generated. Nonclustered indexes do not inherit the compression property of the table. To compress indexes, you must explicitly set the compression property of the indexes. By default, the compression setting for indexes is set to NONE when the index is created. When a clustered index is created on a heap, the clustered index inherits the compression state of the heap unless an alternative compression state is specified. When a heap is configured for page-level compression, pages receive page-level compression only in the following ways: Data is bulk imported with bulk optimizations enabled. Data is inserted using INSERT INTO ... WITH (TABLOCK) syntax and the table does not have a nonclustered index. A table is rebuilt by executing the ALTER TABLE ... REBUILD statement with the PAGE compression option. New pages allocated in a heap as part of DML operations do not use PAGE compression until the heap is rebuilt. Rebuild the heap by removing and reapplying compression, or by creating and removing a clustered index. Changing the compression setting of a heap requires all nonclustered indexes on the table to be rebuilt so that they have pointers to the new row locations in the heap. You can enable or disable ROW or PAGE compression online or offline. Enabling compression on a heap is single threaded for an online operation. The disk space requirements for enabling or disabling row or page compression are the same as for creating or rebuilding an index. For partitioned data, you can reduce the space that is required by enabling or disabling compression for one partition at a time. To determine the compression state of partitions in a partitioned table, query the data_compression column of the sys.partitions catalog view. When you are compressing indexes, leaf-level pages can be compressed with both row and page compression. Non–leaf-level pages do not receive page compression. Because of their size, large-value data types are sometimes stored separately from the normal row data on special purpose pages. Data compression is not available for the data that is stored separately. Tables that implemented the vardecimal storage format in SQL Server 2005, retain that setting when upgraded. You can apply row compression to a table that has the vardecimal storage format. However, because row compression is a superset of the vardecimal storage format, there is no reason to retain the vardecimal storage format. Decimal values gain no additional compression when you combine the vardecimal storage format with row compression. You can apply page compression to a table that has the vardecimal storage format; however, the vardecimal storage format columns probably will not achieve additional compression.NOTE SQL Server 2017 supports the vardecimal storage format; however, because row-level compression achieves the same goals, the vardecimal storage format is deprecated. This feature will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature. Using Columnstore and Columnstore Archive Compression Applies to: SQL Server ( SQL Server 2014 (12.x) through current version), Azure SQL Database. Basics Columnstore tables and indexes are always stored with columnstore compression. You can further reduce the size of columnstore data by configuring an additional compression called archival compression. To perform archival compression, SQL Server runs the Microsoft XPRESS compression algorithm on the data. Add or remove archival compression by using the following data compression types: Use COLUMNSTORE_ARCHIVE data compression to compress columnstore data with archival compression. Use COLUMNSTORE data compression to decompress archival compression. The resulting data continue to be compressed with columnstore compression. To add archival compression, use ALTER TABLE (Transact-SQL) or ALTER INDEX (Transact-SQL) with the REBUILD option and DATA COMPRESSION = COLUMNSTORE. Examples: ALTER TABLE ColumnstoreTable1 REBUILD PARTITION = 1 WITH (DATA_COMPRESSION = COLUMNSTORE_ARCHIVE) ; ALTER TABLE ColumnstoreTable1 REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = COLUMNSTORE_ARCHIVE) ; ALTER TABLE ColumnstoreTable1 REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = COLUMNSTORE_ARCHIVE ON PARTITIONS (2,4)) ; To remove archival compression and restore the data to columnstore compression, use ALTER TABLE (Transact- SQL) or ALTER INDEX (Transact-SQL) with the REBUILD option and DATA COMPRESSION = COLUMNSTORE. Examples: ALTER TABLE ColumnstoreTable1 REBUILD PARTITION = 1 WITH (DATA_COMPRESSION = COLUMNSTORE) ; ALTER TABLE ColumnstoreTable1 REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = COLUMNSTORE) ; ALTER TABLE ColumnstoreTable1 REBUILD PARTITION = ALL WITH (DATA_COMPRESSION = COLUMNSTORE ON PARTITIONS (2,4) ) ; This next example sets the data compression to columnstore on some partitions, and to columnstore archival on other partitions. ALTER TABLE ColumnstoreTable1 REBUILD PARTITION = ALL WITH ( DATA_COMPRESSION = COLUMNSTORE ON PARTITIONS (4,5), DATA COMPRESSION = COLUMNSTORE_ARCHIVE ON PARTITIONS (1,2,3) ) ;Performance Compressing columnstore indexes with archival compression, causes the index to perform slower than columnstore indexes that do not have the archival compression. Use archival compression only when you can afford to use extra time and CPU resources to compress and retrieve the data. The benefit of archival compression, is reduced storage, which is useful for data that is not accessed frequently. For example, if you have a partition for each month of data, and most of your activity is for the most recent months, you could archive older months to reduce the storage requirements. Metadata The following system views contain information about data compression for clustered indexes: sys.indexes (Transact-SQL) - The type and type_desc columns include CLUSTERED COLUMNSTORE and NONCLUSTERED COLUMNSTORE. sys.partitions (Transact-SQL) – The data_compression and data_compression_desc columns include COLUMNSTORE and COLUMNSTORE_ARCHIVE. The procedure sp_estimate_data_compression_savings (Transact-SQL) does not apply to columnstore indexes. How Compression Affects Partitioned Tables and Indexes When you use data compression with partitioned tables and indexes, be aware of the following considerations: When partitions are split by using the ALTER PARTITION statement, both partitions inherit the data compression attribute of the original partition. When two partitions are merged, the resultant partition inherits the data compression attribute of the destination partition. To switch a partition, the data compression property of the partition must match the compression property of the table. There are two syntax variations that you can use to modify the compression of a partitioned table or index: The following syntax rebuilds only the referenced partition: ALTER TABLE REBUILD PARTITION = 1 WITH (DATA_COMPRESSION =

To view the full page, please visit: Windows Server 2012 R2 with SQL Server 2012 SP3 Standard Product Userguide

Windows Server 2012 R2 with SQL Server 2012 SP3 Standard
Microsoft Windows Server 2012 R2 with pre-installed SQL Server 2012 SP3
Buy now

Related Products

SqlServer2019 Window2025 English Version Hysmartix Technology Co., Limited Starting from $0/hr or $0/month + Alibaba Cloud Usage Fees

Mysql8.0.43Anolis8.10 Hysmartix Technology Co., Limited Starting from $0/hr or $0/month + Alibaba Cloud Usage Fees

SqlServer2019 Hysmartix Technology Co., Limited Starting from $0/hr or $0/month + Alibaba Cloud Usage Fees

Oracle 21c Anolis 8.10 - RHCK Hysmartix Technology Co., Limited Starting from $0/hr or $0/month + Alibaba Cloud Usage Fees

Oracle 23 Ai Anolis 8.8 - RHCK Hysmartix Technology Co., Limited Starting from + Alibaba Cloud Usage Fees