Feb 272012
 

At the end of last post, I gave you a puzzle and also mentioned to give the answer of it.

use AdventureWorks2008R2

--set statistics io on -- this is the hint
select CustomerID, SalesOrderID
from Sales.SalesOrderHeader with (index=[IX_SalesOrderHeader_CustomerID])
where CustomerID = 29974 and SalesOrderID = 45785

select CustomerID, SalesOrderID
from Sales.SalesOrderHeader
where CustomerID = 29974 and SalesOrderID = 45785

Look at the query plan, the cost of both are the same. Does it mean the performance of those 2 are the same? If not, Which statement generates better plan?

I would say the first one has better plan in this particular case. If you uncomment “set statistics io on” statement and run it, some statistics will be returned.


(1 row(s) affected)
Table 'SalesOrderHeader'. Scan count 0, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)
Table 'SalesOrderHeader'. Scan count 0, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The first SQL statement only does 2 logical IOs where the second one does 3. This means that the first SQL accesses 2 pages where the second one access 3. The reason the second statement access 3 is it’s clustered index that the row of the data is organized by clustered index keys. Key columns and non-key columns both exist in leaf level of the index in which it makes fewer rows in a data page. As we know that every row in the non-leaf level of an index will have one record for each page at leaf level. If a leaf level page contains less rows, more records with the clustered index keys will be escalated to the non-leaf level. While number of records is increasing, more records are escalated to the non-leaf level in which it will cause the tree of the clustered index growning, which mean whenever SQL Server need to access row(s) at leaf level, it will have to traverse more pages in the non-leaf level.

The size of none clustered index rows are usually smaller than records in the clustered index. Holding same amount of records, none clustered indexes usually need less non-leaf levels than clustered indexes. SQL Server will need less steps down to the data pages. But in the most of the time, non-leaf level pages are cached in the memory. The costs can be ignored. Since the clustered index is unique, which is very selective, and the clustered key value is provided in where clause, SQL Server chooses clustered index. It’s not absolutely optimal but it’s the plan that SQL Server “thinks” less to give it out. This is always the case that SQL Server does not always give you the best plan.

You may think that since the first query is using CustomerID index key, SQL Server will have to check the CustomerID, then find the SalesOrderID. That will make the execution slower. The fact is that there is no true none unique index in the SQL Server system. When you create a none unique none clustered index, SQL Server will automatically add the clustered index keys(or RowID if the table is in HEAP) to the end to make it unique. The made up key will be escalated to none-leave level until root level as well. Looking at the seek operator of the first plan, there is no “Predicates” property (only “Seek Predicates”). This mean that SQL Server performs seek based on both CustomerID and SalesOrderID columns. CustomerID is not selective but combination of CustomerID and SalesOrderID is.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong> <pre class="">

(required)

(required)

C# | HTML | Plain Text | SQL | XHTML | XML | XSLT |

This site uses Akismet to reduce spam. Learn how your comment data is processed.