Thursday, July 29, 2010

Finding Key Lookups In Cached Execution Plans

Ask anyone with performance tuning experience to list the most expensive operations in an execution plan and it's a safe bet you'll find key lookups near the top. The good news is they're usually easy to fix - Glenn Berry (Blog | Twitter) blogged about it recently and many others have written about the subject as well. For most people though, it's finding out when and where they're happening that's the tough part.

Background
A key lookup occurs when a nonclustered index is used to satisfy a query but doesn't include all the columns the query is asking for so they have to be retrieved from the corresponding clustered index (or heap if there's no clustered index) that the nonclustered index is based on. It's expensive because it requires fetching additional pages into the buffer which has a double whammy effect: if the pages aren't already in the buffer you have to wait for the disk subsystem to retrieve them and you're filling up the buffer with up to twice the number of data pages you'd need if you could just stick with using the nonclustered index to begin with.

Finding Key Lookups - There's a DMV For That
SQL Server keeps track of how many key lookups occur against every index in sys.dm_db_index_usage_stats (in the user_lookups column). Interesting, but to be useful we need to know what queries are causing the lookups. Enter the DMV sys.dm_exec_query_stats which keeps track of a wealth of information about how many times a query has been executed and the resources (CPU, disk, etc.) its used. Plug the sql_handle and plan_handle columns into the DMFs sys.dm_exec_sql_text and sys.dm_exec_query_plan, respectively, and we get the text and execution plan for the query. Because the execution plan is an XML document we can leverage SQL Server's native XML capabilities to find any key lookup operations that are occurring. Join them all together and - voila! - we can see every cached query that's got a key lookup, the additional columns being retrieved, and the execution plan, ordered by worst offender first - everything we need to know to work on eliminating the key lookups that are draining performance. Happy tuning!

NOTE: As the comments in the header suggest , exercise caution when running this against a production server…executing sys.dm_exec_query_plan can be resource intensive when your server contains a lot of cached plans. You have been warned!


/*********************************************************************************************
Find Key Lookups in Cached Plans v1.00 (2010-07-27)
(C) 2010, Kendal Van Dyke

Feedback: mailto:kendal.vandyke@gmail.com

License:
   This query is free to download and use for personal, educational, and internal
   corporate purposes, provided that this header is preserved. Redistribution or sale
   of this query, in whole or in part, is prohibited without the author's express
   written consent.
  
Note:
   Exercise caution when running this in production!

   The function sys.dm_exec_query_plan() is resource intensive and can put strain
   on a server when used to retrieve all cached query plans.

   Consider using TOP in the initial select statement (insert into @plans)
   to limit the impact of running this query or run during non-peak hours
*********************************************************************************************/
DECLARE @plans TABLE
   
(
     
query_text NVARCHAR(MAX) ,
     
o_name SYSNAME ,
     
execution_plan XML ,
     
last_execution_time DATETIME ,
     
execution_count BIGINT ,
     
total_worker_time BIGINT ,
     
total_physical_reads BIGINT ,
     
total_logical_reads BIGINT
   
) ;

DECLARE @lookups TABLE
   
(
     
table_name SYSNAME ,
     
index_name SYSNAME ,
     
index_cols NVARCHAR(MAX)
    ) ;

WITH    query_stats
         
AS ( SELECT   [sql_handle] ,
                       
[plan_handle] ,
                       
MAX(last_execution_time) AS last_execution_time ,
                       
SUM(execution_count) AS execution_count ,
                       
SUM(total_worker_time) AS total_worker_time ,
                       
SUM(total_physical_reads) AS total_physical_reads ,
                       
SUM(total_logical_reads) AS total_logical_reads
              
FROM     sys.dm_exec_query_stats
              
GROUP BY [sql_handle] ,
                       
[plan_handle]
            
)
   
INSERT  INTO @plans
           
( query_text ,
             
o_name ,
             
execution_plan ,
             
last_execution_time ,
             
execution_count ,
             
total_worker_time ,
             
total_physical_reads ,
             
total_logical_reads
           
)
           
SELECT /*TOP 50*/
                   
sql_text.[text] ,
                   
CASE WHEN sql_text.objectid IS NOT NULL
                        
THEN ISNULL(OBJECT_NAME(sql_text.objectid,
                                                
sql_text.[dbid]),
                                    
'Unresolved')
                        
ELSE CAST('Ad-hoc\Prepared' AS SYSNAME)
                   
END ,
                   
query_plan.query_plan ,
                   
query_stats.last_execution_time ,
                   
query_stats.execution_count ,
                   
query_stats.total_worker_time ,
                   
query_stats.total_physical_reads ,
                   
query_stats.total_logical_reads
           
FROM    query_stats
                   
CROSS APPLY sys.dm_exec_sql_text(query_stats.sql_handle)
                   
AS [sql_text]
                   
CROSS APPLY sys.dm_exec_query_plan(query_stats.plan_handle)
                   
AS [query_plan]
           
WHERE   query_plan.query_plan IS NOT NULL ;


;
WITH XMLNAMESPACES (
  
DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan'
),
lookups AS (
  
SELECT  DB_ID(REPLACE(REPLACE(keylookups.keylookup.value('(Object/@Database)[1]',
                                                           
'sysname'), '[', ''),
                        
']', '')) AS [database_id] ,
          
OBJECT_ID(keylookups.keylookup.value('(Object/@Database)[1]',
                                               
'sysname') + '.'
                    
+ keylookups.keylookup.value('(Object/@Schema)[1]',
                                                 
'sysname') + '.'
                    
+ keylookups.keylookup.value('(Object/@Table)[1]', 'sysname')) AS [object_id] ,
          
keylookups.keylookup.value('(Object/@Database)[1]', 'sysname') AS [database] ,
          
keylookups.keylookup.value('(Object/@Schema)[1]', 'sysname') AS [schema] ,
          
keylookups.keylookup.value('(Object/@Table)[1]', 'sysname') AS [table] ,
          
keylookups.keylookup.value('(Object/@Index)[1]', 'sysname') AS [index] ,
          
REPLACE(keylookups.keylookup.query('
for $column in DefinedValues/DefinedValue/ColumnReference
return string($column/@Column)
'
).value('.', 'varchar(max)'), ' ', ', ') AS [columns] ,
          
plans.query_text ,
          
plans.o_name,
          
plans.execution_plan ,
          
plans.last_execution_time ,
          
plans.execution_count ,
          
plans.total_worker_time ,
          
plans.total_physical_reads,
          
plans.total_logical_reads
  
FROM    @plans AS [plans]
          
CROSS APPLY execution_plan.nodes('//RelOp/IndexScan[@Lookup="1"]') AS keylookups ( keylookup )
)
SELECT  lookups.[database] ,
       
lookups.[schema] ,
       
lookups.[table] ,
       
lookups.[index] ,
       
lookups.[columns] ,
       
index_stats.user_lookups ,
       
index_stats.last_user_lookup ,
       
lookups.execution_count ,
       
lookups.total_worker_time ,
       
lookups.total_physical_reads ,
       
lookups.total_logical_reads,
       
lookups.last_execution_time ,
      
lookups.o_name AS [object_name],
       
lookups.query_text ,
       
lookups.execution_plan
FROM    lookups
       
INNER JOIN sys.dm_db_index_usage_stats AS [index_stats] ON lookups.database_id = index_stats.database_id
                                                             
AND lookups.[object_id] = index_stats.[object_id]
WHERE   index_stats.user_lookups > 0
       
AND lookups.[database] NOT IN ('[master]','[model]','[msdb]','[tempdb]')
ORDER BY lookups.execution_count DESC
--ORDER BY index_stats.user_lookups DESC
--ORDER BY lookups.total_logical_reads DESC

Thursday, July 22, 2010

SQLRally Logo Contest - Fame & Fortune Up For Grabs!

Now that the PASS SQLRally (formerly known as the spring event) has a name it's time to start building a brand…and you have a chance to help! We're running a logo design contest and you're invited to enter. At stake is fame, a $310 prize, and consideration for bidding on the SQLRally website design if you're picked as the winner. Remember we're targeting the cost of the SQLRally at $299 so if you win you could cover the cost of admission.

The contest is open now and runs until 1 PM EDT, July 29, though we reserve the right to extend it depending on the quality and number of entries we receive. So what are we looking for?

  • Logo to be used on website (we will be creating a new event site, ideally with the logo designer), conference name badges, simple onsite signage, possibly on bag or tshirt
  • Consideration to racetrack/car rally imagery
  • Require both one color (B&W) and color versions
  • Future requirements include website design. Preference will be given to the logo designer to bid on future design extensions.
  • Should give some consideration to the existing PASS organization color palette/font however not at all restricted to using existing:
    • Black and White PASS logo used primarily in print applications
    • Grey: 70% Black
    • PASS Red: Pantone 1795 C, CMYK 0, 94, 100, 0
    • PASS Blue: Pantone 294 C, CMYK 100, 58, 0, 21
    • Light Blue: CMYK, 100, 30, 0, 0
    • Font: Helvetica Neue

Think you've got the winning design? Head on over to 99designs to check out the full design brief and submit your entry!

Wednesday, July 21, 2010

Getting Started In Blogging And Technical Speaking - At The 2010 PASS Summit

In case you missed it the Program Session (60 min. long) selections are now available on the 2010 PASS Summit website. I've already mentioned that I've been selected for a Spotlight session (90 min.) on transactional replication and now I'm equally excited to be picked for a Program session. Here's the details:

Getting Started In Blogging And Technical Speaking

Are you thinking of starting a blog? Or are you interested in presenting at events like SQL Saturday and the PASS Summit but not sure how to get in the game? Don’t let uncertainty keep you from contributing – the SQL Server community needs you! This interactive session will explore reasons for blogging and speaking and offer advice on topic selection, improving writing and speaking skills, seeking out places to write and speak, and things to (and not to) do to once you get started.

Goals:

  • Grow the writer\speaker pool by encouraging would be bloggers & presenters to take the leap from being bystanders to participants in the SQL Server technical community.

Registration for the Summit is currently at $1,595 until September 30, then goes up to $1,795. I strongly encourage you to consider going, especially if you've never been before. The opportunity to meet peers & leaders in our industry and learn new skills is worth the price. If you're on the fence take a moment to read about why you'll benefit from going.

PASSSummitBanner950x75[1]

Wednesday, July 14, 2010

Introducing The PASS SQLRally

If you're a PASS member you may have noticed in this morning's Community Connector that the 2011 spring event I talked about recently has a name: The PASS SQLRally. We've also nailed down the dates and location - May 11-13 at the Marriott World Center in Orlando, FL.

Why am I saying "we"? Since my last post about the SQLRally I've joined the team helping to plan the event. Throughout the planning process I'll be providing updates on what we're doing - call it a behind the scenes look at what we're up to. I'll start with….

How We Decided On The Name
I mentioned that when Andy, Jack, and I put together a proposal to host the event in Orlando that we struggled with what to call it. We never found the right name so we just called it the "spring event" and figured we'd just worry about getting PASS to approve it first. They did, and fortunately for us we now had the power of the PASS marketing team to help figure out what to call it.

Let's all recognize the proverbial elephant in the room on this one: coming up with a good name is hard. There are a lot of things to consider:

  • Does it convey what the event is about?
  • Does it sound exciting\interesting?
  • Is it original? (think avoiding copyright\trademark violation)
  • Will it work as a Twitter hashtag?
  • Is it conducive to SEO?
  • Does it have "buzz"?

The list goes on but you get the point. We started with a list of around 150 variations on names and eventually narrowed it down to 3. After much debate we decided on SQLRally because it met all of the criteria above and lends itself really well to doing some fun branding. Looking at the definition of rally and you'll see it has some very appropriate meanings:

  • Beat up: gather; "drum up support"
  • A large gathering of people intended to arouse enthusiasm
  • An automobile race run over public roads

We're certainly all enthusiastic about SQL Server and the double meaning of a road race gives us the chance to have a little fun. Imagine speakers wearing checkered flag shirts and messaging like "rev up performance". Even topic centric "tracks" plays nicely into the rally theme. It wasn't easy, but in the end I think we've come up with a great name!

Now We Need A Logo
Now it's time to start building a brand around it all. The next thing we're going to tackle is coming up with a good logo that visually represents what this event is supposed to be about. Here's what I think is the cool part - you get to help. We're going to hold a contest to design the logo. Details will be announced soon so pay attention to the Connector and this blog!