Thursday 14 March 2013

Index Tuning

This week I've been carrying out some performance tuning of some of the more intensive procedures in one of our environments.  Using SQL Sentry Performance Advisor I highlighted four or five procedures to start with.  I was able to get a couple of quick wins which was a result, then analysing one of the execution plans I noticed a rather expensive index update.  A bit of digging using the help index scripts created by Kimberley Tripp of SQL Skills revealed the table in question had 24 indexes, one clustered index and 23 non clustered indexes, a little excessive considering the table has 29 columns don’t you think?  My first thought was that there must be some duplicate indexes, so I turned to yet more index scripts created by Kimberly Tripp, but to my surprise there were no duplicate indexes.  This got me thinking, indexes are great for reads but inherently bad for writes, there are no two ways about it, every index you create affects every insert, update and delete operation on the table.  There was no doubt in my mind that this number of indexes was impacting performance, it was crystal clear in the execution plan in the form of the red highlighted operator courtesy of SQL Sentry Plan Explorer (which is built in to Performance Advisor).


So what now?  I have identified the cause of the poorly performing procedure as being an index update, which in turn led to me finding 24 indexes on the table in question.  I can’t just blindly drop some of the indexes and hope the performance improves, what do I do?  I turn to my trusty DMV’s, when I say mine I mean mine, sure you have your DMV’s but the metadata they expose are for your environment not mine.  Right then too the DMV’s we turn, I have the below query which gets the index usage stats for all indexes on a particular table;

-- Set Database Context
USE master;
GO
-- Create table
IF NOT EXISTS ( SELECT  1
                FROM    sys.objects
                WHERE   [object_id] = OBJECT_ID('dbo.IndexUsageStats')
                        AND [type] = 'U' )
    CREATE TABLE master.dbo.IndexUsageStats
        (
          IndexName VARCHAR(128) ,
          UserReads BIGINT ,
          UserWrites BIGINT
        );
ELSE
      -- If table exists truncate it before population
    TRUNCATE TABLE  master.dbo.IndexUsageStats;     
GO

-- Set database context
USE DatabaseName;
GO
-- Set isolation level
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
DECLARE @TableName AS NVARCHAR(128) = 'TableName';

-- Insert index usage data
INSERT  INTO master.dbo.IndexUsageStats
        SELECT  i.[name] AS [index_name] ,
                ddius.[user_seeks] + ddius.[user_scans] + ddius.[user_lookups] AS [user_reads] ,
                ddius.[user_updates] AS [user_writes]
        FROM    sys.dm_db_index_usage_stats ddius
                INNER JOIN sys.indexes i ON ddius.[object_id] = i.[object_id]
                                            AND i.[index_id] = ddius.[index_id]
                INNER JOIN sys.partitions SP ON ddius.[object_id] = SP.[object_id]
                                                AND SP.[index_id] = ddius.[index_id]
                INNER JOIN sys.objects o ON ddius.[object_id] = o.[object_id]
                INNER JOIN sys.sysusers su ON o.[schema_id] = su.[UID]
        WHERE   ddius.[database_id] = DB_ID() -- current database only
                AND ddius.[object_id] = OBJECT_ID(@TableName)
        GROUP BY su.[name] ,
                o.[name] ,
                i.[name] ,
                ddius.[user_seeks] + ddius.[user_scans] + ddius.[user_lookups] ,
                ddius.[user_updates]
        ORDER BY ddius.[user_seeks] + ddius.[user_scans]
                + ddius.[user_lookups] DESC;
GO

-- Return index Usage Stats
SELECT  'Index Usage Stats' AS ResultType,
            IUS.IndexName,
            IUS.UserReads,
            IUS.UserWrites
FROM    master.dbo.IndexUsageStats AS IUS;
GO

From this I was able to identify that 21 of the 24 indexes have been used to satisfy a query signified by the UserReads column, great that’s three indexes that are being updated for every INSERT, UPDATE and DELETE statement and are being maintained by expensive index reorganise and index rebuild operations but are never read from.  I script the drop and create of these three indexes and save the script file.  But we are not done yet, now this is the really cool part.  It is thanks to another SQL Skills member that I was able to achieve what I wanted to do next, interrogate the plan cache.  I came across a post from Jonathan Kehayias which shows how to find what queries in the plan cache used a specific index.  I tweaked this slightly as I wanted to find what queries in the plan cache accessed the table and which indexes they used to satisfy the query.  Below is the modified query;

-- Set Database Context
USE master;
GO
-- Create table
IF NOT EXISTS ( SELECT  1
                FROM    sys.objects
                WHERE   [object_id] = OBJECT_ID('dbo.PlanCacheIndexes')
                        AND [type] = 'U' )
    CREATE TABLE master.dbo.PlanCacheIndexes
        (
          StatementText VARCHAR(4000) ,
          DatabaseName VARCHAR(128) ,
          SchemaName VARCHAR(128) ,
          TableName VARCHAR(128) ,
          IndexName VARCHAR(128) ,
          Indextype VARCHAR(128) ,
          QueryPlan XML ,
          UseCounts BIGINT
        );
ELSE
      -- If table exists truncate it before population
    TRUNCATE TABLE  master.dbo.PlanCacheIndexes;     
GO   

-- Set isolation level
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

-- Declare and set variable
DECLARE @TableName AS NVARCHAR(128) = 'TableName';

-- Make sure the name passed is appropriately quoted
IF ( LEFT(@TableName, 1) <> '['
     AND RIGHT(@TableName, 1) <> ']'
   )
    SET @TableName = QUOTENAME(@TableName);
-- Handle the case where the left or right was quoted manually but not the opposite side
IF LEFT(@TableName, 1) <> '['
    SET @TableName = '[' + @TableName;
IF RIGHT(@TableName, 1) <> ']'
    SET @TableName = @TableName + ']';

-- Dig into the plan cache and find all plans using Indexes
WITH XMLNAMESPACES
   (DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan')   
INSERT INTO master.dbo.PlanCacheIndexes
SELECT
stmt.value('(@StatementText)[1]', 'varchar(max)') AS SQL_Text,
obj.value('(@Database)[1]', 'varchar(128)') AS DatabaseName,
obj.value('(@Schema)[1]', 'varchar(128)') AS SchemaName,
obj.value('(@Table)[1]', 'varchar(128)') AS TableName,
obj.value('(@Index)[1]', 'varchar(128)') AS IndexName,
obj.value('(@IndexKind)[1]', 'varchar(128)') AS IndexKind,
qp.query_plan,
cp.usecounts
FROM sys.dm_exec_cached_plans AS cp
CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS qp
CROSS APPLY query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS batch(stmt)
CROSS APPLY stmt.nodes('.//IndexScan/Object[@Table=sql:variable("@TableName")]') AS TableName(obj)
OPTION(MAXDOP 1, RECOMPILE);
GO

-- Return plan information
SELECT  'Plan Cache Indexes' AS ResultType,
            REPLACE(REPLACE(IndexName, '[', ''), ']', '') AS IndexName ,
        IndexType ,
        COUNT(IndexName) AS IndexUsageInstances ,
        SUM(UseCounts) AS TimesIndexUsed
FROM    master.dbo.PlanCacheIndexes
GROUP BY IndexName ,
        IndexType
ORDER BY SUM(UseCounts) DESC;
GO

NOTE – As Jonathan states in his post the OPTION(MAXDOP 1, RECOMPILEis advisable as this query can be particularly intensive.

This showed me that in the plan cache there are 18 indexes that have been used to satisfy a query against the table so that gives us another 3 indexes we can potentially get rid of.  I ran the below query to show me the index usage stats for the indexes that are not in the plan cache.

-- Set Database Context
USE master;
GO
-- Return Index usage stats for indexes that are not used in the plan cache
SELECT  'Index Usage Not In Plan Cache' AS ResultType,
            IUS.IndexName,
            IUS.UserReads,
            IUS.UserWrites
FROM    master.dbo.IndexUsageStats AS IUS
WHERE   IUS.IndexName NOT IN (
        SELECT  REPLACE(REPLACE(IndexName, '[', ''), ']', '')
        FROM    master.dbo.PlanCacheIndexes );
GO

I script the drop and create of these 3 indexes again saving the script file.  What is important to point out here is that all three of these indexes have still been used to satisfy a query since the index usage stats where last reset.  You can use the below query to find out that last time the index stats where reset;

-- Set Database Context
USE master;
GO
-- How old are the index usage stats?
SELECT  'Index Stats Age' AS ResultType,
            DATEDIFF(DAY, sd.crdate, GETDATE()) AS days_history
FROM    sys.sysdatabases sd
WHERE   sd.[name] = 'tempdb';
GO

Take a note of the percentage of reads to writes, if this is low, which I expect it will be (in my case less than 0.02%) then chances are there was a one off query maybe for a report or extract.

So with this approach I identified 6 indexes, 3 that haven’t been used and 3 that have been used but do not have a query in the plan cache.  What you do next will depend on your environment, for me it meant running the drop index statements in a test environment and running a workload to prove that dropping the indexes improves the procedure and does not cause any unexpected performance issues.  Then I had to raise a change request to schedule in remove the indexes from production which I am pleased to say was approved and will be being run tonight outside of core business hours.

I am still not comfortable that the remaining 18 indexes are the best fit for our workload so have created a task for myself to review the indexes in place on this table and make improvements where possible.  I will post my results in a future post.

Enjoy!

Chris