Tuesday, 26 February 2013

Cost Threshold for Parallelism


Parallelism, where to start?  Well let’s start right at the beginning shall we with the question, what is parallelism?  Simply put it is a feature of SQL Server which allows queries to utilize more threads in order to complete quicker.  There are several ways to influence SQL Servers decision whether to use parallelism or not and if so how “much” which is referred to as the degree of parallelism.

Cost Threshold For Parallelism
This is a server wide setting which indicates what the minimum query cost has to be before Parallelism is considered by the optimizer (yes UK readers I spelt optimizer with a Z not an S).

Maximum Degree Of Parallelism
This is a server wide setting which indicates the maximum number of processors a query run in parallel can use.

OPTION (MAXDOP X)
This is a query hint which contrary to some beliefs will override the above two settings by reducing the degree of parallelism for queries greater than the cost threshold, but will not make a too small query run in parallel (Thanks to Brent Ozar for pointing this out).

So what settings should you be using, well as always “it depends”, sorry but this really isn’t a one size fits all solution.  Servers will have different workloads, number of cores, some will have hard NUMA some will have Soft NUMA some older servers won’t have NUMA at all.  What you really need to do is understand the impact parallelism has on your environment, today I will look specifically at the sp_configure option ‘cost threshold for parallelism‘.

SQL Server exposes an increasing wealth of information via DMV’s that allow us to make more informed decisions; one idea that I loved was using these DMV’s specifically sys.dm_exec_cached_plans and sys.dm_exec_query_plan to return information from the Plan Cache for queries run in parallel. This idea I took from Jonathan Kehayias and adapted to group the exposed information by StatementSubTreeCost.  This will give you the spread of query costs which you can use to make an educated decision as to if the default cost threshold for parallelism value of 5 is in fact the best for your environment.

/*
      -----------------------------------------------------------------
      Cost Threshold For Parallelism
      -----------------------------------------------------------------
   
      For more SQL resources, check out SQLServer365.blogspot.com

      -----------------------------------------------------------------

      You may alter this code for your own purposes.
      You may republish altered code as long as you give due credit.
      You must obtain prior permission before blogging this code.

      THIS CODE AND INFORMATION ARE PROVIDED "AS IS"
    
      -----------------------------------------------------------------
*/
-- Set Database Context
USE master;
GO
-- Create table
IF NOT EXISTS ( SELECT  1
                FROM    sys.objects
                WHERE   [object_id] = OBJECT_ID('dbo.PlanCacheForMaxDop')
                        AND [type] = 'U' )
    CREATE TABLE master.dbo.PlanCacheForMaxDop
        (
          CompleteQueryPlan XML ,
          StatementText VARCHAR(4000) ,
          StatementOptimizationLevel VARCHAR(25) ,
          StatementSubTreeCost FLOAT ,
          ParallelSubTreeXML XML ,
          UseCounts INT ,
          PlanSizeInBytes INT
        );
ELSE
      -- If table exists truncate it before population
    TRUNCATE TABLE  master.dbo.PlanCacheForMaxDop;     
GO   

-- Collect parallel plan information
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
WITH XMLNAMESPACES  
(DEFAULT 'http://schemas.microsoft.com/sqlserver/2004/07/showplan') 
INSERT INTO master.dbo.PlanCacheForMaxDop
SELECT 
     query_plan AS CompleteQueryPlan,
     n.value('(@StatementText)[1]', 'VARCHAR(4000)') AS StatementText,
     n.value('(@StatementOptmLevel)[1]', 'VARCHAR(25)') AS StatementOptimizationLevel,
     n.value('(@StatementSubTreeCost)[1]', 'VARCHAR(128)') AS StatementSubTreeCost,
     n.query('.') AS ParallelSubTreeXML, 
     ecp.usecounts,
     ecp.size_in_bytes 
FROM sys.dm_exec_cached_plans AS ecp
CROSS APPLY sys.dm_exec_query_plan(plan_handle) AS eqp
CROSS APPLY query_plan.nodes('/ShowPlanXML/BatchSequence/Batch/Statements/StmtSimple') AS qn(n)
WHERE  n.query('.').exist('//RelOp[@PhysicalOp="Parallelism"]') = 1;
GO

-- Return parallel plan information
SELECT  CompleteQueryPlan ,
        StatementText ,
        StatementOptimizationLevel ,
        StatementSubTreeCost ,
        ParallelSubTreeXML ,
        UseCounts ,
        PlanSizeInBytes
FROM    master.dbo.PlanCacheForMaxDop;
GO

-- Return grouped parallel plan information
SELECT  MAX(CASE WHEN StatementSubTreeCost BETWEEN 1 AND 5 THEN '1-5'
                 WHEN StatementSubTreeCost BETWEEN 5 AND 6 THEN '5-6'
                 WHEN StatementSubTreeCost BETWEEN 6 AND 7 THEN '6-7'
                 WHEN StatementSubTreeCost BETWEEN 7 AND 8 THEN '7-8'
                 WHEN StatementSubTreeCost BETWEEN 8 AND 9 THEN '8-9'
                 WHEN StatementSubTreeCost BETWEEN 9 AND 10 THEN '9-10'
                 WHEN StatementSubTreeCost BETWEEN 10 AND 11 THEN '10-11'
                 WHEN StatementSubTreeCost BETWEEN 11 AND 12 THEN '11-12'
                 WHEN StatementSubTreeCost BETWEEN 12 AND 13 THEN '12-13'
                 WHEN StatementSubTreeCost BETWEEN 13 AND 14 THEN '13-14'
                 WHEN StatementSubTreeCost BETWEEN 14 AND 15 THEN '14-15'
                 WHEN StatementSubTreeCost BETWEEN 15 AND 16 THEN '15-16'
                 WHEN StatementSubTreeCost BETWEEN 16 AND 17 THEN '16-17'
                 WHEN StatementSubTreeCost BETWEEN 17 AND 18 THEN '17-18'
                 WHEN StatementSubTreeCost BETWEEN 18 AND 19 THEN '18-19'
                 WHEN StatementSubTreeCost BETWEEN 19 AND 20 THEN '19-20'
                 WHEN StatementSubTreeCost BETWEEN 20 AND 25 THEN '20-25'
                 WHEN StatementSubTreeCost BETWEEN 25 AND 30 THEN '25-30'
                 WHEN StatementSubTreeCost BETWEEN 30 AND 35 THEN '30-35'
                 WHEN StatementSubTreeCost BETWEEN 35 AND 40 THEN '35-40'
                 WHEN StatementSubTreeCost BETWEEN 40 AND 45 THEN '40-45'
                 WHEN StatementSubTreeCost BETWEEN 45 AND 50 THEN '45-50'
                 WHEN StatementSubTreeCost > 50 THEN '>50'
                 ELSE CAST(StatementSubTreeCost AS VARCHAR(100))
            END) AS StatementSubTreeCost ,
        COUNT(*) AS countInstance
FROM    master.dbo.PlanCacheForMaxDop
GROUP BY CASE WHEN StatementSubTreeCost BETWEEN 1 AND 5 THEN 2.5
              WHEN StatementSubTreeCost BETWEEN 5 AND 6 THEN 5.5
              WHEN StatementSubTreeCost BETWEEN 6 AND 7 THEN 6.5
              WHEN StatementSubTreeCost BETWEEN 7 AND 8 THEN 7.5
              WHEN StatementSubTreeCost BETWEEN 8 AND 9 THEN 8.5
              WHEN StatementSubTreeCost BETWEEN 9 AND 10 THEN 9.5
              WHEN StatementSubTreeCost BETWEEN 10 AND 11 THEN 10.5
              WHEN StatementSubTreeCost BETWEEN 11 AND 12 THEN 11.5
              WHEN StatementSubTreeCost BETWEEN 12 AND 13 THEN 12.5
              WHEN StatementSubTreeCost BETWEEN 13 AND 14 THEN 13.5
              WHEN StatementSubTreeCost BETWEEN 14 AND 15 THEN 14.5
              WHEN StatementSubTreeCost BETWEEN 15 AND 16 THEN 15.5
              WHEN StatementSubTreeCost BETWEEN 16 AND 17 THEN 16.5
              WHEN StatementSubTreeCost BETWEEN 17 AND 18 THEN 17.5
              WHEN StatementSubTreeCost BETWEEN 18 AND 19 THEN 18.5
              WHEN StatementSubTreeCost BETWEEN 19 AND 20 THEN 19.5
              WHEN StatementSubTreeCost BETWEEN 10 AND 15 THEN 12.5
              WHEN StatementSubTreeCost BETWEEN 15 AND 20 THEN 17.5
              WHEN StatementSubTreeCost BETWEEN 20 AND 25 THEN 22.5
              WHEN StatementSubTreeCost BETWEEN 25 AND 30 THEN 27.5
              WHEN StatementSubTreeCost BETWEEN 30 AND 35 THEN 32.5
              WHEN StatementSubTreeCost BETWEEN 35 AND 40 THEN 37.5
              WHEN StatementSubTreeCost BETWEEN 40 AND 45 THEN 42.5
              WHEN StatementSubTreeCost BETWEEN 45 AND 50 THEN 47.5
              WHEN StatementSubTreeCost > 50 THEN 100
              ELSE StatementSubTreeCost
         END;
GO

Enjoy!

Chris         

Sunday, 17 February 2013

Fun with dates

Every year me and a group of friends attend the Royal International Air Tattoo (RIAT) at RAF Fairford, for us it is a long weekend where we camp close to the airfield and get to take in three full days (if you include Friday practice  of utterly breath-taking aerobatic displays from some of the greatest military aircraft in the world.  We have a running joke during the year long wait between events that once we get Christmas out of the way RIAT will be here before we know it.  


We all use dates as DBA’s in fact we use them all the time, every single day even without realising it.  I often use T-SQL to show me the difference between dates, but this particular script I like because it gives me a countdown to RIAT, so on a bad day or even on a good day for that matter I can see just how long, or little time is left until we set off once again J

Here is a script I use for fun;

/*
      -----------------------------------------------------------------
      Fun with dates
      -----------------------------------------------------------------
    
      For more SQL resources, check out SQLServer365.blogspot.com

      -----------------------------------------------------------------

      You may alter this code for your own purposes.
      You may republish altered code as long as you give due credit.
      You must obtain prior permission before blogging this code.

      THIS CODE AND INFORMATION ARE PROVIDED "AS IS"
    
      -----------------------------------------------------------------
*/
-- Set database context
USE master;
GO

-- Declare variables
DECLARE @startTime DATETIME
DECLARE @endTime DATETIME
DECLARE @Seconds INT

-- Set date range
SET @startTime = GETDATE()
SET @endTime = '20130719 04:30'

-- Get the difference in seconds
SELECT  @Seconds = DATEDIFF(ss, @startTime, @endTime);

-- Return how long left                                                       
SELECT  'T Minus ' + CONVERT(VARCHAR(10), ( @Seconds / 86400 )) + ' Days '
        + CONVERT(VARCHAR(10), ( ( @Seconds % 86400 ) / 3600 )) + ' Hours '
        + CONVERT(VARCHAR(10), ( ( ( @Seconds % 86400 ) % 3600 ) / 60 ))
        + ' Minutes ' + CONVERT(VARCHAR(10), ( ( ( @Seconds % 86400 ) % 3600 )
                                               % 60 )) + ' Seconds and counting!';
GO

Enjoy!

Chris

Friday, 15 February 2013

T-SQL Gotcha

I came across an interesting issue a few days ago that I thought I would blog about.  The issue is replacing NUL characters from values in SQL Server, more specifically using REPLACE().  Take the below TSQL which runs but will never complete;

DECLARE @var VARCHAR(MAX);
SET @var = 'SQL Server 365';
SELECT  REPLACE(@var, CHAR(0), '');
GO

NOTE – Do not run this on production it will never complete and will ramp up CPU!

Why?  Well, this is due to the fact that CHAR(0) (or 0x0000 in ASCII) is an undefined character in Windows collations and all undefined characters are ignored during comparison, sort, and pattern matching so the query effectively gets stuck in an infinite loop!

If we COLLATE the string to a SQL Collation where CHAR(0) is a defined character the query returns as expected, as below;

DECLARE @var VARCHAR(MAX);
SET @var = 'SQL Server 365';
SELECT  REPLACE(@var COLLATE SQL_Latin1_General_CP1_CI_AS, CHAR(0), '');
GO

Enjoy!

Chris

Thursday, 14 February 2013

Script Replication with PowerShell

I promised you some PowerShell scripts this year and this post is the first one.  I can see the value PowerShell can bring, but find it equally frustrating.  I don’t know the syntax very well or all the command lets and personally for 99.9% of my daily tasks I can achieve the desired results in T-SQL.  I would say that PowerShell has much more of an appeal to Domain Administrators, specifically those who manage large domains with tens or hundreds of thousands of objects.  Having said that, I have stuck with PowerShell and do try to have a dabble when the opportunity arises.

The script in this post will script your replication topology to a file for use in the event of DR.  Why? Well, having a solid DR plan is by no means an easy task, and without testing your DR plan I can say with some certainty that there will be something you have missed, whether it is a login, firewall port or out of date replication topology, that is where this post comes in.  I am a massive fan of replication, having used it in every single one of my DBA positions to achieve a multitude of requirements but that is not to say it hasn’t caused me any problems, I have spent many a late night reinitialising subscriptions and troubleshooting seemingly unknown problems.

Replication is one area that is often overlooked in a DR scenario, do you have up to date scripts to recreate your replication topology?  No?  You’re not telling me that you are going to recreate all those publications using the GUI during DR are you?  All those filtered articles, all those keep existing object unchanged settings and not to forget the articles with do not replicate delete statements set.

The PowerShell script is below, simply save it to a file called ScriptReplication.ps1;

#Load command-line parameters - if they exist
param ([string]$sqlserver, [string]$filename)

#Reference RMO Assembly
[reflection.assembly]::LoadWithPartialName("Microsoft.SqlServer.Replication") | out-null
[reflection.assembly]::LoadWithPartialName("Microsoft.SqlServer.Rmo") | out-null

function errorhandler([string]$errormsg)
{
    writetofile ("-- Replication Script Generator run at: " + (date)) $filename 1
    writetofile ("-- [Replication Script ERROR] " + $errormsg) $filename 0
}

function writetofile([string]$text, [string]$myfilename, [int]$cr_prefix)
{
    if ($cr_prefix -eq 1) { "" >> $myfilename }
    $text >> $myfilename
}

function initializefile([string]$myfilename)
{
    "" > $myfilename
}

trap {errorhandler($_); Break}

#Deal with absent parameters
[string] $hostname=hostname
if ($sqlserver -eq "") {$sqlserver = read-host -prompt "Please enter the server name or leave blank for Hostname"}
if ($filename -eq "")  {$filename = read-host -prompt "Please enter the file name (eg 'c:\ReplicationBackupScript.sql')..."}
if ($sqlserver -eq "")   {$sqlserver = $hostname}
if ($filename -eq "")   {$filename = "c:\ReplicationBackupScript.sql"}

# Clear file contents
if (Test-Path  ($filename)) {Clear-Content $filename}

$repsvr=New-Object "Microsoft.SqlServer.Replication.ReplicationServer" $sqlserver

initializefile $filename

# if we don't have any replicated databases then there's no point in carrying on
if ($repsvr.ReplicationDatabases.Count -eq 0)
{
    writetofile ("-- Replication Script Generator run at: " + (date)) $filename 0
    writetofile "-- ZERO replicated databases on $sqlserver!!!" $filename 1
    EXIT
}

# similarly, if we don't have any publications then there's no point in carrying on
[int] $Count_Tran_Pub = 0
[int] $Count_Merge_Pub = 0

foreach($replicateddatabase in $repsvr.ReplicationDatabases)
{
        $Count_Tran_Pub = $Count_Tran_Pub + $replicateddatabase.TransPublications.Count
        $Count_Merge_Pub = $Count_Merge_Pub + $replicateddatabase.MergePublications.Count
}

if (($Count_Tran_Pub + $Count_Merge_Pub) -eq 0)
{
    writetofile ("-- Replication Script Generator run at: " + (date)) $filename 0
    writetofile "-- ZERO Publications on $sqlserver!!!" $filename 1
    EXIT
}

# if we got this far we know that there are some publications so we'll script them out
# the $scriptargs controls exactly what the script contains
# for a full list of the $scriptargs see the end of this script
$scriptargs = [Microsoft.SqlServer.Replication.scriptoptions]::Creation `
-bor  [Microsoft.SqlServer.Replication.scriptoptions]::IncludeArticles `
-bor  [Microsoft.SqlServer.Replication.scriptoptions]::IncludePublisherSideSubscriptions `
-bor  [Microsoft.SqlServer.Replication.scriptoptions]::IncludeSubscriberSideSubscriptions

writetofile ("-- Replication Script Generator run at: " + (date)) $filename 0
writetofile "-- PUBLICATIONS ON $sqlserver" $filename 1
writetofile "-- TRANSACTIONAL PUBLICATIONS ($Count_Tran_Pub)" $filename 1

foreach($replicateddatabase in $repsvr.ReplicationDatabases)
{
    if ($replicateddatabase.TransPublications.Count -gt 0)
    {
        foreach($tranpub in $replicateddatabase.TransPublications)
        {
            writetofile "/********************************************************************************" $filename 0
            writetofile ("***** Writing to file script for publication: " + $tranpub.Name) $filename 0
            writetofile "********************************************************************************/" $filename 0
            [string] $myscript=$tranpub.script($scriptargs
            writetofile $myscript $filename 0
        }
    }
}

writetofile "-- MERGE PUBLICATIONS ($Count_Merge_Pub)" $filename 1

foreach($replicateddatabase in $repsvr.ReplicationDatabases)
{
    if ($replicateddatabase.MergePublications.Count -gt 0)
    {
        foreach($mergepub in $replicateddatabase.MergePublications)
        {
            writetofile "/********************************************************************************" $filename 0
            writetofile ("***** Writing to file script for publication: " + $mergepub.Name) $filename 0
            writetofile "********************************************************************************/" $filename 0
            [string] $myscript=$mergepub.script($scriptargs
            writetofile $myscript $filename 0
        }
    }
}

You can then copy this to a Server you want to script your replication topology from and create a SQL Agent job to execute the script.  I have just the script below, It will execute the PowerShell script and email an operator called DBA in the event of a failure.  You will need to;

·         Update the operator
·         Update ServerNameHere with the name of the server you are scripting replication from

NOTE - If this is a named instance then this is simply in the form of ServerName\InstanceName

·         Update the schedule accordingly
·         Update the path to the PowerShell script accordingly
·         Update the path to the CreateReplication.sql file accordingly

Note - I run this at 06:00 every Monday Wednesday and Friday, I also run this manually if a change to any of the publications has changed.  I also have a synchronisation process which copies this over to the DR Server.

USE [msdb]
GO

/****** Object:  Job [Maintenance - ScriptReplication]    Script Date: 02/14/2013 13:39:27 ******/
BEGIN TRANSACTION
DECLARE @ReturnCode INT
SELECT @ReturnCode = 0
/****** Object:  JobCategory [[Uncategorized (Local)]]]    Script Date: 02/14/2013 13:39:27 ******/
IF NOT EXISTS (SELECT name FROM msdb.dbo.syscategories WHERE name=N'[Uncategorized (Local)]' AND category_class=1)
BEGIN
EXEC @ReturnCode = msdb.dbo.sp_add_category @class=N'JOB', @type=N'LOCAL', @name=N'[Uncategorized (Local)]'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback

END

DECLARE @jobId BINARY(16)
EXEC @ReturnCode =  msdb.dbo.sp_add_job @job_name=N'Maintenance - ScriptReplication',
            @enabled=1,
            @notify_level_eventlog=0,
            @notify_level_email=2,
            @notify_level_netsend=0,
            @notify_level_page=0,
            @delete_level=0,
            @description=N'This job will script the replication topology to G:\Powershell\ScriptReplication\CreateReplication.sql for use in DR',
            @category_name=N'[Uncategorized (Local)]',
            @owner_login_name=N'sa',
            @notify_email_operator_name=N'DBA', @job_id = @jobId OUTPUT
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
/****** Object:  Step [Script Replication]    Script Date: 02/14/2013 13:39:28 ******/
EXEC @ReturnCode = msdb.dbo.sp_add_jobstep @job_id=@jobId, @step_name=N'Script Replication',
            @step_id=1,
            @cmdexec_success_code=0,
            @on_success_action=1,
            @on_success_step_id=0,
            @on_fail_action=2,
            @on_fail_step_id=0,
            @retry_attempts=0,
            @retry_interval=0,
            @os_run_priority=0, @subsystem=N'PowerShell',
            @command=N'G:\Powershell\ScriptReplication\ScriptReplication.ps1 -sqlserver ''ServerNameHere'' -FileName ''G:\Powershell\ScriptReplication\CreateReplication.sql''',
            @database_name=N'master',
            @flags=40
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_update_job @job_id = @jobId, @start_step_id = 1
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobschedule @job_id=@jobId, @name=N'MWF - 06:00',
            @enabled=1,
            @freq_type=8,
            @freq_interval=42,
            @freq_subday_type=1,
            @freq_subday_interval=0,
            @freq_relative_interval=0,
            @freq_recurrence_factor=1,
            @active_start_date=20121024,
            @active_end_date=99991231,
            @active_start_time=60000,
            @active_end_time=235959
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
EXEC @ReturnCode = msdb.dbo.sp_add_jobserver @job_id = @jobId, @server_name = N'(local)'
IF (@@ERROR <> 0 OR @ReturnCode <> 0) GOTO QuitWithRollback
COMMIT TRANSACTION
GOTO EndSave
QuitWithRollback:
    IF (@@TRANCOUNT > 0) ROLLBACK TRANSACTION
EndSave:

GO

Enjoy!

Chris