Monday, 13 August 2012

High Availability Lingo


You have got to love the way the SQL Server team change the lingo in all of their high availability (HA) / disaster recovery (DR) features.  To a novice and even to more seasoned DBA’s it can be confusing, this post will not go into the nitty gritty of how each of these features work but just simply explain the terms commonly used.  I will cover the most common terms used with;
  • Log Shipping
  • Replication
  • Database Mirroring
  • AlwaysOn Availability Groups

  
Log Shipping

Primary Server
The primary server in a log shipping configuration is the instance of the SQL Server Database Engine that is your production server.

Primary Database
                The primary database is the database on the primary server that you want to back up to another server.

Secondary Server
                The secondary server in a log shipping configuration is the server where you want to keep a warm standby copy of your primary database.

Secondary Database
                The secondary database is the database on the secondary server that exists as a warm standby of the primary database.

Monitor Server
                The optional monitor server tracks all of the details of log shipping.

Backup Job
                A backup job is created on the primary server instance for each primary database. It performs the backup operation, logs history to the local server and the monitor server, and deletes old backup files and history information.

Copy Job
A copy job is created on each secondary server instance in a log shipping configuration. This job copies the backup files from the primary server to a configurable destination on the secondary server and logs history on the secondary server and the monitor server.

Restore Job
A restore job is created on the secondary server instance for each log shipping configuration. This job restores the copied backup files to the secondary databases.

Alert Job
                If a monitor server is used, an alert job is created on the monitor server instance. This alert job is shared by the primary and secondary databases of all log shipping configurations using this monitor server instance.

Replication

Publisher
The Publisher is a database instance that makes data available to other locations through replication.

Publication
                A publication is a collection of one or more articles from one database.

Article
                An article identifies a database object that is included in a publication.

Distributor
                The Distributor is a database instance that acts as a store for replication specific data associated with one or more Publishers. Each Publisher is associated with a single database (known as a distribution database) at the Distributor.

Subscriber
                A Subscriber is a database instance that receives replicated data.

Subscription
                A subscription is a request for a copy of a publication to be delivered to a Subscriber.

Transactional Replication
                Transactional replication typically starts with a snapshot of the publication database objects and data. As soon as the initial snapshot is taken, subsequent data changes and schema modifications made at the Publisher are usually delivered to the Subscriber as they occur (in near real time). The data changes are applied to the Subscriber in the same order and within the same transaction boundaries as they occurred at the Publisher; therefore, within a publication, transactional consistency is guaranteed.

Merge Replication
                Merge replication, like transactional replication, typically starts with a snapshot of the publication database objects and data. Subsequent data changes and schema modifications made at the Publisher and Subscribers are tracked with triggers. The Subscriber synchronizes with the Publisher when connected to the network and exchanges all rows that have changed between the Publisher and Subscriber since the last time synchronization occurred.

Snapshot Replication
                Snapshot replication distributes data exactly as it appears at a specific moment in time and does not monitor for updates to the data. When synchronization occurs, the entire snapshot is generated and sent to Subscribers.
Peer to Peer Replication
                Peer-to-peer transactional replication lets you insert, update, or delete data at any node in a topology and have data changes propagated to the other nodes. Because you can change data at any node, data changes at different nodes could conflict with each other. If a row is modified at more than one node, it can cause a conflict or even a lost update when the row is propagated to other nodes.
Snapshot Agent
                The Snapshot Agent is typically used with all types of replication. It prepares schema and initial data files of published tables and other objects, stores the snapshot files, and records information about synchronization in the distribution database. The Snapshot Agent runs at the Distributor.

Log Reader Agent
                The Log Reader Agent is used with transactional replication. It moves transactions marked for replication from the transaction log on the Publisher to the distribution database. Each database published using transactional replication has its own Log Reader Agent that runs on the Distributor and connects to the Publisher.

Distribution Agent
                The Distribution Agent is used with snapshot replication and transactional replication. It applies the initial snapshot to the Subscriber and moves transactions held in the distribution database to Subscribers.

Merge Agent
                The Merge Agent is used with merge replication. It applies the initial snapshot to the Subscriber and moves and reconciles incremental data changes that occur. Each merge subscription has its own Merge Agent that connects to both the Publisher and the Subscriber and updates both.

Queue Reader Agent
                The Queue Reader Agent is used with transactional replication with the queued updating option. The agent runs at the Distributor and moves changes made at the Subscriber back to the Publisher. Unlike the Distribution Agent and the Merge Agent, only one instance of the Queue Reader Agent exists to service all Publishers and publications for a given distribution database.


Database Mirroring

Principal Server
The Principal Server instance serves the database to clients.

Mirror Server
                The Mirror Server instance acts as a hot or warm standby server.

Witness
                High-safety mode with automatic failover requires a third server instance, known as a witness. Unlike the two partners, the witness does not serve the database. The witness supports automatic failover by verifying whether the principal server is up and functioning.

Hot Standby
                Hot Standby is the term used when a database mirroring session is synchronised database mirroring provides a hot standby server that supports rapid failover without a loss of data from committed transactions.

Warm Standby
                Warm Standby is the term used when a database mirroring session is not synchronized; the mirror server is typically available as a warm standby server (with possible data loss).

Operating Modes
There are two mirroring operating modes. One of them, high-safety mode supports synchronous operation.  The second operating mode, high-performance mode, runs asynchronously.

High-Safety
                Under high-safety mode, when a session starts, the mirror server synchronizes the mirror database together with the principal database as quickly as possible.

High-Performance
                Under high-performance mode, the mirror server tries to keep up with the log records sent by the principal server. The mirror database might lag somewhat behind the principal database. However, typically, the gap between the databases is small.

Synchronous
                Under synchronous operation, a transaction is committed on both partners, but at the cost of increased transaction latency.

Asynchronous
Under asynchronous operation, the transactions commit without waiting for the mirror server to write the log to disk, which maximizes performance.

Transaction Safety
                If the SAFETY option is set to FULL, database mirroring operation is synchronous, after the initial synchronizing phase. If a witness is set in high-safety mode, the session supports automatic failover.  If the SAFETY option is set to OFF, database mirroring operation is asynchronous. The session runs in high-performance mode, and the WITNESS option should also be OFF.

Role Switching
                Within the context of a database mirroring session, the principal and mirror roles are typically interchangeable in a process known as role switching. Role switching involves transferring the principal role to the mirror server.

AlwaysOn Availability Groups

Availability Group
A container for a set of databases, availability databases, that fail over together.

Availability Database
A database that belongs to an availability group. For each availability database, the availability group maintains a single read-write copy (the primary database) and one to four read-only copies (secondary databases).

Primary Database
The read-write copy of an availability database.

Secondary Database
A read-only copy of an availability database.

Availability Replica
An instantiation of an availability group that is hosted by a specific instance of SQL Server and maintains a local copy of each availability database that belongs to the availability group. Two types of availability replicas exist: a single primary replica and one to four secondary replicas.

Primary Replica
The availability replica that makes the primary databases available for read-write connections from clients and, also, sends transaction log records for each primary database to every secondary replica.

Secondary Replica
An availability replica that maintains a secondary copy of each availability database, and serves as a potential failover targets for the availability group. Optionally, a secondary replica can support read-only access to secondary databases can support creating backups on secondary databases.

Availability Group Listener
A server name to which clients can connect in order to access a database in a primary or secondary replica of an AlwaysOn availability group. Availability group listeners direct incoming connections to the primary replica or to a read-only secondary replica.

Availability Modes
                In AlwaysOn Availability Groups, the availability mode is a replica property that determines whether a given availability replica can run in synchronous-commit mode. For each availability replica, the availability mode must be configured for either synchronous-commit mode or asynchronous-commit mode.

Asynchronous-commit
Is a disaster-recovery solution that works well when the availability replicas are distributed over considerable distances.

Synchronous-commit
Emphasizes high availability over performance, at the cost of increased transaction latency.

Automatic failover
A failover that occurs automatically on the loss of the primary replica. Automatic failover is supported only when the current primary and one secondary replica are both configured with failover mode set to AUTOMATIC and the secondary replica currently synchronized. If the failover mode of either the primary or secondary replica is MANUAL, automatic failover cannot occur.

Planned manual failover (without data loss)
Planned manual failover, or manual failover, is a failover that is initiated by a database administrator, typically, for administrative purposes. A planned manual failover is supported only if both the primary replica and secondary replica are configured for synchronous-commit mode and the secondary replica is currently synchronized (in the SYNCHRONIZED state). When the target secondary replica is synchronized, manual failover (without data loss) is possible even if the primary replica has crashed because the secondary databases are ready for failover. A database administrator manually initiates a manual failover.

Forced manual failover (with possible data loss)
A failover that can be initiated by a database administrator when a planned manual failover is not possible, because either no secondary replica is SYNCHRONIZED with the primary replica (that is, no secondary replica is ready for failover) or the primary replica is not running. Forced manual failover, or forced failover, risks possible data loss and is recommended strictly for disaster recovery. This is the only form of failover supported by in asynchronous-commit availability mode.

Automatic failover set
Within a given availability group, a pair of availability replicas (including the current primary replica) that are configured for synchronous-commit mode with automatic failover, if any. An automatic failover set takes effect only if the secondary replica is currently SYNCHRONIZED with the primary replica.

Synchronous-commit failover set
Within a given availability group, a set of two or three availability replicas (including the current primary replica) that are configured for synchronous-commit mode, if any. A synchronous-commit failover set takes effect only if the secondary replicas are configured for manual failover mode and at least one secondary replica is currently SYNCHRONIZED with the primary replica.

Entire failover set
Within a given availability group, the set of all availability replicas whose operational state is currently ONLINE, regardless of availability mode and of failover mode. The entire failover set becomes relevant when no secondary replica is currently SYNCHRONIZED with the primary replica.

Enjoy!

Chris

Wednesday, 8 August 2012

Cost Savings and Backup Compression


Let me set the scene, one of our internal IT SQL Servers which stores a whole host of performance metrics has over the last few months’ experienced tremendous growth as we have started to monitor more metrics on more servers.  A routine we have in place to collect and report on a multitude of server and database information including database sizes and growth highlighted this to me along with a significant decrease in free space on the backup volume.  We use SQL Server to back up our databases as opposed to a third party product as any benefits we may gain are outweighed by the cost.

When checking the backup routine on the server I noticed that we were not using backup compression.  Rather than allocating more expensive SAN storage to the volume I turned on backup compression, checking the server a day later I was pleased to report an 80% saving in cumulative backup size across the server, Happy Times! J

You can use the script below to calculate the backup compression percentage;

/*
      -----------------------------------------------------------------
      Calculate backup compression percentage per backup
      -----------------------------------------------------------------
    
      For more SQL resources, check out SQLServer365.blogspot.co.uk

      -----------------------------------------------------------------

      You may alter this code for your own purposes.
      You may republish altered code as long as you give due credit.
      You must obtain prior permission before blogging this code.

      THIS CODE AND INFORMATION ARE PROVIDED "AS IS"
     
      -----------------------------------------------------------------
*/
-- Set database context
USE msdb;
GO
-- Calculate compression ratio of all backups taken in the last 24 hours
SELECT (((backup_size - compressed_backup_size) / backup_size) * 100) AS CompressionPercentage, database_name, [type], backup_start_date
FROM msdb.dbo.backupset
WHERE backup_start_date > GETDATE()-1
ORDER BY backup_start_date DESC;
GO

Backup compression was introduced in SQL Server 2008 and although it may not have as many bells and whistles as some third party vendors it is still an extremely valuable addition to SQL Server and one which isn’t is used nearly as much as believe it should.  The saving alone in disk space makes it a no brainer surely?!  I mean 80% saving in storage across a single server is something we as DBA’s can ill afford to ignore, turning backup compression on, on an additional 4 servers this did however drop slightly to 71% but that is still a huge saving.  Let’s say that per 1TB of enterprise storage costs £5,000 even with a 50% saving using backup compression that is £2,500 saved that can be used elsewhere for training courses, conventions, books etc.  Add to this the fact that because a compressed backup is smaller than an uncompressed backup of the same data, compressing a backup typically requires less device I/O and therefore usually increases backup speed significantly.

The amount of compression you achieve will vary depending on the below factors;

·         The type of data.
Character data compresses more than other types of data.
·         The consistency of the data among rows on a page.
Typically, if a page contains several rows in which a field contains the same value, significant compression might occur for that value. In contrast, for a database that contains random data or that contains only one large row per page, a compressed backup would be almost as large as an uncompressed backup.
·         Whether the data is encrypted.
Encrypted data compresses significantly less than equivalent unencrypted data. If transparent data encryption is used to encrypt an entire database, compressing backups might not reduce their size by much, if at all.
·         Whether the database is compressed.
If the database is compressed, compressing backups might not reduce their size by much, if at all.

Before you blindly turn on backup compression there are a few things to be wary of though;

·         Compressed and uncompressed backups cannot co-exist in a media set.
·         Previous versions of SQL Server cannot read compressed backups.
·         NTbackups cannot share a tape with compressed SQL Server backups.
·         By default, compression significantly increases CPU usage.

As always, make sure any change you make has been thoroughly tested and any and all implications understood!

Enjoy!

Chris