SQL Server Backup Terminology Part 1: Media Sets & Backup Sets
When backing up a database or transaction log, SQL Server needs to know two basic things:
- What information the backup should contain
- Where it's going to be written
Many options exist to control the contents and behavior of a backup: whether it's full, differential, or a log, if it should be compressed, copy_only, encrypted, the list goes on and on. Most people will learn these options, understand their effects, and consider themselves to have mastered SQL Server backups. While they're not totally wrong, there's still a lot more to the story: where the backups are going. While it's incredibly simple to use the TO DISK
option and write the backup out to one or more files, there are a lot of possible controls over how and where a backup is written out to tape or disk. This post, along with a few others, will discuss those features.
Basics of Media Sets and Backup Sets
Whatever our backup contains, it needs to go somewhere. That somewhere is called a media set. The concept of a media set may be unfamiliar because their creation is handled automatically when taking a backup. A media set consists of one or more devices of a single type. Currently, that means:
- One or more files or
- One or more tapes or
- One or more Azure blobs (in SQL Server 2014 and later)
Media sets cannot contain more than one type of device, so creating a media set that utilizes both a file and a tape is not possible.
Backing up a database will create the necessary media set (if it doesn't already exist) and then write the backup to the media set. Technically the backup being written to the media set is known as a backup set.
But enough talk, let's try some examples and see how this all works:
First, let's clear out the backup history so the results of these examples are more apparent. The sp_delete_backuphistory procedure takes a date parameter and will delete all records older than that date. Passing in the current date means everything will be cleared out. As with all my code examples, this should only be executed on a test machine, and never in production environments!
1DECLARE @Oldest_Date DATETIME;
2SET @Oldest_Date = SYSDATETIME();
3EXEC msdb.dbo.sp_delete_backuphistory @Oldest_Date;
Now, let's create a user database and back it up to a single file.
1USE [tempdb];
2GO
3IF DB_ID('DemoDB') IS NOT NULL
4BEGIN
5 ALTER DATABASE [DemoDB] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
6 DROP DATABASE [DemoDB];
7END
8GO
9CREATE DATABASE [DemoDB]
10GO
11ALTER DATABASE [DemoDB] SET RECOVERY SIMPLE
12GO
13BACKUP DATABASE [DemoDB]
14TO DISK = 'C:\Temp\DemoDB1.bak';
Creating this backup automatically created a media set. SQL Server maintains a table of media sets in the msdb database. To view it, run the following:
1SELECT * FROM msdb.dbo.backupmediaset;
Looking at the results, there's not a whole lot to a media set. It's mostly identifiers and a few properties. There's nothing in there at all about the backup itself; the media set is merely a container that holds backups. Information about the backup is stored as part of the backup set, which has its own table in msdb. The following query displays information about the backup set. Fill in the media_set_id
value from the media set query above before running it.
1SELECT b.* FROM msdb.dbo.backupset b
2INNER JOIN msdb.dbo.backupmediaset m ON b.media_set_id = m.media_set_id
3WHERE m.media_set_id = <enter media_set_id here>;
The backup set contains lots of information about the backup and state of the database when it was taken, including the backup's duration both in terms of time and the log sequence number. Much of this same information can be found in the backup header, which can be queried using the RESTORE HEADERONLY
command:
1RESTORE HEADERONLY
2FROM DISK = 'C:\Temp\DemoDB1.bak';
Multiple Backup Sets Per Media Set
A media set is effectively a container, and containers that can only hold one of something are pretty boring. To make sure media sets aren't boring, Microsoft gave them the ability to contain multiple backup sets. Now let's take another backup of the same database, and write it out to the same file as before.
1BACKUP DATABASE [DemoDB]
2TO DISK = 'C:\Temp\DemoDB1.bak';
But what happened to the first backup we took? Nothing at all, actually – it's still there. The media set (backup file) now has a second backup set containing the second backup.
This can be observed by either querying the backupset table or the backup header. Two entries will now be shown.
1SELECT b.* FROM msdb.dbo.backupset b
2INNER JOIN msdb.dbo.backupmediaset m ON b.media_set_id = m.media_set_id
3WHERE m.media_set_id = <enter media_set_id here>;
Backups of Multiple Databases Per Media Set
It's also completely possible for backup sets of different databases to be contained in the same media set. To prove this, let's add a backup of the master database to our existing media set.
1BACKUP DATABASE [master]
2TO DISK = 'C:\Temp\DemoDB1.bak';
The backup set query from above will now show the media set contains three backups: 2 of DemoDB and 1 of master.
With multiple backup sets being contained within a media set, things get a little bit trickier. It's now that much easier to lose multiple backups when only a single file needs to be deleted, lost, or damaged. Restores are also a bit more work, because if the media set contains multiple backup sets, you'll need to specify which one is being restored. This is done by using the RESTORE statement's FILE option (which adds to the confusion because in this case, "file" really refers to the backup set.)
To restore the 2nd backup taken of the DemoDB database into a new database called DemoDB_2, the syntax would be as follows:
1RESTORE DATABASE [DemoDB_2]
2FROM DISK = 'C:\Temp\DemoDB1.bak'
3WITH FILE = 2,
4MOVE 'DemoDB' TO 'C:\Temp\DemoDB_2.mdf',
5MOVE 'DemoDB_log' TO 'C:\Temp\DemoDB_2.ldf';
If you don't specify which FILE to restore, it defaults to FILE = 1
, the first backup set in the media set. The MOVE options tell SQL Server to restore the database files to a new location, rather than attempting to overwrite the existing DemoDB files (and failing).
Backup Compression
And then there's compression. If you look above to the query against msdb.dbo.backupmediaset
, you'll see that the result set contains a column called is_compressed
. SQL Server native backup compression occurs at the media set level. All backups in a media set must have the same level of compression, so the entire media set will contain backup sets that are all compressed, or all uncompressed. Try adding a compressed backup of DemoDB to our existing (uncompressed) media set and you'll be greeted with the following error.
1BACKUP DATABASE [DemoDB]
2TO DISK = 'C:\Temp\DemoDB1.bak'
3WITH COMPRESSION;
As this error leads us to believe, there are many backup options that can control the behavior of backup sets and media sets. My next post in this series will discuss and explain them.