This section covers some general strategies that MySQL administrators can use to keep their databases intact:
Take advantage of any auto-recovery capabilities that the server offers.
Have a policy of making regular database backups, so that you have something to fall back on if your databases are irretrievably lost.
Another step you can take is to schedule regular preventive maintenance. Techniques for this are discussed in the "Table Repair and Data Recovery" section later in this chapter because they use the table maintenance utilities described in that section.
Using the Server's Auto-Recovery Capabilities
One of your first lines of defense in maintaining database integrity is the MySQL server's ability to perform automatic recovery at startup time. When the server begins executing, it can perform certain types of table checking to help deal with problems resulting from an earlier server or machine crash. MySQL is designed to recover from a variety of problems, so if you do nothing more than restart the server normally, it will make the necessary corrections for you in most cases. The possible actions taken by the server include the following:
If the InnoDB table handler is enabled, it can check for a variety of problems automatically. Committed transactions that are present in the redo logs but not yet flushed to tables are rolled forward (redone). Uncommitted transactions in progress at the time of the crash are rolled back (discarded) using the undo logs. The result is to leave your InnoDB tables in a consistent state, so that their contents reflect all transactions that had been committed up to the point of the crash.
The BDB table handler, if enabled, also attempts auto-recovery based on the contents of its log files.
The MyISAM handler can perform automatic checking and repair operations. This is controlled by the server's --myisam-recover=level option, where level can be empty to disable checking or a comma-separated list of one or more of the following values: DEFAULT (same as specifying no option), BACKUP (create a backup of the table if it is changed), FORCE (force recovery even if more than a row of data will be lost), or QUICK (quick recovery). The --myisam-recover option is available as of MySQL 3.23.25.
If InnoDB or BDB auto-recovery fails due to a non-recoverable problem, the server exits after writing a message to the error log. To force the server to start up anyway so that you can attempt a manual recovery procedure, see the "Recovering the InnoDB Tablespace or BDB Tables" section later in the chapter.
No automatic table startup timetable checking is available for ISAM tables. Nor is it likely there ever will be; ISAM support in MySQL is essentially frozen because MyISAM tables are preferable. I encourage you to consider converting your ISAM tables to MyISAM tables. To convert a table to MyISAM format, use an ALTER TABLE statement:
ALTER TABLE tbl_name TYPE = MYISAM;
You can also use the mysql_convert_table_format utility to convert all tables in a database with a single command. This script is written in Perl and requires that you have DBI installed. To see how to use it, invoke it with the --help option.
If you don't want to convert your ISAM tables, you can arrange to check them by invoking isamchk before the server starts up. Also, if your server is older than 3.23.25 (prior to the introduction of --myisam-recover), you can check your MyISAM tables by invoking myisamchk before the server starts up. The "Scheduling Preventive Maintenance" section later in this chapter discusses how to arrange for table checking with these utilities before the server starts up.
Backing Up and Copying Databases
It's important to back up your databases in case tables are lost or damaged. If a serious system crash occurs, you want to be able to restore your tables to the state they were in at the time of the crash with as little data loss as possible. Likewise, a user who issues an unwise DROP DATABASE or DROP TABLE will likely show up at your door requesting that you perform data recovery. Sometimes it's even the MySQL administrator who inadvertently causes the damage, for example, by trying to edit a table file directly using an editor, such as vi or emacs. This is certain to do bad things to the table!
The techniques that are used for creating backups are also useful for copying databases to another server. Most commonly, a database is transferred to a server running on another host, but you can also transfer data to another server running on the same host. You might do this if you're running a server for a new release of MySQL and want to test it with some real data from your production server.
Another use for backups is to set up a replication server, because one of the first steps in setting up a slave server is to take a snapshot of the master server at a specific point in time. The backup serves as this snapshot, and loading it into the slave server brings it up to date with respect to the master server. Thereafter, updates made on the master server are replicated to the slave server through the standard replication protocol. The procedure for setting up replication is discussed in Chapter 11, "General MySQL Administration."
The two main methods for backing up databases are to use the mysqldump program or to directly copy database files (for example, with mysqlhotcopy, cp, tar, or cpio). Each method has its own advantages and disadvantages:
mysqldump operates in cooperation with the MySQL server. Direct-copy methods involve file copy operations that are done external to the server, and you must take steps to ensure that the server does not modify the tables while you copy them. This is the same problem that occurs if you try to use file system backups to back up databases: If a database table is being updated during the file system backup, the table files that go into the backup may be in an inconsistent state momentarily and will be worthless for restoring the table later. However, whereas with file system backups you may have no control over the backup schedule, when you use direct-copy methods to copy tables, you can take steps to make sure the server leaves the tables alone.
mysqldump is slower than direct-copy techniques because the dump operation involves transferring the information over the network. Direct-copy backup methods operate at the file system level and require no network traffic.
mysqldump generates text files that are portable to other machines, even those with a different hardware architecture. Such files are therefore usable for copying databases. Files generated by direct-copy methods may or may not be portable to other machines. This depends on whether or not they correspond to tables that use a machine independent storage format. ISAM tables do not satisfy this constraint. For example, copying files from Solaris on SPARC to Solaris on SPARC will work, but copying files from Solaris on SPARC to Solaris on Intel or to Mac OS X will not work. MyISAM and InnoDB tables normally are machine independent. For those table types, directly copied files can be moved to a server running on a machine with a different hardware architecture. For further discussion of the characteristics of various table types, including whether they are portable, see Chapter 3, "MySQL SQL Syntax and Use."
Whichever backup method you choose, there are certain principles to which you should adhere to assure the best results if you ever need to restore database contents:
Perform backups regularly. Set a schedule and stick to it.
Tell the server to perform binary update logging (see the "Maintaining Log Files" section in Chapter 11). The binary logs can help when you need to restore databases after a crash. After you use your backup files to restore the databases to the state they were in at the time of the backup, you can re-apply the changes that occurred after the backup was made by running the queries contained in the logs. This restores the tables in the databases to their state at the time the crash occurred.
Use a consistent and understandable naming scheme for your backup files. Names like backup1, backup2, and so forth are not particularly meaningful; when it comes time to perform a restore operation, you'll waste time figuring out what's in the files. You may find it useful to construct backup filenames using database names and dates—for example:
% mysqldump sampdb > /archive/mysql/sampdb.2002-10-02
% mysqldump menagerie > /archive/mysql/menagerie.2002-10-02
Expire your backup files periodically to keep them from filling your disk. One way to do this is use the log file rotation techniques discussed in Chapter 11. You can apply the same principles to backup file expiration as well.
Back up your backup files using file system backups. If you have a complete crash that wipes out not only your data directory but also the disk drive containing your database backups, you'll be in real trouble. Back up your logs, too.
Put your backup files on a different file system than the one you use for your databases. This reduces the likelihood of filling up the file system containing the data directory as a result of generating backups. Also, if this file system is on another drive, you further reduce the extent of damage that can be caused by drive failure, because loss of any one drive cannot destroy both your data directory and your backups.
Using mysqldump to Back Up and Copy Databases
When you use the mysqldump program to generate database backup files, the file is written in SQL format by default, consisting of CREATE TABLE statements that create the tables being dumped and INSERT statements containing the data for the rows in the tables. To re-create the database later, you can take the mysqldump output file and use it as input to mysql to reload it into MySQL. (Note that you do not use mysqlimport to read SQL-format mysqldump output!)
You can dump an entire database into a single text file as follows:
% mysqldump sampdb > /archive/mysql/sampdb.2002-10-02
The beginning of the output file will look something like this:
-- MySQL dump 9.06
--
-- Host: localhost Database: sampdb
---------------------------------------------------------
-- Server version 4.0.3-beta-log
--
-- Table structure for table 'absence'
--
CREATE TABLE absence (
student_id int(10) unsigned NOT NULL default '0',
date date NOT NULL default '0000-00-00',
PRIMARY KEY (student_id,date)
) TYPE=MyISAM;
--
-- Dumping data for table 'absence'
--
INSERT INTO absence VALUES (3,'2002-09-03');
INSERT INTO absence VALUES (5,'2002-09-03');
INSERT INTO absence VALUES (10,'2002-09-06');
...
The rest of the file consists of more CREATE TABLE and INSERT statements.
Backup files often are large, so you'll likely want to do what you can to make them smaller. One way to do this is to use the --opt option, which optimizes the dump process to generate a smaller file:
% mysqldump --opt sampdb > /archive/mysql/sampdb.2002-10-02
You can also compress the dump file. For example, to compress the backup as you generate it, use a command like the following:
% mysqldump --opt sampdb | gzip > /archive/mysql/sampdb.2002-10-02.gz
If you find large dump files difficult to manage, it's possible to dump the contents of individual tables by naming them after the database name on the mysqldump command line. Then mysqldump will dump just the named tables rather than all the tables in the database. This partitions the dump into smaller, more manageable files. The following example shows how to dump some of the sampdb tables into separate files:
% mysqldump --opt sampdb student score event absence > gradebook.sql
% mysqldump --opt sampdb member president > hist-league.sql
--opt is useful when you're generating backup files that are intended to be used to periodically refresh the contents of another database. That's because it automatically enables the --add-drop-table option, which tells mysqldump to precede each CREATE TABLE statement in the file with a DROP TABLE IF EXISTS statement for the same table. Then, when you take the backup file and load it into the second database, you won't get an error if the tables already exist. If you're running a second test server that's not a replication slave, you can use this technique to periodically reload it with a copy of the data from the databases on your production server.
If you want to transfer a database to another server, you may not even need to create backup files. Make sure that the database exists on the other host and then dump the database over the network using a pipe so that mysql mysqldump directly. For example, to copy the sampdb database from the local host to the server on boa.snake.net, do so like this:
% mysqladmin -h boa.snake.net create sampdb
% mysqldump --opt sampdb | mysql -h boa.snake.net sampdb
If you don't have a MySQL account on the local host that allows you to access the boa.snake.net server, but you do have such an account on boa.snake.net itself, use ssh to remotely invoke MySQL commands on that host:
% ssh boa.snake.net mysqladmin create sampdb
% mysqldump --opt sampdb | ssh boa.snake.net mysql sampdb
Later, if you want to refresh the sampdb database on boa.snake.net, repeat the mysqldump command.
Other mysqldump options you may find useful include the following:
The combination of --flush-logs and --lock-tables is helpful for checkpointing your database. --lock-tables locks all the tables that you're dumping, and --flush-logs closes and reopens the log files. If you're generating sequenced update or binary update logs, the new log will contain only those queries that modify databases subsequent to the checkpoint. This synchronizes your log to the time of the backup. (Locking all the tables is not so good for client access during the backups if you have clients that need to perform updates, however.)
If you use --flush-logs to checkpoint the logs to the time of the backup, it's probably best to dump the entire database. During restore operations, it's common to extract log contents on a per-database basis. If you dump individual tables, it's much more difficult to synchronize log checkpoints against your backup files. (There is no option for extracting updates for individual tables, so you'll have to extract them yourself.)
By default, mysqldump reads the entire contents of a table into memory before writing it out. This isn't really necessary and, in fact, is almost a recipe for failure if you have really large tables. You can use the --quick option to tell mysqldump to write each row as soon as it has been retrieved. To further optimize the dump process, use --opt instead of --quick. The --opt option turns on other options that speed up dumping the data. In addition, the dump file is written in such a way that it can be processed more quickly later when loaded back into the server.
Performing backups using --opt is probably the most common method because of the benefits for backup speed. Be warned, however, that the --opt option does have a price; what --opt optimizes is your backup procedure, not access by other clients to the database. The --opt option prevents anyone from updating any of the tables that you're dumping by locking all the tables at once. You can easily see for yourself the effect of this on general database access. Just try making a backup at the time of day when your database is normally most heavily used. It won't take long for your phone to start ringing with people calling to find out what's going on. (I'd appreciate it if you would refrain from asking how it is that I happen to know this.)
An option that has something of the opposite effect of --opt is --delayed. This option causes mysqldump to write INSERT DELAYED statements rather than INSERT statements. If you are loading a dump file into another database and you want to minimize the impact of the operation on other queries that may be taking place in that database, --delayed is helpful for achieving that end.
Normally you name a database on the mysqldump command line, optionally followed by specific table names. To dump several databases at once, use the --databases option. Then mysqldump will interpret all names as database names and dump all the tables in each of them. To dump all of a server's databases, use --all-databases. In this case, you supply no database or table name arguments. Be careful with the --all-databases option if you intend to load the dump output into another server; the dump will include the grant tables in the mysql database, and you may not really want to replace the other server's grant tables.
The --compress option is helpful when copying a database to another machine because it reduces the number of bytes traveling over the network:
% mysqldump --opt sampdb | mysql --compress -h boa.snake.net sampdb
Notice that the --compress option is given for the program that communicates with the server on the remote host, not the one that communicates with the local host. Compression applies only to network traffic; it does not cause compressed tables to be created in the destination database.
By default, mysqldump dumps both table structure (the CREATE TABLE statements) and table contents (the INSERT statements). To dump just one or the other, use the --no-create-info or --no-data options.
mysqldump has many other options as well. Consult Appendix E for more information.
Using Direct-Copy Database Backup and Copying Methods
Another way to back up a database or tables that doesn't involve mysqldump is to copy table files directly. Typically, this is done using utilities such as cp, tar, or cpio. When you use a direct-copy backup method, you must make sure the tables aren't being used. If the server is changing a table while you're copying it, the copies will be worthless. The best way to ensure the integrity of your copies is to bring down the server, copy the files, and restart the server. If you don't want to bring down the server, use the read-access locking protocol described in the "Coordinating with the Server" section earlier in this chapter. That will prevent the server from changing the tables while you're copying them.
Assuming that the server is either down or that you've read-locked the tables you want to copy, the following example shows how to back up the entire sampdb database to a backup directory. If the data directory is /usr/local/mysql/data, the commands look like this:
% cd /usr/local/mysql/data
% cp -r sampdb /archive/mysql
Individual tables can be backed up as follows:
% cd /usr/local/mysql/data/sampdb
% cp member.* /archive/mysql/sampdb
% cp score.* /archive/mysql/sampdb
...
When you're done backing up, you can restart the server if you brought it down. If you left the server running and locked the tables, you can release the locks.
Direct-copy methods apply to copying a database from one machine to another, too. For example, you can use scp rather than cp. If the data directory on boa.snake.net is /var/mysql/data, the following commands copy the sampdb database directory to that host:
% cd /usr/local/mysql/data
% scp -r sampdb boa.snake.net:/var/mysql/data
Note that copying databases to another host this way involves some additional constraints:
Both machines must have the same hardware architecture, or the tables you're copying must all be of a portable table type. The resulting tables on the second host may appear to have very strange contents otherwise.
You must prevent the servers on both hosts from attempting to change the files while you're copying them. The safest approach is to bring down both servers while you're working with the tables.
Direct-copy methods, such as those just described, work best for table types like MyISAM and ISAM that represent a given table with a unique set of files in the database directory. For types such as InnoDB, you must observe additional precautions. See the "Backing Up the InnoDB Tablespace or BDB Tables" section later in this chapter.
Making Backups with mysqlhotcopy
As of version 3.23.11, MySQL distributions include mysqlhotcopy, a Perl DBI script that helps you make database backups. The "hot" in the name refers to the fact that the backups are made while the server is running; you need not take it offline.
mysqlhotcopy has the following principal benefits:
It's faster than mysqldump because it directly copies the files rather than requesting them through the server the way mysqldump does. (This means that you must run it on the server host; it does not work with remote servers.)
It's convenient because it automatically manages for you the locking protocol necessary to keep the server from changing the tables while they're being copied. mysqlhotcopy does this using internal locking (described earlier in the "Coordinating with the Server" section).
It can flush the logs, which synchronizes the checkpoints for the backup files and the logs and makes it easier to use the backups for recovery, should that be necessary later.
There are several ways to invoke mysqlhotcopy. Suppose you want to copy the sampdb database. The following command will create a directory sampdb_copy in the server's data directory and copy the files in the sampdb database directory into it:
% mysqlhotcopy sampdb
To copy the database into a directory named sampdb under a directory you specify, specify that directory after the database name. For example, to copy the sampdb database to a directory /archive/2002-09-12/sampdb, use the following command:
% mysqlhotcopy sampdb /archive/2002-09-12
To find out what mysqlhotcopy will do for any given command, include the -n option in your invocation syntax. This runs mysqlhotcopy in "no execution" mode, such that it just prints commands rather than executing them.
Making Backups with BACKUP TABLE
The BACKUP TABLE statement, available as of MySQL 3.23.25, provides a way to back up individual tables by having the server itself copy the table's files. It works for MyISAM tables only. To use this statement, name the files you want to back up and a path to the directory on the server host where you want the files copied. For example,
BACKUP TABLE tbl1, tbl2, tbl3 TO '/archive/sampdb';
The directory must exist and be writable to the server, you must have the FILE privilege and also the SELECT privilege for the tables. BACKUP TABLE flushes each table to cause any pending changes to be written to disk and then copies each table's .frm and .MYD description and data files from the database directory. It does not copy the .MYI index file, because that can be rebuilt from the other two files.
BACKUP TABLE locks the tables one at a time. This means that for a multiple-table backup, it's possible for the backup files to be different than the actual state of the tables when the statement finishes. Suppose you're backing up tbl1 and tbl2. BACKUP TABLE will lock tbl1 only while backing it up, so it might be modified while tbl2 is being backed up. This means that when BACKUP TABLE finishes, the contents of tbl1 will differ from the contents of the backup files. To make sure that the backup files as a group match the contents of the corresponding tables, disable modifications to any of the tables for the duration of the BACKUP TABLE statement by read-locking them all. Issue an UNLOCK TABLES statement afterward to release the locks—for example:
LOCK TABLES tbl1 READ, tbl2 READ;
BACKUP TABLE tbl1, tbl2 TO 'backup_dir_path';
UNLOCK TABLES;
A table that has been backed up with BACKUP TABLE can be reloaded into the server with RESTORE TABLE, as described in the "Table Repair and Data Recovery" section later in this chapter.
Backing Up the InnoDB Tablespace or BDB Tables
InnoDB and BDB tables can be dumped using mysqldump, just like any other kind of tables. You can also use direct-copy methods, but take care to observe the following special requirements:
InnoDB tables are not represented using separate files (except that each has a .frm description file). Instead, they are all represented together within the InnoDB tablespace, which is a set of one or more large files. To directly copy the InnoDB tablespace, copy all of its component files. You should do this after the server has been shut down to make sure that the InnoDB handler has committed any pending transactions. For completeness, you should also copy all the .frm files corresponding to InnoDB tables, the InnoDB log files, and the option file in which the tablespace configuration is specified. (Make a copy of the option file because you'll want it for reinitializing the tablespace should you suffer loss of the current option file.)
Another method is to use InnoDB Hot Backup, available from innodb.com. This is a commercial tool that allows you to make InnoDB backups with the server running.
To directly copy BDB tables, you should copy all BDB tables managed by the server, and you must also copy the BDB log files. Do this when the server has been shut down. The BDB handler requires the logs to be present when the server starts up, which means that should it be necessary to restore BDB tables, you'll need to provide the logs as well. BDB log files are created in the data directory by default and have names of the form log.nnnnnnnnnn with a 10-digit suffix.
Using Replication to Help Make Backups
Making backups is important, but it introduces a conflict of interest into your duties as a MySQL administrator. On the one hand, you want to maximize the availability of your server to the members of your user community, which includes allowing them to make database updates. On the other hand, for recovery purposes, backups are most useful if you make sure your backup file and log file checkpoints are synchronized. These goals conflict because the best way to synchronize backup and log checkpoints is by flushing the logs when you make the backup, combined with making sure no updates occur by either bringing the server down or locking all the tables at once (for example, with the --opt option to mysqldump). Unfortunately, disallowing updates reduces client access to the tables for the duration of the backup.
If you have a replication slave server set up, it can help you resolve this conflict. Rather than making backups on the master server, use the slave server instead. Then you need not bring down the master or otherwise make it unavailable to clients during the backup. Instead, suspend replication on the slave server with SLAVE STOP and make a backup from the slave. (If you are using a direct-copy backup method, issue a FLUSH TABLES statement as well.) Afterward, re-enable replication with SLAVE START and the slave will catch up on any updates made by the master server during the backup period. Depending on your backup method, you may not even need to suspend replication. For example, if you're backing up only a single database, you can use mysqlhotcopy or mysqldump with the appropriate options to lock all the tables at once. In that case, the slave server can stay up, but it won't attempt any updates to the locked tables during the backup. When the backup is done and the locks are released, the slave resumes update processing automatically.
Using a Backup to Rename a Database
MySQL has no command for renaming a database, but you can still do so. Dump the database with mysqldump, create a new empty database with the new name, and then reload the dump file into the new database. After that you can drop the old database. The following example shows how to rename db1 to db2:
% mysqldump db1 > db1.sql
% mysqladmin create db2
% mysql db2 < db1.sql
% mysqladmin drop db1
An easier way to rename a database is to bring down the server, rename the database directory, and restart the server. However, this strategy can be used only if you have no BDB or InnoDB tables in the database. It doesn't work for BDB tables because the pathname to each table is encoded in its .db file. Nor does it work for InnoDB tables because the database name for each table is stored in the InnoDB tablespace, which is unaffected by renaming the database directory.
Whichever method you use for renaming a database, remember that access rights to it are controlled through the grant tables in the mysql database. If any of the grant tables have entries that refer specifically to the database that was renamed, you'll need to adjust the entries appropriately to refer to the new name. For a database renamed from db1 to db2, the statements to use look like this:
mysql> UPDATE db SET Db = 'db2' WHERE Db = 'db1';
mysql> UPDATE tables_priv SET Db = 'db2' WHERE Db = 'db1';
mysql> UPDATE columns_priv SET Db = 'db2' WHERE Db = 'db1';
mysql> UPDATE host SET Db = 'db2' WHERE Db = 'db1';
No such statement is needed for the user table, because it has no Db column.
No comments:
Post a Comment