operations-guide/doc/openstack-ops/ch_ops_backup_recovery.xml

290 lines
11 KiB
XML

<?xml version="1.0" encoding="UTF-8"?>
<chapter version="5.0" xml:id="backup_and_recovery"
xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/1999/xhtml"
xmlns:ns4="http://www.w3.org/1998/Math/MathML"
xmlns:ns3="http://www.w3.org/2000/svg"
xmlns:ns="http://docbook.org/ns/docbook">
<?dbhtml stop-chunking?>
<title>Backup and Recovery</title>
<para>Standard backup best practices apply when creating your OpenStack
backup policy. For example, how often to back up your data is closely
related to how quickly you need to recover from data loss.<indexterm
class="singular">
<primary>backup/recovery</primary>
<secondary>considerations</secondary>
</indexterm></para>
<note>
<para>If you cannot have any data loss at all, you should also focus on a
highly available deployment. The <emphasis><link
xlink:href="http://docs.openstack.org/ha-guide/index.html">OpenStack High Availability
Guide</link></emphasis> offers suggestions for elimination of a single
point of failure that could cause system downtime. While it is not a
completely prescriptive document, it offers methods and techniques for
avoiding downtime and data loss.<indexterm class="singular">
<primary>data</primary>
<secondary>preventing loss of</secondary>
</indexterm></para>
</note>
<para>Other backup considerations include:</para>
<itemizedlist>
<listitem>
<para>How many backups to keep?</para>
</listitem>
<listitem>
<para>Should backups be kept off-site?</para>
</listitem>
<listitem>
<para>How often should backups be tested?</para>
</listitem>
</itemizedlist>
<para>Just as important as a backup policy is a recovery policy (or at least
recovery testing).</para>
<section xml:id="what_to_backup">
<title>What to Back Up</title>
<para>While OpenStack is composed of many components and moving parts,
backing up the critical data is quite simple.<indexterm class="singular">
<primary>backup/recovery</primary>
<secondary>items included</secondary>
</indexterm></para>
<para>This chapter describes only how to back up configuration files and
databases that the various OpenStack components need to run. This chapter
does not describe how to back up objects inside Object Storage or data
contained inside Block Storage. Generally these areas are left for users
to back up on their own.</para>
</section>
<section xml:id="database_backups">
<title>Database Backups</title>
<para>The example OpenStack architecture designates the cloud controller
as the MySQL server. This MySQL server hosts the databases for nova,
glance, cinder, and keystone. With all of these databases in one place,
it's very easy to create a database backup:<indexterm class="singular">
<primary>databases</primary>
<secondary>backup/recovery of</secondary>
</indexterm><indexterm class="singular">
<primary>backup/recovery</primary>
<secondary>databases</secondary>
</indexterm></para>
<programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt --all-databases &gt; openstack.sql</programlisting>
<para>If you only want to backup a single database, you can instead
run:</para>
<programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt nova &gt; nova.sql</programlisting>
<para>where <code>nova</code> is the database you want to back up.</para>
<para>You can easily automate this process by creating a cron job that
runs the following script once per day:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt>!/bin/bash
backup_dir="/var/lib/backups/mysql"
filename="${backup_dir}/mysql-`hostname`-`eval date +%Y%m%d`.sql.gz"
# Dump the entire MySQL database
/usr/bin/mysqldump --opt --all-databases | gzip &gt; $filename
# Delete backups older than 7 days
find $backup_dir -ctime +7 -type f -delete</programlisting>
<para>This script dumps the entire MySQL database and deletes any backups
older than seven days.</para>
</section>
<section xml:id="file_system_backups">
<title>File System Backups</title>
<para>This section discusses which files and directories should be backed
up regularly, organized by service.<indexterm class="singular">
<primary>file systems</primary>
<secondary>backup/recovery of</secondary>
</indexterm><indexterm class="singular">
<primary>backup/recovery</primary>
<secondary>file systems</secondary>
</indexterm></para>
<section xml:id="compute">
<title>Compute</title>
<para>The <filename>/etc/nova</filename> directory on both the cloud
controller and compute nodes should be regularly backed up.<indexterm
class="singular">
<primary>cloud controllers</primary>
<secondary>file system backups and</secondary>
</indexterm><indexterm class="singular">
<primary>compute nodes</primary>
<secondary>backup/recovery of</secondary>
</indexterm></para>
<para><code>/var/log/nova</code> does not need to be backed up if you
have all logs going to a central area. It is highly recommended to use a
central logging server or back up the log directory.</para>
<para><code>/var/lib/nova</code> is another important directory to back
up. The exception to this is the <code>/var/lib/nova/instances</code>
subdirectory on compute nodes. This subdirectory contains the KVM images
of running instances. You would want to back up this directory only if
you need to maintain backup copies of all instances. Under most
circumstances, you do not need to do this, but this can vary from cloud
to cloud and your service levels. Also be aware that making a backup of
a live KVM instance can cause that instance to not boot properly if it
is ever restored from a backup.</para>
</section>
<section xml:id="image_catalog_delivery">
<title>Image Catalog and Delivery</title>
<para><code>/etc/glance</code> and <code>/var/log/glance</code> follow
the same rules as their nova counterparts.<indexterm class="singular">
<primary>Image service</primary>
<secondary>backup/recovery of</secondary>
</indexterm></para>
<para><code>/var/lib/glance</code> should also be backed up. Take
special notice of <code>/var/lib/glance/images</code>. If you are using
a file-based back end of glance, <code>/var/lib/glance/images</code> is
where the images are stored and care should be taken.</para>
<para>There are two ways to ensure stability with this directory. The
first is to make sure this directory is run on a RAID array. If a disk
fails, the directory is available. The second way is to use a tool such
as rsync to replicate the images to another server:</para>
<programlisting># rsync -az --progress /var/lib/glance/images \
backup-server:/var/lib/glance/images/</programlisting>
</section>
<section xml:id="identity">
<title>Identity</title>
<para><code>/etc/keystone</code> and <code>/var/log/keystone</code>
follow the same rules as other components.<indexterm class="singular">
<primary>Identity</primary>
<secondary>backup/recovery</secondary>
</indexterm></para>
<para><code>/var/lib/keystone</code>, although it should not contain any
data being used, can also be backed up just in case.</para>
</section>
<section xml:id="ops_block_storage">
<title>Block Storage</title>
<para><code>/etc/cinder</code> and <code>/var/log/cinder</code> follow
the same rules as other components.<indexterm class="singular">
<primary>Block Storage</primary>
</indexterm><indexterm class="singular">
<primary>storage</primary>
<secondary>block storage</secondary>
</indexterm></para>
<para><code>/var/lib/cinder</code> should also be backed up.</para>
</section>
<section xml:id="ops_object_storage">
<title>Object Storage</title>
<para><code>/etc/swift</code> is very important to have backed up. This
directory contains the swift configuration files as well as the ring
files and ring <glossterm>builder file</glossterm>s, which if lost,
render the data on your cluster inaccessible. A best practice is to copy
the builder files to all storage nodes along with the ring files.
Multiple backup copies are spread throughout your storage
cluster.<indexterm class="singular">
<primary>builder files</primary>
</indexterm><indexterm class="singular">
<primary>rings</primary>
<secondary>ring builders</secondary>
</indexterm><indexterm class="singular">
<primary>Object Storage</primary>
<secondary>backup/recovery of</secondary>
</indexterm></para>
</section>
</section>
<section xml:id="recovering_backups">
<title>Recovering Backups</title>
<para>Recovering backups is a fairly simple process. To begin, first
ensure that the service you are recovering is not running. For example, to
do a full recovery of <literal>nova</literal> on the cloud controller,
first stop all <code>nova</code> services:<indexterm class="singular">
<primary>recovery</primary>
<seealso>backup/recovery</seealso>
</indexterm><indexterm class="singular">
<primary>backup/recovery</primary>
<secondary>recovering backups</secondary>
</indexterm></para>
<?hard-pagebreak ?>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> stop nova-api
# stop nova-cert
# stop nova-consoleauth
# stop nova-novncproxy
# stop nova-objectstore
# stop nova-scheduler</programlisting>
<para>Now you can import a previously backed-up database:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mysql nova &lt; nova.sql</programlisting>
<para>You can also restore backed-up nova directories:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mv /etc/nova{,.orig}
# cp -a /path/to/backup/nova /etc/</programlisting>
<para>Once the files are restored, start everything back up:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> start mysql
# for i in nova-api nova-cert nova-consoleauth nova-novncproxy
nova-objectstore nova-scheduler
&gt; do
&gt; start $i
&gt; done</programlisting>
<para>Other services follow the same process, with their respective
directories and <phrase role="keep-together">databases</phrase>.</para>
</section>
<section xml:id="ops-backup-recovery-summary">
<title>Summary</title>
<para>Backup and subsequent recovery is one of the first tasks system
administrators learn. However, each system has different items that need
attention. By taking care of your database, image service, and appropriate
file system locations, you can be assured that you can handle any event
requiring recovery.</para>
</section>
</chapter>