operations-guide/doc/openstack-ops/ch_ops_backup_recovery.xml

210 lines
10 KiB
XML

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter [
<!-- Some useful entities borrowed from HTML -->
<!ENTITY ndash "&#x2013;">
<!ENTITY mdash "&#x2014;">
<!ENTITY hellip "&#x2026;">
<!ENTITY plusmn "&#xB1;">
]>
<chapter xmlns="http://docbook.org/ns/docbook"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
xml:id="backup_and_recovery">
<?dbhtml stop-chunking?>
<title>Backup and Recovery</title>
<para>Standard backup best practices apply when creating your
OpenStack back up policy. For example, how often to backup your
data is closely related to how quickly you need to recover
from data loss.</para>
<note>
<para>If you cannot have any data loss at all, you should also
focus on a highly available deployment. The
<citetitle><link
xlink:href="http://docs.openstack.org/high-availability-guide/content/"
>OpenStack High Availability Guide</link></citetitle> offers
suggestions for elimination of a single point of
failure that could cause system downtime. While it
is not a completely prescriptive document, it
offers methods and techniques for avoiding
downtime and data loss.
</para></note>
<para>Other backup considerations include:</para>
<itemizedlist>
<listitem>
<para>How many backups to keep?</para>
</listitem>
<listitem>
<para>Should backups be kept off-site?</para>
</listitem>
<listitem>
<para>How often should backups be tested?</para>
</listitem>
</itemizedlist>
<para>Just as important as a backup policy is a recovery policy
(or at least recovery testing).</para>
<section xml:id="what_to_backup">
<title>What to Backup</title>
<para>While OpenStack is composed of many components and
moving parts, backing up the critical data is quite
simple.</para>
<para>This chapter describes only how to back up configuration
files and databases that the various OpenStack components
need to run. This chapter does not describe how to back up
objects inside Object Storage or data contained inside
Block Storage. Generally these areas are left for users
to back up on their own.</para>
</section>
<section xml:id="database_backups">
<title>Database Backups</title>
<para>The example OpenStack architecture designates the cloud
controller as the MySQL server. This MySQL server hosts
the databases for nova, glance, cinder, and keystone. With
all of these databases in one place, it's very easy to
create a database backup:</para>
<programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt --all-databases &gt;
openstack.sql</programlisting>
<para>If you only want to backup a single database, you can
instead run:</para>
<programlisting language="bash"><?db-font-size 75%?><prompt>#</prompt> mysqldump --opt nova &gt; nova.sql</programlisting>
<para>where <code>nova</code> is the database you want to back
up.</para>
<para>You can easily automate this process by creating a cron
job that runs the following script once per day:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt>!/bin/bash
backup_dir="/var/lib/backups/mysql"
filename="${backup_dir}/mysql-`hostname`-`eval date +%Y%m%d`.sql.gz"
# Dump the entire MySQL database
/usr/bin/mysqldump --opt --all-databases | gzip &gt; $filename
# Delete backups older than 7 days
find $backup_dir -ctime +7 -type f -delete</programlisting>
<para>This script dumps the entire MySQL database and deletes
any backups older than seven days.</para>
</section>
<section xml:id="file_system_backups">
<title>File System Backups</title>
<para>This section discusses which files and directories should be backed up regularly, organized by service.</para>
<section xml:id="compute">
<title>Compute</title>
<para>The <filename>/etc/nova</filename> directory on both the
cloud controller and compute nodes should be regularly
backed up.</para>
<para>
<code>/var/log/nova</code> does not need to be backed up if
you have all logs going to a central area. It is
highly recommended to use a central logging server or
back up the log directory.</para>
<para>
<code>/var/lib/nova</code> is another important
directory to back up. The exception to this is the
<code>/var/lib/nova/instances</code> subdirectory
on compute nodes. This subdirectory contains the KVM
images of running instances. You would want to
back up this directory only if you need to maintain backup
copies of all instances. Under most circumstances, you
do not need to do this, but this can vary from cloud
to cloud and your service levels. Also be aware that
making a backup of a live KVM instance can cause that
instance to not boot properly if it is ever restored
from a backup.</para>
</section>
<section xml:id="image_catalog_delivery">
<title>Image Catalog and Delivery</title>
<para>
<code>/etc/glance</code> and
<code>/var/log/glance</code> follow the same rules
as their nova counterparts.</para>
<para>
<code>/var/lib/glance</code> should also be backed up.
Take special notice of
<code>/var/lib/glance/images</code>. If you are
using a file-based backend of glance,
<code>/var/lib/glance/images</code> is where the
images are stored and care should be taken.</para>
<para>There are two ways to ensure stability with this
directory. The first is to make sure this directory is
run on a RAID array. If a disk fails, the directory is
available. The second way is to use a tool such as
rsync to replicate the images to another
server:</para>
<para># rsync -az --progress /var/lib/glance/images
backup-server:/var/lib/glance/images/</para>
</section>
<section xml:id="identity">
<title>Identity</title>
<para>
<code>/etc/keystone</code> and
<code>/var/log/keystone</code> follow the same
rules as other components.</para>
<para>
<code>/var/lib/keystone</code>, although it should not
contain any data being used, can also be backed up
just in case.</para>
</section>
<section xml:id="ops_block_storage">
<title>Block Storage</title>
<para>
<code>/etc/cinder</code> and
<code>/var/log/cinder</code> follow the same rules
as other components.</para>
<para>
<code>/var/lib/cinder</code> should also be backed
up.</para>
</section>
<section xml:id="ops_object_storage">
<title>Object Storage</title>
<para>
<code>/etc/swift</code> is very important to have
backed up. This directory contains the swift
configuration files as well as the ring files and ring
<glossterm>builder file</glossterm>s, which if
lost render the data on your cluster inaccessible. A
best practice is to copy the builder files to all
storage nodes along with the ring files. Multiple
backup copies are spread throughout your storage
cluster.</para>
</section>
</section>
<section xml:id="recovering_backups">
<title>Recovering Backups</title>
<para>Recovering backups is a fairly simple process. To begin,
first ensure that the service you are recovering is not
running. For example, to do a full recovery of nova on the
cloud controller, first stop all <code>nova</code>
services:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> stop nova-api
# stop nova-cert
# stop nova-consoleauth
# stop nova-novncproxy
# stop nova-objectstore
# stop nova-scheduler</programlisting>
<para>Now you can import a previously backed-up
database:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mysql nova &lt; nova.sql</programlisting>
<para>You can also restore backed-up nova directories:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> mv /etc/nova{,.orig}
# cp -a /path/to/backup/nova /etc/</programlisting>
<para>Once the files are restored, start everything back
up:</para>
<programlisting language="bash"><?db-font-size 65%?><prompt>#</prompt> start mysql
# for i in nova-api nova-cert nova-consoleauth nova-novncproxy nova-objectstore nova-scheduler
&gt; do
&gt; start $i
&gt; done
</programlisting>
<para>Other services follow the same process, with their
respective directories and databases.</para>
</section>
<section xml:id="ops-backup-recovery-summary">
<title>Summary</title>
<para>Backup and subsequent recovery is one of the first tasks system
administrators learn. However, each system has different
items that need attention. By taking care of your database, image
service, and appropriate file system locations, you can be assured
you can handle any event requiring recovery.</para>
</section>
</chapter>