operations-guide/doc/openstack-ops/ch_ops_upgrades.xml

679 lines
31 KiB
XML

<?xml version="1.0" encoding="UTF-8"?>
<chapter version="5.0" xml:id="ch_ops_upgrades"
xmlns="http://docbook.org/ns/docbook"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:xi="http://www.w3.org/2001/XInclude"
xmlns:ns5="http://www.w3.org/2000/svg"
xmlns:ns4="http://www.w3.org/1998/Math/MathML"
xmlns:ns3="http://www.w3.org/1999/xhtml"
xmlns:ns="http://docbook.org/ns/docbook">
<title>Upgrades</title>
<para>With the exception of Object Storage, upgrading from one
version of OpenStack to another can take a great deal of effort.
This chapter provides some guidance on the operational aspects
that you should consider for performing an upgrade for a basic
architecture.</para>
<section xml:id="ops_upgrades-pre-considerations">
<title>Pre-upgrade considerations</title>
<section xml:id="ops_upgrades-planning">
<title>Upgrade planning</title>
<itemizedlist>
<listitem>
<para>Thoroughly review the
<link xlink:href="http://wiki.openstack.org/wiki/ReleaseNotes/"
>release notes</link> to learn about new, updated, and deprecated features.
Find incompatibilities between versions.</para>
</listitem>
<listitem>
<para>Consider the impact of an upgrade to users. The upgrade process
interrupts management of your environment including the dashboard.
If you properly prepare for the upgrade, existing instances, networking,
and storage should continue to operate. However, instances might experience
intermittent network interruptions.</para>
</listitem>
<listitem>
<para>Consider the approach to upgrading your environment. You can perform
an upgrade with operational instances, but this is a dangerous approach.
You might consider using live migration to temporarily relocate instances
to other compute nodes while performing upgrades. However, you must
ensure database consistency throughout the process; otherwise your
environment might become unstable. Also, don't forget to provide
sufficient notice to your users, including giving them plenty of
time to perform their own backups.</para>
</listitem>
<listitem>
<para>Consider adopting structure and options from the service
configuration files and merging them with existing configuration
files. The
<link xlink:href="http://docs.openstack.org/kilo/config-reference/content/"
><citetitle>OpenStack Configuration Reference</citetitle></link>
contains new, updated, and deprecated options for most
services.</para>
</listitem>
<listitem>
<para>Like all major system upgrades, your upgrade could fail for
one or more reasons. You should prepare for this situation by
having the ability to roll back your environment to the previous
release, including databases, configuration files, and packages.
We provide an example process for rolling back your environment in
<xref linkend="ops_upgrades-roll-back"/>.<indexterm class="singular">
<primary>upgrading</primary>
<secondary>process overview</secondary>
</indexterm><indexterm class="singular">
<primary>rollbacks</primary>
<secondary>preparing for</secondary>
</indexterm><indexterm class="singular">
<primary>upgrading</primary>
<secondary>preparation for</secondary>
</indexterm></para>
</listitem>
<listitem>
<para>Develop an upgrade procedure and assess it thoroughly by
using a test environment similar to your production
environment.</para>
</listitem>
</itemizedlist>
</section>
<section xml:id="ops_upgrades-pre-testing">
<title>Pre-upgrade testing environment</title>
<para>The most important step is the pre-upgrade testing. If you
are upgrading immediately after release of a new version,
undiscovered bugs might hinder your progress. Some deployers
prefer to wait until the first point release is announced.
However, if you have a significant deployment, you might follow
the development and testing of the release to ensure that bugs
for your use cases are fixed.<indexterm class="singular">
<primary>upgrading</primary>
<secondary>pre-upgrade testing</secondary>
</indexterm></para>
<para>Each OpenStack cloud is different even if you have a near-identical
architecture as described in this guide. As a result, you must still
test upgrades between versions in your environment using an
approximate clone of your environment.</para>
<para>However, that is not to say that it needs to be the same
size or use identical hardware as the production environment.
It is important to consider the hardware and scale of the cloud that
you are upgrading. The following tips can help you minimise the cost:<indexterm
class="singular">
<primary>upgrading</primary>
<secondary>controlling cost of</secondary>
</indexterm></para>
<variablelist>
<varlistentry>
<term>Use your own cloud</term>
<listitem>
<para>The simplest place to start testing the next version
of OpenStack is by setting up a new environment inside
your own cloud. This might seem odd, especially the double
virtualization used in running compute nodes. But it is a
sure way to very quickly test your configuration.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Use a public cloud</term>
<listitem>
<para>Consider using a public cloud to test the scalability
limits of your cloud controller configuration. Most public
clouds bill by the hour, which means it can be inexpensive
to perform even a test with many nodes.<indexterm class="singular">
<primary>cloud controllers</primary>
<secondary>scalability and</secondary>
</indexterm></para>
</listitem>
</varlistentry>
<varlistentry>
<term>Make another storage endpoint on the same system</term>
<listitem>
<para>If you use an external storage plug-in or shared file
system with your cloud, you can test whether it works by
creating a second share or endpoint. This allows you to
test the system before entrusting the new version on to your
storage.</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Watch the network</term>
<listitem>
<para>Even at smaller-scale testing, look for excess network
packets to determine whether something is going horribly
wrong in inter-component communication.</para>
</listitem>
</varlistentry>
</variablelist>
<para>To set up the test environment, you can use one of several
methods:</para>
<itemizedlist>
<!-- the following link suffers from a toolchain problem, where in the
rendered PDF version, the title butts up against the link, which comes
before the title FIXME -->
<listitem>
<para>Do a full manual install by using the <link
xlink:href="http://docs.openstack.org/index.html#install-guides"
><citetitle>OpenStack Installation
Guide</citetitle></link> for your platform. Review the
final configuration files and installed packages.</para>
</listitem>
<listitem>
<para>Create a clone of your automated configuration
infrastructure with changed package repository URLs.</para>
<para>Alter the configuration until it works.</para>
</listitem>
</itemizedlist>
<para>Either approach is valid. Use the approach that matches your
experience.</para>
<para>An upgrade pre-testing system is excellent for getting the
configuration to work. However, it is important to note that the
historical use of the system and differences in user interaction
can affect the success of upgrades.</para>
<para>If possible, we highly recommend that you dump your
production database tables and test the upgrade in your development
environment using this data. Several MySQL bugs have been uncovered
during database migrations because of slight table differences between
a fresh installation and tables that migrated from one version to another.
This will have impact on large real datasets, which you do not want to
encounter during a production outage.</para>
<para>Artificial scale testing can go only so far. After your
cloud is upgraded, you must pay careful attention to the
performance aspects of your cloud.</para>
</section>
<?hard-pagebreak?>
<section xml:id="ops_upgrades_upgrade_levels">
<title>Upgrade Levels</title>
<para>Upgrade levels are a feature added to OpenStack Compute since the
Grizzly release to provide version locking on the RPC
(Message Queue) communications between the various Compute services.
</para>
<para>This functionality is an important piece of the puzzle when
it comes to live upgrades and is conceptually similar to the
existing API versioning that allows OpenStack services of
different versions to communicate without issue.</para>
<para>Without upgrade levels, an X+1 version Compute service can
receive and understand X version RPC messages, but it can only
send out X+1 version RPC messages. For example, if a
<systemitem class="service">nova-conductor</systemitem>
process has been upgraded to X+1 version, then the conductor service
will be able to understand messages from X version
<systemitem class="service">nova-compute</systemitem>
processes, but those compute services will not be able to
understand messages sent by the conductor service.</para>
<para>During an upgrade, operators can add configuration options to
<filename>nova.conf</filename> which lock the version of RPC
messages and allow live upgrading of the services without
interruption caused by version mismatch. The configuration
options allow the specification of RPC version numbers if desired,
but release name alias are also supported. For example:</para>
<programlisting language="ini">[upgrade_levels]
compute=X+1
conductor=X+1
scheduler=X+1</programlisting>
<para>will keep the RPC version locked across the specified services
to the RPC version used in X+1. As all instances of a particular
service are upgraded to the newer version, the corresponding line
can be removed from <filename>nova.conf</filename>.</para>
<para>Using this functionality, ideally one would lock the RPC version
to the OpenStack version being upgraded from on
<systemitem class="service">nova-compute</systemitem> nodes, to
ensure that, for example X+1 version
<systemitem class="service">nova-compute</systemitem>
processes will continue to work with X version
<systemitem class="service">nova-conductor</systemitem>
processes while the upgrade completes. Once the upgrade of
<systemitem class="service">nova-compute</systemitem>
processes is complete, the operator can move onto upgrading
<systemitem class="service">nova-conductor</systemitem>
and remove the version locking for
<systemitem class="service">nova-compute</systemitem> in
<filename>nova.conf</filename>.
</para>
</section>
</section>
<section xml:id="ops_upgrade-process">
<title>Upgrade process</title>
<?dbhtml stop-chunking?>
<para>This section describes the process to upgrade a basic
OpenStack deployment based on the basic two-node architecture in the
<link xlink:href="http://docs.openstack.org/index.html#install-guides">
<citetitle>OpenStack Installation Guide</citetitle></link>.
All nodes must run a supported distribution of Linux with a recent kernel
and the current release packages.</para>
<section xml:id="upgrade-considerations">
<title>Prerequisites</title>
<itemizedlist>
<listitem>
<para>Perform some cleaning of the environment prior to
starting the upgrade process to ensure a consistent state.
For example, instances not fully purged from the system
after deletion might cause indeterminate behavior.</para>
</listitem>
<listitem>
<para>For environments using the OpenStack Networking
service (neutron), verify the release version of the database. For example:</para>
<screen><prompt>#</prompt> <userinput>su -s /bin/sh -c "neutron-db-manage --config-file /etc/neutron/neutron.conf \
--config-file /etc/neutron/plugins/ml2/ml2_conf.ini current" neutron</userinput></screen>
</listitem>
</itemizedlist>
</section>
<section xml:id="upgrades-backup">
<title>Perform a backup</title>
<procedure>
<step>
<para>Save the configuration files on all nodes. For example:</para>
<screen><prompt>#</prompt> <userinput>for i in keystone glance nova neutron openstack-dashboard cinder heat ceilometer; \
do mkdir $i-kilo; \
done</userinput>
<prompt>#</prompt> <userinput>for i in keystone glance nova neutron openstack-dashboard cinder heat ceilometer; \
do cp -r /etc/$i/* $i-kilo/; \
done</userinput></screen>
<note>
<para>You can modify this example script on each node to
handle different services.</para>
</note>
</step>
<step>
<para>Make a full database backup of your production data. As of
Kilo, database downgrades are not supported, and the only method
available to get back to a prior database version will be to restore
from backup.</para>
<screen><prompt>#</prompt> <userinput>mysqldump -u root -p --opt --add-drop-database --all-databases &gt; icehouse-db-backup.sql</userinput></screen>
<note>
<para>Consider updating your SQL server configuration as
described in the
<link xlink:href="http://docs.openstack.org/index.html#install-guides"
>OpenStack Installation Guide</link>.</para>
</note>
</step>
</procedure>
</section>
<section xml:id="upgrades-repos">
<title>Manage repositories</title>
<procedure>
<para>On all nodes:</para>
<step>
<para>Remove the repository for the previous release packages.</para>
</step>
<step>
<para>Add the repository for the new release packages.</para>
</step>
<step>
<para>Update the repository database.</para>
</step>
</procedure>
</section>
<section xml:id="upgrades-packages">
<title>Upgrade packages on each node</title>
<para>Depending on your specific configuration, upgrading all
packages might restart or break services supplemental to your
OpenStack environment. For example, if you use the TGT iSCSI
framework for Block Storage volumes and the upgrade includes
new packages for it, the package manager might restart the
TGT iSCSI services and impact connectivity to volumes.</para>
<para>If the package manager prompts you to update configuration
files, reject the changes. The package manager appends a
suffix to newer versions of configuration files. Consider
reviewing and adopting content from these files.</para>
<note>
<para>You may need to explicitly install the <literal>ipset</literal>
package if your distribution does not install it as a
dependency.</para></note>
</section>
<section xml:id="upgrades-services">
<title>Update services</title>
<para>To update a service on each node, you generally modify one or more
configuration files, stop the service, synchronize the
database schema, and start the service. Some services require
different steps. We recommend verifying operation of each
service before proceeding to the next service.</para>
<para>The order you should upgrade services, and any changes from the
general upgrade process is described below:</para>
<orderedlist>
<title>Controller node</title>
<listitem>
<para>OpenStack Identity - Clear any expired tokens before
synchronizing the database.</para>
</listitem>
<listitem>
<para>OpenStack Image service</para>
</listitem>
<listitem>
<para>OpenStack Compute, including networking
components.</para>
</listitem>
<listitem>
<para>OpenStack Networking</para>
</listitem>
<listitem>
<para>OpenStack Block Storage</para>
</listitem>
<listitem>
<para>OpenStack dashboard - In typical environments, updating the
dashboard only requires restarting the Apache HTTP service.</para>
<para></para>
</listitem>
<listitem>
<para>OpenStack Orchestration</para>
</listitem>
<listitem>
<para>OpenStack Telemetry - In typical environments, updating the
Telemetry service only requires restarting the service.</para>
<para></para>
</listitem>
<listitem>
<para>OpenStack Compute - Edit the configuration file and restart the
service.</para>
</listitem>
<listitem>
<para>OpenStack Networking - Edit the configuration file and restart
the service.</para>
</listitem>
</orderedlist>
<itemizedlist>
<title>Compute nodes</title>
<listitem>
<para>OpenStack Block Storage - Updating the Block Storage service
only requires restarting the service.</para>
</listitem>
</itemizedlist>
<itemizedlist>
<title>Storage nodes</title>
<listitem>
<para>OpenStack Networking - Edit the configuration file and restart
the service.</para>
</listitem>
</itemizedlist>
</section>
<section xml:id="upgrades-final-steps">
<title>Final steps</title>
<para>On all distributions, you must perform some final tasks to
complete the upgrade process.<indexterm class="singular">
<primary>upgrading</primary>
<secondary>final steps</secondary>
</indexterm></para>
<procedure>
<step><para>Decrease DHCP timeouts by modifying
<filename>/etc/nova/nova.conf</filename> on the compute nodes
back to the original value for your environment.</para></step>
<step><para>Update all <filename>.ini</filename> files to match
passwords and pipelines as required for the OpenStack release in your
environment.</para></step>
<step><para>After migration, users see different results from
<command>nova image-list</command> and <command>glance
image-list</command>. To ensure users see the same images in
the list commands, edit the <filename>/etc/glance/policy.json</filename>
and <filename>/etc/nova/policy.json</filename> files to contain
<code>"context_is_admin": "role:admin"</code>, which limits
access to private images for projects.</para></step>
<step><para>Verify proper operation of your environment. Then, notify your users
that their cloud is operating normally again.</para></step>
</procedure>
</section>
</section>
<section xml:id="ops_upgrades-roll-back">
<title>Rolling back a failed upgrade</title>
<para>Upgrades involve complex operations and can fail. Before
attempting any upgrade, you should make a full database backup
of your production data. As of Kilo, database downgrades are
not supported, and the only method available to get back to a
prior database version will be to restore from backup.</para>
<para>This section provides guidance for rolling back to a previous
release of OpenStack. All distributions follow a similar <phrase role="keep-together"
>procedure</phrase>.<indexterm class="singular">
<primary>rollbacks</primary>
<secondary>process for</secondary>
</indexterm><indexterm class="singular">
<primary>upgrading</primary>
<secondary>rolling back failures</secondary>
</indexterm></para>
<para>A common scenario is to take down production management services
in preparation for an upgrade, completed part of the upgrade process,
and discovered one or more problems not encountered during testing.
As a consequence, you must roll back your environment to the original
"known good" state. You also made sure that you did not make any state
changes after attempting the upgrade process; no new instances, networks,
storage volumes, and so on. Any of these new resources will be in a frozen
state after the databases are restored from backup.</para>
<para>Within this scope, you must complete these steps to
successfully roll back your environment:</para>
<orderedlist>
<listitem>
<para>Roll back configuration files.</para>
</listitem>
<listitem>
<para>Restore databases from backup.</para>
</listitem>
<listitem>
<para>Roll back packages.</para>
</listitem>
</orderedlist>
<para>You should verify that you
have the requisite backups to restore. Rolling back upgrades is
a tricky process because distributions tend to put much more
effort into testing upgrades than downgrades. Broken downgrades
take significantly more effort to troubleshoot and, resolve than
broken upgrades. Only you can weigh the risks of trying to push
a failed upgrade forward versus rolling it back. Generally,
consider rolling back as the very last option.</para>
<para>The following steps described for Ubuntu have worked on at
least one production environment, but they might not work for
all environments.</para>
<procedure>
<title>To perform the rollback</title>
<step>
<para>Stop all OpenStack services.</para>
</step>
<step>
<para>Copy contents of configuration backup directories that you
created during the upgrade process back to
<filename>/etc/&lt;service&gt;</filename> directory.</para>
</step>
<step>
<para>Restore databases from the
<filename><replaceable>RELEASE_NAME</replaceable>-db-backup.sql</filename> backup file
that you created with the <command>mysqldump</command>
command during the upgrade process:</para>
<screen><prompt>#</prompt> <userinput>mysql -u root -p &lt; <replaceable>RELEASE_NAME</replaceable>-db-backup.sql</userinput></screen>
</step>
<step>
<para>Downgrade OpenStack packages.</para>
<warning>
<para>Downgrading packages is by far the most complicated
step; it is highly dependent on the distribution and the
overall administration of the system.</para>
</warning>
<substeps>
<step>
<para>Determine which OpenStack packages are installed on
your system. Use the <command>dpkg
--get-selections</command> command. Filter for
OpenStack packages, filter again to omit packages
explicitly marked in the <code>deinstall</code> state,
and save the final output to a file. For example, the
following command covers a controller node with
keystone, glance, nova, neutron, and cinder:</para>
<screen><prompt>#</prompt> <userinput>dpkg --get-selections | grep -e keystone -e glance -e nova -e neutron \
-e cinder | grep -v deinstall | tee openstack-selections</userinput>
<computeroutput>cinder-api install
cinder-common install
cinder-scheduler install
cinder-volume install
glance install
glance-api install
glance-common install
glance-registry install
neutron-common install
neutron-dhcp-agent install
neutron-l3-agent install
neutron-lbaas-agent install
neutron-metadata-agent install
neutron-plugin-openvswitch install
neutron-plugin-openvswitch-agent install
neutron-server install
nova-api install
nova-cert install
nova-common install
nova-conductor install
nova-consoleauth install
nova-novncproxy install
nova-objectstore install
nova-scheduler install
python-cinder install
python-cinderclient install
python-glance install
python-glanceclient install
python-keystone install
python-keystoneclient install
python-neutron install
python-neutronclient install
python-nova install
python-novaclient install
</computeroutput></screen>
<note>
<para>Depending on the type of server, the contents and
order of your package list might vary from this
example.</para>
</note>
</step>
<step>
<para>You can determine the package versions available for
reversion by using the <command>apt-cache
policy</command> command. If you removed the Grizzly
repositories, you must first reinstall them and run
<command>apt-get update</command>:</para>
<!-- FIXME - there was a query about whether this command and the output is
aligned correctly. In the PDF the # is directly above the n of nova common, and
everything is indented below the m of them in the previous sentence -->
<screen><prompt>#</prompt> <userinput>apt-cache policy nova-common</userinput>
<computeroutput>nova-common:
Installed: 1:2013.2-0ubuntu1~cloud0
Candidate: 1:2013.2-0ubuntu1~cloud0
Version table:
*** 1:2013.2-0ubuntu1~cloud0 0
500 http://ubuntu-cloud.archive.canonical.com/ubuntu/
precise-updates/havana/main amd64 Packages
100 /var/lib/dpkg/status
1:2013.1.4-0ubuntu1~cloud0 0
500 http://ubuntu-cloud.archive.canonical.com/ubuntu/
precise-updates/grizzly/main amd64 Packages
2012.1.3+stable-20130423-e52e6912-0ubuntu1.2 0
500 http://us.archive.ubuntu.com/ubuntu/
precise-updates/main amd64 Packages
500 http://security.ubuntu.com/ubuntu/
precise-security/main amd64 Packages
2012.1-0ubuntu2 0
500 http://us.archive.ubuntu.com/ubuntu/
precise/main amd64 Packages</computeroutput></screen>
<para>This tells us the currently installed version of the
package, newest candidate version, and all versions
along with the repository that contains each version.
Look for the appropriate Grizzly
version—<code>1:2013.1.4-0ubuntu1~cloud0</code> in
this case. The process of manually picking through this
list of packages is rather tedious and prone to errors.
You should consider using the following script to help
with this process:</para>
<!-- FIXME - there was a query about whether this command and the output is
aligned correctly. -->
<screen><prompt>#</prompt> <userinput>for i in `cut -f 1 openstack-selections | sed 's/neutron/quantum/;'`;
do echo -n $i ;apt-cache policy $i | grep -B 1 grizzly |
grep -v Packages | awk '{print "="$1}';done | tr '\n' ' ' |
tee openstack-grizzly-versions</userinput>
<computeroutput>cinder-api=1:2013.1.4-0ubuntu1~cloud0
cinder-common=1:2013.1.4-0ubuntu1~cloud0
cinder-scheduler=1:2013.1.4-0ubuntu1~cloud0
cinder-volume=1:2013.1.4-0ubuntu1~cloud0
glance=1:2013.1.4-0ubuntu1~cloud0
glance-api=1:2013.1.4-0ubuntu1~cloud0
glance-common=1:2013.1.4-0ubuntu1~cloud0
glance-registry=1:2013.1.4-0ubuntu1~cloud0
quantum-common=1:2013.1.4-0ubuntu1~cloud0
quantum-dhcp-agent=1:2013.1.4-0ubuntu1~cloud0
quantum-l3-agent=1:2013.1.4-0ubuntu1~cloud0
quantum-lbaas-agent=1:2013.1.4-0ubuntu1~cloud0
quantum-metadata-agent=1:2013.1.4-0ubuntu1~cloud0
quantum-plugin-openvswitch=1:2013.1.4-0ubuntu1~cloud0
quantum-plugin-openvswitch-agent=1:2013.1.4-0ubuntu1~cloud0
quantum-server=1:2013.1.4-0ubuntu1~cloud0
nova-api=1:2013.1.4-0ubuntu1~cloud0
nova-cert=1:2013.1.4-0ubuntu1~cloud0
nova-common=1:2013.1.4-0ubuntu1~cloud0
nova-conductor=1:2013.1.4-0ubuntu1~cloud0
nova-consoleauth=1:2013.1.4-0ubuntu1~cloud0
nova-novncproxy=1:2013.1.4-0ubuntu1~cloud0
nova-objectstore=1:2013.1.4-0ubuntu1~cloud0
nova-scheduler=1:2013.1.4-0ubuntu1~cloud0
python-cinder=1:2013.1.4-0ubuntu1~cloud0
python-cinderclient=1:1.0.3-0ubuntu1~cloud0
python-glance=1:2013.1.4-0ubuntu1~cloud0
python-glanceclient=1:0.9.0-0ubuntu1.2~cloud0
python-quantum=1:2013.1.4-0ubuntu1~cloud0
python-quantumclient=1:2.2.0-0ubuntu1~cloud0
python-nova=1:2013.1.4-0ubuntu1~cloud0
python-novaclient=1:2.13.0-0ubuntu1~cloud0
</computeroutput></screen>
<note>
<para>If you decide to continue this step manually,
don't forget to change <code>neutron</code> to
<code>quantum</code> where applicable.</para>
</note>
</step>
<step>
<para>Use the <command>apt-get install</command> command
to install specific versions of each package by
specifying
<code>&lt;package-name&gt;=&lt;version&gt;</code>. The
script in the previous step conveniently created a list
of <code>package=version</code> pairs for you:</para>
<screen><prompt>#</prompt> <userinput>apt-get install `cat openstack-grizzly-versions`</userinput></screen>
<para>This step completes the rollback procedure. You
should remove the upgrade release repository and run
<command>apt-get update</command> to prevent
accidental upgrades until you solve whatever issue
caused you to roll back your environment.</para>
</step>
</substeps>
</step>
</procedure>
</section>
</chapter>