[HA Guide] Update for the current predominant architectures

Begin updating the guide to reflect current architecture best practices.
Remove the keepalived architecture, its use is increasingly rare.

Change-Id: Id62d09707611f4706620b00f7800b80138afe98d
This commit is contained in:
Andrew Beekhof 2016-09-20 12:31:35 +10:00
parent 5a45c9ce62
commit c5c825fd88
3 changed files with 27 additions and 107 deletions

Binary file not shown.

Before

Width:  |  Height:  |  Size: 52 KiB

View File

@ -1,96 +0,0 @@
============================
The keepalived architecture
============================
High availability strategies
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The following diagram shows a very simplified view of the different
strategies used to achieve high availability for the OpenStack
services:
.. image:: /figures/keepalived-arch.jpg
:width: 100%
Depending on the method used to communicate with the service, the
following availability strategies will be followed:
- Keepalived, for the HAProxy instances.
- Access via an HAProxy virtual IP, for services such as HTTPd that
are accessed via a TCP socket that can be load balanced
- Built-in application clustering, when available from the application.
Galera is one example of this.
- Starting up one instance of the service on several controller nodes,
when they can coexist and coordinate by other means. RPC in
``nova-conductor`` is one example of this.
- No high availability, when the service can only work in
active/passive mode.
There are known issues with cinder-volume that recommend setting it as
active-passive for now, see:
https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support
While there will be multiple neutron LBaaS agents running, each agent
will manage a set of load balancers, that cannot be failed over to
another node.
Architecture limitations
~~~~~~~~~~~~~~~~~~~~~~~~
This architecture has some inherent limitations that should be kept in
mind during deployment and daily operations.
The following sections describe these limitations.
#. Keepalived and network partitions
In case of a network partitioning, there is a chance that two or
more nodes running keepalived claim to hold the same VIP, which may
lead to an undesired behaviour. Since keepalived uses VRRP over
multicast to elect a master (VIP owner), a network partition in
which keepalived nodes cannot communicate will result in the VIPs
existing on two nodes. When the network partition is resolved, the
duplicate VIPs should also be resolved. Note that this network
partition problem with VRRP is a known limitation for this
architecture.
#. Cinder-volume as a single point of failure
There are currently concerns over the cinder-volume service ability
to run as a fully active-active service. During the Mitaka
timeframe, this is being worked on, see:
https://blueprints.launchpad.net/cinder/+spec/cinder-volume-active-active-support
Thus, cinder-volume will only be running on one of the controller
nodes, even if it will be configured on all nodes. In case of a
failure in the node running cinder-volume, it should be started in
a surviving controller node.
#. Neutron-lbaas-agent as a single point of failure
The current design of the neutron LBaaS agent using the HAProxy
driver does not allow high availability for the project load
balancers. The neutron-lbaas-agent service will be enabled and
running on all controllers, allowing for load balancers to be
distributed across all nodes. However, a controller node failure
will stop all load balancers running on that node until the service
is recovered or the load balancer is manually removed and created
again.
#. Service monitoring and recovery required
An external service monitoring infrastructure is required to check
the OpenStack service health, and notify operators in case of any
failure. This architecture does not provide any facility for that,
so it would be necessary to integrate the OpenStack deployment with
any existing monitoring environment.
#. Manual recovery after a full cluster restart
Some support services used by RDO or RHEL OSP use their own form of
application clustering. Usually, these services maintain a cluster
quorum, that may be lost in case of a simultaneous restart of all
cluster nodes, for example during a power outage. Each service will
require its own procedure to regain quorum.
If you find any or all of these limitations concerning, you are
encouraged to refer to the
:doc:`Pacemaker HA architecture<intro-ha-arch-pacemaker>` instead.

View File

@ -42,21 +42,37 @@ Networking for high availability.
Common deployment architectures
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There are primarily two HA architectures in use today.
There are primarily two recommended architectures for making OpenStack
highly available.
One uses a cluster manager such as Pacemaker or Veritas to co-ordinate
the actions of the various services across a set of machines. Since
we are focused on FOSS, we will refer to this as the Pacemaker
architecture.
Both use a cluster manager such as Pacemaker or Veritas to
orchestrate the actions of the various services across a set of
machines. Since we are focused on FOSS, we will refer to these as
Pacemaker architectures.
The other is optimized for Active/Active services that do not require
any inter-machine coordination. In this setup, services are started by
your init system (systemd in most modern distributions) and a tool is
used to move IP addresses between the hosts. The most common package
for doing this is keepalived.
The architectures differ in the sets of services managed by the
cluster.
Traditionally, Pacemaker has been positioned as an all-encompassing
solution. However, as OpenStack services have matured, they are
increasingly able to run in an active/active configuration and
gracefully tolerate the disappearance of the APIs on which they
depend.
With this in mind, some vendors are restricting Pacemaker's use to
services that must operate in an active/passive mode (such as
cinder-volume), those with multiple states (for example, Galera) and
those with complex bootstrapping procedures (such as RabbitMQ).
The majority of services, needing no real orchestration, are handled
by Systemd on each node. This approach avoids the need to coordinate
service upgrades or location changes with the cluster and has the
added advantage of more easily scaling beyond Corosync's 16 node
limit. However, it will generally require the addition of an
enterprise monitoring solution such as Nagios or Sensu for those
wanting centralized failure reporting.
.. toctree::
:maxdepth: 1
intro-ha-arch-pacemaker.rst
intro-ha-arch-keepalived.rst