Juju Charm - HACluster
Go to file
Trent Lloyd 8200221a30 Enforce no-quorum-policy=stop for all cluster sizes
Previously quorum was only enforced on clusters with 3 or more nodes
under the mistaken assumption that it is not possible to have quorum
with only 2 nodes. The corosync votequorum agent which is configured
allows for quorum in 2-node scenarios using the "two_node" option which
is already configured by the charm.

In this new scenario, corosync requires that both nodes are present in
order to initially form cluster quorum, but also allows a single
surviving node to keep quorum or take-over once it was already started
while in contact with the other node.

The net effect of this change is that nodes are unable to startup
independently (which is when split brain situations are frequently seen
due to network startup delays, etc). There is no change to the runtime
behavior (there is still a risk that both nodes can go active if the
network connection between them is interrupted, this is an inherrent
risk of two-node clusters and requires a 3-node cluster to fix).

Thus we update the CRM configuration to always set no-quorum-policy=stop
regardless of whether the cluster has 2 or 3+ nodes.

In the event that you need to startup a cluster manually with only 1
node, first verify that the second node is definitely either powered off
or that corosync/pacemaker and all managed resources are stopped (thus
we can be sure it won't go split brain, because it cannot startup again
until it is in contact with the other node). Then you can override
cluster startup using this command to temporarily set the expected votes
to 1 instead of 2:
$ corosync-quorumtool -e1

Once the second node comes back up and corosync reconnects, the expected
vote count will automatically be reset to the configured value (or if
corosync is restarted).

Change-Id: Ica6a3ba387a4ab362400a25ff2ba0145e0218e1f
2018-02-09 14:10:46 +08:00
actions Re-license charm as Apache-2.0 2016-06-28 12:12:40 +01:00
files [bradm] Removed haproxy nrpe checks 2015-02-17 16:30:22 +10:00
hooks Enforce no-quorum-policy=stop for all cluster sizes 2018-02-09 14:10:46 +08:00
lib Update tox.ini files from release-tools gold copy 2016-09-09 19:22:07 +00:00
ocf Fix ocf maas_dns defaults 2017-10-25 12:02:36 -07:00
templates Fix pacemaker down crm infinite loop 2017-01-24 10:55:29 -08:00
tests Sync charm-helpers 2018-01-19 12:08:51 +00:00
unit_tests Support use of json for relation data 2018-01-04 18:01:28 +00:00
.gitignore Support network space binding of hanode relation 2017-09-28 09:00:43 +01:00
.gitreview Add tox and testr config and fix apt_pkg error exposed by tox unit_test run 2016-05-24 15:50:48 +00:00
.project Refactoring to use openstack charm helpers. 2013-03-24 12:01:17 +00:00
.pydevproject Refactoring to use openstack charm helpers. 2013-03-24 12:01:17 +00:00
.testr.conf Add tox and testr config and fix apt_pkg error exposed by tox unit_test run 2016-05-24 15:50:48 +00:00
LICENSE Re-license charm as Apache-2.0 2016-06-28 12:12:40 +01:00
Makefile Update repo to do ch-sync from Git 2017-09-26 10:21:49 +02:00
README.md Fix pacemaker down crm infinite loop 2017-01-24 10:55:29 -08:00
actions.yaml Add pause/resume actions 2016-03-23 14:47:13 +00:00
charm-helpers-hooks.yaml Update repo to do ch-sync from Git 2017-09-26 10:21:49 +02:00
charm-helpers-tests.yaml Update repo to do ch-sync from Git 2017-09-26 10:21:49 +02:00
config.yaml Add maintenance-mode configuration option 2017-08-16 17:44:44 +00:00
copyright Re-license charm as Apache-2.0 2016-06-28 12:12:40 +01:00
icon.svg Add icon and category 2014-04-11 12:22:46 +01:00
metadata.yaml Add Bionic and remove Zesty series and tests 2017-12-05 05:28:22 +00:00
requirements.txt Add tox and testr config and fix apt_pkg error exposed by tox unit_test run 2016-05-24 15:50:48 +00:00
setup.cfg fix coverage settings 2015-04-02 18:53:16 +01:00
test-requirements.txt Enable Zesty-Ocata Amulet Tests 2017-05-05 15:31:42 +00:00
tox.ini Enable xenial-pike amulet test 2017-11-21 12:17:36 +13:00

README.md

Overview

The hacluster subordinate charm provides corosync and pacemaker cluster configuration for principle charms which support the hacluster, container scoped relation.

The charm will only configure for HA once more that one service unit is present.

Usage

NOTE: The hacluster subordinate charm requires multicast network support, so this charm will NOT work in ec2 or in other clouds which block multicast traffic. Its intended for use in MAAS managed environments of physical hardware.

To deploy the charm:

juju deploy hacluster mysql-hacluster

To enable HA clustering support (for mysql for example):

juju deploy -n 2 mysql
juju deploy -n 3 ceph
juju set mysql vip="192.168.21.1"
juju add-relation mysql ceph
juju add-relation mysql mysql-hacluster

The principle charm must have explicit support for the hacluster interface in order for clustering to occur - otherwise nothing actually get configured.

Settings

It is best practice to set cluster_count to the number of expected units in the cluster. The charm will build the cluster without this setting, however, race conditions may occur in which one node is not yet aware of the total number of relations to other hacluster units, leading to failure of the corosync and pacemaker services to complete startup.

Setting cluster_count helps guarantee the hacluster charm waits until all expected peer relations are available before building the corosync cluster.

HA/Clustering

There are two mutually exclusive high availability options: using virtual IP(s) or DNS.

To use virtual IP(s) the clustered nodes must be on the same subnet such that the VIP is a valid IP on the subnet for one of the node's interfaces and each node has an interface in said subnet. The VIP becomes a highly-available API endpoint.

To use DNS high availability there are several prerequisites. However, DNS HA does not require the clustered nodes to be on the same subnet. Currently the DNS HA feature is only available for MAAS 2.0 or greater environments. MAAS 2.0 requires Juju 2.0 or greater. The MAAS 2.0 client requires Ubuntu 16.04 or greater. The clustered nodes must have static or "reserved" IP addresses registered in MAAS. The DNS hostname(s) must be pre-registered in MAAS before use with DNS HA.

The charm will throw an exception in the following circumstances: If running on a version of Ubuntu less than Xenial 16.04

Usage for Charm Authors

The hacluster interface supports a number of different cluster configuration options.

Mandatory Relation Data (deprecated)

Principle charms should provide basic corosync configuration:

corosync\_bindiface: The network interface to use for cluster messaging.
corosync\_mcastport: The multicast port to use for cluster messaging.

however, these can also be provided via configuration on the hacluster charm itself. If configuration is provided directly to the hacluster charm, this will be preferred over these relation options from the principle charm.

Resource Configuration

The hacluster interface provides support for a number of different ways of configuring cluster resources. All examples are provided in python.

NOTE: The hacluster charm interprets the data provided as python dicts; so it is also possible to provide these as literal strings from charms written in other languages.

init_services

Services which will be managed by pacemaker once the cluster is created:

init_services = {
        'res_mysqld':'mysql',
    }

These services will be stopped prior to configuring the cluster.

resources

Resources are the basic cluster resources that will be managed by pacemaker. In the mysql charm, this includes a block device, the filesystem, a virtual IP address and the mysql service itself:

resources = {
    'res_mysql_rbd':'ocf:ceph:rbd',
    'res_mysql_fs':'ocf:heartbeat:Filesystem',
    'res_mysql_vip':'ocf:heartbeat:IPaddr2',
    'res_mysqld':'upstart:mysql',
    }

resource_params

Parameters which should be used when configuring the resources specified:

resource_params = {
    'res_mysql_rbd':'params name="%s" pool="images" user="%s" secret="%s"' % \
                    (config['rbd-name'], SERVICE_NAME, KEYFILE),
    'res_mysql_fs':'params device="/dev/rbd/images/%s" directory="%s" '
                   'fstype="ext4" op start start-delay="10s"' % \
                    (config['rbd-name'], DATA_SRC_DST),
    'res_mysql_vip':'params ip="%s" cidr_netmask="%s" nic="%s"' %\
                    (config['vip'], config['vip_cidr'], config['vip_iface']),
    'res_mysqld':'op start start-delay="5s" op monitor interval="5s"',
    }

groups

Resources which should be managed as a single set of resource on the same service unit:

groups = {
    'grp_mysql':'res_mysql_rbd res_mysql_fs res_mysql_vip res_mysqld',
    }

clones

Resources which should run on every service unit participating in the cluster:

clones = {
    'cl_haproxy': 'res_haproxy_lsb'
    }