Move cold boot section to CDG

Also move the section up the page to reflect
its importance.

Depends-on: Ic7ad2622b6aa3c8ae8821a8f5f531534d8cfb3d6
Change-Id: I31ba8a968cd7f42d639e1843e66714dbd0a067aa
This commit is contained in:
Peter Matulis 2020-11-11 15:40:40 -05:00
parent a197911d6c
commit bf6cecd936
1 changed files with 8 additions and 95 deletions

103
README.md
View File

@ -58,6 +58,13 @@ for mysql can be retrieved using the following command:
Root user DB access is only usable from within one of the deployed units
(access to root is restricted to localhost only).
## Cold boot
When machines hosting the percona-cluster units are started in order for the
application to assume a clustered and healthy state particular steps are
required to be taken. This is documented in the [OpenStack Charms Deployment
Guide][cdg-percona-startup].
## Limitations
Note that Percona XtraDB Cluster is not a 'scale-out' MySQL solution; reads
@ -202,101 +209,6 @@ Upstream documentation is also available:
* [Percona XtraDB Cluster In-Place Upgrading Guide: From 5.5 to 5.6][upstream-upgrading-55-to-56]
* [Galera replication - how to recover a PXC cluster][upstream-recovering]
## Cold Boot
In the event of an unexpected power outage and cold boot, the cluster will be
unable to reestablish itself without manual intervention.
The cluster will be in scenario 3 or 6 from the upstream [Percona Cluster
documentation](https://www.percona.com/blog/2014/09/01/galera-replication-how-to-recover-a-pxc-cluster/)
Please read the upstream documentation as it provides context to the steps
outlined here. In either scenario, it is necessary to choose a unit to become
the bootstrap node.
### Determine the node with the highest sequence number
This information can be found in the
`/var/lib/percona-xtradb-cluster/grastate.dat` file. The charm will also display
this information in the juju status.
Example `juju status` after a cold boot of `percona-cluster`
Unit Workload Agent Machine Public address Ports Message
keystone/0* active idle 0 10.5.0.32 5000/tcp Unit is ready
percona-cluster/0 blocked idle 1 10.5.0.20 3306/tcp MySQL is down. Sequence Number: 355. Safe To Bootstrap: 0
hacluster/0 active idle 10.5.0.20 Unit is ready and clustered
percona-cluster/1 blocked idle 2 10.5.0.17 3306/tcp MySQL is down. Sequence Number: 355. Safe To Bootstrap: 0
hacluster/1 active idle 10.5.0.17 Unit is ready and clustered
percona-cluster/2* blocked idle 3 10.5.0.27 3306/tcp MySQL is down. Sequence Number: 355. Safe To Bootstrap: 0
hacluster/2* active idle 10.5.0.27 Unit is ready and clustered
*Note*: An application leader is denoted by any asterisk in the Unit column.
In the above example all the sequence numbers match. This means we can
bootstrap from any unit we choose.
In the next example the percona-cluster/2 node has the highest sequence number
so we must choose that node to avoid data loss.
Unit Workload Agent Machine Public address Ports Message
keystone/0* active idle 0 10.5.0.32 5000/tcp Unit is ready
percona-cluster/0* blocked idle 1 10.5.0.20 3306/tcp MySQL is down. Sequence Number: 1318. Safe To Bootstrap: 0
hacluster/0* active idle 10.5.0.20 Unit is ready and clustered
percona-cluster/1 blocked idle 2 10.5.0.17 3306/tcp MySQL is down. Sequence Number: 1318. Safe To Bootstrap: 0
hacluster/1 active idle 10.5.0.17 Unit is ready and clustered
percona-cluster/2 blocked idle 3 10.5.0.27 3306/tcp MySQL is down. Sequence Number: 1325. Safe To Bootstrap: 0
hacluster/2 active idle 10.5.0.27 Unit is ready and clustered
### Bootstrap the node with the highest sequence number
Run the `bootstrap-pxc` action on the node with the highest sequence number. In
this example, it is unit percona-cluster/2, which happens to be a non-leader.
juju run-action --wait percona-cluster/2 bootstrap-pxc
### Notify the cluster of the new bootstrap UUID
In the vast majority of cases, once the `bootstrap-pxc` action has been run and
the model has settled the output to the `juju status` command will now look
like this:
Unit Workload Agent Machine Public address Ports Message
keystone/0* active idle 0 10.5.0.32 5000/tcp Unit is ready
percona-cluster/0* waiting idle 1 10.5.0.20 3306/tcp Unit waiting for cluster bootstrap
hacluster/0* active idle 10.5.0.20 Unit is ready and clustered
percona-cluster/1 waiting idle 2 10.5.0.17 3306/tcp Unit waiting for cluster bootstrap
hacluster/1 active idle 10.5.0.17 Unit is ready and clustered
percona-cluster/2 waiting idle 3 10.5.0.27 3306/tcp Unit waiting for cluster bootstrap
hacluster/2 active idle 10.5.0.27 Unit is ready and clustered
If you observe the above output ("Unit waiting for cluster bootstrap") then the
`notify-bootstrapped` action needs to be run on a unit. There are two
possibilities:
1. If the `bootstrap-pxc` action was run on a leader then run
`notify-bootstrapped` on a non-leader.
2. If the `bootstrap-pxc` action was run on a non-leader then run
`notify-bootstrapped` on the leader.
In the current example, the first action was run on a non-leader so we'll run
the second action on the leader, percona-cluster/0:
juju run-action percona-cluster/0 notify-bootstrapped --wait
After the model settles, the output should show all nodes in active and ready
state:
Unit Workload Agent Machine Public address Ports Message
keystone/0* active idle 0 10.5.0.32 5000/tcp Unit is ready
percona-cluster/0* active idle 1 10.5.0.20 3306/tcp Unit is ready
hacluster/0* active idle 10.5.0.20 Unit is ready and clustered
percona-cluster/1 active idle 2 10.5.0.17 3306/tcp Unit is ready
hacluster/1 active idle 10.5.0.17 Unit is ready and clustered
percona-cluster/2 active idle 3 10.5.0.27 3306/tcp Unit is ready
hacluster/2 active idle 10.5.0.27 Unit is ready and clustered
The percona-cluster application is now back to a clustered and healthy state.
# Bugs
Please report bugs on [Launchpad][lp-bugs-charm-percona-cluster].
@ -317,6 +229,7 @@ For general charm questions refer to the [OpenStack Charm Guide][cg].
[upstream-recovering]: https://www.percona.com/blog/2014/09/01/galera-replication-how-to-recover-a-pxc-cluster/
[juju-docs-actions]: https://jaas.ai/docs/actions
[cdg-percona-migration-to-mysql8]: https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-series-upgrade-specific-procedures.html#percona-cluster-charm-series-upgrade-to-focal
[cdg-percona-startup]: https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-managing-power-events.html#id22
[mysql-router-charm]: https://jaas.ai/mysql-router
[mysql-innodb-cluster-charm]: https://jaas.ai/mysql-innodb-cluster
[cdg-procedures]: https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/app-series-upgrade-openstack.html#procedures