Workaround ovn cluster failure during update when schema change.

During update the ovndb server can have a schema change. The problem
is that an updated slave ovndb wouldn't connect to a master which
still has the old db schema.  At some point (200000ms) pacemaker put
the resource in error Time Out.  Then it will wait for the operator to
cleanup the resource.  Meaning that the update can goes like this:

 - Original state: (Master, Slave, Failed): nothing updated
   - ctl0-M-old
   - ctl1-S-old
   - ctl2-S-old
 - First state: after update of ctl0
   - ctl0-F-new
   - ctl1-M-old
   - ctl2-S-old
 - Second state: after update of ctl1
   - ctl0-F-new
   - ctl1-F-new
   - ctl2-M-old
 - Third and final state: after update of ctl2
   - ctl0-F-new
   - ctl1-F-new
   - ctl2-M-new

During the third state we have a cut in the control plane as ctl2 is
the master and there is no slave to fall back to. Then we end up
loosing HA as only one node is active.  The error persists after
reboot.  Only a pcs resource cleanup will bring the cluster online.

The real solution will come from ovndb and the associated ocf agent,
but in the meantime, we workaround it by:
 - cleanup
 - ban the resource;
in step 1 and:
 - cleanup
 - unban the resource
in step 5.

This has the net effect of preventing the cut in the control plane for
the last node as we move master to the updated controller which will
form a cluster of one master and one slave (as two are updated).  The
last one will happily join then when it will be updated.

That means:
 - we always have either 1 or 2 nodes working;
 - we end the update with the cluster converged back to a stable
 state.

The problems are :
 - we could hide a real ovndb cluster issue;
- if the update break in-between we could have a leftover ban on one
 of the node;

But, all things considered, this looks like the best compromise for
the time being.

Change-Id: I8f71bf83ddafca167deae1a38ca819f7d930fb80
Closes-Bug: #1847780
(cherry picked from commit 751b3fc096)
(cherry picked from commit d9c60ab05e)
This commit is contained in:
Sofer Athlan-Guyot 2019-10-11 16:10:00 +02:00
parent 838ea794a9
commit 9ad87580ba
1 changed files with 26 additions and 0 deletions

View File

@ -199,6 +199,23 @@ outputs:
/var/log/containers/openvswitch.
ignore_errors: true
update_tasks:
# When a schema change happens, the newer slaves don't connect
# back to the older master and end up timing out. So we clean
# up the error here until we get a fix for
# https://bugzilla.redhat.com/show_bug.cgi?id=1759974
- name: Clear ovndb cluster pacemaker error
shell: "pcs resource cleanup ovn-dbs-bundle"
when:
- step|int == 1
# Then we ban the resource for this node. It has no effect on
# the first two controllers, but when we reach the last one,
# it avoids a cut in the control plane as master get chosen in
# one of the updated Stopped ovn. They are in error, that why
# we need the cleanup just before.
- name: Ban ovndb resource on the current node.
shell: "pcs resource ban ovn-dbs-bundle $(hostname | cut -d. -f1)"
when:
- step|int == 1
- name: Get docker ovn-dbs image
set_fact:
ovn_dbs_docker_image: {get_param: DockerOvnDbsImage}
@ -243,6 +260,15 @@ outputs:
shell: "docker tag {{ovn_dbs_docker_image}} {{ovn_dbs_docker_image_latest}}"
# Got to check that pacemaker_is_active is working fine with bundle.
# TODO: pacemaker_is_active resource doesn't support bundle.
# We remove any leftover error and remove the ban.
- name: Ensure the cluster converge back even in case of schema change
shell: "pcs resource cleanup ovn-dbs-bundle"
when:
- step|int == 5
- name: Remove the ban
shell: "pcs resource clear ovn-dbs-bundle"
when:
- step|int == 5
# When ovn-dbs-bundle support was added, we didn't tag the ovn-dbs image
# with pcmklatest. So, when update is run for the first time we need to
# update the ovn-dbs-bundle resource to use the 'pcmklatest' tagged image.