Prevent swacting to a 'Locking' controller

Locking a controller takes a finite amount of time, resulting in a
brief window between issuing a lock command toward the inactive
controller and the controller actually entering the locked state.

Typically, this window lasts only a few seconds. However, during
periods of high system activity or when VMs or other migrations are
occurring, it can extend to a minute or longer before the controller
enters the locked state.

In some cases, initiating a 'system host-swact' command while the
inactive controller is in this 'Locking but not yet Locked' state has
led to a switch of activity to a locked controller.

The current pre-swact semantic check is inadequate in preventing
this race condition, which could result in a locked active controller.

This update adds a precheck of a list of in-progress actions, any of
which will now reject a swact request.

Test Plan:

PASS: Verify sysinv package build.
PASS: Verify swact is rejected for any of the in-progress actions
      listed in the precheck.
PASS: Verify swact reject handling and output text.
PASS: Verify pep8 of changed lines.

Regression:

PASS: Verify swact handling when task is empty
PASS: Verify swact handling when task is not empty and not Locking
PASS: Verify Swact soak (10x)

Closes-Bug: 2064347
Change-Id: I78238fa649c330d7b908dbcf50f654c004205ee6
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
This commit is contained in:
Eric MacDonald 2024-04-30 21:16:01 +00:00
parent 5ca9c6f78f
commit f29cc84ba3
1 changed files with 27 additions and 0 deletions

View File

@ -6496,6 +6496,33 @@ class HostController(rest.RestController):
ihost_ctr.subfunction_oper,
ihost_ctr.subfunctions))
# Deny swact for specific in-progress actions.
#
# Need to handle the case where a lock action has been
# issued against the peer controller that is shortly
# followed by a swact before the lock is complete.
#
# Allowing the swact to continue can lead to activating
# a locked standby controller.
#
# Note: The force_swact is not honored to prevent swact
# to a locked controller.
#
# The following is a list of host actions that if
# any are in-progress should force a reject of the
# swact request.
if ihost_ctr.task:
rejected_actions = [constants.LOCK_ACTION,
constants.FORCE_LOCK_ACTION,
constants.FORCE_UNSAFE_LOCK_ACTION]
for reject_action in rejected_actions:
task_str = hostupdate.get_task_from_action(reject_action)
if ihost_ctr.task.startswith(task_str):
raise wsme.exc.ClientSideError(
_("%s is currently being locked. Cannot "
"Swact to a controller with an in-progress "
"lock operation.") % (ihost_ctr.hostname))
# deny swact if a kube rootca update phase is in progress
self._semantic_check_swact_kube_rootca_update(hostupdate.ihost_orig,
force_swact)