Fix the health monitor traffic test member ERROR

Neutron may be slow to start passing traffic after a health
monitor has been defined on a pool. We have seen "Layer4 timeout"
errors[1] in some of the gate job runs where a few seconds later the
health monitor traffic starts getting a response from the backend
member server.

This patch changes the waiter to allow an "ERROR" status for the
initial member check after the health monitor is added. The waiter
will still timeout if it does not become "ONLINE" as expected.

As the zuul log viewer is broken I can't link to the log line, but it
is at time: Aug 30 01:20:40

[1] https://openstack.fortnebula.com:13808/v1/ \
    AUTH_e8fd161dc34c421a979a9e6421f823e9/logs_58/679358/2/check/ \
    octavia-v2-dsvm-scenario/1bcb675/controller/logs/ \
    octavia-amphora_log.txt.gz

Change-Id: Ic55fabe94627b21a6f347e5822893a8b63cd1afb
This commit is contained in:
Michael Johnson 2019-08-30 11:11:42 -07:00
parent 7140479919
commit 8d420801d7
2 changed files with 5 additions and 2 deletions

View File

@ -338,6 +338,7 @@ class TrafficOperationsScenarioTest(test_base.LoadBalancerBaseTestWithCompute):
const.ONLINE,
CONF.load_balancer.build_interval,
CONF.load_balancer.build_timeout,
error_ok=True,
pool_id=self.pool_id)
waiters.wait_for_status(
self.mem_member_client.show_member,

View File

@ -28,7 +28,7 @@ LOG = logging.getLogger(__name__)
def wait_for_status(show_client, id, status_key, status,
check_interval, check_timeout, root_tag=None,
**kwargs):
error_ok=False, **kwargs):
"""Waits for an object to reach a specific status.
:param show_client: The tempest service client show method.
@ -40,6 +40,7 @@ def wait_for_status(show_client, id, status_key, status,
:check_interval: How often to check the status, in seconds.
:check_timeout: The maximum time, in seconds, to check the status.
:root_tag: The root tag on the response to remove, if any.
:error_ok: When true, ERROR status will not raise an exception.
:raises CommandFailed: Raised if the object goes into ERROR and ERROR was
not the desired status.
:raises TimeoutException: The object did not achieve the status or ERROR in
@ -75,7 +76,8 @@ def wait_for_status(show_client, id, status_key, status,
if caller:
message = '({caller}) {message}'.format(caller=caller,
message=message)
raise exceptions.UnexpectedResponseCode(message)
if not error_ok:
raise exceptions.UnexpectedResponseCode(message)
elif int(time.time()) - start >= check_timeout:
message = (
'{name} {field} failed to update to {expected_status} within '