From 2c7f377b157ea174e6a645cc8ac9e95a45f33727 Mon Sep 17 00:00:00 2001 From: Sean Dague Date: Fri, 28 Jul 2017 11:29:18 +0000 Subject: [PATCH] Wait for compute service to check in With cell v2, on initial bring up, discover hosts can't run unless all the compute nodes have checked in. The documentation says that you should run ``nova service-list --binary nova-compute`` and see all your hosts before running discover hosts. This isn't really viable in a multinode devstack because of how things are brought up in parts. We can however know that stack.sh will not complete before the compute node is up by waiting for the compute node to check in before completing. This happens quite late in the stack.sh run, so shouldn't add any extra time in most runs. Cells v1 and Xenserver don't use real hostnames in the service table (they encode complex data that is hostname like to provide more topology information than just hostnames). They are exempted from this check. Related-Bug: #1708039 Change-Id: I32eb59b9d6c225a3e93992be3a3b9f4b251d7189 (cherry picked from commit c2fe916fc7c6c00cdfa0085e198eaf2ad4d915d1) --- functions | 20 ++++++++++++++++++++ lib/nova | 22 ++++++++++++++++++++++ stack.sh | 7 +++++++ 3 files changed, 49 insertions(+) diff --git a/functions b/functions index a9e1451755..5504ef5876 100644 --- a/functions +++ b/functions @@ -402,6 +402,26 @@ EOF return $rval } +function wait_for_compute { + local timeout=$1 + local rval=0 + time_start "wait_for_service" + timeout $timeout bash -x < 30 seconds + # happen between here and the script ending. However, in multinode + # tests this can very often not be the case. So ensure that the + # compute is up before we move on. + if is_service_enabled n-cell; then + # cells v1 can't complete the check below because it munges + # hostnames with cell information (grumble grumble). + return + fi + # TODO(sdague): honestly, this probably should be a plug point for + # an external system. + if [[ "$VIRT_DRIVER" == 'xenserver' ]]; then + # xenserver encodes information in the hostname of the compute + # because of the dom0/domU split. Just ignore for now. + return + fi + wait_for_compute 60 +} + function start_nova { start_nova_rest start_nova_compute diff --git a/stack.sh b/stack.sh index a369b84e52..ac2fc23c63 100755 --- a/stack.sh +++ b/stack.sh @@ -1361,6 +1361,13 @@ fi # Sanity checks # ============= +# Check that computes are all ready +# +# TODO(sdague): there should be some generic phase here. +if is_service_enabled n-cpu; then + is_nova_ready +fi + # Check the status of running services service_check