Fix booting instances after nova-compute upgrade

After upgrading from Rocky to Stein, nova-compute services fail to start
new instances with the following error message:

Failed to allocate the network(s), not rescheduling.

Looking in the nova-compute logs, we also see this:

Neutron Reported failure on event
network-vif-plugged-60c05a0d-8758-44c9-81e4-754551567be5 for instance
32c493c4-d88c-4f14-98db-c7af64bf3324: NovaException: In shutdown, no new
events can be scheduled

During the upgrade process, we send nova containers a SIGHUP to cause
them to reload their object version state. Speaking to the nova team in
IRC, there is a known issue with this, caused by oslo.service performing
a full shutdown in response to a SIGHUP, which breaks nova-compute.
There is a patch [1] in review to address this.

The workaround employed here is to restart the nova compute service.

[1] https://review.openstack.org/#/c/641907

Change-Id: Ia4fcc558a3f62ced2d629d7a22d0bc1eb6b879f1
Closes-Bug: #1821362
This commit is contained in:
Mark Goddard 2019-03-22 14:59:41 +00:00
parent 98df4dd841
commit 192dcd1e1b
3 changed files with 45 additions and 27 deletions

View File

@ -357,6 +357,17 @@ nova_safety_upgrade: "no"
nova_libvirt_port: "16509"
nova_ssh_port: "8022"
nova_services_require_nova_conf:
- nova-api
- nova-compute
- nova-compute-ironic
- nova-conductor
- nova-consoleauth
- nova-novncproxy
- nova-serialproxy
- nova-scheduler
- nova-spicehtml5proxy
####################
# Notification
####################

View File

@ -81,16 +81,6 @@
- name: Copying over nova.conf
become: true
vars:
services_require_nova_conf:
- nova-api
- nova-compute
- nova-compute-ironic
- nova-conductor
- nova-consoleauth
- nova-novncproxy
- nova-serialproxy
- nova-scheduler
- nova-spicehtml5proxy
service_name: "{{ item.key }}"
merge_configs:
sources:
@ -105,7 +95,7 @@
when:
- inventory_hostname in groups[item.value.group]
- item.value.enabled | bool
- item.key in services_require_nova_conf
- item.key in nova_services_require_nova_conf
with_dict: "{{ nova_services }}"
notify:
- "Restart {{ item.key }} container"

View File

@ -1,21 +1,38 @@
---
# This play calls sighup on every service to refresh upgrade levels
- name: Sighup nova-api
command: docker exec -t nova_api kill -1 1
when: inventory_hostname in groups['nova-api']
- name: Sighup nova-conductor
command: docker exec -t nova_conductor kill -1 1
when: inventory_hostname in groups['nova-conductor']
# NOTE(mgoddard): Currently (just prior to Stein release), sending SIGHUP to
# nova compute services leaves them in a broken state in which they cannot
# start new instances. The following error is seen in the logs:
# "In shutdown, no new events can be scheduled"
# To work around this we restart the nova-compute services.
# Speaking to the nova team, this seems to be an issue in oslo.service,
# with a fix proposed here: https://review.openstack.org/#/c/641907.
# This issue also seems to affect the proxy services, which exit non-zero in
# reponse to a SIGHUP, so restart those too.
# TODO(mgoddard): Remove this workaround when this bug has been fixed.
- name: Sighup nova-consoleauth
command: docker exec -t nova_consoleauth kill -1 1
when: inventory_hostname in groups['nova-consoleauth']
- name: Send SIGHUP to nova services
become: true
command: docker exec -t {{ item.value.container_name }} kill -1 1
when:
- inventory_hostname in groups[item.value.group]
- item.value.enabled | bool
- item.key in nova_services_require_nova_conf
- not item.key.startswith('nova-compute')
- not item.key.endswith('proxy')
with_dict: "{{ nova_services }}"
- name: Sighup nova-scheduler
command: docker exec -t nova_scheduler kill -1 1
when: inventory_hostname in groups['nova-scheduler']
- name: Sighup nova-compute
command: docker exec -t nova_compute kill -1 1
when: inventory_hostname in groups['compute']
- name: Restart nova compute and proxy services
become: true
kolla_docker:
action: restart_container
common_options: "{{ docker_common_options }}"
name: "{{ item.value.container_name }}"
when:
- inventory_hostname in groups[item.value.group]
- item.value.enabled | bool
- item.key in nova_services_require_nova_conf
- item.key.startswith('nova-compute')
or item.key.endswith('proxy')
with_dict: "{{ nova_services }}"