nova/nova/compute
Matt Riedemann 6369f39244 Remove allocations before setting vm_status to SHELVED_OFFLOADED
Tempest is intermittently failing a test which does the
following:

1. Create a server.
2. Shelve offload it.
3. Unshelve it.

Tempest waits for the server status to be SHELVED_OFFLOADED
before unshelving the server, which goes through the
scheduler to pick a compute node and claim resources on it.

When shelve offloading a server, the resource allocations
for the instance and compute node it was on are cleared, which
will also delete the internal consumer record in the placement
service.

The race is that the allocations are removed during shelve
offload *after* the server status changes to SHELVED_OFFLOADED.
This leaves a window where unshelve is going through the
scheduler and gets the existing allocations for the instance,
which are non-empty and have a consumer generation. The
claim_resources method in the scheduler then uses that
consumer generation when PUTing the allocations. That PUT
fails because in between the GET and PUT of the allocations,
placement has deleted the internal consumer record. When
PUTing the new allocations with a non-null consumer generation,
placement returns a 409 conflict error because for a new
consumer it expects the "consumer_generation" parameter to be
None.

This change handles the race by simply making sure the allocations
are deleted (along with the related consumer record in placement)
*before* the instance.vm_status is changed.

Change-Id: I2a6ccaff904c1f0759d55feeeef0ec1da32c65df
Closes-Bug: #1798688
2018-12-07 17:27:16 -05:00
..
monitors Remove translation of log messages 2017-06-09 09:06:16 +00:00
__init__.py Switch to using oslo_* instead of oslo.* 2015-02-06 06:03:10 -05:00
api.py Minimal construct plumbing for nova service-list when a cell is down 2018-10-31 15:22:15 -04:00
build_results.py Compute Add build_instance hook in compute manager 2014-12-04 10:12:00 -05:00
cells_api.py Minimal construct plumbing for nova service-list when a cell is down 2018-10-31 15:22:15 -04:00
claims.py [Trivial] docstrings, typos, minor refactoring 2017-08-28 08:33:58 -05:00
flavors.py Remove unused flavor_delete_info() method 2018-08-03 12:44:52 -04:00
instance_actions.py Add instance action record for snapshot instances 2017-12-11 17:46:38 +08:00
instance_list.py Minimal construct plumbing for nova list when a cell is down 2018-10-31 15:12:18 -04:00
manager.py Remove allocations before setting vm_status to SHELVED_OFFLOADED 2018-12-07 17:27:16 -05:00
migration_list.py Refactor scatter-gather utility to return exception objects 2018-10-31 15:18:07 -04:00
multi_cell_list.py Refactor scatter-gather utility to return exception objects 2018-10-31 15:18:07 -04:00
power_state.py Removed enum duplication from nova.compute 2016-09-02 07:30:44 +00:00
provider_tree.py Add missing ws seperator between words 2018-11-26 23:42:18 +00:00
resource_tracker.py Merge "Change the default values of XXX_allocation_ratio" 2018-12-06 23:44:42 +00:00
rpcapi.py Fix up compute rpcapi version for pike release 2018-10-22 15:15:49 +11:00
stats.py Change consecutive build failure limit to a weigher 2018-06-06 15:18:50 -07:00
task_states.py Fix resource tracker updates during instance evacuation 2018-09-12 13:05:29 +03:00
utils.py Fix sloppy initialization of the new disk ops semaphore. 2018-12-03 10:19:22 +11:00
vm_states.py Removed enum duplication from nova.compute 2016-09-02 07:30:44 +00:00