Merge "Add PENDING vm state"
This commit is contained in:
commit
43879f24bd
|
@ -0,0 +1,263 @@
|
|||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
==========================
|
||||
Introduce Pending VM state
|
||||
==========================
|
||||
https://blueprints.launchpad.net/nova/+spec/introduce-pending-vm-state
|
||||
|
||||
This feature adds support for the PENDING server state. When the scheduler
|
||||
determines there is no capacity available for the given request, and so the
|
||||
instance is about to be routed into cell0, the server should be set into the
|
||||
PENDING state instead of ERROR if the operator wishes so. This will allow the
|
||||
execution of subsequent actions transparently to the end user.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
As an operator, I want to enable an external -to Nova- service, triggered as
|
||||
soon as a server's build request fails due to NoValidHosts, and try to free up
|
||||
the requested resources.
|
||||
|
||||
If the outcome of the follow up actions is:
|
||||
|
||||
#. success, the external service will try to rebuild the instance::
|
||||
|
||||
POST /servers/{server_id}/action
|
||||
{
|
||||
"rebuild": {
|
||||
"description": null,
|
||||
"imageRef": {image_id}
|
||||
}
|
||||
}
|
||||
|
||||
NOTE::
|
||||
The rebuild api needs to be adapted to take care of instances that fail
|
||||
while building and are mapped to cell0. This change is considered out
|
||||
of scope for this spec and is being addressed by another spec [1]
|
||||
|
||||
#. failure, the external service will set the state of the instance to ERROR
|
||||
(using reset-state)::
|
||||
|
||||
POST /servers/{server_id}/action
|
||||
{
|
||||
"os-resetState": {
|
||||
"state": "error"
|
||||
}
|
||||
}
|
||||
|
||||
In order to achieve that, transparently to the user, the instance should not
|
||||
be set to the ERROR state but to the new PENDING state.
|
||||
|
||||
We need to clarify here that, as for all the other VM states, the end user
|
||||
will be able to delete instances set to the PENDING state. Failures to the
|
||||
follow up actions, caused by the deletion of instances in the new state, have
|
||||
to be handled by the external service.
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
#. Add the PENDING state in the InstanceState object.
|
||||
|
||||
#. Add the PENDING state in compute vm_states.
|
||||
|
||||
#. Add the PENDING state in the server ViewBuilder as a new progress status.
|
||||
|
||||
#. Add a configuration option that defaults to ``False`` in the DEFAULT group
|
||||
to enable the use of PENDING vm_state on NoValidHost events::
|
||||
|
||||
CONF.use_pending_state
|
||||
|
||||
#. Add the following code in the conductor manager ``_bury_in_cell0`` method
|
||||
to make sure that the a vm is set to PENDING only when the operator has
|
||||
chosen so and the failure reported by the scheduler is a NoValidHost::
|
||||
|
||||
verify = isinstance(exc, exception.NoValidHost)
|
||||
|
||||
if CONF.use_pending_state and verify:
|
||||
vm_state = vm_states.PENDING
|
||||
else:
|
||||
vm_state = vm_states.ERROR
|
||||
|
||||
updates = {'vm_state': vm_state, 'task_state': None}
|
||||
|
||||
#. Add a new API microversion and Map the PENDING state to ERROR for requests
|
||||
to previous microversions. See `REST API impact`_.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Follow the vendor data example and perform an asynchronous REST API call from
|
||||
the Nova Conductor to the external service when enabled by the operator. But
|
||||
having an asynchronous REST API call from the conductor would potentially have
|
||||
performance impact.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
None.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
A new API microversion is needed for this change. For the older microversions
|
||||
the PENDING state will be mapped to the ERROR state.
|
||||
|
||||
Example responses for a server set to PENDING would be::
|
||||
|
||||
GET /servers/detail (new microversion)
|
||||
{
|
||||
"servers":[
|
||||
{
|
||||
...: ...,
|
||||
"name": "test",
|
||||
"id":"2dd26c1e-bc6f-45f6-83b3-2cb72ea026eb",
|
||||
"OS-EXT-STS:vm_state":"pending",
|
||||
"status":"PENDING",
|
||||
...: ...
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
GET /servers/detail (previous microversions)
|
||||
{
|
||||
"servers":[
|
||||
{
|
||||
...: ...,
|
||||
"name": "test",
|
||||
"id":"2dd26c1e-bc6f-45f6-83b3-2cb72ea026eb",
|
||||
"OS-EXT-STS:vm_state":"error",
|
||||
"status":"ERROR",
|
||||
...: ...
|
||||
}
|
||||
]
|
||||
}
|
||||
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
None.
|
||||
|
||||
Notifications impact
|
||||
--------------------
|
||||
|
||||
Firstly, the external third party service has to be notified when a server is
|
||||
set to PENDING state. For this, the already existing versioned notification
|
||||
instance.update.
|
||||
|
||||
For the second part, a notification is needed in order to inform the external
|
||||
service about a server's build procedure outcome. The plan is to use this
|
||||
notification in order to enable the external Reaper service, to know where the
|
||||
requested resources have to be freed up.
|
||||
|
||||
The transformation of the existing ``select_destinations`` notification from
|
||||
unversioned to a versioned one, is out of scope of this spec. There is an
|
||||
existing change that tries to address the transformation of the notification
|
||||
and can be picked up [2].
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
The end user will have servers in PENDING state if the operator chooses to
|
||||
enable the configuration option ``CONF.use_pending_state``.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
None.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
There will be a new config option specifying if the PENDING state will be used
|
||||
or not. It seems that the most appropriate place for this option is the DEFAULT
|
||||
section.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
None.
|
||||
|
||||
Upgrade impact
|
||||
--------------
|
||||
|
||||
None.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
<ttsiouts>
|
||||
|
||||
Other contributors:
|
||||
<johnthetubaguy>
|
||||
<strigazi>
|
||||
<belmoreira>
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
See `Proposed change`_.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
None.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Updating existing unit and functional tests should be enough to verify the
|
||||
use of the new state.
|
||||
New unit and functional tests have to be added to verify the new notification.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
#. The new configuration option as well as the meaning of the PENDING state
|
||||
should be documented.
|
||||
|
||||
#. Update the allowed state transitions documentation to include::
|
||||
|
||||
BUILD to PENDING
|
||||
PENDING to BUILD
|
||||
PENDING to ERROR
|
||||
|
||||
#. Document that the responsibility of managing the instance's lifecycle is
|
||||
transferred to the external service as soon as the instance is set to the
|
||||
PENDING state.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
[1] Enable rebuild for instances in cell0
|
||||
https://review.openstack.org/#/c/554218/
|
||||
|
||||
[2] WIP: Transform scheduler.select_destinations notification
|
||||
https://review.openstack.org/#/c/508506/
|
||||
|
||||
As discussed in the Dublin PTG:
|
||||
https://etherpad.openstack.org/p/nova-ptg-rocky L472
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
.. list-table:: Revisions
|
||||
:header-rows: 1
|
||||
|
||||
* - Release Name
|
||||
- Description
|
||||
* - Rocky
|
||||
- Introduced
|
||||
* - Stein
|
||||
- Re-proposed
|
Loading…
Reference in New Issue