Allow controlled shutdown of GuestOS for operations which power off the VM

Powering off a VM without giving the Guest Operating system a chance
to perform a controlled shutdown can lead to data corruption.

The proposed changes will make the default behavior for stop, rescue, resize,
and shelve to give the GuestOS a chance to perform a controlled shutdown
before the VM is powered off.

The change will encapsulate the complexity of signaling to and waiting for
the GuestOS in the hypervisor, and allow image owners the ability to tune
the associated timing via image metadata to take account of GuestOSs that
require an extended period to shutdown (such as Windows).

Users will be able to specify the shutdown behavior on a per operation basis
via a new shutdown_type parameter where, in keeping with the current reboot
operation, a "soft" shutdown will give the GuestOS a chance to perform a
clean shutdown, and a "hard" shutdown will cause an immediate power off.  The
default behavior will be a "soft" shutdown.

https://blueprints.launchpad.net/nova/+spec/user-defined-shutdown

Change-Id: Ie2c13c9173566c6d545fcaf6a3ab88df636a7d33
This commit is contained in:
Phil Day 2014-04-22 17:07:10 +00:00
parent 5a7446ca79
commit b85125bb4d
1 changed files with 259 additions and 0 deletions

View File

@ -0,0 +1,259 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
==========================================================================
Allow controlled shutdown of GuestOS for operations which power off the VM
==========================================================================
https://blueprints.launchpad.net/nova/+spec/user-defined-shutdown
The current behavior of powering off a VM without giving the Guest Operating
system a chance to perform a controlled shutdown can lead to data corruption.
Problem description
===================
Currently in libvirt operations which power off the VM (stop, rescue, shelve,
resize) do so without giving the GuestOS a chance to shutdown gracefully.
Some GuestOS's (for example Windows) do not react well to this type of virtual
power failure, and so it would be better if these operations follow the
same approach as soft_reboot and give the GuestOS a chance to shutdown
gracefully.
Proposed change
===============
The proposed changes will make the default behavior for stop, rescue, resize,
and shelve to give the GuestOS a chance to perform a controlled shutdown
before the VM is powered off.
The change will encapsulate the complexity of signaling to and waiting for
the GuestOS in the hypervisor, and allow image owners the ability to tune
the associated timing via image metadata to take account of GuestOSs that
require an extended period to shutdown (such as Windows).
Users will be able to specify the shutdown behavior on a per operation basis
via a new shutdown_type parameter where, in keeping with the current reboot
operation, a "soft" shutdown will give the GuestOS a chance to perform a
clean shutdown, and a "hard" shutdown will cause an immediate power off. The
default behavior will be a "soft" shutdown.
An example of a user wanting to override the default behavior is Tempest
which does not generally care if a GuestOS becomes corrupted and may
prefer speed of execution over data integrity.
At the hypervisor layer the shutdown behavior will be controlled by two
values:
* A timeout value specifying in seconds how long the hypervisor should
wait for the GuestOS to shutdown. If the GuestOS does not shutdown
within this period then the VM will be powered off anyway. A value of 0
will power off the VM without signaling the Guest to shutdown.
* A retry interval specifying in seconds how frequently within that period the
hypervisor should signal the guest to shutdown. This is a protection
against guests that may not be ready to process the shutdown signal
when it is first issued - a common problem if an instance is deleted
just after it has been created and the GuestOS is booting.
For example if the overall timeout is set to 60 seconds and the retry interval
is set to 10 seconds then the guest will be signaled up to six times before
being powered off.
These values will be passed into the virt driver by the compute manager,
allowing the same values to be used for all hypervisors.
The timeout value will be a Nova configuration parameter as different
operators may want a different default. The retry value will be implemented
as a constent in the Nova code. The timeout value can be overridden
on a per image basis via image metadata settings.
Alternatives
------------
An alternative approach would be to expose a new operation that only shuts
down the GuestOS (with used defined timing parameters), expose the status of
that operation via the API, and rely on the client for all retry logic.
However we believe that a clean shutdown should be the default behavior in
Nova and not have to be managed as a separate activity (which would have to
be replicated in all API bindings).
An alternative using a simpler single parameter to specify how long the
hypervisor should wait was previously merged but had to be reverted
because it added around 25 minutes to the tempest runs:
https://review.openstack.org/#/c/35303/
This was due to Tempest frequently stopping an instance immediately after
it is created, in which case the ACPI signal is delivered before the GuestOS
is in a state to process it. This results in the shutdown waiting for the
full duration of the timeout.
The revised approach described above avoids this issue by periodically
resending the shutdown signal to the GuestOS.
Once this change has been merged Tempest could be optimized to avoid this delay
(for example by setting the timeout to zero via image metadata or nova.conf).
It could be argued that the delete operation should allow the same
controlled shutdown schematics so that instances using and/or booting
from volumes can also leave those file systems in a safe state. However
if the stop operation is modified to provide a controlled shutdown then
users can achieve the required sequence by performing a stop prior to
the delete. This also avoids an issue of the http delete request not
normally accepting a body.
Data model impact
-----------------
None, the change is contained mainly within the interaction between the compute
manager and the virt driver.
REST API impact
---------------
The following API methods will be extended to accept an optional shutdown_type
parameter:
* Stop POST servers/{server_id}/action
{"os-stop": {"shutdown_type": "HARD|SOFT"}}
* Rescue POST servers/{server_id}/action
{"rescue": {"shutdown_type": "HARD|SOFT"}}
* Resize POST servers/{server_id}/action
{"resize": {"shutdown_type": "HARD|SOFT",
"flavor_id": <id>}}
* Shelve POST servers/{server_id}/action
{"shelve": {"shutdown_type: "HARD|SOFT"}}
* Migrate POST servers/{server_id}/action
{"migrate": {"shutdown_type: "HARD|SOFT"}}
Security impact
---------------
None, the change doesn't change the set of operations that a user can perform.
Notifications impact
--------------------
None.
Other end user impact
---------------------
Users will be able to provide additional options to the stop, rescue, and
delete. These will be exposed in the python-novaclient:
nova stop [--hard-shutdown]
nova rescue [--hard-shutdown]
nova resize [--hard-shutdown]
nova shelve [--hard-shutdown]
Note that "--hard-shutdown" is preferred here over the "--hard" option used
for reboot since a "soft resize" might be interpreted to mean a soft change
in allocated resources (such as disabling a cpu).
To make the novaclient CLI reboot command consistent it will be also modified
to accept --hard-shutdown as an alias for --hard.
Performance Impact
------------------
The performance impact is limited to the changes in the processing path of the
stop, rescue, and delete operations. When performing a clean shutdown
these will take longer than before as the system waits for the GuestOS to
shutdown. The overhead of polling to observe this change in state is
negligible and the calling thread will sleep (yield) between each poll.
Other deployer impact
---------------------
Once this set of changes has been merged the system will by default be
configured to wait for instances to shutdown gracefully for stop, shelve,
rescue, and resize operations.
Deployers will need to consider if they want to modify the default timeout
parameters, and/or to add override values to the metadata of existing images.
The configuration parameters will be common to all hypervisors, but this
BP will only deliver a libvirt implementation.
Developer impact
----------------
Only the first stage of the implementation is hypervisor dependent, once
that has merged other hypervisor implementations can be added.
The remaining stages will apply to any hypervisor that implements the revised
power_off options.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
philip-day
Work Items
----------
* Add timeout parameters to virt power_off method of virt driver and provide
the libvirt implementation. Implement clean_shutdown for stop() within
the compute manager as an initial example.
* Add clean_shutdown option to compute manager Rescue, Resize, and Shelve
operations
* Use image properties to override the timeout values
* Expose clean shutdown via rpcapi
* Expose clean shutdown via API
Dependencies
============
None
Testing
=======
The methods that are being modified are already extensively tested by Tempest
which will ensure no functional regression.
The default behavior will be to perform a clean shutdown, although it's not
easy to see how this can be verified by Tempest, since it needs specific
support within the Guest, and the behavior of any GuestOS is generally
considered outside the scope of Nova. Likewise the ability to stop without a
clean shutdown could be exercised from Tempest (it's possible that Tempest
would want to make this its normal case), but its hard to see how that could
be verified. Input will be sought from the Tempest community to see what can
be done to address these issues.
Documentation Impact
====================
* The API specs will need to be updated.
* The change in default behavior for stop, rescue, resize, and shelve (to wait
for the GuestOS to shutdown) will need to be documented.
* The ability to override the shutdown timeouts on a per image basis will need
to be documented.
References
==========
The code for the first work item is available for review
https://review.openstack.org/#q,I432b0b0c09db82797f28deb5617f02ee45a4278c,n,z