Merge "Add volume reimage spec"

This commit is contained in:
Zuul 2021-12-21 20:47:11 +00:00 committed by Gerrit Code Review
commit 84839d6e4d
1 changed files with 90 additions and 65 deletions

View File

@ -24,13 +24,13 @@ if the image-backed volume is already used as a boot volume which is already
attached to a server, it makes the case more complicated, because changing the attached to a server, it makes the case more complicated, because changing the
image in an attached root volume is `not supported`_. image in an attached root volume is `not supported`_.
In addition, as mentioned in nova bp 'volume-backend-server-rebuild' [1]_, In addition, as mentioned in nova spec 'volume-backend-server-rebuild' [1]_,
if users want to rebuild a volume-backed server, they can complete it by using if users want to rebuild a volume-backed server, they can complete it by using
re-image API, this way there is no new volume with new volume id, we don't re-image API, this way there is no new volume with new volume id, we don't
have to worry about types, quota problem, etc. have to worry about types, quota problem, etc.
As the discussion result during Stein PTG [2]_, we would propose to implement As the discussion result during Yoga PTG [2]_, the idea still makes sense and
a new API to support this. we would propose to implement a new API to support this.
.. _not supported: https://review.openstack.org/#/c/520660/ .. _not supported: https://review.openstack.org/#/c/520660/
@ -48,33 +48,44 @@ Proposed change
This spec proposes to add a new 'os-reimage' action to volume actions API. This spec proposes to add a new 'os-reimage' action to volume actions API.
* Add a new 'os-reimage' action to volume actions API. * Add a new 'os-reimage' action to volume actions API.
* The request will be rejected if cinder microversion is not new enough to
support 'os-reimage' action.
As mentioned above, the main purpose of this spec is for Nova's rebuild use As mentioned above, the main purpose of this spec is for Nova's rebuild use
case. In this case, nova will perform the following steps: case. When Nova calls cinder for re-image then cinder will perform these following
steps:
#. Create an empty (no connector) volume attachment for the volume and #. See nova side operations in the nova spec here
server. This ensures the volume remains ``reserved`` through the next 'volume-backend-server-rebuild' [1]_
step. #. Receive the re-image request from nova and check if the volume is in
#. Delete the existing volume attachment (the old one). ``reserved`` state.
#. Call the new ``os-reimage`` API. #. Perform the re-image operation. During this, the volume will transition
#. Poll the volume status for completion (either success or failure). into ``downloading`` state which is similar to the state when creating a
#. Upon successful completion of the re-image operation, update the empty bootable volume from image (more detailed information is captured later in
volume attchment in Cinder, and then do the attachment on the Nova host this section).
when spawning the (rebuilt) guest VM and "complete" the attachment #. Add a new 'volume-reimaged' external event and use it notify nova when the
which will make the volume ``in-use`` again. re-image operation is complete, like we use for volume-extend.
See `perform_resize_volume_online`_ for details.
#. After successful completion of the re-image operation, nova will call cinder
to update the attachment with connector info.
#. Cinder will update the attachment and return connection info to Nova.
#. After Nova completes the connection with brick, Nova will make the attachment
complete call marking the volume ``in-use``.
So, we propose to add a new microversion to support users to re-image a .. _perform_resize_volume_online: https://review.opendev.org/c/openstack/nova/+/454322
specific volume with new image id. Only ``reserved``, ``available`` and
``error`` volume can be re-imaged. The ``reserved`` volume can only be
re-imaged when the ``ignore_reserved`` parameter is ``True``. The volume
state will be changed to ``downloading`` first.
Two separate policies will be introduced, one for re-imaging an ``available`` So, we propose to add a new microversion to support users to re-image a
or ``error`` volume and another for re-imaging an ``reserved`` volume, the specific volume with new image id. Only ``reserved``, ``available`` and
default policies for these new actions are ``RULE_ADMIN_OR_OWNER``. ``error`` volume can be re-imaged. The ``reserved`` volume can only be
re-imaged when the ``reimage_reserved`` parameter is ``True``. The volume
state will be changed to ``downloading`` first.
If the image status is not ``active``, image size or image min_disk size is Two separate policies will be introduced, one for re-imaging an ``available``
larger than volume size, will raise InvalidInput (400) error. or ``error`` volume and another for re-imaging an ``reserved`` volume, the
default policies for these new actions are ``SYSTEM_ADMIN_OR_PROJECT_MEMBER``.
If the image status is not ``active``, image size or image min_disk size is
larger than volume size, will raise InvalidInput (400) error.
* Add a new ``reimage`` cast rpc. This rpc is used to cast an async message * Add a new ``reimage`` cast rpc. This rpc is used to cast an async message
from cinder API to the cinder-volume service or cluster. from cinder API to the cinder-volume service or cluster.
@ -83,20 +94,22 @@ This spec proposes to add a new 'os-reimage' action to volume actions API.
manager which will be called 'reimage', this method will first download the manager which will be called 'reimage', this method will first download the
image from glance and then perform the re-image operation. image from glance and then perform the re-image operation.
We propose to only support the basic implementation in this spec, the * Add an external event ``volume-reimaged`` and call it to notify nova API
re-image method will call driver's 'copy_image_to_volume' to re-write the about completion of reimage operation.
specific volume with new image. If the specific image is encrypted, then it
will call driver's 'copy_image_to_encrypted_volume'. It's similar to what we
have done when we create volume from image.
After the re-image operation is complete, the volume will set back to We propose to only support the basic implementation in this spec, the
original state, and all previously existing volume_glance_metadata of this re-image method will call driver's 'copy_image_to_volume' to re-write the
volume will be replaced with the metadata from new image. specific volume with new image. If the specific image is encrypted, then it
will call driver's 'copy_image_to_encrypted_volume'. It's similar to what we
have done when we create volume from image.
After the re-image operation is complete, the volume will set back to
original state, and all previously existing volume_glance_metadata of this
volume will be replaced with the metadata from new image.
* If the re-image operation fails, we will set a user message to give hints to * If the re-image operation fails, we will set a user message to give hints to
the user how they can recover from a potentially now corrupt boot volume and the user how they can recover from a potentially now corrupt boot volume and
the volume status will changed to ``error``. The latter part is important for the volume status will changed to ``error``.
the caller to know when to stop polling for the operation to complete.
There are also some other optimized mechanisms that should be considered to There are also some other optimized mechanisms that should be considered to
support in future, but will not be implemented in this spec: support in future, but will not be implemented in this spec:
@ -142,31 +155,31 @@ The rest API look like this in v3:
POST /v3/{project_id}/volumes/{volume_id}/action POST /v3/{project_id}/volumes/{volume_id}/action
.. code-block:: python .. code-block:: json
{ {
'os-reimage': { "os-reimage": {
'image_id': "71543ced-a8af-45b6-a5c4-a46282108a90", "image_id": "71543ced-a8af-45b6-a5c4-a46282108a90",
'ignore_reserved': false "reimage_reserved": true
} }
} }
* The <string> 'image_id' refers to the id of image. * The <string> 'image_id' refers to the id of image.
No default value since this is a required parameter. No default value since this is a required parameter.
* The <boolean> 'ignore_reserved' refers to re-image a volume and ignore its * The <boolean> 'reimage_reserved' refers to re-image a volume and ignore its
'reserved' status. The 'available' and 'error' volume can be re-imaged 'reserved' status. The 'available' and 'error' volume can be re-imaged
directly, but the 'reserved' volume can only be re-imaged when this directly, but the 'reserved' volume can only be re-imaged when this
parameter is 'true'. parameter is 'true'.
Defaults to 'false', this is an optional parameter. Defaults to 'false', this is an optional parameter.
The response body of it is like: The response body of it is like:
.. code-block:: python .. code-block:: json
{ {
"volume": { "volume": {
"migration_status": null, "migration_status": null,
"attachments": [ ], "attachments": [],
"links": [ "links": [
{ {
"href": "http://10.79.144.144/volume/v3/ffc60994a7274553905e5e5a8f890ab3/volumes/d90bfc0e-babf-4478-a591-23ca883ba2be", "href": "http://10.79.144.144/volume/v3/ffc60994a7274553905e5e5a8f890ab3/volumes/d90bfc0e-babf-4478-a591-23ca883ba2be",
@ -209,17 +222,17 @@ The response body of it is like:
"bootable": "true", "bootable": "true",
"created_at": "2018-09-26T01:55:38.735749", "created_at": "2018-09-26T01:55:38.735749",
"volume_type": "lvmdriver-1" "volume_type": "lvmdriver-1"
} }
} }
* The <string> 'status' will be 'downloading'. * The <string> 'status' will be 'downloading'.
* The <dict> 'volume_image_metadata' refers to the image metadata of the * The <dict> 'volume_image_metadata' refers to the image metadata of the
volume. It will include the original image until the re-image operation is volume. It will include the original image until the re-image operation is
complete in cinder-volume. complete in cinder-volume.
- Normal response codes: 202 - Normal response codes: 202
- Error response codes: 400, 403, 404, 409 - Error response codes: 400, 403, 404, 409
Data model impact Data model impact
----------------- -----------------
@ -246,6 +259,13 @@ function.
Callers of the new API will need to poll the status of the volume until it Callers of the new API will need to poll the status of the volume until it
goes back to its original status or ``error`` in case the operation failed. goes back to its original status or ``error`` in case the operation failed.
The only exception here is Nova which will be notified via external events
API and doesn't need to poll for the re-image to be completed.
Since this is a data path change, it will only modify the contents of volume and
dependent resources like snapshots or backups won't be affected by it. Just
to keep in mind that restoring to an earlier backup/snapshot will also revert
the volume to the old image.
Performance Impact Performance Impact
------------------ ------------------
@ -269,9 +289,7 @@ Assignee(s)
----------- -----------
Primary assignee: Primary assignee:
Yikun Jiang <yikunkero@gmail.com> Rajat Dhasmana <rajatdhasmana@gmail.com>
Other assignee:
TommyLike <tommylikehu@gmail.com>
Work Items Work Items
---------- ----------
@ -287,6 +305,8 @@ By supporting re-image volumes, we need to do the following changes:
* Set a user message in the event when the re-image fails. * Set a user message in the event when the re-image fails.
* Add a call to nova external events API with 'volume-reimaged' event
Dependencies Dependencies
============ ============
@ -303,11 +323,16 @@ Documentation Impact
Need to document the new behavior of the volume re-image, as well Need to document the new behavior of the volume re-image, as well
as related client examples, etc. as related client examples, etc.
We also need to mention in the documentation that when the volume
is re-imaged, all current content on the volume will be *destroyed*.
This is important as cinder volumes are considered to be persistent,
which is not the case with this operation.
References References
========== ==========
.. [1] http://review.openstack.org/#/c/532407 .. [1] https://review.opendev.org/c/openstack/nova-specs/+/809621
.. [2] https://wiki.openstack.org/wiki/CinderSteinPTGSummary#Nova_Cross_Project_Time .. [2] https://wiki.openstack.org/wiki/CinderYogaPTGSummary#Volume_re-image
.. [3] http://lists.openstack.org/pipermail/openstack-operators/2018-March/014952.html .. [3] http://lists.openstack.org/pipermail/openstack-operators/2018-March/014952.html