518 lines
21 KiB
ReStructuredText
518 lines
21 KiB
ReStructuredText
..
|
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
|
License.
|
|
|
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
|
|
|
=================
|
|
New Location APIs
|
|
=================
|
|
|
|
https://blueprints.launchpad.net/glance/+spec/new-location-apis
|
|
|
|
Problem description
|
|
===================
|
|
|
|
Currently we have two security vulnerabilities with
|
|
``show_multiple_locations`` config option, OSSN-0065 [1]_ and OSSN-0090 [2]_.
|
|
If we enable ``show_multiple_locations`` and the policies for add/update
|
|
(set_image_location), get (get_image_location) and remove
|
|
(delete_image_location) locations are set for non-admins then non-admin users
|
|
can modify location data to corrupt an image that they own. Note that the
|
|
policies for add, get and remove locations are set for non-admins by default
|
|
else a non-admin user cannot associate data with an image record, or retrieve
|
|
image data, or delete image data.
|
|
|
|
When show_multiple_locations is False, users cannot modify image
|
|
locations via the image-update API call, even if they have the
|
|
``{get,set,delete}_image_location`` permissions. However, there are some
|
|
popular use cases where other services can bypass Glance and store or access
|
|
image data directly in the backend by writing or reading image locations,
|
|
using the image owner's credentials, and this is why operators want to set
|
|
show_multiple_locations to True. What operators want to do, however, is to
|
|
enable optimized image data access; exposing image locations to non-admin
|
|
users is a side-effect, not the goal. We currently recommend that operators
|
|
who want to use optimized data access use a specialized Glance instance for
|
|
services, and only expose glance-api to end users with show_multiple_locations
|
|
set False. This is inconvinient for certain users.
|
|
|
|
Proposed change
|
|
===============
|
|
|
|
There will be 3 phases in which the work will be done as follows:
|
|
|
|
1. Introduce 2 new API calls that allow operations on image locations which
|
|
are described in detail in the `REST API impact`_ section.
|
|
These calls will replace the image-update mechanism for consumers
|
|
like cinder and nova.
|
|
|
|
2. Modify the consumer (cinder/nova) code to use the new location APIs.
|
|
Also modify HTTP store to use new location APIs.
|
|
|
|
3. Remove ``show_multiple_locations`` config option when it is no longer
|
|
required by other services (cinder/nova) to perform operations on
|
|
locations. This will mostly be done 1 or 2 cycles after the consumers
|
|
have adapted the new location APIs to handle the upgrade cases.
|
|
|
|
The config option ``show_multiple_locations`` has been deprecated since Newton
|
|
but we will keep the config option until the consumers of glance locations
|
|
(nova, cinder, http store etc) start using the new location APIs. Since this
|
|
is a major effort spanning across multiple services (nova, cinder, glance),
|
|
we will implement the work items in different cycles to provide enough
|
|
time for developers (to implement this) and operators (to move away from the
|
|
config option).
|
|
|
|
We will introduce 2 new policies, for each API performing different operations
|
|
like add and get, as follows:
|
|
|
|
1. The ``add policy`` can default to the ``project member`` or ``service``
|
|
role (when it is implemented).
|
|
2. The ``get policy`` will default to the ``service`` role for authorization.
|
|
|
|
Along with the new ``add policy``, we will add a check in the location add API
|
|
code to check the status of image and only add location if it is in ``QUEUED``
|
|
state and adding location when the image is in other states will be
|
|
disallowed. This is done in order to prevent malicious users from modifying
|
|
the image location again and again since the location added for the first time
|
|
is the correct one as far as Glance is concerned.
|
|
|
|
We will also introduce a new configuration parameter ``do_secure_hash`` on
|
|
the glance side which will tell the API if we want to do the hash calculation
|
|
or not. This will be useful in cases when nova, cinder etc, adds a location
|
|
in glance since glance does not calculate the hash and checksum automatically
|
|
in these cases. The value of ``do_secure_hash`` will be ``True`` by default.
|
|
|
|
After nova or cinder send a request for adding a location for the VM snapshot
|
|
or upload volume case respectively and ``do_secure_hash`` is ``True``, glance
|
|
will start a background process that will calculate the hash of the image.
|
|
Unless we have ``validation_data`` (in the request body) to be verified,
|
|
image will be set to ``active`` state after registering the location even if
|
|
the hash calculation is ongoing in the background. This is done so that the
|
|
image can be used to create instances and bootable volumes instantly after
|
|
we've registered the location and not wait for the hash calculation since
|
|
it is a long running task. After the hash calculation completes, image
|
|
properties will be updated with the ``checksum``, ``os_hash_algo`` and
|
|
``os_hash_value`` values.
|
|
|
|
Following are the cases of image transition with different values of
|
|
``do_secure_hash`` and ``validation_data``:
|
|
|
|
* ``do_secure_hash`` is ``True`` and ``validation_data`` is not None:
|
|
|
|
Image transition: (queued, importing, active)
|
|
|
|
In this case the consumer provides the hash values for validation and
|
|
hash is calculated by glance.
|
|
An example of this case will be providing validation_data for HTTP store.
|
|
Here image hash will be calculated and verified before setting image to
|
|
active state.
|
|
|
|
* ``do_secure_hash`` is ``True`` and ``validation_data`` is None:
|
|
|
|
Image transition: (queued, active)
|
|
|
|
In this case validation data will not be provided by the consumer but
|
|
hash is calculated by glance.
|
|
Examples of this case will be when nova snapshots an instance or cinder
|
|
uploads a volume to image.
|
|
Here image hash calculation will be done and updated after setting
|
|
image to active state.
|
|
This is a tricky case because the consumer will have no idea if the
|
|
``active`` image will ever have a hash value or not and if it should
|
|
wait for the hash to be populated in the image or not.
|
|
To handle this, we will set the ``os_hash_algo`` value in the image
|
|
properties so the consumer will know that hash calculation is ongoing
|
|
for this image and the hash will be populated here.
|
|
Here are the following cases:
|
|
|
|
* ``active`` image and no ``os_hash_algo``: This image will not have hash
|
|
value populated.
|
|
* ``active`` image and has ``os_hash_algo``: Poll for ``active`` image
|
|
status and ``os_hash_algo`` until you get ``os_hash_value``.
|
|
Polling for ``active`` image status is optional since the image gets
|
|
active when ``validation_data`` is not provided and hash calculation
|
|
is ongoing in the background i.e. this case. The ``os_hash_algo`` value
|
|
will be popped if hash calculation fails.
|
|
|
|
* ``do_secure_hash`` is ``False`` and ``validation_data`` is not None:
|
|
|
|
Image transition: (queued, active)
|
|
|
|
In this case validation data will be provided by the consumer and hash
|
|
is not calculated by glance.
|
|
An example of this case will be providing validation_data for HTTP store.
|
|
Here image hash will not be calculated and verified but directly set to
|
|
image with values provided by the user.
|
|
|
|
* ``do_secure_hash`` is ``False`` and ``validation_data`` is None:
|
|
|
|
Image transition: (queued, active)
|
|
|
|
In this case validation data will not be provided by the consumer and
|
|
hash is not calculated by glance.
|
|
This can happen for all cases.
|
|
Here hash value won't be set in the image.
|
|
|
|
If the hash calculation fails, we will add a retry mechanism that will
|
|
reinitiate the task. We will add a new configuration option ``http_retries``
|
|
with a default value of ``3`` i.e. the hash calculation will be executed
|
|
maximum 3 times by default if the first and second tries fail.
|
|
If after all the retries, the hash calculation still fails, we will not update
|
|
the hash and checksum values and image will stay in ``active`` state.
|
|
|
|
End-user access to image locations via the Image API is no longer necessary.
|
|
Since Train, Glance has multiple stores support, and we have added API calls
|
|
that allow users to manipulate data locality with respect to store.
|
|
Further, a store is an opaque identifier, whereas an image location
|
|
exposes backend details that users don't need to know.
|
|
|
|
Here are the current use cases for the direct manipulation of image
|
|
locations along with an explanation of how they can be handled by the
|
|
new Location API.
|
|
|
|
1. When using a copy-on-write (COW) backend shared by Nova and Glance,
|
|
Nova can create an image record in Glance, snapshot a server image
|
|
directly in the backend, and set the location on the image record.
|
|
This use case is covered by the new add-location call, and having
|
|
its default policy be project member (image owner) or service.
|
|
|
|
2. A user wants to have a single image record, but have image data
|
|
stored in multiple locations for locality (i.e., to have image
|
|
data as close as possible to where it's consumed).
|
|
This use case is handled by the glance multiple stores feature
|
|
plus image import, which since API v2.8, allows a 'stores' parameter
|
|
specifying where the image data should be stored. This applies to both
|
|
newly created images and existing images (via the copy-image import
|
|
method).
|
|
In this workflow, Glance itself manipulates the image locations; there
|
|
is no need for the user to interact with locations directly.
|
|
|
|
3. An operator wants to introduce a new storage backend and decommission
|
|
the current backend while keeping the same image catalog.
|
|
Similar to #2, this can be handled by using the copy-image import
|
|
method and the delete-image-from-store API call introduced in v2.10.
|
|
Note that there are some exceptions to this like:
|
|
|
|
a. HTTP store is read-only, so we can't use copy-image in this case.
|
|
|
|
b. For RBD store, we will create a dependency chain if we launch a VM
|
|
or create a bootable volume from it hence we can't delete the source
|
|
image until all of it's children are flattened.
|
|
|
|
c. For cinder store, if the cinder backend uses COW cloning, it is similar
|
|
to the RBD case mentioned in b) else the image delete will succeed.
|
|
|
|
Following APIs are not being implemented:
|
|
|
|
``Update``: For service to service interaction, there is no value in updating
|
|
the metadata of a location. This would be beneficial if we plan to remove the
|
|
existing location code from image-update call and support the usecase of
|
|
operators/end-users doing location operations.
|
|
|
|
``Delete``: We already have `Delete Image From Store`_ API for this purpose.
|
|
We don't require the `Delete Image From Store`_ API call for the current
|
|
usecase but if we plan to extend the location APIs in future, we can do this
|
|
by updating the policies enforced by `Delete Image From Store`_ operation from
|
|
the default ``role:admin`` to ``role:admin or role:service``.
|
|
|
|
Alternatives
|
|
------------
|
|
|
|
* We can remove the ``show_multiple_locations`` config option and filter the
|
|
images with the ``admin_or_service`` role. This will require the consumers
|
|
to provide admin credentials during add or get of an image to get the
|
|
location.
|
|
This was the original proposal but due to the disagreement here [3]_, we
|
|
changed the design to the current proposal.
|
|
|
|
* Another alternative is to add this functionality in the import workflow.
|
|
We can add a new import method ``direct-location`` which will allow end
|
|
users to specify the ``location`` and ``metadata`` parameters and create a
|
|
new image based on the given parameters. We can also update an existing
|
|
image with ``location`` and ``metadata`` values but will require the image
|
|
to be in ``queued`` state.
|
|
|
|
For this, we will need to add a new import method ``direct-location`` and also
|
|
add ``--metadata`` and ``--location`` parameters to the following commands:
|
|
|
|
* ``glance image-create-via-import --import-method direct-location --location
|
|
<location> --metadata <key1=value1, key2=value2 ...>``
|
|
|
|
* ``glance image-import --import-method direct-location --location
|
|
<location> --metadata <key1=value1, key2=value2 ...>``
|
|
|
|
Data model impact
|
|
-----------------
|
|
|
|
None
|
|
|
|
REST API impact
|
|
---------------
|
|
|
|
We are going to add 2 new location APIs:
|
|
|
|
* Add Location
|
|
|
|
This will add a new location to an existing image.
|
|
The request body will contain the location URL and ``validation_data`` [4]_
|
|
(optional). The purpose of including validation_data in the request body
|
|
is when the consumer wants to validate the image hash or just directly wants
|
|
to add the hash values to the image. The cases of ``validation_data`` with
|
|
``do_secure_hash`` are described in the `Proposed change`_ section.
|
|
An example where ``validation_data`` will be provided is the HTTP store case,
|
|
where the user will provide hash value for the HTTP image.
|
|
|
|
Unlike old location API, we will not provide support of adding a location
|
|
on a particular index. If we want to get the benefit of indexes, we can
|
|
use the old location APIs or set location strategy as store_type [5]_.
|
|
A new location strategy ``store_identifier`` is proposed [6]_ and should be
|
|
useful to download image from a specific store in case multiple stores are
|
|
configured.
|
|
|
|
POST /v2/images/{image_id}/locations
|
|
|
|
* JSON request body
|
|
|
|
.. code-block:: json
|
|
|
|
{
|
|
"url": "cinder://lvmdriver-1/1a304872-b0ca-4992-b2c2-6874c6d5d5f9",
|
|
"validation_data": {
|
|
"os_hash_algo": "sha512",
|
|
"os_hash_value": "6b813aa46bb90b4da216a4d19376593fa3f4fc7e617f03a92b7fe11e9a3981cbe8f0959dbebe36225e5f53dc4492341a4863cac4ed1ee0909f3fc78ef9c3e869",
|
|
}
|
|
}
|
|
|
|
* JSON response body
|
|
|
|
- Success - 200
|
|
|
|
.. code-block:: json
|
|
|
|
{
|
|
"url": "cinder://lvmdriver-1/1a304872-b0ca-4992-b2c2-6874c6d5d5f9",
|
|
"metadata": "{'store': 'lvmdriver-1'}"
|
|
"validation_data": {
|
|
"os_hash_algo": "sha512",
|
|
"os_hash_value": "6b813aa46bb90b4da216a4d19376593fa3f4fc7e617f03a92b7fe11e9a3981cbe8f0959dbebe36225e5f53dc4492341a4863cac4ed1ee0909f3fc78ef9c3e869",
|
|
}
|
|
}
|
|
|
|
- Error - 409 (Location already exists or if image is not in QUEUED
|
|
state), 403 (Forbidden for users that are not owner), 400 (BadRequest
|
|
if hash validation fails)
|
|
|
|
* Get Location(s)
|
|
|
|
This will show all the locations associated to an existing image. Returns an
|
|
empty list if an image contains no locations.
|
|
|
|
GET /v2/images/{image_id}/locations
|
|
|
|
* JSON response body
|
|
|
|
.. code-block:: json
|
|
|
|
[
|
|
{
|
|
"url": "cinder://lvmdriver-1/0f031ed1-5872-43d5-a638-4b0d07c10ab5",
|
|
"metadata": "{'store': 'lvmdriver-1'}"
|
|
},
|
|
{
|
|
"url": "cinder://cephdriver-1/11b4fa9f-a44b-46c9-950c-0026c467252c",
|
|
"metadata": "{'store': 'cephdriver-1'}"
|
|
}
|
|
]
|
|
|
|
- Error - 404 (Image ID does not exist), 403 (Forbidden for normal users)
|
|
|
|
The transition of image state during the image create operation will be as
|
|
follows.
|
|
Image upload (PUT), image stage (PUT) and location add (POST), will transition
|
|
the image from queued to the next state that could be either of the following:
|
|
|
|
1. ``saving``
|
|
2. ``uploading``
|
|
3. ``importing``
|
|
4. ``active``
|
|
|
|
Below are the valid transitions for image from queued state.
|
|
|
|
'queued': ('saving', 'uploading', 'importing', 'active', 'deleted')
|
|
|
|
Security impact
|
|
---------------
|
|
|
|
No worse than it is now, and possibly better.
|
|
|
|
1. The get-locations policy is restricted to the 'service' role,
|
|
so users will not be able to see image locations. Thus with
|
|
'show_multiple_locations' and 'show_direct_url' set to False,
|
|
the new get-locations API will not expose location information
|
|
to users.
|
|
2. The add-location policy is restricted by default to image-owner.
|
|
This will allow end users to add a location to an image to address
|
|
current uses of this functionality that we aren't aware of.
|
|
Even allowing this, the data-substitution attack is blocked because
|
|
the API call will only be allowed for an image in 'queued' status.
|
|
The add-location API cannot be used to add a location to an image in
|
|
other states and then delete the original location, so the OSSN-0065
|
|
attack is not possible under this scenario.
|
|
Further, the add-locations call (unlike the current method of
|
|
updating locations via PATCH), does not require the locations to
|
|
be visible to succeed. Thus operators will be able to configure
|
|
Glance with 'show_multiple_locations' and 'show_direct_url' set
|
|
to False, even when other services are sharing a COW backend with
|
|
Glance and the operator wants an optimized workflow.
|
|
|
|
Notifications impact
|
|
--------------------
|
|
|
|
None
|
|
|
|
Other end user impact
|
|
---------------------
|
|
|
|
Since the new APIs are mainly for service to service interaction (except the
|
|
HTTP store case), we will only expose the location add API via CLI. However,
|
|
we will need to add methods for all APIs in openstacksdk (that will call
|
|
the new location APIs) that will be used by other consumer services like
|
|
cinder and nova.
|
|
End users can still use the existing commands (that internally calls the
|
|
image-update API) to perform operations on locations:
|
|
|
|
* ``glance location-add:`` Add a location (and related metadata) to an image.
|
|
* ``glance location-delete:`` Remove locations (and related metadata) from an
|
|
image.
|
|
* ``glance location-update:`` Update metadata of an image's location.
|
|
|
|
We will also add a new command to glanceclient and OSC that will allow end
|
|
users to add the location ``url`` and ``validation-data`` for HTTP store case.
|
|
|
|
* ``glance add-location-properties --url <location> --validation-data
|
|
<os_hash_algo=value1, os_hash_value=value2>``
|
|
* ``openstack image add location properties --url <location> --validation-data
|
|
<os_hash_algo=value1, os_hash_value=value2>``
|
|
|
|
Performance Impact
|
|
------------------
|
|
|
|
In the old location API, the consumers (nova, cinder) registered
|
|
the location in glance and the checksum, hash etc values weren't
|
|
calculated. After the consumers adapt to the new location API,
|
|
and the ``do_secure_hash`` config parameter is ``True`` (default),
|
|
glance will read the image and calculate the hash in the background.
|
|
The hash calculation will be a long running task so it will consume
|
|
resources, however, this won't affect the operation requested by
|
|
nova or cinder as the image will transition to ``active`` state even
|
|
when the hash calculation is ongoing.
|
|
|
|
The performance downside will result in creation of more secure
|
|
images and the impact needs to be conveyed to the operators/end users
|
|
with documentation and releasenotes. Since ``do_secure_hash`` will be a
|
|
configurable parameter on glance side, we will add suitable help text
|
|
to convey the performance and security impact of enabling/disabling this
|
|
option.
|
|
|
|
Other deployer impact
|
|
---------------------
|
|
|
|
None
|
|
|
|
Developer impact
|
|
----------------
|
|
|
|
Consumers like Cinder, Nova and HTTP store need to modify code to call the
|
|
new client functions to access the API.
|
|
Some of the key things to consider while implementing consumer side changes
|
|
are:
|
|
|
|
* We will use SDK to make the API calls. The changes to call new
|
|
location APIs will be in SDK and also in OSC/glanceclient for location
|
|
ADD in case of HTTP store.
|
|
* Keep backward compatibility with old behavior. Glance should support
|
|
the legacy behavior as well as the new way to add/get locations. This is
|
|
useful in upgrade cases where one compute node is running 2023.1 (Antelope)
|
|
code and the other compute node has been upgraded to 2024.1 (CC) release.
|
|
* Testing should be done to see if the existing functionalities supported
|
|
with the legacy location APIs works as expected with the new APIs.
|
|
|
|
Implementation
|
|
==============
|
|
|
|
Assignee(s)
|
|
-----------
|
|
|
|
Primary assignee:
|
|
pdeore
|
|
|
|
Other contributors:
|
|
whoami-rajat
|
|
|
|
Work Items
|
|
----------
|
|
|
|
* Add 2 new Location APIs for add and get operations.
|
|
|
|
* Modify consumers like cinder and nova and http store to use the new location
|
|
APIs.
|
|
|
|
* Add a new configuration parameter ``do_secure_hash`` in glance and document
|
|
it's impact.
|
|
|
|
* Add a new configuration parameter ``http_retries`` in glance and document
|
|
it's usage.
|
|
|
|
* Add SDK support to call the new APIs.
|
|
|
|
* Add a releasenote mentioning that we will remove the config option
|
|
``show_multiple_locations`` when the consumers (nova/cinder/http store)
|
|
shift to using new location APIs.
|
|
|
|
* Tempest tests for the new add-location and get-location APIs.
|
|
|
|
Dependencies
|
|
============
|
|
|
|
None
|
|
|
|
Testing
|
|
=======
|
|
|
|
* Unit Tests
|
|
* Functional Tests
|
|
* Integration Tests
|
|
* Tempest Tests
|
|
|
|
Documentation Impact
|
|
====================
|
|
|
|
Need to document new location APIs.
|
|
|
|
References
|
|
==========
|
|
|
|
.. [1] https://wiki.openstack.org/wiki/OSSN/OSSN-0065
|
|
|
|
.. [2] https://wiki.openstack.org/wiki/OSSN/OSSN-0090
|
|
|
|
.. [3] https://review.opendev.org/c/openstack/glance-specs/+/840882/2..15/specs/zed/approved/glance/new-location-info-apis.rst#b199
|
|
|
|
.. [4] https://specs.openstack.org/openstack/glance-specs/specs/stein/implemented/glance/spec-lite-locations-with-validation-data.html
|
|
|
|
.. [5] https://docs.openstack.org/glance/latest/contributor/api/glance.common.location_strategy.store_type.html
|
|
|
|
.. [6] https://review.opendev.org/c/openstack/glance-specs/+/881951
|
|
|
|
.. _Delete Image From Store: https://docs.openstack.org/api-ref/image/v2/index.html?expanded=delete-image-from-store-detail#delete-image-from-store
|
|
|
|
* Deprecate `show_multiple_locations` option | https://review.opendev.org/c/openstack/glance/+/313936
|
|
|
|
* Update deprecated show_multiple_locations helptext | https://review.opendev.org/c/openstack/glance/+/426283
|
|
|
|
* Update show_multiple_locations deprecation note | https://review.opendev.org/c/openstack/glance/+/625702
|
|
|
|
* Original security bug | https://bugs.launchpad.net/ossn/+bug/1549483
|
|
|
|
* New security bug | https://bugs.launchpad.net/ossn/+bug/1990157
|