On Xena we added the use_quota DB field to volumes and snapshots to
unify the tracking of temporary resources, but we still had to keep
compatibility code for the old mechanisms (due to rolling
upgrades).
This patch removes compatibility code with the old mechanism and adds
additional cleanup code to remove the tracking in the volume metadata.
Change-Id: I3f9ed65b0fe58f7b7a0867c0e5ebc0ac3c703b05
When rebuilding a volume backed instance, while copying the new
image to the existing volume, we preserve sparseness.
This could be problematic since we don't write the zero blocks of
the new image and the data in the old image can still persist
leading to a data leak scenario.
To prevent this, we are using `-S 0`[1][2] option with the `qemu-img convert`
command to write all the zero bytes into the volume.
In the testing done, this doesn't seem to be a problem with known 'raw'
images but good to handle the case anyway.
Following is the testing performed with 3 images:
1. CIRROS QCOW2 to RAW
======================
Volume size: 1 GiB
Image size (raw): 112 MiB
CREATE VOLUME FROM IMAGE (without -S 0)
LVS (10.94% allocated)
volume-91ea43ef-684c-402f-896e-63e45e5f4fff stack-volumes-lvmdriver-1 Vwi-a-tz-- 1.00g stack-volumes-lvmdriver-1-pool 10.94
REBUILD (with -S 0)
LVS (10.94% allocated)
volume-91ea43ef-684c-402f-896e-63e45e5f4fff stack-volumes-lvmdriver-1 Vwi-aotz-- 1.00g stack-volumes-lvmdriver-1-pool 10.94
Conclusion:
Same space is consumed on the disk with and without preserving sparseness.
2. DEBIAN QCOW2 to RAW
======================
Volume size: 3 GiB
Image size (raw): 2 GiB
CREATE VOLUME FROM IMAGE (without -S 0)
LVS (66.67% allocated)
volume-edc42b6a-df5d-420e-85d3-b3e52bcb735e stack-volumes-lvmdriver-1 Vwi-a-tz-- 3.00g stack-volumes-lvmdriver-1-pool 66.67
REBUILD (with -S 0)
LVS (66.67% allocated)
volume-edc42b6a-df5d-420e-85d3-b3e52bcb735e stack-volumes-lvmdriver-1 Vwi-aotz-- 3.00g stack-volumes-lvmdriver-1-pool 66.67
Conclusion:
Same space is consumed on the disk with and without preserving sparseness.
3. FEDORA QCOW2 TO RAW
======================
CREATE VOLUME FROM IMAGE (without -S 0)
Volume size: 6 GiB
Image size (raw): 5 GiB
LVS (83.33% allocated)
volume-efa1a227-a30d-4385-867a-db22a3e80ad7 stack-volumes-lvmdriver-1 Vwi-a-tz-- 6.00g stack-volumes-lvmdriver-1-pool 83.33
REBUILD (with -S 0)
LVS (83.33% allocated)
volume-efa1a227-a30d-4385-867a-db22a3e80ad7 stack-volumes-lvmdriver-1 Vwi-aotz-- 6.00g stack-volumes-lvmdriver-1-pool 83.33
Conclusion:
Same space is consumed on the disk with and without preserving sparseness.
Another testing was done to check if the `-S 0` option actually
works in OpenStack setup.
Note that we are converting qcow2 to qcow2 image which won't
happen in a real world deployment and only for test purposes.
DEBIAN QCOW2 TO QCOW2
=====================
CREATE VOLUME FROM IMAGE (without -S 0)
LVS (52.61% allocated)
volume-de581f84-e722-4f4a-94fb-10f767069f50 stack-volumes-lvmdriver-1 Vwi-a-tz-- 3.00g stack-volumes-lvmdriver-1-pool 52.61
REBUILD (with -S 0)
LVS (66.68% allocated)
volume-de581f84-e722-4f4a-94fb-10f767069f50 stack-volumes-lvmdriver-1 Vwi-aotz-- 3.00g stack-volumes-lvmdriver-1-pool 66.68
Conclusion:
We can see that the space allocation increased hence we are not preserving sparseness when using the -S 0 option.
[1] https://qemu-project.gitlab.io/qemu/tools/qemu-img.html#cmdoption-qemu-img-common-opts-S
[2] abf635ddfe/qemu-img.c (L182-L186)
Closes-Bug: #2045431
Change-Id: I5be7eaba68a5b8e1c43f0d95486b5c79c14e1b95
Unfortunately, the commit b75c29c7d8
did not update all the places where BackupAPI.restore_backup()
was called. One of them was in the flow manager.
Although this regression being undetected in is disappointing,
we are not changing how unit tests are performed in any
fundamental way for this patch. An outstanding patch using
MyPy is aready in review, which would capture this case.
Closes-bug: #2025277
Related-Change-Id: I54b81a568a01af44e3f74bcac55e823cdae9bfbf
Change-Id: Iabfebacfea44916f89584ffd019d848e53302eaf
Add new config option `image_conversion_disable`, when it's set to
`True`, image disk_format and volume format must be the same format.
Otherwise will raise ImageUnacceptable exception.
`image_conversion_disable` config is bool option, default to `False`.
The idea behind this was that in certain high scale environments,
it is possible that a cloud allows both qcow2 and raw image uploads.
However, uploading a qcow2 image and creating a large number of
volumes can cause a tremendous amount of conversions that will kill
cinder-volume. It may be undesirable to have this, so a cloud
operator can opt to disallow conversions and enforce that the user
uploads the correct image type if they want to have volumes (aka
raw in rbd case).
Closes-Bug: #1970115
Change-Id: Ic481d68639d9460d1fd14225bc17a0d8287d5fd9
This change replaces invalid UUIDs used in unit tests which are causing
the deprecation warnings like;
FutureWarning: b"'fake_volume_id'" is an invalid UUID. Using UUIDFields
with invalid UUIDs is no longer supported, and will be removed in
a future release. Please update your code to input valid UUIDs or
accept ValueErrors for invalid UUIDs. ...
Change-Id: I18b0a7c81efd8fe981bdd7a3c41a7044147a96a0
In change-id Iabf9c3fab56ffef50695ce45745f193273822b39 we left the
`volume_attachment` out of the expected attributes of the Volume OVO
(even when te DB layer is providing that information) because it was
breaking some unit tests.
This means that in some cases we are unnecessarily loading the
attachments again (manually or via lazy loading) once we have the volume
OVO because that field is not set.
In this patch we populate the `volume_attachment` when we load the
Volume OVO from the DB.
Change-Id: I6576832b2c2722ab749cfe70bbc2058ead816c36
Cinder creates temporary resources, volumes and snapshots, during some
of its operations, and these resources aren't counted towards quota
usage.
Cinder currently has a problem to track quota usage is when deleting
temporary resources.
Determining which volumes are temporary is a bit inconvenient because we
have to check the migration status as well as the admin metadata, so
they have been the source of several bugs, though they should be
properly tracked now.
For snapshots we don't have any way to track which ones are temporary,
which creates some issues:
- Quota sync mechanism will count them as normal snapshots.
- Manually deleting temporary snapshots after an operation fails will
mess the quota.
- If we are using snapshots instead of clones for backups of in-use
volumes the quota will be messed on completion.
This patch proposes the introduction of a new field for those database
resource tables where we create temporary resources: volumes and
snaphots.
The field will be called "use_quota" and will be set to False for
temporary resources to indicate that we don't want them to be counted
towards quota on deletion.
Instead of using "temporary" as the field name "use_quota" was used to
allow other cases that should not do quota in the future.
Moving from our current mechanism to the new one is a multi-release
process because we need to have backward compatibility code for rolling
upgrades.
This patch adds everything needed to complete the multi-release process
so that anybody can submit next release patches. To do so the patch
adds backward compatible code adding the feature in this release and
TODO comments with the exact changes that need to be done for the next
2 releases.
The removal of the compatibility code will be done in the next release,
and in the one after that we'll remove the temporary metadata rows that
may still exist in the database.
With this new field we'll be able to make our DB queries more efficient
for quota usage calculations, reduce the chances of introducing new
quota usage bugs in the future, and allow users to filter in/out
temporary volumes on listings.
Closes-Bug: #1923828
Closes-Bug: #1923829
Closes-Bug: #1923830
Implements: blueprint temp-resources
Change-Id: I98bd4d7a54906b613daaf14233d749da1e1531d5
Volume usage notifications in Cinder during Cinder volume migrations are
incorrect.
Part of a volume migration is creating a new volume on the destination
backend and during this process Cinder will issue volume usage
notifications "create.start" and "create.end" even though this is not a
user volume creation and is inconsistent with other temporary volume and
snapshot creation cases.
Also one of the latest steps during the volume creation is to delete one
of the 2 volumes (the source or the destination) and in that case Cinder
will only issue a "delete.end" notification without its corresponding
"delete.start".
Since temporary volumes (for backups or migrations) are not counted
towards quota usage they should also not issue volume usage
notifications.
This patch makes sure that we don't do notifications when creating or
deleting temporary migration volumes.
In both cases it checks the migration_status field to see if it starts
with 'target:'. For creation the migration_status is set in
_migrate_volume_generic method before making the RPC call, so the data
will be there from the start, before the manager flow starts.
Closes-Bug: #1922920
Change-Id: I7164d700ef56a29e5d4f707fd2340e621bd6f351
Glance has changed the format of the cinder URIs in image locations
so that they can look like
cinder://glance-store-name/volume_id
in addition to the legacy format
cinder://volume_id
Change the cinder code so that it can handle both formats for
reading. (We only need to write the legacy format.)
Change-Id: I8c176bf4c875061591bb6c94654a2cef643a4dcb
Closes-bug: #1898075
After a clone of an encrypted volume is created, an attach is attempted.
PowerMax driver requires the provider_location be populated in order
to find the volume to attach. For this, the volume object needs to be
updated.
Closes-Bug: #1913054
Change-Id: Idf5b3783ddc333d6d60f28a3d08e5fd28e5c1fa8
If a volume_type is not specified in a volume-create request, change
I4da0c13b5b3f8174a30b8557f968d6b9e641b091 (introduced in Train) sets a
default volume_type in the REST API layer. This prevents the
selection logic in cinder.volume.flows.api.create_volume.
ExtractVolumeRequestTask from being able to infer the appropriate
volume_type from the source volume, snapshot, or image metadata, and
has caused a regression where the created volume is of the default
type instead of the inferred type.
This patch removes setting the default volume_type in the REST API
and modifies the selection code in ExtractVolumeRequestTask slightly
to make sure a volume_type is always assigned in that function, and
adds and revises some tests.
Change-Id: I05915f2e32b1229ad320cd1c5748de3d63183b91
Closes-bug: #1879578
test.py was previously not in the tests
directory. This means that downstream packagers
of Cinder have to specifically exclude it from
the main Cinder package (which does not typically
include unit tests).
Move it under the cinder/tests/unit/ dir where it
should be, to clean this up.
Change-Id: I65c50722f5990f540d84fa361b997302bbc935c5
When we reach xtremio_volumes_per_glance_cache, it is not able
to create more clones for the image-volume therefore requires to
create a new cache entry.
Keeping in mind the case in [1], we can get CinderException for
various reasons from different drivers during the clone operation.
So we define a new exception for the xtremio driver to force create
a new cache entry when the limit reaches and not enforce the same
for other drivers.
[1] https://bugs.launchpad.net/cinder/+bug/1552734
Closes-Bug: #1858169
Change-Id: I2bf964d5a7b2048db9be1ea3eb97cd517e112c5b
This adds usage of the flake8-import-order extension to our flake8
checks to enforce consistency on our import ordering to follow the
overall OpenStack code guidelines.
Since we have now dropped Python 2, this also cleans up a few cases for
things that were third party libs but became part of the standard
library such as mock, which is now a standard part of unittest.
Some questions, in order of importance:
Q: Are you insane?
A: Potentially.
Q: Why should we touch all of these files?
A: This adds consistency to our imports. The extension makes sure that
all imports follow our published guidelines of having imports ordered
by standard lib, third party, and local. This will be a one time
churn, then we can ensure consistency over time.
Q: Why bother. this doesn't really matter?
A: I agree - but...
We have the issue that we have less people actively involved and less
time to perform thorough code reviews. This will make it objective and
automated to catch these kinds of issues.
But part of this, even though it maybe seems a little annoying, is for
making it easier for contributors. Right now, we may or may not notice
if something is following the guidelines or not. And we may or may not
comment in a review to ask for a contributor to make adjustments to
follow the guidelines.
But then further along into the review process, someone decides to be
thorough, and after the contributor feels like they've had to deal with
other change requests and things are in really good shape, they get a -1
on something mostly meaningless as far as the functionality of their
code. It can be a frustrating and disheartening thing.
I believe this actually helps avoid that by making it an objective thing
that they find out right away up front - either the code is following
the guidelines and everything is happy, or it's not and running local
jobs or the pep8 CI job will let them know right away and they can fix
it. No guessing on whether or not someone is going to take a stand on
following the guidelines or not.
This will also make it easier on the code reviewers. The more we can
automate, the more time we can spend in code reviews making sure the
logic of the change is correct and less time looking at trivial coding
and style things.
Q: Should we use our hacking extensions for this?
A: Hacking has had to keep back linter requirements for a long time now.
Current versions of the linters actually don't work with the way
we've been hooking into them for our hacking checks. We will likely
need to do away with those at some point so we can move on to the
current linter releases. This will help ensure we have something in
place when that time comes to make sure some checks are automated.
Q: Didn't you spend more time on this than the benefit we'll get from
it?
A: Yeah, probably.
Change-Id: Ic13ba238a4a45c6219f4de131cfe0366219d722f
Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
We've kept hacking capped for a long time now. This raises the hacking
package version to the latest release and fixes the issues that it
found.
Change-Id: I933d541d9198f9742c95494bae6030cb3e4f2499
Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
Much of our code renames this at import already --
just name it "volume_utils" for consistency, and
to make code that imports other modules named "utils"
less confusing.
Change-Id: I3cdf445ac9ab89b3b4c221ed2723835e09d48a53
This patch fixes the image volume cache so that a new cache entry is
created whenever cloning an existing entry fails (which can happen, for
example, when a cached volume reaches its snapshot limit). This restores
the original behavior, which was broken by [1].
[1] I547fb4bcdd4783225b8ca96d157c61ca3bcf4ef4
Closes-Bug: #1801595
Change-Id: Ib5947e2c7300730adb851ad58e898a29f2b88525
Fix an issue with the 'resource_backend' included in the scheduler spec
for creating a volume associated with another volume, snapshot, or
group/cg. When running A/A, the 'resource_backend' must reference the
cluster, not the host.
Enhance the unit tests that cover this area. This includes fixing the
'expected_spec' so it copies a dictionary rather than referencing it,
so that external changes to the dictionary don't inadvertently update
the unit test's expected results.
Closes-Bug: #1808343
Change-Id: I7d414844d094945b55a094a8426687595f22de28
The "_copy_image_to_volume" method is a common function to
copy image to volume.
This also can be used by re-image interface, this patch
extract them into volume utils.
part of blueprint: add-volume-re-image-api
Change-Id: I89471fd2737d0b21ce029a6ad7ed43a6e8bf4810
This patch is to log user message when volume creation fails.
This is only for _create_raw_volume, and other patches will be
submitted for _create_from_snapshot etc.
Change-Id: I9ba87863623a9c5806e93b69e1992cabce2f13b9
Partial-Bug: #1799159
This patch consider two situations to add test case:
1. Create volume from bootable source volume.
2. Create volume from non-bootable source volume.
Change-Id: Icf250ece974a1794e9c8ce33ac71accc3652665b
When creating a volume from a snapshot of a bootable volume the
bootable setting is ignored. This changes the creation logic so
when snapshot_id is not None, bootable can be set to be the same
as the snapshot's source_volume.
Change-Id: Ifca9ca04dfa8eca987dc44ec23cda5e0d431cba2
Volume type's reserved key 'availability_zones' will not be
recognized if type is specified in source volume or snapshot.
This patch fixes this issue while refactor some codes regarding
collecting volume type from resource.
Change-Id: Ibe937d5a97d685f5d9c3f8c03c5c384bb02a6942
This patch fixs two issues related to multiattach extra specs:
1. Volume's multiattach attribute will always be False if it's
created by a multiattach type while type is specified in source
volume, snapshot, image metadata or config file, related patch [1].
2. 'MULTIATTACH_POLICY' will not be validated if multiattach flag is
specified in the volume type that from source volume, snapshot,
image metadata or config file [2].
[1]: a32e24ff3e
[2]: ff30971d1f
Closes-Bug: #1770671
Change-Id: Ibec0656d9ead726ca12e62091fda74c04ec99eda
Add image signature verification support when
creating from image.
Change-Id: I37b7a795da18e3ddb18e9f293a9c795e207e7b7e
Partial-Implements: bp cinder-support-image-signing
The Cinder code that processes Glance image metadata
is a bit confused about whether this particular field
is a Glance property or metadata.
Since it isn't a defined a Glance property and is stored
in image metadata, ensure that Cinder also tracks it
metadata and not as a property.
This mismatch prior to this fix causes Cinder to create
volumes with the wrong encryption key when creating a
volume from an encrypted image, which results in an
unreadable volume.
Closes-Bug: #1764125
Change-Id: Ie5af3703eaa82d23b50127f611235d86e4104369
Now availability zone is highly integrated into
volume type's extra spec, it will be recognized
when creating and retyping, also we can filter
volume type by extra spec now.
Change-Id: I4e6aa7af707bd063e7edf2b0bf28e3071ad5c67a
Partial-Implements: bp support-az-in-volumetype
Refactor the code that creates a volume from a downloaded glance image
to minimize the scope of the lock that prevents multiple entries in the
volume image cache. Now the lock serializes only the portion of the
code that causes the cache entry to be created. Locking is minimized
when the volume is already cached, or when the volume won't be cached.
Closes-Bug: #1758414
Change-Id: I547fb4bcdd4783225b8ca96d157c61ca3bcf4ef4
A few unit tests hit the path in create volume that now checks for
and creates the image conversion directory. For unit test runs,
this results in a 'conversion' dir being created in the root of
the repo. This adds mocking to prevent local filesystem modification.
Change-Id: I9d3d15411089fc020257b4021a5263a18433eece
At the moment, the check for disk space when booting from an image
currently happens regardless if the backend supports cloning or not.
This means that even if a backend can clone the image directly, the
control plane must have enough disk to download the entire image
which can be unreasonable in backend such as RBD.
This patch moves the code which checks for enough disk space to be
in the same function that effectively downloads the image, ensuring
that it only runs when an image has to be downloaded and avoiding
the check if the backend successfully cloned the image.
Closes-Bug: #1744383
Change-Id: Ibfd6f40e8b8ab88d4ec76e9ac27617a0f97b6c29
A regression was introduced in
I970c10f9b50092b659fa2d88bd6a02f6c69899f2
on backends supporting storage pools. By extracting the
'host@backend' out of the host attr from a parent volume,
the scheduler may attempt to put volumes into other pools
belonging to the backend.
Many backends cannot clone across storage pools,
or create storage volumes from snapshots from other
storage pools.
Change-Id: Ic4c8f29bef2c82550d6d6f03f8fa1dc80696f56e
Closes-Bug: #1732557
create_from_image() function checks for free space in conversion path
(by default /var/lib/cinder/conversion) even if cinder is creating
volume from image snapshot.
If /var/lib/cinder/conversion has less free space than the size of the
image volume is created from, exception will be thrown by
check_available_space() function in image_utils module.
This patch checks if cinder is configured with
allowed_direct_url_schemes = cinder
If so, cinder should not check free space in conversion path because
volume is created from image snapshot.
Closes-Bug: #1683228
Change-Id: I6330ed31dc9336c9d3d09c2d43c8f4913744e9a2
This patch implements the spec of creating volume from backup.
Change-Id: Icdc6c7606c43243a9e12d7a42df293b729f589e5
Partial-Implements: blueprint support-create-volume-from-backup
This patch is a minor reorg of existing code to facilitate subsequent
changes for migrating legacy keys to a modern key manager such as
Barbican.
volume/utils.py now supports these functions that manage the lifecycle
of an encryption key:
- create_encryption_key()
- delete_encryption_key()
- clone_encryption_key()
create_encryption_key() is an existing function, but it was not used
everywhere a key was created. All code that needs to create a key now
does so by calling this function.
delete_encryption_key() is primarily a small wrapper, but it will play
a larger role in a subsequent patches related to key migration. One
functional improvement in this patch is it no longer propagates an
exception when deleting an unknown key ID. This avoids needlessly
blocking attempts to delete an encrypted volume for fear that the key
could not be deleted.
clone_encryption_key() is a small wrapper that improves the transparency
of code that needs to create a new encryption key ID from an existing
key ID.
Implements: blueprint migrate-fixed-key-to-barbican
Change-Id: I2108e77a8d07dddfb9ec284b3930a197854bd884
Currently we are incrementing gigabytes quota twice, once while
creating snapshot & then while creating individual volumes & this
is fixed in this change. Also, snapshot quota updation was put in
a loop of volumes because of which it gets incremented to number
of volume times.
Change-Id: I9ef79a21c7438e69221a5ed2a1c1bfb59f3f9a32
Closes-Bug: 1728834
Pass the request to scheduler rather than volume service in
order to check the backend's capacity.
Change-Id: I970c10f9b50092b659fa2d88bd6a02f6c69899f2
Partial-Implements: blueprint inspection-mechanism-for-capacity-limited-host
when creating volume from image and enabling image cache, cinder
forgets to cleanup consistencygroup field in volume object before
creating image cache volume entry, so the volume object can not
be updated.
Change-Id: Ib9bff05d2d5d7e8c7e0919fa9be0776d9cdc0d12
Closes-Bug: #1701998
When creating volume, Cinder use az cache to validate az. This
will lead a problem: If a new cinder-volume node with a different
az is added to the OpenStack environment or admin change an exist
cinder-volume's az, Creating volume with this az will fail until
the cache is refreshed.
This patch will refresh the cached az if the previous creating
task failed to find the target.
Change-Id: I3e884af1499ea2ddea1a3603d3d09c31a1f62811
Cloese-bug: #1693084