This change enables ephemeral encryption support to convert:
* encrypted source image to unencrypted destination image
* unencrypted source image to encrypted destination image
* encrypted source image to encrypted destination image
This also makes necessary changes for mypy checks to pass.
Related to blueprint ephemeral-storage-encryption
Change-Id: I9edc87006b1f7de69bc52f916f45c2cbb66abe23
oslo.utils is planning to make JSON the default output format parsed
when creating QemuImgInfo objects. As such this change makes JSON the
default output_format requested when calling qemu-img info.
The majority of this change is actually test removal from
nova.tests.unit.virt.libvirt.test_utils as these human readable qemu-img
based tests now duplicate tests found in oslo.utils itself.
Change-Id: I56676713571e79f05ee3f0bffc5da8386e02c5d4
The updated minimum required libvirt (4.0.0) and QEMU (2.11)
for "Ussuri" satisfy the version requirements; this was done
in Change-Id: Ia18e9be4d (22c1916b49 — libvirt: Bump
MIN_{LIBVIRT,QEMU}_VERSION for "Ussuri", 2019-11-19).
Drop the version constant QEMU_VERSION_REQ_SHARED and now-needless
compatibility code; adjust/remove tests.
Change-Id: If878a023c69f25a9ea45b7de2ff9eb1976aaeb8c
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
This change addresses an old TODO in the images module by dropping the
use of a Libvirt specific configurable from the qemu_img_info function.
We can identify RBD based volumes by checking for 'rbd:' at the start of
the path provided to the function instead of using the configurable.
Change-Id: Ife9e67d5c71f4cca825dff713f54ec955508f6e6
This change is a follow up to I0c3f14100a18107f7e416293f3d4fcc641ce5e55
and removes the direct call to nova.privsep.qemu with one to the images
API that now returns an oslo_utils.imageutils.QemuImgInfo object.
Version 4.1.0 of oslo.utils introducing support for the format-specific
data returned by qemu-img info for LUKSv1 based images.
Change-Id: I573396116e10cf87f80f1ded55f2cd8f498859e4
This will allow for the use of the JSON output format that is easier to
parse within QemuImgInfo and should allow additional information to be
extracted from qemu-img calls in the future.
Change-Id: I0b6d1a98726ffa1ebc78fb3c4563a2e4b40ddeff
This is mostly code motion from the nova.virt.images module into privsep
to allow for both privileged and unprivileged calls to be made.
A privileged_qemu_img_info function is introduced allowing QEMU to
access devices requiring root privileges, such as host block devices.
Change-Id: I5ac03f923d9d181d22d44d8ec8fbc31eb0c3999e
This doesn't exist for 'nova.volume' and no longer exists for
'nova.network'. There's only one image backend we support, so do like
we've done elsewhere and just use 'nova.image.glance'.
Change-Id: I7ca7d8a92dfbc7c8d0ee2f9e660eabaa7e220e2a
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Use virt.images.convert_image instead of invoking 'qemu-img convert'
directly so that cache option can be set properly. i.e. cache=none
when O_DIRECT is supported by the filesystem.
The big advantage of this is that it will bypass the host page cache
so that converting large images won't evict guest data from the cache.
Refactor images.convert_image so that the compression option can be
controlled by the caller.
Change-Id: I4b7be98b5832ca8c580339fcfb7b9203264b5ff8
Signed-off-by: Jack Ding <jack.ding@windriver.com>
Signed-off-by: Chris Friesen <chris.friesen@windriver.com>
Introduce an I/O semaphore to limit the number of concurrent
disk-IO-intensive operations. This could reduce disk contention from
image operations like image download, image format conversion, snapshot
extraction, etc.
The new config option max_concurrent_disk_ops can be set in nova.conf
per compute host and would be virt-driver-agnostic. It is default to 0
which means no limit.
blueprint: io-semaphore-for-concurrent-disk-ops
Change-Id: I897999e8a4601694213f068367eae9608cdc7bbb
Signed-off-by: Jack Ding <jack.ding@windriver.com>
Implement the certificate_utils module. The module's verify_certificate
method can be applied to the creation or rebuild of an instance. It is
triggered by one of two ways:
1) The enable_certificate_validation configuration option is set to
True in Nova's glance configuration (alongside the
verify_glance_signatures option also set to True)
2) A list of trusted certificate IDs is provided
Change-Id: I0ae2dbf66241207a425bf7d0fc02a4d2e2dea409
Implements: blueprint nova-validate-certificates
Following a similar pattern to previous changes, move calls to
qemu-img to convert between image formats to use privsep.
Change-Id: I2c3df909a783e1480d3ab4ca10b34b84ac9e4b5f
blueprint: hurrah-for-privsep
The update_available_resource periodic task in the compute manager
eventually calls through to the resource tracker and virt driver
get_available_resource method, which gets the guests running on
the hypervisor, and builds up a set of information about the host.
This includes disk information for the active domains.
However, the periodic task can race with instances being deleted
concurrently and the hypervisor can report the domain but the driver
has already deleted the backing files as part of deleting the
instance, and this leads to failures when running "qemu-img info"
on the disk path which is now gone.
When that happens, the entire periodic update fails.
This change simply tries to detect the specific failure from
'qemu-img info' and translate it into a DiskNotFound exception which
the driver can handle. In this case, if the associated instance is
undergoing a task state transition such as moving to another host or
being deleted, we log a message and continue. If the instance is in
steady state (task_state is not set), then we consider it a failure
and re-raise it up.
Note that we could add the deleted=False filter to the instance query
in _get_disk_over_committed_size_total but that doesn't help us in
this case because the hypervisor says the domain is still active
and the instance is not actually considered deleted in the DB yet.
Change-Id: Icec2769bf42455853cbe686fb30fda73df791b25
Closes-Bug: #1662867
If /var/lib/nova/instances is mounted on a filesystem like tmpfs that
doesn't have support for O_DIRECT, "qemu-img convert" currently crashes
because it's unconditionally using the "-t none" flag.
This patch therefore:
- moves the _supports_direct_io() function out of the libvirt driver,
from nova/virt/libvirt/driver.py to nova/utils.py and makes it public.
- uses that function to decide to use -t none or -t writethrough when
converting images with qemu-img.
Closes-Bug: #1734784
Co-Authored-By: melanie witt <melwittt@gmail.com>
Change-Id: Ifb47de00abf3f83442ca5264fbc24885df924a19
Qemu 2.10 added the requirement of a --force-share flag to qemu-img
info when reading information about a disk that is in use by a
guest. We do this a lot in Nova for operations like gathering
information before live migration.
Up until this point all qemu/libvirt version matching has been solely
inside the libvirt driver, however all the image manip code was moved
out to nova.virt.images. We need the version of QEMU available there.
This does it by initializing that version on driver init host. The net
effect is also that broken libvirt connections are figured out
earlier, as there is an active probe for this value.
Co-Authored-By: Sean Dague <sean@dague.net>
Change-Id: Iae2962bb86100f03fd3ad9aac3767da876291e74
Closes-Bug: #1718295
Apparently the current 8 second timeout on qemu-info may not be
sufficient if snapshot images are > 120G in size.
This bumps that to 30s instead to provide a backstop, but not hurt
people with large snapshots.
Change-Id: I877b9401a671904a13bb07bae3636b72d7d20df8
Closes-Bug: #1705340
qemu-img convert defaults to cache=none, which means that when it
completes the output data may still only be in the host kernel's
cache rather than on persistent storage. A host crash at this point
will leave a file with the correct metadata (name, size, ownership,
permissions), but no contents. This will prevent qcow2 instances on
that compute host which use that image from restarting, and requires
manual intervention from an operator to fix.
See also change Id9905a87, which fixes this issue for downloads
without a conversion.
Closes-Bug: #1669844
Change-Id: I33bd99b0752111ff7057f9bd40e58dcde77c7d95
We've got user reported bugs that when opperating with slow NFS
backends with large (30+ GB) disk files, the prlimit of cpu_time 2 is
guessed to be the issue at hand because if folks hot patch a qemu-img
that runs before the prlimitted one, the prlimitted one succeeds.
This increases the allowed cpu timeout, as well as tweaking the error
message so that we return something more prescriptive when the
qemu-img command fails with prlimit abort.
The original bug (#1449062) the main mitigation concern here was a
carefully crafted image that gets qemu-img to generate > 1G of json,
and hence could be a node attack vector. cpu_time was never mentioned,
and I think was added originally as a belt and suspenders addition. As
such, bumping it to 8 seconds shouldn't impact our protection in any
real way.
Change-Id: I1f4549b787fd3b458e2c48a90bf80025987f08c4
Closes-Bug: #1646181
Alpine linux runs qemu-img version, where you cannot put
parameter for convert "-f qcow2" in the end of the line,
because it expects location. Parameter must be included
before source and destination.
Change-Id: I7f75eb62c19190c0d523b49dd371d603cafd753f
Closes-Bug: 1634156
images.fetch was passed a max_size argument, but did not use it.
images.fetch_to_raw used the max_size argument to check that the
image being downloaded is not larger than target instance's root disk.
However, this check does not make sense in this context. fetch_to_raw
is used to download directly to the image cache, which means when
booting multiple instances on the same compute it only executes once.
However, the check obviously needs to happen against every instance,
not just the first to use a particular image. Consequently every image
backend duplicates this check, making the check in fetch_to_raw both
confusing and redundant.
There are a couple of callers outside the libvirt driver. These do not
pass the max_size argument, and are therefore unaffected.
Implements: blueprint libvirt-instance-storage
Change-Id: I70a559f3dc9b59097ff6923920f4377cca00d1b2
This is done by detecting ploop format as a directory
containing DiskDescriptor.xml file
Change-Id: Icde0152ba1e735293fdaebde592a43a9242a6c3f
Partial-Bug: #1528638
This uses the new 'prlimit' parameter for oslo.concurrency execute
method, to set an address space limit of 1GB and CPU time limit
of 2 seconds, when running qemu-img.
This is a re-implementation of the previously reverted commit
commit da217205f5
Author: Tristan Cacqueray <tdecacqu@redhat.com>
Date: Wed Aug 5 17:17:04 2015 +0000
virt: Use preexec_fn to ulimit qemu-img info call
Closes-Bug: #1449062
Change-Id: I135b5242af1bfdcb0ea09a6fcda21fc03a6fbe7d
Functions were passing in user_id and project_id to these functions,
but they were not being used. This change allows a subsequent patch to
drop an instance object as a function argument which has no purpose
other than to provide these unused values.
Change-Id: I844b97523b28b327e76e01ef7f16b57a415418ec
The config options of the "nova.conf" section "libvirt" got
moved to the new central location "nova/conf/libvirt.py".
Subsequent patches will then move another options in libvirt section.
This is the second patch in a long-chain patchs.
Change-Id: I3e452172e366b87b373eff33c454472c6be8f1f2
Co-Authored-by: Markus Zoeller <mzoeller@de.ibm.com>
Implements: blueprint centralize-config-options-newton
Skip creating the formatted log message
if the message is not going to be emitted
because of the log level.
TrivialFix
Change-Id: Iba9f47163a0ac3aca612818272db6d536b238975
Add options from 'virt.images'. These options are part of the 'DEFAULT'
group but are included in the "nova.conf.virt" file in hope that they
can eventually be moved to their own group.
Change-Id: I34d44e3d840a92c9271957ffb5a3b9da88652caa
Implements: bp centralize-config-options
We clean up oslo-incubator in 90ae25e38915cc502d9e9c52d59e8fb668a72ae1,
and sync imageutils to oslo.utis with unittest in version 3.1.
openstack/common/imageutils.py exists without unittest. Let's switch
to use oslo.utils' imageutils.
Change-Id: Iac3b221d7ad16c16866dc6b11f08afc473da9bc9
ImageCacheManager deletes base image while image backend is copying
image to the instance path leading instance to go in the error state.
Acquired lock before removing image from cache. If libvirt is copying
image to the instance path, image cache manager won't be able to remove
it until libvirt finishes copying image completely.
Closes-Bug: 1256838
Closes-Bug: 1470437
Co-Authored-By: Michael Still <mikal@stillhq.com>
Depends-On: I337ce28e2fc516c91bec61ca3639ebff0029ad49
Change-Id: I376cc951922c338669fdf3f83da83e0d3cea1532
When doing a live snapshot, the libvirt driver creates an intermediate
qcow2 file with the same backing file as the original disk. However,
it calls qemu-img info without specifying the input format explicitly.
An authenticated user can write data to a raw disk which will cause
this code to misinterpret the disk as a qcow2 file with a
user-specified backing file on the host, and return an arbitrary host
file as the backing file.
This bug does not appear to result in a data leak in this case, but
this is hard to verify. It certainly results in corrupt output.
Closes-Bug: #1524274
Change-Id: I11485f077d28f4e97529a691e55e3e3c0bea8872
When qemu-img is called with oslo_concurrency.process_utils.execute
the ProcessExecutionError was raised when qemu-img either fails to
execute or has a non-zero exit code. This error did not propagate
up to the compute manager with any meaningful information meaning
that if an instance build fails the error message is the generic
"There are not enough hosts available".
This change captures ProcessExecutionError and re-raises the
exception as either InvalidDiskInfo (in qemu_img_info) or
ImageUnacceptable (in convert_image and fetch_to_raw) and makes the
manager accept this as a cause for a BuildAbortException on the
logic that if the image is bad, things are dire, let's bail.
Based on the code in qemu_img_info it appears there was a
misunderstanding of how process_utils.execute behaves so it seems
likely this problem is present elsewhere in the code. This change
attempts to only address the issue as it shows up on the new
instance path described in the related bug.
Change-Id: I4fa1c258db58c70dfbf0178b7bb13978fda3a11f
Closes-Bug: #1436166
The libvirt driver was calling images.convert_image during snapshot to
convert snapshots to the intended output format. However, this
function does not take the input format as an argument, meaning it
implicitly does format detection. This opened an exploit for setups
using raw storage on the backend, including raw on filesystem, LVM,
and RBD (Ceph). An authenticated user could write a qcow2 header to
their instance's disk which specified an arbitrary backing file on the
host. When convert_image ran during snapshot, this would then write
the contents of the backing file to glance, which is then available to
the user. If the setup uses an LVM backend this conversion runs as
root, meaning the user can exfiltrate any file on the host, including
raw disks.
This change adds an input format to convert_image.
Partial-Bug: #1524274
Change-Id: If73e73718ecd5db262ed9904091024238f98dbc0
This reverts commit da217205f5.
The patch made nova-compute to segfault, among other things
resulting in instances not being spawned successfully.
Closes-Bug: #1506012
Change-Id: I0065dd194a7c910cc9f9d9b468e5d43bf5c9b7c0
This uses the preexec_fn oslo.concurrency execute parameter to set user limit
before calling qemu-img info. The process is limitted to 2 seconds of cpu
and 1GB of virtual memory
Change-Id: Ib47f1116c94c8f76d2a4a525af24439a4aa15854
Closes-Bug: #1449062
Having read several user reports of FlavorDiskTooSmall, it is apparent
that this error is not meaningful to users. We create 3 new subclasses
of FlavorDiskTooSmall: FlavorDiskSmallerThanImage,
FlavorDiskSmallerThanMinDisk, and VolumeSmallerThanMinDisk, because
these exceptions require different actions by the user. The new
subclasses get a more detailed error message to help the user.
Change-Id: I2234c4f4f9b5ac0780b3ec2e6a168380c07ed8c1
fileutils is graduated in the oslo.utils library.
Remove old implementation of the read_cached_file function
in the nova.utils and move from the fileutils the read_cached_file
and delete_cached_file functions to the nova.utils module and
the unit tests for these functions.
Implements: blueprint graduate-fileutils[1]
[1] https://blueprints.launchpad.net/oslo-incubator/+spec/graduate-fileutils
Depends-On: I51ba9076e1fbc16145ee2311f47b7768c16dcb20 (requirements)
Change-Id: I849f1c74ec811dbe82ba6270569a008a49eab465
Convert the use of the incubated version of the log module
to the new oslo.log library.
Sync oslo-incubator modules to update their imports as well.
Co-Authored-By: Doug Hellmann <doug@doughellmann.com>
Change-Id: Ic4932e3f58191869c30bd07a010a6e9fdcb2a12c
The oslo team is recommending everyone to switch to the
non-namespaced versions of libraries. Updating the hacking
rule to include a check to prevent oslo.* import from
creeping back in.
This commit includes:
- using oslo_utils instead of oslo.utils
- using oslo_serialization instead of oslo.serialization
- using oslo_db instead of oslo.db
- using oslo_i18n instead of oslo.i18n
- using oslo_middleware instead of oslo.middleware
- using oslo_config instead of oslo.config
- using oslo_messaging instead of "from oslo import messaging"
- using oslo_vmware instead of oslo.vmware
Change-Id: I3e2eb147b321ce3e928817b62abcb7d023c5f13f
The virt/images.py code uses `CONF.libvirt.images_type` config which is
defined in `nova/virt/libvirt/imagesbackend.py`. The problem is that if
`nova/tests/unit/virt/test_images.py` is run before
`nova/tests/unit/virt/libvirt/test_imagebackend.py` then the file containing
the definition won't have been imported yet.
The tests pass at the gate because this order never occurs. But running
locally it can--especially using nosetests.
The solution is to use the `import_opt` construct with the nuance that we
can't use it at a module global level because of a circular dependency issue.
Instead we `import_opt` at call time.
Change-Id: Ifd11e6c2cab62ec6473e129a66c3e12b1928c3f2
Introduces Hyper-V generation 2 VMs support in the Nova Hyper-V
compute driver.
Hyper-V Server 2012 R2 introduces a new feature for virtual machines
named "generation 2", consisting mainly in a new virtual firmware
and better support for synthetic devices.
Operating systems supporting generation 2:
* Windows Server 2012 / Windows 8 and above
* Newer Linux kernels
Co-Authored-By: Simona Iuliana Toader <itoader@cloudbasesolutions.com>
Partially implements: blueprint hyper-v-generation-2-vms
Change-Id: I3a56be74fd49ac845ef8f05d2a3dac93edf8ac78
oslo.i18n uses different marker functions to separate the
translatable messages into different catalogs, which the translation
teams can prioritize translating. For details, please refer to:
http://docs.openstack.org/developer/oslo.i18n/guidelines.html#guidelines-for-use-in-openstack
There were not marker fuctions some places in directory network.
This commit makes changes:
* Add missing marker functions
* Use ',' instead of '%' while adding variables to log messages
Change-Id: I5a8f381b6f8fdb4e8febe9e6a901f7cdc6846646
Raise an exception if the qemu image doesn't
exist or qemu-img fails.
Co-Authored-By: Chuck Short <chuck.short@canonical.com>
Closes-Bug: #1168318
Change-Id: I6b4123590e7d2934de0bc6add900d708d5986039
oslo.i18n provides the i18n functions that were provided by
oslo-incubator's gettextutils module. Some tests that were
using internal details of the library were removed.
Change-Id: I44cfd5552e0dd86af21073419d31622f5fdb28e0
Brings the wacky download method into the new nova.image.API object and
removes calls to nova.image.glance throughout the virt drivers that were
using various call signatures of glance.ImageService.download. Since
some of the wrapper objects that fetched image bits from Glance were
using the get and update methods of the former
nova.image.glance.GlanceImageService object, those calls were updated to
instead call the new, standardized nova.image.API methods of the same
name and call signature.
The final patch in this series will refactor the remaining direct use of
the nova.image.glance module, which are calls to
glance.generate_image_url() and glance.generate_glance_url().
Change-Id: Ib3a85d0875c06c4e54e4211bc2fb8ce3fe16161d
Partially-implements blueprint: standardize-nova-image
This reverts commit a55bbbfa19.
The series of patches involved with adding this feature introduced
an unexpected dependency on glance's v2 API, which we do not
currently support. Triggering a user-facing bug quickly, and leaving
some uncertainty about what else is likely to come in the future,
a revert of this code was decided given the short time to -rc1.
Closes-bug: 1291014
Change-Id: I2ed6a861e583b9513b0984ff9801d4b9f7536798
Images now support multiple locations within its metadata and it may be
stored on more than one backend storage. Nova should add a layer to
transparently handle image preparing and removing for an instance by the
best approach/location, and it should allow administrators to configure
the image handler pipeline with the order who preferred to the layer.
Also, based on this structure we could implement particular sub-class in
relevant hypervisor layer with more advanced functions, such as CoW
making, snapshot capturing and etc. in future.
Implement bp: image-multiple-location
DocImpact
Change-Id: Idce8d21ae37bfdbb28a2567120a83d1061061904
Signed-off-by: Zhi Yan Liu <zhiyanl@cn.ibm.com>