When deleting an instance, always send VIR_DOMAIN_UNDEFINE_NVRAM to
delete the NVRAM file, regardless of whether the image is of type UEFI.
This prevents a bug when rebuilding an instance from an UEFI image to a
non-UEFI image.
Closes-Bug: #1997352
Change-Id: I24648f5b7895bf5d093f222b6c6e364becbb531f
Signed-off-by: Simon Hensel <simon.hensel@inovex.de>
This makes us attempt to first look up a disk device by alias using
the volume_uuid, before falling back to the old method of using the
guest target device name.
Related to blueprint libvirt-dev-alias
Change-Id: I1dfe4ad3df81bc810835af9b09cfc6c06e9a5388
This handles the case where the live migration monitoring thread may
race and call jobStats() after the migration has completed resulting in
the following error:
libvirt.libvirtError: internal error: migration was active, but no
RAM info was set
Closes-Bug: #1982284
Change-Id: I77fdfa9cffbd44b2889f49f266b2582bcc6a4267
This change extends the guest xml parsing such that
the source device path can be extreacted from interface
elements of type vdpa.
This is required to identify the interface to remove when
detaching a vdpa port from a domain.
This change fixes a latent bug in the libvirt fixutre
related to the domain xml generation for vdpa interfaces.
Change-Id: I5f41170e7038f4b872066de4b1ad509113034960
Now that we no longer support py27, we can use the standard library
unittest.mock module instead of the third party mock lib. Most of this
is autogenerated, as described below, but there is one manual change
necessary:
nova/tests/functional/regressions/test_bug_1781286.py
We need to avoid using 'fixtures.MockPatch' since fixtures is using
'mock' (the library) under the hood and a call to 'mock.patch.stop'
found in that test will now "stop" mocks from the wrong library. We
have discussed making this configurable but the option proposed isn't
that pretty [1] so this is better.
The remainder was auto-generated with the following (hacky) script, with
one or two manual tweaks after the fact:
import glob
for path in glob.glob('nova/tests/**/*.py', recursive=True):
with open(path) as fh:
lines = fh.readlines()
if 'import mock\n' not in lines:
continue
import_group_found = False
create_first_party_group = False
for num, line in enumerate(lines):
line = line.strip()
if line.startswith('import ') or line.startswith('from '):
tokens = line.split()
for lib in (
'ddt', 'six', 'webob', 'fixtures', 'testtools'
'neutron', 'cinder', 'ironic', 'keystone', 'oslo',
):
if lib in tokens[1]:
create_first_party_group = True
break
if create_first_party_group:
break
import_group_found = True
if not import_group_found:
continue
if line.startswith('import ') or line.startswith('from '):
tokens = line.split()
if tokens[1] > 'unittest':
break
elif tokens[1] == 'unittest' and (
len(tokens) == 2 or tokens[4] > 'mock'
):
break
elif not line:
break
if create_first_party_group:
lines.insert(num, 'from unittest import mock\n\n')
else:
lines.insert(num, 'from unittest import mock\n')
del lines[lines.index('import mock\n')]
with open(path, 'w+') as fh:
fh.writelines(lines)
Note that we cannot remove mock from our requirements files yet due to
importing pypowervm unit test code in nova unit tests. This library
still uses the mock lib, and since we are importing test code and that
lib (correctly) only declares mock in its test-requirements.txt, mock
would not otherwise be installed and would cause errors while loading
nova unit test code.
[1] https://github.com/testing-cabal/fixtures/pull/49
Change-Id: Id5b04cf2f6ca24af8e366d23f15cf0e5cac8e1cc
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
These were left to last since there's a bit of cleanup necessary to move
everything across.
Change-Id: I921c812ac03f7d32eec31200772020c17f292851
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
We support Python 3.6 as a minimum now, making these checks no-ops.
Change-Id: I5ca2439c948687022f8d88df978bc7ee77199fcc
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
As a follow up of I86153d31b02e6b74b42d53a6800297cbd0e5cbb4 the two
get_disk test that was mistakenly added to test_driver is now moved to
test_guest where they belong.
Change-Id: I17bd591ffb96b9b296bea04c87e286a83d40570e
Related-Bug: #1882521
Nova so far applied a retry loop that tried to periodically detach the
device from libvirt while the device was visible in the domain xml. This
could lead to an issue where an already progressing detach on the
libvirt side is interrupted by nova re-sending the detach request for
the same device. See bug #1882521 for more information.
Also if there was both a persistent and a live domain the nova tried the
detach from both at the same call. This lead to confusion about the
result when such call failed. Was the detach failed partially?
We can do better, at least for the live detach case. Based on the
libvirt developers detaching from the persistent domain always
succeeds and it is a synchronous process. Detaching from the live
domain can be both synchronous or asynchronous depending on the guest
OS and the load on the hypervisor. But for live detach libvirt always
sends an event [1] nova can wait for.
So this patch does two things.
1) Separates the detach from the persistent domain from the detach from
the live domain to make the error cases clearer.
2) Changes the retry mechanism.
Detaching from the persistent domain is not retried. If libvirt
reports device not found, while both persistent and live detach
is needed, the error is ignored, and the process continues with
the live detach. In any other case the error considered as fatal.
Detaching from the live domain is changed to always wait for the
libvirt event. In case of timeout, the live detach is retried.
But a failure event from libvirt considered fatal, based on the
information from the libvirt developers, so in this case the
detach is not retried.
Related-Bug: #1882521
[1]https://libvirt.org/html/libvirt-libvirt-domain.html#virConnectDomainEventDeviceRemovedCallback
Change-Id: I7f2b6330decb92e2838aa7cee47fb228f00f47da
At present QEMU will raise an error to libvirt when a device_del request
is made for a device that has already partially detached through a
previous request. This is outlined in more detail in the following
downstream Red Hat QEMU bug report:
Get libvirtError "Device XX is already in the process of unplug" [..]
https://bugzilla.redhat.com/show_bug.cgi?id=1878659
Within Nova we can actually ignore this error and allow our existing
retry logic to attempt again after a short wait, hopefully allowing the
original request to complete removing the device from the domain.
This change does this and should result in one of the following
device_del requests raising a VIR_ERR_DEVICE_MISSING error from libvirt.
_try_detach_device should then translate that libvirt error into a
DeviceNotFound exception which is itself then ignored by all
detach_device_with_retry callers and taken to mean that the device has
detached successfully.
Closes-Bug: #1923206
Change-Id: I0e068043d8267ab91535413d950a3e154c2234f7
This patch adds from_persistent_config kwargs to get_interface_by_cfg()
and get_disk() so that the caller can specify which domain config the
devices is read from. Currently, if there was both a live domain and a
persistent domain then nova only reads from the live domain. In a later
patch during device detach these calls will be used to detach from the
persistent domain separately from the live domain.
Change-Id: I86153d31b02e6b74b42d53a6800297cbd0e5cbb4
Related-Bug: #1882521
Libvirt XML contains useful configuration information such as instance names,
flavors and images as metadata. This change extends this metadata to include
the IP addresses of the instances.
Example:
<metadata>
<nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.1">
...
<nova:ports>
<nova:port uuid="567a4527-b0e4-4d0a-bcc2-71fda37897f7">
<nova:ip type="fixed" address="192.168.1.1" ipVersion="4"/>
<nova:ip type="fixed" address="fe80::f95c:b030:7094" ipVersion="6"/>
<nova:ip type="floating" address="11.22.33.44" ipVersion="4"/>
</nova:port>
</nova:ports>
...
</nova:instance>
</metadata>
Change-Id: I45f1df4935905170957c2ea2496c8a698a7464a2
blueprint: libvirt-driver-ip-metadata
Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp>
Introduced by I32908b77c18f8ec08211dd67be49bbf903611c34 this was then
missed by the clean up of I8e349849db0b1a540d295c903f1470917b82fd97 that
actually bumped MIN_LIBVIRT_VERSION to 5.0.0 past the required 4.1.0 for
the original change. It's safe to remove this now ahead of another bump.
Change-Id: I181b3bf433b0fcef92ef4d430e9858506f24153c
Replace six.text_type with str.
This patch completes six removal.
Change-Id: I779bd1446dc1f070fa5100ccccda7881fa508d79
Implements: blueprint six-removal
Signed-off-by: Takashi Natsume <takanattie@gmail.com>
For attach:
* Generates InstancePciRequest for SRIOV interfaces attach requests
* Claims and allocates a PciDevice for such request
For detach:
* Frees PciDevice and deletes the InstancePciRequests
On the libvirt driver side the following small fixes was necessar:
* Fixes PCI address generation to avoid double 0x prefixes in LibvirtConfigGuestHostdevPCI
* Adds support for comparing LibvirtConfigGuestHostdevPCI objects
* Extends the comparison of LibvirtConfigGuestInterface to support
macvtap interfaces where target_dev is only known by libvirt but not
nova
* generalize guest.get_interface_by_cfg() to work with both
LibvirtConfigGuest[Inteface|HostdevPCI] objects
Implements: blueprint sriov-interface-attach-detach
Change-Id: I67504a37b0fe2ae5da3cba2f3122d9d0e18b9481
The VIR_MIGRATE_PARAM_PERSIST_XML parameter was introduced in libvirt
v1.3.4 and is used to provide the new persistent configuration for the
destination during a live migration:
https://libvirt.org/html/libvirt-libvirt-domain.html#VIR_MIGRATE_PARAM_PERSIST_XML
Without this parameter the persistent configuration on the destination
will be the same as the original persistent configuration on the source
when the VIR_MIGRATE_PERSIST_DEST flag is provided.
As Nova does not currently provide the VIR_MIGRATE_PARAM_PERSIST_XML
param but does provide the VIR_MIGRATE_PERSIST_DEST flag this means that
a soft reboot by Nova of the instance after a live migration can revert
the domain back to the original persistent configuration from the
source.
Note that this is only possible in Nova as a soft reboot actually
results in the virDomainShutdown and virDomainLaunch libvirt APIs being
called that recreate the domain using the persistent configuration.
virDomainReboot does not result in this but is not called at this time.
The impact of this on the instance after the soft reboot is pretty
severe, host devices referenced in the original persistent configuration
on the source may not exist or could even be used by other users on the
destination. CPU and NUMA affinity could also differ drastically between
the two hosts resulting in the instance being unable to start etc.
As MIN_LIBVIRT_VERSION is now > v1.3.4 this change simply includes the
VIR_MIGRATE_PARAM_PERSIST_XML param using the same updated XML for the
destination as is already provided to VIR_MIGRATE_PARAM_DEST_XML.
Co-authored-by: Tadayoshi Hosoya <tad-hosoya@wr.jp.nec.com>
Closes-Bug: #1890501
Change-Id: Ia3f1d8e83cbc574ce5cb440032e12bbcb1e10e98
I7eb86edc130d186a66c04b229d46347ec5c0b625 introduced
VIR_ERR_DEVICE_MISSING into the hot unplug libvirt error code list
within detach_device_with_retry. While the change correctly referenced
that the error code was introduced in v4.1.0 it made no attempt to
handle versions prior to this. With MIN_LIBVIRT_VERSION currently pinned
to v4.0.0 we need to handle libvirt < v4.1.0 to avoid referencing the
non-existent error code within the libvirt module.
Closes-Bug: #1891547
Change-Id: I32908b77c18f8ec08211dd67be49bbf903611c34
Remove six.PY2 and six.PY3.
Subsequent patches will replace other six usages.
Change-Id: Iccce0ab50eee515e533ab36c8e7adc10cb3f7019
Implements: blueprint six-removal
Signed-off-by: Takashi Natsume <takanattie@gmail.com>
Introduced in libvirt v4.1.0 [1] this error code replaces the previously
raised VIR_ERR_INVALID_ARG, VIR_ERR_OPERATION_FAILED and
VIR_ERR_INVALID_ARG codes [2][3].
VIR_ERR_OPERATION_FAILED was introduced and tested as an
active/live/hot unplug config device detach error code in
I131aaf28d2f5d5d964d4045e3d7d62207079cfb0.
VIR_ERR_INTERNAL_ERROR was introduced and tested as an
active/live/hot unplug config device detach error code in
I3055cd7641de92ab188de73733ca9288a9ca730a.
VIR_ERR_INVALID_ARG was introduced and tested as an
inactive/persistent/cold unplug config device detach error code in
I09230fc47b0950aa5a3db839a070613c9c817576.
This change introduces support for the new VIR_ERR_DEVICE_MISSING error
code while also retaining coverage for these codes until
MIN_LIBVIRT_VERSION is bumped past v4.1.0.
The majority of this change is test code motion with the existing tests
being modified to run against either the active or inactive versions of
the above error codes for the time being.
test_detach_device_with_retry_operation_internal and
test_detach_device_with_retry_invalid_argument_no_live have been removed
as they duplicate the logic within the now refactored
_test_detach_device_with_retry_second_detach_failure.
[1] https://libvirt.org/git/?p=libvirt.git;a=commit;h=bb189c8e8c93f115c13fa3bfffdf64498f3f0ce1
[2] https://libvirt.org/git/?p=libvirt.git;a=commit;h=126db34a81bc9f9f9710408f88cceaa1e34bbbd7
[3] https://libvirt.org/git/?p=libvirt.git;a=commit;h=2f54eab7c7c618811de23c60a51e910274cf30de
Closes-Bug: #1887946
Change-Id: I7eb86edc130d186a66c04b229d46347ec5c0b625
This is actually two functions, one of which takes an XML string and
another than takes an existing domain. Neither return a domain but
rather a Guest object, and while the first actually creates a new domain
along with the new Guest, the latter simply creates the new Guest and
resumes the existing domain. Simplify the function by ensuring it only
focuses on the former case, creation of new domains as part of the new
Guest object. The revised function is renamed, along with a related
function, '_create_domain_and_network', to reflect what it actually
returns. The other cases are replaced by simple call to
'Guest.resume()', which is all the latter case was really doing.
While we're here, we also update the function signature of the later to
make the 'block_device_info' argument mandatory, since all callers were
passing this anyway. All of this has some test impact but nothing
serious.
We also move some of the tests from 'test_driver' to 'test_guest', since
they were actually testing things from 'virt.libvirt.guest' but clearly
just weren't moved when that module was introduced many cycles ago.
Finally, we rename the '_prepare_domain_for_snapshot' and
'_snapshot_domain' functions to reflect what they're actually doing,
namely suspending and then resuming the domain before and after a
(cold) snapshot, respectively.
Part of blueprint add-emulated-virtual-tpm
Change-Id: I2489bf16dabc8a83b2044139247f4245ae29adb1
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Since 0.9.11 virDomainBlockResize has accepted the size argument in
bytes when the VIR_DOMAIN_BLOCK_RESIZE_BYTES flag is provided.
This change switches all callers over to using bytes to simplify the
required call, avoiding the need to divide by units.Ki etc.
Change-Id: Ib8d9318596186acd86a738ceea187420698645e6
We use the oslo.utils save_and_reraise_exception context manager in our
detach device code and catch specific exceptions that mean 'not found'
and raise DeviceNotFound instead. When we do that, the
save_and_reraise_exception context manager logs an ERROR traceback of
the original exception, for informational purposes. This is misleading
when trying to debug other issues, as it makes it look like the caught
exception caused a problem.
This passes the reraise=False keyword arg to the context manager and
sets the 'reraise' attribute to True only if we are not going to raise
a different exception.
Related-Bug: #1836212
Change-Id: Icce1e31fe3ebcbf9e4897bbfa57b7f3d1fba67a3
It turns out that when detaching a device libvirt can raise a
libvirt.VIR_ERR_INTERNAL_ERROR exception with an error log of
"unable to execute QEMU command 'device_del': Device <foo> not found".
Add this exception to the existing "not found" case which currently
handles only libvirt.VIR_ERR_OPERATION_FAILED.
Change-Id: I3055cd7641de92ab188de73733ca9288a9ca730a
Closes-Bug: #1815949
Signed-off-by: Chris Friesen <chris.friesen@windriver.com>
Nova skips detaching of ovs dpdk interfaces
thinking that it's already detached because
get_interface_by_cfg() return no inteface.
This is due to _set_config_VIFVHostUser()
not setting target_dev in configuration while
LibvirtConfigGuestInterface sets target_dev
if tag "target" is found in the interface.
As target_dev is not a valid value for
vhostuser interface, it will not be checked
for vhostuser type.
Change-Id: Iaf185b98c236df47e44cda0732ee0aed1fd6323d
Closes-Bug: #1807340
Recently, the _ThreadingEvent class in oslo.service was removed [1] and
our unit test patching is preventing us from moving to a newer version
of oslo.service [2].
We have patching of the _ThreadingEvent.wait method to bypass the sleep
time in the looping call of RetryDecorator, which adds several seconds
to the run time of unit tests.
This changes things to use the new SleepFixture from oslo.service
instead.
Depends-On: https://review.openstack.org/616371
[1] I62e9f1a7cde8846be368fbec58b8e0825ce02079
[2] https://review.openstack.org/615676
Change-Id: I45dd7602068eb0ce1331cfefd5a0cf6418bc8e88
We need to completely remove the mocks for oslo.service private members
in order to decouple nova from the requirements updates and be able to
update the constraint for oslo.service to a new version that has a
fixture for managing the sleep call being mocked (applied in the
following patch).
Change-Id: I0bbd2d7f9d6eb13d97587d867ef4d651809a7dd4
Signed-off-by: Doug Hellmann <doug@doughellmann.com>
This reverts commit 23446a9552.
With change Ibf2b5eeafd962e93ae4ab6290015d58c33024132 there
is nothing using the migrate_configure_max_speed method any
longer and can be removed. An additional mock, added after
the change being reverted, is also removed.
Change-Id: I90d6e14bf9383bf71d65d2180474ba228db2feab
Related-Bug: #1786346
There can be unicode characters in the params for live migration, for
example, the guest domain name in the destination XML. We need to
convert those to bytes when we call migrateToURI3 under python2.
The existing code was just calling str() for this, but that will fail
with the error:
UnicodeEncodeError: 'ascii' codec can't encode characters...
We need to encode the unicode characters to do conversion.
The existing unit test wasn't using any unicode characters in its test
data, so this scenario wasn't covered.
Closes-Bug: #1768807
Change-Id: I4b34139a3c5e3e2b7cf7cbe50bdf3da3131b9b1c
That's "grease" in the sense of "lubricate", or "make faster". Not in
the sense of "kill with an M3 submachine gun".
Similar optimizations to [1] in libvirt/test_guest (RetryDecorator in
detach_device_with_retry) and disk/mount/test_api (RetryDecorator in
Mount.map_dev).
[1] I60d11fc9a9e8569b1663c7319a5c25b921c5de1a
Change-Id: Ic7ad60ce89c93c0f03c040e244f8191c963c08f3
We should use nova.virt.libvirt.Guest instead of call from
a virDomain object.
Change-Id: Ifa8fe1b19980cc9e986d26b284d2fb093466d30c
Signed-off-by: Chen Hanxiao <chenhx@certusnet.com.cn>
The recently updated minimum required libvirt version (1.3.1; in commit
403320b -- libvirt: Bump MIN_{LIBVIRT,QEMU}_VERSION for "Rocky") brings
in the newer libvirt migration API, migrateToURI3(). The newer API was
explicitly designed[*] to be backward compatible with the older variant.
So remove the usage of the older variants:
migrateToURI()
migrateToURI2()
And just stick to the newer API -- migrateToURI3().
Clean up the following:
- Add the 'migrate_disks' and 'destination_xml' paramters, and remove
the no longer needed 'domain_xml' from the Nova migrate() method.
- Remove or fix various unit tests to use migrateToURI3().
- Stub nova.virt.libvirt.guest.Guest.migrate() correctly in
nova/tests/unit/virt/test_virt_drivers.py.
[*] https://libvirt.org/git/?p=libvirt.git;a=commit;h=4bf62f4 --
Extensible migration APIs
Change-Id: Id9ee1feeadf612fa79c3d280cee3a614a74a00a7
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
When detaching a device from a domain we first attempt to remove the
device from both the persistent and live configs before looping to
ensure the device has really been detached from the running live config.
Previously when this failed we logged an error message that suggested
that this was due to issues detaching the device from a transient
domain, however this is not the case as the domain is persistent.
This change simply updates the error and associated comments to only
reference the live config of the domain.
Additionally a DEBUG line claiming that a device has been successfully
detached is now only logged once the device is removed from the live
config, hopefully avoiding any confusion from this line been logged
each time an attempt is made to detach the device.
Change-Id: If869470216600c303d47cf79f12c4fc88abcf813
In this commit we are enhancing guest object to control the maximum bw
to perform migration.
Related-Bug: #1414559
Change-Id: I35470773b8c467449ed71217fdb4b6c82f455e33
Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@redhat.com>
etree.tostring() returns bytes by default in python3, and a string by
default in python2. For sanity, we make it explicitly emit unicode
everywhere. This requires a python2 fix in guest, as the libvirt
bindings for python2 don't accept a unicode string to migrateToURI3()'s
'param' argument, whereas the python3 bindings do.
Change-Id: I85cd9a903fba310b5ae7bedeed118ca4ea98dff6
The bandwidth param set outside of the method "migrate" from guest
object have to be done inside that to avoid duplicating that option.
Change-Id: I8a37753dea8eca7b26466f17dfbdc184c48c24c5
Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@redhat.com>
In a past attempt to fix a bug [1], we started raising DeviceNotFound
if a device wasn't found on the persistent domain. This was to address
a scenario where the guest ignored the detach from the live domain
because it was busy and we wanted to avoid failing a later detach
request to the user (compute handles DeviceNotFound).
Unfortunately, in the above case, a later detach request won't fail to
the user but it also won't detach from the live domain. It sees the
device already detached from the persistent domain and doesn't attempt
to detach from the live domain.
This is a serious problem because it's possible for a volume to be
attached to two live domains and data corruption can occur.
This adds an attempt to detach from the live domain even if we had
already detached from the persistent domain in the past.
Closes-Bug: #1707238
[1] https://review.openstack.org/386257
Change-Id: I8cd056fa17184a98c31547add0e9fb2d363d0908