Right now we'll fail to calculate the boot order of a set of BDMs if
one of them is a device_type=lun. This fixes that and teaches us
that it's just a "hd" from qemu's perspective.
Closes-Bug: #2065084
Change-Id: Ic1340918738d503fc797c9373fe2e1dd16b27a09
Libvirt now enforces that device="lun" (i.e. raw device passthrough)
disks must not have the <serial> property set. We recently enabled
the ability to manage devices by alias instead of serial, but to
fully enable this use-case we need to avoid putting serial in the
XML to appease libvirt.
Related-Bug: #2065084
Change-Id: Ifa2df89f27e58e1e64ce046edeaf6e49a7c89490
Live migrating to a host with cpu_shared_set configured will now
update the VM's configuration accordingly.
Example: live migrating a VM from source host with cpu_shared_set=0,1
to destination host with cpu_shared_set=2,3 will now update the
VM configuration.
(<vcpu cpuset="0-1"> will be updated to <vcpu cpuset="2-3">).
Related-Bug: #1869804
Change-Id: I7c717503eba58088094fac05cb99b276af9a3460
When working on a fix for bug #1811870, it was noted that the check to
ensure pinned instances do not overcommit was not pagesize aware. This
means if an instance without hugepages boots on a host with a large
number of hugepages allocated, it may not get all of the memory
allocated to it. Put in concrete terms, consider a host with 1 NUMA
cell, 2 CPUs, 1G of 4k pages, and a single 1G page. If you boot a first
instance with 1 CPU, CPU pinning, 1G of RAM, and no specific page size,
the instance should boot successfully. An attempt to boot a second
instance with the same configuration should fail because there is only
the single 1G page available, however, this is not currently the case.
The reason this happens is because we currently have two tests: a first
that checks total (not free!) host pages and a second that checks free
memory but with no consideration for page size. The first check passes
because we have 1G worth of 4K pages configured and the second check
passes because we have the single 1G page.
Close this gap.
Change-Id: I74861a67827dda1ab2b8451967f5cf0ae93a4ad3
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
Closes-Bug: #1811886
Previously, live migrations completely ignored CPU power management.
This patch makes sure that we correctly:
* Power up the cores on the destination during pre_live_migration, as
we need them powered up before the instance starts on the
destination.
* If the live migration is successful, power down the vacated cores on
the source.
* In case of a rollback, power down the cores previously powered up on
pre_live_migration.
Closes-bug: 2056613
Change-Id: I787bd7807950370cd865f29b95989d489d4826d0
We want to test power management in our functional tests in multinode
scenarios (ex: live migration).
This was previously impossible because all the methods in
nova.virt.libvirt.cpu.api and were at the module level, meaning both
source and destination libvirt drivers would call the same method to
online and offline cores. This made it impossible to maintain distinct
core power state between source and destination.
This patch inserts a nova.virt.libvirt.cpu.api.API class, and gives
the libvirt driver a cpu_api attribute with an instance of that
class. Along with the tiny API.core() helper, this allows new
functional tests in the subsequent patches to stub out the core
"model" code with distinct objects on the source and destination
libvirt drivers, and enables a whole bunch of testing (and fixes!)
around live migration.
Related-bug: 2056613
Change-Id: I052535249b9a3e144bb68b8c588b5995eb345b97
Previously, with the `isolate` emulator threads policy and libvirt cpu
power management enabled, we did not power on the cores to which the
emulator threads were pin. Start doing that, and don't forget to power
them down when the instance is stopped.
Closes-bug: 2056612
Change-Id: I6e5383d8a0bf3f0ed8c870754cddae4e9163b4fd
Sometimes, some GPU may have a long list of PCI addresses (say a SRIOV
GPU) or operators may have a long list of GPUs. In order to help their
lifes, let's allow device_addresses to be optional.
This means that a valid configuration could be :
[devices]
enabled_mdev_types = nvidia-35, nvidia-36
[mdev_nvidia-35]
[mdev_nvidia-36]
NOTE(sbauza): we have a slight coverage gap for testing what happens
if the groups aren't set, but I'll add it in a next patch
Related-Bug: #2041519
Change-Id: I73762a0295212ee003db2149d6a9cf701023464f
We want to cap a maximum mdevs we can create.
If some type has enough capacity, then other GPUs won't be used and
existing ResourceProviders would be deleted.
Closes-Bug: #2041519
Change-Id: I069879a333152bb849c248b3dcb56357a11d0324
With the change from ml2/ovs DHCP agents towards OVN implementation
in neutron there is no port with device_owner network:dhcp anymore.
Instead DHCP is provided by network:distributed port.
Closes-Bug: 2055245
Change-Id: Ibb569b9db1475b8bbd8f8722d49228182cd47f85
Now the destination returns the list of the needed mdevs for the
migration, we can change the XML.
Note: this is the last patch of the feature branch.
I'll work on adding mtty support in the next patches in the series
but that's not a feature usage.
Change-Id: Ib448444be09df50c3db5ccda8a49bfd882c18edf
Implements: blueprint libvirt-mdev-live-migrate
For encryption of local ephemeral disks, the <encryption> XML should be
a sub element of the <source> XML element [1][2] in order for more
involved operations like live migration to work properly.
This adds generation of ephemeral <encryption> XML as a sub element of
the <source> XML.
This also renames the internal LibvirtConfigGuestDisk attribute for
volume encryption from "encryption" to "volume_encryption" in an effort
to clearly differentiate between volume encryption and ephemeral disk
encryption.
[1] https://libvirt.org/formatdomain.html#hard-drives-floppy-disks-cdroms
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1371022#c13
Related to blueprint ephemeral-encryption-libvirt
Change-Id: Ie4e5f2b27f7ef05f5c45b9adc1df2966e7f05e62
This change enables ephemeral encryption support to convert:
* encrypted source image to unencrypted destination image
* unencrypted source image to encrypted destination image
* encrypted source image to encrypted destination image
This also makes necessary changes for mypy checks to pass.
Related to blueprint ephemeral-storage-encryption
Change-Id: I9edc87006b1f7de69bc52f916f45c2cbb66abe23
This adds configuration of the default ephemeral encryption format and
sets default encryption attributes in the driver block device mapping
when needed. This includes generation of a secret passphrase when one
has not been provided.
Co-Authored-By: melanie witt <melwittt@gmail.com>
Related to blueprint ephemeral-encryption-libvirt
Change-Id: I052441076c677c0fe76a8d9421af70b0ffa1d400
With Libvirt v8.7.0+, the <maxphysaddr> sub-element
of the <cpu> element specifies the number of vCPU
physical address bits [1].
[1] https://libvirt.org/news.html#v8-7-0-2022-09-01
New flavor extra_specs and image properties are added to
control the physical address bits of vCPUs in Libvirt guests.
The nova-scheduler requests COMPUTE_ADDRESS_SPACE_* traits
based on them. The traits are already defined in os-traits
v2.10.0. Also numerical comparisons are performed at
both compute capabilities filter and image props filter.
blueprint: libvirt-maxphysaddr-support-caracal
Change-Id: I98968f6ef1621c9fb4f682c119038e26d62ce381
Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp>
Many bugs around nova-compute rebalancing are focused around
problems when the compute node and placement resources are
deleted, and sometimes they never get re-created.
To limit this class of bugs, we add a check to ensure a compute
node is only ever deleted when it is known to have been deleted
in Ironic.
There is a risk this might leave orphaned compute nodes and
resource providers that need manual clean up because users
do not want to delete the node in Ironic, but are removing it
from nova management. But on balance, it seems safer to leave
these cases up to the operator to resolve manually, and collect
feedback on how to better help those users.
blueprint ironic-shards
Change-Id: I2bc77cbb77c2dd5584368563dc4250d71913906b
Ironic in API 1.82 added the option for nodes to be associated with
a specific shard key. This can be used to partition up the nodes within
a single ironic conductor group into smaller sets of nodes that can
each be managed by their own nova-compute ironic service.
We add a new [ironic]shard config option to allow operators to say
which shard each nova-compute process should target.
As such, when the shard is set we ignore the peer_list setting
and always have a hash ring of one.
Also corrects an issue where [ironic]/conductor_group was considered
a mutable configuration; it is not mutable, nor is shards. In any
situation where an operator changes the scope of nodes managed by a
nova compute process, a restart is required.
blueprint ironic-shards
Co-Authored-By: Jay Faulkner <jay@jvf.cc>
Change-Id: Ie0c71f7bc5a62d607ffd3134837299fee952a947
Right now we're logging this as an unexpected failure, complete with
full traceback. However, this is expected if an image is deleted
mid-snapshot, as is the case in one of our tempest tests.
Change-Id: I6eb76c0500c7940778b7a15ac5202659ef92a82a
The destination lookups at the src mdev types and returns its own
mdevs using the same type. We also reserve them by an internal dict
and we make sure we can cleanup this dict if the live-migration aborts.
Partially-Implements: blueprint libvirt-mdev-live-migrate
Change-Id: I4a7e5292dd3df63943bd9f01803fa933e0466014
This enables use of ephemeral encryption for swap disks.
The 'encrypted' attribute is intended to be read-only, so we will raise
an error if something attempts to save it later.
DriverImageBlockDevice and DriverEphemeralBlockDevice are already not
saving updates to their 'encrypted' attributes, so the same error
checking is added to them as well for consistency.
Related to blueprint ephemeral-encryption-libvirt
Change-Id: Id30577699928180eff6a5b78390ce9e3efa28b16
RDP console was only for HyperV driver so removing the
API. As API url stay same (because same used for other
console types API), RDP console API will return 400.
Cleaning up the related config options as well as moving its
API ref to obsolete seciton.
Keeping RPC method to avoid error when old controller is used
with new compute. It can be removed in next RPC version bump.
Change-Id: I8f5755009da4af0d12bda096d7a8e85fd41e1a8c
Libvirt requires both swtpm and swtpm_setup to launch instances with
vTPM emulated by swtpm. However the driver now checks if "any" of
these two binaries exist.
This fixes the logic and ensure "both" of these two binaries exist,
to meet the requirement by libvirt correctly.
Closes-Bug: #2052760
Change-Id: I44453e69c88115868cda192c9ca17b92ba7b6556