Commit Graph

11913 Commits

Author SHA1 Message Date
Zuul 1bca24aeb0 Merge "Always delete NVRAM files when deleting instances" 2024-04-04 22:14:14 +00:00
Zuul 3e358bc37c Merge "vgpu: Allow device_addresses to not be set" 2024-03-18 16:58:28 +00:00
Zuul e255323f46 Merge "libvirt: Cap with max_instances GPU types" 2024-03-18 12:31:30 +00:00
Artom Lifshitz c1ccc1a316 pwr mgmt: handle live migrations correctly
Previously, live migrations completely ignored CPU power management.
This patch makes sure that we correctly:

* Power up the cores on the destination during pre_live_migration, as
  we need them powered up before the instance starts on the
  destination.
* If the live migration is successful, power down the vacated cores on
  the source.
* In case of a rollback, power down the cores previously powered up on
  pre_live_migration.

Closes-bug: 2056613
Change-Id: I787bd7807950370cd865f29b95989d489d4826d0
2024-03-11 14:21:27 -04:00
Artom Lifshitz 29dc044a7a pwr mgmt: make API into a per-driver object
We want to test power management in our functional tests in multinode
scenarios (ex: live migration).

This was previously impossible because all the methods in
nova.virt.libvirt.cpu.api and were at the module level, meaning both
source and destination libvirt drivers would call the same method to
online and offline cores. This made it impossible to maintain distinct
core power state between source and destination.

This patch inserts a nova.virt.libvirt.cpu.api.API class, and gives
the libvirt driver a cpu_api attribute with an instance of that
class. Along with the tiny API.core() helper, this allows new
functional tests in the subsequent patches to stub out the core
"model" code with distinct objects on the source and destination
libvirt drivers, and enables a whole bunch of testing (and fixes!)
around live migration.

Related-bug: 2056613
Change-Id: I052535249b9a3e144bb68b8c588b5995eb345b97
2024-03-08 20:31:42 -05:00
Artom Lifshitz 0986d2bbe8 Power on cores for isolated emulator threads
Previously, with the `isolate` emulator threads policy and libvirt cpu
power management enabled, we did not power on the cores to which the
emulator threads were pin. Start doing that, and don't forget to power
them down when the instance is stopped.

Closes-bug: 2056612
Change-Id: I6e5383d8a0bf3f0ed8c870754cddae4e9163b4fd
2024-03-08 20:31:34 -05:00
Sylvain Bauza d445eaf9dd vgpu: Allow device_addresses to not be set
Sometimes, some GPU may have a long list of PCI addresses (say a SRIOV
GPU) or operators may have a long list of GPUs. In order to help their
lifes, let's allow device_addresses to be optional.

This means that a valid configuration could be :

    [devices]
    enabled_mdev_types = nvidia-35, nvidia-36

    [mdev_nvidia-35]

    [mdev_nvidia-36]

NOTE(sbauza): we have a slight coverage gap for testing what happens
if the groups aren't set, but I'll add it in a next patch

Related-Bug: #2041519
Change-Id: I73762a0295212ee003db2149d6a9cf701023464f
2024-03-05 11:48:25 +01:00
Sylvain Bauza 60851e4464 libvirt: Cap with max_instances GPU types
We want to cap a maximum mdevs we can create.
If some type has enough capacity, then other GPUs won't be used and
existing ResourceProviders would be deleted.

Closes-Bug: #2041519
Change-Id: I069879a333152bb849c248b3dcb56357a11d0324
2024-03-05 11:48:19 +01:00
Zuul dac8bd2493 Merge "libvirt: make <encryption> a sub element of <source>" 2024-03-01 20:05:16 +00:00
Zuul 1c903ccc8d Merge "Fix nova-metadata-api for ovn dhcp native networks" 2024-03-01 12:34:52 +00:00
Zuul 815fcbfa6b Merge "Add encryption support to convert_image" 2024-03-01 11:22:23 +00:00
Zuul 7275e6088e Merge "imagebackend: Add support to libvirt_info for LUKS based encryption" 2024-03-01 11:22:11 +00:00
Zuul d29a9b64ee Merge "Make compute node rebalance safer" 2024-02-29 18:48:26 +00:00
Zuul 163f682362 Merge "Limit nodes by ironic shard key" 2024-02-29 18:46:22 +00:00
Steven Blatzheim 135af5230e Fix nova-metadata-api for ovn dhcp native networks
With the change from ml2/ovs DHCP agents towards OVN implementation
in neutron there is no port with device_owner network:dhcp anymore.
Instead DHCP is provided by network:distributed port.

Closes-Bug: 2055245
Change-Id: Ibb569b9db1475b8bbd8f8722d49228182cd47f85
2024-02-29 13:12:41 +01:00
Zuul 149585bca1 Merge "libvirt: Configure and teardown ephemeral encryption secrets" 2024-02-29 11:56:10 +00:00
Zuul 060445aa2f Merge "Modify the mdevs in the migrate XML" 2024-02-29 06:58:40 +00:00
Sylvain Bauza 8abc7b47fd Modify the mdevs in the migrate XML
Now the destination returns the list of the needed mdevs for the
migration, we can change the XML.

Note: this is the last patch of the feature branch.
I'll work on adding mtty support in the next patches in the series
but that's not a feature usage.

Change-Id: Ib448444be09df50c3db5ccda8a49bfd882c18edf
Implements: blueprint libvirt-mdev-live-migrate
2024-02-28 15:53:49 +01:00
melanie witt e91aaaf551 libvirt: make <encryption> a sub element of <source>
For encryption of local ephemeral disks, the <encryption> XML should be
a sub element of the <source> XML element [1][2] in order for more
involved operations like live migration to work properly.

This adds generation of ephemeral <encryption> XML as a sub element of
the <source> XML.

This also renames the internal LibvirtConfigGuestDisk attribute for
volume encryption from "encryption" to "volume_encryption" in an effort
to clearly differentiate between volume encryption and ephemeral disk
encryption.

[1] https://libvirt.org/formatdomain.html#hard-drives-floppy-disks-cdroms
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1371022#c13

Related to blueprint ephemeral-encryption-libvirt

Change-Id: Ie4e5f2b27f7ef05f5c45b9adc1df2966e7f05e62
2024-02-28 08:46:20 +00:00
melanie witt 9f7a6732f9 Add encryption support to convert_image
This change enables ephemeral encryption support to convert:

  * encrypted source image to unencrypted destination image
  * unencrypted source image to encrypted destination image
  * encrypted source image to encrypted destination image

This also makes necessary changes for mypy checks to pass.

Related to blueprint ephemeral-storage-encryption

Change-Id: I9edc87006b1f7de69bc52f916f45c2cbb66abe23
2024-02-28 07:56:42 +00:00
Lee Yarwood 3391ac2656 imagebackend: Add support to libvirt_info for LUKS based encryption
Related to blueprint ephemeral-encryption-libvirt

Change-Id: I909c86ab722179efcb673b66f1f81121ab8b5f66
2024-02-28 07:56:42 +00:00
Lee Yarwood 177c184e40 libvirt: Configure and teardown ephemeral encryption secrets
This adds configuration of the default ephemeral encryption format and
sets default encryption attributes in the driver block device mapping
when needed. This includes generation of a secret passphrase when one
has not been provided.

Co-Authored-By: melanie witt <melwittt@gmail.com>

Related to blueprint ephemeral-encryption-libvirt

Change-Id: I052441076c677c0fe76a8d9421af70b0ffa1d400
2024-02-28 07:56:42 +00:00
Zuul 7fa1859576 Merge "libvirt: Support maxphysaddr." 2024-02-28 06:18:08 +00:00
Zuul 4c3640599a Merge "block_device: Add encryption attributes to swap disks" 2024-02-28 00:41:44 +00:00
Nobuhiro MIKI 1038a63387 libvirt: Support maxphysaddr.
With Libvirt v8.7.0+, the <maxphysaddr> sub-element
of the <cpu> element specifies the number of vCPU
physical address bits [1].

[1] https://libvirt.org/news.html#v8-7-0-2022-09-01

New flavor extra_specs and image properties are added to
control the physical address bits of vCPUs in Libvirt guests.
The nova-scheduler requests COMPUTE_ADDRESS_SPACE_* traits
based on them. The traits are already defined in os-traits
v2.10.0. Also numerical comparisons are performed at
both compute capabilities filter and image props filter.

blueprint: libvirt-maxphysaddr-support-caracal
Change-Id: I98968f6ef1621c9fb4f682c119038e26d62ce381
Signed-off-by: Nobuhiro MIKI <nmiki@yahoo-corp.jp>
2024-02-27 14:16:25 +09:00
Zuul 326a41b3b3 Merge "Catch ImageNotFound on snapshot failure" 2024-02-26 20:59:58 +00:00
John Garbutt 947bb5f641 Make compute node rebalance safer
Many bugs around nova-compute rebalancing are focused around
problems when the compute node and placement resources are
deleted, and sometimes they never get re-created.

To limit this class of bugs, we add a check to ensure a compute
node is only ever deleted when it is known to have been deleted
in Ironic.

There is a risk this might leave orphaned compute nodes and
resource providers that need manual clean up because users
do not want to delete the node in Ironic, but are removing it
from nova management. But on balance, it seems safer to leave
these cases up to the operator to resolve manually, and collect
feedback on how to better help those users.

blueprint ironic-shards

Change-Id: I2bc77cbb77c2dd5584368563dc4250d71913906b
2024-02-25 13:25:27 -08:00
John Garbutt f1a4857d61 Limit nodes by ironic shard key
Ironic in API 1.82 added the option for nodes to be associated with
a specific shard key. This can be used to partition up the nodes within
a single ironic conductor group into smaller sets of nodes that can
each be managed by their own nova-compute ironic service.

We add a new [ironic]shard config option to allow operators to say
which shard each nova-compute process should target.
As such, when the shard is set we ignore the peer_list setting
and always have a hash ring of one.

Also corrects an issue where [ironic]/conductor_group was considered
a mutable configuration; it is not mutable, nor is shards. In any
situation where an operator changes the scope of nodes managed by a
nova compute process, a restart is required.

blueprint ironic-shards
Co-Authored-By: Jay Faulkner <jay@jvf.cc>

Change-Id: Ie0c71f7bc5a62d607ffd3134837299fee952a947
2024-02-25 13:25:27 -08:00
Zuul 9a9ab2128b Merge "Reserve mdevs to return to the source" 2024-02-23 15:46:08 +00:00
Zuul bdd7daffbb Merge "Check if destination can support the src mdev types" 2024-02-23 15:46:01 +00:00
Zuul f15b8b7204 Merge "check both source and dest compute libvirt versions for mdev lv" 2024-02-23 15:40:00 +00:00
Dan Smith 40a56ce05b Catch ImageNotFound on snapshot failure
Right now we're logging this as an unexpected failure, complete with
full traceback. However, this is expected if an image is deleted
mid-snapshot, as is the case in one of our tempest tests.

Change-Id: I6eb76c0500c7940778b7a15ac5202659ef92a82a
2024-02-22 11:21:49 -08:00
Zuul 3209f65516 Merge "HyperV: Remove RDP console API" 2024-02-20 07:01:10 +00:00
Sylvain Bauza 2e1e12cd62 Reserve mdevs to return to the source
The destination lookups at the src mdev types and returns its own
mdevs using the same type. We also reserve them by an internal dict
and we make sure we can cleanup this dict if the live-migration aborts.

Partially-Implements: blueprint libvirt-mdev-live-migrate
Change-Id: I4a7e5292dd3df63943bd9f01803fa933e0466014
2024-02-16 16:05:48 +01:00
Zuul 84a3e254e8 Merge "libvirt: stop enabling hyperv feature reenlightenment" 2024-02-15 13:21:27 +00:00
Zuul f315c5658e Merge "libvirt: Stop unconditionally enabling evmcs" 2024-02-15 10:22:41 +00:00
melanie witt eb4e0faf2d block_device: Add encryption attributes to swap disks
This enables use of ephemeral encryption for swap disks.

The 'encrypted' attribute is intended to be read-only, so we will raise
an error if something attempts to save it later.

DriverImageBlockDevice and DriverEphemeralBlockDevice are already not
saving updates to their 'encrypted' attributes, so the same error
checking is added to them as well for consistency.

Related to blueprint ephemeral-encryption-libvirt

Change-Id: Id30577699928180eff6a5b78390ce9e3efa28b16
2024-02-14 08:03:34 +00:00
Ghanshyam Mann 0c1e1ccf03 HyperV: Remove RDP console API
RDP console was only for HyperV driver so removing the
API. As API url stay same (because same used for other
console types API), RDP console API will return 400.

Cleaning up the related config options as well as moving its
API ref to obsolete seciton.

Keeping RPC method to avoid error when old controller is used
with new compute. It can be removed in next RPC version bump.

Change-Id: I8f5755009da4af0d12bda096d7a8e85fd41e1a8c
2024-02-13 12:24:38 -08:00
Zuul 35af4b345d Merge "Remove the Hyper-V driver" 2024-02-08 16:14:23 +00:00
Ghanshyam Mann b068b04372 Remove the Hyper-V driver
Nova Hyper-V driver is not tested in OpenStack upstream and no maintianers.
This driver has been marked as deprecated in Antelope release. It has dependency
on the OpenStack Winstacker project which has been retired[1].

As discussed in vPTG[2], removing the HyperV driver, tests, and its config.

[1] https://review.opendev.org/c/openstack/governance/+/886880
[2] https://etherpad.opendev.org/p/nova-caracal-ptg#L301

Change-Id: I568c79bae9b9736a20c367096d748c730ed59f0e
2024-02-05 12:06:58 -08:00
Sylvain Bauza 489aab934c Check if destination can support the src mdev types
Now that the source knows that both the computes support the right
libvirt version, it passes to the destination the list of mdevs it has
for the instance. By this change, we'll verify if the types of those
mdevs are actually supported by the destination.
On the next change, we'll pass the destination mdevs back to the
source.

Partially-Implements: blueprint libvirt-mdev-live-migrate
Change-Id: Icb52fa5eb0adc0aa6106a90d87149456b39e79c2
2024-02-05 15:10:55 +01:00
Zuul 681f6872fb Merge "testing: Use inspect.isfunction() to check signatures" 2024-02-02 06:51:34 +00:00
Zuul da918d4b95 Merge "Revert "[pwmgmt]ignore missin governor when cpu_state used"" 2024-01-31 16:37:02 +00:00
Sylvain Bauza baa78326dd check both source and dest compute libvirt versions for mdev lv
Since only qemu 8.1 and libvirt 8.6.0 supports mdev live-migration,
we need to verify the values of the hypervisor for both the source
and the destination.

If one of them are older, the conductor raises an exception that will
eventually fact the API to return an HTTP500.

Change-Id: I17f170143c58401b8b0a5a93e83355b1f7178ab5
Partially-Implements: blueprint libvirt-mdev-live-migrate
2024-01-30 20:45:13 +01:00
Simon Hensel 406d590a36 Always delete NVRAM files when deleting instances
When deleting an instance, always send VIR_DOMAIN_UNDEFINE_NVRAM to
delete the NVRAM file, regardless of whether the image is of type UEFI.
This prevents a bug when rebuilding an instance from an UEFI image to a
non-UEFI image.

Closes-Bug: #1997352

Change-Id: I24648f5b7895bf5d093f222b6c6e364becbb531f
Signed-off-by: Simon Hensel <simon.hensel@inovex.de>
2024-01-23 16:16:17 +01:00
Zuul 087c372a8e Merge "[ironic] Partition & use cache for list_instance*" 2024-01-19 14:48:41 +00:00
Zuul 650301b09e Merge "Allow config to support virtiofs (driver)" 2024-01-18 21:51:49 +00:00
Balazs Gibizer 03ef4d6f53 Revert "[pwmgmt]ignore missin governor when cpu_state used"
This reverts commit ed2ac71a46.

This is unnecessary as 2c4421568e already
fixed this bug on master

Change-Id: Id2c253a6e223bd5ba22512d9e5a40a9d12680da2
2024-01-16 11:45:08 +01:00
songjie e618e78edc libvirt: stop enabling hyperv feature reenlightenment
The 'reenlightenment' hyperv enlightenment will cause
instances live-migration to fail (KVM currently doesn’t
fully support reenlightenment notifications, see
www.qemu.org/docs/master/system/i386/hyperv.html),
so don't enable it now.

Change-Id: I6821819450bc96e4304125ea3b76a0e462e6e33f
Closes-Bug: #2046549
Related-Bug: #2009280
2023-12-29 09:29:01 +08:00
Zuul a43728d33b Merge "Resolve mypy error" 2023-12-20 15:03:38 +00:00