Libvirt has implemented the capability to expose maximum number of
SEV guests and SEV-ES guests in 8.0.0[1][2]. This allows nova to detect
maximum number of memory encrypted guests using that feature.
The detection is not used if the [libvirt] num_memory_encrypted_guests
option is set to preserve the current behavior.
Note that current nova supports only SEV and does not support SEV-ES,
so this implementation only uses the maximum number of SEV guests.
The maximum number of SEV-ES guests will be used in case we implement
support for SEV-ES.
[1] 34cb8f6fcd
[2] 7826148a72
Implements: blueprint libvirt-detect-sev-max-guests
Change-Id: I502e1713add7e6a1eb11ecce0cc2b5eb6a14527a
For live migration the libvirt driver already supports generating the
migration URL based on the compute host hostname if so configured.
However for the non live move operations the driver always used the IP
address of the compute host based on [DEFAULT]my_ip.
Some deployments rely on DNS to abstract the IP address management. In
these environments it is beneficial if nova allows connection between
compute hosts based on the hostname (or FQDN) of the host instead of
trying to configure [DEFAUL]my_ip to an IP address.
This patch introduces a new config option
[libvirt]migration_inbound_addr that is used to determine the address
for incoming move operations (cold migrate, resize, evacuate). This
config is defaulted to [DEFAULT]my_ip to keep the configuration backward
compatible. However it allows an explicit hostname or FQDN to be
specified, or allows to specify '%s' that is then resolved to the
hostname of compute host.
blueprint: libvirt-migrate-with-hostname-instead-of-ip
Change-Id: I6a80b5620f32770a04c751143c4ad07882e9f812
Qemu>=5.0.0 bumped the default tb-cache size to 1GiB(from 32MiB)
and this made it difficult to run multiple guest VMs on systems
running with lower memory. With Libvirt>=8.0.0 it's possible to
configure lower tb-cache size.
Below config option is introduced to allow configure
TB cache size as per environment needs, this only
applies to 'virt_type=qemu':-
[libvirt]tb_cache_size
Also enable this flag in nova-next job.
[1] https://github.com/qemu/qemu/commit/600e17b26
[2] https://gitlab.com/libvirt/libvirt/-/commit/58bf03f85
Closes-Bug: #1949606
Implements: blueprint libvirt-tb-cache-size
Change-Id: I49d2276ff3d3cc5d560a1bd96f13408e798b256a
The following options have minimum values defined, and too small values
are not rounded but rejected by oslo.config.
This change updates the description to explain the actual behavior.
Closes-Bug: #2007532
Change-Id: I8d1533ae4b44d4e8f811dce554196f270e25da3e
By this patch, we now automatically power down or up cores
when an instance is either stopped or started.
Also, by default, we now powersave or offline dedicated cores when
starting the compute service.
Implements: blueprint libvirt-cpu-state-mgmt
Change-Id: Id645fd1ba909683af903f3b8f11c7f06db3401cb
Before going further, we need to somehow return the list of CPUs even offline
if they are power managed by Nova.
Co-Authored-By: Sean Mooney <smooney@redhat.com>
Partially-Implements: blueprint libvirt-cpu-state-mgmt
Change-Id: I5dca10acde0eff554ed139587aefaf2f5fad2ca5
This is the first stage of the power management series.
In order to be able to switch the CPU state or change the
governor, we need a framework to access sysfs.
As some bits can be reused, let's create a nova.filesystem helper module
that will define read-write mechanisms for accessing sysfs-specific commands.
Partially-Implements: blueprint libvirt-cpu-state-mgmt
Change-Id: Icb913ed9be8d508de35e755a9c650ba25e45aca2
The options list in 'Related Options:' section doesn't rendered
as bulleted list for some params because of missing blank line.
This changes adds missing blank line wherever needed in [1].
[1] https://docs.openstack.org/nova/latest/configuration/config.html
Change-Id: I7077aea2abcf3cab67592879ebd1fde066bfcac5
Before, the definition of live_migration_downtime didn't explain
if any exception/timeout occurs if the migration exceeds the value.
This is just used as a reference for nova and if any problem happens
when the VM gets paused, there will be no abort or force-complete.
Closes-Bug: #1960345
Signed-off-by: Pedro Almeida <pedro.monteiroazevedodemouraalmeida@windriver.com>
Change-Id: I336481d1801a367b5628fedcd2aa5f5cf763355a
Use the new oslo.confg type HostDomainOpt to support underscore in the
name. You can see the bugzilla[1] to have more information.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1868940
Related-Bug: #1892044
Change-Id: Ib6c8fc1e3d90b79b10066c429670fcb957bddd23
Nova so far applied a retry loop that tried to periodically detach the
device from libvirt while the device was visible in the domain xml. This
could lead to an issue where an already progressing detach on the
libvirt side is interrupted by nova re-sending the detach request for
the same device. See bug #1882521 for more information.
Also if there was both a persistent and a live domain the nova tried the
detach from both at the same call. This lead to confusion about the
result when such call failed. Was the detach failed partially?
We can do better, at least for the live detach case. Based on the
libvirt developers detaching from the persistent domain always
succeeds and it is a synchronous process. Detaching from the live
domain can be both synchronous or asynchronous depending on the guest
OS and the load on the hypervisor. But for live detach libvirt always
sends an event [1] nova can wait for.
So this patch does two things.
1) Separates the detach from the persistent domain from the detach from
the live domain to make the error cases clearer.
2) Changes the retry mechanism.
Detaching from the persistent domain is not retried. If libvirt
reports device not found, while both persistent and live detach
is needed, the error is ignored, and the process continues with
the live detach. In any other case the error considered as fatal.
Detaching from the live domain is changed to always wait for the
libvirt event. In case of timeout, the live detach is retried.
But a failure event from libvirt considered fatal, based on the
information from the libvirt developers, so in this case the
detach is not retried.
Related-Bug: #1882521
[1]https://libvirt.org/html/libvirt-libvirt-domain.html#virConnectDomainEventDeviceRemovedCallback
Change-Id: I7f2b6330decb92e2838aa7cee47fb228f00f47da
We are well above the required MIN_LIBVIRT_VERSION and MIN_QEMU_VERSION
(4.4.0 and 2.11.0, respectively) to get QEMU-native TLS[1] support by
default.
So we can now deprecate (and later remove) the support for "tunnelled
live migration", which has two inherent limitations: (a) it cannot
handle live migration of disks in a non-shared storage setup (a.k.a.
"block migration"); and (b) it has a huge performance overhead and
latency, because it burns more CPU and memory bandwidth due to increased
number of data copies, on both source and destination hosts.
Both the above limitations are addressed by the QEMU-native TLS support
`live_migration_with_native_tls`, which is the recommended approach for
securing all live migration streams (guest RAM, device state, and
disks).
[1] https://docs.openstack.org/nova/latest/admin/secure-live-migration-with-qemu-native-tls.html
Change-Id: I34fd5a4788a2ad4380d9a57b84512fa94a6f9c37
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Parse a comma-separated list of CPU flags from
`[libvirt]/cpu_model_extra_flags`. If the CPU flag starts with '+',
enable the feature in Nova guest CPU guest XML, or if it starts with
'-', disable the feature. If neither '+' nor '-' is specified, enable
the flag. For example, on a compute node that is running hardware (e.g.
an Intel server that supports TSX) and virtualization software that
supports the given CPU flags, if a user provides this config:
[libvirt]
cpu_mode = custom
cpu_models = Cascadelake-Server
cpu_model_extra_flags = -hle, -rtm, +ssbd, mtrr
Then Nova should generate this CPU for the guest:
<cpu match='exact'>
<model fallback='forbid'>Cascadelake-Server</model>
<vendor>Intel</vendor>
<feature policy='require' name='ssbd'/>
<feature policy='require' name='mtrr'/>
<feature policy='disable' name='hle'/>
<feature policy='disable' name='rtm'/>
</cpu>
This ability to selectively disable CPU flags lets you avoid any CPU
flags that need to be disabled for any number of reasons. E.g. disable
a CPU flag that is a potential security risk, or disable one that causes
a performance penalty.
blueprint: allow-disabling-cpu-flags
Change-Id: I2ef7c5bef87bd64c087f3b136c2faac9a3865f10
Signed-off-by: Patrick Uiterwijk <patrick@puiterwijk.org>
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
This is a security concern, as mounting filesystems on the host has
had previous CVEs around executing code on the host. libguestfs is
much safer, and is the only way we should allow this.
Some caveats came up during the discussion of the bug and this change
which are documented in the release note.
Co-Authored-By: Matt Riedemann <mriedem.os@gmail.com>
Closes-Bug: #1552042
Change-Id: Iac8496065c8b6212d7edac320659444ab341b513
This hasn't been validated upstream and there doesn't appear to be
anyone using it. It's time to drop support for this. This is mostly test
and documentation damage, though there is some other cleanup going on,
like the removal of the essentially noop 'pick_disk_driver_name' helper.
Change-Id: I73305e82da5d8da548961b801a8e75fb0e8c4cf1
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
This has not been tested in the gate for a long time and was only added
to enable CI in the early days of OpenStack. Time to bid adieu.
Change-Id: I7a157f37d2a67e1174a1725fd579c761d81a09b1
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
This was replaced by the '[DEFAULT] pointer_model' config option was
back in the 14.0.0 (Newton) release.
Change-Id: Ia39c0bad4c1c03b3ffb4a162c2afddb44ebaf6a1
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
When using emulated TPM, libvirt will store the persistent TPM data
under '/var/lib/libvirt/swtpm/<instance_uuid>' which is owned by the
"tss" or "root" user depending how libvirt is configured (the parent
directory, '/var/lib/libvirt/swtpm' is always owned by root). When doing
a resize or a cold migration between nodes, this data needs to be copied
to the other node to ensure that the TPM data is not lost. Libvirt
won't do this automatically for us since cold migrations, or offline
migrations in libvirt lingo, do not currently support "copying
non-shared storage or other file based storages", which includes the
vTPM device [1].
To complicate things further, even if migration/resize is supported,
only the user that nova-compute runs as is guaranteed to be able to have
SSH keys set up for passwordless access, and it's only guaranteed to be
able to copy files to the instance directory on the dest node.
The solution is to have nova (via privsep) copy the TPM files into the
local instance directory on the source and changes the ownership. This
is handled through an additional call in 'migrate_disk_and_power_off'.
As itself, nova then copies them into the instance directory on the
dest. Nova then (once again, via privsep) changes the ownership back and
moves the files to where libvirt expects to find them. This second step
is handled by 'finish_migration'. Confirming the resize will result in
the original TPM data at '/var/lib/libvirt/swtpm' being deleted by
libvirt and the copied TPM data in the instance data being cleaned up by
nova (via 'confirm_migration'), while reverting it will result on the
same on the host.
Part of blueprint add-emulated-virtual-tpm
[1] https://libvirt.org/migration.html#offline
Change-Id: I9b053919bb499c308912c8c9bff4c1fc396c1193
Signed-off-by: Chris Friesen <chris.friesen@windriver.com>
Co-authored-by: Stephen Finucane <stephenfin@redhat.com>
The 4.14 kernel is sufficiently old in the tooth (Ubuntu 18.04 uses
4.15, RHEL 7.x has likely backported the fixes) that there are likely
not a great deal of users that could still use this broken feature if
they wanted to. Drop support for it almost entirely, retaining only a
warning to prevent accidental use.
Change-Id: Iad76bce128574dc2f86998ccf2a9c5e799c71313
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Previous patches added support for parsing the vTPM-related flavor extra
specs and image metadata properties, the necessary integrations with the
Castellan key manager API etc. This change adds the ability to enable
support in the libvirt driver and create guests with vTPM functionality
enabled. Cold migration and resize are not yet supported. These will be
addressed in follow-on changes.
Functional tests are included. These require expansion of the
fakelibvirt stubs to implement basic secret management
Part of blueprint add-emulated-virtual-tpm
[1] https://review.opendev.org/686804
Change-Id: I1ff51f608b85dbb621814e70079ecfdd3d1a1d22
Co-Authored-By: Eric Fried <openstack@fried.cc>
Co-Authored-By: Stephen Finucane <stephenfin@redhat.com>
Testing on our own deployment which has 'live_migration_tunnelled'
enabled has indicated that 'live_migration_inbound_addr' still
works in this case, despite the documentation suggesting otherwise.
Looking at the code I am lead to believe that
'live_migration_inbound_addr' is set on the migration target, and
is then used by the migration source to replace the '%s' in the
migration URI. Whilst the 'live_migration_uri' is ignored if both
'live_migration_tunnelled' is enabled and
'live_migration_inbound_addr' is set, the docs suggest further
impact on the inbound_addr which doesn't appear to be true.
This patch attempts to clarify the use of the live migration
parameters.
Change-Id: I5a25786413ede23c72f8ccee1ad12497da7f751c
This teaches libvirt's RBD image backend about the outside world, that
other ceph clusters may exist, and how to use Glance's multi-store image
import-via-copy mechanism.
The basic theory is that when we go to do the normal CoW clone for RBD,
we do the "does this image have a location that matches my RBD backend?"
check. If that check does not pass, if configured, we avoid failing
and ask Glance to copy it to our store instead. After that has completed,
we just recurse (once) and re-try our existing logic to see if the image
is now in a reachable location. If so, we pass like we would have
originally, and if not, we fail in the same way we would have.
The copy-to-store logic sets up a looping poll to check for copy completion
every N seconds according to a tunable, with a total timeout value in
case it never completes. If the timeout expires or Glance reports failure,
we will treat that the same as unreachable-due-to-location.
Related to blueprint rbd-glance-multistore
Change-Id: Ia839ad418b0f2887cb8e8f5ee3e660a0751db9ce
If RBD backend is used for Nova ephemeral storage, Nova tries to remove
ephemeral storage volume from Ceph in a retry loop: 10 attempts at 1
second intervals, totaling 10 seconds overall - which, due to a thirty
second ceph watcher timeout, might result in intermittent volume
removal failures on Ceph side.
This patch adds params rbd_destroy_volume_retries, defaulting to 12, and
rbd_destroy_volume_retry_interval, defaulting to 5, which multiplied, give
Ceph reasonable amount of time to complete the operation successfully.
Closes-Bug: #1856845
Change-Id: Icfd55617f0126f79d9610f8a2fc6b4c817d1a2bd
This change adds a max_queues config option to allow
operators to set the maximium number of virtio queue
pairs that can be allocated to a virtio network
interface.
Change-Id: I9abe783a9a9443c799e7c74a57cc30835f679a01
Closes-Bug: #1847367
Blueprint image-precache-support added a conf section called
[image_cache], so it makes sense to move all the existing image
cache-related conf options into it.
Old:
[DEFAULT]image_cache_manager_interval
[DEFAULT]image_cache_subdirectory_name
[DEFAULT]remove_unused_base_images
[DEFAULT]remove_unused_original_minimum_age_seconds
[libvirt]remove_unused_resized_minimum_age_seconds
New:
[image_cache]manager_interval
[image_cache]subdirectory_name
[image_cache]remove_unused_base_images
[image_cache]remove_unused_original_minimum_age_seconds
[image_cache]remove_unused_resized_minimum_age_seconds
Change-Id: I3c49825ac0d70152b6c8ee4c8ca01546265f4b80
Partial-Bug: #1847302
Add one configuration option CONF.libvirt.pmem_namespaces:
"$LABEL:$NSNAME[|$NSNAME][,$LABEL:$NSNAME[|$NSNAME]]"
e.g. "128G:ns0|ns1|ns2|ns3,262144MB:ns4|ns5,MEDIUM:ns6|ns7"
Change-Id: I98e5ddbd7a9f2211a16221b5049bc36452a49a75
Partially-Implements: blueprint virtual-persistent-memory
Co-Authored-By: He Jie Xu <hejie.xu@intel.com>
This is a follow-up to the previous SEV commit which enables booting
SEV guests (I659cb77f12a3), making some minor improvements based on
nits highlighted during review:
- Clarify in the hypervisor-kvm.rst documentation that the
num_memory_encrypted_guests option is optional, by rewording and
moving it to the list of optional steps.
- Make things a bit more concise and avoid duplication of information
between the above page and the documentation for the option
num_memory_encrypted_guests, instead relying on appropriate
hyperlinking between them.
- Clarify that virtio-blk can be used for boot disks in newer kernels.
- Hyperlink to a page explaining vhost-user
- Remove an unneeded mocking of a LOG object.
- A few other grammar / spelling tweaks.
blueprint: amd-sev-libvirt-support
Change-Id: I75b7ec3a45cac25f6ebf77c6ed013de86c6ac947
Track compute node inventory for the new MEM_ENCRYPTION_CONTEXT
resource class (added in os-resource-classes 0.4.0) which represents
the number of guests a compute node can host concurrently with memory
encrypted at the hardware level.
This serves as a "master switch" for enabling SEV functionality, since
all the code which takes advantage of the presence of this inventory
in order to boot SEV-enabled guests is already in place, but none of
it gets used until the inventory is non-zero.
A discrete inventory is required because on AMD SEV-capable hardware,
the memory controller has a fixed number of slots for holding
encryption keys, one per guest. Typical early hardware only has 15
slots, thereby limiting the number of SEV guests which can be run
concurrently to 15. nova needs to track how many slots are available
and used in order to avoid attempting to exceed that limit in the
hardware.
Work is in progress to allow QEMU and libvirt to expose the number of
slots available on SEV hardware; however until this is finished and
released, it will not be possible for nova to programatically detect
the correct value with which to populate the MEM_ENCRYPTION_CONTEXT
inventory. So as a stop-gap, populate the inventory using the value
manually provided by the cloud operator in a new configuration option
CONF.libvirt.num_memory_encrypted_guests.
Since this commit effectively enables SEV, also add all the relevant
documentation as planned in the AMD SEV spec[0]:
- Add operation.boot-encrypted-vm to the KVM hypervisor feature matrix.
- Update the KVM section of the Configuration Guide.
- Update the flavors section of the User Guide.
- Add a release note.
[0] http://specs.openstack.org/openstack/nova-specs/specs/train/approved/amd-sev-libvirt-support.html#documentation-impact
blueprint: amd-sev-libvirt-support
Change-Id: I659cb77f12a38a4d2fb118530ebb9de88d2ed30d
Rename the exist config attribute: [libvirt]/cpu_model to
[libvirt]/cpu_models, which is an orderded list of CPU models the host
supports. The value in the list can be made case-insensitive.
Change logic of method: '_get_guest_cpu_model_config', if cpu_mode is
custom and cpu_models set. It will parse the required traits
associated with the CPU flags from flavor extra_specs and select the
most appropriate CPU model.
Add new method 'get_cpu_model_names' to host.py. It will return a list
of the cpu models that the CPU arch can support.
Update the docs of hypervisor-kvm.
Change-Id: I06e1f7429c056c4ce8506b10359762e457dbb2a0
Implements: blueprint cpu-model-selection
This patch removes the legacy code for image checksumming
as well as configuration values that are not longer being
used.
Change-Id: I9c552e33456bb862688beaabe69f2b72bb8ebcce
libvirt has split the CPU feature flags file 'cpu_map.xml' into
a bunch of flag files for each CPU model, which are stored under
directory 'src/cpu_map/'.
Update this change accordingly.
Change-Id: Id45587adb6ecd8e0bdef344c90979eaea61e61b8
According to the https://review.openstack.org/#/c/510776/ fix, we should
have a corresponding change to the nova docs.
Change-Id: Id3f03d260a305150013cfe5578c1da438a12e6f0
Related-Bug: #1722432
Previously the initial call to connect to a RBD cluster via the RADOS
API could hang indefinitely if network or other environmental related
issues were encountered.
When encountered during a call to update_available_resource this can
result in the local n-cpu service reporting as UP while never being able
to break out of a subsequent RPC timeout loop as documented in bug
This change adds a simple timeout configurable to be used when initially
connecting to the cluster [1][2][3]. The default timeout of 5 seconds
being sufficiently small enough to ensure that if encountered the n-cpu
service will be able to be marked as DOWN before a RPC timeout is seen.
[1] http://docs.ceph.com/docs/luminous/rados/api/python/#rados.Rados.connect
[2] http://docs.ceph.com/docs/mimic/rados/api/python/#rados.Rados.connect
[3] http://docs.ceph.com/docs/nautilus/rados/api/python/#rados.Rados.connect
Closes-bug: #1834048
Change-Id: I67f341bf895d6cc5d503da274c089d443295199e
Ceph doesn't support QCOW2 for hosting a virtual machine
disk:
http://docs.ceph.com/docs/master/rbd/rbd-openstack/
When we set image_type as rbd and force_raw_images as
False and we don't launch an instance with boot-from-volume,
the instance is spawned using qcow2 as root disk but
fails to boot because data is accessed as raw.
To fix this, we raise an error and refuse to start
nova-compute service when force_raw_images and
image_type are incompatible.
When we import image into rbd, check the format of cache
images. If the format is not raw, remove it first and
fetch it again. It will be raw format now.
Change-Id: I1aa471e8df69fbb6f5d9aeb35651bd32c7123d78
Closes-Bug: 1816686
Change I408baef12358a83921c4693b847a692f6c19e36f in Stein
bumped the minimum required version of libvirt to 3.0.0
so we can drop the minimum version check for post-copy
support along with the related unit tests.
While in here, this fixes a typo in the help text for the
live_migration_permit_post_copy config option.
Change-Id: Id55fbb44eec67cba18293deb25ba4d54fbfd83bc
When configuring QEMU cache modes for Nova instances, we use
'writethrough' when 'none' is not available. But that's not correct,
because of our misunderstanding of how cache modes work. E.g. the
function disk_cachemode() in the libvirt driver assumes that
'writethrough' and 'none' cache modes have the same behaviour with
respect to host crash safety, which is not at all true.
The misunderstanding and complexity stems from not realizing that each
QEMU cache mode is a shorthand to toggle *three* booleans. Refer to the
convenient cache mode table in the code comment (in
nova/virt/libvirt/driver.py).
As Kevin Wolf (thanks!), QEMU Block Layer maintainer, explains (I made
a couple of micro edits for clarity):
The thing that makes 'writethrough' so safe against host crashes is
that it never keeps data in a "write cache", but it calls fsync()
after _every_ write. This is also what makes it horribly slow. But
'cache=none' doesn't do this and therefore doesn't provide this kind
of safety. The guest OS must explicitly flush the cache in the
right places to make sure data is safe on the disk. And OSes do
that.
So if 'cache=none' is safe enough for you, then 'cache=writeback'
should be safe enough for you, too -- because both of them have the
boolean 'cache.writeback=on'. The difference is only in
'cache.direct', but 'cache.direct=on' only bypasses the host kernel
page cache and data could still sit in other caches that could be
present between QEMU and the disk (such as commonly a volatile write
cache on the disk itself).
So use 'writeback' mode instead of the debilitatingly slow
'writethrough' for cases where the O_DIRECT-based 'none' is unsupported.
Do the minimum required update to the `disk_cachemodes` config help
text. (In a future patch, rewrite the cache modes documentation to fix
confusing fragments and outdated information.)
Closes-Bug: #1818847
Change-Id: Ibe236988af24a3b43508eec4efbe52a4ed05d45f
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Looks-good-to-me'd-by: Kevin Wolf <kwolf@redhat.com>
The file virt.py has been remove from the patch
https://review.openstack.org/#/c/392566/.
So use_cow_images config is now in file compute.py
Change from virt to compute for it.
Change-Id: Id332f6dc3e0aaaab4ff94b810e4a5bf6b7e01874
Added a new ``unique`` choice to the ``[libvirt]/sysinfo_serial``
configuration which if set will result in the guest serial number
being set to ``instance.uuid`` on this host. It is also made to be
the default value of ``[libvirt]/sysinfo_serial`` config option.
Implements: blueprint per-instance-libvirt-sysinfo-serial
Change-Id: I001beb2840496f7950988acc69018244847aa888
- Add a one-line summary of the config attribute
`live_migration_with_native_tls` and a note about its version
requirements. Thus allowing the diligent operators, who carefully
read the documentation, to make more informed choices.
- Remove the superfluous "_migrateToURI3" suffix in the unit test. In
Nova commit 4b3e8772, we ripped out support for the older
migrateToURI{2} APIs and just stuck to migrateToURI3(). So no need
to spell out which migration API variant we are using.
- Add a TODO item in the libvirt driver to deprecate and remove
support for VIR_MIGRATE_TUNNELLED (and related config attribute)
once the MIN_{LIBVIRT,QEMU}_VERSION supports "native TLS" by
default.
Blueprint: support-qemu-native-tls-for-live-migration
Change-Id: Ic1419e443cecf94eb4f2c48894abb1a0eb9b73cb
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
The encryption offered by Nova (via `live_migration_tunnelled`, i.e.
"tunnelling via libvirtd") today secures only two migration streams:
guest RAM and device state; but it does _not_ encrypt the NBD (Network
Block Device) transport—which is used to migrate disks that are on
non-shared storage setup (also called: "block migration"). Further, the
"tunnelling via libvirtd" has a huge performance penalty and latency,
because it burns more CPU and memory bandwidth due to increased number
of data copies on both source and destination hosts.
To solve this existing limitation, introduce a new config option
`live_migration_with_native_tls`, which will take advantage of "native
TLS" (i.e. TLS built into QEMU, and relevant support in libvirt). The
native TLS transport will encrypt all migration streams, *including*
disks that are not on shared storage — all of this without incurring the
limitations of the "tunnelled via libvirtd" transport.
Closes-Bug: #1798796
Blueprint: support-qemu-native-tls-for-live-migration
Change-Id: I78f5fef41b6fbf118880cc8aa4036d904626b342
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>