openstack/nova - nova - OpenDev: Free Software Needs Free Tools

Commit Graph

Author	SHA1	Message	Date
Zuul	33ab9c5d0e	Merge "[libvirt]Add migration_inbound_addr"	2023-12-05 11:43:35 +00:00
Zuul	1738b52c30	Merge "Detect maximum number of SEV guests automatically"	2023-11-23 18:13:25 +00:00
Takashi Kajinami	03055de176	Detect maximum number of SEV guests automatically Libvirt has implemented the capability to expose maximum number of SEV guests and SEV-ES guests in 8.0.0[1][2]. This allows nova to detect maximum number of memory encrypted guests using that feature. The detection is not used if the [libvirt] num_memory_encrypted_guests option is set to preserve the current behavior. Note that current nova supports only SEV and does not support SEV-ES, so this implementation only uses the maximum number of SEV guests. The maximum number of SEV-ES guests will be used in case we implement support for SEV-ES. [1] `34cb8f6fcd` [2] `7826148a72` Implements: blueprint libvirt-detect-sev-max-guests Change-Id: I502e1713add7e6a1eb11ecce0cc2b5eb6a14527a	2023-11-23 07:58:54 +00:00
Zuul	3405cd45dd	Merge "Fix wrong description about minimum values"	2023-11-19 13:49:21 +00:00
Balazs Gibizer	6bca37e904	[libvirt]Add migration_inbound_addr For live migration the libvirt driver already supports generating the migration URL based on the compute host hostname if so configured. However for the non live move operations the driver always used the IP address of the compute host based on [DEFAULT]my_ip. Some deployments rely on DNS to abstract the IP address management. In these environments it is beneficial if nova allows connection between compute hosts based on the hostname (or FQDN) of the host instead of trying to configure [DEFAUL]my_ip to an IP address. This patch introduces a new config option [libvirt]migration_inbound_addr that is used to determine the address for incoming move operations (cold migrate, resize, evacuate). This config is defaulted to [DEFAULT]my_ip to keep the configuration backward compatible. However it allows an explicit hostname or FQDN to be specified, or allows to specify '%s' that is then resolved to the hostname of compute host. blueprint: libvirt-migrate-with-hostname-instead-of-ip Change-Id: I6a80b5620f32770a04c751143c4ad07882e9f812	2023-11-12 10:27:51 +01:00
yatinkarel	3f7cc63d94	Add config option to configure TB cache size Qemu>=5.0.0 bumped the default tb-cache size to 1GiB(from 32MiB) and this made it difficult to run multiple guest VMs on systems running with lower memory. With Libvirt>=8.0.0 it's possible to configure lower tb-cache size. Below config option is introduced to allow configure TB cache size as per environment needs, this only applies to 'virt_type=qemu':- [libvirt]tb_cache_size Also enable this flag in nova-next job. [1] https://github.com/qemu/qemu/commit/600e17b26 [2] https://gitlab.com/libvirt/libvirt/-/commit/58bf03f85 Closes-Bug: #1949606 Implements: blueprint libvirt-tb-cache-size Change-Id: I49d2276ff3d3cc5d560a1bd96f13408e798b256a	2023-07-13 19:35:52 +05:30
Takashi Kajinami	009ffe4127	Fix wrong description about minimum values The following options have minimum values defined, and too small values are not rounded but rejected by oslo.config. This change updates the description to explain the actual behavior. Closes-Bug: #2007532 Change-Id: I8d1533ae4b44d4e8f811dce554196f270e25da3e	2023-02-16 17:04:58 +09:00
Sylvain Bauza	0807b7ae9a	Enable cpus when an instance is spawning By this patch, we now automatically power down or up cores when an instance is either stopped or started. Also, by default, we now powersave or offline dedicated cores when starting the compute service. Implements: blueprint libvirt-cpu-state-mgmt Change-Id: Id645fd1ba909683af903f3b8f11c7f06db3401cb	2023-02-10 13:03:39 +01:00
Sylvain Bauza	96f9518096	libvirt: let CPUs be power managed Before going further, we need to somehow return the list of CPUs even offline if they are power managed by Nova. Co-Authored-By: Sean Mooney <smooney@redhat.com> Partially-Implements: blueprint libvirt-cpu-state-mgmt Change-Id: I5dca10acde0eff554ed139587aefaf2f5fad2ca5	2023-02-10 12:16:57 +01:00
Sylvain Bauza	ddf96bcd31	cpu: interfaces for managing state and governor This is the first stage of the power management series. In order to be able to switch the CPU state or change the governor, we need a framework to access sysfs. As some bits can be reused, let's create a nova.filesystem helper module that will define read-write mechanisms for accessing sysfs-specific commands. Partially-Implements: blueprint libvirt-cpu-state-mgmt Change-Id: Icb913ed9be8d508de35e755a9c650ba25e45aca2	2023-02-09 07:04:02 +01:00
Rajesh Tailor	ac42c43e43	Correct config help message related options The options list in 'Related Options:' section doesn't rendered as bulleted list for some params because of missing blank line. This changes adds missing blank line wherever needed in [1]. [1] https://docs.openstack.org/nova/latest/configuration/config.html Change-Id: I7077aea2abcf3cab67592879ebd1fde066bfcac5	2022-11-11 16:03:53 +05:30
Rajesh Tailor	aa1e7a6933	Fix typos in help messages This change fixes typos in conf parameter help messages and in error log message. Change-Id: Iedc268072d77771b208603e663b0ce9b94215eb8	2022-05-30 17:28:29 +05:30
Pedro Almeida	de110b042d	Update live_migration_downtime definition Before, the definition of live_migration_downtime didn't explain if any exception/timeout occurs if the migration exceeds the value. This is just used as a reference for nova and if any problem happens when the VM gets paused, there will be no abort or force-complete. Closes-Bug: #1960345 Signed-off-by: Pedro Almeida <pedro.monteiroazevedodemouraalmeida@windriver.com> Change-Id: I336481d1801a367b5628fedcd2aa5f5cf763355a	2022-02-23 13:21:03 -03:00
Daniel Bengtsson	0d84833e96	Use the new type HostDomainOpt. Use the new oslo.confg type HostDomainOpt to support underscore in the name. You can see the bugzilla[1] to have more information. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1868940 Related-Bug: #1892044 Change-Id: Ib6c8fc1e3d90b79b10066c429670fcb957bddd23	2021-05-19 10:29:56 +02:00
Balazs Gibizer	e56cc4f439	Replace blind retry with libvirt event waiting in detach Nova so far applied a retry loop that tried to periodically detach the device from libvirt while the device was visible in the domain xml. This could lead to an issue where an already progressing detach on the libvirt side is interrupted by nova re-sending the detach request for the same device. See bug #1882521 for more information. Also if there was both a persistent and a live domain the nova tried the detach from both at the same call. This lead to confusion about the result when such call failed. Was the detach failed partially? We can do better, at least for the live detach case. Based on the libvirt developers detaching from the persistent domain always succeeds and it is a synchronous process. Detaching from the live domain can be both synchronous or asynchronous depending on the guest OS and the load on the hypervisor. But for live detach libvirt always sends an event [1] nova can wait for. So this patch does two things. 1) Separates the detach from the persistent domain from the detach from the live domain to make the error cases clearer. 2) Changes the retry mechanism. Detaching from the persistent domain is not retried. If libvirt reports device not found, while both persistent and live detach is needed, the error is ignored, and the process continues with the live detach. In any other case the error considered as fatal. Detaching from the live domain is changed to always wait for the libvirt event. In case of timeout, the live detach is retried. But a failure event from libvirt considered fatal, based on the information from the libvirt developers, so in this case the detach is not retried. Related-Bug: #1882521 [1]https://libvirt.org/html/libvirt-libvirt-domain.html#virConnectDomainEventDeviceRemovedCallback Change-Id: I7f2b6330decb92e2838aa7cee47fb228f00f47da	2021-04-18 10:24:08 +02:00
Kashyap Chamarthy	14071dfb11	libvirt: Deprecate `live_migration_tunnelled` We are well above the required MIN_LIBVIRT_VERSION and MIN_QEMU_VERSION (4.4.0 and 2.11.0, respectively) to get QEMU-native TLS[1] support by default. So we can now deprecate (and later remove) the support for "tunnelled live migration", which has two inherent limitations: (a) it cannot handle live migration of disks in a non-shared storage setup (a.k.a. "block migration"); and (b) it has a huge performance overhead and latency, because it burns more CPU and memory bandwidth due to increased number of data copies, on both source and destination hosts. Both the above limitations are addressed by the QEMU-native TLS support `live_migration_with_native_tls`, which is the recommended approach for securing all live migration streams (guest RAM, device state, and disks). [1] https://docs.openstack.org/nova/latest/admin/secure-live-migration-with-qemu-native-tls.html Change-Id: I34fd5a4788a2ad4380d9a57b84512fa94a6f9c37 Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>	2021-03-16 16:50:32 +01:00
Zuul	bcb78e5a02	Merge "Remove non-libguestfs file injection for libvirt"	2021-03-15 11:07:45 +00:00
Kashyap Chamarthy	bcd6b42047	libvirt: Allow disabling CPU flags via `cpu_model_extra_flags` Parse a comma-separated list of CPU flags from `[libvirt]/cpu_model_extra_flags`. If the CPU flag starts with '+', enable the feature in Nova guest CPU guest XML, or if it starts with '-', disable the feature. If neither '+' nor '-' is specified, enable the flag. For example, on a compute node that is running hardware (e.g. an Intel server that supports TSX) and virtualization software that supports the given CPU flags, if a user provides this config: [libvirt] cpu_mode = custom cpu_models = Cascadelake-Server cpu_model_extra_flags = -hle, -rtm, +ssbd, mtrr Then Nova should generate this CPU for the guest: <cpu match='exact'> <model fallback='forbid'>Cascadelake-Server</model> <vendor>Intel</vendor> <feature policy='require' name='ssbd'/> <feature policy='require' name='mtrr'/> <feature policy='disable' name='hle'/> <feature policy='disable' name='rtm'/> </cpu> This ability to selectively disable CPU flags lets you avoid any CPU flags that need to be disabled for any number of reasons. E.g. disable a CPU flag that is a potential security risk, or disable one that causes a performance penalty. blueprint: allow-disabling-cpu-flags Change-Id: I2ef7c5bef87bd64c087f3b136c2faac9a3865f10 Signed-off-by: Patrick Uiterwijk <patrick@puiterwijk.org> Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>	2021-03-04 17:40:06 +01:00
Sean Dague	d06a10f096	Remove non-libguestfs file injection for libvirt This is a security concern, as mounting filesystems on the host has had previous CVEs around executing code on the host. libguestfs is much safer, and is the only way we should allow this. Some caveats came up during the discussion of the bug and this change which are documented in the release note. Co-Authored-By: Matt Riedemann <mriedem.os@gmail.com> Closes-Bug: #1552042 Change-Id: Iac8496065c8b6212d7edac320659444ab341b513	2021-03-03 17:54:57 +01:00
Zuul	bc19a33586	Merge "Fix misleading documentation for live_migration_inbound_addr"	2021-02-02 00:49:21 +00:00
Stephen Finucane	3a390c2c82	libvirt: Drop support for Xen This hasn't been validated upstream and there doesn't appear to be anyone using it. It's time to drop support for this. This is mostly test and documentation damage, though there is some other cleanup going on, like the removal of the essentially noop 'pick_disk_driver_name' helper. Change-Id: I73305e82da5d8da548961b801a8e75fb0e8c4cf1 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-01-22 10:06:40 +00:00
Stephen Finucane	d02ce3c4f0	libvirt: Drop support for UML This has not been tested in the gate for a long time and was only added to enable CI in the early days of OpenStack. Time to bid adieu. Change-Id: I7a157f37d2a67e1174a1725fd579c761d81a09b1 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2021-01-22 10:05:57 +00:00
Zuul	561183f47f	Merge "libvirt: Remove support for '[libvirt] use_usb_tablet'"	2021-01-20 17:53:13 +00:00
Zuul	f27a11feb7	Merge "Update supported transports for iscsi connector"	2020-12-05 17:01:40 +00:00
Stephen Finucane	5b0343d3e1	libvirt: Remove support for '[libvirt] use_usb_tablet' This was replaced by the '[DEFAULT] pointer_model' config option was back in the 14.0.0 (Newton) release. Change-Id: Ia39c0bad4c1c03b3ffb4a162c2afddb44ebaf6a1 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2020-10-06 13:16:11 +01:00
Stephen Finucane	eb819c8c63	Add support for resize and cold migration of emulated TPM files When using emulated TPM, libvirt will store the persistent TPM data under '/var/lib/libvirt/swtpm/<instance_uuid>' which is owned by the "tss" or "root" user depending how libvirt is configured (the parent directory, '/var/lib/libvirt/swtpm' is always owned by root). When doing a resize or a cold migration between nodes, this data needs to be copied to the other node to ensure that the TPM data is not lost. Libvirt won't do this automatically for us since cold migrations, or offline migrations in libvirt lingo, do not currently support "copying non-shared storage or other file based storages", which includes the vTPM device [1]. To complicate things further, even if migration/resize is supported, only the user that nova-compute runs as is guaranteed to be able to have SSH keys set up for passwordless access, and it's only guaranteed to be able to copy files to the instance directory on the dest node. The solution is to have nova (via privsep) copy the TPM files into the local instance directory on the source and changes the ownership. This is handled through an additional call in 'migrate_disk_and_power_off'. As itself, nova then copies them into the instance directory on the dest. Nova then (once again, via privsep) changes the ownership back and moves the files to where libvirt expects to find them. This second step is handled by 'finish_migration'. Confirming the resize will result in the original TPM data at '/var/lib/libvirt/swtpm' being deleted by libvirt and the copied TPM data in the instance data being cleaned up by nova (via 'confirm_migration'), while reverting it will result on the same on the host. Part of blueprint add-emulated-virtual-tpm [1] https://libvirt.org/migration.html#offline Change-Id: I9b053919bb499c308912c8c9bff4c1fc396c1193 Signed-off-by: Chris Friesen <chris.friesen@windriver.com> Co-authored-by: Stephen Finucane <stephenfin@redhat.com>	2020-09-08 09:58:21 +01:00
Stephen Finucane	e45f3b5d71	Remove support for Intel CMT events The 4.14 kernel is sufficiently old in the tooth (Ubuntu 18.04 uses 4.15, RHEL 7.x has likely backported the fixes) that there are likely not a great deal of users that could still use this broken feature if they wanted to. Drop support for it almost entirely, retaining only a warning to prevent accidental use. Change-Id: Iad76bce128574dc2f86998ccf2a9c5e799c71313 Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2020-09-01 17:36:31 +01:00
Chris Friesen	e0ca2652ed	libvirt: Add emulated TPM support to Nova Previous patches added support for parsing the vTPM-related flavor extra specs and image metadata properties, the necessary integrations with the Castellan key manager API etc. This change adds the ability to enable support in the libvirt driver and create guests with vTPM functionality enabled. Cold migration and resize are not yet supported. These will be addressed in follow-on changes. Functional tests are included. These require expansion of the fakelibvirt stubs to implement basic secret management Part of blueprint add-emulated-virtual-tpm [1] https://review.opendev.org/686804 Change-Id: I1ff51f608b85dbb621814e70079ecfdd3d1a1d22 Co-Authored-By: Eric Fried <openstack@fried.cc> Co-Authored-By: Stephen Finucane <stephenfin@redhat.com>	2020-08-25 17:55:33 +01:00
Andrew Bonney	b6e9023751	Fix misleading documentation for live_migration_inbound_addr Testing on our own deployment which has 'live_migration_tunnelled' enabled has indicated that 'live_migration_inbound_addr' still works in this case, despite the documentation suggesting otherwise. Looking at the code I am lead to believe that 'live_migration_inbound_addr' is set on the migration target, and is then used by the migration source to replace the '%s' in the migration URI. Whilst the 'live_migration_uri' is ignored if both 'live_migration_tunnelled' is enabled and 'live_migration_inbound_addr' is set, the docs suggest further impact on the inbound_addr which doesn't appear to be true. This patch attempts to clarify the use of the live migration parameters. Change-Id: I5a25786413ede23c72f8ccee1ad12497da7f751c	2020-07-31 15:34:21 +01:00
Dan Smith	07025abf72	Make libvirt able to trigger a backend image copy when needed This teaches libvirt's RBD image backend about the outside world, that other ceph clusters may exist, and how to use Glance's multi-store image import-via-copy mechanism. The basic theory is that when we go to do the normal CoW clone for RBD, we do the "does this image have a location that matches my RBD backend?" check. If that check does not pass, if configured, we avoid failing and ask Glance to copy it to our store instead. After that has completed, we just recurse (once) and re-try our existing logic to see if the image is now in a reachable location. If so, we pass like we would have originally, and if not, we fail in the same way we would have. The copy-to-store logic sets up a looping poll to check for copy completion every N seconds according to a tunable, with a total timeout value in case it never completes. If the timeout expires or Glance reports failure, we will treat that the same as unreachable-due-to-location. Related to blueprint rbd-glance-multistore Change-Id: Ia839ad418b0f2887cb8e8f5ee3e660a0751db9ce	2020-06-24 07:37:51 -07:00
Sasha Andonov	6458c3dba5	rbd_utils: increase _destroy_volume timeout If RBD backend is used for Nova ephemeral storage, Nova tries to remove ephemeral storage volume from Ceph in a retry loop: 10 attempts at 1 second intervals, totaling 10 seconds overall - which, due to a thirty second ceph watcher timeout, might result in intermittent volume removal failures on Ceph side. This patch adds params rbd_destroy_volume_retries, defaulting to 12, and rbd_destroy_volume_retry_interval, defaulting to 5, which multiplied, give Ceph reasonable amount of time to complete the operation successfully. Closes-Bug: #1856845 Change-Id: Icfd55617f0126f79d9610f8a2fc6b4c817d1a2bd	2020-05-20 00:08:54 +00:00
Sean Mooney	0e6aac3c2d	add [libvirt]/max_queues config option This change adds a max_queues config option to allow operators to set the maximium number of virtio queue pairs that can be allocated to a virtio network interface. Change-Id: I9abe783a9a9443c799e7c74a57cc30835f679a01 Closes-Bug: #1847367	2019-12-02 15:46:50 +00:00
Eric Fried	828e8047e5	Consolidate [image_cache] conf options Blueprint image-precache-support added a conf section called [image_cache], so it makes sense to move all the existing image cache-related conf options into it. Old: [DEFAULT]image_cache_manager_interval [DEFAULT]image_cache_subdirectory_name [DEFAULT]remove_unused_base_images [DEFAULT]remove_unused_original_minimum_age_seconds [libvirt]remove_unused_resized_minimum_age_seconds New: [image_cache]manager_interval [image_cache]subdirectory_name [image_cache]remove_unused_base_images [image_cache]remove_unused_original_minimum_age_seconds [image_cache]remove_unused_resized_minimum_age_seconds Change-Id: I3c49825ac0d70152b6c8ee4c8ca01546265f4b80 Partial-Bug: #1847302	2019-11-13 11:09:03 -06:00
LuyaoZhong	0e4ca43311	libvirt: Enable driver configuring PMEM namespaces Add one configuration option CONF.libvirt.pmem_namespaces: "$LABEL:$NSNAME[\|$NSNAME][,$LABEL:$NSNAME[\|$NSNAME]]" e.g. "128G:ns0\|ns1\|ns2\|ns3,262144MB:ns4\|ns5,MEDIUM:ns6\|ns7" Change-Id: I98e5ddbd7a9f2211a16221b5049bc36452a49a75 Partially-Implements: blueprint virtual-persistent-memory Co-Authored-By: He Jie Xu <hejie.xu@intel.com>	2019-09-19 23:15:39 +00:00
Zuul	22a440e0ed	Merge "vCPU model selection"	2019-09-11 22:59:42 +00:00
Adam Spiers	922d8bf811	Improve SEV documentation and other minor tweaks This is a follow-up to the previous SEV commit which enables booting SEV guests (I659cb77f12a3), making some minor improvements based on nits highlighted during review: - Clarify in the hypervisor-kvm.rst documentation that the num_memory_encrypted_guests option is optional, by rewording and moving it to the list of optional steps. - Make things a bit more concise and avoid duplication of information between the above page and the documentation for the option num_memory_encrypted_guests, instead relying on appropriate hyperlinking between them. - Clarify that virtio-blk can be used for boot disks in newer kernels. - Hyperlink to a page explaining vhost-user - Remove an unneeded mocking of a LOG object. - A few other grammar / spelling tweaks. blueprint: amd-sev-libvirt-support Change-Id: I75b7ec3a45cac25f6ebf77c6ed013de86c6ac947	2019-09-10 14:48:32 +01:00
Adam Spiers	8e5d6767bb	Enable booting of libvirt guests with AMD SEV memory encryption Track compute node inventory for the new MEM_ENCRYPTION_CONTEXT resource class (added in os-resource-classes 0.4.0) which represents the number of guests a compute node can host concurrently with memory encrypted at the hardware level. This serves as a "master switch" for enabling SEV functionality, since all the code which takes advantage of the presence of this inventory in order to boot SEV-enabled guests is already in place, but none of it gets used until the inventory is non-zero. A discrete inventory is required because on AMD SEV-capable hardware, the memory controller has a fixed number of slots for holding encryption keys, one per guest. Typical early hardware only has 15 slots, thereby limiting the number of SEV guests which can be run concurrently to 15. nova needs to track how many slots are available and used in order to avoid attempting to exceed that limit in the hardware. Work is in progress to allow QEMU and libvirt to expose the number of slots available on SEV hardware; however until this is finished and released, it will not be possible for nova to programatically detect the correct value with which to populate the MEM_ENCRYPTION_CONTEXT inventory. So as a stop-gap, populate the inventory using the value manually provided by the cloud operator in a new configuration option CONF.libvirt.num_memory_encrypted_guests. Since this commit effectively enables SEV, also add all the relevant documentation as planned in the AMD SEV spec[0]: - Add operation.boot-encrypted-vm to the KVM hypervisor feature matrix. - Update the KVM section of the Configuration Guide. - Update the flavors section of the User Guide. - Add a release note. [0] http://specs.openstack.org/openstack/nova-specs/specs/train/approved/amd-sev-libvirt-support.html#documentation-impact blueprint: amd-sev-libvirt-support Change-Id: I659cb77f12a38a4d2fb118530ebb9de88d2ed30d	2019-09-10 13:59:02 +01:00
ya.wang	f80e5f989d	vCPU model selection Rename the exist config attribute: [libvirt]/cpu_model to [libvirt]/cpu_models, which is an orderded list of CPU models the host supports. The value in the list can be made case-insensitive. Change logic of method: '_get_guest_cpu_model_config', if cpu_mode is custom and cpu_models set. It will parse the required traits associated with the CPU flags from flavor extra_specs and select the most appropriate CPU model. Add new method 'get_cpu_model_names' to host.py. It will return a list of the cpu models that the CPU arch can support. Update the docs of hypervisor-kvm. Change-Id: I06e1f7429c056c4ce8506b10359762e457dbb2a0 Implements: blueprint cpu-model-selection	2019-09-06 14:01:35 +08:00
Mohammed Naser	e141bcdb25	config: remove deprecated checksum options This patch removes the legacy code for image checksumming as well as configuration values that are not longer being used. Change-Id: I9c552e33456bb862688beaabe69f2b72bb8ebcce	2019-08-15 11:47:51 -04:00
Wang Huaqiang	4945e6dbd4	doc: correct the information of 'cpu_map' libvirt has split the CPU feature flags file 'cpu_map.xml' into a bunch of flag files for each CPU model, which are stored under directory 'src/cpu_map/'. Update this change accordingly. Change-Id: Id45587adb6ecd8e0bdef344c90979eaea61e61b8	2019-08-02 20:19:18 +08:00
Jack Lu	76c60c25bc	Update supported transports for iscsi connector According to the https://review.openstack.org/#/c/510776/ fix, we should have a corresponding change to the nova docs. Change-Id: Id3f03d260a305150013cfe5578c1da438a12e6f0 Related-Bug: #1722432	2019-07-15 16:51:59 +01:00
Lee Yarwood	03f7dc29b7	libvirt: Add a rbd_connect_timeout configurable Previously the initial call to connect to a RBD cluster via the RADOS API could hang indefinitely if network or other environmental related issues were encountered. When encountered during a call to update_available_resource this can result in the local n-cpu service reporting as UP while never being able to break out of a subsequent RPC timeout loop as documented in bug This change adds a simple timeout configurable to be used when initially connecting to the cluster [1][2][3]. The default timeout of 5 seconds being sufficiently small enough to ensure that if encountered the n-cpu service will be able to be marked as DOWN before a RPC timeout is seen. [1] http://docs.ceph.com/docs/luminous/rados/api/python/#rados.Rados.connect [2] http://docs.ceph.com/docs/mimic/rados/api/python/#rados.Rados.connect [3] http://docs.ceph.com/docs/nautilus/rados/api/python/#rados.Rados.connect Closes-bug: #1834048 Change-Id: I67f341bf895d6cc5d503da274c089d443295199e	2019-07-02 19:36:16 +01:00
zhu.boxiang	6c6ffc0476	Fix failure to boot instances with qcow2 format images Ceph doesn't support QCOW2 for hosting a virtual machine disk: http://docs.ceph.com/docs/master/rbd/rbd-openstack/ When we set image_type as rbd and force_raw_images as False and we don't launch an instance with boot-from-volume, the instance is spawned using qcow2 as root disk but fails to boot because data is accessed as raw. To fix this, we raise an error and refuse to start nova-compute service when force_raw_images and image_type are incompatible. When we import image into rbd, check the format of cache images. If the format is not raw, remove it first and fetch it again. It will be raw format now. Change-Id: I1aa471e8df69fbb6f5d9aeb35651bd32c7123d78 Closes-Bug: 1816686	2019-05-20 19:10:31 +08:00
Matt Riedemann	3bcfb15a89	libvirt: drop MIN_LIBVIRT_POSTCOPY_VERSION Change I408baef12358a83921c4693b847a692f6c19e36f in Stein bumped the minimum required version of libvirt to 3.0.0 so we can drop the minimum version check for post-copy support along with the related unit tests. While in here, this fixes a typo in the help text for the live_migration_permit_post_copy config option. Change-Id: Id55fbb44eec67cba18293deb25ba4d54fbfd83bc	2019-04-03 14:27:03 -04:00
Kashyap Chamarthy	b9dc86d8d6	libvirt: Use 'writeback' QEMU cache mode when 'none' is not viable When configuring QEMU cache modes for Nova instances, we use 'writethrough' when 'none' is not available. But that's not correct, because of our misunderstanding of how cache modes work. E.g. the function disk_cachemode() in the libvirt driver assumes that 'writethrough' and 'none' cache modes have the same behaviour with respect to host crash safety, which is not at all true. The misunderstanding and complexity stems from not realizing that each QEMU cache mode is a shorthand to toggle three booleans. Refer to the convenient cache mode table in the code comment (in nova/virt/libvirt/driver.py). As Kevin Wolf (thanks!), QEMU Block Layer maintainer, explains (I made a couple of micro edits for clarity): The thing that makes 'writethrough' so safe against host crashes is that it never keeps data in a "write cache", but it calls fsync() after _every_ write. This is also what makes it horribly slow. But 'cache=none' doesn't do this and therefore doesn't provide this kind of safety. The guest OS must explicitly flush the cache in the right places to make sure data is safe on the disk. And OSes do that. So if 'cache=none' is safe enough for you, then 'cache=writeback' should be safe enough for you, too -- because both of them have the boolean 'cache.writeback=on'. The difference is only in 'cache.direct', but 'cache.direct=on' only bypasses the host kernel page cache and data could still sit in other caches that could be present between QEMU and the disk (such as commonly a volatile write cache on the disk itself). So use 'writeback' mode instead of the debilitatingly slow 'writethrough' for cases where the O_DIRECT-based 'none' is unsupported. Do the minimum required update to the `disk_cachemodes` config help text. (In a future patch, rewrite the cache modes documentation to fix confusing fragments and outdated information.) Closes-Bug: #1818847 Change-Id: Ibe236988af24a3b43508eec4efbe52a4ed05d45f Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com> Looks-good-to-me'd-by: Kevin Wolf <kwolf@redhat.com>	2019-03-21 14:17:22 +01:00
zhu.boxiang	8fcf36eb21	Trivialfix for help description of images_type The file virt.py has been remove from the patch https://review.openstack.org/#/c/392566/. So use_cow_images config is now in file compute.py Change from virt to compute for it. Change-Id: Id332f6dc3e0aaaab4ff94b810e4a5bf6b7e01874	2019-03-13 10:27:06 +08:00
Matt Riedemann	b29158149d	Follow up for per-instance serial number change This is a follow up change to address review nits from I001beb2840496f7950988acc69018244847aa888. Change-Id: I1fbfa46b52b32039ff3d6703a27306b56314c1d5	2019-02-04 11:53:15 -05:00
Kevin_Zheng	dec5dd9286	Per-instance serial number Added a new ``unique`` choice to the ``[libvirt]/sysinfo_serial`` configuration which if set will result in the guest serial number being set to ``instance.uuid`` on this host. It is also made to be the default value of ``[libvirt]/sysinfo_serial`` config option. Implements: blueprint per-instance-libvirt-sysinfo-serial Change-Id: I001beb2840496f7950988acc69018244847aa888	2019-02-03 10:59:21 +08:00
Kashyap Chamarthy	f59140ed7a	libvirt: A few miscellaneous items related to "native TLS" - Add a one-line summary of the config attribute `live_migration_with_native_tls` and a note about its version requirements. Thus allowing the diligent operators, who carefully read the documentation, to make more informed choices. - Remove the superfluous "_migrateToURI3" suffix in the unit test. In Nova commit `4b3e8772`, we ripped out support for the older migrateToURI{2} APIs and just stuck to migrateToURI3(). So no need to spell out which migration API variant we are using. - Add a TODO item in the libvirt driver to deprecate and remove support for VIR_MIGRATE_TUNNELLED (and related config attribute) once the MIN_{LIBVIRT,QEMU}_VERSION supports "native TLS" by default. Blueprint: support-qemu-native-tls-for-live-migration Change-Id: Ic1419e443cecf94eb4f2c48894abb1a0eb9b73cb Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>	2019-01-22 11:09:09 +01:00
Kashyap Chamarthy	9160fe5098	libvirt: Support native TLS for migration and disks over NBD The encryption offered by Nova (via `live_migration_tunnelled`, i.e. "tunnelling via libvirtd") today secures only two migration streams: guest RAM and device state; but it does _not_ encrypt the NBD (Network Block Device) transport—which is used to migrate disks that are on non-shared storage setup (also called: "block migration"). Further, the "tunnelling via libvirtd" has a huge performance penalty and latency, because it burns more CPU and memory bandwidth due to increased number of data copies on both source and destination hosts. To solve this existing limitation, introduce a new config option `live_migration_with_native_tls`, which will take advantage of "native TLS" (i.e. TLS built into QEMU, and relevant support in libvirt). The native TLS transport will encrypt all migration streams, including disks that are not on shared storage — all of this without incurring the limitations of the "tunnelled via libvirtd" transport. Closes-Bug: #1798796 Blueprint: support-qemu-native-tls-for-live-migration Change-Id: I78f5fef41b6fbf118880cc8aa4036d904626b342 Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>	2019-01-09 11:00:35 +01:00

1 2 3 4

161 Commits