Commit Graph

2644 Commits

Author SHA1 Message Date
zhong.zhou f3eb76e57b Validate flavor image min ram when resize volume-backed instance
When resize instance, the flavors returned may not meet the image
minimum memory requirement, resizing instance ignores the minimum
memory limit of the image, which may cause the resizing be
successfully, but the instance fails to start because the memory is
too small to run the system.

Related-Bug: 2007968
Change-Id: I132e444eedc10b950a2fc9ed259cd6d9aa9bed65
2024-04-18 10:53:04 +08:00
Ghanshyam Mann 0c1e1ccf03 HyperV: Remove RDP console API
RDP console was only for HyperV driver so removing the
API. As API url stay same (because same used for other
console types API), RDP console API will return 400.

Cleaning up the related config options as well as moving its
API ref to obsolete seciton.

Keeping RPC method to avoid error when old controller is used
with new compute. It can be removed in next RPC version bump.

Change-Id: I8f5755009da4af0d12bda096d7a8e85fd41e1a8c
2024-02-13 12:24:38 -08:00
Zuul 5e914c27a0 Merge "Bump hacking version" 2023-12-18 21:20:36 +00:00
Sean Mooney f4852f4c81 [codespell] fix final typos and enable ci
This chnage adds the pre-commit config and
tox targets to run codespell both indepenetly
and via the pep8 target.

This change correct all the final typos in the
codebase as detected by codespell.

Change-Id: Ic4fb5b3a5559bc3c43aca0a39edc0885da58eaa2
2023-12-15 12:32:42 +00:00
Stephen Finucane 3973fc393c Bump hacking version
This bumps the version of flake8 and resolves some erroneous failures in
f-strings. A number of new E721 (do not compare types) class errors are
picked up, which are all addressed.

Change-Id: I7a1937b107ff3af8d1e5fe23fc32b120ef4697f7
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2023-12-14 10:54:26 +00:00
Zuul 17b7aa3926 Merge "[codespell] fix typos in tests" 2023-12-13 23:46:08 +00:00
Danylo Vodopianov eb8519d811 Packed virtqueue support was added.
1) Extend flavor/image extra spec.
2) New xml parameter for qemu command was added.
3) New request filter added for scheduler.
4) Unit and Functional tests were updated
5) Requirments was updated ( os-traits = 3.0.0 )
6) Releasnote was added

Nova spec: https://review.opendev.org/c/openstack/nova-specs/+/868377

Depends-On: https://review.opendev.org/c/openstack/os-traits/+/876069
Change-Id: I789eeae86947e9a3cbd7d5fcc58d2aabe3b8b84c
2023-11-29 16:06:33 +02:00
Dan Smith 190ecc6b8b Clean up service_get_all()
When we added the all_cells flag to this we just kinda hacked it
into place, leaving a big chunk of the method nested inside a
conditional. This refactors out that chunk into a helper, and also
corrects a naming error that was very confusing when reading the code
(a variable named "service" which was a list of services).

Change-Id: I41ff076864dce9ed826922f6609536ea4545a181
2023-10-09 07:45:08 -07:00
Dan Smith 86889b9182 Warn if we find compute services in cell0
While debugging a field issue recently, we determined that computes
had been pointed at cell0 and created service and node records there.
This makes us warn during service list if we find compute services
in cell0 to tip off operators that they have a configuration problem.

Change-Id: Id95c0d02cc34348623b01997fcd1930628d48ccc
2023-10-09 07:03:48 -07:00
Sean Mooney 2232ca95f2 [codespell] fix typos in tests
this mainly fixes typos in the tests and
one type in an exception message.
some addtional items are added to the dict based on
our usage of vars in test but we could remove them later
by doing minor test updates. They are intentionally not
fixed in this commit to limit scope creep.

Change-Id: Iacfbb0a5dc8ffb0857219c8d7c7a7d6e188f5980
2023-10-03 11:08:55 +01:00
melanie witt 6f79d6321e Enforce quota usage from placement when unshelving
When [quota]count_usage_from_placement = true or
[quota]driver = nova.quota.UnifiedLimitsDriver, cores and ram quota
usage are counted from placement. When an instance is SHELVED_OFFLOADED,
it will not have allocations in placement, so its cores and ram should
not count against quota during that time.

This means however that when an instance is unshelved, there is a
possibility of going over quota if the cores and ram it needs were
allocated by some other instance(s) while it was SHELVED_OFFLOADED.

This fixes a bug where quota was not being properly enforced during
unshelve of a SHELVED_OFFLOADED instance when quota usage is counted
from placement. Test coverage is also added for the "recheck" quota
cases.

Closes-Bug: #2003991

Change-Id: I4ab97626c10052c7af9934a80ff8db9ddab82738
2023-05-23 01:02:05 +00:00
Sahid Orentino Ferdjaoui 8c2e765989 compute: enhance compute evacuate instance to support target state
Related to the bp/allowing-target-state-for-evacuate. This change
is extending compute API to accept a new argument targetState.

The targetState argument when set will force state of an evacuated
instance to the destination host.

Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com>
Change-Id: I9660d42937ad62d647afc6be965f166cc5631392
2023-01-31 11:29:01 +01:00
Zuul e3b2910023 Merge "Fix rescue volume-based instance" 2023-01-30 11:16:59 +00:00
alexc20 c97507dfcd record action log when deleting shelved instance
Closes-Bug: #1993736

Change-Id: I9ce18cbba5083c55d15d9b7c2a89133d227754ea
2022-11-03 17:46:48 -03:00
Dan Smith 45c5b80fd0 Add API support for rebuilding BFV instances
This adds a microversion and API support for triggering a rebuild
of volume-backed instances by leveraging cinder functionality to
do so.

Implements: blueprint volume-backed-server-rebuild
Closes-Bug: #1482040

Co-Authored-By: Rajat Dhasmana <rajatdhasmana@gmail.com>

Change-Id: I211ad6b8aa7856eb94bfd40e4fdb7376a7f5c358
2022-08-31 18:05:03 +05:30
Rajesh Tailor 6eed55bf55 Fix rescue volume-based instance
As of now, when attempting to rescue a volume-based instance
using an image without the hw_rescue_device and/or hw_rescue_bus
properties set, the rescue api call fails (as non-stable rescue
for volume-based instances are not supported) leaving the instance
in error state.

This change checks for hw_rescue_device/hw_rescue_bus image
properties before attempting to rescue and if the property
is not set, then fail with proper error message, without changing
instance state.

Related-Bug: #1978958
Closes-Bug: #1926601
Change-Id: Id4c8c5f3b32985ac7d3d7c833b82e0876f7367c1
2022-08-26 12:28:00 +05:30
Sean Mooney 0aad338b1c Add VDPA support for suspend and livemigrate
This change append vnic-type vdpa to the list
of passthough vnic types and removes the api blocks

This should enable the existing suspend and live migrate
code to properly manage vdpa interfaces enabling
"hot plug" live migrations similar to direct sr-iov.

Implements: blueprint vdpa-suspend-detach-and-live-migrate
Change-Id: I878a9609ce0d84f7e3c2fef99e369b34d627a0df
2022-08-23 09:32:00 +01:00
Sean Mooney 6f1c7ab2e7 Add source dev parsing for vdpa interfaces
This change extends the guest xml parsing such that
the source device path can be extreacted from interface
elements of type vdpa.

This is required to identify the interface to remove when
detaching a vdpa port from a domain.

This change fixes a latent bug in the libvirt fixutre
related to the domain xml generation for vdpa interfaces.

Change-Id: I5f41170e7038f4b872066de4b1ad509113034960
2022-08-22 14:57:21 +01:00
Zuul ddcc286ee1 Merge "enable blocked VDPA move operations" 2022-08-20 15:37:54 +00:00
Zuul 4d130cb9c5 Merge "Unify placement client singleton implementations" 2022-08-19 02:48:33 +00:00
Dan Smith c178d93606 Unify placement client singleton implementations
We have many places where we implement singleton behavior for the
placement client. This unifies them into a single place and
implementation. Not only does this DRY things up, but may cause us
to initialize it fewer times and also allows for emitting a common
set of error messages about expected failures for better
troubleshooting.

Change-Id: Iab8a791f64323f996e1d6e6d5a7e7a7c34eb4fb3
Related-Bug: #1846820
2022-08-18 07:22:37 -07:00
Zuul 3af84811c8 Merge "compute: Update bdms with ephemeral encryption details when requested" 2022-08-18 12:31:45 +00:00
Sean Mooney 95f96ed3aa enable blocked VDPA move operations
This change adds functional test for operations on servers with VDPA
devices that are expected to work but currently blocked due to lack
of testing or qemu bugs.

cold-migrate, resize, evacuate,and shelve are enabled
and tested by this patch

Closes-Bug: #1970467
Change-Id: I6e220cf3231670d156632e075fcf7701df744773
2022-08-16 14:04:19 +01:00
Balazs Gibizer a93092e0d5 Update RequestSpec.pci_request for resize
Nova uses the RequestSpec.pci_request in the PciPassthroughFilter to
decide if the PCI devicesm, requested via the pci_alias in the flavor
extra_spec, are available on a potential target host. During resize the
new flavor might contain different pci_alias request than the old flavor
of the instance. In this case Nova should use the pci_alias from the new
flavor to scheduler the destination host of the resize. However this
logic was missing and Nova used the old pci_request value based on the
old flavor. This patch adds the missing logic.

Closes-Bug: #1983753
Closes-Bug: #1941005
Change-Id: I73c9ae27e9c42ee211a53bed3d849650b65f08be
2022-08-10 17:08:34 +02:00
Zuul 7f5279edc9 Merge "For evacuation, ignore if task_state is not None" 2022-08-04 14:02:38 +00:00
Amit Uniyal db919aa15f For evacuation, ignore if task_state is not None
ignore instance task state and continue with vm evacutaion

Closes-Bug: #1978983
Change-Id: I5540df6c7497956219c06cff6f15b51c2c8bc29d
2022-08-03 04:52:10 +00:00
Lee Yarwood 2f97ca2cdc compute: Update bdms with ephemeral encryption details when requested
This change starts the process of wiring up the new ephemeral encryption
control mechanisims in the compute layer. This initial step being to
ensure the BlockDeviceMapping objects are correctly updated with the
required ephemeral encryption details when requested through the
instance flavor extra specs or image metadata properties.

Change-Id: Id49cb238f7bbf2b97f018ddbe090ebdc08d762dc
2022-08-02 21:25:47 +00:00
Sylvain Bauza a755e5d9f2 api: Drop generating a keypair and add special chars to naming
As agreed in the spec, we will both drop the generation support for a keypair
but we'll also accept @ (at) and . (dot) chars in the keyname, all of them in
the same API microversion.

Rebased the work from I5de15935e83823afa545a250cf84f6a7a37036b4

APIImpact

Implements: blueprint keypair-generation-removal
Co-Authored-By: Nicolas Parquet <nicolas.parquet@gandi.net>

Change-Id: I6a7c71fb4385348c87067543d0454f302907395e
2022-07-28 11:05:50 +02:00
René Ribaud 09239fc2ea Allow unshelve to a specific host (REST API part)
This adds support to the REST API, in a new microversion, for specifying
a destination host to unshelve server action when the server
is shelved offloaded.
This patch also supports the ability to unpin the availability_zone of an
instance that is bound to it.

Note that the functional test changes are due to those tests using the
"latest" microversion 2.91.

Implements: blueprint unshelve-to-host
Change-Id: I9e95428c208582741e6cd99bd3260d6742fcc6b7
2022-07-22 10:22:34 +02:00
René Ribaud a263fa46f8 Allow unshelve to a specific host (Compute API part)
This patch introduce changes to the compute API that will allow
PROJECT_ADMIN to unshelve an shelved offloaded server to a specific host.
This patch also supports the ability to unpin the availability_zone of an
instance that is bound to it.

Implements: blueprint unshelve-to-host
Change-Id: Ieb4766fdd88c469574fad823e05fe401537cdc30
2022-07-22 10:22:24 +02:00
melanie witt 1d4dbfd468 Log the exception returned from a cell during API.get()
When getting an instance using the compute.API we call
scatter_gather_single_cell() to be able to capture details when we fail
to retrieve a result from a cell such as timeouts and exceptions.

Currently however, we aren't logging the content of an exception if
scatter_gather_single_cell() returns an exception as the result. The
scatter gather method itself logs exceptions that are not of type
NovaException as these represent definite unexpected errors such as
database errors but NovaException handling are left for the caller to
decide whether they want to log it or re-raise it and so on.

It can be difficult to debug a situation where a cell is returning a
NovaException result so this adds logging of the exception content in
the compute API when we encounter an unexpected NovaException.

The existing log message has been updated to more accurately reflect
what has happened (did not respond vs exception). The assignment of the
exception object in scatter gather has also been updated to not
unnecessarily construct a new exception object because it (a) wasn't
necessary and (b) made asserting the LOG.exception() call argument in
the unit test difficult.

Related-Bug: #1970087

Change-Id: Iae1c61c72be5b6017b934293e3dc079a24eeb0e7
2022-05-03 02:03:26 +00:00
John Garbutt 140b3b81f9 Enforce resource limits using oslo.limit
We now enforce limits on resources requested in the flavor.
This includes: instances, ram, cores. It also works for any resource
class being requested via the flavor chosen, such as custom resource
classes relating to Ironic resources.

Note because disk resources can be limited, we need to know if the
instance is boot from volume or not. This has meant adding extra code to
make sure we know that when enforcing the limits.

Follow on patches will update the APIs to accurately report the limits
being applied to instances, ram and cores.

blueprint unified-limits-nova

Change-Id: If1df93400dcbcb1d3aac0ade80ae5ecf6ce38d11
2022-02-24 16:21:03 +00:00
John Garbutt 4207493829 Enforce api and db limits
When using unified limits, we add enforcement of those limits on all
related API calls. Note: we do not yet correctly report the configured
limits to users via the quota APIs, that is in a future patch.

Note the unified limits calls are made alongside the existing legacy
quota calls. The old quota calls will be handed by the quota engine
driver, that is basically a no-op. This is to make it easier to remove
the legacy code paths in the future.

Note, over quota exceptions raised with unified limits use the standard
(improved) exception message as those raised by oslo.limit. They
however do use the existing exception code to ease integration. The
user of the API will see the same return codes, no matter which code is
enabled to enforce the limits.

Finally, this also adds test coverage where it was missing. Coverage
for "quota recheck" behavior in KeypairAPI is added where all other
KeypairAPI testing is located. Duplicate coverage is removed from
nova/api/openstack/compute/test_keypairs.py at the same time.

blueprint unified-limits-nova

Change-Id: I36e82a17579158063396d7e55b495ccff4959ceb
2022-02-24 16:21:02 +00:00
Zuul b5029890c1 Merge "Move 'hw:pmu', 'hw_pmu' parsing to nova.virt.hardware" 2022-02-15 21:41:35 +00:00
Zuul 3a14c1a427 Merge "Gracefull recovery when attaching volume fails" 2022-02-14 12:37:58 +00:00
Zuul 232f8275ec Merge "Join quota exception family trees" 2022-02-10 19:43:54 +00:00
Felix Huettner 9eb116b99c Gracefull recovery when attaching volume fails
When trying to attach a volume to an already running instance the nova-api
requests the nova-compute service to create a BlockDeviceMapping. If the
nova-api does not receive a response within `rpc_response_timeout` it will
treat the request as failed and raise an exception.

There are multiple cases where nova-compute actually already processed the
request and just the reply did not reach the nova-api in time (see bug report).
After the failed request the database will contain a BlockDeviceMapping entry
for the volume + instance combination that will never be cleaned up again.
This entry also causes the nova-api to reject all future attachments of this
volume to this instance (as it assumes it is already attached).

To work around this we check if a BlockDeviceMapping has already been created
when we see a messaging timeout. If this is the case we can safely delete it
as the compute node has already finished processing and we will no longer pick
it up.
This allows users to try the request again.

A previous fix was abandoned but without a clear reason ([1]).

[1]: https://review.opendev.org/c/openstack/nova/+/731804

Closes-Bug: 1960401
Change-Id: I17f4d7d2cb129c4ec1479cc4e5d723da75d3a527
2022-02-09 14:02:31 +01:00
Dmitrii Shcherbakov 0620678344 [yoga] Add support for VNIC_REMOTE_MANAGED
Allow instances to be created with VNIC_TYPE_REMOTE_MANAGED ports.
Those ports are assumed to require remote-managed PCI devices which
means that operators need to tag those as "remote_managed" in the PCI
whitelist if this is the case (there is no meta information or standard
means of querying this information).

The following changes are introduced:

* Handling for VNIC_TYPE_REMOTE_MANAGED ports during allocation of
  resources for instance creation (remote_managed == true in
  InstancePciRequests);

* Usage of the noop os-vif plugin for VNIC_TYPE_REMOTE_MANAGED ports
  in order to avoid the invocation of the local representor plugging
  logic since a networking backend is responsible for that in this
  case;

* Expectation of bind time events for ports of VNIC_TYPE_REMOTE_MANAGED.
  Events for those arrive early from Neutron after a port update (before
  Nova begins to wait in the virt driver code, therefore, Nova is set
  to avoid waiting for plug events for VNIC_TYPE_REMOTE_MANAGED ports;

* Making sure the service version is high enough on all compute services
  before creating instances with ports that have VNIC type
  VNIC_TYPE_REMOTE_MANAGED. Network requests are examined for the presence
  of port ids to determine the VNIC type via Neutron API. If
  remote-managed ports are requested, a compute service version check
  is performed across all cells.

Change-Id: Ica09376951d49bc60ce6e33147477e4fa38b9482
Implements: blueprint integration-with-off-path-network-backends
2022-02-09 01:23:27 +03:00
Dan Smith 72058b7a40 Join quota exception family trees
For some reason, we have two lineages of quota-related exceptions in
Nova. We have QuotaError (which sounds like an actual error), from
which all of our case-specific "over quota" exceptions inhert, such
as KeypairLimitExceeded, etc. In contrast, we have OverQuota which
lives outside that hierarchy and is unrelated. In a number of places,
we raise one and translate to the other, or raise the generic
QuotaError to signal an overquota situation, instead of OverQuota.
This leads to places where we have to catch both, signaling the same
over quota situation, but looking like there could be two different
causes (i.e. an error and being over quota).

This joins the two cases, by putting OverQuota at the top of the
hierarchy of specific exceptions and removing QuotaError. The latter
was only used in a few situations, so this isn't actually much change.
Cleaning this up will help with the unified limits work, reducing the
number of potential exceptions that mean the same thing.

Related to blueprint bp/unified-limits-nova

Change-Id: I17a3e20b8be98f9fb1a04b91fcf1237d67165871
2022-02-08 07:52:01 -08:00
Stephen Finucane eacecc2433 Move 'hw:pmu', 'hw_pmu' parsing to nova.virt.hardware
Virtually all of the code for parsing 'hw:'-prefixed extra specs and
'hw_'-prefix image metadata properties lives in the 'nova.virt.hardware'
module. It makes sense for these to be included there. Do that.

Change-Id: I1fabdf1827af597f9e5fdb40d5aef244024dd015
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2022-02-01 17:57:27 +00:00
Stephen Finucane f42fb1241b Add 'hw:vif_multiqueue_enabled' flavor extra spec
This mirrors the 'hw_vif_multiqueue_enabled' image metadata property.
Providing a way to set this via flavor extra specs allows admins to
enable this by default and easily enable it for existing instances
without the need to rebuild (a destructive operation).

Note that, in theory at least, the image import workflow provided by
glance should allows admins to enable this by default, but the legacy
image create workflow does not allow this and admins cannot really
control which API end users use when uploading their own images.

Also note that we could provide this behavior using a host-level
configuration option. This would be similar to what we do for other
attributes such as machine type ('hw_machine_type' image meta prop or
'[libvirt] hw_machine_type' config option) or pointer model
('hw_pointer_model' image meta prop or '[compute] pointer_model' config
option) and would be well suited to things that we don't expect to
change, such as enabling multiqueue (it's a sensible default). However,
we would need to start storing this information in system_metadata, like
we do for machine type (since Wallaby) to prevent things changing over
live migration. We have also started avoiding host-level config options
for things like this since one must ensure that the value configured are
consistent across deployments to behavior that varies depending on the
host the guest is initially created on.

Change-Id: I405d0324abe32b31a434105cf2c104876fe9c127
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2021-11-16 19:12:49 +00:00
Zuul ff4b396abf Merge "Avoid unbound instance_uuid var during delete" 2021-11-03 19:51:32 +00:00
Balazs Gibizer 14e43f385e Avoid unbound instance_uuid var during delete
The patch I03cf285ad83e09d88cdb702a88dfed53c01610f8 fixed most of the
possible cases for this to happen but missed one. An early enough
exception during _delete() can cause that the instance_uuid never gets
defined but then we try to use it during the finally block. This patch
moves the saving of the instance_uuid to the top of the try block to
avoid the issue.

Change-Id: Ib3073d7f595c8927532b7c49fc7e5ffe80d508b9
Closes-Bug: #1940812
Related-Bug: #1914777
2021-10-20 09:48:07 +00:00
Balazs Gibizer 49b481ec98 Query ports with admin client to get resource_request
The port.resource_request field is admin only. Nova depends on the
value of this field to do a proper scheduling and resource allocation
and deallocation for ports with resource request as well as to update
the port.binding:profile.allocation field with the resource providers
the requested resources are fulfilled from. However in some cases nova
does not use a neutron admin client / elevated context to read the
port. In this case neutron returns None for the port.resource_request
field and nova thinks that the port has no resource request.

This patch fixes all three places where previous testing showed that
context elevation was missing.

Change-Id: Icb35e20179572fb713a397b4605312cf3294b41b
Closes-Bug: #1945310
2021-10-20 11:39:23 +02:00
Balazs Gibizer 44309c419f Support interface attach / detach with new resource request format
The interface attach and detach logic is now fully adapted to the new
extended resource request format, and supports more than one request
group in a single port.

blueprint: qos-minimum-guaranteed-packet-rate
Change-Id: I73e6acf5adfffa9203efa3374671ec18f4ea79eb
2021-09-01 15:51:47 +02:00
Zuul e81211318a Merge "Support move ops with extended resource request" 2021-08-31 21:38:24 +00:00
Zuul 9abcb3825a Merge "Support boot with extended resource request" 2021-08-31 21:38:15 +00:00
Balazs Gibizer 191bdf2069 Support move ops with extended resource request
Nova re-generates the resource request of an instance for each server
move operation (migrate, resize, evacuate, live-migrate, unshelve) to
find (or validate) a target host for the instance move. This patch
extends the this logic to support the extended resource request from
neutron.

As the changes in the neutron interface code is called from nova-compute
service during the port binding the compute service version is bumped.
And a check is added to the compute-api to reject the move operations
with ports having extended resource request if there are old computes
in the cluster.

blueprint: qos-minimum-guaranteed-packet-rate
Change-Id: Ibcf703e254e720b9a6de17527325758676628d48
2021-08-27 17:59:18 +02:00
Balazs Gibizer c3886c3ca7 Support boot with extended resource request
This adds the final missing pieces to support creating servers with
ports having extended resource request. As the changes in the neutron
interface code is called from nova-compute service during the port
binding the compute service version is bumped. And a check is added to
the compute-api to reject such server create requests if there are old
computes in the cluster.

Note that some of the negative and SRIOV related interface attach
tests are also started to pass as they are not dependent on any of the
interface attach specific implementation. Still interface attach is
broken here as the failing of the positive tests show.

blueprint: qos-minimum-guaranteed-packet-rate

Change-Id: I9060cc9cb9e0d5de641ade78c5fd7e1cc77ade46
2021-08-27 15:51:12 +02:00
Balazs Gibizer 94f47471e0 Transfer RequestLevelParams from ports to scheduling
The new format of the resource_request field of the Neutron port allows
expressing not just request groups but also request global parameters
for the allocation candidate query. This patch adapts the neutron client
in nova to parse such parameters. Then transfer this information to the
scheduler to include it in the allocation candidate request.

It relies on previous patches that already extended the
RequestLevelParams ovo and the allocation candidate query generation.

Change-Id: Icb91f6429050a161f577d0ed94d4cd906d3da461
blueprint: qos-minimum-guaranteed-packet-rate
2021-08-22 11:45:17 +02:00