Commit Graph

321 Commits

Author SHA1 Message Date
melanie witt 6f79d6321e Enforce quota usage from placement when unshelving
When [quota]count_usage_from_placement = true or
[quota]driver = nova.quota.UnifiedLimitsDriver, cores and ram quota
usage are counted from placement. When an instance is SHELVED_OFFLOADED,
it will not have allocations in placement, so its cores and ram should
not count against quota during that time.

This means however that when an instance is unshelved, there is a
possibility of going over quota if the cores and ram it needs were
allocated by some other instance(s) while it was SHELVED_OFFLOADED.

This fixes a bug where quota was not being properly enforced during
unshelve of a SHELVED_OFFLOADED instance when quota usage is counted
from placement. Test coverage is also added for the "recheck" quota
cases.

Closes-Bug: #2003991

Change-Id: I4ab97626c10052c7af9934a80ff8db9ddab82738
2023-05-23 01:02:05 +00:00
Sahid Orentino Ferdjaoui 8c2e765989 compute: enhance compute evacuate instance to support target state
Related to the bp/allowing-target-state-for-evacuate. This change
is extending compute API to accept a new argument targetState.

The targetState argument when set will force state of an evacuated
instance to the destination host.

Signed-off-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@industrialdiscipline.com>
Change-Id: I9660d42937ad62d647afc6be965f166cc5631392
2023-01-31 11:29:01 +01:00
whoami-rajat 6919db5612 Add conductor RPC interface for rebuild
This patch adds support for passing the ``reimage_boot_volume``
flag from the API layer through the conductor layer to the
computer layer and also includes RPC bump as necessary.

Related blueprint volume-backed-server-rebuild

Change-Id: I8daf177eb67d08112a16fe788910644abf338fa6
2022-08-31 16:38:50 +05:30
Rajat Dhasmana 30aab9c234 Add support for volume backed server rebuild
This patch adds the plumbing for rebuilding a volume backed
instance in compute code. This functionality will be enabled
in a subsequent patch which adds a new microversion and the
external support for requesting it.

The flow of the operation is as follows:

1) Create an empty attachment
2) Detach the volume
3) Request cinder to reimage the volume
4) Wait for cinder to notify success to nova (via external events)
5) Update and complete the attachment

Related blueprint volume-backed-server-rebuild

Change-Id: I0d889691de1af6875603a9f0f174590229e7be18
2022-08-31 16:38:37 +05:30
Dan Smith 232684b440 Avoid n-cond startup abort for keystone failures
Conductor creates a placement client for the potential case where
it needs to make a call for certain operations. A transient network
or keystone failure will currently cause it to abort startup, which
means it is not available for other unrelated activities, such as
DB proxying for compute.

This makes conductor test the placement client on startup, but only
abort startup on errors that are highly likely to be permanent
configuration errors, and only warn about things like being unable
to contact keystone/placement during initialization. If a non-fatal
error is encountered at startup, later operations needing the
placement client will retry initialization.

Closes-Bug: #1846820
Change-Id: Idb7fcbce0c9562e7b9bd3e80f2a6d4b9bc286830
2022-08-18 07:37:42 -07:00
Stephen Finucane 89ef050b8c Use unittest.mock instead of third party mock
Now that we no longer support py27, we can use the standard library
unittest.mock module instead of the third party mock lib. Most of this
is autogenerated, as described below, but there is one manual change
necessary:

nova/tests/functional/regressions/test_bug_1781286.py
  We need to avoid using 'fixtures.MockPatch' since fixtures is using
  'mock' (the library) under the hood and a call to 'mock.patch.stop'
  found in that test will now "stop" mocks from the wrong library. We
  have discussed making this configurable but the option proposed isn't
  that pretty [1] so this is better.

The remainder was auto-generated with the following (hacky) script, with
one or two manual tweaks after the fact:

  import glob

  for path in glob.glob('nova/tests/**/*.py', recursive=True):
      with open(path) as fh:
          lines = fh.readlines()
      if 'import mock\n' not in lines:
          continue
      import_group_found = False
      create_first_party_group = False
      for num, line in enumerate(lines):
          line = line.strip()
          if line.startswith('import ') or line.startswith('from '):
              tokens = line.split()
              for lib in (
                  'ddt', 'six', 'webob', 'fixtures', 'testtools'
                  'neutron', 'cinder', 'ironic', 'keystone', 'oslo',
              ):
                  if lib in tokens[1]:
                      create_first_party_group = True
                      break
              if create_first_party_group:
                  break
              import_group_found = True
          if not import_group_found:
              continue
          if line.startswith('import ') or line.startswith('from '):
              tokens = line.split()
              if tokens[1] > 'unittest':
                  break
              elif tokens[1] == 'unittest' and (
                  len(tokens) == 2 or tokens[4] > 'mock'
              ):
                  break
          elif not line:
              break
      if create_first_party_group:
          lines.insert(num, 'from unittest import mock\n\n')
      else:
          lines.insert(num, 'from unittest import mock\n')
      del lines[lines.index('import mock\n')]
      with open(path, 'w+') as fh:
          fh.writelines(lines)

Note that we cannot remove mock from our requirements files yet due to
importing pypowervm unit test code in nova unit tests. This library
still uses the mock lib, and since we are importing test code and that
lib (correctly) only declares mock in its test-requirements.txt, mock
would not otherwise be installed and would cause errors while loading
nova unit test code.

[1] https://github.com/testing-cabal/fixtures/pull/49

Change-Id: Id5b04cf2f6ca24af8e366d23f15cf0e5cac8e1cc
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2022-08-01 17:46:26 +02:00
Rajesh Tailor 2521810e55 Fix typos
This change fixes some of the typos in unit tests as well
as in nova code-base.

Change-Id: I209bbb270baf889fcb2b9a4d1ce0ab4a962d0d0e
2022-05-30 17:40:00 +05:30
John Garbutt 140b3b81f9 Enforce resource limits using oslo.limit
We now enforce limits on resources requested in the flavor.
This includes: instances, ram, cores. It also works for any resource
class being requested via the flavor chosen, such as custom resource
classes relating to Ironic resources.

Note because disk resources can be limited, we need to know if the
instance is boot from volume or not. This has meant adding extra code to
make sure we know that when enforcing the limits.

Follow on patches will update the APIs to accurately report the limits
being applied to instances, ram and cores.

blueprint unified-limits-nova

Change-Id: If1df93400dcbcb1d3aac0ade80ae5ecf6ce38d11
2022-02-24 16:21:03 +00:00
Sean Mooney f3d48000b1 Add autopep8 to tox and pre-commit
autopep8 is a code formating tool that makes python code pep8
compliant without changing everything. Unlike black it will
not radically change all code and the primary change to the
existing codebase is adding a new line after class level doc strings.

This change adds a new tox autopep8 env to manually run it on your
code before you submit a patch, it also adds autopep8 to pre-commit
so if you use pre-commit it will do it for you automatically.

This change runs autopep8 in diff mode with --exit-code in the pep8
tox env so it will fail if autopep8 would modify your code if run
in in-place mode. This allows use to gate on autopep8 not modifying
patches that are submited. This will ensure authorship of patches is
maintianed.

The intent of this change is to save the large amount of time we spend
on ensuring style guidlines are followed automatically to make it
simpler for both new and old contibutors to work on nova and save
time and effort for all involved.

Change-Id: Idd618d634cc70ae8d58fab32f322e75bfabefb9d
2021-11-08 12:37:27 +00:00
Balazs Gibizer 191bdf2069 Support move ops with extended resource request
Nova re-generates the resource request of an instance for each server
move operation (migrate, resize, evacuate, live-migrate, unshelve) to
find (or validate) a target host for the instance move. This patch
extends the this logic to support the extended resource request from
neutron.

As the changes in the neutron interface code is called from nova-compute
service during the port binding the compute service version is bumped.
And a check is added to the compute-api to reject the move operations
with ports having extended resource request if there are old computes
in the cluster.

blueprint: qos-minimum-guaranteed-packet-rate
Change-Id: Ibcf703e254e720b9a6de17527325758676628d48
2021-08-27 17:59:18 +02:00
Zuul 745c8ee274 Merge "smartnic support - functional tests" 2021-08-19 18:13:30 +00:00
Zuul 529e883ac0 Merge "smartnic support - reject server move and suspend" 2021-08-19 18:13:22 +00:00
Zuul 1c348ac7b8 Merge "smartnic support - cleanup arqs" 2021-08-19 18:02:52 +00:00
Zuul c7dd853945 Merge "smartnic support - create arqs" 2021-08-19 10:04:28 +00:00
Stephen Finucane 43b253cd60 db: Post reshuffle cleanup
Introduce a new 'nova.db.api.api' module to hold API database-specific
helpers, plus a generic 'nova.db.utils' module to hold code suitable for
both main and API databases. This highlights a level of complexity
around connection management that is present for the main database but
not for the API database. This is because we need to handle the
complexity of cells for the former but not the latter.

Change-Id: Ia5304c552ce552ae3c5223a2bfb3a9cd543ec57c
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2021-08-09 15:34:40 +01:00
Stephen Finucane bf8b5fc7d0 db: Move remaining 'nova.db.sqlalchemy' modules
The two remaining modules, 'api_models' and 'api_migrations', are
moved to the new 'nova.db.api' module.

Change-Id: I138670fe36b07546db5518f78c657197780c5040
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2021-08-09 15:34:40 +01:00
Stephen Finucane 100b9dc62c db: Unify 'nova.db.api', 'nova.db.sqlalchemy.api'
Merge these, removing an unnecessary layer of abstraction, and place
them in the new 'nova.db.main' directory. The resulting change is huge,
but it's mainly the result of 's/sqlalchemy import api/main import api/'
and 's/nova.db.api/nova.db.main.api/' with some necessary cleanup. We
also need to rework how we do the blocking of API calls since we no
longer have a 'DBAPI' object that we can monkey patch as we were doing
before. This is now done via a global variable that is set by the 'main'
function of 'nova.cmd.compute'.

The main impact of this change is that it's no longer possible to set
'[database] use_db_reconnect' and have all APIs automatically wrapped in
a DB retry. Seeing as this behavior is experimental, isn't applied to
any of the API DB methods (which don't use oslo.db's 'DBAPI' helper),
and is used explicitly in what would appear to be the critical cases
(via the explicit 'oslo_db.api.wrap_db_retry' decorator), this doesn't
seem like a huge loss.

Change-Id: Iad2e4da4546b80a016e477577d23accb2606a6e4
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2021-08-09 15:34:40 +01:00
Yongli He c3245098e3 smartnic support - functional tests
function test
	- boot/soft reboot/hard reboot
        - delete
	- rebuild
	- un/pause
	- stop/start
	- lock/unlock
	- rescue/unrescue
	- reject: resize/suspend/migrate/shelve/evacuate

Implements: blueprint sriov-smartnic-support
Co-Authored-By: Xinran Wang <xin-ran.wang@intel.com>
Change-Id: I1d25a3a00380cac07547f53e75259ac0711c949c
2021-08-05 15:58:43 +08:00
Yongli He 1f53176d2f smartnic support - reject server move and suspend
Server with ARQ in the port does not support move and suspend,
reject these operations in API stage:

	- resize
	- shelve
	- live_migrate
	- evacuate
	- suspend
	- attach/detach a smartnic port

Reject create server with smartnic in port if minimal compute
service version less than 57

Reject create server with port which have a malformed device
profile that request multi devices, like:
  {
      "resources:CUSTOM_ACCELERATOR_FPGA": "2",
      "trait:CUSTOM_INTEL_PAC_ARRIA10": "required",
  }

Implements: blueprint sriov-smartnic-support
Change-Id: Ia705a0341fb067e746a3b91ec4fc6d149bcaffb8
2021-08-05 15:58:41 +08:00
Yongli He e19fa1a199 smartnic support - cleanup arqs
delete arqs:
        -  delete arq while port unbind
	-  create ops failed and arqs did not bind to instance
	-  arq bind to instance but not bind to port

Implements: blueprint sriov-smartnic-support
Change-Id: Idab0ee38750d018de409699a0dbdff106d9e11fb
2021-08-05 15:58:34 +08:00
Yongli He b90c828d70 smartnic support - create arqs
create arqs for port with device profile:

    - On API stage, device profile is used to get schedule infomations.

    - After schedule instance come to a host, Conductor create ARQ and updates
      the ARQ binding info to Cyborg.

Implements: blueprint sriov-smartnic-support

Depends-On: https://review.opendev.org/c/openstack/neutron-lib/+/768324
Depends-On: https://review.opendev.org/q/topic:%22bug%252F1906602%22+
Depends-On: https://review.opendev.org/c/openstack/cyborg/+/758942

Change-Id: Idaf92c54df0f39d177d7acaabbfcf254ff5a4d0f
Co-Authored-By: Shaohe Feng <shaohe.feng@intel.com>
Co-Authored-By: Xinran Wang <xin-ran.wang@intel.com>
2021-08-05 15:58:29 +08:00
Zuul 052cf96358 Merge "Remove (almost) all references to 'instance_type'" 2021-06-13 05:57:49 +00:00
Balazs Gibizer b14f6ba62e Use NotificationFixture for legacy notifications too
Change-Id: Ic16c575c8f36e8a3c50b6e302b9fdf961cb3ed22
2021-05-24 11:00:59 +01:00
Balazs Gibizer f1f599d098 Create a fixture around fake_notifier
The fake_notifier uses module globals and also needs careful stub and
reset calls to work properly. This patch wraps the fake_notifier into a
proper Fixture that automates the complexity.

This is fairly rage patch but it does not change any logic just redirect
calls from the fake_notifier to the new NotificationFixture

Change-Id: I456f685f480b8de71014cf232a8f08c731605ad8
2021-05-24 11:00:59 +01:00
Stephen Finucane 212f89a61e tests: Split external service fixtures out
There's no need to throw these into one giant file.

Change-Id: I8478449d15edb40f98d25d3940343cae9ab2fde8
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2021-05-13 14:28:33 +01:00
Stephen Finucane c269285568 tests: Move remaining non-libvirt fixtures
Move these to the central place. There's a large amount of test damage
but it's pretty trivial.

Change-Id: If581eb7aa463c9dde13714f34f0f1b41549a7130
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2021-05-12 16:32:43 +01:00
Stephen Finucane 1bf45c4720 Remove (almost) all references to 'instance_type'
This continues on from I81fec10535034f3a81d46713a6eda813f90561cf and
removes all other references to 'instance_type' where it's possible to
do so. The only things left are DB columns, o.vo fields, some
unversioned objects, and RPC API methods. If we want to remove these, we
can but it's a lot more work.

Change-Id: I264d6df1809d7283415e69a66a9153829b8df537
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
2021-03-29 12:24:15 +01:00
Sylvain Bauza 9e96f64126 Rename ensure_network_metadata to amend requested_networks
As we don't persist (fortunately) the requested networks when booting an
instance, we need a way to implement the value of the RequestSpec field
during any create or move operation so we would know in a later change
which port or network was asked.

Partially-Implements: blueprint routed-networks-scheduling

Change-Id: I0c7e32f6088a8fc1625a0655af824dee2df4a12c
2021-02-03 18:21:34 +01:00
zhangbailin 7fbd787b1b Cyborg shelve/unshelve support
This change extends the conductor manager to append the cyborg
resource request to the request spec when performing an unshelve.

On shelve offload an instance will be deleted the instance's ARQs
binding info to free up the bound ARQs in Cyborg service.
And this change passes the ARQs to spawn during unshelve an instance.

This change extends the ``shelve_instance``, ``shelve_offload_instance``
and ``unshelve_instance`` rpcapi function to carry the arq_uuids.

Co-Authored-By: Wenping Song <songwenping@inspur.com>

Implements: blueprint cyborg-shelve-and-unshelve
Change-Id: I258df4d77f6d86df1d867a8fe27360731c21d237
2021-01-15 03:21:17 +00:00
Takashi Natsume 1cf2431f4b Remove six.text_type (2/2)
Replace six.text_type with str.
This patch completes six removal.

Change-Id: I779bd1446dc1f070fa5100ccccda7881fa508d79
Implements: blueprint six-removal
Signed-off-by: Takashi Natsume <takanattie@gmail.com>
2020-12-13 11:26:35 +00:00
Sean Mooney 1356ef5b57 Cyborg evacuate support
This change extends the conductor manager
to append the cyborg resource request to the
request spec when performing an evacuate.

This change passes the ARQs to spawn during rebuild
and evacuate. On evacuate the existing ARQs will be deleted
and new ARQs will be created and bound, during rebuild the
existing ARQs are reused.

This change extends the rebuild_instance compute rpcapi
function to carry the arq_uuids. This eliminates the
need to lookup the uuids associated with the arqs assinged
to the instance by quering cyborg.

Co-Authored-By: Wenping Song <songwenping@inspur.com>
Co-Authored-By: Brin Zhang <zhangbailin@inspur.com>

Implements: blueprint cyborg-rebuild-and-evacuate
Change-Id: I147bf4d95e6d86ff1f967a8ce37260730f21d236
2020-09-01 08:41:45 +00:00
Sundar Nadathur d94ea23d3d Delete ARQs by UUID if Cyborg ARQ bind fails.
During the reivew of the cyborg series it was noted that
in some cases ARQs could be leaked during binding.
See https://review.opendev.org/#/c/673735/46/nova/conductor/manager.py@1632

This change adds a delete_arqs_by_uuid function that can delete
unbound ARQs by instance uuid.

This change modifies build_instances and schedule_and_build_instances
to handel the AcceleratorRequestBindingFailed exception raised when
binding fails and clean up instance arqs.

Co-Authored-By: Wenping Song <songwenping@inspur.com>

Closes-Bug: #1872730
Change-Id: I86c2f00e2368fe02211175e7328b2cd9c0ebf41b
Blueprint: nova-cyborg-interaction
2020-07-23 15:26:07 +08:00
Stephen Finucane 125df26bf9 Use 'Exception.__traceback__' for versioned notifications
The 'inspect.trace()' function is expected to be called within the
context of an exception handler. The 'from_exc_and_traceback' class
method of the 'nova.notification.objects.exception.ExceptionPayload'
class uses this to get information about a provided exception, however,
there are cases where this is called from outside of an exception
handler. In these cases, we see an 'IndexError' since we can't get the
last frame of a non-existent stacktrace. The solution to this is to
fallback to using the traceback embedded in the exception. This is a bit
lossy when decorators are involved but for all other cases this will
give us the same information. This also allows us to avoid passing a
traceback argument to the function since we have it to hand already.

Change-Id: I404ca316b1bf2a963106cd34e927934befbd9b12
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Closes-Bug: #1881455
2020-06-08 14:38:33 +01:00
Zuul ccd9cb4e6c Merge "func tests: move _run_periodics() into base class" 2020-04-02 18:39:18 +00:00
Sundar Nadathur c433b1df42 Bump compute rpcapi version and reduce Cyborg calls.
The _get_bound_arq_resources() in the compute manager [1] calls Cyborg
up to 3 times: once to get the accelerator request (ARQ) UUIDs for the
instance, and then once or twice to get all ARQs with completed bindings.

The first call can be eliminated by passing the ARQs from the conductor
to the compute manager as an additional parameter in
build_and_run_instance(). This requires a bump in compute rpcapi version.

[1] https://review.opendev.org/#/c/631244/54/nova/compute/manager.py@2652

Blueprint: nova-cyborg-interaction

Change-Id: I26395d57bd4ba55276b7514baa808f9888639e11
2020-03-31 00:24:00 -07:00
Sundar Nadathur a20aca7f5e Delete ARQs for an instance when the instance is deleted.
This patch series now works for many VM operations with libvirt:
* Creation, deletion of VM instances.
* Pause/unpause

The following works but is a no-op:
* Lock/unlock

Hard reboots are taken up in a later patch in this series.
Soft reboots work for accelerators unless some unrelated failure
forces a hard reboot in the libvirt driver.

Suspend is not supported yet. It would fail with this error:
   libvirtError: Requested operation is not valid:
   domain has assigned non-USB host devices

Shelve is not supported yet.
Live migration is not intended to be supported with accelerators now.

Change-Id: Icb95890d8f16cad1f7dc18487a48def2f7c9aec2
Blueprint: nova-cyborg-interaction
2020-03-24 22:44:18 -07:00
Artom Lifshitz ee05cd8b9e func tests: move _run_periodics() into base class
There are two almost identical implementations of the _run_periodics()
helper - and a third one would have joined them in a subsequent patch,
if not for this patch. This patch moves the _run_periodics() to the
base test class. In addition, _run_periodics() depends on the
self.computes dict used for compute service tracking. The method that
populates that dict, _start_compute(), is therefore also moved to the
base class.

This enables some light refactoring of existing tests that need
either the _run_periodics() helper, or the compute service tracking.

In addition, a needless override of _start_compute() in
test_aggregates that provided no added value is removed. This is done
to avoid any potential confusion around _start_compute()'s role.

Change-Id: I36dd64dc272ea1743995b3b696323a9431666489

safdasdf

Change-Id: I33d8ac0a1cae0b2d275a21287d5e44c008a68122
2020-03-24 10:10:53 -04:00
Sundar Nadathur cc630b4eb6 Create and bind Cyborg ARQs.
* Call Cyborg with device profile name to get ARQs (Accelerator Requests).
  Each ARQ corresponds to a single device profile group, which
  corrresponds to a single request group in request spec.
* Match each ARQ to associated request group, and thereby obtain the
  corresponding RP for that ARQ.
* Call Cyborg to bind the ARQ to that host/device-RP.
* When Cyborg sends the ARQ bind notification events, wait for those
  events with a timeout.

Change-Id: I0f8b6bf2b4f4510da6c84fede532533602b6af7f
Blueprint: nova-cyborg-interaction
2020-03-21 12:03:38 -07:00
Balazs Gibizer 94c7e7ad43 Support unshelve with qos ports
This patch adds support for unshelving an offloaded server with qos ports.
To do that this patch:
* collects the port resource requests from neutron before the scheduler
  is called to select the target of the unshelve.
* calculate the request group - provider mapping after the scheduler
  selected the target host
* update the InstancePCIRequest to drive the pci_claim to allocate VFs
  from the same PF as the bandwidth is allocated from by the scheduler
* update the binding profile of the qos ports to so that the allocation
  key of the binding profile points to the RPs the port is allocated
  from.

As this was the last move operation to be supported the compute service
version is bumped to indicate such support. This will be used in a later
patches to implement a global service level check in the API.

Note that unshelve does not have a re-schedule loop and all the RPC
changes was committed in Queens.

Two error cases needs special care by rolling back allocations before
putting the instance back to SHELVED_OFFLOADED state:

* if the IntancePCIRequest cannot be updated according to the new target
host of unshelve
* if updating port binding fails in neutron during unshelve

Change-Id: I678722b3cf295c89110967d5ad8c0c964df4cb42
blueprint: support-move-ops-with-qos-ports-ussuri
2020-03-18 17:24:56 +01:00
Stephen Finucane 5fc3b81fdf Remove 'nova.image.api' module
This doesn't exist for 'nova.volume' and no longer exists for
'nova.network'. There's only one image backend we support, so do like
we've done elsewhere and just use 'nova.image.glance'.

Change-Id: I7ca7d8a92dfbc7c8d0ee2f9e660eabaa7e220e2a
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-02-18 11:45:39 +00:00
Stephen Finucane fadeedcdea nova-net: Remove layer of indirection in 'nova.network'
At some point in the past, there was only nova-network and its code
could be found in 'nova.network'. Neutron was added and eventually found
itself (mostly!) in the 'nova.network.neutronv2' submodule. With
nova-network now gone, we can remove one layer of indirection and move
the code from 'nova.network.neutronv2' back up to 'nova.network',
mirroring what we did with the old nova-volume code way back in 2012
[1]. To ensure people don't get nova-network and 'nova.network'
confused, 'neutron' is retained in filenames.

[1] https://review.opendev.org/#/c/14731/

Change-Id: I329f0fd589a4b2e0426485f09f6782f94275cc07
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2020-01-15 14:57:49 +00:00
Matt Riedemann 26d695876a Use graceful_exit=True in ComputeTaskManager.revert_snapshot_based_resize
This passes graceful_exit=True to the wrap_instance_event decorator
in ComputeTaskManager.revert_snapshot_based_resize so that upon successful
completion of the RevertResizeTask, when the instance is hard destroyed
from the target cell DB (used to create the action/event), a traceback
is not logged for the InstanceActionNotFound exception.

The same event is also finished in the source cell DB upon successful
completion of the RevertResizeTask. Note that there are other ways we
could have done this, e.g. moving the contents of the _execute() method
to another method and then putting that in an EventReporter context with
the source cell context/instance, but this was simpler.

Part of blueprint cross-cell-resize

Change-Id: Ibb32f7c19f5f2ec4811b165b8df748d1b7b0f9e4
2019-12-23 10:10:57 -05:00
Matt Riedemann 74d18c412f Add revert_snapshot_based_resize conductor RPC method
This adds the conductor ComputeTaskManager method
revert_snapshot_based_resize along with the related conductor
RPC API client method which will be an RPC cast from the API
for a revertResize server action.

Part of blueprint cross-cell-resize

Change-Id: Ia6b6b25238963a5f60349267da6d07cb740982f4
2019-12-12 12:00:33 -05:00
Matt Riedemann 6f74bc1e98 Add confirm_snapshot_based_resize conductor RPC method
This adds the conductor ComputeTaskManager method
confirm_snapshot_based_resize along with the related conductor
RPC API client method which by default will be an RPC cast
from the API for a confirmResize server action but can also
be RPC called in the case of deleting a server in VERIFY_RESIZE
status.

Part of blueprint cross-cell-resize

Change-Id: If4c4b23891bfc340deb18a2f500510a472a869c9
2019-12-12 11:13:52 -05:00
Eric Fried 7daa3f59e2 Use provider mappings from Placement (mostly)
fill_provider_mapping is used from *most* code paths where it's
necessary to associate RequestSpec.request_groups with the resource
providers that are satisfying them. (Specifically, all the code paths
where we have a Selection object available. More about that below.)

Prior to Placement microversion 1.34, the only way to do this mapping
was by reproducing much of the logic from GET /allocation_candidates
locally to reverse engineer the associations. This was incomplete,
imperfect, inefficient, and ugly. That workaround was nested in the call
from fill_provider_mapping to fill_provider_mapping_based_on_allocation.

Placement microversion 1.34 enhanced GET /allocation_candidates to
return these mappings [1], and Nova started using 1.34 as of [2], so
this commit makes fill_provider_mapping bypass
fill_provider_mapping_based_on_allocations completely.

We would love to get rid of the entire hack, but
fill_provider_mapping_based_on_allocation is still used from
finish_revert_resize to restore port bindings on a reverted migration.
And when reverting a migration, we don't have allocation candidates with
mappings, only the original source allocations. It is left to a future
patch to figure out how to get around this, conceivably by saving the
original mappings in the migration context.

[1] https://docs.openstack.org/placement/train/specs/train/implemented/placement-resource-provider-request-group-mapping-in-allocation-candidates.html
[2] I52499ff6639c1a5815a8557b22dd33106dcc386b

Related to blueprint: placement-resource-provider-request-group-mapping-in-allocation-candidates
Change-Id: I45e0b2b73f88b86a20bc70ddf4f9bb97c8ea8312
2019-12-06 11:04:55 -06:00
Zuul f1382651dc Merge "Sanity check instance mapping during scheduling" 2019-11-28 23:25:35 +00:00
Zuul 691db5b99b Merge "Restrict RequestSpec to cell when evacuating" 2019-11-14 01:10:33 +00:00
Zuul ee16ae1b39 Merge "Plumb allow_cross_cell_resize into compute API resize()" 2019-11-12 02:42:44 +00:00
Zuul 485d2894d2 Merge "Refresh instance in MigrationTask.execute Exception handler" 2019-11-07 20:54:21 +00:00
Zuul 5d251cf654 Merge "Execute CrossCellMigrationTask from MigrationTask" 2019-11-07 20:54:15 +00:00