openstack/nova - nova - OpenDev: Free Software Needs Free Tools

Commit Graph

Author	SHA1	Message	Date
Balazs Gibizer	48229b46b4	Retry /reshape at provider generation conflict During a normal update_available_resources run if the local provider tree caches is invalid (i.e. due to the scheduler made an allocation bumping the generation of the RPs) and the virt driver try to update the inventory of an RP based on the cache Placement will report conflict, the report client will invalidate the caches and the retry decorator on ResourceTracker._update_to_placement will re-drive the top of the fresh RP data. However the same thing can happen during reshape as well but the retry mechanism is missing in that code path so the stale caches can cause reshape failures. This patch adds specific error handling in the reshape code path to implement the same retry mechanism as exists for inventory update. blueprint: pci-device-tracking-in-placement Change-Id: Ieb954a04e6aba827611765f7f401124a1fe298f3	2022-08-25 10:00:10 +02:00
Stephen Finucane	89ef050b8c	Use unittest.mock instead of third party mock Now that we no longer support py27, we can use the standard library unittest.mock module instead of the third party mock lib. Most of this is autogenerated, as described below, but there is one manual change necessary: nova/tests/functional/regressions/test_bug_1781286.py We need to avoid using 'fixtures.MockPatch' since fixtures is using 'mock' (the library) under the hood and a call to 'mock.patch.stop' found in that test will now "stop" mocks from the wrong library. We have discussed making this configurable but the option proposed isn't that pretty [1] so this is better. The remainder was auto-generated with the following (hacky) script, with one or two manual tweaks after the fact: import glob for path in glob.glob('nova/tests/*/.py', recursive=True): with open(path) as fh: lines = fh.readlines() if 'import mock\n' not in lines: continue import_group_found = False create_first_party_group = False for num, line in enumerate(lines): line = line.strip() if line.startswith('import ') or line.startswith('from '): tokens = line.split() for lib in ( 'ddt', 'six', 'webob', 'fixtures', 'testtools' 'neutron', 'cinder', 'ironic', 'keystone', 'oslo', ): if lib in tokens[1]: create_first_party_group = True break if create_first_party_group: break import_group_found = True if not import_group_found: continue if line.startswith('import ') or line.startswith('from '): tokens = line.split() if tokens[1] > 'unittest': break elif tokens[1] == 'unittest' and ( len(tokens) == 2 or tokens[4] > 'mock' ): break elif not line: break if create_first_party_group: lines.insert(num, 'from unittest import mock\n\n') else: lines.insert(num, 'from unittest import mock\n') del lines[lines.index('import mock\n')] with open(path, 'w+') as fh: fh.writelines(lines) Note that we cannot remove mock from our requirements files yet due to importing pypowervm unit test code in nova unit tests. This library still uses the mock lib, and since we are importing test code and that lib (correctly) only declares mock in its test-requirements.txt, mock would not otherwise be installed and would cause errors while loading nova unit test code. [1] https://github.com/testing-cabal/fixtures/pull/49 Change-Id: Id5b04cf2f6ca24af8e366d23f15cf0e5cac8e1cc Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2022-08-01 17:46:26 +02:00
Stephen Finucane	8133092907	Remove use of pkg_resources Use of this library has significant performance implications. While we're probably not too badly affected, we don't actually need to use it here. The 'parse_version' utility it exposes is intended to parse PEP440-compliant version identifiers, not the simple microversions placement uses, which the 'microversion_parse' library can competently parse for us. Change-Id: I9b7281caec6fa53600dea316492d052787cf799b Signed-off-by: Stephen Finucane <stephenfin@redhat.com>	2022-07-14 15:20:55 +01:00
Zuul	77c8f91a5b	Merge "Bump min placement microversion to 1.36"	2021-08-31 01:35:22 +00:00
Matt Riedemann	c09d98dadb	Add force kwarg to delete_allocation_for_instance This adds a force kwarg to delete_allocation_for_instance which defaults to True because that was found to be the most common use case by a significant margin during implementation of this patch. In most cases, this method is called when we want to delete the allocations because they should be gone, e.g. server delete, failed build, or shelve offload. The alternative in these cases is the caller could trap the conflict error and retry but we might as well just force the delete in that case (it's cleaner). When force=True, it will DELETE the consumer allocations rather than GET and PUT with an empty allocations dict and the consumer generation which can result in a 409 conflict from Placement. For example, bug 1836754 shows that in one tempest test that creates a server and then immediately deletes it, we can hit a very tight window where the method GETs the allocations and before it PUTs the empty allocations to remove them, something changes which results in a conflict and the server delete fails with a 409 error. It's worth noting that delete_allocation_for_instance used to just DELETE the allocations before Stein [1] when we started taking consumer generations into account. There was also a related mailing list thread [2]. Closes-Bug: #1836754 [1] I77f34788dd7ab8fdf60d668a4f76452e03cf9888 [2] http://lists.openstack.org/pipermail/openstack-dev/2018-August/133374.html Change-Id: Ife3c7a5a95c5d707983ab33fd2fbfc1cfb72f676	2021-08-30 06:11:25 +00:00
Balazs Gibizer	f6e8c512fb	Bump min placement microversion to 1.36 To implement the usage of same_subtree query parameter in the allocation candidate request first the minimum requires placement microversion needs to be bumped from 1.35 to 1.36. This patch makes such bump and update the related nova upgrade check. Later patches will modify the query generation to include the same_subtree param to the request. Change-Id: I5bfec9b9ec49e60c454d71f6fc645038504ef9ef blueprint: qos-minimum-guaranteed-packet-rate	2021-08-21 10:00:51 +02:00
Balazs Gibizer	c3804efd42	Refactor ResourceRequest constructor This refactor changes ResourceRequest __init__ to make only an empty request and moves the ResourceReqeust creation from a RequestSpec to a static factory method. This is a preparation to introduce another factory method later that will generate a ResourceRequest from a single ResourceGroup instead of a full RequestSpec. Blueprint: support-interface-attach-with-qos-ports Change-Id: Idd58298a6b01775f962b9bf0a0835f762c8e0ed2	2021-01-18 15:40:42 +01:00
Eric Fried	f2d088b04e	Stop using PlacementDirect PlacementDirect was integrated into a functional test suite when it was first created as a way to prove that it worked [1] and demonstrate how to use it. However, it was a pain then, because the interceptor needs to be created every time you want to use it; and since extracted placement started diverging from in-tree placement, other problems started cropping up (see the associated bug). So this commit removes the use of PlacementDirect from nova. Details: - test_report_client now uses PlacementFixture. So all the `with interceptor` context management is gone. This accounts for the vast majority of the apparent change, which is just outdenting those contexts. - SchedulerReportClientTestBase, which was doing some hocus pocus to wrap the SchedulerReportClient such that we could do some microversion checks, is removed. The test suite simply instantiates the microversion-checking wrapper class directly as the client used by the test cases. - We were taking advantage of a PlacementDirect feature allowing us to default to the latest microversion if not explicitly specified in the request. Without this, we had to add the `version` kwarg to some of the calls we were making to SchedulerReportClient primitives (get/put/post/delete). - A piece of test_update_from_provider_tree was using a deliberately-broken interceptor to prove that the code in question wasn't hitting the API. We replace this with a non-callable mock on the Adapter's request method. - test_global_request_id was taking advantage of the interceptor to validate that the global request ID was making it to the "other side" of the API boundary. This was fun, but overkill. We now simply assert that the correct HTTP header is making it into the ksa Adapter's request method. - Functional test suite test_resource_tracker.IronicResourceTrackerTest was inheriting from the SchedulerReportClientTestBase class, but not using the interceptor anywhere. Can't tell you why that was done. So now it just uses the plain old test.TestCase like everyone else. [1] This commit does remove all of nova's testing of PlacementDirect. However, it is still tested in the placement repository itself: `69b9659a45/placement/tests/functional/test_direct.py` Change-Id: Icb889c09a69e7c5cbf9330e5d9917d6ab3ac3dc5 Related-Bug: #1818560	2020-03-05 07:36:37 -06:00
Eric Fried	bcc893a2b0	Use Placement 1.35 (root_required) Placement microversion 1.35 gives us the root_required queryparam to GET /allocation_candidates, allowing us to filter out candidates where the root provider has/lacks certain traits, independent of traits specified in any of the individual request groups. Use it. And add affordance for specifying such traits to the RequestSpec. Which allows us to fix up the couple of request filters that were hacking traits into the RequestSpec.flavor. Change-Id: I44f02044ce178e84c23d178e5a23a3aa1208e502	2020-01-07 16:46:56 -06:00
Eric Fried	54195a1bd9	Use Placement 1.34 (string suffixes & mappings) This commit cuts us over to using placement microversion 1.34 for GET /allocation_candidates, thereby supporting string request group suffixes (added in 1.33) when specified in flavor extra_specs. The mappings (added in 1.34) are not used in code yet, but a future patch will tie the group suffixes to the RequestGroup.requester_id so that it can be correlated after GET /a_c. This will allow us to get rid of map_requested_resources_to_providers, which was a hack to bridge the gap until we had mappings from placement. Change-Id: I52499ff6639c1a5815a8557b22dd33106dcc386b	2019-12-05 17:02:46 -06:00
Matt Riedemann	cea4f391f3	Move compute_node_to_inventory_dict to test-only code Since [1] the only thing still using this utility method is some functional report client test code so this change moves it to the test class that needs it. [1] Ib62ac0b692eb92a2ed364ec9f486ded05def39ad Change-Id: I016765112b4d7a811a855da5e503a8cb870afbbe	2019-11-07 17:34:33 -05:00
Zuul	5345f9acb7	Merge "Remove @safe_connect from _delete_provider"	2019-10-09 22:49:12 +00:00
Chris Dent	5f553f4b1a	Use microversion in put allocations in test_report_client In some places test_report_client uses a raw client.put when writing allocations. This defaults to the latest microversion. This means that if the allocations format changes in the latest microversion in a patch in placement, the nova functional tests fail. In this change we pin the microversion to the minimum one providing the features used in the PUT. Change-Id: Ia0fee6bb931770792b552ae32ef31f0a4cc466ee	2019-09-02 11:40:35 +01:00
Stephen Finucane	7abe83f646	scheduler: Flatten 'ResourceRequest.from_extra_specs', 'from_image_props' The 'ResourceRequest' object sources information from three different attributes of an instance: the instance's image metadata properties, the instance's flavor, this flavor's extra specs. It's possible for a user to override resources requested via the flavor using flavor extra specs (e.g. using the 'resources:VCPU=N' extra spec), and it's possible to override traits requested via the flavor extra specs using image metadata (e.g. using the 'traits_required=foo' metadata property). This means there's an implicit hierarchy present: - Traits: image metadata > flavor extra specs - Resources: flavor extra specs > flavor Previously, we pulled information from the flavor extra specs and image metadata using two classmethods, 'from_extra_specs' and 'from_image_props', but this required a lot of glue code in between to ensure this hierarchy was maintained. Stop doing this, preferring to centralize everything in one location. This results in fewer LoC and a more grokable implementation, and will make things much easier when we start handling 'PCPU's here. Part of blueprint cpu-resources Change-Id: Ic0e6bc47b79711b38b2d4dabaeb5ae1dbaf2b18a Signed-off-by: Stephen Finucane <sfinucan@redhat.com>	2019-08-27 17:00:03 +01:00
Eric Fried	9981d06b4c	Remove @safe_connect from _delete_provider This commit removes the @safe_connect decorator from SchedulerReportClient._delete_provider and makes its callers deal with resulting keystoneauth1.exceptions.ClientExceptionZ sanely: - ServiceController.delete (via delete_resource_provider) logs an error and continues (backward compatible behavior, best effort to delete each provider). - ComputeManager.update_available_resource (ditto) ditto (ditto). - SchedulerReportClient.update_from_provider_tree raises the exception through, per the contract described in the catch_all internal helper. Change-Id: I8403a841f21a624a546ae5f26bb9ba19318ece6a	2019-07-19 16:41:08 -05:00
Eric Fried	0652a4c7a5	Un-safe_connect and publicize get_providers_in_tree In the continuing saga to wipe @safe_connect from the annals of history, and in preparation, for its use outside of SchedulerReportClient, this commit does two things to _get_providers_in_tree: - Removes @safe_connect from it. Callers now need to be aware that they can get ClientExceptionZ from ksa. (The two existing callers were vetted and needed no additional handling - it's way more appropriate for them to raise ClientException than a mysterious NoneType error somewhere down the line as they would have been doing previously.) - Renames it to get_providers_in_tree. Change-Id: I2b284d69d345d15287f04a7ca4cd422155768525	2019-06-27 17:00:24 -05:00
Zuul	42df3eaf1f	Merge "Prepare _heal_allocations_for_instance for nested allocations"	2019-06-27 21:33:43 +00:00
Balazs Gibizer	307999c581	Prepare _heal_allocations_for_instance for nested allocations When no allocations exist for an instance the current heal code uses a report client call that can only handle allocations from a single RP. This call is now replaced with a more generic one so in a later patch port allocations can be added to this code path too. Related-Bug: #1819923 Change-Id: Ide343c1c922dac576b1944827dc24caefab59b74	2019-06-27 10:33:14 +02:00
Stephen Finucane	dc6fc82c14	hacking: Resolve W605 (invalid escape sequence) This one's actually important since it will be an error in future versions of Python. Change-Id: Ib9f735216773224f91ac7f49fbe2eee119670872 Signed-off-by: Stephen Finucane <sfinucan@redhat.com>	2019-06-24 14:24:06 -05:00
Eric Fried	c43f7e664d	Use aggregate_add_host in nova-manage When nova-manage placement sync_aggregates was added [1], it duplicated some report client logic (aggregate_add_host) to do provider aggregate retrieval and update so as not to duplicate a call to retrieve the host's resource provider record. It also left a TODO to handle generation conflicts. Here we change the signature of aggregate_add_host to accept either the host name or RP UUID, and refactor the nova-manage placement sync_aggregates code to use it. The behavior in terms of exit codes and messaging should be largely unchanged, though there may be some subtle differences in corner cases. [1] Iac67b6bf7e46fbac02b9d3cb59efc3c59b9e56c8 Change-Id: Iaa4ddf786ce7d31d2cee660d5196e5e530ec4bd3	2019-03-26 17:38:48 -05:00
Chris Dent	09090c8277	Use a placement conf when testing report client It turns out that the independent wsgi interceptors in test_report_client were using nova's global configuration when creating the intercepts using code from placement. This was working because until [1] placement's set of conf options had not diverged from nova's and nova still has placement_database config settings. This change takes advantage of new functionality in the PlacementFixture to allow the fixtur to manage config and database, but _not_ run the interceptor. This means it can set up a config that is later used by the independent interceptors that are used in the report client tests. [1] Ie43a69be8b75250d9deca6a911eda7b722ef8648 Change-Id: I05326e0f917ca1b9a6ef8d3bd463f68bd00e217e Closes-Bug: #1818560 Depends-On: I8c36f35dbe85b0c0db1a5b6b5389b160b68ca488	2019-03-04 20:43:48 +00:00
Zuul	c7f0d160e4	Merge "Use placement.inventory.inuse in report client"	2019-02-25 22:47:34 +00:00
Matt Riedemann	39ec15f58c	Follow up for I0c764e441993e32aafef0b18049a425c3c832a50 This is a follow up for change I0c764e441993e32aafef0b18049a425c3c832a50 to address review comments. The most important part is the early exit from _fill_provider_mapping if request_spec.maps_requested_resources returns False. That is needed to avoid the performance impact of getting allocations and resource provider traits per instance and provider. Since this code is currently only going to be exercised with ports that have resource requests, we want to avoid the extra work for all other server create requests. Part of blueprint bandwidth-resource-provider Change-Id: I90845461b2b98c176c7b3b97dd3f47ed604a9bef	2019-02-22 10:57:11 +01:00
Eric Fried	efa22cd985	Use placement.inventory.inuse in report client Since I9a833aa35d474caa35e640bbad6c436a3b16ac5e we've had the framework for placement to return specific error codes allowing us to differentiate among error conditions for oft-repeated status codes. That change also included as its proof-of-concept a specific code for the placement side of InventoryInUse - i.e. an attempt to delete an inventory record for which there are existing allocations. SchedulerReportClient was previously identifying this error condition by parsing the text of the 409 response. With this change, it instead uses the provided error code. Change-Id: Ic621adcadf10cc607455eba48c4cb1882bde23fa	2019-02-11 17:10:59 -06:00
Chris Dent	27617ee193	Switch to using os-resource-classes With the extraction of placement we ended up with resource class names being duplicated between nova and placement. To address that, the os-resource-classes library [1] was created to provide a single authority for standard resource classes and the format of custom classes. This patch changes nova to use it, removing the use of the rc_fields module which used to have the information. A method left in it (normalize_name) has been moved to utils.py, renamed as normalize_rc_name, and callers and tests updated accordingly. Because the placement code is being kept in nova for the time being, that code's use of rc_fields is maintained, and the module too. A note is added in the module explain that. Backporting the changes from extracted-placement to placement-in-nova was considered but because we no longer have placement tests in nova, that didn't seem like the right thing to do. requirements and lower-constraints have been updated. os-resource-classes is already in global requirements. For reference the related placement change is at [2]. [1] https://docs.openstack.org/os-resource-classes [2] https://review.openstack.org/#/c/623556/ Change-Id: I8e579920c0eaca81b563a87429c930b21b3d4dc5	2019-02-07 11:11:09 +00:00
Eric Fried	570ad36992	Commonize _update code path There were a bunch of report client methods around updating inventory to placement which were only being used in the non-update_provider_tree code paths of the resource tracker's update routine. Those code paths had already been retrofitted to produce a placement-shaped inventory object. update_from_provider_tree gives us another way to flush these inventory changes. This patch simply takes the inventory object produced by the get_inventory() and update_compute_node() code paths and updates the provider tree object in the same fashion as update_provider_tree does. So now all three code paths can commonly invoke update_from_provider_tree. And we can get rid of a ton of redundant code in the report client. This includes the former incarnation of set_inventory_for_provider; so we rename the artist formerly known as _set_inventory_for_provider to match its brethren, set_traits_for_provider and set_aggregates_for_provider. Change-Id: I1a305847f0310c8d4babd5a625e4cc7bffe5b086	2019-01-16 18:34:39 +00:00
Eric Fried	2f77e7ad90	Consolidate inventory refresh get_provider_tree_and_ensure_root now refreshes inventories via the _ensure_resource_provider code path, so the call to _refresh_and_get_inventories in get_provider_tree_and_ensure_root is no longer necessary. Change-Id: Iece924e85409bd4d9cd38ce6ced7883ffc905310	2019-01-16 18:34:39 +00:00
Eric Fried	deef31729b	Reduce calls to placement from _ensure Prior to this patch, the report client's update_from_provider_tree method would, upon failure of any placement API call, invalidate the cache just for the failing provider (and any descendants) and attempt to continue operating on any other providers in the tree. With this patch, we instead invalidate the tree around the failing provider and fail right away. In real life, since we don't yet have any implementations of nested, this would have been effectively a null change. Except: this allows us to resolve a TODO whereby we would always _ensure_resource_provider (including a call to GET /resource_providers?in_tree=$compute_rp) on every periodic. Now we can optimize that out. This should reduce the number of calls to placement per RT periodic to zero in steady state when [compute]resource_provider_association_refresh is zero. Closes-Bug: #1742467 Change-Id: Ieeaad9783e0ff93377fbc6c7932618d2fac8946a	2019-01-16 18:34:34 +00:00
Chris Dent	787bb33606	Use external placement in functional tests Adjust the fixtures used by the functional tests so they use placement database and web fixtures defined by placement code. To avoid making redundant changes, the solely placement- related unit and functional tests are removed, but the placement code itself is not (yet). openstack-placement is required by the functional tests. It is not added to test-requirements as we do not want unit tests to depend on placement in any way, and we enforce this by not having placement in the test env. The concept of tox-siblings is used to ensure that the placement requirement will be satisfied correctly if there is a depends-on. To make this happen, the functional jobs defined in .zuul.yaml are updated to require openstack/placement. tox.ini has to be updated to use a envdir that is the same name as job. Otherwise the tox siblings role in ansible cannot work. The handling of the placement fixtures is moved out of nova/test.py into the functional tests that actually use it because we do not want unit tests (which get the base test class out of test.py) to have anything to do with placement. This requires adjusting some test files to use absolute import. Similarly, a test of the comparison function for the api samples tests is moved into functional, because it depends on placement functionality, TestUpgradeCheckResourceProviders in unit.cmd.test_status is moved into a new test file: nova/tests/functional/test_nova_status.py. This is done because it requires the PlacementFixture, which is only available to functional tests. A MonkeyPatch is required in the test to make sure that the right context managers are used at the right time in the command itself (otherwise some tables do no exist). In the test itself, to avoid speaking directly to the placement database, which would require manipulating the RequestContext objects, resource providers are now created over the API. Co-Authored-By: Balazs Gibizer <balazs.gibizer@ericsson.com> Change-Id: Idaed39629095f86d24a54334c699a26c218c6593	2018-12-12 18:46:49 +00:00
Balazs Gibizer	dfa2e6f221	Consumer gen support for put allocations The placement API version 1.28 introduced consumer generation as a way to make updating allocation safe even if it is done from multiple places. This patch changes the scheduler report client put_allocations function to raise AllocationUpdateFailed in case of generation conflict. The only direct user of this call is the nova-manage heal_allocations CLI which will simply fail to heal the allocation for this instance. Blueprint: use-nested-allocation-candidates Change-Id: Iba230201803ef3d33bccaaf83eb10453eea43f20	2018-09-25 13:02:02 +02:00
Eric Fried	73d7ef4288	Nix update_instance_allocation, _allocate_for_instance A previous change [1] removed the only usage of SchedulerReportClient method update_instance_allocation, which itself was the only user of the _allocate_for_instance method. Remove both of these methods and their test artifacts. [1] If272365e58a583e2831a15a5c2abad2d77921729 Change-Id: Iec02942d384620608e7c705f15f895105d90c882	2018-09-20 18:53:03 +00:00
Eric Fried	8e1ca5bf34	Use uuidsentinel from oslo.utils oslo.utils release 3.37.0 [1] introduced uuidsentinel [2]. This change rips out nova's uuidsentinel and replaces it with the one from oslo.utils. [1] https://review.openstack.org/#/c/599754/ [2] https://review.openstack.org/#/c/594179/ Change-Id: I7f5f08691ca3f73073c66c29dddb996fb2c2b266 Depends-On: https://review.openstack.org/600041	2018-09-05 09:08:54 -05:00
Eric Fried	4f3c063aab	Fix reshaper report client functonal test nits Followon to address minor issues (mostly doc/comment) from reshaper functional tests over the report client changes [1]. [1] https://review.openstack.org/#/c/585049/19/nova/tests/functional/test_report_client.py Change-Id: I413cfd54cb8e5df444810874ebe2954844bbf863	2018-08-30 14:18:49 -05:00
Eric Fried	b23bf6d6ab	Report client: update_from_provider_tree w/reshape The update_from_provider_tree method now takes an `allocations` kwarg which, if not None, signals that we need to do a reshape (inventory/allocation data migration). If the reshape section fails for any reason, we raise ReshapeFailed. Change-Id: I3fc2d5538cfe3ac1fd330f10d0376627f34a8b94 blueprint: reshape-provider-tree	2018-08-24 15:57:10 -05:00
Eric Fried	2833785f59	Report client: _reshape helper, placement min bump Add a thin wrapper to invoke the POST /reshaper placement API with appropriate error checking. This bumps the placement minimum to the reshaper microversion, 1.30. Change-Id: Idf8997d5efdfdfca6967899a0882ffb9ecf96915 blueprint: reshape-provider-tree	2018-08-24 15:39:18 -05:00
Eric Fried	25b852efd7	Report client: get_allocations_for_provider_tree The reshaper path needs to pass all the allocations related to the compute node's provider tree to update_provider_tree so it can shuffle those allocations appropriately. This patch adds a new get_allocations_for_provider_tree method to the report client for this purpose. Blueprint: reshape-provider-tree Change-Id: I73811f3e3bf19dec3a240e1f1f8c69f4c98d677c	2018-08-24 15:36:49 -05:00
Eric Fried	176d1d90fd	Report client: Real get_allocs_for_consumer In preparation for reshaper work, implement a superior method to retrieve allocations for a consumer. The new get_allocs_for_consumer: - Uses the microversion that returns consumer generations (1.28). - Doesn't hide error conditions: - If the request returns non-200, instead of returning {}, it raises a new ConsumerAllocationRetrievalFailed exception. - If we fail to communicate with the placement API, instead of returning None, it raises (a subclass of) ksa ClientException. - Returns the entire payload rather than just the 'allocations' dict. The existing get_allocations_for_consumer is refactored to behave compatibly (except it logs warnings for the previously-silently-hidden error conditions). In a subsequent patch, we should rework all callers of this method to use the new one, and get rid of the old one. Change-Id: I0e9a804ae7717252175f7fe409223f5eb8f50013 blueprint: reshape-provider-tree	2018-08-24 15:31:04 -05:00
Vladyslav Drok	55fb7efe31	Use placement microversion 1.26 in update_from_provider_tree Recent change I1fd85860c96e8690fbcf93c8a2f02178168bfd5a changed the microversion for updating the inventory only in the _update_inventory_attempt, missing _set_inventory_for_provider which is called from update_from_provider_tree. It causes failures with ironic virt driver. Closes-Bug: 1787910 Change-Id: Ibdebd02ce6f52ca87559e9d2d5c068f37bf4b6db	2018-08-20 11:29:10 -04:00
Matt Riedemann	660e328a25	Use consumer generation in _heal_allocations_for_instance If we're updating existing allocations for an instance due to the project_id/user_id not matching the instance, we should use the consumer_generation parameter, new in placement 1.28, to ensure we don't overwrite the allocations while another process is updating them. As a result, the include_project_user kwarg to method get_allocations_for_consumer is removed since nothing else is using it now, and the minimum required version of placement checked by nova-status is updated to 1.28. Change-Id: I4d5f26061594fa9863c1110e6152069e44168cc3	2018-07-23 14:09:55 -04:00
Eric Fried	3518ccb665	Check provider generation and retry on conflict Update aggregate-related scheduler report client methods to use placement microversion 1.19, which returns provider generation in GET /rps/{u}/aggregates and handles generation conflicts in PUT /rps/{u}/aggregates. Helper methods previously returning aggregates and traits now also return the generation, which is fed through appropriately to subsequent calls. As a result, the generation kwarg is no longer needed in _refresh_associations, so it is removed. Doing this exposes the race described in the cited bug, so we add a retry decorator to the resource tracker's _update and the report client's aggregate_{add\|remove}_host methods. Related to blueprint placement-aggregate-generation Closes-Bug: #1779931 Change-Id: I3c5fbb18297db71e682fcddb5bf4536595d92383	2018-07-20 10:09:44 -05:00
Eric Fried	814bc9d2d9	Enforce placement minimum in nova.cmd.status We keep forgetting to bump the minimum required placement version in nova.cmd.status (and all the related bits and pieces) whenever we change the report client to require a new version. This patch interposes a check in the test_report_client functional suite any time get/put/post/delete is called from the report client. If we see a microversion higher than the minumum specified in nova.cmd.status, we raise an exception, which will blow up the test. This should force the author of a new patch on SchedulerReportClient to do the necessary paperwork in that patch. ...assuming said author happens to write a test in test_report_client. This pattern can and should be copied into other test suites where report client tests are likely to be written, to broaden the scope of this enforcement. Change-Id: I5482b92f941261ab6ee6b7cd532ce268c31fe793	2018-06-15 21:04:50 +00:00
Chris Dent	43cc59abe2	Provide a direct interface to placement This is a method of using wsgi-intercept to provide a context manager that allows talking to placement over requests, but without a network. It is a quick and dirty way to talk to and make changes in the placement database where the only network traffic is with the placement database. This is expected to be useful in the creation of tools for performing fast forward upgrades where each compute node may need to "migrate" its resource providers, inventory and allocations in the face of changing representations of hardware (for example pre-existing VGPUs being represented as nested providers) but would like to do so when all non-database services are stopped. A system like this would allow code on the compute node to update the placement database, using well known HTTP interactions, without the placement service being up. The basic idea is that we spin up the WSGI stack with no auth, configured using whatever already loaded CONF we happen to have available. That CONF points to the placement database and all the usual stuff. The context manager provides a keystoneauth1 Adapter class that operates as a client for accessing placement. The full WSGI stack is brought up because we need various bits of middleware to help ensure that policy calls don't explode and so JSON validation is in place. In this model everything else is left up to the caller: constructing the JSON, choosing which URIs to call with what methods (see test_direct for minimal examples that ought to give an idea of what real callers could expect). To make things friendly in the nova context and ease creation of fast forward upgrade tools, SchedulerReportClient is tweaked to take an optional adapter kwarg on construction. If specified, this is used instead of creating one with get_ksa_adapter(), using settings from [placement] conf. Doing things in this way draws a clear line between the placement parts and the nova parts while keeping the nova parts straightforward. NoAuthReportClient is replaced with a base test class, test_report_client.SchedulerReportClientTestBase. This provides an _interceptor() context manager which is a wrapper around PlacementDirect, but instead of producing an Adapter, it produces a SchedulerReportClient (which has been passed the Adapter provided by PlacementDirect). test_resource_tracker and test_report_client are updated accordingly. Caveats to be aware of: * This is (intentionally) set up to circumvent authentication and authorization. If you have access to the necessary database connection string, then you are good to go. That's what we want, right? * CONF construction being left up to the caller is on purpose because right now placement itself is not super flexible in this area and flexibility is desired here. This is not (by a long shot) the only way to do this. Other options include: * Constructing a WSGI environ that has all the necessary bits to allow calling the methods in the handlers directly (as python commands). This would duplicate a fair bit of the middleware and seems error prone, because it's hard to discern what parts of the environ need to be filled. It's also weird for data input: we need to use a BytesIO to pass in data on PUTs and POSTs. * Using either the WSGI environ or wsgi-intercept models but wrap it with a pythonic library that exposes a "pretty" interface to callers. Something like: placement.direct.allocations.update(consumer_uuid, {data}) * Creating a python library that assembles the necessary data for calling the methods in the resource provider objects and exposing that to: a) the callers who want this direct stuff b) the existing handlers in placement (which remain responsible for json manipulation and validation and microversion handling, and marshal data appropriately for the python lib) I've chosen the simplest thing as a starting point because it gives us something to talk over and could solve the immediate problem. If we were to eventually pursue the 4th option, I would hope that we had some significant discussion before doing so as I think it is a) harder than it might seem at first glance, b) likely to lead to many asking "why bother with the http interface at all?". Both require thought. Partially implements blueprint reshape-provider-tree Co-Authored-By: Eric Fried <efried@us.ibm.com> Change-Id: I075785abcd4f4a8e180959daeadf215b9cd175c8	2018-06-12 11:04:50 -05:00
Jay Pipes	5eda1fab85	mirror nova host aggregate members to placement This patch is the first step in syncing the nova host aggregate information with the placement service. The scheduler report client gets a couple new public methods -- aggregate_add_host() and aggregate_remove_host(). Both of these methods do NOT impact the provider tree cache that the scheduler reportclient keeps when instantiated inside the compute resource tracker. Instead, these two new reportclient methods look up a resource provider by name (not UUID) since that is what is supplied by the os-aggregates Compute API when adding or removing a "host" to/from a nova host aggregate. Change-Id: Ibd7aa4f8c4ea787774becece324d9051521c44b6 blueprint: placement-mirror-host-aggregates	2018-05-30 12:45:20 -04:00
Chris Dent	ce2840539e	Move test_report_client out of placement namespace test_report_client provides functional tests of the report client using a fully operating placement service (via wsgi-intercept) but it is not, in itself, testing placement. Therefore this change moves the test into nova/tests/functional where it can sit besides other genral purpose nova-related functional tests. As noted in the moved file, in a future where placement is extracted, nova could choose to import a fixture that placement (installed as a test dependency) provides so that this test and ones like it can continue to run as desired. compute/test_resource_tracker.py is updated to reflect the new location of the module as it makes use of it. partially implements blueprint placement-extract Change-Id: I433700e833f97c0fec946dafc2cdda9d49e1100b	2018-04-06 22:56:03 +01:00

44 Commits