Avoid unnecessary joins in HostManager._get_instances_by_host

While getting a HostState object for a given compute node during
scheduling, if the HostState does not have its instance info
set, either because it's out of date or because config option
"track_instance_changes" is False, the HostManager still pulls
the list of instances for that host from the database and stores
it in HostState.instances.

This is *only* used (in-tree) by the affinity filters and even
then the only thing those filters use from HostState.instances
is the set of keys from the dict, which is the list of instance
UUIDs on a given host. The actual Instance objects aren't used
at all. See blueprint put-host-manager-instance-info-on-a-diet
for more details on that.

The point of this change, is that when we go to pull the set
of instances from the database for a given host, we don't need
to join on the default columns (info_cache and security_groups)
defined in the _instance_get_all_query() method in the DB API.

This should be at least some minor optimization in scheduling
for hosts that have several instances on them in a large cloud.
As noted in the comment in the code, any out of tree filters
that rely on using the info_cache or security_groups from the
instance are now going to be hit with a lazy-load penalty
per instance, but we have no contract on out of tree filters
so if this happens, the people maintaining said filters can
(1) live with it (2) fork the HostManager code or (3) upstream
their filter so it's in-tree.

A more impactful change would be to refactor
HostManager._get_host_states to bulk query the instances on
the given set of compute nodes in a single query per cell. But
that is left for a later change.

Change-Id: Iccefbfdfa578515a004ef6ac718bac1a49d5c5fd
Partial-Bug: #1737465
This commit is contained in:
Matt Riedemann 2018-05-17 14:36:23 -04:00 committed by Eric Fried
parent 6cfcce0559
commit a1a335f19a
2 changed files with 9 additions and 2 deletions

View File

@ -750,7 +750,13 @@ class HostManager(object):
'instance info for this host.', host_name)
return {}
with context_module.target_cell(context, hm.cell_mapping) as cctxt:
inst_list = objects.InstanceList.get_by_host(cctxt, host_name)
# NOTE(mriedem): We pass expected_attrs=[] to avoid a default
# join on info_cache and security_groups, which at present none
# of the in-tree filters/weighers rely on that information. Any
# out of tree filters which rely on it will end up lazy-loading
# the field but we don't have a contract on out of tree filters.
inst_list = objects.InstanceList.get_by_host(
cctxt, host_name, expected_attrs=[])
return {inst.uuid: inst for inst in inst_list}
def _get_instance_info(self, context, compute):

View File

@ -750,7 +750,8 @@ class HostManagerTestCase(test.NoDBTestCase):
mock_get_by_host.return_value = objects.InstanceList(objects=[inst1])
host_state.update(
inst_dict=hm._get_instance_info(context, cn1))
mock_get_by_host.assert_called_once_with(context, cn1.host)
mock_get_by_host.assert_called_once_with(
context, cn1.host, expected_attrs=[])
self.assertTrue(host_state.instances)
self.assertEqual(host_state.instances[uuids.instance], inst1)