nova/nova
Dan Smith 43a84dbc1e Change consecutive build failure limit to a weigher
There is concern over the ability for compute nodes to reasonably
determine which events should count against its consecutive build
failures. Since the compute may erronenously disable itself in
response to mundane or otherwise intentional user-triggered events,
this patch adds a scheduler weigher that considers the build failure
counter and can negatively weigh hosts with recent failures. This
avoids taking computes fully out of rotation, rather treating them as
less likely to be picked for a subsequent scheduling
operation.

This introduces a new conf option to control this weight. The default
is set high to maintain the existing behavior of picking nodes that
are not experiencing high failure rates, and resetting the counter as
soon as a single successful build occurs. This is minimal visible
change from the existing behavior with default configuration.

The rationale behind the default value for this weigher comes from the
values likely to be generated by its peer weighers. The RAM and Disk
weighers will increase the score by number of available megabytes of
memory (range in thousands) and disk (range in millions). The default
value of 1000000 for the build failure weigher will cause competing
nodes with similar amounts of available disk and a small (less than ten)
number of failures to become less desirable than those without, even
with many terabytes of available disk.

Change-Id: I71c56fe770f8c3f66db97fa542fdfdf2b9865fb8
Related-Bug: #1742102
(cherry picked from commit 91e29079a0)
2018-06-07 07:17:51 -07:00
..
CA
api Merge "Cleanup RP and HM records while deleting a compute service." into stable/queens 2018-05-22 08:31:24 +00:00
cells Add instance action record for snapshot instances 2017-12-11 17:46:38 +08:00
cmd Metadata-API fails to retrieve avz for instances created before Pike 2018-05-30 17:40:25 -04:00
common
compute Change consecutive build failure limit to a weigher 2018-06-07 07:17:51 -07:00
conductor [placement] Add sending global request ID in get 2018-03-26 06:24:09 +00:00
conf Change consecutive build failure limit to a weigher 2018-06-07 07:17:51 -07:00
console Fix accumulated nits 2018-01-16 14:54:04 +00:00
consoleauth Merge "Remove translation of log messages" 2017-08-10 11:39:03 +00:00
db Merge "Add index(instance_uuid, updated_at) on instance_actions table" 2018-02-08 15:23:14 +00:00
hacking trivial: Rename 'policy_check' -> 'policy' 2017-10-25 17:56:40 +01:00
image Workaround glanceclient bug when CONF.glance.api_servers not set 2018-02-08 09:06:48 -05:00
ipv6
keymgr Remove deprecated keymgr code 2017-09-11 15:48:30 -04:00
locale Imported Translations from Zanata 2018-03-01 06:16:22 +00:00
network Handle PortNotFoundClient exception when getting ports 2018-05-09 12:30:38 +00:00
notifications Handle EndpointNotFound when building image_ref_url in notifications 2018-03-21 15:52:18 +00:00
objects Metadata-API fails to retrieve avz for instances created before Pike 2018-05-30 17:40:25 -04:00
pci Address nits in I46d483f9de6776db1b025f925890624e5e682ada 2018-01-02 15:57:50 +00:00
policies Update os_compute_api:os-flavor-extra-specs:index docs for 2.47 2018-04-20 19:34:29 +00:00
privsep Update plugs Contrail methods to work with privsep 2018-02-21 15:48:04 -05:00
scheduler Change consecutive build failure limit to a weigher 2018-06-07 07:17:51 -07:00
servicegroup iso8601.is8601.Utc No Longer Exists 2017-08-29 19:26:55 -04:00
tests Change consecutive build failure limit to a weigher 2018-06-07 07:17:51 -07:00
virt libvirt: Skip fetching the virtual size of block devices 2018-05-31 10:49:19 +01:00
vnc
volume Use ksa session for cinder microversion check 2018-03-29 21:55:40 +00:00
__init__.py
availability_zones.py
baserpc.py
block_device.py Add uuid column to BlockDeviceMapping 2017-12-17 14:28:35 +00:00
cache_utils.py
config.py
context.py Allow cinderv2 endpoints within the request context catalog 2018-06-05 10:04:06 +01:00
crypto.py
debugger.py
exception.py Add __repr__ for NovaException 2018-04-05 19:44:32 +00:00
exception_wrapper.py rename binary to source in versioned notifications 2017-07-25 17:36:04 +02:00
filters.py
hooks.py
i18n.py correct referenced url in comments 2018-01-18 09:16:37 +08:00
loadables.py
manager.py
policy.py Add policy granularity to the Flavors API 2017-07-19 15:56:47 -04:00
profiler.py
quota.py Follow up on removing old-style quotas code 2017-12-08 22:11:24 +00:00
rpc.py Remove dead code of api.fault notification sending 2017-10-09 17:29:40 +02:00
safe_utils.py Allow wrapping of closures 2017-07-20 10:07:52 +01:00
service.py Enhance doc for nova services 2017-08-31 08:30:48 +08:00
service_auth.py Fix NoneType error when [service_user] is misconfigured 2017-11-28 12:22:30 -06:00
test.py Change consecutive build failure limit to a weigher 2018-06-07 07:17:51 -07:00
utils.py Merge "Handle TZ change in iso8601 >=0.1.12" 2018-01-31 00:36:50 +00:00
version.py
weights.py
wsgi.py