The above exception [1] occurs for example [2] when elasticsearch returns data
with more than one zuul_executor as a list.
This is what l#58 is able to sort
[(12.5, '5'), (12.5, '4'), (12.5, '3'), (25.0, '6'), (18.75, '2'), (6.25, '13'), (12.5, '1')]
This is when it throws exception
[(8.13953488372093, 'ze06.opendev.org'),
(12.790697674418604, 'ze10.opendev.org'),
(5.813953488372093, 'ze05.opendev.org'),
(8.13953488372093, 'ze01.opendev.org'),
(16.27906976744186, 'ze04.opendev.org'),
(4.651162790697675, 'ze03.opendev.org'),
(3.488372093023256, 'ze02.opendev.org'),
(4.651162790697675, 'ze08.opendev.org'),
(12.790697674418604, 'ze09.opendev.org'),
(20.930232558139537, 'ze12.opendev.org'),
(1.1627906976744187, 'ze11.opendev.org'),
(1.1627906976744187, ['ze12.opendev.org', 'ze11.opendev.org'])]
[1] https://0050cb9fd8118437e3e0-3c2a18acb5109e625907972e3aa6a592.ssl.cf5.rackcdn.com/790065/7/check/openstack-tox-py38/4968a73/tox/test_results/1449136.yaml.log
[2] https://review.opendev.org/c/openstack/tripleo-ci-health-queries/+/787569/6/output/elastic-recheck/1449136.yaml
Change-Id: Ie559d5764d9f68420119a7f9608389f0745a9c02
This should render the need to use wrappers obsolete as all
file writing operations are now atomic, assuring that we either
write the entire file or fail.
That is important as we do not want to end-up serving partial files
with the web-server.
Change-Id: I696e2474b557e6b5fea707a198f32cea721cc150
The json file outputs of e-r are loaded by web browsers in order to
render our graphs. These json files are actually quite large and part of
the reason why is we pretty print them with 4 space indents and they
have large nesting. Stop pretty printing (humans can pass the files
through a filter if necessary) in order to reduce the size of these
files and make browsers happier (less time spent downloading).
Change-Id: I19dedc2994169932eb0e90b6cdea3856637f5ef0
Getting elasticsearch data for bug 1708704 is failing
in the check queue with:
pyelasticsearch.exceptions.ElasticHttpError: \
(500, 'ArrayIndexOutOfBoundsException[null]')
This might have to do with the size of the resulting
messages from the hits on the tripleo and kolla jobs,
I'm not sure.
What's clear though is the graph generation is blowing
up in the check queue on that bug but not the gate queue,
maybe due to a smaller result set, so this adds some
error handling in the graph generation for when a specific
bug query fails so it does not halt the entire build of the
graph.
Change-Id: Ibe18c9cccc421a6549a18148f1a2ce3c1e4339d4
If a bug is invalid in a project then we should probably
consider its query for removal in the cleanup command.
For example, bug 1663529 and bug 1828244 were both marked
Invalid and had no hits but weren't processed by the
cleanup command.
Change-Id: I7bac9fc169601c86a26565e9fa5b3d72c362a8fc
This automates the process to remove old queries
for fixed bugs. It's a bit conservative to start
so it doesn't check for open reviews nor does it
filter out affected projects with non-Fix* status
on the bug. It can be made more robust once we're
confident in how it works and play with it on the
open queries.
Change-Id: Iaaf17892804453b99a846be27457c88e5a8f8a55
As of the great renaming of 2019 we need to update the
openstack gerrit URL default to review.opendev.org.
Change-Id: I2e3f7e7fb03be0deba0c95995265376dbce3c5b6
Story: #2005498
Task: #30599
Chances are probably 0 that we won't have failures
or that we'd have 100% categorization rates, which
probably mean if we don't get any failures the
default ALL_FAILS_QUERY is broken, which can easily
happen:
I208675c2258b6c635925c7b9ea9fae5afd000565
This logs a warning if a group yields no failures
based on the default ALL_FAILS_QUERY.
Change-Id: Ib2c12b1fc276389297cf4ac15775e6b2da828fdd
This commit ensures elastic-recheck is able to support zuul v2 and v3
simultaneously:
- Add message queries based on v3 completion messages
- Include job-output.txt where console.html was expected
Change-Id: If3d7990d892a9698a112d9ff1dd5160998a4efe6
Depends-On: I7e34206d7968bf128e140468b9a222ecbce3a8f1
When gerrit is running slow we get 502 responses
back which kills the graph builder. We can retry
these requests from the client to keep going. Generally
a single retry fixes it.
Change-Id: I745d7c9b80ab8861972193d82c037df76af69e06
String interpolation should be delayed to be handled by the logging code,
rather than being done at the point of the logging call.
Ref:http://docs.openstack.org/developer/oslo.i18n/guidelines.html#log-translation
For example:
# WRONG
LOG.info(_LI('some message: variable=%s') % variable)
# RIGHT
LOG.info(_LI('some message: variable=%s'), variable)
Change-Id: I44b85cbf9f4b27d1fee2c1465029fca8cde4f87e
This commit adds logging to actually figure out what is going on when
running the uncategorized_fails command. As part of this the log format
is changed slightly to make it a bit more clear. It also adds a new CLI
option, --verbose, to enable the debug log level.
Co-Authored-By: Matthew Treinish <mtreinish@kortar.org>
Change-Id: I6ad3b4e77b15de1e31899510850f2852f301c543
Add the ability to add custom elastic-search query
suffix.
This is useful to better support use cases that limit
searches to e.g. specific branches.
Change-Id: Iaf28003d7a2a09e3134ed5f75c602b694efb51c6
There are a few hardcoded strings and numbers in uncategorized_fails.
Make these configurable so that it is easier to reuse this tool.
Also add some debug logging.
Change-Id: Ie62ce83bb43dcc8d9b382fe6719fe57eacc5727b
Refactor to use a config class to hold all the
params needed so that they can be more easily
overridden and reused across all the
elastic-recheck tools.
In addition, use the new class to make the
jobs_regex and ci_username configurable.
Change-Id: Ic6f115a6882494bf4c087ded4d7cafa557765c28
In the recent change to split the uncategorized page we were
inconsistent in our use of other and others for the second page.
To make everything consistent and actually work this commit updates
the last references of other and other.html to others and others.html.
Change-Id: If6a4331f5cb7d0a362a89c87fdeebf4891f603b9
This commit add grenade to the list of projects in the integrated gate
list. While it technically doesn't have the job tag set, this is only
because it doesn't regular tempest jobs. It's still a part of the
integrated gate for the same reasons devstack and tempest are so we
should include it in the list.
Change-Id: If8ee91d4c9c8e519b045617fe2e3c7a09fad231d
When stackforge was compacted into the openstack/ git namespace this
broke our filtering on just integrated gate projects. This commit does
2 things to handle this. First it takes a list of the projects using
the integrated gate job template from zuul's layout.yaml and uses that
to be the filter. This way we actual have a view of the integrated gate
again. It then makes a second page for all the other projects failures
which can be used by those to track uncategorized failures. A future
next step will be to add a config file to make the functional split of
these views configurable so that we can more easily adjust the split
as the interactions between projects change.
Change-Id: I41c8ae1e75e8a3d8893f6af5af7c283b5f5c1bcf
Change days to integer argument as the search method for SearchEngine
uses range with `days`, which only allows for int values.
Change-Id: I5801b38091dd4db475580a3570ef47dc9a3a0cbb
This commit adds the output from zuul launcher for job status to the
query we use for finding failures.
Change-Id: I6c12c247c1b65fd10d2c6ecd98cc3e26bbe6d124
The puppet-openstack project doesn't seem to use elastic-recheck
and drives down the percentage in the uncategorized bugs output
so let's blacklist the puppet jobs to turn down the noise it
generates.
Change-Id: I1668ec6b5c17fcc567982d7b8d395c17a17ed2ae
It's the opposite of fun to read a single line of json output
when generating the graph results, so if using the -o option
to write to a file, indent the output for readability.
Change-Id: I091ee2ade51def2d59cdeadb792e56746f8882a2
This uses the ES health API to get the cluster health status
and pretty it up in our index/gate graph pages.
The cluster health API is documented here:
https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-health.html
Also add a note to the readme on how to view the openstack
ES cluster health.
Change-Id: I3df833cf5024af7282e2602e7dfa5db9c3384b6a
Graphs counts were looking at all history instead of just 10 days
as intended. Update the search to only look at the most recent 10
days.
Change-Id: I9495888a818986b3ac187bac7fd65fbcad6135a3
Commit e456a7afca added the
'allow-nonvoting' key to query yaml files so that we can
show non-voting job failures in the graph.
This change builds on that by displaying in the graph,
for non-voting job queries only based on that key, when a
job is non-voting with a simple "Voting: False" line.
Since the default behavior is to filter out non-voting
job results in all of the queries, we don't show anything
special in the graph for voting jobs since it would just
clutter up the output (non-voting is the exceptional case
we want to display).
Change-Id: Ibd75c6244abd10ad7cc491b4453339ad326a11ed
This code doesn't work at all. Bring it back to life.
Also accept inputs from a config file.
Closes-Bug: #1526921
Change-Id: I8f45dc9d42f7547f9d849686739b9a641c176814
In the case where multiple gate queues exist and
are name gate-<name>, a regex gate* could be provided
without the quotes. Update the code to allow specifying a regex.
Change-Id: I24f43f64e71a87989a8dad8570e453a0bbd11402
If launchpad or elastic search time out for a bug,
handle the exception and keep going for other bugs.
Otherwise intermittent issues for a single bug can cause
lack of data for all other bug queries.
Change-Id: Ie32a73774674a329d467606e6e99d7804044ce8c
Change I1f3c2a65104db39fdd7d786d421cded1b436a5f6 added the 'voting'
field which has been tracking results for a couple of weeks now, so
let's use it to filter the initial set of results for the uncategorized
bugs page.
Change-Id: If66c9f5f2f0dea344f941a6a072ff0c30a86e7f2
Make the crrently hardcoded database, elastic search,
and logstash uris configurable using a conf file.
Use the configured logstash url in the the web gate.html
and index.html.
Change-Id: I282745796a40f10955e0c9893e817779b2d4d55a
Currently it is not possible to point to a different database or
elastic search engine. Make these configurable by using the
same configuration file used by bot.py.
Also add a logstash url so that it can be configured separately
from elastic search url.
Change-Id: I77e4215765e32c34b67c38e37e5764c6c0e45c84