Classify tempest-devstack failures using ElasticSearch
Go to file
Sean Dague 932986a876 move queries.yaml into a queries subdir
this handles the piece of work we've been talking about for a while
in moving the queries.yaml file into a directory with a bunch of
files. These remain yaml so that they can be tagged with additional
metadata. This would support the concept of soft deleting as well
as other useful meta data to gauge our evolution of the bugs we
track over time.

This should see some real review as it's extensive enough of a
change that the existing tests might not be sufficient. However it
should be enough to move this forward quite a bit.

This also makes future looking statements about doing soft deletes
with a resolved_at keyword in the future. That implementation will
come later.

Change-Id: I86317fcf6f1886ab5b6c0ee154b29e71865c52b7
2013-12-02 11:43:00 -05:00
doc/source Apply Cookiecutter to the repo. 2013-09-23 15:27:39 -07:00
elastic_recheck move queries.yaml into a queries subdir 2013-12-02 11:43:00 -05:00
queries move queries.yaml into a queries subdir 2013-12-02 11:43:00 -05:00
.coveragerc Apply Cookiecutter to the repo. 2013-09-23 15:27:39 -07:00
.gitignore Apply Cookiecutter to the repo. 2013-09-23 15:27:39 -07:00
.gitreview Apply Cookiecutter to the repo. 2013-09-23 15:27:39 -07:00
.testr.conf Apply Cookiecutter to the repo. 2013-09-23 15:27:39 -07:00
CONTRIBUTING.rst Apply Cookiecutter to the repo. 2013-09-23 15:27:39 -07:00
LICENSE Apply Cookiecutter to the repo. 2013-09-23 15:27:39 -07:00
MANIFEST.in Apply Cookiecutter to the repo. 2013-09-23 15:27:39 -07:00
README.rst move queries.yaml into a queries subdir 2013-12-02 11:43:00 -05:00
babel.cfg Apply Cookiecutter to the repo. 2013-09-23 15:27:39 -07:00
elasticRecheck.conf.sample move queries.yaml into a queries subdir 2013-12-02 11:43:00 -05:00
recheckwatchbot.yaml Make bot.py behave like a daemon 2013-09-18 17:45:12 -04:00
requirements.txt Make pid file configurable 2013-09-30 10:29:32 -07:00
setup.cfg Add graph script 2013-10-02 14:56:49 -07:00
setup.py Apply Cookiecutter to the repo. 2013-09-23 15:27:39 -07:00
test-requirements.txt Add mox fixture to base TestCase 2013-10-01 18:05:33 -04:00
tox.ini Reorganize tests into unit and functional tests 2013-10-09 13:52:25 -04:00

README.rst

elastic-recheck

"Classify tempest-devstack failures using ElasticSearch"

Idea

When a tempest job failure is detected, by monitoring gerrit (using gerritlib), a collection of logstash queries will be run on the failed job to detect what the bug was.

Eventually this can be tied into the rechecker tool and launchpad

queries/

All queries are stored in separate yaml files in a queries directory at the top of the elastic_recheck code base. The format of these files is ######.yaml (where ###### is the bug number), the yaml should have a query keyword which is the query text for elastic search.

Guidelines for good queries

  • After a bug is resolved and has no more hits in elasticsearch, we should flag it with a resolved_at keyword. This will let us keep some memory of past bugs, and see if they come back. (Note: this is a forward looking statement, sorting out resolved_at will come in the future)
  • Queries should get as close as possible to fingerprinting the root cause
  • Queries should not return any hits for successful jobs, this is a sign the query isn't specific enough

Future Work

  • Move config files into a separate directory
  • Make unit tests robust
  • Merge both binaries
  • Add debug mode flag
  • Split out queries repo
  • Expand gating testing
  • Cleanup and document code better
  • Sort out resolved_at stamping to remove active bugs
  • Move away from polling ElasticSearch to discover if its ready or not
  • Add nightly job to propose a patch to remove bug queries that return no hits -- Bug hasn't been seen in 2 weeks and must be closed

Main Dependencies

  • gerritlib
  • pyelasticsearch