Refactors configuration loading in order to simplify it and to
allow overriding defaults using environment variables.
This behavior is similar to other tools like pip or ansible, which
can load any configurable option from env.
This step ease migration towards containerized use, where we do not
want to keep any secrets inside containers and we may want to
avoid volume mounting, especially when testing.
Change-Id: I0d3a9f19b0ba8d1604d0ca63db01296a3219fb47
- Upgraded hacking(flake8)
- Added more modern tox linters environment (pep8 alias)
- Temporary added skips for broken newer rules
- Fixed few basic rule violations
- Moved flake8 config to setup.cfg (tox.ini is not recommended)
Change-Id: I75b3ce5d2ce965a9dc5bdfaa49b2aacd8f0195ad
Refactor to use a config class to hold all the
params needed so that they can be more easily
overridden and reused across all the
elastic-recheck tools.
In addition, use the new class to make the
jobs_regex and ci_username configurable.
Change-Id: Ic6f115a6882494bf4c087ded4d7cafa557765c28
pyelasticsearch>1.0 defaults the port to 9200 but logstash.o.o/es
is on port 80, so update the defaults in code and config samples.
Change-Id: Ibb85cd29e1cbc3ff448aa8470854fe0f8bede260
These need to use kwargs in order to be correctly used, otherwise
they default to the hard-coded values.
Change-Id: Ic297b9a5d38651425abed561b0cb5cbab6838bc7
This commit adds options to the config file for the elastic recheck
bot configuration file. This enables users to specify how to connect
to an elastic recheck server and a subunit2sql database, but things
will still default to using the openstack-infra servers to prevent
breaking the running service.
Change-Id: I10db1a568cc01e137e5f4d8a8814b17201c4c438
While we were storing the correct self.data for nested channel
configs, we weren't actually using self.data to when we attempted to
connect to irc channels. Fixup the data variable early enough that IRC
channel connection in the ER bot should work again.
Change-Id: I2758fb79e3b424def53415d6fdc41fbc99d0ca9f
The logging setup for the bot includes very useful and specific
information on the kinds of things we want to get rid of when we
enable logging around elastic search code. Extract it so it can be
used in CLI tools.
Change-Id: I2eff77cf37d481c6518c95642e308af196b61d50
MessageConfig is a dictionary, but __init__ was updating __dict__ and
not self. This bug was causing elastic-recheck to stacktrace when it
tries to comment on a gerrit patch.
Change-Id: Ic4304336692226acfe60d0845edaf2acb383acc6
Instead of having the messages inline, we should do them in the
yaml file so that changing the UX for the bot reporting isn't a
code change.
Depends-On: I9208123a4cb3be02c272cd8a6eba460f4130a960
Change-Id: I8fdb07f9964f616addba6e8f25e5bd9de27d077a
this adds support for the channel config being nested, and provides
the basis for moving message catalog out of the code and into the
yaml in a new nested section.
Change-Id: I7353af4c3f141d4bd617d6fd388d7957e0586ba8
Elastic Recheck is really 2 things, real time searching, and bulk
offline categorization. While the bulk categorization needs to look
over the entire dataset, the real time portion is really deadline
oriented. So only cares about the last hour's worth of data. As such
we really don't need to search *all* the indexes in ES, but only
the most recent one (and possibly the one before that if we are near
rotation).
Implement this via a recent= parameter for our search feature. If set
to true then we specify the most recently logstash index. If it turns
out that we're within an hour of rotation, also search the one before
that.
Adjust all the queries the bot uses to be recent=True. This will
hopefully reduce the load generated by the bot on the ES cluster.
Change-Id: I0dfc295dd9b381acb67f192174edd6fdde06f24c
er had hung in the gate on a launchpad network issue. Set the
timeout for launchpadlib in an attempt to prevent this in the
future.
Change-Id: Ib3585f0a2b502e306c42711815c40d17fd6520a9
Recently the elasticsearch schema was updated to include a
build_short_uuid field which has indexed the first 7 chars of the
build_uuid. This field is useful because it allows e-r to filter on that
field instead of searching on build_uuid.
Update e-r to filter on build_short_uuid which should make queries much
more performant. As part of this change replace variables named
short_build_uuid with build_short_uuid for consistency with the
elasticsearch schema.
Change-Id: Iae5323f3f5d2fd01f2c69f78b9403baf5ebafe85
check fails all the time, on totally crappy patches, we really
don't want to get spam on terrible patches failing for terrible
reasons. Only for things which should never fail, which is the
gate.
Change-Id: I4e8c5b338c4a1d042acc0e274ac4e78497372a83
Instead of saying you failed x jobs, because of y bugs. Map the two.
Some refactoring was needed to add unit tests.
Change-Id: I1c49c8cd4df24c7fb4c152e188f74caa13dfed9c
IRC has a 512 byte maximum message size that includes the privmsg and
destination data. Use textwrap at 400 characters to approximate proper
splitting of long IRC messages to avoid exceptions that occur when
sending more than 512 bytes.
Change-Id: I575ce3694f4b399a3adf5432f6f6971307d9e202
Exception objects do not have msg attributes they have message
attributes. Use message instead of msg. Also use logger.warning()
instead of logger.warn() to be consistent with python documentation.
Change-Id: I49be960202f3fa1add19f3068ff824ae9d7f8314
This reverts commit e75b996e60.
Change is being reverted because we can't actually use a static LOG
object if we expect setup_logging to do the right thing at runtime.
Python logging will load logging objects at import time using the static
LOG object before setup_logging can run otherwise.
Conflicts:
elastic_recheck/bot.py
elastic_recheck/elasticRecheck.py
Change-Id: I582c7e9c9b3c2ccab6a695bfba00a61f7c0a04a9
We are starting to track a decent amount of data per zuul/jenkins job,
so track data in an object instead of assorted variables and
dictionaries. For example bugs are now tracked by job and not
gerrit event. Now, we can support reporting which bug caused which
specific job to fail. This also does some assorted object related
cleanups. This consists of internal changes only, a future patch will
make the gerrit and irc comments take advantage of this.
Change-Id: I2116cd0e10b45617a8d572b27f1672f695fa91d0
Always log the gerrit comment, and when running in nocomment just don't
send it to gerrit. This helps make testing changes to the gerrit comment
easier.
Change-Id: Ie26b86ed374d284154389b4bd5a86b9d2f365800
Now that we are running this on all jobs (not just tempest) we are
getting significantly more IRC messages. Add failed job name to logs to
provide more context of what job is failing. For unclassified failures
also include the queue (as a unclassified unit test failure in check queue is
much less important then one in the gate).
Change-Id: I485bf06721fa5afd102b99b26e38f12449deec7b
This commit adds support for multi-project/channel irc support to the bot.
To do this you list multiple channels in the bot yaml and specify which
projects get reported to that file. When a bug is encountered that targets
the listed project the message will be reported. If 'all' is specified as
a field under projects for a channel then all projects will be reported on
that channel.
Change-Id: Ic3dd76bad94213c7152c29a99c00ed23a2c01a31
In addition to searching by change and patch search by the short build_uuid.
This prevents us accidentally classifying multiple builds when we classify
a failure on gerrit. This can happen in the gate queue if there is a
gate reset, or if there are multiple 'recheck bug x' on a single patch
revision in the check queue.
Change-Id: I6356a971ca250ddf5f01a9734f13d0b080a62c89
in moving to the event object, we could fail to pass an event,
make that ok for now. Can clean up and separate this later.
Change-Id: I33359cf437fb4617390ea8cd43d5b3e57aef3ce5
instead of passing around complex data structures, create an
event object for our purposes that means we can pass around the
payload relevant to us. Simplifies some things, and will make
adding build_uuid tighter.
Change-Id: I8172b25ae3c60e38d63cf7f4d8a0f6c854bae766
For local testing we want to run the bot without writing to the outside
world. We already have a nocomment option, this adds a no irc option as
well.
Change-Id: I23cc2e3d05a85a414487b2cdac2f95b977f4e3eb
we have been timing out on logs a lot, and not noticing. Redo this
logic to be exception based so we can tell the IRC channel when we
timeout on logs, to get to the bottom of reliability issues with
indexing logstash data.
Change-Id: Ia63d801235c6959eb7b97c334291a6d2f06411b6
this makes the er bot work at a more sane set of default logs,
plus also tells us how often we end up timing out.
it also makes the logs actually include timestamps.
Change-Id: I29877c4158a84bd46b0a437a12c14450a049b49d
this makes the bot based on argparse, and provides a tox job that
makes running the bot in test mode a little more sane. It also
provides for a new '-n' nocomment item so you can run the bot
and not be worried that it will report to gerrit with findings.
Change-Id: If9d6a7e72dd8d9338f2dd3283cf9a761488122de
this changes the interface to move the readiness check out of
the classifier and into the stream object. This massively
simplifies the logic connecting these pieces, as classifier is
now just a thin wrapper to elastic search.
This also adds unit testing for the stream processing through the
creation of a fake_gerrit mock class. That lets us run gerrit
event interactions in a sane way.
It also drops all the unit testing for the classifier which is now
largely useless, because all it tests is we can execute a for loop.
Change-Id: I1971c121276412e31f01eb5680b9c41fc7e442d3