Commit Graph

124 Commits

Author SHA1 Message Date
Ghanshyam Mann d58ee7fb08 Retire openstack-health
openstack-health is QA repo and service to know the
job/test success/failure rate. But for a long (more than
a year), this code is broken. We do not have any maintainer
in QA to fix and maintain it. As it is broken, QA and infra team
agree to stop the service

- http://status.openstack.org/openstack-health/#/

In QA zed PTG, we decided to retire the repo also

- https://etherpad.opendev.org/p/qa-zed-ptg

Needed-By: https://review.opendev.org/c/openstack/governance/+/836706
Change-Id: Ie15aa8e469b0bb3aff47dcad422c0676fea640d2
2022-04-05 18:47:38 -05:00
Zuul f489fbe8ae Merge "Replace assertItemsEqual with assertCountEqual" 2021-01-22 15:39:31 +00:00
gugug 29fe21adc4 Replace assertItemsEqual with assertCountEqual
assertItemsEqual was removed from Python's unittest.TestCase in
Python 3.3 [1][2]. We have been able to use them since then, because
testtools required unittest2, which still included it. With testtools
removing Python 2.7 support [3][4], we will lose support for
assertItemsEqual, so we should switch to use assertCountEqual.

[1] - https://bugs.python.org/issue17866
[2] - https://hg.python.org/cpython/rev/d9921cb6e3cd
[3] - testing-cabal/testtools#286
[4] - testing-cabal/testtools#277

Change-Id: I0247031614ae75c1fd9f93898b7fb57838eb593f
2020-07-12 11:15:31 +08:00
Sean McGinnis 711c267bca
Use unittest.mock instead of third party mock
Now that we no longer support py27, we can use the standard library
unittest.mock modules instead of the third party mock lib.

Change-Id: I328fd430e61b666147095f860d02a2badcfd3072
Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
2020-03-13 12:03:07 -05:00
Manik Bindlish 12d66bea82 dict_object.keys() is not required for *in* operator
cleanup of .keys() from dict_object.keys() *in* operator

Change-Id: I6dfa5b591d6c049c543cacee2a2db907f41670ec
2018-12-20 10:45:35 +00:00
Trevor McCasland a9bfd78f8d fix prepare for numeric data
There is a case where there can be more than one build_name
with the same run_at[1] and without this change the request in [1]
will result in a KeyError[2]

The run_aggregator method is now updated to work with runs with
more than one build_name so that the expected_responses from
the unit tests make more sense and it makes it more ready to be
used for displaying run times in the grouped runs page.

[1] http://paste.openstack.org/show/726408/
[2] http://paste.openstack.org/show/726409/

Closes-Bug: #1785736

Change-Id: I66194d9fd753f2b094158103c37708148315629e
2018-08-09 12:45:48 -05:00
mccasland, trevor (tm2086) 1febf0c179 fix get_numeric_data
The change[1] introduced a failure for the grouped-runs page, there
is a case where if no data is provided to get_numeric_data it will error
with[2]

This change aims to fix that error by returning get_numeric_data early
if no data is provided.

[1] https://review.openstack.org/#/c/547866/
[2] http://paste.openstack.org/show/725854/

Change-Id: I02e9f40f4dcd44aefb47f5c8c96d9a318e034189
2018-07-13 15:01:17 -05:00
mccasland, trevor (tm2086) 741a457f72 Add run time graph for jobs
Motivation for this change:
  By adding the run time graph to the job's page it will raise
  awareness to users of changes in run times among jobs.
  By adding the scatter graph to the job's page it can raise awareness
  of the run time deviation.

New behavior:
  A linear and scatter chart called Job Run Time is available in the jobs
  view. It gathers run_time data from the job_data objects provided by
  the timedelta in the response of /runs/key/<key>/<value> limited
  by the values in the resolution dropdown.
  The original response of this API call is wrapped in a 'data' property
  and the additional information to support drawing the scatter chart on
  the canvas is added to a new dict property called 'numeric'.

NOTE:
  * The methods in run_aggregator were derived from [1] and [2]
  * An experiment was done to see which implementation would
    result in the lowest load time and the results for the job
    'tempest-full' for 2 weeks and 1 month periods are as follows:
    - latest patch with 1 db call
        2.21s - 2 weeks
        3.40s - 1 month

    - older patch with 2 db calls
        7.86s - 2 weeks
        10.16s - 1 month

completes queens priority "Job duration graph in o-h" from:
  https://etherpad.openstack.org/p/qa-queens-priorities

[1] https://review.openstack.org/#/c/370913/4
[2] 4db9a61471/openstack_health/test_run_aggregator.py (L70-L90)
Change-Id: Ib5196d86b6b5efa0083d4aa4dd28f1fac3493560
2018-06-29 14:57:05 -05:00
Matthew Treinish 26f188a102
Handle missing groupby_key in _group_runs_by_key
The _group_runs_by_key() helper function is there to re-organize the
data grouped by a metadata value. However, it was not fault tolerant
if the metadata key used for grouping was missing from any of the list
of runs. In the course of normal operation this should not occur,
because the metadata should be the same for all jobs in the OpenStack
subunit2sql db. But, due to a worker bug and/or a service restart we're
encountering a run that's missing a few metadata values. To make this
error non-fatal this commit augments the function to just ignore runs
where the groupby_key is not in the metadata.

Change-Id: I87087c8e3a985d883db78e008fd14f4fb9ac4e24
2018-02-09 12:17:43 -05:00
Zuul 7e51dd3a08 Merge "Add job name to recent failed test results" 2017-11-17 02:51:42 +00:00
Matthew Treinish bec8d9c9c6
Add job name to recent failed test results
In the post zuul v3 world the job name in the url doesn't always match
the job name in the database. Because of this we need to be returning
the job name from the database in the recent failed tests list so we can
use that for the table.

Change-Id: Ic1ba76c853ffd9ebb87683e56c82fea711b6139b
2017-11-15 17:09:54 -05:00
Masayuki Igawa 91b87a9578
Fix config file path treatment when using uwsgi
This commit fixes the treatment of the config file path when we use the
Flask app directly. Originally, when we use the Flask app such as using
with uwsgi which is mentioned in the README as one of production
settings, the parameter '--pyargv config_file' doesn't work because we
read '/etc/openstack-health.conf' file directly. It should work,
otherwise, users can't specify the config file.

Change-Id: I1beb927200880d42e40e49db3aab47453f83169f
2017-10-31 19:57:25 +09:00
Zuul f7143734f8 Merge "Handle '/' in build_name for getting test_runs" 2017-10-02 10:36:30 +00:00
Trevor McCasland c2ed46cf5a Handle '/' in build_name for getting test_runs
According to this snippet:
http://flask.pocoo.org/snippets/76/

we can change string to path. Path type has the same behavior
as string but it allows the '/' character.

Change-Id: I7c19c7366b698d66a1cef13a822955c0da840a5a
Closes-Bug: #1720255
2017-09-29 11:14:31 -05:00
Masayuki Igawa ce23c005f8
Add links to o-h test pages for Failed tests on RSS
This commit adds links to o-h test pages for Failed tests on RSS.
Adding these links could be useful to know its test trends easily.

Change-Id: I60af6963f14ff57e8515d9b5f1959698082abf71
2017-09-11 15:20:43 -06:00
Matthew Treinish 769f9d3cd7
Add version string to status string
Sometimes debugging puppet deployment issues is difficult without
knowing exactly which version of the code is actually running. To help
debug automated deployment issues being able to query the exact version
running is useful. This commit does this by adding the version string to
the /status response so we can easily tell which version is running.

Change-Id: I4a26cf097cd1efc9962911e7b5824eb6b8b2f5d9
2017-03-27 15:49:59 -04:00
Masayuki Igawa 82c298518b
Add failed tests to rss
This commit adds failed tests to RSS. It could be useful to understand
the reason of failures.

Change-Id: Ib1acb636bad352da649b9cca1294462e41bd62be
2017-03-13 14:43:25 +09:00
gengchc2 5434c8fd86 Replace six.iteritems() with .items()
1.As mentioned in [1], we should avoid using
six.iteritems to achieve iterators. We can
use dict.items instead, as it will return
iterators in PY3 as well. And dict.items/keys
will more readable. 2.In py2, the performance
about list should be negligible, see the link [2].
[1] https://wiki.openstack.org/wiki/Python3
[2] http://lists.openstack.org/pipermail/openstack-dev/2015-June/066391.html

Change-Id: Ia082e1a64efbd5f3e9dc6830f14181cb30a11fd2
2016-12-09 10:17:25 +08:00
Matthew Treinish fa9c804e45
Fix elasticsearch querying with recent e-r changes
A recent elastic-recheck commit, change
Ic6f115a6882494bf4c087ded4d7cafa557765c28 made backwards
incompatible changes to the api for the classfier object to use a
common config object. This was a good cleanup but failed to account
for things reusing elastic-recheck as a lib interface outside of the
repo (like openstack-health). This change refactors the
openstack-health usage to take into account these recent changes and
unbreak things.

Change-Id: I3afe7f0a4a4475084be54f8df69083105fb4a4fe
2016-10-03 15:07:10 -04:00
Tim Buckley 776c0a5683 Fix bad use of jsonify() in _get_data
When no test runs are found in the subunit2sql database we attempt to
return an empty result set in get_test_runs_for_test. However, in the
cached function _get_data() the empty response was wrapped in a
call to jsonify() causing the function to return a Response rather
than a serializable dict.

This patch removes the bad jsonify() call to avoid any serialization
errors for pages with no data.

Change-Id: I49d47bb998593f61fc4b1431d366c4cdd9bb3d2c
2016-08-30 14:36:29 -06:00
Jenkins 361f82df8d Merge "Clarify error message for unavailable RSS feeds" 2016-08-23 07:26:36 +00:00
Matthew Treinish 3a098f6250
Cache individual elasticsearch calls
When we introduced elastic-recheck querying to openstack-health we
started caching the entire api call whenever es was being queried on
the backend. At the time this made sense because there was a single
api call using elastic search so this was easier. But now that we have
multiple apis using elastic search data we should individually cache
the calls. There are 2 reasons for doing this:

1. We can share the cache of queries across api calls. We only use ES
to run e-r queries on failed runs. Even if looking at fialures from a
different view there will be a lot of overlap in what we're searching
for.

2. With the addition of failure lists that depend on a date range our
cache hit rate goes way down for the api call. That's because our
caching strategy is a bit naive still and doesn't understand ranges so
we always have a cache miss if one of the bounds on the range misses.
To mitigate this by caching ES queries the slow part of the query will
always be cached except for anything new.

Change-Id: Ibf1589e43598dba92e6b6e36f481c38ecc5116ee
2016-08-18 16:34:06 -04:00
Jenkins 3c2d401a9b Merge "Add failed runs list to test page" 2016-08-16 15:16:47 +00:00
Tim Buckley b5699ea693 Clarify error message for unavailable RSS feeds
Now that RSS feeds show give errors properly when no failures are
found vs invalid parameters are given (see also:
I48659140314a2d8737611dd0a86097f10c1f3ac8) the error message and
status code could use some clarification.

This changes the status code to 404 and specifies that no runs of
any kind could be found with a given metadata key/value pair.

Change-Id: I14c42e2d64203204197cf9a708fb26ecd9961898
2016-08-12 14:35:58 -06:00
Masayuki Igawa 8d9eb1da5d Fix RSS feed unavailable when no failure
This commit fixes RSS feed unavailable error to return zero length entry
list when there is specified run_metadata_key and value with no failure.

Closes-Bug: #1573630
Change-Id: I48659140314a2d8737611dd0a86097f10c1f3ac8
2016-08-11 15:09:47 -04:00
Matthew Treinish e86564d6a0
Add failed runs list to test page
This commit adds a new table to the per test page which shows all the
runs which have failed in the current view. This is useful for
debugging spikes in failures on the pass fail graph.

Change-Id: Idc8c5d41467ac0302a4e8a256e64af1bb73cae24
2016-08-10 17:52:25 -04:00
Jenkins f56b991a01 Merge "Add html tags to rss links" 2016-07-13 06:16:20 +00:00
Masayuki Igawa dcec891e5f Fix MemcacheIllegalInputError: Key contains spaces
This commit fixes MemcacheIllegalInputError: Key contains spaces by
replacing spaces with underscores. It seems like memcached doesn't
accept a key that contains spaces.

Change-Id: I8dfbcb33b7f8603c8d8f4411c02c84a2eb75937e
2016-07-01 14:20:23 +09:00
Masayuki Igawa 4db9a61471 Fix pandas functions warnings
This commit fixes pandas functions warnings. Some of pandas functions such as
rolling_mean(), resample(how=xx) are deprecated from v0.18.0.

Change-Id: I1465e50821af2aaa77d5458205469c4eec1dab58
Closes-Bug: #1580447
2016-06-30 17:06:50 +09:00
Matthew Treinish 3ed878f6b8
Add caching on get_test_runs_by_build_name
This commit enables simple caching on the results from the
get_test_runs_by_build_name. This API call involves one of the slowest
subunit2sql DB calls we make and is a good candidate for caching.
Right now, we just cache on name and the request datestamps which will
result in a lot of potential cache misses as people request varied
dates and/or build_names. So there is probably room for additional
optimizations in the future.

Change-Id: I7582188f4f027bc86375056e5cf60cca3a760ce8
2016-06-29 16:00:49 -04:00
Masayuki Igawa 4176c79a16 Add html tags to rss links
This commit adds html tags to rss links. With this change, users can
follow links on their rss readers easily.

Change-Id: Ib3f0ac1e6e15ecca2aa823b11bbcc903230112c8
2016-06-17 15:33:52 +09:00
Jenkins 5f5935ded2 Merge "Remove failover logic outside of flask context" 2016-06-13 18:39:58 +00:00
Jenkins 284f97f661 Merge "Fix changed imports in dogpile.cache 0.6.0" 2016-06-12 22:16:00 +00:00
Jenkins 6401a45852 Merge "Turn off interpolation on raw test run numeric data" 2016-06-10 16:04:08 +00:00
Matthew Treinish 2a447a8c5d
Fix changed imports in dogpile.cache 0.6.0
In the latest release of dogpile.cache, 0.6.0, a couple of backwards
incompatible import rejiggering broke our usage in openstack-health's
distributed dbm backend proxy. This commit updates the imports to work
with 0.6.0 and newer versions of dogpile.cache.

Change-Id: Iad50ab66c88a2164146fea98edf435a445e0ee6c
2016-06-07 11:52:08 -04:00
Matthew Treinish 40744826b7
Turn off interpolation on raw test run numeric data
This commit removes the interpolation stage from the test_run numeric
data processing. The interpolation is useful when we're dealing with
a line plot since it can fill the gaps for missing data. But, since we
moved to graphing the numeric data as a scatter plot the interpolation
does more harm then good. It basically forms a line artificially in
the data wherever there is missing data. This is both confusing and
misleading so let's just not do it. However, the interpolation is
still useful for generating the mean and std dev data. Those are
essentially line plots (although because of nvd3 limitations are
still displayed as scatter points) and having the gaps filled makes
it more useful, so we want to still use the interpolated data for
those values.

Change-Id: I55c40c6ca7145848c0da73c934851c20994df8d1
2016-06-07 11:24:43 -04:00
Matthew Treinish 62491309e6
Stop using deprecated jsonpify import
This commit updates the flask jsonpify extension import to stop using
the deprecated version. Starting In flask 0.11 when importing
jsonpify via flask.ext.jsonpify a deprecation warning is emitted to
update the import.

Change-Id: I266d0dcb0b71d97f4fb7edd0d5d7bf1cc8b36205
2016-06-06 16:24:51 -04:00
Matthew Treinish 83e0e1e755
Remove failover logic outside of flask context
This commit removes some failover logic to make things in the get
recent test failure list api run without a flask context. This was
originally added in an earlier version of the async cache worker
implementation to handle directly calling the method outside of an
api request. (to refresh the cache) However, recent changes to that
have made this unecessary as the inner function is all that's called
by the async thread.

Change-Id: If0e7e621fec6c20f52a4dca162272b7296681ffc
2016-06-02 20:08:56 -04:00
Matthew Treinish d5a542cc6e
Add DBM cache backend that uses memcached locking
This commit adds a new dogpile.cache backend that uses the memcached
distributed locking mechanism to enable async workers but still uses
the dbm file storage. This should enable us to cache large JSON blobs
without worrying about the memacached size constraints, but at the
same time reap the benefits of having an async worker update the
cache in the background. The tradeoff here is configuration complexity
because you still need to install memcached to leverage this.

Change-Id: Ied241ca1762c62a047bd366d7bd105028a884f30
2016-06-02 13:34:16 -04:00
Jenkins 75f5954a9e Merge "Add build_name to subjects of RSS feed" 2016-06-01 22:58:20 +00:00
Jenkins 8d430de448 Merge "Change Content-Type of RSS feeds to application/xml" 2016-06-01 22:53:07 +00:00
Masayuki Igawa 8d38d58d99 Add build_name to subjects of RSS feed
This commit add a build_name to subjects of RSS feed. Because it's
better than just a UUID of the job.

Change-Id: Ib8f2c0ac5d619c0b2f1f5d0729cbaf358676a50c
2016-05-31 15:30:21 +09:00
Masayuki Igawa f91cc92052 Change Content-Type of RSS feeds to application/xml
This commit changes the Content-Type of RSS feeds to 'application/xml'
for interoperability with the widest range of feed readers. And most of
web browsers can render an o-h feed as just an XML with it.

Change-Id: I6b3d8ce5abd43cfc2a96eab048fed7b56f337623
Closes-Bug: #1587251
2016-05-31 15:11:34 +09:00
Matthew Treinish 8b4ca90abc
Use _config_get() for the remaining config options
This commit switches the remaining config options to use the
_config_get() method to get a value with a default. This cleans
up the code a bit and also makes the interface for getting
configuration options consistent.

Change-Id: I99bfefe7d5395bc29179520eb242015ecd0b1400
2016-05-27 17:12:44 -04:00
Matthew Treinish fac99f7827
Enable setting None as a default value in _config_get()
This commit fixes a bug in _config_get() when trying to set a default
value of None the default path wouldn't be triggered. This fixes the
issue by switching the kwarg to use False instead of None and checking
specifically against that.

Change-Id: I3d8a881fca4d76c0f70ee2e04ea9328eb855bb46
2016-05-27 15:12:03 -04:00
Jenkins c92913fb2e Merge "Update dogpile cache with an async worker" 2016-05-27 18:42:11 +00:00
Matthew Treinish a839a02be1
Update dogpile cache with an async worker
This commit changes how we refresh our cache from doing it directly on
stale requests to do it async in the background with a worker thread.
This leverages dogpile.cache's async_creation_runner mechanism to start
a background thread when the cache goes stale that will update the cache
while still returning the cached copy in them meantime.

To enable this you need to use a dogpile.cache backend that supports
using a distributed_lock. This does not include the default dbm backend,
using a memcached based backend, like dogpile.cache.pylibmc, is
recommended for enabling this functionality.

Change-Id: I0fd29839c72ca2fdfb4c4724bb3da7e283e3d27d
2016-05-25 16:50:43 -04:00
Matthew Treinish 32ab136cf7
Skip ES lookups on runs with missing metadata
This commit fixes a bug when the data returned from subunit2sql is
missing the required fields, like in the case of periodic jobs, to
use the elastic-recheck classifier. Previously it was assumed all
runs being returned from subunit2sql would have an associated change
and the metadata associated with that, but this wasn't always true.
So now it will just skip trying to do an ES lookup if all the required
fields aren't present.

Change-Id: I395656c94499f658df1dd20cafa0783e394bc382
2016-05-25 11:49:14 -04:00
Matthew Treinish 3076768c6b
Add elastic-recheck to status response
We recently added elastic-recheck/elastic-search as an additional data
source for openstack-health. However, adding the current configuration
was neglected in that patch. This commit fixes this oversight, but
including a field for elastic-recheck in the status response. It will
let API users know if elastic-recheck is installed, configured, and if
so what the health of the elastic-search cluster is.

Change-Id: Ia76a26de930b13a4a7cd90dc0ef45bbcecc714f6
2016-05-23 10:57:36 -04:00
Jenkins f6f12919af Merge "Add elastic-recheck data querying" 2016-05-20 21:35:11 +00:00