Commit Graph

68 Commits

Author SHA1 Message Date
Andreas Jaeger 16f5f4cf44 Retire repository
Fuel (from openstack namespace) and fuel-ccp (in x namespace)
repositories are unused and ready to retire.

This change removes all content from the repository and adds the usual
README file to point out that the repository is retired following the
process from
https://docs.openstack.org/infra/manual/drivers.html#retiring-a-project

See also
http://lists.openstack.org/pipermail/openstack-discuss/2019-December/011647.html

Depends-On: https://review.opendev.org/699362
Change-Id: I6b38110f2d230006cd9cce1da5d2cf76cf470d35
2019-12-18 09:54:55 +01:00
Peter Razumovsky bd90fc114a Remove elasticsearch code from stacklight repo
Elasticsearch moved to fuel-ccp-elasticsearch repo, due
to collisions between similar docker image names, remove
elasticsearch code from here.

Change-Id: Ia8c74a335ffe9355e4c033d0998080ce56fb1d8f
Depends-On: Ic39eb474f42b25e55772cb95edd362e4be5623c3
2017-03-16 11:20:01 +00:00
Sergey Reshetnyak f602a189a3 Install openjdk from backports repo
Change-Id: I85c88f38fe8a5b74c943324884b9d61b33596879
2017-02-02 13:21:31 +03:00
Bartosz Kupidura 0d1024c807 Start using grafana from fuel-ccp-grafana
Change-Id: Ic8b059b95c0956a92abfc9fc754071a60c66649e
Depends-on: I89540f8dd656dd930c8b4d999aa195a295592ba4
2017-01-11 16:08:56 +01:00
Yuriy Taraday a7faada27b Switch hindsight to 9.0-latest snapshot repo
9.0-2016-09-02-140322 is already gone, so let's use 9.0-latest instead.
It should always point to latest in foreseeable future.

Change-Id: I3c20c935b15904bdde8caeadd14532477a9420c1
2016-11-07 16:34:03 +03:00
Yuriy Taraday 53c573e146 Convert parent image specification to image_spec calls
Change-Id: I619255d0efed4ba85bb622fb5c78690f68afddfb
Depends-On: I18281bdb41e91cd5c9160055f1617d7ee9d3b548
2016-10-18 12:53:32 +03:00
Éric Lemoine 9915c8ad1b Fix typo in Grafana graph
This fixes a typo in a Grafana graph.

Change-Id: Icdb1287d2b7f6acc99680de447d8c28eff63c645
2016-09-26 09:01:42 +02:00
Éric Lemoine 967f210a43 Fix network graphs in Grafana
This commit fixes the network-related graphs in Grafana. The
network statistics provided by Snap are cumulative values.
For example, bytes_recv is the number of bytes received
since the machine boot. For that reason, we need to
configure the Grafana network graphs to use the derivative()
function in SELECT queries.

Change-Id: Icbf5a60c7bc3a34972a4045d454e8ab40917e391
2016-09-23 13:05:18 +00:00
Éric Lemoine 6941106f40 Use base-tools as the base for alarm-manager
In I7b2b98f379c49bdbf23177a038bdca9433d1c6e5 reviews
suggested to use "base" rather than "base-tools" as the base
image. This commit reverts that. Using "base" forces to add
dependencies such as "dumb-init", which should really not be
the responsibility of authors of non-base images.

This commit also changes the way we specify versions of
Python dependencies in requirements.txt. Fixing versions
using `==` allows for more repeatable builds, but it also
means that we will quickly rely on old versions and will not
be able to guarantee that our code works with the latest
versions of our dependencies. Using `>=` is also consistent
with what we currently use in base-tools/requirements.txt.
requirements.txt was updated using openstack_requirement's
update-requirements tool.

Change-Id: I60d0d8e761da717e2f73485bae53f1e11e3aebba
2016-09-22 09:27:44 +00:00
Olivier Bourdon 4b0617cd24 Implement alarm-manager
alarm-manager is responsible for watching a configurable
location within the filesystem where the user can put
a YAML file which defines alarms. When change/creation is
detected, the YAML file is checked for proper contents
and if verification is successfull, LUA code is generated
as well as LUA configuration files. Hindsight will pick
up those changes after a certain period of time and
provides proper alarming to the platform.

Change-Id: I7b2b98f379c49bdbf23177a038bdca9433d1c6e5
2016-09-21 19:54:39 +02:00
Jenkins e1e524dfc5 Merge "Add a "container_name" field to OS services in ES" 2016-09-21 12:43:06 +00:00
Jenkins 92a334d84f Merge "Fixed inconsistency with SeverityLabel in ovs" 2016-09-21 08:23:37 +00:00
Jenkins 664014e099 Merge "Add cluster name to afd annotations" 2016-09-20 15:12:01 +00:00
Proskurin Kirill 6811aad561 Add a "container_name" field to OS services in ES
This allows us to search all logs from a specific container.

Change-Id: I6cf358e813d90d9fc23c5dfc29b36c48f79daca5
2016-09-20 12:06:30 +00:00
Proskurin Kirill eb36e6d794 Fixed inconsistency with SeverityLabel in ovs
Change-Id: I53182a4178095d2337fb8ebf795c209c7ec9668e
Closes-Bug: 1625224
2016-09-20 12:03:51 +00:00
Jenkins e411000414 Merge "Remove unneeded ENV, we already have it in base image" 2016-09-19 16:30:54 +00:00
Proskurin Kirill 58f2d78edb Remove unneeded ENV, we already have it in base image
Change-Id: Ibced400715943fb46f573dd50b01794bf03e375b
2016-09-19 14:00:50 +00:00
Éric Lemoine 8f98b80f21 Add cluster name to afd annotations
This change does two things:

* rename variables in Lua code
* add a "cluster" dimension to afd annotations. This is required
  to display specific annotations in Grafana dashboards

Change-Id: Iebafb5a63034bec937afedd2697b4a4cce964321
2016-09-19 12:43:12 +00:00
Jenkins 078796c566 Merge "Add alarm annotations in Grafana dashboard" 2016-09-16 10:25:41 +00:00
Jenkins 710e0c01bb Merge "Add Lua code for alarming" 2016-09-16 10:25:27 +00:00
Jenkins c13be0d39d Merge "Add cronjob for elasticsearch indices cleanup" 2016-09-15 22:46:37 +00:00
Proskurin Kirill 89567d8d14 Add cronjob for elasticsearch indices cleanup
Elasticsearch pod now has a cron container to run
some jobs for it.

Change-Id: Ia299e662681174512b0ad70b932caf66448b4cea
2016-09-15 14:06:53 +00:00
Jenkins ba90f98fbf Merge "Unhardcode elasticsearch and kibana version" 2016-09-15 12:42:28 +00:00
Éric Lemoine 86cfb80382 Add alarm annotations in Grafana dashboard
This commit adds alarm annotations to the Grafana System
dashboard. Annotations are displayed when the node status
changes (e.g. OKAY -> WARN).

Change-Id: I1eca90eb574ba3e985566709cc8cc353209da7d2
2016-09-15 12:14:21 +00:00
Éric Lemoine ed5934cd36 Add Lua code for alarming
This commits adds Lua code for generating AFD (Anomaly and
Fault Detection) metrics based on the evaluation of alarms.
The Lua code was copied from the lma_collector Fuel plugin
[*], with changes to accomodate Hindsight and the versions
of lua_sandbox and lua_sandbox_extensions we rely on.

In the future we plan to move this Lua code in its own Git
repository. And the Hindsight Dockerfile will install the
Lua code in the image using Debian packages.

The afd_node_default_cpu_alarms.lua and
hindsight_afd_node_default_cpu_alarms.cfg.j2 files will be
removed. Instead the operator will configure alarms through
a YAML file, and we will use a sidecar container for
generating Lua tables including alarm definitions and
corresponding plugin configuration files.

[*] https://github.com/openstack/fuel-plugin-lma-collector/

Change-Id: If182c3a6453f7bf8b72f03af56a14ace109eaa68
2016-09-15 12:13:27 +00:00
Proskurin Kirill d865e94bb1 Unhardcode elasticsearch and kibana version
Plus update them to the recent ones.

Change-Id: Ieee08bd4c261f17b3cf30782861b840124a9575f
2016-09-15 10:46:43 +00:00
Éric Lemoine 8149af754f Fix typo in Hindsight InfluxDB Lua code
Change-Id: If3d28328b2803d998bde877c008758b095876096
2016-09-14 17:37:13 +02:00
Éric Lemoine a6f365dfa5 Add support for multi-value metrics
This commit adds support for metrics with multiple values.
Multi-value metrics will for example be needed for alarming.

Change-Id: I496fa1925c389f2638cf9b99243fbf45d7d2dad7
2016-09-13 13:21:31 +00:00
Olivier Bourdon aa02be1d5c Add SMART data to Grafana System dashboard
Removed decimals part for number of hours in power on state
Ran format-dashboards.py script on updated JSON file

Change-Id: Ib2b8bcde7c3d01908d39b5bdb55b6a6062005f1a
2016-09-13 06:33:40 +02:00
Olivier Bourdon 0e7c59917e Add SMART monitoring
Note that we can not gather SMART infos from the host within
VMs
See https://www.smartmontools.org/wiki/FAQ
and more particularly the
DosmartctlandsmartdrunonavirtualmachineguestOS
sub-entry

Change-Id: Idee7d48e45a5a388061d196d1e07c55404780085
2016-09-13 06:33:40 +02:00
Éric Lemoine 7b9a836eb6 Update Snap
This updates the Snap Dockerfile to use a new version of
Snap. This version includes the following fixes:

* https://github.com/intelsdi-x/snap-plugin-collector-smart/issues/27
* https://github.com/intelsdi-x/snap/issues/1164

Change-Id: I2d914ac0b075e11d5b609a5bd69084905a9485c4
2016-09-13 06:33:21 +02:00
Jenkins b968bc39b0 Merge "Use date_time lib for lpeg grammar parsing" 2016-09-12 07:59:44 +00:00
Éric Lemoine 70a6ce6a57 Make the Heka build less fragile
Currently the Heka image build sometimes fails with the
following error:

	2016-09-07 18:08:09.219 2868 DEBUG fuel_ccp.build [-] heka: [91mCMake Error at /tmp/heka/cmake/message_proto.cmake:2 (message):
    Google protocol buffers 'protoc' must be installed, message.proto has been modified and needs to be regenerated.

This error is related to the "make" process trying to
re-generate "message.pb.go" using "protoc" which is not
present. And re-generating "message.pb.go" is not necessary
and shouldn't be done.

This commit attempts to fix the issue by touch'ing the
"message.pb.go" file to make sure the "make" process will
not attempt to re-generate it.

Change-Id: I91f12a99c813be99ba5a24ff65ca786eb97fea0c
2016-09-08 16:07:48 +02:00
Olivier Bourdon b21cc701a0 Use snap binaries to build image
Until we find a better official and publicly available
location for hosting these binaries, they will be hosted on
bintray.com

This should change when/if Intel will be providing access
to nightly binary builds. Please note that the binaries
have been produced using Intel's build scripts

The snap task file has been updated to take into account
the fact that cpu metrics are now dynamic and that due to
snap framework issue #1144 you can not request a specific
instance of dynamic metrics

The Grafana system dashboard has been updated to comply with
the snap task change above

Change-Id: I76a2eac0497c8e2024234aab5e117d173e136049
2016-09-07 16:00:05 +02:00
Éric Lemoine 78b13b7209 Set correct host name in metrics
This commit fixes a bug where hostnames were not correct in
metrics collected by Snap and Hindsight. It relies on
Kubernetes' downward API and the spec.nodeName field [1].
The latter is only supported by Kubernetes 1.4 and higher,
and the deployment of stacklight-collector pods will fail if
Kubernetes 1.3 or lower is used.

[1] <https://github.com/kubernetes/kubernetes/pull/27880>

Change-Id: I73cd35803a2201a09144bf925753156e47489cff
Depends-On: I293bb3aa113883c02f2e738f9d74291bf2f23d95
Partial-Bug: #1614484
2016-09-07 11:48:03 +02:00
Jenkins 2d5bb26ebe Merge "Add Kubernetes dashboard to Grafana" 2016-09-07 08:14:21 +00:00
Jenkins bcda9ee395 Merge "Collect kubelet stats" 2016-09-07 08:14:14 +00:00
Éric Lemoine 8fd1321dd6 Add Kubernetes dashboard to Grafana
Like any other Grafana dashboard this dashboard is manually
created using the Grafana UI itself. It is then exported to
JSON using Grafana's "export" functionality. At last, the
format-dashboards.py script is used on the JSON file to
format all the dashboards in the same way, and increment the
version number each time a dashboard is updated.

Change-Id: Ib2abcb53b5b116633490730186d4a5157c91c8cd
2016-09-06 17:27:33 +02:00
Éric Lemoine 1b12557fe7 Collect kubelet stats
This commit adds an input Hindsight plugin that scrapes the
Kubelet stats API at a regular time interval. This is to
collect system metrics (CPU usage, etc.) relative to pods
running on the cluster. The metrics created by the plugin
are injected into the Hindsight pipeline, and then read by
the InfluxDB plugin which sends them to InfluxDB for
storage.

Change-Id: I0b39d416ebc4e8090a959267d6fc813ddab2674a
2016-09-06 17:19:05 +02:00
Jenkins adbf3947ae Merge "Add programname to mysql log messages" 2016-09-06 13:44:27 +00:00
Proskurin Kirill 8f9c00d8ad Use date_time lib for lpeg grammar parsing
Change-Id: Ia64cfd2d4a90a9e0fb3360b15fd80a391bcfc32f
2016-09-05 13:28:49 +00:00
Éric Lemoine 0fe1681f10 Use packages for Hindsight
Change-Id: I44594235ff723ee5b50cbe212530e57aaf704180
2016-09-02 16:17:25 +02:00
Jenkins 02af8cfcf6 Merge "Remove spurious build output and group Docker RUNs" 2016-09-02 10:01:19 +00:00
Jenkins 273d91b84c Merge "Read Heka openstack_log config only once" 2016-09-02 06:53:42 +00:00
Jenkins 55a3fb9834 Merge "Do not recreate Fields on each process_message" 2016-09-02 06:53:34 +00:00
Éric Lemoine 43c4bf8839 Read Heka openstack_log config only once
This is an optimization avoiding to read the
heka_service_pattern configuration parameter on each call to
process_message.

Change-Id: I1d6299b263706661017920f28b4ebbc6a4cb30c3
2016-09-01 15:31:31 +02:00
Olivier Bourdon 36c37a3b10 Remove spurious build output and group Docker RUNs
Group RUN commands in Dockerfile template
Run git checkout in quiet mode so that we do not
get any suprious output looking like errors
(Like: ... You are in 'detached HEAD' state ...)

Change-Id: I00edbdc36110f2e75ef9a3044bfd6077edc1916a
2016-09-01 15:27:31 +02:00
Éric Lemoine 75b87abb01 Add programname to mysql log messages
Change-Id: Ida39edee36c502ec30015a3aec4de135866a78bb
Closes-Bug: #1618990
2016-09-01 15:13:01 +02:00
Éric Lemoine 105b0ca5f4 Do not recreate Fields on each process_message
This commit provides an optimization to the RabbitMQ logs
decoder. The optimization involves not creating a Fields
table on each call to process_message.

Change-Id: I6ec7109ca5a8409824b4e52d371877651a29752d
2016-09-01 15:07:49 +02:00
Éric Lemoine be833f0e2a Use standard way for building Heka Snap plugin
This commit changes the way we build the Heka publisher
plugin [1] in the Snap Docker image. The process of building
this plugin has changed in Snap, and is now much simpler.

This commit also fixes the build of the Snap image, which
has been failing in the 3rd-party CI periodic build job [2].

[1] <https://github.com/intelsdi-x/snap-plugin-publisher-heka>
[2] <https://jenkins-tp.ng.mirantis.net/view/build/job/mcp-build-images-build-number/24/console>

Change-Id: I1f63bfc52d52d5101ae4610c357b9e061935628c
2016-08-31 13:15:36 +02:00