Elasticsearch moved to fuel-ccp-elasticsearch repo, due
to collisions between similar docker image names, remove
elasticsearch code from here.
Change-Id: Ia8c74a335ffe9355e4c033d0998080ce56fb1d8f
Depends-On: Ic39eb474f42b25e55772cb95edd362e4be5623c3
9.0-2016-09-02-140322 is already gone, so let's use 9.0-latest instead.
It should always point to latest in foreseeable future.
Change-Id: I3c20c935b15904bdde8caeadd14532477a9420c1
This commit fixes the network-related graphs in Grafana. The
network statistics provided by Snap are cumulative values.
For example, bytes_recv is the number of bytes received
since the machine boot. For that reason, we need to
configure the Grafana network graphs to use the derivative()
function in SELECT queries.
Change-Id: Icbf5a60c7bc3a34972a4045d454e8ab40917e391
In I7b2b98f379c49bdbf23177a038bdca9433d1c6e5 reviews
suggested to use "base" rather than "base-tools" as the base
image. This commit reverts that. Using "base" forces to add
dependencies such as "dumb-init", which should really not be
the responsibility of authors of non-base images.
This commit also changes the way we specify versions of
Python dependencies in requirements.txt. Fixing versions
using `==` allows for more repeatable builds, but it also
means that we will quickly rely on old versions and will not
be able to guarantee that our code works with the latest
versions of our dependencies. Using `>=` is also consistent
with what we currently use in base-tools/requirements.txt.
requirements.txt was updated using openstack_requirement's
update-requirements tool.
Change-Id: I60d0d8e761da717e2f73485bae53f1e11e3aebba
alarm-manager is responsible for watching a configurable
location within the filesystem where the user can put
a YAML file which defines alarms. When change/creation is
detected, the YAML file is checked for proper contents
and if verification is successfull, LUA code is generated
as well as LUA configuration files. Hindsight will pick
up those changes after a certain period of time and
provides proper alarming to the platform.
Change-Id: I7b2b98f379c49bdbf23177a038bdca9433d1c6e5
This change does two things:
* rename variables in Lua code
* add a "cluster" dimension to afd annotations. This is required
to display specific annotations in Grafana dashboards
Change-Id: Iebafb5a63034bec937afedd2697b4a4cce964321
This commit adds alarm annotations to the Grafana System
dashboard. Annotations are displayed when the node status
changes (e.g. OKAY -> WARN).
Change-Id: I1eca90eb574ba3e985566709cc8cc353209da7d2
This commits adds Lua code for generating AFD (Anomaly and
Fault Detection) metrics based on the evaluation of alarms.
The Lua code was copied from the lma_collector Fuel plugin
[*], with changes to accomodate Hindsight and the versions
of lua_sandbox and lua_sandbox_extensions we rely on.
In the future we plan to move this Lua code in its own Git
repository. And the Hindsight Dockerfile will install the
Lua code in the image using Debian packages.
The afd_node_default_cpu_alarms.lua and
hindsight_afd_node_default_cpu_alarms.cfg.j2 files will be
removed. Instead the operator will configure alarms through
a YAML file, and we will use a sidecar container for
generating Lua tables including alarm definitions and
corresponding plugin configuration files.
[*] https://github.com/openstack/fuel-plugin-lma-collector/
Change-Id: If182c3a6453f7bf8b72f03af56a14ace109eaa68
This commit adds support for metrics with multiple values.
Multi-value metrics will for example be needed for alarming.
Change-Id: I496fa1925c389f2638cf9b99243fbf45d7d2dad7
Removed decimals part for number of hours in power on state
Ran format-dashboards.py script on updated JSON file
Change-Id: Ib2b8bcde7c3d01908d39b5bdb55b6a6062005f1a
Note that we can not gather SMART infos from the host within
VMs
See https://www.smartmontools.org/wiki/FAQ
and more particularly the
DosmartctlandsmartdrunonavirtualmachineguestOS
sub-entry
Change-Id: Idee7d48e45a5a388061d196d1e07c55404780085
Currently the Heka image build sometimes fails with the
following error:
2016-09-07 18:08:09.219 2868 DEBUG fuel_ccp.build [-] heka: [91mCMake Error at /tmp/heka/cmake/message_proto.cmake:2 (message):
Google protocol buffers 'protoc' must be installed, message.proto has been modified and needs to be regenerated.
This error is related to the "make" process trying to
re-generate "message.pb.go" using "protoc" which is not
present. And re-generating "message.pb.go" is not necessary
and shouldn't be done.
This commit attempts to fix the issue by touch'ing the
"message.pb.go" file to make sure the "make" process will
not attempt to re-generate it.
Change-Id: I91f12a99c813be99ba5a24ff65ca786eb97fea0c
Until we find a better official and publicly available
location for hosting these binaries, they will be hosted on
bintray.com
This should change when/if Intel will be providing access
to nightly binary builds. Please note that the binaries
have been produced using Intel's build scripts
The snap task file has been updated to take into account
the fact that cpu metrics are now dynamic and that due to
snap framework issue #1144 you can not request a specific
instance of dynamic metrics
The Grafana system dashboard has been updated to comply with
the snap task change above
Change-Id: I76a2eac0497c8e2024234aab5e117d173e136049
This commit fixes a bug where hostnames were not correct in
metrics collected by Snap and Hindsight. It relies on
Kubernetes' downward API and the spec.nodeName field [1].
The latter is only supported by Kubernetes 1.4 and higher,
and the deployment of stacklight-collector pods will fail if
Kubernetes 1.3 or lower is used.
[1] <https://github.com/kubernetes/kubernetes/pull/27880>
Change-Id: I73cd35803a2201a09144bf925753156e47489cff
Depends-On: I293bb3aa113883c02f2e738f9d74291bf2f23d95
Partial-Bug: #1614484
Like any other Grafana dashboard this dashboard is manually
created using the Grafana UI itself. It is then exported to
JSON using Grafana's "export" functionality. At last, the
format-dashboards.py script is used on the JSON file to
format all the dashboards in the same way, and increment the
version number each time a dashboard is updated.
Change-Id: Ib2abcb53b5b116633490730186d4a5157c91c8cd
This commit adds an input Hindsight plugin that scrapes the
Kubelet stats API at a regular time interval. This is to
collect system metrics (CPU usage, etc.) relative to pods
running on the cluster. The metrics created by the plugin
are injected into the Hindsight pipeline, and then read by
the InfluxDB plugin which sends them to InfluxDB for
storage.
Change-Id: I0b39d416ebc4e8090a959267d6fc813ddab2674a
This is an optimization avoiding to read the
heka_service_pattern configuration parameter on each call to
process_message.
Change-Id: I1d6299b263706661017920f28b4ebbc6a4cb30c3
Group RUN commands in Dockerfile template
Run git checkout in quiet mode so that we do not
get any suprious output looking like errors
(Like: ... You are in 'detached HEAD' state ...)
Change-Id: I00edbdc36110f2e75ef9a3044bfd6077edc1916a
This commit provides an optimization to the RabbitMQ logs
decoder. The optimization involves not creating a Fields
table on each call to process_message.
Change-Id: I6ec7109ca5a8409824b4e52d371877651a29752d