* Split out object/workers stats collection for Nova, Cinder and Neutron plugins
* Use the common interface exposed by collectd_base.Base
Change-Id: I59f698b8f09fd0d3ce375327d9e4d81d767d961c
This feature was broken and not stable enough for production deployment.
Related-bug: #1606831
Related-bug: #1643542
Change-Id: I0ce52ec01838d891c43d6e797617d3044a02d10f
puppet-lint installed from the master branch breaks the CI. This change
uses the official gem instead because the latest version now includes
the bug fix that wasn't released before.
Change-Id: I646176e30494cf1e8fac97c6ecebb3899ade8107
This patch uses the generic AFD filter with new alarms to replace
the custom AFD for workers.
Blueprint: allow-all-alarms-to-be-specified-in-alarming-file
Change-Id: I6c432e60a16da5bb3c8d0ecd0bd22a1246fe6f82
This patch uses the generic AFD filter with new alarms to replace the
custom AFD for API backends.
Blueprint: allow-all-alarms-to-be-specified-in-alarming-file
Change-Id: Id139e45a9942a9c86a2d35d1966b083d9c75af89
On controller nodes the increasing number of the AFD filters puts too
much load on the Heka pipeline and can generate "idle packs" errors.
It was observed that a poolsize value of 200 solves the issue.
Change-Id: I1d5f9fea352e16e15b37828bc525906a06fadd0e
The collector services are managed by Pacemaker for the controller,
detached RabbitMQ and detached MySQL nodes. This change ensures that for
all these roles, the OCF script is created before the collector services
are configured.
Change-Id: I555b13f0433cccaa1297cd286dbb41d88de1d369
Closes-Bug: #1627968
This patch moves the installation of the OCF script at the beginning of
the depoy_start to be sure that it is available when pacemaker starts
the collector resources. As it requires a configured hiera we also moved
the hiera task.
Change-Id: I90b4fa2a9038eaed0f1dcadb0f00713a1b2487b0
Closes-bug: #1575039
This patch creates new plugin check_local_endpoint.py to check openstack
service locally and emits a new metric openstack_check_local_api.
Change-Id: I58290dd685b97354137ad5c0b91aece79fd91695
This is to avoid to pollute hiera namespaces after a rolling upgrade of the
plugin.
blueprint: alarming-refactoring
Change-Id: I28039f1688583af39d089d96a5ecd7683f55642d
This change adds a filter plugin that monitors the kernel log messages
for hard drive errors and reports the number of errors per second
as 'hdd_errors_rate'. The filter is configured for all nodes,
irrespective of their roles. An alarm is also added that triggers
a CRITICAL alert when the metric value is greater than 0.
DocImpact
Change-Id: I485f5692a3e5facf0f7ea019ccdbd70683a7dd4e
In some environments (especially using slow HDD drives), the
Elasticsearch backends may fail to ingest logs fast enough. As a result
the log_collector service running on the controller nodes are blocked.
To alleviate this issue, this change increases the bulk size for nodes
that generate lots of logs:
- controllers which run OpenStack API services in addition to Pacemaker.
- all nodes when the environment's log level is set to debug.
In such cases, the flush_count parameter is increased to 100 (instead of
10 by default).
Change-Id: Ifdfbcb8ff0292f695dee4deab45560f126bde242
Closes-Bug: #1617211
This removes duplication of code and limitations we had to deal with
because the collectd Puppet resources don't play well when they are
created at different times from several manifests.
Change-Id: I52fabb1fb5795a33f552168553a148b1520fc496
The latest version of puppetlabs_spec_helper (1.2.0) depends on
rubocop-rspec which itself requires at least Ruby 2.2.
Change-Id: Ica4b71296912a66a98b223c002d1e8bdd04111d6
This change adds a collectd plugin that gets metrics from the Pacemaker
cluster:
- cluster's metrics
- node's metrics
- resource's metrics
Most of the metrics are only collected from the node that is the
designated controller except pacemaker_resource_local_active and
pacemaker_dc_local_active.
The plugin also removes the 'pacemaker_resource' plugin by providing the
exact same metrics and notifications for the other collectd plugins.
Finally the plugin is also installed on the standalone-rabbitmq and
standalone-database nodes if they are present.
Change-Id: I8b5b987704f69c6a60b13e8ea982f27924f488d1
This change uses the information that is already avaiable in the
collector's Hiera data to decide whether the RabbitMQ collectd
plugin should be deployed or not.
Change-Id: Ib1df231d6bf99ee6f34ee199fd5241d6b264fc00
The patche adds a new collectd plugin to test the availability of libvirt
and configure AFD for all compute nodes.
These AFD are part of nova global cluster.
Change-Id: I0944f7da69caf32ed6ac9c908d4241bc8c396994