Commit Graph

96 Commits

Author SHA1 Message Date
Andreas Jaeger c929899400 Retire repository
Fuel repositories are all retired in openstack namespace, retire
remaining fuel repos in x namespace since they are unused now.

This change removes all content from the repository and adds the usual
README file to point out that the repository is retired following the
process from
https://docs.openstack.org/infra/manual/drivers.html#retiring-a-project

See also
http://lists.openstack.org/pipermail/openstack-discuss/2019-December/011675.html

A related change is: https://review.opendev.org/699752 .

Change-Id: I8aded54f1b9f3b79f3a4bf8f607d3695b92f528b
2019-12-18 19:39:39 +01:00
Swann Croiset a5c154d6ef Configure pagination for OpenStack collectd plugins
Change-Id: I32e83368b7d0d2e8b68d7f7a2df0d1b61653fa72
2017-02-09 12:54:36 +00:00
Swann Croiset b7c7e7bdc2 Remove the SMTP standalone alerting_mode
This feature was broken and not stable enough for production deployment.

Related-bug: #1606831
Related-bug: #1643542

Change-Id: I0ce52ec01838d891c43d6e797617d3044a02d10f
2017-01-09 13:23:14 +01:00
Swann Croiset 8bc378c486 Rename GSE alerting attribute
Change-Id: I30b16d7ef159242f9984b54a8ae344fbf6560314
2016-10-10 16:24:06 +02:00
Guillaume Thouvenin 9dbf48dbfe Replace the workers AFD filter
This patch uses the generic AFD filter with new alarms to replace
the custom AFD for workers.

Blueprint: allow-all-alarms-to-be-specified-in-alarming-file
Change-Id: I6c432e60a16da5bb3c8d0ecd0bd22a1246fe6f82
2016-10-06 09:05:30 +02:00
Guillaume Thouvenin 215f693307 Replace the API backends AFD filter
This patch uses the generic AFD filter with new alarms to replace the
custom AFD for API backends.

Blueprint: allow-all-alarms-to-be-specified-in-alarming-file
Change-Id: Id139e45a9942a9c86a2d35d1966b083d9c75af89
2016-10-05 15:41:55 +00:00
Simon Pasquier 2cc44ddba0 Increase the number of points per InfluxDB batch
This change improves the InfluxDB write performances by increasing to
500 the maximum number of points that are sent per InfluxDB request.
InfluxDB recommends to have a batch size of 5,000 but it cannot be the default
configuration value due to the fixed sized of Heka messages (256K currently)
which leads to silently discard metrics.
Note that the InfluxDB accumulator will flush the data either when
it holds 500 points or when it hasn't data for at least 5 seconds.

Co-Authored-By: Swann Croiset <scroiset@mirantis.com>

Change-Id: I7d238375dc0c231782983fc4901c9a32936fb08a
Partial-Bug: #1581369
2016-09-30 14:37:40 +02:00
Swann Croiset cc5aadb474 Do not send GSE to Nagios when activate_alerting=false
DocImpact
blueprint: alarming-refactoring

Change-Id: Ie343672afd4a222a3d7f920182a0b2e90e1fd6de
2016-09-20 09:41:54 +02:00
Swann Croiset 83db24f549 Increase the Elasticsearch bulk size when required
In some environments (especially using slow HDD drives), the
Elasticsearch backends may fail to ingest logs fast enough. As a result
the log_collector service running on the controller nodes are blocked.

To alleviate this issue, this change increases the bulk size for nodes
that generate lots of logs:
- controllers which run OpenStack API services in addition to Pacemaker.
- all nodes when the environment's log level is set to debug.

In such cases, the flush_count parameter is increased to 100 (instead of
10 by default).

Change-Id: Ifdfbcb8ff0292f695dee4deab45560f126bde242
Closes-Bug: #1617211
2016-08-29 15:17:44 +00:00
Swann Croiset 7f1f3bd59f Configure AFD alarms against 'mysql_check' metric
Change-Id: Ib15fea4ab041243e44a61c9d54d1f154b02d34af
2016-08-26 15:23:07 +02:00
Swann Croiset 26c5788684 Check memcached service on controller nodes
The patch replaces the service_heartbeat mechanism.

Change-Id: I060e10320cf6f8b874a39037b1f9257ed1996342
2016-08-26 10:56:06 +02:00
Swann Croiset 5c4b3eb2e6 Add Python collectd plugin to check memcached availability
This plugin emits check metrics for memcached.

Change-Id: I5b0fba60d076080503e34f751fccaae801ca327a
2016-08-26 10:54:57 +02:00
Simon Pasquier 3a3ef6f2e3 Add Pacemaker collectd plugin
This change adds a collectd plugin that gets metrics from the Pacemaker
cluster:

  - cluster's metrics
  - node's metrics
  - resource's metrics

Most of the metrics are only collected from the node that is the
designated controller except pacemaker_resource_local_active and
pacemaker_dc_local_active.

The plugin also removes the 'pacemaker_resource' plugin by providing the
exact same metrics and notifications for the other collectd plugins.

Finally the plugin is also installed on the standalone-rabbitmq and
standalone-database nodes if they are present.

Change-Id: I8b5b987704f69c6a60b13e8ea982f27924f488d1
2016-08-11 14:53:43 +02:00
Éric Lemoine 27174d2196 Fix broken links in doc
This commit fixes links in the doc by removing the /user/
path element in doc URLs.

Change-Id: I753fd10a42cb4024b619aa1e6123cd2ac8526f68
2016-06-15 16:55:57 +02:00
Swann Croiset c679b05be7 Install explicit package version of Heka
Change-Id: Ica6a6936cfd8f959758988f97af29d6489734484
Fixes-bug: #1590013
2016-06-08 07:51:28 +00:00
Swann Croiset b2bb3f3ea9 Remove some default lma_collector::params
This patch removes default parameters for InfluxDB/Elasticsearch HTTP port
and address. These parameters are always provided by callers and that the way
to go.

Change-Id: I5e346b71a7d639475f2fba92126f8d191f8cd5fd
2016-06-01 09:42:28 +02:00
Simon Pasquier 5cda99d8c6 Revert "Increase the number of points per InfluxDB batch"
This reverts commit 567562faaf.

Change-Id: I14549b8cad02058c3352c71bad80ff2ad0dcd970
2016-05-26 14:25:07 +00:00
Simon Pasquier 567562faaf Increase the number of points per InfluxDB batch
This change increases the maximum number of points that are sent in a
single request. InfluxDB recommends to have a batch size of 5,000 so
this is now the default configuration value. Note that the InfluxDB
accumulator will flush the data either when it holds 5,000 points or
when it hasn't data for at least 5 seconds.

Change-Id: If07b7d285d216855997254952ca6d7511cff65ec
Partial-Bug: #1581369
2016-05-24 12:01:22 +02:00
Swann Croiset 13d1801c65 Prevent using init script to start Heka on controller nodes
Change-Id: I3b01ac021f9e89ef74fbd82d7abc103a2f34399d
Fixes-bug: #1570839
2016-05-04 14:34:39 +02:00
Swann Croiset 391ca132b3 Emit aggregated HTTP metrics
HTTP metrics are now statistics aggregated every 10 seconds.
A new metric is emitted openstack_<service>_response_times with these
values:
- min
- max
- sum
- count
- percentile

Hence, the previous metric disappears (openstack_<service>_responses).

Implements-blueprint: aggregated-http-metrics

Change-Id: I48e92df6f4baa7be942ad138b7f23c3d15f5a24e
2016-05-04 14:34:39 +02:00
Swann Croiset ebac150f8a Separate the (L)og of the LMA collector
This change separates the processing of the logs/notifications and
metric/alerting into 2 dedicated hekad processes, these services are
named 'log_collector' and 'metric_collector'.

Both services are managed by Pacemaker on controller nodes and by Upstart on
other nodes.

All metrics computed by log_collector (HTTP response times and creation time
for instances and volumes) are sent directly to the metric_collector via TCP.
Elasticsearch output (log_collector) uses full_action='block' and the
TCP output uses full_action='drop'.

All outputs of metric_collector (InfluxDB, HTTP and TCP) use
full_action='drop'.

The buffer size configurations are:
* metric_collector:
  - influxdb-output buffer size is increased to 1Gb.
  - aggregator-output (tcp) buffer size is decreased to 256Mb (vs 1Gb).
  - nagios outputs (x3) buffer size are decreased to 1Mb.
* log_collector:
  - elasticsearch-output buffer size is decreased to 256Mb (vs 1Gb).
  - tcp-output buffer size is set to 256Mb.

Implements: blueprint separate-lma-collector-pipelines
Fixes-bug: #1566748

Change-Id: Ieadb93b89f81e944e21cf8e5a65f4d683fd0ffb8
2016-05-04 14:34:14 +02:00
Swann Croiset 96df47af73 Increase the Heka poolsize on controllers
On controller nodes, the Heka poolsize must be increased to handle the load
generated by derived metrics from logs otherwise a deadlock
can happen in the filter plugins and block heka.

Fixes-bug: #1557388

Change-Id: I74362011d32d413f244c6cdb6e4625ed96759df0
2016-04-05 18:34:17 +02:00
Swann Croiset 9cb06879fe Increase timeout to 20s for Openstack collectd plugins
And decrease the max_retries from 3 to 2 to stay in the 50 seconds window.
This change allows to retrieve large number of objects and also avoids to
overload the system by performing 3 'zombies' requests every 50 seconds
without any metrics collected.

Partial-bug: #1554502
Change-Id: I60a7611bc82598831538da01245b87fb29a15c44
2016-03-09 18:25:30 +01:00
Swann Croiset e427882db2 Add parameter to lma_collector::collectd::rabbitmq class
The new parameter 'queue' configures the 'Queue' option of the Python collectd
plugin.

Change-Id: I5f5b1a21dd777469c7ab56688946d169ae3d917b
Related-bug: #1549721
2016-03-03 11:06:34 +01:00
Simon Pasquier 55ba8c48c6 Document the lma_collector::afd::* classes
Change-Id: Ie395c439fbc5c05f8fc396d33c07f01989b43a01
Implements: blueprint lma-without-fuel
2016-02-23 14:25:00 +00:00
Simon Pasquier 4fa6843e6d Document the lma_collector::smtp_alert class
Change-Id: I366799f6be60674fa8dd930b713ae8a3070ad699
Implements: blueprint lma-without-fuel
2016-02-23 14:19:47 +01:00
Simon Pasquier 10c0ddc505 Document the lma_collector::notifications::metrics class
Change-Id: I0a914bef498f292ff3ba0d43a9a4acb15085d15b
Implements: blueprint lma-without-fuel
2016-02-23 10:56:31 +00:00
Jenkins 8c857cf6d7 Merge "Document the lma_collector::metrics::service_heartbeat class" 2016-02-23 10:55:53 +00:00
Jenkins 12216a5662 Merge "Document the lma_collector::metrics::heka_monitoring class" 2016-02-23 10:55:31 +00:00
Jenkins cc60eedfd2 Merge "Document the lma_collector::gse_policies class" 2016-02-23 10:55:21 +00:00
Jenkins 108f54b16a Merge "Document the lma_collector::gse_cluster_filter define" 2016-02-23 10:55:02 +00:00
Jenkins 818d32c5f3 Merge "Document the lma_collector::gse_nagios define" 2016-02-23 10:54:48 +00:00
Jenkins 61c722992b Merge "Document the lma_collector::aggregator classes" 2016-02-23 10:54:30 +00:00
Jenkins 2bd9220996 Merge "Document the lma_collector::afd_nagios define" 2016-02-23 10:54:09 +00:00
Jenkins d3255f2262 Merge "Document the lma_collector::afd_filter define" 2016-02-23 10:53:53 +00:00
Jenkins 1059ae2e99 Merge "Document the lma_collector::influxdb class" 2016-02-23 10:53:40 +00:00
Simon Pasquier f9da20a24c Document the lma_collector::metrics::service_heartbeat class
Change-Id: I04beea034476af546becfea73ba4393ace659b80
Implements: blueprint lma-without-fuel
2016-02-23 10:36:38 +00:00
Swann Croiset 59d8fd8604 Specify explicitly Neutron log file names parsed by Hekad
Fixes-bug: #1546424
Change-Id: Icacfb8f7c6b81817856df468aeb592978a8d26e8
2016-02-19 15:44:18 +00:00
Simon Pasquier 27f452fb12 Document the lma_collector::metrics::heka_monitoring class
Change-Id: Iab7af305aae408f63b51ec60eba3d39aab1c62ee
Implements: blueprint lma-without-fuel
2016-02-19 09:59:11 +01:00
Simon Pasquier 1bcf3fdd3a Document the lma_collector::gse_policies class
Change-Id: Icb65b93f3a6b737d818dc64f165d282d08943119
Implements: blueprint lma-without-fuel
2016-02-19 09:59:11 +01:00
Simon Pasquier 0170939e50 Document the lma_collector::gse_cluster_filter define
Change-Id: I08908d55844d301ec2e91156feb9849316a19646
Implements: blueprint lma-without-fuel
2016-02-19 09:59:10 +01:00
Simon Pasquier 495d9908f4 Document the lma_collector::gse_nagios define
Change-Id: I22088b4503541a372b0af91a38f167c82e3ca059
Implements: blueprint lma-without-fuel
2016-02-19 09:59:10 +01:00
Simon Pasquier a22d44b9ad Document the lma_collector::aggregator classes
Change-Id: Id4b3664e54319fa9b1192ade47d9e41f025aa6ae
Implements: blueprint lma-without-fuel
2016-02-19 09:59:10 +01:00
Simon Pasquier d59e3a6187 Document the lma_collector::afd_nagios define
Change-Id: I36be2de71d3de60668fdfc8b8397ff6c118ec4c2
Implements: blueprint lma-without-fuel
2016-02-19 09:59:10 +01:00
Simon Pasquier cc5ecd9f52 Document the lma_collector::afd_filter define
Change-Id: I634b028b1ef90c0197e04ed09bb9c92247400df3
Implements: blueprint lma-without-fuel
2016-02-19 09:59:09 +01:00
Simon Pasquier 6f213e1816 Document the lma_collector::influxdb class
Change-Id: I27531ce0ae9403449f417c2ee6ee77bceb191aa3
Implements: blueprint lma-without-fuel
2016-02-19 09:59:09 +01:00
Swann Croiset 98441edea0 Do not purge the collectd package configuration by default
This avoids to purge collectd configuration by the last manifest applied.

Fixes-bug: #1546091

Change-Id: Ib6c22910f4c9259920bb9ce079a0135deff31544
2016-02-17 09:45:41 +01:00
Éric Lemoine 54a2e4b4b9 Make changes to lma_collector::collectd::mysql
This commit is related to the usage and documentation of the
lma_collector::collectd::mysql class.

The following changes are made:

1. Make the "username" and "password" parameters required. Today
   they default to the empty string, which doesn't make much sense.
2. Change the internal resource name from "nova" to "config". The
   name "nova" was confusing as the collection of MySQL statistics
   is unrelated to Nova. With this change the generated collectd
   configuration file is named "mysql-config.conf", which makes
   more sense than "mysql-nova.conf" and is consistent with other
   collectd config file names we have (e.g. "python-config.conf").
3. Add a unit test for the class.
4. Adjust the documentation.

Change-Id: I281c28d9f4da7ae728615041e175845ad5829b34
2016-02-09 07:50:31 -08:00
Simon Pasquier 3f0cdd5061 Clean-up resources dealing with notifications
This change refactors the lma_collector Puppet module regarding the
processing of the OpenStack notifications to get rid of the coupling with
Fuel. In particular, the configuration of the OpenStack services is
done in the external manifests since there was no point to have it in
lma_collector.

The change removes also workarounds that were necessary with older
versions of the plugin:

  - heat-engine is now managed as a regular service.
  - the can_exit flag is reverted back to false for the AMQP plugins

Finally it restarts properly the Keystone service if necessary:
Keystone is executed as a WSGI application in Apache so we need to
restart Apache if the Keystone configuration changes.

Change-Id: I39a2d25695449271b946ddcbca00cd8911dbdbb4
Implements: blueprint lma-without-fuel
2016-02-04 09:36:02 +01:00
Jenkins 06b6eb5e47 Merge "Add doc for lma_collector::collectd::mysql" 2016-02-02 15:21:49 +00:00