fuel-plugin-lma-collector

Commit Graph

Author	SHA1	Message	Date
Andreas Jaeger	c929899400	Retire repository Fuel repositories are all retired in openstack namespace, retire remaining fuel repos in x namespace since they are unused now. This change removes all content from the repository and adds the usual README file to point out that the repository is retired following the process from https://docs.openstack.org/infra/manual/drivers.html#retiring-a-project See also http://lists.openstack.org/pipermail/openstack-discuss/2019-December/011675.html A related change is: https://review.opendev.org/699752 . Change-Id: I8aded54f1b9f3b79f3a4bf8f607d3695b92f528b	2019-12-18 19:39:39 +01:00
Swann Croiset	a5c154d6ef	Configure pagination for OpenStack collectd plugins Change-Id: I32e83368b7d0d2e8b68d7f7a2df0d1b61653fa72	2017-02-09 12:54:36 +00:00
Swann Croiset	b7c7e7bdc2	Remove the SMTP standalone alerting_mode This feature was broken and not stable enough for production deployment. Related-bug: #1606831 Related-bug: #1643542 Change-Id: I0ce52ec01838d891c43d6e797617d3044a02d10f	2017-01-09 13:23:14 +01:00
Swann Croiset	8bc378c486	Rename GSE alerting attribute Change-Id: I30b16d7ef159242f9984b54a8ae344fbf6560314	2016-10-10 16:24:06 +02:00
Guillaume Thouvenin	9dbf48dbfe	Replace the workers AFD filter This patch uses the generic AFD filter with new alarms to replace the custom AFD for workers. Blueprint: allow-all-alarms-to-be-specified-in-alarming-file Change-Id: I6c432e60a16da5bb3c8d0ecd0bd22a1246fe6f82	2016-10-06 09:05:30 +02:00
Guillaume Thouvenin	215f693307	Replace the API backends AFD filter This patch uses the generic AFD filter with new alarms to replace the custom AFD for API backends. Blueprint: allow-all-alarms-to-be-specified-in-alarming-file Change-Id: Id139e45a9942a9c86a2d35d1966b083d9c75af89	2016-10-05 15:41:55 +00:00
Simon Pasquier	2cc44ddba0	Increase the number of points per InfluxDB batch This change improves the InfluxDB write performances by increasing to 500 the maximum number of points that are sent per InfluxDB request. InfluxDB recommends to have a batch size of 5,000 but it cannot be the default configuration value due to the fixed sized of Heka messages (256K currently) which leads to silently discard metrics. Note that the InfluxDB accumulator will flush the data either when it holds 500 points or when it hasn't data for at least 5 seconds. Co-Authored-By: Swann Croiset <scroiset@mirantis.com> Change-Id: I7d238375dc0c231782983fc4901c9a32936fb08a Partial-Bug: #1581369	2016-09-30 14:37:40 +02:00
Swann Croiset	cc5aadb474	Do not send GSE to Nagios when activate_alerting=false DocImpact blueprint: alarming-refactoring Change-Id: Ie343672afd4a222a3d7f920182a0b2e90e1fd6de	2016-09-20 09:41:54 +02:00
Swann Croiset	83db24f549	Increase the Elasticsearch bulk size when required In some environments (especially using slow HDD drives), the Elasticsearch backends may fail to ingest logs fast enough. As a result the log_collector service running on the controller nodes are blocked. To alleviate this issue, this change increases the bulk size for nodes that generate lots of logs: - controllers which run OpenStack API services in addition to Pacemaker. - all nodes when the environment's log level is set to debug. In such cases, the flush_count parameter is increased to 100 (instead of 10 by default). Change-Id: Ifdfbcb8ff0292f695dee4deab45560f126bde242 Closes-Bug: #1617211	2016-08-29 15:17:44 +00:00
Swann Croiset	7f1f3bd59f	Configure AFD alarms against 'mysql_check' metric Change-Id: Ib15fea4ab041243e44a61c9d54d1f154b02d34af	2016-08-26 15:23:07 +02:00
Swann Croiset	26c5788684	Check memcached service on controller nodes The patch replaces the service_heartbeat mechanism. Change-Id: I060e10320cf6f8b874a39037b1f9257ed1996342	2016-08-26 10:56:06 +02:00
Swann Croiset	5c4b3eb2e6	Add Python collectd plugin to check memcached availability This plugin emits check metrics for memcached. Change-Id: I5b0fba60d076080503e34f751fccaae801ca327a	2016-08-26 10:54:57 +02:00
Simon Pasquier	3a3ef6f2e3	Add Pacemaker collectd plugin This change adds a collectd plugin that gets metrics from the Pacemaker cluster: - cluster's metrics - node's metrics - resource's metrics Most of the metrics are only collected from the node that is the designated controller except pacemaker_resource_local_active and pacemaker_dc_local_active. The plugin also removes the 'pacemaker_resource' plugin by providing the exact same metrics and notifications for the other collectd plugins. Finally the plugin is also installed on the standalone-rabbitmq and standalone-database nodes if they are present. Change-Id: I8b5b987704f69c6a60b13e8ea982f27924f488d1	2016-08-11 14:53:43 +02:00
Éric Lemoine	27174d2196	Fix broken links in doc This commit fixes links in the doc by removing the /user/ path element in doc URLs. Change-Id: I753fd10a42cb4024b619aa1e6123cd2ac8526f68	2016-06-15 16:55:57 +02:00
Swann Croiset	c679b05be7	Install explicit package version of Heka Change-Id: Ica6a6936cfd8f959758988f97af29d6489734484 Fixes-bug: #1590013	2016-06-08 07:51:28 +00:00
Swann Croiset	b2bb3f3ea9	Remove some default lma_collector::params This patch removes default parameters for InfluxDB/Elasticsearch HTTP port and address. These parameters are always provided by callers and that the way to go. Change-Id: I5e346b71a7d639475f2fba92126f8d191f8cd5fd	2016-06-01 09:42:28 +02:00
Simon Pasquier	5cda99d8c6	Revert "Increase the number of points per InfluxDB batch" This reverts commit `567562faaf`. Change-Id: I14549b8cad02058c3352c71bad80ff2ad0dcd970	2016-05-26 14:25:07 +00:00
Simon Pasquier	567562faaf	Increase the number of points per InfluxDB batch This change increases the maximum number of points that are sent in a single request. InfluxDB recommends to have a batch size of 5,000 so this is now the default configuration value. Note that the InfluxDB accumulator will flush the data either when it holds 5,000 points or when it hasn't data for at least 5 seconds. Change-Id: If07b7d285d216855997254952ca6d7511cff65ec Partial-Bug: #1581369	2016-05-24 12:01:22 +02:00
Swann Croiset	13d1801c65	Prevent using init script to start Heka on controller nodes Change-Id: I3b01ac021f9e89ef74fbd82d7abc103a2f34399d Fixes-bug: #1570839	2016-05-04 14:34:39 +02:00
Swann Croiset	391ca132b3	Emit aggregated HTTP metrics HTTP metrics are now statistics aggregated every 10 seconds. A new metric is emitted openstack_<service>_response_times with these values: - min - max - sum - count - percentile Hence, the previous metric disappears (openstack_<service>_responses). Implements-blueprint: aggregated-http-metrics Change-Id: I48e92df6f4baa7be942ad138b7f23c3d15f5a24e	2016-05-04 14:34:39 +02:00
Swann Croiset	ebac150f8a	Separate the (L)og of the LMA collector This change separates the processing of the logs/notifications and metric/alerting into 2 dedicated hekad processes, these services are named 'log_collector' and 'metric_collector'. Both services are managed by Pacemaker on controller nodes and by Upstart on other nodes. All metrics computed by log_collector (HTTP response times and creation time for instances and volumes) are sent directly to the metric_collector via TCP. Elasticsearch output (log_collector) uses full_action='block' and the TCP output uses full_action='drop'. All outputs of metric_collector (InfluxDB, HTTP and TCP) use full_action='drop'. The buffer size configurations are: * metric_collector: - influxdb-output buffer size is increased to 1Gb. - aggregator-output (tcp) buffer size is decreased to 256Mb (vs 1Gb). - nagios outputs (x3) buffer size are decreased to 1Mb. * log_collector: - elasticsearch-output buffer size is decreased to 256Mb (vs 1Gb). - tcp-output buffer size is set to 256Mb. Implements: blueprint separate-lma-collector-pipelines Fixes-bug: #1566748 Change-Id: Ieadb93b89f81e944e21cf8e5a65f4d683fd0ffb8	2016-05-04 14:34:14 +02:00
Swann Croiset	96df47af73	Increase the Heka poolsize on controllers On controller nodes, the Heka poolsize must be increased to handle the load generated by derived metrics from logs otherwise a deadlock can happen in the filter plugins and block heka. Fixes-bug: #1557388 Change-Id: I74362011d32d413f244c6cdb6e4625ed96759df0	2016-04-05 18:34:17 +02:00
Swann Croiset	9cb06879fe	Increase timeout to 20s for Openstack collectd plugins And decrease the max_retries from 3 to 2 to stay in the 50 seconds window. This change allows to retrieve large number of objects and also avoids to overload the system by performing 3 'zombies' requests every 50 seconds without any metrics collected. Partial-bug: #1554502 Change-Id: I60a7611bc82598831538da01245b87fb29a15c44	2016-03-09 18:25:30 +01:00
Swann Croiset	e427882db2	Add parameter to lma_collector::collectd::rabbitmq class The new parameter 'queue' configures the 'Queue' option of the Python collectd plugin. Change-Id: I5f5b1a21dd777469c7ab56688946d169ae3d917b Related-bug: #1549721	2016-03-03 11:06:34 +01:00
Simon Pasquier	55ba8c48c6	Document the lma_collector::afd::* classes Change-Id: Ie395c439fbc5c05f8fc396d33c07f01989b43a01 Implements: blueprint lma-without-fuel	2016-02-23 14:25:00 +00:00
Simon Pasquier	4fa6843e6d	Document the lma_collector::smtp_alert class Change-Id: I366799f6be60674fa8dd930b713ae8a3070ad699 Implements: blueprint lma-without-fuel	2016-02-23 14:19:47 +01:00
Simon Pasquier	10c0ddc505	Document the lma_collector::notifications::metrics class Change-Id: I0a914bef498f292ff3ba0d43a9a4acb15085d15b Implements: blueprint lma-without-fuel	2016-02-23 10:56:31 +00:00
Jenkins	8c857cf6d7	Merge "Document the lma_collector::metrics::service_heartbeat class"	2016-02-23 10:55:53 +00:00
Jenkins	12216a5662	Merge "Document the lma_collector::metrics::heka_monitoring class"	2016-02-23 10:55:31 +00:00
Jenkins	cc60eedfd2	Merge "Document the lma_collector::gse_policies class"	2016-02-23 10:55:21 +00:00
Jenkins	108f54b16a	Merge "Document the lma_collector::gse_cluster_filter define"	2016-02-23 10:55:02 +00:00
Jenkins	818d32c5f3	Merge "Document the lma_collector::gse_nagios define"	2016-02-23 10:54:48 +00:00
Jenkins	61c722992b	Merge "Document the lma_collector::aggregator classes"	2016-02-23 10:54:30 +00:00
Jenkins	2bd9220996	Merge "Document the lma_collector::afd_nagios define"	2016-02-23 10:54:09 +00:00
Jenkins	d3255f2262	Merge "Document the lma_collector::afd_filter define"	2016-02-23 10:53:53 +00:00
Jenkins	1059ae2e99	Merge "Document the lma_collector::influxdb class"	2016-02-23 10:53:40 +00:00
Simon Pasquier	f9da20a24c	Document the lma_collector::metrics::service_heartbeat class Change-Id: I04beea034476af546becfea73ba4393ace659b80 Implements: blueprint lma-without-fuel	2016-02-23 10:36:38 +00:00
Swann Croiset	59d8fd8604	Specify explicitly Neutron log file names parsed by Hekad Fixes-bug: #1546424 Change-Id: Icacfb8f7c6b81817856df468aeb592978a8d26e8	2016-02-19 15:44:18 +00:00
Simon Pasquier	27f452fb12	Document the lma_collector::metrics::heka_monitoring class Change-Id: Iab7af305aae408f63b51ec60eba3d39aab1c62ee Implements: blueprint lma-without-fuel	2016-02-19 09:59:11 +01:00
Simon Pasquier	1bcf3fdd3a	Document the lma_collector::gse_policies class Change-Id: Icb65b93f3a6b737d818dc64f165d282d08943119 Implements: blueprint lma-without-fuel	2016-02-19 09:59:11 +01:00
Simon Pasquier	0170939e50	Document the lma_collector::gse_cluster_filter define Change-Id: I08908d55844d301ec2e91156feb9849316a19646 Implements: blueprint lma-without-fuel	2016-02-19 09:59:10 +01:00
Simon Pasquier	495d9908f4	Document the lma_collector::gse_nagios define Change-Id: I22088b4503541a372b0af91a38f167c82e3ca059 Implements: blueprint lma-without-fuel	2016-02-19 09:59:10 +01:00
Simon Pasquier	a22d44b9ad	Document the lma_collector::aggregator classes Change-Id: Id4b3664e54319fa9b1192ade47d9e41f025aa6ae Implements: blueprint lma-without-fuel	2016-02-19 09:59:10 +01:00
Simon Pasquier	d59e3a6187	Document the lma_collector::afd_nagios define Change-Id: I36be2de71d3de60668fdfc8b8397ff6c118ec4c2 Implements: blueprint lma-without-fuel	2016-02-19 09:59:10 +01:00
Simon Pasquier	cc5ecd9f52	Document the lma_collector::afd_filter define Change-Id: I634b028b1ef90c0197e04ed09bb9c92247400df3 Implements: blueprint lma-without-fuel	2016-02-19 09:59:09 +01:00
Simon Pasquier	6f213e1816	Document the lma_collector::influxdb class Change-Id: I27531ce0ae9403449f417c2ee6ee77bceb191aa3 Implements: blueprint lma-without-fuel	2016-02-19 09:59:09 +01:00
Swann Croiset	98441edea0	Do not purge the collectd package configuration by default This avoids to purge collectd configuration by the last manifest applied. Fixes-bug: #1546091 Change-Id: Ib6c22910f4c9259920bb9ce079a0135deff31544	2016-02-17 09:45:41 +01:00
Éric Lemoine	54a2e4b4b9	Make changes to lma_collector::collectd::mysql This commit is related to the usage and documentation of the lma_collector::collectd::mysql class. The following changes are made: 1. Make the "username" and "password" parameters required. Today they default to the empty string, which doesn't make much sense. 2. Change the internal resource name from "nova" to "config". The name "nova" was confusing as the collection of MySQL statistics is unrelated to Nova. With this change the generated collectd configuration file is named "mysql-config.conf", which makes more sense than "mysql-nova.conf" and is consistent with other collectd config file names we have (e.g. "python-config.conf"). 3. Add a unit test for the class. 4. Adjust the documentation. Change-Id: I281c28d9f4da7ae728615041e175845ad5829b34	2016-02-09 07:50:31 -08:00
Simon Pasquier	3f0cdd5061	Clean-up resources dealing with notifications This change refactors the lma_collector Puppet module regarding the processing of the OpenStack notifications to get rid of the coupling with Fuel. In particular, the configuration of the OpenStack services is done in the external manifests since there was no point to have it in lma_collector. The change removes also workarounds that were necessary with older versions of the plugin: - heat-engine is now managed as a regular service. - the can_exit flag is reverted back to false for the AMQP plugins Finally it restarts properly the Keystone service if necessary: Keystone is executed as a WSGI application in Apache so we need to restart Apache if the Keystone configuration changes. Change-Id: I39a2d25695449271b946ddcbca00cd8911dbdbb4 Implements: blueprint lma-without-fuel	2016-02-04 09:36:02 +01:00
Jenkins	06b6eb5e47	Merge "Add doc for lma_collector::collectd::mysql"	2016-02-02 15:21:49 +00:00

1 2

96 Commits