fuel-plugin-lma-collector

Commit Graph

Author	SHA1	Message	Date
Andreas Jaeger	c929899400	Retire repository Fuel repositories are all retired in openstack namespace, retire remaining fuel repos in x namespace since they are unused now. This change removes all content from the repository and adds the usual README file to point out that the repository is retired following the process from https://docs.openstack.org/infra/manual/drivers.html#retiring-a-project See also http://lists.openstack.org/pipermail/openstack-discuss/2019-December/011675.html A related change is: https://review.opendev.org/699752 . Change-Id: I8aded54f1b9f3b79f3a4bf8f607d3695b92f528b	2019-12-18 19:39:39 +01:00
Simon Pasquier	5c369ee965	Fix lint errors for the Heka module Change-Id: I336dba6c1913bb2af09538a183e679fbbbde2321	2016-06-22 16:33:20 +02:00
Swann Croiset	c679b05be7	Install explicit package version of Heka Change-Id: Ica6a6936cfd8f959758988f97af29d6489734484 Fixes-bug: #1590013	2016-06-08 07:51:28 +00:00
Swann Croiset	89882ae849	Remove the init script shipped by the heka package This prevents to restart all collectors by side effect when '/etc/init.d/heka stop' is run. Change-Id: I4f743e765e5895f3c97505a166140bcd80f7ce34 Partial-bug: #1570850	2016-05-09 08:03:52 +00:00
Swann Croiset	13d1801c65	Prevent using init script to start Heka on controller nodes Change-Id: I3b01ac021f9e89ef74fbd82d7abc103a2f34399d Fixes-bug: #1570839	2016-05-04 14:34:39 +02:00
Swann Croiset	ebac150f8a	Separate the (L)og of the LMA collector This change separates the processing of the logs/notifications and metric/alerting into 2 dedicated hekad processes, these services are named 'log_collector' and 'metric_collector'. Both services are managed by Pacemaker on controller nodes and by Upstart on other nodes. All metrics computed by log_collector (HTTP response times and creation time for instances and volumes) are sent directly to the metric_collector via TCP. Elasticsearch output (log_collector) uses full_action='block' and the TCP output uses full_action='drop'. All outputs of metric_collector (InfluxDB, HTTP and TCP) use full_action='drop'. The buffer size configurations are: * metric_collector: - influxdb-output buffer size is increased to 1Gb. - aggregator-output (tcp) buffer size is decreased to 256Mb (vs 1Gb). - nagios outputs (x3) buffer size are decreased to 1Mb. * log_collector: - elasticsearch-output buffer size is decreased to 256Mb (vs 1Gb). - tcp-output buffer size is set to 256Mb. Implements: blueprint separate-lma-collector-pipelines Fixes-bug: #1566748 Change-Id: Ieadb93b89f81e944e21cf8e5a65f4d683fd0ffb8	2016-05-04 14:34:14 +02:00
Swann Croiset	728efcc745	Remove dashboard configuration by the heka module Because the dashboard is configured by heka::output::dashboard Change-Id: I8f41becf9733699167ce7192fcb225d4b0c48f0a	2016-04-26 13:38:53 +02:00
Swann Croiset	96df47af73	Increase the Heka poolsize on controllers On controller nodes, the Heka poolsize must be increased to handle the load generated by derived metrics from logs otherwise a deadlock can happen in the filter plugins and block heka. Fixes-bug: #1557388 Change-Id: I74362011d32d413f244c6cdb6e4625ed96759df0	2016-04-05 18:34:17 +02:00
Swann Croiset	b46fcb0417	Rotate hekad logs every 30 minutes if necessary This change rotates the hekad logs more frequently. It also rotates the log file when it reaches a certain size. Fixes-bug: #1561603 Change-Id: Ic08831b8abadd0e1f846e0f401dc74b15dd46b3c	2016-03-30 14:31:57 +02:00
Simon Pasquier	949d7a2fc6	Always install the latest version of Heka This is required to deal with plugin upgrades. Change-Id: I092943123635617cc8d1e733e2ce2abbc4045518	2016-03-23 08:09:36 +00:00
Éric Lemoine	ccdba23158	Move Pacemaker/Corosync code out of lma_collector This commit moves the Pacemaker/Corosync Puppet code from the lma_collector module to the Fuel-specific base.pp manifest. This involves the following changes: * Fuel's "pacemaker_wrappers::service" define is now used in base.pp to configure the LMA service resource to using the "pacemaker" provider. * To configure "pacemaker_wrappers::service" we need to know the Heka user. So to avoid hacks where we'd use private variables from the lma_collector and heka modules to determine the Heka user the lma_collector and heka modules are changed to make the Heka user configurable. For this the "heka" class "run_as_root" parameter is removed in favor of a "user" parameter. * In other manifests we use a resource collector to make sure that the LMA service resource is not re-configured with the default provider. This part is a bit hackish, but we haven't been able to come up with a better way to address the issue. Change-Id: I0ed0bddb245dc3a65b034e5caec14a65cfa908cb Implements: blueprint lma-without-fuel	2016-01-29 12:50:57 +01:00
Simon Pasquier	6bd8565bd0	Fix nologin path Change-Id: I62e5287fb8292ba89c24f03c3e49cf2318f6a2cd Closes-Bug: #1523579	2015-12-23 09:14:00 +01:00
Simon Pasquier	2d1d6e6936	Add AFD plugins for OpenStack services This change introduces the first Anomaly and Fault Detection (AFD) filter plugins. These plugins return AFD events on the availability of the API endpoints, the API backends (as reported by HAProxy), and the service workers (eg nova-scheduler, nova-conductor, ...). Change-Id: I75bfb433e4e174659900f885040a1c2032efd470 Implements: blueprint alerting-lma-collector	2015-09-17 16:27:49 +02:00
Swann Croiset	b48edc132d	Change heka home directory before process starts The system user heka is now created before the package installation. Change-Id: I5d5a481f134f4c9f75a794c6955dbd2456b83e95 Fixes-bug: #1493937	2015-09-11 14:17:03 +02:00
Swann Croiset	16bc34148d	Fix issue on CentOS with Heka user The deployment on CentOS is broken since `254eda4` This patch creates always the 'heka' user defined in heka::params:user even if the Hekad process run as 'root'. This way should works for both MOS 6.1 and 7. Change-Id: I9ec690735b10f149d4477f0b8a7ca3a7d0cc54c1	2015-08-27 15:24:32 +02:00
Simon Pasquier	254eda470b	Fix typo in heka/manifests/init.pp Change-Id: I26c0bc01166a456f208cc1f68dccd45ec6796ea2	2015-08-14 21:26:01 +02:00
Simon Pasquier	60b2ffac25	Configure Pacemaker to manage the LMA collector This change configures Pacemaker to manage the LMA collector service with proper ordering regarding the local RabbitMQ service. This also means that I removed the wrapper script that took care of checking the RabbitMQ availability before launching the hekad process on the controllers. Change-Id: I4e747083fb9876f06fde9914b626970e37d0b429 Implements: blueprint lma-aggregator-in-ha-mode	2015-08-14 20:44:17 +02:00
Simon Pasquier	35ded01e64	Update Heka to version 0.10 This change installs the latest version of Heka (0.10.0b0). This version of Heka is required because it comes with updated Lua plugins and modules for InfluxDB. Change-Id: I4cbb65603cc8e49679c1a89c5a3792c977e44b7a Implements: blueprint upgrade-influxdb-grafana	2015-07-31 10:04:13 +02:00
Simon Pasquier	b7e3ffb7fc	Revert "Update Heka to version 0.10" This reverts commit `2f32d2af5f`. The Process Input plugin has some issues with Heka 0.10.0b0. See issue [1] https://github.com/mozilla-services/heka/issues/1620 Change-Id: Id25e91e780952fdfd85e38947b04e935a785f65d	2015-07-30 11:54:26 +02:00
Simon Pasquier	2f32d2af5f	Update Heka to version 0.10 This change installs the latest version of Heka (0.10.0b0). This version of Heka is required because it comes with updated Lua plugins and modules for InfluxDB. Change-Id: Ibcb51909658d908979c9f13bdec6a754e2698df2 Implements: blueprint upgrade-influxdb-grafana	2015-07-27 14:56:19 +02:00
Simon Pasquier	efbda489c7	Remove non-ASCII characters Change-Id: If0edb8b20cb0f0ad68b172fe05eeede390890b44	2015-07-17 16:40:28 +02:00
Swann Croiset	bb6c74b246	Add Heka params to set maximum message injection This add these Heka configuration options in global.toml If not provided, use the Heka default values which are currently: * max_process_inject = 1 * max_timer_inject = 10 Change-Id: If1995fa505aec6ff3000af33c548730dd06d1046	2015-07-03 17:40:24 +02:00
Swann Croiset	3f52ddeac6	Allow larger Hekad messages The maximum size observed during a load test with 50 nodes is 158Kb, the default size is 64Kb. This is required by elasticsearch buffered output which can hit the limited size and finally lose messages. The Heka log: Plugin 'elasticsearch_output' error: Message too big, requires 161024 (MAX_MESSAGE_SIZE = 65536) Change-Id: I8970435e2f710889e4b5d2c55a53572c042ef647	2015-05-29 15:34:51 +00:00
Simon Pasquier	067834e466	Dump Heka statistics periodically This change adds a cron job that sends SIGUSR1 to the hekad process every hour. Heka will dump an internal report which is available in /var/log/lma_collector.log eventually. Change-Id: I7e164a85a8222f60e7a625d1277528b819a17661	2015-05-12 09:47:33 +02:00
Guillaume Thouvenin	fb953f8af3	Wait for rabbitmq before starting lma_collector If we start lma_collector before the availability of rabbitmq cluster it will fail to connect to the lma queues and then, it will fail to start. It may take several long minutes before pacemaker starts the service. So we need to be sure that rabbitmq cluster is up and running before starting lma_collector. Change-Id: Ia254b744f4173f64ee3ab8200b2896ecc412d06f	2015-04-22 14:36:51 +00:00
Simon Pasquier	3d71e776b4	Add Apache license headers to Puppet manifests This change fixes the text of the LICENSE file too. Change-Id: Iaebc5a8fc174b4bfe12fa0fb917c6de79ebba334	2015-04-20 15:21:17 +02:00
Simon Pasquier	6354833881	Fix errors reported by puppet-lint This is required to enable the Fuel Plugins CI. Change-Id: I2220775503e5bfc21d63f0c7686f5376cac4e4ff	2015-03-23 15:00:14 +00:00
Simon Pasquier	8517b26293	Split into smaller tasks This change moves away from the big monolithic Puppet manifest. Instead we introduce separate tasks for each role that the plugin supports. Change-Id: I370c9e8267f86da742f5cca48f1fec8bc3d9c4a9	2015-03-05 15:20:04 +01:00
Simon Pasquier	c9ee4d30d9	Initial import of the LMA collector plugin This is an import of the initial LMA PoC code. For now, it only covers the collection of logs (notifications will be added in a subsequent commit). There's been a bit of rewrite to: - decouple the Heka configuration from the LMA collector. - run the Heka service as non-root when possible (Ubuntu only for now due to file permission issues on CentOS [1]). - adapt to version 0.9 of Heka. [1] https://bugs.launchpad.net/fuel/+bug/1425954 Change-Id: I4472b49a25e18e06984b5b29bdce18f917137bc8	2015-02-27 14:16:49 +01:00

29 Commits