Commit Graph

277 Commits

Author SHA1 Message Date
Witek Bedyk c404f419ad Fix libvirt ping_checks documentation
Change-Id: I05f2df64d13e25edcd8e30e96919915dcf4fc0f4
2020-05-14 11:08:02 +00:00
Zuul 5148493fb7 Merge "Do not copy /sbin/ip to /usr/bin/monasa-agent-ip" 2020-04-27 16:43:40 +00:00
Joseph Davis 8d4bd979d5 Add remove configuration for matching arguments
The remove_config() function only removes an exact match for
the configured instance.
This will allow removing plugin configuration when all config is
not known.

Use case: A compute node has been removed, so any host_alive
ping checks that are configured for it should be removed.  But at
the time of removal the list of target_hostnames to match are
not known.

Change-Id: I8050e1eed68d7b64f7a968b061afa69fe2e86d72
Story: 2004539
Task: 28287
2020-04-23 15:56:21 -07:00
KeithMnemonic 17ba1346b7 Do not copy /sbin/ip to /usr/bin/monasa-agent-ip
This patch removes the code that does the copy of /sbin/ip to
/usr/bin/monasca-agent-ip. There is a limitation with /sbin/ip
that limits copying it to a new name that is longer than
2 characters. The error is:

./monasca-agent-ip a
Object "nasca-agent-ip" is unknown, try "ip help".

As this is not working on RHEL,SLES, or Ubuntu this code
should be removed.

Change-Id: I439be00070eb1cf16416325f23a86fc7cd518acc
Story: 2001593
Task: 6543
2020-04-23 13:56:29 -04:00
Zuul ebd42eb5de Merge "Add Infiniband metrics plugin" 2020-01-16 08:25:26 +00:00
Doug Szumski e4aba27933 Add Infiniband metrics plugin
This plugin adds initial support for gathering Infiniband counters.

Story: 2007044
Task: 37859
Change-Id: Id04c34bd9aabd61ccd4ce22b30e515e4ca627561
2020-01-14 11:33:03 +00:00
Zuul 8af052695b Merge "Add support for gathering Slab memory usage" 2020-01-13 18:00:27 +00:00
Doug Szumski 0a4abd532e Add support for gathering Slab memory usage
This is useful, for example when monitoring Slab memory leaks. To support
gathering this metric a minimum version of psutil 5.4.4 is required
(released on Apr 13th 2018).

Story: 2006815
Task: 37375
Change-Id: Ibe8def9e2a7c967a34236889aa03b287065abcdc
2019-12-19 15:23:24 +00:00
Stig Telfer 3406f8243c Ceph plugin updates for Luminous
Since the Luminous release of Ceph, the plugin no longer exports metrics
such as object storage daemon stats, placement groups and pool stats.
Check for the installed version of the Ceph command and parse results
according to version.

Include test data for Jewel and Luminous Ceph clusters.

Story: 2005032
Task: 29515

Change-Id: I0aef0db25f49545c715b07880edd57135e3beafe
Co-Authored-By: Bharat Kunwar <bharat@stackhpc.com>
Co-Authored-By: Doug Szumski <doug@stackhpc.com>
2019-12-18 10:59:08 +00:00
Witek Bedyk e62dcef4c4 Fix TOC section reference
Change-Id: Ic964802c63816adaf7808024d93fff9fd8028fed
Story: 2006753
Task: 37448
2019-11-14 17:53:36 +01:00
Zuul 300f9418de Merge "add X.509 certificate check plugin" 2019-11-14 16:41:55 +00:00
Guang Yee e1d73c4b5d add X.509 certificate check plugin
Currently we don't have any capability to monitor the internal TLS/SSL
certificates. i.e. SSL certificates used by MySQL for replication, RabbitMQ for
distribution, etc. The cert_check plugin is not adequate for this purpose
becaue it can only check on certficates over HTTPS endpoints. Furthermore,
checking on these internal certificates over the network is cumbersome
because the agent plugin would have to speak specific protocols.

This patch adds a cert_file_check plugin to detect the certificate expiry
(in days from now) for the given X.509 certificate file in PEM format.
Similar to cert_check plugin, this plugin will a metric
'cert_file.cert_expire_days' which contains the number of days from now the
given certificate will be expired. If the certificate has already expired,
this will be a negative number.

Change-Id: Id95cc7115823f972e234417223ab5906b57447cc
Story: 2006753
2019-11-13 08:35:54 -08:00
Witek Bedyk 96afbc6b9c Update agent architecture diagram
Change-Id: I4c53c8ab6c22c9460734b4194a2dcba6b58b8c79
2019-10-28 16:58:48 +01:00
Matthew Oliver 833e5946fe Add swift_handoffs check plugin to monasca
A powerful metric to watch for a swift cluster is the
number of handoff partitions on a drive on a storage node.

A build up of handoff nodes on a particular server could
indicate a disk problem somewhere in the cluster. A bottleneck
somewhere. Or better, when would be a good time to rebalance
the ring (as you'd want to do it when existing backend data
movement is at a minimum.

So it turns out to be a great visualisation of the health of
a cluster.

That's what this check plugin does. Each instance check takes
the following values:

  ring: <path to a Swift ring file>
  devices: <path to the directory of mountpoints>
  granularity: <either server or device>

To be able to determine primary vs handoff partitions on a drive
the swift ring needs to be consulted. If a storage node stores
more then 1 ring, and an instance would be defined for each.

You give swift a bunch of disks. These disks are placed in what
swift calls the 'devices' location. That is a directory where a
mount point for each mounted swift drive is located.

Finally, you can decide on the granularity, which defaults to
`server` if not defined. Only 2 metrics are created from this
check:

  swift.partitions.primary_count
  swift.partitions.handoff_count

But with the hostname dimension a ring dimension will also be set.
Allowing the graphing of the handoff vs partitions of each ring.
When the granularity is set to device, then an additional
dimension to the metric is added, the device name (the name of
the devices mount point). This allows the graphing and monitoring
of each device in a server if a finer granularity is required.

Because we need to consult the Swift ring there is a runtime
requirement on the Python Swift module being installed. But
this isn't required for the unit tests. Making it a runtime
dependency means when the check is loaded it'll log an error
and then exit if it can't import the swift module.

This is the second of two Swift check plugins I've been working on.
For more details see my blog post[1]

[1] - https://oliver.net.au/?p=358

Change-Id: Ie91add9af39f2ab0e5b575390c0c6355563c0bfc
2019-10-18 17:16:14 +02:00
Matthew Oliver 0d929d01a8 Add swift_recon check plugin to monasca
Swift outputs alot of statsd metrics that you can point directly
at monasca-agents. However there is another swift endpoint,
recon, that is used to gather more metrics.

The Swift recon (or reconnaissance) API is an endpoint each of the
storage node servers make available via a REST API. This API can
either be hit manually or via the swift-recon tool.

This patch adds a check plugin that hits the recon REST API and
and send metrics to monasca.

This is the first of two Swift check plugins I'm working on.
For more details see my blog post[1]

[1] - https://oliver.net.au/?p=358

Change-Id: I503d74936f6f37fb261c1592845968319695475a
2019-10-09 19:06:08 +02:00
Zuul 602636a6f9 Merge "trivial fix: Correct some spelling errors" 2019-07-30 13:58:33 +00:00
Nguyen Hai Truong f9b34043bf trivial fix: Correct some spelling errors
Small modification to correct spelling mistake.

Change-Id: Id2a98d9a4b3d1f36ed47b8cb36931e4267b69dcf
2019-07-15 13:45:29 +02:00
Witek Bedyk 9854fbc93f Add information about Prometheus plugin to README
Prometheus plugin support is emphasized in README.rst. Also formating of
the first page is updated to improve readability. Promethes section in
Plugins.md is fixed to be correctly referenced from table of content.

Story: 2005625
Task: 30878
Change-Id: Icbf305435d1bacdeabd1654af5e14b58a3248282
2019-05-27 07:36:20 +00:00
Mariusz Karpiarz 7ad32d6eca [docs] Add cross-tenant role using the openstack CLI
Change-Id: Ib827e73c5ee3ff6fb741b936f9a13a83e766eca8
2019-04-12 15:53:59 +01:00
Zuul a4abb94996 Merge "Support setting statsd metric aggregation interval" 2019-02-28 13:43:41 +00:00
Doug Szumski 0aa392a527 Support setting statsd metric aggregation interval
This adds support for setting the statsd metrics aggregation interval
as part of Monasca setup. Setting this interval is useful for users
calculating rates from statsd metrics.

Story: 2005063
Task: 29607
Change-Id: I22f5f1700c438245fd7e98deb40d706358349b6c
2019-02-21 14:23:21 +00:00
Thomas Bechtold e68c8391ea docs: Be consistent with config options in the init_config paragraph
Currently this paragraph is not very readable due to a lot of variable
names which are sometimes written with ", and sometimes without.
Mark these variables now as inline code (with backticks) to make the
paragraph more readable.

Change-Id: I6c06d50404e570db8e791da89a0f95f98597798e
2019-02-06 11:08:22 +01:00
Dobroslaw Zybort 9ee96263e9 Add missing libvirt system dependency installation
Story: 2003347
Task: 24386

Change-Id: I0af9c8825795b9b8f315b62cbf8934ba5056ece5
2018-11-23 09:28:21 +00:00
Zuul ec6d387d90 Merge "Update Libvirt.md with Installation steps" 2018-11-19 15:09:02 +00:00
Pandiyan ab00021018 Update Libvirt.md with Installation steps
Story: 2003347
Task: 24386
Change-Id: I2f0807f73802a32e0afef2a308760455437d953c
2018-11-19 15:13:33 +01:00
Zuul 5756ed9b64 Merge "Improvements to formatting" 2018-10-01 17:27:38 +00:00
Lukasz Zajaczkowski f23ceda96b Enable the system metrics in Docker environment
Story: 2003093
Task: 23185
Change-Id: I9700a6fcb650fbcf983f2a4f145b430876f12429
2018-08-22 10:12:38 +02:00
Joseph Davis 9dc138cdbe Improvements to formatting
Just some improvements to the formatting to help make the markdown
more readable.  Did not try to address correctness of any of the
content.

Change-Id: I323ae1a942ca48e63416421407345b32aa2da121
2018-08-08 16:11:15 -07:00
Lukasz Zajaczkowski 029127aa42 Enable the process plugin in Docker environment
Change-Id: I860a960f900fcb89ce4c7ddb096f9c37cba9c4c9
Story: 2002783
Task: 22667
2018-07-09 10:57:33 +02:00
Zuul 89e563194e Merge "Removed dependency on supervisor" 2018-06-29 14:20:42 +00:00
Stefano Canepa a8a2bb845b Removed dependency on supervisor
To support python3 in the near future this was done:

* Removed dependency on supervisor.
* Added template configuration for systemd target that includes all
  services.
* Added templates configuration for systemd service for every single
  service.
* Changed monasca_setup to use the new templates.

In the meanwhile code was formated to cope with pep8 settings and some
other small changes were done to comply with pycodestyle and
pydocstring.

Task: 4126
Story: 2000975

Depends-On: https://review.openstack.org/#/c/566475/

Change-Id: I0d0c4ea41a830581d6b9f247fad6a2dda1f96cbe
2018-06-27 11:28:43 +02:00
weezhard 8198225294 Add Cloudkitty detection plugin (Cloudkitty - Rating Service)
Change-Id: I82415b88d131f8239e062bb73561cc45e26b3887
2018-05-26 00:40:55 +02:00
Lukasz Zajaczkowski 40ab64e0b8 Fix sample for ZooKeeper plugin
Story: 2002081
Task: 19750

Change-Id: Ib9128eea00e4a8e611decf0c8190ffeae14dc27a
2018-05-22 12:48:05 +02:00
pangliye b4d6d5b49b fix misspelling of 'monasca'
Change-Id: I6babe4c683310257a700b997765a2a9985edfa02
2018-05-17 11:44:01 +08:00
Zuul 818fcad4e4 Merge "Allow Keystone config in init_config for http check" 2018-04-24 12:05:54 +00:00
Dobroslaw Zybort e13de8083e Allow Keystone config in init_config for http check
Now the user will be able to configure one, default Keystone for all
services in `http_check.yml` instead of providing them one by one for
every instance.

Setting Keystone config in every instance is marked as depreciated.

Story: 2001843
Task: 12610

Change-Id: If52b52efab6cc14a7df583b1dc2596b04e6813bc
2018-04-23 11:11:02 +02:00
Zuul cdff90a106 Merge "Add support for k8s StatefulSet" 2018-04-11 17:08:45 +00:00
Zuul da8a8051b7 Merge "Avoid overwriting sys.path "ip" command" 2018-04-11 15:26:28 +00:00
Tobias Johansson 52643b7a7c Add support for k8s StatefulSet
Currently collector doesn't recognize k8s StatefulSet.

Change-Id: If111279dc204704ac5e73d9f880eb4ecdd925297
Story: 2001757
Task: 12155
2018-04-11 13:43:21 +02:00
Lukasz Zajaczkowski b8bf71425b Add prerequisite for enabling all untunable metrics
Change-Id: Ie198ec9f0d97c41462ab3de8e4ae92426f8102a9
2018-03-20 14:17:36 +01:00
Zuul ceca989b2a Merge "Add LXC swap metric collector and fix lxc bug" 2018-03-13 14:13:55 +00:00
ShangXiao a0bbf784ef [Trivialfix]Modify a grammatical error
Modify a grammatical error by deleting "the" in docs/Ovs.md.

Change-Id: Ifb0e4cc1d1cd38fc0850d468b8e41ec028298726
2018-03-08 09:18:04 +00:00
Eduardo 1133a0a04f Add LXC swap metric collector and fix lxc bug
LXC plugin throw up a exception when try collect cpu metrics. This
patch fix it (tests are passing) and add swap collector.

Change-Id: I3b12ac6ce199006bc1e024d2b2626657519e4f0b
Story: 2001563
Task: 6507
2018-03-07 16:54:46 -03:00
Dirk Mueller 5c9b2b1acc Avoid overwriting sys.path "ip" command
This is desastrous to the rest of the system when
run outside a venv, as it overwrites the system ip
and it loses then capabilities to run for everyone else

Story: 2001593
Task: 6542

Change-Id: Ie0b7ef25b0f2cf6aca61adda4de5767ac2300cae
2018-02-23 14:02:03 +01:00
James Gu ae8771f414 Add monasca agent plugin to monitor cassandra health
Adds detection plugin to monitor cassandra service, including
the process, and the data directory through the args.

Change-Id: Ic2c20bc878527f607c0eb871e98a79c1521c0507
Story: 2001499
task: 6289
2018-02-01 11:51:57 +01:00
bandorf 44e6cadd12 Add new metrics for Cadvisor plugin
improvements for monasca self-monitoring
add metrics:
- number of cores
- memory used (percentage)
- file system used (percentage)

story: 2001407
task: 6099

Change-Id: I11dd367543b6c17b9935aa4826345dd5df721445
2018-01-18 14:16:57 +01:00
Zuul 5a68eb6c59 Merge "Update congestion agent plugin" 2018-01-11 21:35:13 +00:00
Fouad Benamrane 4e4faac80c Update congestion agent plugin
This commit is for correcting some mistakes about the documentation
and small issues.

Change-Id: I0a1bda7353c7e742ceb36f182011b3b92dc18da1
2018-01-10 11:14:47 +01:00
Craig Bryant 50824d1170 Add max batch size for writing to API
Add configurable maximum size of batches of measumrement to write
to Monasca API. Prevents killing Monasca API in memory limited
configurations. The default is no limit.

Change-Id: I2bf84501cc51c24843d7c3befd8f9dd42f010f0c
Story: 2001434
2018-01-03 22:53:53 -07:00
Zuul 47bade9e09 Merge "Add network congestion detection to monasca" 2017-12-30 21:55:30 +00:00