masakari-monitors

Commit Graph

Author	SHA1	Message	Date
Vasyl Saienko	f4769d3177	Honor libvirt.connection_uri in introspectivemonitor Do not use default uri, pick up from parameters. Change-Id: I8620aeab224ab37096656d20c4bcc3fe7e7f3f18	2024-02-13 16:36:14 +00:00
Vasyl Saienko	b89aeeea88	[introspectivemonitor] Fix syntax for python3 Change-Id: I4810e852d7fbf6c0140ab64414eed63f93d3f956	2024-02-13 16:36:06 +00:00
Vasyl Saienko	185003f7dc	Fix pep8 for new hacking Change-Id: I0bf55e6289fdbfcb6c564976f9040278fc8f9d71	2023-11-29 14:19:17 +02:00
suzhengwei	cf646dfce2	not retry to send notification for specific http exception When the host not added in the failover segment, it will raise 400(host with name *** could not be found). It should not retry to send notification in this case. Change-Id: I24a6aba97b834ae92dabe85196f01d27bb518b3c	2023-01-04 11:41:33 +08:00
Takashi Natsume	afcf34decc	Use daemon property instead of setDaemon method The setDaemon method of the threading.Thread was deprecated in Python 3.10 (). Replace the setDaemon method with the daemon property. : https://docs.python.org/3.10/library/threading.html#threading.Thread.setDaemon Change-Id: I643251c0394b8e8ede8198f580549ef6f260a9de Signed-off-by: Takashi Natsume <takanattie@gmail.com>	2022-08-24 23:30:42 +09:00
Maksim Malchuk	7a44244f25	Libvirt auth support Related-Bug: #1965754 Change-Id: I46f63de4b8ca8e5acd5db9cb8b0d2e13393d666c Signed-off-by: Maksim Malchuk <maksim.malchuk@gmail.com>	2022-05-13 15:15:20 +00:00
Zuul	a89511e088	Merge "host monitor by consul"	2022-03-03 11:26:36 +00:00
Zuul	89c067a4d8	Merge "connection too much when large scale failure"	2022-02-28 08:39:59 +00:00
Takashi Kajinami	741cffdfe7	Use LOG.warning instead of deprecated LOG.warn The LOG.warn method is deprecated[1] and the LOG.warning method should be used instead. [1] https://docs.python.org/3/library/logging.html#logging.warning Change-Id: I7e3dc5d1897cd10b94e0a5a5a06db667cba7d443	2022-01-28 09:05:39 +09:00
sue	4ecfb34a09	connection too much when large scale failure When large scale failure, there would be too many host or instance failure notifications in a very short time. Each time when one notification to be sent to masakari, it needs to make client, which brings great pressure to keystone. This patch keep the client reusable when it is made. Until exception it will be made again. Change-Id: I39795bc796d3e2402881b8116cdc241aa2d60a9f	2022-01-26 15:19:25 +08:00
sue	7c476d07aa	host monitor by consul This is a new host monitor by consul. It can monitor host connectivity via management, tenant and storage interfaces. Implements: bp host-monitor-by-consul Change-Id: I384ad70dfd9116c6e253e0562b762593a3379d0c	2021-12-23 14:39:09 +08:00
zhaoleilc	50677c2ca8	Fix a typo This patch fixes a typo. Change-Id: Ib72b08823e6a086ef0efbb956a3c2cdb4366c217	2021-12-10 17:28:43 +08:00
zhaoyixin	5198a3c910	Fix some typos This patch fixes some typos. Closes-Bug: #1949540 Change-Id: Iddb519719f3aca80286e9ca1666040ac202e037e	2021-11-03 10:47:30 +08:00
Radosław Piliszek	c2d9a4f9cb	Fix hostmonitor to respect quorum Both cibadmin-based and crm_mon-based host status queryings were affected, allowing partitioned cluster to tell Masakari to evacuate hosts from the other partition (which nota bene include all remotes if applicable). Closes-Bug: #1878548 Change-Id: I0b1ca8a011ee4da162a2c3a986c1dab9a3d38190	2021-09-13 19:27:52 +00:00
Zuul	981cda7e83	Merge "Remove conditionals for an ancient openstacksdk"	2021-08-17 06:41:46 +00:00
sue	339218d0af	move DriverBase to the base dir Move the driver.py one level up, so that new monitor driver can reuse it. Change-Id: I3ec98682056d07c777d592e01d290af9d06d7ff1	2021-08-16 15:03:08 +08:00
Radosław Piliszek	fe4e09408c	Remove conditionals for an ancient openstacksdk Change-Id: I4b035925b212236e5a31cd3752d08d281501c26d	2021-07-28 19:41:38 +00:00
Zuul	6e84289a53	Merge "Fix hostmonitor hanging forever after certain exceptions"	2021-07-27 06:23:37 +00:00
Zuul	1244861394	Merge "Fix typos"	2021-07-23 18:44:42 +00:00
sue	e7154f3d77	Fix hostmonitor hanging forever after certain exceptions The hostmonitor, like other Masakari monitors, starts as an Oslo service (based on eventlet). The main thread is supposed to run a loop that has an internal wait mechanism (instead of reusing periodic_tasks from oslo_service). However, the loop could be broken, if an unexpected exception appeared, and it never ran again but the process was still alive (due to oslo_service not stopping). The example mentioned in the bug report is about unavailability of the Masakari API (and/or Keystone API) before notification sending. This exception is not caught early because SendNotification._make_client is called outside of the try block (unlike the actual notification sending). The exception bubbles up and stops the main loop, leaving a useless hostmonitor process. The user is unaware unless they notice the logs are no longer growing. While the general design begs for a revamp (we might get away with that by using Consul in the first place), the easy fix is to prevent exceptions breaking the loop completely so that the hostmonitor can continue to work and try to regain health. At the very least it will keep posting ERROR messages in the log which is more likely to be spotted in comparison to lack of logs (which is, unfortunately, less commonly considered an alerting situation). This change also fixes, adapts and robustifies the two relevant unit tests. Closes-Bug: #1930361 Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com> Change-Id: I7e3447dcddc7998e3e3c30f4f0019d91a99c79ce	2021-07-23 14:30:23 +00:00
ericxiett	31f5b47d38	Fix typos This patch fixes some typos. Closes-Bug: #1934845 Change-Id: I93e48dfa896f7b7112d42f7681469f3d3e55c50a	2021-07-07 14:08:49 +08:00
YeHaiyang	76d9e894c7	Fix several code comment errors in process monitor. Change-Id: I21617a0952c2d6704e7adb8397d032a21ba38d8e	2021-07-05 19:40:55 +08:00
YeHaiyang	6a496e7871	Replace "split(' ')" with "split()" in masakari-monitors By default, if split's step is None, runs of consecutive whitespace are regarded as a single separator. So there is no need to use split(' '). Change-Id: Idcda8dfcaf5fd5abfab106238f91acdd3166883f	2021-06-30 01:23:19 +00:00
Zuul	bc624feecd	Merge "Replaces yaml.load() with yaml.safe_load()"	2021-04-27 01:01:58 +00:00
Zuul	12d67aede4	Merge "Drop CAP_NET_ADMIN"	2021-03-25 17:04:47 +00:00
Zuul	22059afc26	Merge "Repeated check to determine host status"	2021-03-24 01:33:26 +00:00
suzhengwei	987584a1c6	Repeated check to determine host status The original test is adapted because the code now no longer overwrites the same status. Change-Id: Ic77f932f56974a66a092b15b0d211efd73b9fc9c Implements: bp retry-check-when-host-failure Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>	2021-03-22 20:36:40 +00:00
Mark Goddard	0cdbb23587	[hostmonitor] Add pacemaker_node_type option When running in a container, it might not be possible to use systemd to verify the status of Corosync and Pacemaker. In such case, allow the user to choose the stack being used. Change-Id: I44ce3be6b6fda3834f6df63861b0dcf546da46a1 Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>	2021-03-22 16:43:25 +00:00
Radosław Piliszek	07bd41f0b4	Drop CAP_NET_ADMIN It is not required for performing monitors' duties. Change-Id: Ib1297ce6e4fca0bfcb82d32b3669475d2011fbe1	2021-03-21 19:10:19 +00:00
suzhengwei	2932977f02	remove unused configration Change-Id: I0cd8dc2fa96001622b3d89534c43b84e6103e369	2021-01-07 17:54:56 +08:00
suzhengwei	9458a9f449	remove unused script Change-Id: I53ed72291abe997bf78fe93938cda7721bbb9f8c	2021-01-05 10:16:07 +08:00
Zuul	abba984abd	Merge "Remove unnecessary continue statement"	2020-11-13 10:45:03 +00:00
Zuul	a34ddd216e	Merge "Use keystoneauth1 config option loading for masakari client"	2020-09-17 10:19:38 +00:00
Zuul	649c583e71	Merge "Remove six"	2020-08-29 08:46:12 +00:00
Zuul	97da0fe7ec	Merge "Replace assertRaisesRegexp with assertRaisesRegex"	2020-08-29 08:42:35 +00:00
Zuul	3a3a04576d	Merge "repeated parsing"	2020-08-26 18:02:58 +00:00
Mark Goddard	e704045880	Use keystoneauth1 config option loading for masakari client If a custom CA file is configured via [api] cafile, currently communication with Keystone will fail, since the session is not created using this CA file. The [api] insecure option is also ignored. This change fixes the issue by using keystoneauth loading for the auth and session, to ensure all standard configuration options are supported. Change-Id: Idd58b72f7f5242e8135fec71b42adf5dd1852417 Closes-Bug: #1873736	2020-07-23 12:19:00 +01:00
Chuck Short	886543d9e7	Replace assertRaisesRegexp with assertRaisesRegex This replaces the deprecated (in python 3.2) unittest.TestCase method assertRaisesRegexp() with assertRaisesRegex(). Also add associated hacking check. Change-Id: I62d5b4c0259c6e2e0fee361542d4b1234ab0ea57 Signed-off-by: Chuck Short <chucks@redhat.com>	2020-06-24 01:12:05 +00:00
Hervé Beraud	8c27817b50	Remove elementtree deprecated methods All our supported runtimes [1] are compatible with the recommended alternatives. `Element.getchildren` [2] is deprecated since python 3.2 and will be removed in python 3.9, these changes switch usages to `list(elem)`. [1] https://governance.openstack.org/tc/reference/runtimes/victoria.html#python-runtimes-for-train [2] https://docs.python.org/3/library/xml.etree.elementtree.html#xml.etree.ElementTree.Element.getchildren Change-Id: Ib34b57e2a660b09bde7c158b4aa7d68e39e5cf36	2020-06-23 11:34:55 +02:00
Tushar Patil	5e05de3255	Remove unnecessary continue statement Removed unncessary continue statement. Change-Id: Idd953793488fc1407084988905251dbaa6336392	2020-06-23 01:31:02 +00:00
suzhengwei	425bb1d663	repeated parsing It repeatedly uses 'node_state_tag.get('uname')' to parse the hostname. But the hostname doesn't change in the loop. Change-Id: Icac10015698378a1901c664f37f11b4529bf03e1	2020-06-04 11:27:06 +08:00
jacky06	33d25e3cc3	Remove six We don't need this in a Python 3-only world. Change-Id: I56dca98d06458174af0f3d4dc585a754e8692f89	2020-05-13 23:12:26 +08:00
suzhengwei	f317a24a25	reset nova-compute process name In masakari-engine, it enable/disable nova-compute service by the process name. So process-monitor need to reset the nova process name to "nova-compute" when trigger process notification. Change-Id: Ia6f22bd1d183093bb0345f323a80268fb62df388 Close-Bug: #1858757	2020-04-21 09:27:16 +08:00
Sean McGinnis	92c934a8e3	Use unittest.mock instead of third party mock Now that we no longer support py27, we can use the standard library unittest.mock module instead of the third party mock lib. Change-Id: Ie5ee60235bafc1e7b3461dee29b83fb62125178e Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>	2020-04-18 11:54:18 -05:00
Zuul	e225e6d1bd	Merge "Check config file for hostname"	2020-04-02 02:29:28 +00:00
Liam Young	85dda1b0eb	Check config file for hostname When sending an alert from the instancemonitor check the monitors config file for the hostname before sending the alert. Change-Id: If11aa1abb1142941d6dcd00c46063d9015644978 Closes-Bug: #1866638	2020-04-02 00:23:46 +00:00
Andreas Jaeger	64e2b887db	Update hacking for Python3 The repo is Python 3 now, so update hacking to version 3.0 which supports Python 3. Fix problems found by updated hacking version. Update local hacking checks to work with newer flake8. Remove hacking and friends from lower-constraints, they're not needed for installation. Change-Id: Ic7903c61bde999685ca26b5a10d070c8d8d206a3	2020-04-01 21:07:40 +02:00
Liam Young	8cb4de9e65	Use hostname to avoid clash with section Switch to using 'hostname' rather than 'host' to specify the Hostname, FQDN or IP address of this host. This is to avoid a clash with a section of the same name 1 1 https://github.com/openstack/masakari-monitors/blob/master/masakarimonitors/conf/host.py#L87 Change-Id: I7d95b063c2eabbd8893857b5e1e7d342db0aebec Closes-Bug: #1866660	2020-03-30 09:21:03 +01:00
Liam Young	dc9b777724	Use crm_mon for pacemaker-remote deployments As described in bug #1728527 cibadmin does not expose the state of the pacemaker-remote nodes which means hostmonitor cannot track them. This change switches to use crm_mon to check the status of remote nodes if the new config option host.restrict_to_remotes to set to True. This will trigger host monitor to use crm_mon to monitor nodes and will only monitor nodes that are marked as remotes (not members). Change-Id: I3f2026805413504c875ea5f39eb036d44b26dd43 Depends-On: Iaa2251708616e9c69817bf5b346d795ea7a4d21b Closes-Bug: #1728527	2019-08-27 17:00:22 +00:00
jayashri bidwe	ae3ab24f9a	Remove deprecated shell scripts As per deprecation notes [1], removed all unused shell scripts from hostmonitor and processmonitor. [1]: https://docs.openstack.org/releasenotes/masakari-monitors/ocata.html Change-Id: I93761ce4685c258058cb2d6b2ccb2323636f33ff	2019-06-19 10:27:22 +05:30

1 2 3

134 Commits