charm-rabbitmq-server

Commit Graph

Author	SHA1	Message	Date
Zuul	9278718a2b	Merge "Add service user password rotation feature"	2023-05-09 14:00:49 +00:00
Alex Kavanagh	42714adfde	Add service user password rotation feature This patch adds the service user rotation feature, which provides two actions: - list-service-usernames - rotate-service-user-password The first lists the possible usernames that can be rotated. The second action rotates the service, and is tested via the func-test-pr. Change-Id: Ia94ab3d54cd8a59e9ba5005b88d3ec1ff87019b1 func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/1029	2023-05-05 10:17:01 +02:00
Olivier Dufour-Cuvillier	c9efea67c8	Allow NRPE to collect stats in CIS hardened env It removes the necessity to run the cron task as root user and ensure the content created in /var/lib/rabbitmq belongs to rabbitmq user and group solely. Then giving access for nrpe user is done by adding its user to rabbitmq group. Also implemented in the upgrade-charm hook for ongoing deployments Closes-Bug: #1879524 Change-Id: I19e3d675ace7c669451ca40a20d21cef1aec6a95	2023-04-13 00:36:40 +00:00
Zuul	415fcc4054	Merge "Enforce a maximum of 1024 async threads."	2023-04-12 17:36:45 +00:00
Vern Hart	3c1c05ee59	Enforce a maximum of 1024 async threads. The beam.smp process won't start if more than 1024 are configured, the charm could make this by default on large systems (e.g. more than 42 CPUs). This change makes RabbitMQEnvContext.calculate_threads() never return more than 1024 (MAX_NUM_THREADS). Change-Id: I92879445210bac6ee7d96a704cdf428ca738e3b6 Closes-Bug: #1768986	2023-04-11 16:55:44 -04:00
Edward Hope-Morley	8c9f68e8aa	Fix typo in configure ttl code Also fixes tox.ini Change-Id: Ic4c2d34ff248d5429eb604824e42dbaba6ca2678 Closes-Bug: #1939681	2023-04-11 12:14:53 +01:00
Alex Kavanagh	55b985f55c	Fix focal to jammy series upgrade This is a fix/workaround to the package upgrade bug that affects the charm. The post-inst package script updates the .erlang.cookie if it is insecure during the upgrade of rabbit from 3.8 to 3.9. This breaks the series-upgrade resulting in a charm erroring on the post-series-upgrade hook. This fix works by checking if the .erlang.cookie has changed during the post-series-upgrade hook and either updating the cookie in peer storage (if it is insecure) or ensuring that the cookie from peer storage is written to the .erlang.cookie if it isn't the leader. This ensures that the cluster continues to work and that the series-upgrade can be completed across the cluster. Change-Id: I540ea8da85b3b4326ccb8194f1d8b1050b04eae9 Closes-Bug: #2006485	2023-02-22 14:56:27 +00:00
Alex Kavanagh	81f08ab769	Fix issue where charms aren't clustered but RMQ is Due to the @cache decorator in the code, it was possible to get the charm into a state where RMQ is clustered, but the charm doesn't record it. The charm 'thinks' it is clustered when it has set the 'clustered' key on the 'cluster' relation. Unfortunately, due to the @cached decorator it's possible in the 'cluster-relation-changed' hook to have a situation where the RMQ instance clusters during the hook execution and then, later, when it's supposed to writing the 'clustered' key, it reads the previous cached value where it wasn't clustered and therefore doesn't set the 'clustered' key. This is just about the only opportunity to do it, and so the charm ends up being locked. The fix was to clear the @cache values so that the nodes would be re-read, and this allows the charm to then write the 'clustered' key. Change-Id: I12be41a83323d150ba1cbaeef64041f0bb5e32ce Closes-Bug: #1975605	2023-01-06 20:39:50 +00:00
Felipe Reyes	b35247364f	Set cluster-partition-handling on upgrade-charm. For units deployed before the implementation of the cluster-partition-handling strategy they won't have that key set in the leader making the charm believe there are pending tasks, so this change seeds the key when is not set with the value present in the charm's configuration. Change-Id: Ifdae35ffee1ad7a8f4e5248c817cca14b69d9566 Closes-Bug: #1979092	2022-06-17 16:36:51 -04:00
Zuul	eac35c1d99	Merge "Remove legacy checks"	2022-04-25 12:29:33 +00:00
Tianqi	9376aeb8e6	Handle non-uniform queue stats output RabbitMQ sesrver sometimes creates non-uniform outputs that nrpe can't parse. Instead of breaking the check, this commit outputs the error messages and continue the check. This problem is most likely caused by queue state being "down" [1]. However, because the current charm doesn't show such information and the bug is hard to manually reproduce, this commit adds the state attribute when creating queue_state file for future debugging. [1] https://www.rabbitmq.com/rabbitmqctl.8.html#state_2 Closes-Bug: #1850948 Change-Id: Iaa493c8270f344cde8ad7c89bd2bb548f0ad71bd	2022-04-13 21:53:33 +00:00
Billy Olsen	26b9434648	Remove legacy checks Remove legacy checks from set_ha_mode in rabbit_utils.py as it checks for versions of rabbitmq which is less than version 3.0.0 which is not available in the archives for any supported releases. Change-Id: Ib21f6ae3f30eabaaa8d677c20a555ded4e6851d6	2022-04-13 12:51:23 -07:00
Liam Young	12de0d964c	Coordinate cluster join events Use the coordination module when cluster join events are called. The `cluster_wait` method has been removed as it is no longer used and `cluster_with` has been broken up into three new methods ( `clustered_with_leader`, `update_peer_cluster_status` and `join_leader`) which can be called separately. The `modulo-nodes` and `known-wait` charm options have been removed as they are no longer needed. Closes-Bug: #1902793 Change-Id: I136f5dcc855da329071e119b67df25d9045e86cc	2022-02-18 15:18:06 +00:00
Liam Young	70cbe1eef9	Coordinate package upgrades across cluster Use the coordination module to manage package upgrades across the cluster. To each this some of the setup was moved into a new configure_rabbit_install method which handles setup is normally run after an upgrade. Change-Id: I8d244d96c83a5da164322faff873a72530ec9def	2022-02-17 11:06:04 +00:00
Liam Young	3d5e1e22d8	Coordination module for rabbit restarts Use the coordination module to manage restarting the rabbitmq services. This is to ensure that restarts are only performed on one unit at a time. This helps prevent situation which can cause the cluster to become split brained (eg if two or more nodes are restarted at the same time). * Manually run _run_atstart & _run_atexit method when actions are run as this does not happen automatically and is needed by the coordination layer. * Replace restart_on_change decorator with coordinated_restart_on_change. coordinated_restart_on_change includes logic for requesting restart locks from the coordination module. * The coordination module works via the leader and cluster events so the hooks now include calls to check_coordinated_functions which will run any function that is waiting for a lock. * Logic has been added to check for the situation where a hook is being run via the run_deferred_hooks actions. If this is the case then restarts are immediate as the action should only be run on one unit at a time. Change-Id: Ia133c90a610793d4da96d3400a3906b801b52b73	2022-02-17 11:06:00 +00:00
Zuul	f44cccc505	Merge "Check before applying plugin and perms changes"	2022-02-08 21:53:14 +00:00
Zuul	d79095f6b5	Merge "Rabbitmq metrics and splitbrain detection"	2022-01-17 23:24:18 +00:00
Linda Guo	0653c186ce	Rabbitmq metrics and splitbrain detection Enabled rabbitmq_prometheus plugin for prometheus to scrape the metrics of rabbitmq and alert if rabbitmq splitbrain is detected. Integrated rabbitmq dashboards in grafana via dashboards relations Added new unit test cases Closes-Bug: 1899183 Change-Id: I88942dd0b246c498d0ab40b00d586d4349b0f100	2022-01-17 18:32:38 +11:00
Liam Young	ccd11fdf9e	Check before applying plugin and perms changes Check that setting update is needed before applying a config update to the cluster. This is mainly applicable to rabbitmq-server > 3.8.2 which supports json output. If a parser is not available to extract the existing settings then the old behaviour of blindly applying the change is used. Closes-Bug: #1909031 Change-Id: I9599f69cc11ea8d1a4e9d618aecdab4afe488d96	2022-01-07 13:57:43 +00:00
Zuul	2212383158	Merge "Switch to enabling the managment plugin by default"	2022-01-04 14:45:41 +00:00
Zuul	bc0e50b673	Merge "Use cluster strategy 'ignore' for install"	2022-01-04 14:38:30 +00:00
Hervé Beraud	ff45f3ae4b	Use unittest.mock instead of mock The mock third party library was needed for mock support in py2 runtimes. Since we now only support py36 and later, we can use the standard lib unittest.mock module instead. Note that https://github.com/openstack/charms.openstack is used during tests and he need `mock`, unfortunatelly it doesn't declare `mock` in its requirements so it retrieve mock from other charm project (cross dependency). So we depend on charms.openstack first and when Ib1ed5b598a52375e29e247db9ab4786df5b6d142 will be merged then CI will pass without errors. Depends-On: Ib1ed5b598a52375e29e247db9ab4786df5b6d142 Change-Id: I98f432a771b5f6c966328d30629410a0a180dbee	2021-12-15 11:54:45 +00:00
Zuul	485b0d3dcd	Merge "Modify the output to action "cluster-status" to make it user friendly (rabbitmq)"	2021-12-13 12:52:56 +00:00
Anna Savchenko	223ec26617	Modify the output to action "cluster-status" to make it user friendly (rabbitmq) "rabbitmqctl cluster_status" uses escape codes to color/highlight the output, and it does not have a way to suppress this. This makes the output to the command "juju run-action rabbitmq-server/leader cluster-status" not user friendly and difficult to read. Add the json formatting option to the rabbitmqctl command and use the json.dumps method to get a user friendly output. Add unit test. Closes-Bug: #1943198 Change-Id: I24380e24ff1edbede9c2db1671a4fc05d5a7cc63	2021-12-10 21:22:30 +02:00
Liam Young	df711c6717	Switch to enabling the managment plugin by default Over time the managment plugin has become a core part of managing a rabbit deployment. This includes allowing tools such as nrpe to be able to query the api and alert for situations such as orphaned queues. Change-Id: Icbf760610ce83b9d95f48e99f6607ddf23963c97 Partial-Bug: 1930547	2021-11-29 11:06:18 +00:00
Liam Young	32bce11f0f	Use json module to dump json in set_ha_mode Use json module to dump json in set_ha_mode rather than trying to generate json using string interpolation. This fixes a bug when using the 'nodes' mode which was generating invalid json. The new function test is taken from Id7ef45b7001d26ede3fd61f97626b5e9e8b81196 Change-Id: Ieb49036389221f6fbf2db93fbe4aebe6e986ea21 Co-Authored-By: Trent Lloyd <trent.lloyd@canonical.com>	2021-11-24 13:56:50 +00:00
Liam Young	ab813a982d	Use cluster strategy 'ignore' for install Use cluster-partition-handling strategy 'ignore' during charm installation regardless of the charm config setting. Once the leader has checked it is clustered with peers then it sets the cluster-partition-handling strategy to be whatever the user set in charm config. Partial-Bug: 1802315 Change-Id: Ic03bbe55ea8aab8b285977a5c0f9410b5bbf35c8	2021-11-24 13:17:03 +00:00
Zuul	c72d401192	Merge "Restrict TLS versions"	2021-11-23 14:43:29 +00:00
James Page	ece87ba8ca	Restrict TLS versions TLS < 1.2 is considered insecure; where possible limit the versions of TLS to 1.2 or higher, enabling support for TLS 1.3 when the required erlang and rabbitmq versions are installed. Change-Id: Iec5ab60488986f8e332ff0e9a11895822a61c1ee Closes-Bug: 1892450 Func-Test-PR: https://github.com/openstack-charmers/zaza-openstack-tests/pull/668	2021-11-23 14:20:10 +00:00
Zuul	c0ea9ed191	Merge "Switch to new configuration file format"	2021-11-23 14:13:39 +00:00
Liam Young	dbab66b0c5	Refactor methods which query rabbit Refactor methods which query rabbit to remove the duplication around checking if json output is supported. Change-Id: Id4e3dbd85748e41bb4b1c8db282495cfffaa823d	2021-11-22 13:05:43 +00:00
James Page	9ed0e2d85c	Switch to new configuration file format For newer RabbitMQ versions, switch to using the new ini style configuration file format (rabbitmq.conf vs rabbitmq.config). This allows the charm to configure a wider set of options and is needed to support limitation of TLS versions use for on the wire encryption. Upgrades at RabbitMQ 3.7.0 should switch from old to new format and file name. Change-Id: I6deda5ecf5990d527e22373540074d2a4b7bad38 Func-Test-PR: https://github.com/openstack-charmers/zaza-openstack-tests/pull/668	2021-11-16 09:35:31 +00:00
Julien Thieffry	242167b6ba	Display busiest queues in check_queues NRPE plugin When invoking the check_rabbitmq_queues script with wildcards for vhost and/or queue parameters, script output does not reflect precisely which queues are having a high number of oustanding messages as information is consolidated under the wildcard. This change fixes this behaviour by adding a new charm configuration parameter which allows the user to specify the number of busiest queues, n, to display should the check_rabbitmq_queues script reports any warnings or errors. The default, n=0, keeps the current script output. This option is applicable regardless of the vhost:queue combination but is specifically relevant when wildcards are passed as arguments. Implementation displays the first n items in the stats list re-organized in decreasing message count order. Closes-Bug: #1939084 Change-Id: I5a32cb6bf37bd2a0f30861eace3c0e6cb5c2559d	2021-08-23 06:21:58 +00:00
Billy Olsen	fd8d018bab	Move cron max file age calculation to rabbit_utils The check_rabbitmq_queues nrpe check accesses the cron file created for running collect stats job. This is done in order to determine if the stats are too old and an alert should be raised. The nagios user does not have access to read the cron job when running in a hardened environment where /etc/cron.d is not readable. This change refactors this logic to move the calculation of maximum age for a stats file from the check_rabbitmq_queues script and into the rabbit_utils code where it is generating the nrpe configuration. A new (optional) parameter is added to the check_rabbitmq_queues script to accept the maximum age in seconds a file can last be modified. This change also removes the trusty support in hooks/install and hooks/upgrade-charm as the rabbit_utils.py file needs to import a dependency which is installed by the scripts. It is cleaned up to make sure the croniter package is always installed on install or upgrade. Change-Id: If948fc921ee0b63682946c7cc879ac50e971e588 Closes-Bug: #1940495 Co-authored-by: Aurelien Lourot <aurelien.lourot@canonical.com>	2021-08-19 15:12:17 +02:00
Zuul	8948fe7a49	Merge "Add config parameters to tune mnesia settings"	2021-08-19 09:14:09 +00:00
Billy Olsen	45ded8b0f9	Improve parsing of cron schedule Improving the parsing of the cron schedule for /etc/cron/rabbitmq-stats. The code makes assumptions that the user in the cron entry will be the root user, which is generally safe as that's what the charm applied. However, the parsing is brittle in that it depends on the 'root' string in the entry. This changes the code so that the cron timer spec is stripped out based on the column entries in the file. Change-Id: I2d573e8942e840e0e5376f1537a2a3373fea3db8 Fixes-Bug: #1939702	2021-08-17 11:27:04 -07:00
Nicolas Bock	8015d9a365	Add config parameters to tune mnesia settings When a RabbitMQ cluster is restarted, the mnesia settings determine how long and how often each broker will try to connect to the cluster before giving up. It might be useful for an operator to be able to tune these parameters. This change adds two settings, `mnesia-table-loading-retry-timeout` and `mnesia-table-loading-retry-limit`, which set these parameters in the rabbitmq.config file [1]. [1] https://www.rabbitmq.com/configure.html#config-items Change-Id: I96aa8c4061aed47eb2e844d1bec44fafd379ac25 Partial-Bug: #1828988 Related-Bug: #1874075 Co-authored-by: Nicolas Bock <nicolas.bock@canonical.com> Co-authored-by: Aurelien Lourot <aurelien.lourot@canonical.com>	2021-08-16 15:43:19 +02:00
Aurelien Lourot	54460a568a	Fix 'rabbitmqctl wait' timeout `rabbitmqctl wait`'s default behavior changed recently and a short timeout was introduced upstream. This patch adapts our code in order to stay on the old, intended behavior. Change-Id: I020e3e9e4976e21da08316ac58642b2058564b02	2021-08-16 13:50:29 +02:00
Zhang Hua	707fa0e093	Number of heat queues will keep growing forever after heat-engine restarts Set TTL as a solution for topic queue engine_worker and heat-engine-listener to avoid them growing all the time after heat-engin restarts. This is rabbitmq-server part. eg: we can set heat ttl by: juju config heat ttl=3600000 Closes-Bug: 1925436 Change-Id: I7b826fe965a200da29020a8f2c6148f76d10a2b0	2021-06-23 17:33:13 +08:00
Liam Young	fbf3bda59a	If rabbit cluster is partioned show that in status If rabbit cluster is partioned show that in status. This check only works on focal+, prior to that the check is ignored. Change-Id: Id45c969d37f8cb1c26d0f9834f4a79e7555dd03c Closes-Bug: 1930417	2021-06-21 08:47:08 +00:00
Liam Young	81c33953f9	Implementation of deferred restarts Add deferred event actions and config. Change-Id: Ifbb15c0c04117a5a98672b2af4fd7203dae9a18e	2021-04-09 21:11:30 +00:00
Zuul	de11e7f0dd	Merge "Fix: do not use charmhelpers in non-charm context"	2021-01-15 15:46:40 +00:00
Martin Kalcok	7acad5fdaa	NRPE: Allow excluding queues from queue-size checks Option '-e <vhost> <queue>' was added to the 'check_rabbitmq_queues.py' nrpe script to allow excluding selected queues when checking queue sizes. Corresponding option 'exclude_queues' was added to the charm config. By default, following queues are excluded: * event.sample * notifications_designate.info * notifications_designate.error * versioned_notifications.info * versioned_notifications.error Closes-Bug: #1811433 Change-Id: I57e297bb4323a3ab98da020bfcb1630889aac6d7	2021-01-14 11:35:31 +01:00
Peter Sabaini	ab79c3ee6c	Fix: do not use charmhelpers in non-charm context In change I60141397f39e3b1b0274230db8d984934c98a08d charmhelper library is being used in the rabbitmq queue nrpe check. This is problematic as the check does not actually run in a charm context and therefore does not have access to the charm environment such as the current config. Additionally an issue in collating check results had been introduced. This change aims to fix these issues. Instead of using the charmhelper library, the cronspec is read out from the cron job definition itself, and the series is probed from /etc/lsb-release Change-Id: I952aeda31e997ccadb6cff62e3b0d46349650979	2020-12-23 11:33:46 +01:00
Felipe Reyes	07ec03b5d7	Add queue-master-locator config option queue-master-locator is a configuration option supported by rabbitmq-server since 3.6, it allows to have control of where the master queue will be created. Change-Id: I38cc019b73d062572e19bd532b6bccdaf88638ba Func-Test-PR: https://github.com/openstack-charmers/zaza-openstack-tests/pull/382 Closes-Bug: #1890759 Signed-off-by: Nicolas Bock <nicolas.bock@canonical.com>	2020-12-13 15:43:31 -03:00
Zuul	39780d9e09	Merge "Update NRPE logic to add/remove checks and files"	2020-11-27 15:10:26 +00:00
Robert Gildein	60f2f486d0	Update NRPE logic to add/remove checks and files The function `update_nrpe_checks` has been changed to remove redundant checks and scripts based on rabbitmq configuration, but the main logic was unchanged. The function logic is based on these three functions: 1) copy all the custom NRPE scripts and create cron file 2) add NRPE checks and remove redundant 2.a) update the NRPE vhost check for TLS and non-TLS 2.b) update the NRPE queues check 2.c) update the NRPE cluster check 3) remove redundant scripts - this must be done after removing the relevant check Closes-Bug: #1779171 Change-Id: Ice83133c2c73532720f33298713267f69e8b4c3a	2020-11-25 11:57:19 +01:00
Zuul	b80f032d1a	Merge "Display queue sizes along with queues"	2020-11-10 20:40:05 +00:00
Peter Sabaini	b3710a0085	Display queue sizes along with queues When checking queues, display not only queue names but also their size (number of messages). Return sizes as integers. Also update parsing to account for a rabbitmqctl output change in focal. Closes-Bug: #1838964 Change-Id: I2014f065393a1ad4b594363ade6c01ccec4fb71a	2020-11-06 18:22:56 +01:00
Peter Sabaini	943f4f63ab	Fix: nrpe queue check should check for freshness Make the rabbitmq queue check also check if its input data file was recently updated. This input data is created via cronjob; if that gets stuck we might not actually be getting meaningful data. The charm supports configuring the check interval via a full cron time specification, so technically one could have that updated only once a year even if this doesn't make much sense in a monitoring scenario. Also fix a buglet in the nrpe update hook function: only deploy a queue check if the cron job hasn't been deconfigured by setting it to the empty string Change-Id: I60141397f39e3b1b0274230db8d984934c98a08d Closes-Bug: #1898523	2020-11-06 09:36:49 +01:00

1 2 3

113 Commits