* Update charm-ceph-mon from branch 'master'
to 945c958ff40a01fef5b4857adc912e0bd0eb1879
- Merge "Don't expect a static job name"
- Don't expect a static job name
A job name passed via the prometheus_scrape library doesn't end up as a
static job name in the prometheus configuration file in the COS world
even though COS expects a fixed string. Practically we cannot have a
static job name like job=ceph in any of the alert rules in COS since the
charms will convert the string "ceph" into:
> juju_MODELNAME_ID_APPNAME_prometheus_scrape_JOBNAME(ceph)-N
Let's give up the possibility of the static job name and use "up{}" so
it will be annotated with the model name/ID, etc. without any specific
job related condition. It will break the alert rules when one unit have
more than one scraping endpoint because there will be no way to
distinguish multiple scraping jobs. Ceph MON only has one prometheus
endpoint for the time being so this change shouldn't cause an immediate
issue. Overall, it's not ideal but at least better than the current
status, which is an alert error out of the box.
The following alert rule:
> up{} == 0
will be converted and annotated as:
> up{juju_application="ceph-mon",juju_model="ceph",juju_model_uuid="UUID"} == 0
Closes-Bug: #2044062
Change-Id: I0df8bc0238349b5f03179dfb8f4da95da48140c7
* Update charm-ceph-mon from branch 'master'
to 762ad83c19cb5b699a1bdbe8c28e2d8dbef10e2b
- Fix: defer cos-prometheus for bootstrap
If a COS prometheus changed event is processed but bootstrap hasn't
completed yet, we need to retry the event at a later time.
Closes-bug: #2042891
Change-Id: I3d274c09522f9d7ef56bc66f68d8488150c125d8
* Update charm-ceph-mon from branch 'master'
to 6ae78a6e6cf97c25e28a7899f540dce4970a8aaa
- Merge "Add alerting rules for RGW multisite deployments"
- Add alerting rules for RGW multisite deployments
Add default prometheus alerting rules for RadosGW multisite deployments based
on the built-in Ceph RGW multisite metrics.
Note that the included prometheus_alerts.yml.default rule file
is included for reference only. The ceph-mon charm will utilize the
resource file from https://charmhub.io/ceph-mon/resources/alert-rules
for deployment so that operators can easily customize these rules.
Change-Id: I5a12162d73686963132a952bddd85ec205964de4
* Update charm-ceph-mon from branch 'master'
to 1c9f3b210d8bf8904143647443133cf35f48d8b7
- Don't error out on missing OSDs
Ceph reef has a behaviour change where it doesn't always return
version keys for all components. In
I12a1bcd32be2ed8a8e5ee0e304f716f5a190bd57 an attempt was made to fix
this by retrying, however this code path can also be hit when a
component such as OSDs are absent. While a cluster without OSDs
wouldn't be functional it still should not cause the charm to error.
As a fix, just make the OSD component optional when querying for a
version instead of retrying.
Change-Id: I5524896c7ad944f6f22fb1498ab0069397b52418
* Update charm-ceph-mon from branch 'master'
to 7223f2634f54fc7a61679c20de56731cb0d6c8bb
- Merge "Retry setting rbd_stats_pools prometheus config"
- Retry setting rbd_stats_pools prometheus config
Setting the 'mgr/prometheus/rbd_stats_pools' option can fail
if we arrive too early, even if the cluster is bootstrapped. This is
particularly seen in ceph-radosgw test runs. This patchset thus
adds a retry decorator to work around this issue.
Change-Id: Id9b7b903e67154e7d2bb6fecbeef7fac126804a8
* Update charm-ceph-mon from branch 'master'
to 0a03288a726ee7a6bfb6c544cd074bfcb9f469a2
- Merge "Add nagios check for radosgw-admin sync status"
- Add nagios check for radosgw-admin sync status
This duplicates the check performed for ceph status and specialises it for
radosgw-admin sync status instead.
The config options available are:
- nagios_rgw_zones: this is which zones are expected to be connected
- nagios_rgw_additional_checks: this is equivalent to nagios_additional_checks
and allows for a configurable set of strings to grep for as critical alerts.
Change-Id: Ideb35587693feaf1cc0736e981005332e91ca861
* Update charm-ceph-mon from branch 'master'
to 03868b2c9f9070499d079a9e64b45ebd3c4fb189
- Revert default source to 'bobcat'
The Openstack libs don't recognize Ceph releases when specifying
the charm source. Instead, we have to use an Openstack release.
Since it was set to quincy, reset it to bobcat.
Closes-Bug: #2026651
Change-Id: Ibac09d2bf77eeba69789434eaa6112c2028fbf64
* Update charm-ceph-mon from branch 'master'
to 324679f061d2129bfef73b2ed001251f7ffc2432
- Tox: add Python 3.11 section to tox.ini
Also improve mocking unit tests
Change-Id: Ie4356c23e97cec48f5731323bc90d63335ecc753
* Update charm-ceph-mon from branch 'master'
to bc7a0fb6c3f994b91556d55ebbeb220ceeb315fc
- Fix: increase timeout for get versions
Change-Id: Iee13e9a88f047f5835aee8e5a308ce2035d28891
* Update charm-ceph-mon from branch 'master'
to 55beb2504d3ea6d7f522d8d9a46bef7d741f1edc
- Fix version retrieval
During cluster deployment a situation can arise where there are
already osd relations but osds are not yet fully added to the cluster.
This can make version retrieval fail for osds. Retry version retrieval
to give the cluster a chance to settle.
Also update tests to install OpenStack from latest/edge
Change-Id: I12a1bcd32be2ed8a8e5ee0e304f716f5a190bd57
* Update charm-ceph-mon from branch 'master'
to 3567a0589cf196ec9b417eb2ac71da6ceac3bad3
- Prune CI test jobs and test bundles
Change-Id: I1be06ec2901ac414388f4875c95631e4ed50145e
* Update charm-ceph-mon from branch 'master'
to 84cdcf3cd50ad5149ef1d2a5248bb6b0806bb20e
- Return previous result of processed broker requests
Instead of returning an empty dict for already processed
broker requests, store the result and return it. This works
around issues in charms like ceph-fs that spin indefinitely
waiting for the response to a request that never arrives.
Closes-Bug: #2031414
Change-Id: Ie86f007d76fe75cc07cf7a973eff3f535a11dbe7
* Update charm-ceph-mon from branch 'master'
to 3eb5898a655cf9b6221bd3ec74d0a98947b08092
- Add docs key and point at Discourse
Add the 'docs' key and point it at a Discourse topic
previously populated with the charm's README contents.
When the new charm revision is released to the Charmhub,
this Discourse-based content will be displayed there. In
the absense of the this new key, the Charmhub's default
behaviour is to display the value of the charm's
'description' key.
Change-Id: I173cadb5a8208283883e1119dbfc5d661809cc5f
* Update charm-ceph-mon from branch 'master'
to ab84214805e130848db3c45884223619904b3734
- Set consistent source
Avoid the unintuitive situation where users are deploying from
channel=quincy but get an older ceph due to deploying series=focal by
explicitly setting source=quincy which is what most users want anyway;
those that do not can still explicitly set source.
Change-Id: I9428e93ba6107ba5e2ebcc667995b3d88eb03d27
* Update charm-ceph-mon from branch 'master'
to 1a41aa24ce82c411da936ef3f21c61a57c059155
- Fix ceph-mon upgrade path
This PR makes some small changes in the upgrade path logic by
providing a fallback method of fetching the current ceph-mon
version and adding additional checks to see if the upgrade can
be done in a sane way.
Closes-Bug: #2024253
Change-Id: I1ca4316aaf4f0b855a12aa582a8188c88e926fa6
* Update charm-ceph-mon from branch 'master'
to 88d37461dc6ce296af1a71ef5811844a4aee7073
- Ensure broker requests are re-processed on upgrade-charm
When broker-request caching was added, it broke functionality
that ensured that clients were updated on charm-upgrade, this
change enables a bypass of that cache functionality and uses
it to re-process broker requests in the upgrade-charm hook.
Depends-On: https://review.opendev.org/c/openstack/charms.ceph/+/848311
Func-Test-Pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/1066
Closes-Bug: #1968369
Change-Id: Ibdad1fd5976fdf2d5f3384f1b120b0d5dda34947
* Update charm-ceph-mon from branch 'master'
to b31de1f027b32fa2ebe8ab62756819de466bbbf2
- Don't clear osd_memory_target unconditionally
The charm can now set osd_memory_target, but it's not per device class
or type by the nature of how the charm works. Resetting
osd_memory_target always when osd_memory_target is not passed over the
relation is a bit risky behavior since operators may have set
osd_memory_target by hand with `ceph config` command out side of the
charm. Let's be less disruptive on the charm upgrade.
Closes-Bug: #1934143
Change-Id: I34dd33e54193a9ebdbc9571d153aa6206c85a067
* Update charm-ceph-mon from branch 'master'
to 58469bc4596d50a0a97351b8fb8d0a7debc5a526
- Merge "Configure ceph with osd-memory-target from ceph-osd charm"
- Configure ceph with osd-memory-target from ceph-osd charm
Change-Id: Id3f21f8ab68fb88529b6cbd78217e27772c2739c
* Update charm-ceph-mon from branch 'master'
to af2323c4578ff7f4b352acfe350279eddd1ce0aa
- rbd mirror relation: be persistent in getting pool info
Auth for getting pool details can fail initially if we set up a rbd
mirror relation at cloud bootstrap. Add some retry to give it another
chance
Change-Id: I2f5ac561120b1abe52ea0621bb472bc78495fa97
Partial-Bug: #2021967
* Update charm-ceph-mon from branch 'master'
to e99c38ae4cc6c89aa7818d38990cc9b2dfb5060d
- Fix persistent config file not update bug
When ceph doing the version upgrade, it will check the previous ceph
from the `source` config variable which store in persistent file.
But the persistent file update is broken. It is because we use hookenv.Config
from ops framework, but the hookenv._run_atexit, which
save the change to file, is not been called.
Partial-Bug: #2007976
Change-Id: Ibf12a2b87736cb1d32788672fb390e027f15b936
func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/1047
* Update charm-ceph-mon from branch 'master'
to 357421f39103e31e82651d0905caf6e84fe5c8ba
- Testing: use mysql and rabbitmq from LTS
For better stability use LTS series for rabbitmq and mysql when
testing instead of interim releases.
Also remove xena (non-lts) from tests and yoga as a source default
Change-Id: Ie443c55dc4cc1b7f63eacfee79b28f210f1277e4
* Update charm-ceph-mon from branch 'master'
to f172b8cd1ebd7fe388464ef47b35eff69c1f1fb6
- Fix: testing bundles for jammy and lunar were off
Change-Id: I314fef8551e896ab35678bc78f0233cb42030413
* Update charm-ceph-mon from branch 'master'
to f23d9e3d3eda7dda1775a3b0a600856bec42a1c4
- Remove relation test
The CephRelationTest class wasn't of much used and the test was
rather flaky, since it compared public IP addresses.
Change-Id: Iba5aad1d895ba8b28ce364899a1e41275dc3003b
func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/1034
* Update charm-ceph-mon from branch 'master'
to 1a81c4416c2bef7b91aff31b019c6730ed89a7dd
- Add support for interim Ubuntu releases
- update bundles to include UCA pocket tests
- update test configuration
- update metadata to include kinetic and lunar
- update snapcraft to allow run-on for kinetic and lunar
Change-Id: I6b229b502dd4ee9f1d219240b86f7826abf0c25d
* Update charm-ceph-mon from branch 'master'
to 7ea8cf6a3588d8528d4caf22f7d7d185a1c45310
- Merge "Use a different name for the local key/value store"
- Use a different name for the local key/value store
The operator framework and charmhelpers use the same path for the
local K/V store, which causes problems when running certain hooks
like 'pre-series-upgrade'. In order to work around this issue, this
patchset makes the charmhelpers lib use a different path, while
migrating the DB file before doing so.
Closes-Bug: #2005137
Change-Id: Ic2e024371ff431888731753d29fff8538232009a
* Update charm-ceph-mon from branch 'master'
to b9f78052035ab7e8a58b6b77e8d6c4fbe8abef0c
- Fix Nagios additional checks functionality
Commit 40b22e3d on juju/charm-helpers repo introduced shell quoting of
each argument passed to the check, turning the quoting of the double quotes
done here not only unnecessary but also damaging to the final command.
Closes-Bug: #2008784
Change-Id: Ifedd5875d27e72a857b01a48afcd058476734695
func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/1022
* Update charm-ceph-mon from branch 'master'
to 3a774be96132e7802d1278cd6812d2fb982c740e
- Fix issue with ceph-client relation handling
A bug was introduced when changing ceph-client to
an operator framework library that caused the
fallback application_name handling to present
a class name rather than a remote applicaiton name.
This change updates the handling to get at an
`app.name` rather than an `app`.
As a drive-by, this also allow-lists the fully-
qualified rename.sh.
Closes-Bug: #1995086
Change-Id: I57b685cb78ba5c4930eb0fa73d7ef09d39d73743
func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/1022
* Update charm-ceph-mon from branch 'master'
to c9389a8cd0165aa8718e6f8aac61a21af99c9144
- Revert "Create NRPE check to verify ceph daemons versions"
This reverts commit dfbda68e1add1e8a31ef0e14c043b584532fcd03.
Reason for revert:
The Ceph version check seems to be missing a consideration of users to
execute the nrpe check. It actually fails to get keyrings to execute the
command as it's run by a non-root user.
$ juju run-action --wait nrpe/0 run-nrpe-check name=check-ceph-daemons-versions
unit-nrpe-0:
UnitId: nrpe/0
id: "20"
results:
Stderr: |
2023-02-01T03:03:09.556+0000 7f4677361700 -1 auth: unable to find
a keyring on
/etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin:
(2) No such file or directory
2023-02-01T03:03:09.556+0000 7f4677361700 -1
AuthRegistry(0x7f467005f540) no keyring found at
/etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
disabling cephx
2023-02-01T03:03:09.556+0000 7f4677361700 -1 auth: unable to find
a keyring on
/etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin:
(2) No such file or directory
2023-02-01T03:03:09.556+0000 7f4677361700 -1
AuthRegistry(0x7f4670064d88) no keyring found at
/etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
disabling cephx
2023-02-01T03:03:09.560+0000 7f4677361700 -1 auth: unable to find
a keyring on
/etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin:
(2) No such file or directory
2023-02-01T03:03:09.560+0000 7f4677361700 -1
AuthRegistry(0x7f4677360000) no keyring found at
/etc/ceph/ceph.client.admin.keyring,/etc/ceph/ceph.keyring,/etc/ceph/keyring,/etc/ceph/keyring.bin,
disabling cephx
[errno 2] RADOS object not found (error connecting to the cluster)
check-output: 'UNKNOWN: could not determine OSDs versions, error: Command ''[''ceph'',
''versions'']'' returned non-zero exit status 1.'
status: completed
timing:
completed: 2023-02-01 03:03:10 +0000 UTC
enqueued: 2023-02-01 03:03:09 +0000 UTC
started: 2023-02-01 03:03:09 +0000 UTC
Related-Bug: #1943628
Change-Id: I84b306e84661e6664e8a69fa93dfdb02fa4f1e7e
* Update charm-ceph-mon from branch 'master'
to 87600a9c313ed2ca18f55911e082ad1f11afe3db
- Merge "Create a key for ceph-osd for crash module auth"
- Create a key for ceph-osd for crash module auth
This will be set on the osd relation,
so the ceph-osd charm can use this key for auth
by the crash reporting module.
ref. https://docs.ceph.com/en/latest/mgr/crash/
See https://review.opendev.org/c/openstack/charm-ceph-osd/+/869139
for how this key is used by ceph-osd.
Closes-Bug: #2000630
Change-Id: Ic95aae6b5981a6df1e0b3c310bcef8018c494a24
* Update charm-ceph-mon from branch 'master'
to 58fc48ebbe8945ef28806da7b8a6cac7be9a3a09
- Ensure crushtool --test called correctly
Later Ceph releases require that the --test function of crushtool
is called with replica information for validation.
Pass in "--num-rep 3" as a basic check plus "--show-statistics"
to silence a non-fatal warning message.
This can be clean cherry-picked back at least as far as
Ceph 12.2.x.
Change-Id: I76d21ddd9da79535f68490b4231ae13705e27edb
Closes-Bug: 2003690
* Update charm-ceph-mon from branch 'master'
to df676a097f780679f73c441d7fb67cdacde634c6
- Make sure lockfile-progs package is installed
Also, drop python-dbus for simplicity since "check_upstart_job" in nrpe
is not enabled any longer. And the python-dbus package is no longer
available on jammy either.
[on focal with systemd]
$ ls -1 /etc/nagios/nrpe.d/
check_ceph.cfg
check_conntrack.cfg
check_reboot.cfg
check_systemd_scopes.cfg
Closes-Bug: #1998163
Change-Id: I30bc22ae8509367207004b90eb2c38ad0fae9ffe
* Update charm-ceph-mon from branch 'master'
to af9143503ca4b58945d9728377a3cce339fcabf2
- Unpin tox version
This unpinning is meant to solve the issues with tox 4.x breaking
all the virtualenv dependencies.
Change-Id: Ifc3381b2f2e4e41ebf6676080bf1831baffb0d42
* Update charm-ceph-mon from branch 'master'
to d76dda4f5586b752451d5a9256097a620f1aade0
- Work around config initialisation behaviour change
The previous (classic) version of the charm initialised a Config
object in the install hook and let it go out of scope. Initialise
a config object explicitly in the install and upgrade charm hooks.
Change-Id: Ic389c840cc4253adaddcaa50d184db6ca66cb397
* Update charm-ceph-mon from branch 'master'
to 9debe750649b65459219e1cfc63b60cac931c78b
- Rewrite the get-erasure-profile action with the ops framework
Change-Id: I07cb5838c446ba08469e1d0f22d75d74c40ef29c
* Update charm-ceph-mon from branch 'master'
to 4ac36718f378d43b35603efcb437756c266ebebd
- Merge "Rewrite get_health action with the Operator framework"
- Rewrite get_health action with the Operator framework
Change-Id: I68645a3d00c0622c7701c8177bcd510c3092afe4
* Update charm-ceph-mon from branch 'master'
to 4d9d7a5b90912aefeaa7e946585706161a0bbf3a
- Merge "Rewrite the create-crush-rule action with the ops framework"
- Rewrite the create-crush-rule action with the ops framework
Change-Id: Ifaccd20ba4a0f148a38d14edf0c26bd4a4d5d655
* Update charm-ceph-mon from branch 'master'
to 71e1225f0d31a79b57967f20ba5152fd8f639f89
- Add ceph-fs to test bundles to ensure relation works
Change-Id: Ifc5e382e44f3dfcddfda3c526e07e9bb5892fbc3
* Update charm-ceph-mon from branch 'master'
to 11b7a7340b2f4aeef3f81b09f126cbbe669419f3
- Merge "Rewrite update status machinery with the ops framework"
- Rewrite update status machinery with the ops framework
Add a new module ceph_status for checking ceph-mon status.
Provide the ceph_shared helpers for querying current status of
ceph-mon units. Also add some initial testing for the charm module.
Change-Id: I5079023ca692f0a2b7bfda96bb1834b8e9b1f0cc
* Update charm-ceph-mon from branch 'master'
to 23fe7b11fd0cde9604649aeffbf57042d72236bf
- Merge "Add nagios check for expected number of OSDs"
- Add nagios check for expected number of OSDs
This check does not require manually setting the number of expected
OSDs.
Initially, the charm sets the count (per-host) to that of what's
present in the OSD tree. The count will be updated (on a per-host
basis) when the number of OSDs grows, but not when it shrinks. There
is a charm action to reset the expected count using information from
the OSD tree.
Closes-Bug: #1952985
Change-Id: Ia6a060bf151908c1d4159e6bdffa7bfe1f0a7988
* Update charm-ceph-mon from branch 'master'
to 15a9dca1b75f97c1a7e213db27b8b941f74413e7
- Merge "Make check_ceph_status.py a bit more "noisy" by default."
- Make check_ceph_status.py a bit more "noisy" by default.
Closes-Bug: #1989154
Change-Id: Ie0d73f14698e4f3ba4e7231920a622f587b4330f