This patchset implements key rotation for OSD units. The monitor
on which this action is called will set the 'pending_key' field
in the relation databag, which specifies the OSD id and new key.
On their side, OSD units will check this field and compare against
the OSD ids that they maintain to tell whether they need to
rotate the key or not.
Change-Id: Ief5afdea2b8449adbe14c7e838330e2f2be1cfd2
Instead of adding a new field that also requires deleting it
when it's not needed, simply reset the already-passed key and
let ceph-fs handle the rest.
Change-Id: I5a9adff9777ab1441ea50eb881a5334a69b087d2
This patchset implements key rotation for MDS daemons, which
essentially involves the ceph-fs charm. It works in a very
similar fashion to RGW units.
Change-Id: I06570d9602137b804af56e358cabf552d6f1e9fd
Following https://tracker.ceph.com/issues/52867 we need to tell ceph
which address family to use via the ms_bind_ipv4/6 config flags.
I added them to the ceph.conf template and updated the config hook.
Closes-Bug: #2056337
Change-Id: Ib735bd4876b6909762288b97857bccaa597c2b80
This patchset implements key rotation for managers only. The user
can specified either the full entity name (i.e: 'mgr.XXXX') or
simply 'mgr', which stands for the local manager.
After the entity's directory is located, a new pending key is
generated, the keyring file is mutated to include the new key and
then replaced in situ. Lastly, the manager service is restarted.
Note that Ceph only has one active manager at a certain point,
so it only makes sense to call this action on _every_ mon unit.
Change-Id: Ie24b3f30922fa5be6641e37635440891614539d5
func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/1195
A job name passed via the prometheus_scrape library doesn't end up as a
static job name in the prometheus configuration file in the COS world
even though COS expects a fixed string. Practically we cannot have a
static job name like job=ceph in any of the alert rules in COS since the
charms will convert the string "ceph" into:
> juju_MODELNAME_ID_APPNAME_prometheus_scrape_JOBNAME(ceph)-N
Let's give up the possibility of the static job name and use "up{}" so
it will be annotated with the model name/ID, etc. without any specific
job related condition. It will break the alert rules when one unit have
more than one scraping endpoint because there will be no way to
distinguish multiple scraping jobs. Ceph MON only has one prometheus
endpoint for the time being so this change shouldn't cause an immediate
issue. Overall, it's not ideal but at least better than the current
status, which is an alert error out of the box.
The following alert rule:
> up{} == 0
will be converted and annotated as:
> up{juju_application="ceph-mon",juju_model="ceph",juju_model_uuid="UUID"} == 0
Closes-Bug: #2044062
Change-Id: I0df8bc0238349b5f03179dfb8f4da95da48140c7
If a COS prometheus changed event is processed but bootstrap hasn't
completed yet, we need to retry the event at a later time.
Closes-bug: #2042891
Change-Id: I3d274c09522f9d7ef56bc66f68d8488150c125d8
Add default prometheus alerting rules for RadosGW multisite deployments based
on the built-in Ceph RGW multisite metrics.
Note that the included prometheus_alerts.yml.default rule file
is included for reference only. The ceph-mon charm will utilize the
resource file from https://charmhub.io/ceph-mon/resources/alert-rules
for deployment so that operators can easily customize these rules.
Change-Id: I5a12162d73686963132a952bddd85ec205964de4
Ceph reef has a behaviour change where it doesn't always return
version keys for all components. In
I12a1bcd32be2ed8a8e5ee0e304f716f5a190bd57 an attempt was made to fix
this by retrying, however this code path can also be hit when a
component such as OSDs are absent. While a cluster without OSDs
wouldn't be functional it still should not cause the charm to error.
As a fix, just make the OSD component optional when querying for a
version instead of retrying.
Change-Id: I5524896c7ad944f6f22fb1498ab0069397b52418
This duplicates the check performed for ceph status and specialises it for
radosgw-admin sync status instead.
The config options available are:
- nagios_rgw_zones: this is which zones are expected to be connected
- nagios_rgw_additional_checks: this is equivalent to nagios_additional_checks
and allows for a configurable set of strings to grep for as critical alerts.
Change-Id: Ideb35587693feaf1cc0736e981005332e91ca861
Setting the 'mgr/prometheus/rbd_stats_pools' option can fail
if we arrive too early, even if the cluster is bootstrapped. This is
particularly seen in ceph-radosgw test runs. This patchset thus
adds a retry decorator to work around this issue.
Change-Id: Id9b7b903e67154e7d2bb6fecbeef7fac126804a8
The Openstack libs don't recognize Ceph releases when specifying
the charm source. Instead, we have to use an Openstack release.
Since it was set to quincy, reset it to bobcat.
Closes-Bug: #2026651
Change-Id: Ibac09d2bf77eeba69789434eaa6112c2028fbf64
During cluster deployment a situation can arise where there are
already osd relations but osds are not yet fully added to the cluster.
This can make version retrieval fail for osds. Retry version retrieval
to give the cluster a chance to settle.
Also update tests to install OpenStack from latest/edge
Change-Id: I12a1bcd32be2ed8a8e5ee0e304f716f5a190bd57
Instead of returning an empty dict for already processed
broker requests, store the result and return it. This works
around issues in charms like ceph-fs that spin indefinitely
waiting for the response to a request that never arrives.
Closes-Bug: #2031414
Change-Id: Ie86f007d76fe75cc07cf7a973eff3f535a11dbe7
Add the 'docs' key and point it at a Discourse topic
previously populated with the charm's README contents.
When the new charm revision is released to the Charmhub,
this Discourse-based content will be displayed there. In
the absense of the this new key, the Charmhub's default
behaviour is to display the value of the charm's
'description' key.
Change-Id: I173cadb5a8208283883e1119dbfc5d661809cc5f
Avoid the unintuitive situation where users are deploying from
channel=quincy but get an older ceph due to deploying series=focal by
explicitly setting source=quincy which is what most users want anyway;
those that do not can still explicitly set source.
Change-Id: I9428e93ba6107ba5e2ebcc667995b3d88eb03d27
This PR makes some small changes in the upgrade path logic by
providing a fallback method of fetching the current ceph-mon
version and adding additional checks to see if the upgrade can
be done in a sane way.
Closes-Bug: #2024253
Change-Id: I1ca4316aaf4f0b855a12aa582a8188c88e926fa6
The charm can now set osd_memory_target, but it's not per device class
or type by the nature of how the charm works. Resetting
osd_memory_target always when osd_memory_target is not passed over the
relation is a bit risky behavior since operators may have set
osd_memory_target by hand with `ceph config` command out side of the
charm. Let's be less disruptive on the charm upgrade.
Closes-Bug: #1934143
Change-Id: I34dd33e54193a9ebdbc9571d153aa6206c85a067
Auth for getting pool details can fail initially if we set up a rbd
mirror relation at cloud bootstrap. Add some retry to give it another
chance
Change-Id: I2f5ac561120b1abe52ea0621bb472bc78495fa97
Partial-Bug: #2021967
When ceph doing the version upgrade, it will check the previous ceph
from the `source` config variable which store in persistent file.
But the persistent file update is broken. It is because we use hookenv.Config
from ops framework, but the hookenv._run_atexit, which
save the change to file, is not been called.
Partial-Bug: #2007976
Change-Id: Ibf12a2b87736cb1d32788672fb390e027f15b936
func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/1047
For better stability use LTS series for rabbitmq and mysql when
testing instead of interim releases.
Also remove xena (non-lts) from tests and yoga as a source default
Change-Id: Ie443c55dc4cc1b7f63eacfee79b28f210f1277e4
- update bundles to include UCA pocket tests
- update test configuration
- update metadata to include kinetic and lunar
- update snapcraft to allow run-on for kinetic and lunar
Change-Id: I6b229b502dd4ee9f1d219240b86f7826abf0c25d
The operator framework and charmhelpers use the same path for the
local K/V store, which causes problems when running certain hooks
like 'pre-series-upgrade'. In order to work around this issue, this
patchset makes the charmhelpers lib use a different path, while
migrating the DB file before doing so.
Closes-Bug: #2005137
Change-Id: Ic2e024371ff431888731753d29fff8538232009a
Commit 40b22e3d on juju/charm-helpers repo introduced shell quoting of
each argument passed to the check, turning the quoting of the double quotes
done here not only unnecessary but also damaging to the final command.
Closes-Bug: #2008784
Change-Id: Ifedd5875d27e72a857b01a48afcd058476734695
func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/1022
A bug was introduced when changing ceph-client to
an operator framework library that caused the
fallback application_name handling to present
a class name rather than a remote applicaiton name.
This change updates the handling to get at an
`app.name` rather than an `app`.
As a drive-by, this also allow-lists the fully-
qualified rename.sh.
Closes-Bug: #1995086
Change-Id: I57b685cb78ba5c4930eb0fa73d7ef09d39d73743
func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/1022