Commit Graph

201 Commits

Author SHA1 Message Date
Stephen Taylor 2fd438b4b1 Update Ceph images to patched 18.2.2 and restore debian-reef repo
This change updates the Ceph images to 18.2.2 images patched with a
fix for https://tracker.ceph.com/issues/63684. It also reverts the
package repository in the deployment scripts to use the debian-reef
directory on download.ceph.com instead of debian-18.2.1. The issue
with the repo that prompted the previous change to debian-18.2.1
has been resolved and the more generic debian-reef directory may
now be used again.

Change-Id: I85be0cfa73f752019fc3689887dbfd36cec3f6b2
2024-03-12 13:45:42 -06:00
Stephen Taylor 1e05f3151d [ceph-osd] Allow lvcreate to wipe existing LV metadata
In some cases when OSD metadata disks are reused and redeployed,
lvcreate can fail to create a DB or WAL volume because it overlaps
an old, deleted volume on the same disk whose signature still
exists at the offsets that trigger detection and abort the LV
creation process when the user is asked whether or not to wipe to
old signature. Adding a --yes argument to the lvcreate command
automatically answers yes to the wipe question and allows lvcreate
to wipe the old signature.

Change-Id: I0d69bd920c8e62915853ecc3b22825fa98f7edf3
2024-03-04 21:19:50 +00:00
Stephen Taylor f641f34b00 [ceph] Update Ceph images to Jammy and Reef 18.2.1
This change updates all Ceph images in openstack-helm-infra to
ubuntu_jammy_18.2.1-1-20240130.

Change-Id: I16d9897bc5f8ca410059a5f53cc637eb8033ba47
2024-01-30 07:58:03 -07:00
Stephen Taylor 5e5a52cc04 Update Rook to 1.12.5 and Ceph to 18.2.0
This change updates Rook to the 1.12.5 release and Ceph to the
18.2.0 (Reef) release.

Change-Id: I546780ce33b6965aa699f1578d1db9790dc4e002
2023-10-13 12:58:56 -06:00
Stephen Taylor d29efccdbb [ceph-osd] Add disk zap to OSD init forced repair case
There exists a case for bluestore OSDs where the OSD init process
detects that an OSD has already been initialized in the deployed
Ceph cluster, but the cluster osdmap does not have an entry for it.
This change corrects this case to zap and reinitialize the disk
when OSD_FORCE_REPAIR is set to 1. It also clarifies a log message
in this case when OSD_FORCE_REPAIR is 0 to state that a manual
repair is necessary.

Change-Id: I2f00fa655bf5359dcc80c36d6c2ce33e3ce33166
2023-08-31 08:05:50 -06:00
Stephen Taylor 443ff3e3e3 [ceph] Use Helm toolkit functions for Ceph probes
This change converts the readiness and liveness probes in the Ceph
charts to use the functions from the Helm toolkit rather than
having hard-coded probe definitions. This allows probe configs to
be overridden in values.yaml without rebuilding charts.

Change-Id: I68a01b518f12d33fe4f87f86494a5f4e19be982e
2023-08-22 19:16:37 +00:00
Stephen Taylor 8d6cc364b7 [ceph-osd] Extend the ceph-osd post-apply job PG wait
In some cases, especially for disruptive OSD restarts on upgrade,
PGs can take longer than the allowed ~30 seconds to get into a
peering state. In these cases, the post-apply job fails prematurely
instead of allowing time for the OSDs and PGs to recover. This
change extends that timeout to ~10 minutes instead to allow the PGs
plenty of recovery time.

The only negative effect of this change is that a legitimate
failure where the PGs can't recover will take 10 minutes to fail
the post-apply job instead of 30 seconds.

Change-Id: I9c22bb692385dbb7bc2816233c83c7472e071dd4
2023-07-07 08:42:30 -06:00
Stephen Taylor 45b492bcf7 [ceph] Update Ceph to 17.2.6
This change updates the openstack-helm-infra charts to use 17.2.6
Quincy images based on Focal.

See https://review.opendev.org/c/openstack/openstack-helm-images/+/881217

Change-Id: Ibb89435ae22f6d634846755e8121facd13d5d331
2023-05-09 12:25:07 +00:00
Stephen Taylor fc92933346 [ceph] Update all Ceph images to Focal
This change updates all Ceph image references to use Focal images
for all charts in openstack-helm-infra.

Change-Id: I759d3bdcf1ff332413e14e367d702c3b4ec0de44
2023-03-16 16:39:37 -06:00
Brian Haley f31cfb2ef9 support image registries with authentication
Based on spec in openstack-helm repo,
support-OCI-image-registry-with-authentication-turned-on.rst

Each Helm chart can configure an OCI image registry and
credentials to use. A Kubernetes secret is then created with these
info. Service Accounts then specify an imagePullSecret specifying
the Secret with creds for the registry. Then any pod using one
of these ServiceAccounts may pull images from an authenticated
container registry.

Change-Id: Iebda4c7a861aa13db921328776b20c14ba346269
2022-07-20 14:28:47 -05:00
Stephen Taylor 9a37183b26 [ceph-osd] Remove ceph-mon dependency in ceph-osd liveness probe
It is possible for misbehaving ceph-mon pods to cause the ceph-osd
liveness probe to fail for healthy ceph-osd pods, which can cause
healthy pods to get restarted unnecessarily. This change removes
the ceph-mon query from the ceph-osd liveness probe so the probe
is only dependent on ceph-osd state.

Change-Id: I9e1846cfdc5783dbb261583e04ea19df81d143f4
2022-05-06 10:15:45 -06:00
Stephen Taylor e02dc3da44 [ceph-osd] Remove udev interactions from osd-init
There are bugs with containerizing certain udev operations in some
udev versions. The osd-init container can hang in these
circumstances, so the osd-init scripts are modified not to use
these problematic operations.

Change-Id: I6b39321b849f5fbf1b6f2097c6c57ffaebe68121
2022-04-29 14:44:32 -06:00
Stephen Taylor 76fb2562c6 [ceph-osd] Allow for unconditional OSD restart
This change allows OSDs to be restarted unconditionally by the
ceph-osd chart. This can be useful in upgrade scenarios where
ceph-osd pods are unhealthy during the upgrade.

Change-Id: I6de98db2b4eb1d76411e1dbffa65c263de3aecee
2022-04-05 10:40:28 -06:00
Stephen Taylor 3b0d3cac44 [ceph-osd] Skip pod wait in post-apply job when disruptive
The new, disruptive post-apply logic to restart ceph-osd pods more
efficiently on upgrade still waits for pods to be in a non-
disruptive state before restarting them disruptively. This change
skips that wait if a disruptive restart is in progress.

Change-Id: I484a3b899c61066aab6be43c4077fff2db6f54bc
2022-04-02 08:58:29 -06:00
Stephen Taylor 2fa26b2821 [ceph-osd] Add a disruptive OSD restart to the post-apply job
Currently the ceph-osd post-apply job always restarts OSDs without
disruption. This requires waiting for a healthy cluster state in
betweeen failure domain restarts, which isn't possible in some
upgrade scenarios. In those scenarios where disruption is
acceptable and a simultaneous restart of all OSDs is required,
the disruptive_osd_restart value now provides this option.

Change-Id: I64bfc30382e86c22b0f577d85fceef0d5c106d94
2022-03-30 15:06:50 -06:00
Ruslan Aliev 109c629838 Add pre-check of storage locations
Since we are about to use wildcards in storage locations,
it is possible to have multiple matches, so we need to add
precheck before using $STORAGE_LOCATION, $BLOCK_DB and $BLOCK_WAL
variables to ensure that stored strings resolve to just one and
only block location.

Signed-off-by: Ruslan Aliev <raliev@mirantis.com>
Change-Id: I60180f988e90473e200e886b69788cc263359ad2
2022-03-29 04:30:34 +00:00
Sigunov, Vladimir (vs422h) 728c340dc0 [CEPH] Discovering ceph-mon endpoints
This is a code improvement to reuse ceph monitor doscovering function
in different templates. Calling the mentioned above function from
a single place (helm-infra snippets) allows less code maintenance
and simlifies further development.

Rev. 0.1 Charts version bump for ceph-client, ceph-mon, ceph-osd,
ceph-provisioners and helm-toolkit
Rev. 0.2 Mon endpoint discovery functionality added for
the rados gateway. ClusterRole and ClusterRoleBinding added.
Rev. 0.3 checkdns is allowed to correct ceph.conf for RGW deployment.
Rev. 0.4 Added RoleBinding to the deployment-rgw.
Rev. 0.5 Remove _namespace-client-ceph-config-manager.sh.tpl and
         the appropriate job, because of duplicated functionality.
         Related configuration has been removed.
Rev. 0.6 RoleBinding logic has been changed to meet rules:
    checkdns namespace - HAS ACCESS -> RGW namespace(s)

Change-Id: Ie0af212bdcbbc3aa53335689deed9b226e5d4d89
2022-02-11 14:30:43 -07:00
Stephen Taylor cb73c61b4e [ceph-osd] Remove wait for misplaced objects during OSD restarts
The wait for misplaced objects during the ceph-osd post-apply job
was added to prevent I/O disruption in the case where misplaced
objects cause multiple replicas in common failure domains. This
concern is only valid before OSD restarts begin because OSD
failures during the restart process won't cause replicas that
violate replication rules to appear elsewhere.

This change keeps the wait for misplaced objects prior to beginning
OSD restarts and removes it during those restarts. The wait during
OSD restarts now only waits for degraded objects to be recovered
before proceeding to the next failure domain.

Change-Id: Ic82c67b43089c7a2b45995d1fd9c285d5c0e7cbc
2021-11-23 12:46:49 -07:00
Parsons, Cliff (cp769u) cc793f2144 [ceph-osd] Update log-runner container for MAC
The log-runner previously was not included in the mandatory access
control (MAC) annotation for the OSD pods, which means it could not
have any AppArmor profile applied to it. This patchset adds that
capability for that container.

Change-Id: I11036789de45c0f8f66b51e15f2cc253e6cb230c
2021-10-26 18:50:28 +00:00
Gage Hugo 22e50a5569 Update htk requirements
This change updates the helm-toolkit path in each chart as part
of the move to helm v3. This is due to a lack of helm serve.

Change-Id: I011e282616bf0b5a5c72c1db185c70d8c721695e
2021-10-06 01:02:28 +00:00
Sean Eagan b1a247e7f5 Helm 3 - Fix Job labels
If labels are not specified on a Job, kubernetes defaults them
to include the labels of their underlying Pod template. Helm 3
injects metadata into all resources [0] including a
`app.kubernetes.io/managed-by: Helm` label. Thus when kubernetes
sees a Job's labels they are no longer empty and thus do not get
defaulted to the underlying Pod template's labels. This is a
problem since Job labels are depended on by
- Armada pre-upgrade delete hooks
- Armada wait logic configurations
- kubernetes-entrypoint dependencies

Thus for each Job template this adds labels matching the
underlying Pod template to retain the same labels that were
present with Helm 2.

[0]: https://github.com/helm/helm/pull/7649

Change-Id: I3b6b25fcc6a1af4d56f3e2b335615074e2f04b6d
2021-09-30 16:01:31 -05:00
Parsons, Cliff (cp769u) b704b9ad02 Ceph OSD log-runner container should run as ceph user
This PS changes the log-runner user ID to run as the ceph user
so that it has the appropriate permissions to write to /var/log/ceph
files.

Change-Id: I4dfd956130eb3a19ca49a21145b67faf88750d6f
2021-08-27 21:04:15 +00:00
Parsons, Cliff (cp769u) a0aec27ebc Fix Ceph checkDNS script
The checkDNS script which is run inside the ceph-mon pods has had
a bug for a while now. If a value of "up" is passed in, it adds
brackets around it, but then doesn't check for the brackets when
checking for a value of "up". This causes a value of "{up}" to be
written into the ceph.conf for the mon_host line and that causes
the mon_host to not be able to respond to ceph/rbd commands. Its
normally not a problem if DNS is working, but if DNS stops working
this can happen.

This patch changes the comparison to look for "{up}" instead of
"up" in three different files, which should fix the problem.

Change-Id: I89cf07b28ad8e0e529646977a0a36dd2df48966d
2021-08-25 14:17:54 +00:00
Chinasubbareddy Mallavarapu 7117c93772 [ceph-osd] Change var crash mount propagation to HostToContainer
- As it will be a security violation to mount anything under /var
partition to pods , changing the mount propagation to HostToContainer

Change-Id: If7a27304507a9d1bcb9efcef4fc1146f77080a4f
2021-08-05 14:33:06 +00:00
Parsons, Cliff (cp769u) b55143dec2 Limit Ceph OSD Container Security Contexts
Wherever possible, the ceph-osd containers need to run with the
least amount of privilege required. In some cases there are
privileges granted but are not needed. This patchset modifies
those container's security contexts to reduce them to only what
is needed.

Change-Id: I0d6633efae7452fee4ce98d3e7088a55123f0a78
2021-07-29 20:24:37 +00:00
Stephen Taylor c2ca599923 [ceph-osd] Mount /var/crash inside ceph-osd pods
This change adds /var/crash as a host-path volume mount for
ceph-osd pods in order to facilitate core dump capture when
ceph-osd daemons crash.

Change-Id: Ie517c64e08b11504f71d7d570394fbdb2ac8e54e
2021-07-20 15:30:19 -06:00
Stephen Taylor 07ceecd8d7 Export crash dumps when Ceph daemons crash
This change configures Ceph daemon pods so that
/var/lib/ceph/crash maps to a hostPath location that persists
when the pod restarts. This will allow for post-mortem examination
of crash dumps to attempt to understand why daemons have crashed.

Change-Id: I53277848f79a405b0809e0e3f19d90bbb80f3df8
2021-06-30 14:24:15 -06:00
Parsons, Cliff (cp769u) b3ebb46ce2 Ceph OSD Init Improvements
Some minor improvements are made in this patchset:
1) Move osd_disk_prechecks to the very beginning to make sure the
   required variables are set before running the bulk of the script.
2) Specify variables in a more consistent manner for readability.
3) Remove variables from CLI commands that are not used/set.

Change-Id: I6167b277e111ed59ccf4415e7f7d178fe4338cbd
2021-06-24 17:12:34 +00:00
Thiago Brito 5a0ba49d50 Prepending library/ to docker official images
This will ease mirroring capabilities for the docker official images.

Signed-off-by: Thiago Brito <thiago.brito@windriver.com>
Change-Id: I0f9177b0b83e4fad599ae0c3f3820202bf1d450d
2021-06-02 15:04:38 -03:00
Parsons, Cliff (cp769u) 17d9fe4de9 Refactor Ceph OSD Init Scripts - Second PS
1) Removed some remaining unsupported ceph-disk related code.
2) Refactored the code that determines when a disk should be
   zapped. Now there will be only one place where disk_zap is
   called.
3) Refactored the code that determines when LVM prepare should
   be called.
4) Improved the logging within the OSD init files

Change-Id: I194c82985f1f71b30d172f9e41438fa814500601
2021-05-27 22:34:54 +00:00
Parsons, Cliff (cp769u) aaa85e3fc5 Refactor Ceph OSD Init Scripts - First PS
This is the first of multiple updates to ceph-osd where the OSD
init code will be refactored for better sustainability.

This patchset makes 2 changes:

1) Removes "ceph-disk" support, as ceph-disk was removed from the
   ceph image since nautilus.
2) Separates the initialization code for the bluestore, filestore,
   and directory backend configuration options.

Change-Id: I116ce9cc8d3bac870adba8b84677ec652bbb0dd4
2021-04-12 19:36:32 +00:00
Stephen Taylor 131ea21512 [ceph-osd] Update directory-based OSD deployment for image changes
Directory-based OSDs are failing to deploy because 'python' has
been replaced with 'python3' in the image. This change updates the
python commands to use python3 instead.

There is also a dependency on forego, which has been removed from
the image. This change also modifies the deployment so that it
doesn't depend on forego.

Ownership of the OSD keyring file has also been changed so that it
is owned by the 'ceph' user, and the ceph-osd process now uses
--setuser and --setgroup to run as the same user.

Change-Id: If825df283bca0b9f54406084ac4b8f958a69eab7
2021-03-29 14:40:28 +00:00
bw6938 2594e71488 [ceph-osd] update rbac api version
When using a helm3 to deploy, it fails as helm 3
no longer supports rbac.authorization.k8s.io/v1beta1,
but v1 can support helm2 and helm3 (liujinyuan@inspur.com).

Change-Id: I40a5863c80489db8ea40028ffb6d89c43f6771d6
2021-02-21 04:49:58 +00:00
Chinasubbareddy Mallavarapu da289c78cb [CEPH] Uplift from Nautilus to Octopus release
This is to uplift ceph charts from 14.X release to 15.X

Change-Id: I4f7913967185dd52d4301c218450cfad9d0e2b2b
2021-02-03 22:34:53 +00:00
Stephen Taylor b2c0028349 [ceph-osd] Fix a bug with DB orphan volume removal
The volume naming convention prefixes logical volume names with
ceph-lv-, ceph-db-, or ceph-wal-. The code that was added recently
to remove orphaned DB and WAL volumes does a string replacement of
"db" or "wal" with "lv" when searching for corresponding data
volumes. This causes DB volumes to get identified incorrectly as
orphans and removed when "db" appears in the PV UUID portion of
the volume name.

Change-Id: I0c9477483b70c9ec844b37a6de10a50c0f2e1df8
2021-01-19 10:10:38 -07:00
Stephen Taylor 4c097b0300 [ceph-osd] dmsetup remove logical devices using correct device names
Found another issue in disk_zap() where a needed update was missed when
https://review.opendev.org/c/openstack/openstack-helm-infra/+/745166
changed the logical volume naming convention.

The above patch set renamed volumes that followed the old convention,
so this logic will never be correct and must be updated.

Also added logic to clean up orphaned DB/WAL volumes if they are
encountered and removed some cases where a data disk is marked as in use
when it isn't set up correctly.

Change-Id: I8deeecfdb69df1f855f287caab8385ee3d6869e0
2021-01-11 14:49:43 -07:00
Stephen Taylor 213596d71c [ceph-osd] Correct naming convention for logical volumes in disk_zap()
OSD logical volume names used to be based on the logical disk path,
i.e. /dev/sdb, but that has changed. The lvremove logic in disk_zap()
is still using the old naming convention. This change fixes that.

Change-Id: If32ab354670166a3c844991de1744de63a508303
2020-12-17 09:29:51 -07:00
Stephen Taylor 885285139e [ceph-osd] Alias synchronized commands and fix descriptor leak
There are many race conditions possible when multiple ceph-osd
pods are initialized on the same host at the same time using
shared metadata disks. The locked() function was introduced a
while back to address these, but some commands weren't locked,
locked() was being called all over the place, and there was a file
descriptor leak in locked(). This change cleans that up by
by maintaining a single, global file descriptor for the lock file
that is only opened and closed once, and also by aliasing all of
the commands that need to use locked() and removing explicit calls
to locked() everywhere.

The global_locked() function has also been removed as it isn't
needed when individual commands that interact with disks use
locked() properly.

Change-Id: I0018cf0b3a25bced44c57c40e33043579c42de7a
2020-12-16 07:22:15 -07:00
Singh, Jasvinder (js581j) ae96308ef1 [ceph-osd] Remove default OSD configuration
The default, directory-based OSD configuration doesn't appear to work
correctly and isn't really being used by anyone. It has been commented
out and the comments have been enhanced to document the OSD config
better. With this change there is no default configuration anymore, so
the user must configure OSDs properly in their environment in
values.yaml in order to deploy OSDs using this chart.

Change-Id: I8caecf847ffc1fefe9cb1817d1d2b6d58b297f72
2020-12-01 10:44:21 -07:00
Taylor, Stephen (st053q) e37d1fc2ab [ceph-osd] Add a check for misplaced objects to the post-apply job
OSD failures during an update can cause degraded and misplaced
objects. The post-apply job restarts OSDs in failure domain
batches in order to accomplish the restarts efficiently. There is
already a wait for degraded objects to ensure that OSDs are not
restarted on degraded PGs, but misplaced objects could mean that
multiple object replicas exist in the same failure domain, so the
job should wait for those to recover as well before restarting
OSDs in order to avoid potential disruption under these failure
conditions.

Change-Id: I39606e388a9a1d3a4e9c547de56aac4fc5606ea2
2020-11-30 10:17:40 -07:00
Taylor, Stephen (st053q) 791b0de5ee [ceph-osd] Fix post-apply job failure related to fault tolerance
A recent change to wait_for_pods() to allow for fault tolerance
appears to be causing wait_for_pgs() to fail and exit the post-
apply script prematurely in some cases. The existing
wait_for_degraded_objects() logic won't pass until pods and PGs
have recovered while the noout flag is set, so the pod and PG
waits can simply be removed.

Change-Id: I5fd7f422d710c18dee237c0ae97ae1a770606605
2020-11-24 06:30:37 -07:00
Andrii Ostapenko ca372bfea6 Fix typo in check inactive PGs logic
Issue introduces in https://review.opendev.org/761031

Change-Id: I154f91e17b5d9a84282197ae843c5aab2ce1d0be
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
2020-11-09 17:53:41 +00:00
Kabanov, Dmitrii 011e5876c0 [ceph-osd] Check inactive PGs multiple times
The PS updates post apply job and allows to check multiple times
inactive PGs that are not peering. The wait_for_pgs() function
fails after 10 sequential positive checks.

Change-Id: I98359894477c8e3556450b60b25d62773666b034
2020-11-03 00:50:42 +00:00
Kabanov, Dmitrii d39abfe0f0 [ceph-osd] Update post apply job
The PS updates wait_for_pods() function in post apply script.
The changes allow to pass wait_for_pods() function when required percent
of OSDs reached (REQUIRED_PERCENT_OF_OSDS). Also removed a part of code
which is not needed any more.

Change-Id: I56f1292682cf2aa933c913df162d6f615cf1a133
2020-10-23 19:00:58 +00:00
Stephen Taylor 16b72c1e22 [ceph-osd] Synchronization audit for the ceph-volume osd-init script
There are race conditions in the ceph-volume osd-init script that
occasionally cause deployment and OSD restart issues. This change
attempts to resolve those and stabilize the script when multiple
instances run simultaneously on the same host.

Change-Id: I79407059fa20fb51c6840717a083a8dc616ba410
2020-10-16 18:30:57 +00:00
Chinasubbareddy Mallavarapu 321b8cb7e3 [ceph-osd] Logic improvement for used osd disk detection
This is to improve the logic to detect used osd disks so that scripts will
not zap the osd disks agressively.

also adding debugging mode for pvdisplay commands to capture more logs
during failure scenarios along with reading osd force repair flag from
values.

Change-Id: Id2996211dd92ac963ad531f8671a7cc8f7b7d2d5
2020-10-15 13:13:28 +00:00
Chinasubbareddy Mallavarapu 6a0feecaef [ceph-osd] Fix the sync issue between osds when using shared disk for metadata
This is to fix the sync between ceph osds when they are using shared
disk for metadata as they are having conflict while preparing the metadata disk.

we are adding a lock when first osd preparing the sahred metadata disk so that
other osd will wait for the lock, also adding udev settle in few places to get
latest tags on lvm devices.

Change-Id: I018bd12a3f02cf8cd3486b9c97e14b138b5dac76
2020-10-11 04:04:53 +00:00
Stephen Taylor 38d9f35c05 [ceph-osd] Don't try to prepare OSD disks that are already deployed
This addresses an issue that can prevent some OSDs from being able
to restart properly after they have been deployed. Some OSDs try to
prepare their disks again on restart and end up crash looping. This
change fixes that.

Change-Id: I9edc1326c3544d9f3e8b6e3ff83529930a28dfc6
2020-10-05 18:40:48 -05:00
Taylor, Stephen (st053q) 173bf928df [ceph-osd] Search for complete logical volume name for OSD data volumes
The existing search for logical volumes to determine if an OSD data
is already being used is incomplete and can yield false positives in
some cases. This change makes the search more correct and specific in
order to avoid those.

Change-Id: Ic2d06f7539567f0948efef563c1942b71e0293ff
2020-09-30 04:25:30 +00:00
Chinasubbareddy Mallavarapu 67c905cae8 [ceph-osd] wait for only osd pods from post apply job
This is to wait only for osd pods during ceph-osd chart install/upgrade
process.

Change-Id: I99bc7c1548f7b13c93059ac832b9f0589b049fc7
2020-09-25 08:45:51 -05:00