This change updates the Ceph images to 18.2.2 images patched with a
fix for https://tracker.ceph.com/issues/63684. It also reverts the
package repository in the deployment scripts to use the debian-reef
directory on download.ceph.com instead of debian-18.2.1. The issue
with the repo that prompted the previous change to debian-18.2.1
has been resolved and the more generic debian-reef directory may
now be used again.
Change-Id: I85be0cfa73f752019fc3689887dbfd36cec3f6b2
This change adds an openstack-support-rook zuul job to test
deploying Ceph using the upstream Rook helm charts found in the
https://charts.rook.io/release repository. Minor changes to the
storage keyring manager job and the mon discovery service in the
ceph-mon chart are also included to allow the ceph-mon chart to be
used to generate auth keys and deploy the mon discovery service
necessary for OpenStack.
Change-Id: Iee4174dc54b6a7aac6520c448a54adb1325cccab
This change converts the readiness and liveness probes in the Ceph
charts to use the functions from the Helm toolkit rather than
having hard-coded probe definitions. This allows probe configs to
be overridden in values.yaml without rebuilding charts.
Change-Id: I68a01b518f12d33fe4f87f86494a5f4e19be982e
This is simply to document the fact that mon_allow_pooL_size_one
must be configured via cluster_commands in the ceph-client chart.
Adding it to ceph.conf via the conf values in the ceph-mon chart
doesn't seem to configure the mons effectively.
Change-Id: Ic7e9a0eade9c0b4028ec232ff7ad574b8574615d
This change updates all Ceph image references to use Focal images
for all charts in openstack-helm-infra.
Change-Id: I759d3bdcf1ff332413e14e367d702c3b4ec0de44
Based on spec in openstack-helm repo,
support-OCI-image-registry-with-authentication-turned-on.rst
Each Helm chart can configure an OCI image registry and
credentials to use. A Kubernetes secret is then created with these
info. Service Accounts then specify an imagePullSecret specifying
the Secret with creds for the registry. Then any pod using one
of these ServiceAccounts may pull images from an authenticated
container registry.
Change-Id: Iebda4c7a861aa13db921328776b20c14ba346269
This change allows mons to be restarted unconditionally by the
ceph-mon chart. This can be useful in upgrade scenarios where
ceph-mon pods need to be forcibly restarted for any reason.
Change-Id: I93a1426c2ca02b060f7a606495893feb2813c142
Under some circumstances, armada job attempts to recreate an existing
Service Account for ceph-mgr. This patchset aims to remediate the issue.
Change-Id: I69bb9045c0e2f24dc2fa9e94ab6a09a58221e1f5
This change corrects the ceph-templates configmap name to be
release-specific like the other configmaps in the chart. This
allows for more robustness in downstream implementations.
Change-Id: I1d09d14f9ba94dbbe11d8a80776f57b9cdf41210
The recent name changes to the ceph-mon configmaps did not get
propagated to all resources in the chart. The hard-coded names in
the unchanged cases were correct and resources deployed
successfully, but this change corrects those configmap names across
all resources for the sake of robustness.
Change-Id: I3195e5ba2726892a7b6e0c31c0fac43bae4aa399
This change makes the ceph-mon configmap names dynamic based on
release name to match how the ceph-osd chart is naming configmaps.
The new ceph-mon post-apply job needs this in some cases in order
not to have conflicting configmap names in separate releases.
Change-Id: Id26d0a8310ccff80a608e25d2b0a74a41f9e6a55
This is a code improvement to reuse ceph monitor doscovering function
in different templates. Calling the mentioned above function from
a single place (helm-infra snippets) allows less code maintenance
and simlifies further development.
Rev. 0.1 Charts version bump for ceph-client, ceph-mon, ceph-osd,
ceph-provisioners and helm-toolkit
Rev. 0.2 Mon endpoint discovery functionality added for
the rados gateway. ClusterRole and ClusterRoleBinding added.
Rev. 0.3 checkdns is allowed to correct ceph.conf for RGW deployment.
Rev. 0.4 Added RoleBinding to the deployment-rgw.
Rev. 0.5 Remove _namespace-client-ceph-config-manager.sh.tpl and
the appropriate job, because of duplicated functionality.
Related configuration has been removed.
Rev. 0.6 RoleBinding logic has been changed to meet rules:
checkdns namespace - HAS ACCESS -> RGW namespace(s)
Change-Id: Ie0af212bdcbbc3aa53335689deed9b226e5d4d89
If the OnDelete pod restart strategy is used for the ceph-mon
daemonset, run a post-apply job to restart the ceph-mon pods one
at a time. Otherwise the mons could restart before the mgrs, which
can be problematic in some upgrade scenarios.
Change-Id: I57f87130e95088217c3cfe73512caaae41d3ef22
This change moves the ceph-mgr deployment from the ceph-client
chart to the ceph-mon chart. Its purpose is to facilitate the
proper Ceph upgrade procedure, which prescribes restarting mgr
daemons before mon daemons.
There will be additional work required to implement the correct
daemon restart procedure for upgrades. This change only addresses
the move of the ceph-mgr deployment.
Change-Id: I3ac4a75f776760425c88a0ba1edae5fb339f128d
This change adds a condition to ensure that an IP address was
obtained for a ceph-mon kubernetes endpoint before building the
expected endpoint string and checking it against the monmap. If an
IP address isn't available, the check is skipped for that mon.
Change-Id: I45a2e2987b5ef0c27b0bb765f7967fcce1af62e4
The ceph-mon-check pod only knew about the v1 port before, and didn't
have the proper mon_host configuration in its ceph.conf file. This
patchset adds knowledge about the v2 port also and correctly configures
the ceph.conf file. Also fixes a namespace hardcoding that was found
in the last ceph-mon-check fix.
Change-Id: I460e43864a2d4b0683b67ae13bf6429d846173fc
A race condition exists that can cause the mon-check pod to delete
mons from the monmap that are only down temporarily. This sometimes
causes issues with the monmap when those mons come back up. This
change adds a check to see if the list of mons in the monmap is
larger than expected before removing anything. If not, the monmap
is left alone.
Change-Id: I43b186bf80741fc178c6806d24c179417d7f2406
This change updates the helm-toolkit path in each chart as part
of the move to helm v3. This is due to a lack of helm serve.
Change-Id: I011e282616bf0b5a5c72c1db185c70d8c721695e
If labels are not specified on a Job, kubernetes defaults them
to include the labels of their underlying Pod template. Helm 3
injects metadata into all resources [0] including a
`app.kubernetes.io/managed-by: Helm` label. Thus when kubernetes
sees a Job's labels they are no longer empty and thus do not get
defaulted to the underlying Pod template's labels. This is a
problem since Job labels are depended on by
- Armada pre-upgrade delete hooks
- Armada wait logic configurations
- kubernetes-entrypoint dependencies
Thus for each Job template this adds labels matching the
underlying Pod template to retain the same labels that were
present with Helm 2.
[0]: https://github.com/helm/helm/pull/7649
Change-Id: I3b6b25fcc6a1af4d56f3e2b335615074e2f04b6d
The checkDNS script which is run inside the ceph-mon pods has had
a bug for a while now. If a value of "up" is passed in, it adds
brackets around it, but then doesn't check for the brackets when
checking for a value of "up". This causes a value of "{up}" to be
written into the ceph.conf for the mon_host line and that causes
the mon_host to not be able to respond to ceph/rbd commands. Its
normally not a problem if DNS is working, but if DNS stops working
this can happen.
This patch changes the comparison to look for "{up}" instead of
"up" in three different files, which should fix the problem.
Change-Id: I89cf07b28ad8e0e529646977a0a36dd2df48966d
This PS updates the mon-check reap-zombies python script to consider
the more recent Ceph changes, including the fact that there is now
a v1 and v2 backend. In addition, it executes the reap-zombies script
with the python3 binary, as the basic 'python' binary does not exist
in the container.
Change-Id: Id079671f03cc5ddbe694f2aa8c9d2480dc573983
This change configures Ceph daemon pods so that
/var/lib/ceph/crash maps to a hostPath location that persists
when the pod restarts. This will allow for post-mortem examination
of crash dumps to attempt to understand why daemons have crashed.
Change-Id: I53277848f79a405b0809e0e3f19d90bbb80f3df8
The current implementation of the Ceph CSI provisioner is tied too
closely with the older Ceph RBD provisioner, which doesn't let the
deployer deploy Ceph CSI provisioner without the old RBD provisioner.
This patchset will decouple them such that they can be deployed
independently from one another.
A few other changes are needed as well:
1) The deployment/gate scripts are updated so that the old RBD and
CSI RBD provisioners are separately enabled/disabled as needed.
The original RBD provisioner is now deprecated.
2) Ceph-mon chart is updated because it had some RBD storageclass
data in values.yaml that is not needed for ceph-mon deployment.
3) Fixed a couple of bugs in job-cephfs-client-key.yaml where RBD
parameters were being used instead of cephfs parameters.
Change-Id: Icb5f78dcefa51990baf1b6d92411eb641c2ea9e2
This will ease mirroring capabilities for the docker official images.
Signed-off-by: Thiago Brito <thiago.brito@windriver.com>
Change-Id: I0f9177b0b83e4fad599ae0c3f3820202bf1d450d
Since k8s v1.11+, the annotation `service.alpha.kubernetes.io/tolerate-unready-endpoints` is deprecated. we should use Service.spec.publishNotReadyAddresses instead.
Change-Id: Ic4f82b8e78770ff29637937c4bcb9af71b53f8d3
This is to update python3 for checkObjectReplication.py script
since python2 got removed from ceph images.
Change-Id: I006a4becaeefb2a0cbef6f5d1fb56c7fc40b0170
This PS is to address security best practices concerning running
containers as a non-privileged user and disallowing privilege
escalation.
Change-Id: If4c0e9fe446091ba75d1a9818ffd3a0933285af4
This is to address zombie processes found in ceph-mon containers due
to the mon-check.sh monitoring script. With shareProcessNamespace the
/pause container will properly handle the defunct processes.
Change-Id: Ic111fd28b517f4c9b59ab23626753e9c73db1b1b
Since we introduced chart version check in gates, requirements are not
satisfied with strict check of 0.1.0
Change-Id: I15950b735b4f8566bc0018fe4f4ea9ba729235fc
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
Added chart lint in zuul CI to enhance the stability for charts.
Fixed some lint errors in the current charts.
Change-Id: I9df4024c7ccf8b3510e665fc07ba0f38871fcbdb
1) Changed the pod name and container name to pick name dynamically for
osd,mon,mgr and mds.
2) Added Init container for ceph-provisioners.
Change-Id: I3e27d51c055010cff982ddb0951d01ea8adac234
Signed-off-by: diwakar thyagaraj <diwakar.chitoor.thyagaraj@att.com>
Fix issues introduced by https://review.opendev.org/#/c/735648
with extra 'ceph-' in service_account and security context not
rendered for keyring generator containers.
Change-Id: Ie53b3407dbd7345d37c92c60a04f3badf735f6a6
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
Unrestrict octal values rule since benefits of file modes readability
exceed possible issues with yaml 1.2 adoption in future k8s versions.
These issues will be addressed when/if they occur.
Also ensure osh-infra is a required project for lint job, that matters
when running job against another project.
Change-Id: Ic5e327cf40c4b09c90738baff56419a6cef132da
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
This updates the ceph-mon chart to include the pod
security context on the pod template
This also adds the container security context to set
readOnlyRootFilesystem flag to true
Change-Id: I4c9e292eaf3d76ee80f50553d1cbc8cdc6f57cac
This commit rewrites lint job to make template linting available.
Currently yamllint is run in warning mode against all templates
rendered with default values. Duplicates detected and issues will be
addressed in subsequent commits.
Also all y*ml files are added for linting and corresponding code changes
are made. For non-templates warning rules are disabled to improve
readability. Chart and requirements yamls are also modified in the name
of consistency.
Change-Id: Ife6727c5721a00c65902340d95b7edb0a9c77365
The PS adds kubernetes tolerations for deployments from ceph-client,
ceph-mon, ceph-provisioners and ceph-rgw charts.
Change-Id: If96f5f2058fca6e145e537e95af39089f441ccbb
The current copyright refers to a non-existent group
"openstack helm authors" with often out-of-date references that
are confusing when adding a new file to the repo.
This change removes all references to this copyright by the
non-existent group and any blank lines underneath.
Change-Id: I1882738cf9757c5350a8533876fd37b5920b5235
Cephfs tests were disabled in order to merge
https://review.opendev.org/695568 due to gate failures that were
blocking it. CephFS isn't used in openstack-helm-infra, so it
wasn't required for that work. This change re-enables the cephfs
tests so we can work through any issues that are causing further
failures.
Since the the issue got fixed in 14.2.8 , upgrading all daemons to 14.2.8.
(https://tracker.ceph.com/issues/43770)
Change-Id: I376d39b7ee00ccb1ab8046b58f92b19a822272e1