Only wait for essential pods in cert recovery

The certificate recovery role will trigger a restart of every pod
in the k8s cluster so that they can be updated with the latest
certificate information.

After pods restart the procedure waits every pod to recover and become
READY. This change modifies that behaviour to only wait for essential
pods to recover, being those in the core namespaces armada,
cert-manager, flux-helm and kube-system.

Test case:

PASS: Run certificate recovery with crashing pods in a custom namespace

Closes-Bug: 2058751

Signed-off-by: Rei Oliveira <Reinildes.JoseMateusOliveira@windriver.com>
Change-Id: I3ea403a3e324ecbb5f2c1f56d6ce1c8bd80fabee
This commit is contained in:
Rei Oliveira 2024-03-15 11:40:26 -03:00 committed by Reinildes Oliveira
parent 3ac6db5973
commit 5a304af6e1
1 changed files with 2 additions and 1 deletions

View File

@ -81,8 +81,9 @@
- name: Wait pods to restart (become READY) on controller
shell: >-
kubectl get po -l '!job-name' -A --no-headers -o
'custom-columns=NAME:.metadata.name,
'custom-columns=NAME:.metadata.name, NAMESPACE:.metadata.namespace,
READY:.status.containerStatuses[*].ready,NODE:.spec.nodeName'
| grep "armada\|cert-manager\|flux-helm\|kube-system"
| grep -v calico-node
| grep $(hostname)
| grep -cv true