diff --git a/doc/source/testing/ceph-node-resiliency.rst b/doc/source/testing/ceph-node-resiliency.rst new file mode 100644 index 0000000000..5209761645 --- /dev/null +++ b/doc/source/testing/ceph-node-resiliency.rst @@ -0,0 +1,1239 @@ +================================================== +Ceph - Node Reduction, Expansion and Ceph Recovery +================================================== + +This document captures steps and result from node reduction and expansion as +well as ceph recovery. + +Test Scenarios: +=============== +1) Node reduction: Shutdown 1 of 3 nodes to simulate node failure. Capture effect +of node failure on Ceph as well as other OpenStack services that are using Ceph. + +2) Node expansion: Apply Ceph and OpenStack related labels to another unused k8 +node. Node expansion should provide more resources for k8 to schedule PODs for +Ceph and OpenStack services. + +3) Fix Ceph Cluster: After node expansion, perform maintenance on Ceph cluster +to ensure quorum is reached and Ceph is HEALTH_OK. + +Setup: +====== +- 6 Nodes (VM based) env +- Only 3 nodes will have Ceph and OpenStack related labels. Each of these 3 + nodes will have one MON and one OSD running on them. +- Followed OSH multinode guide steps to setup nodes and install k8 cluster +- Followed OSH multinode guide steps to install Ceph and OpenStack charts up to + Cinder. + +Steps: +====== +1) Initial Ceph and OpenStack deployment: +Install Ceph and OpenStack charts on 3 nodes (mnode1, mnode2 and mnode3). +Capture Ceph cluster status as well as k8s PODs status. + +2) Node reduction (failure): +Shutdown 1 of 3 nodes (mnode3) to test node failure. This should cause +Ceph cluster to go in HEALTH_WARN state as it has lost 1 MON and 1 OSD. +Capture Ceph cluster status as well as k8s PODs status. + +3) Node expansion: +Add Ceph and OpenStack related labels to 4th node (mnode4) for expansion. +Ceph cluster would show new MON and OSD being added to cluster. However Ceph +cluster would continue to show HEALTH_WARN because 1 MON and 1 OSD are still +missing. + +4) Ceph cluster recovery: +Perform Ceph maintenance to make Ceph cluster HEALTH_OK. Remove lost MON and +OSD from Ceph cluster. + + +Step 1: Initial Ceph and OpenStack deployment +============================================= + +.. note:: + Make sure only 3 nodes (mnode1, mnode2, mnode3) have Ceph and OpenStack + related labels. k8s would only schedule PODs on these 3 nodes. + +``Ceph status:`` + +.. code-block:: console + + ubuntu@mnode1:/opt/openstack-helm$ kubectl exec -n ceph ceph-mon-5qn68 -- ceph -s + cluster: + id: 54d9af7e-da6d-4980-9075-96bb145db65c + health: HEALTH_OK + + services: + mon: 3 daemons, quorum mnode1,mnode2,mnode3 + mgr: mnode2(active), standbys: mnode3 + mds: cephfs-1/1/1 up {0=mds-ceph-mds-6f66956547-c25cx=up:active}, 1 up:standby + osd: 3 osds: 3 up, 3 in + rgw: 2 daemons active + + data: + pools: 19 pools, 101 pgs + objects: 354 objects, 260 MB + usage: 77807 MB used, 70106 MB / 144 GB avail + pgs: 101 active+clean + + io: + client: 48769 B/s wr, 0 op/s rd, 12 op/s wr + +``Ceph MON Status:`` + +.. code-block:: console + + ubuntu@mnode1:/opt/openstack-helm$ kubectl exec -n ceph ceph-mon-5qn68 -- ceph mon_status -f json-pretty + +.. code-block:: json + + { + "name": "mnode2", + "rank": 1, + "state": "peon", + "election_epoch": 92, + "quorum": [ + 0, + 1, + 2 + ], + "features": { + "required_con": "153140804152475648", + "required_mon": [ + "kraken", + "luminous" + ], + "quorum_con": "2305244844532236283", + "quorum_mon": [ + "kraken", + "luminous" + ] + }, + "outside_quorum": [], + "extra_probe_peers": [], + "sync_provider": [], + "monmap": { + "epoch": 1, + "fsid": "54d9af7e-da6d-4980-9075-96bb145db65c", + "modified": "2018-08-14 21:02:24.330403", + "created": "2018-08-14 21:02:24.330403", + "features": { + "persistent": [ + "kraken", + "luminous" + ], + "optional": [] + }, + "mons": [ + { + "rank": 0, + "name": "mnode1", + "addr": "192.168.10.246:6789/0", + "public_addr": "192.168.10.246:6789/0" + }, + { + "rank": 1, + "name": "mnode2", + "addr": "192.168.10.247:6789/0", + "public_addr": "192.168.10.247:6789/0" + }, + { + "rank": 2, + "name": "mnode3", + "addr": "192.168.10.248:6789/0", + "public_addr": "192.168.10.248:6789/0" + } + ] + }, + "feature_map": { + "mon": { + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 1 + } + }, + "mds": { + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 1 + } + }, + "osd": { + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 1 + } + }, + "client": { + "group": { + "features": "0x7010fb86aa42ada", + "release": "jewel", + "num": 1 + }, + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 1 + } + } + } + } + + +``Ceph quorum status:`` + +.. code-block:: console + + ubuntu@mnode1:/opt/openstack-helm$ kubectl exec -n ceph ceph-mon-5qn68 -- ceph quorum_status -f json-pretty + +.. code-block:: json + + { + "election_epoch": 92, + "quorum": [ + 0, + 1, + 2 + ], + "quorum_names": [ + "mnode1", + "mnode2", + "mnode3" + ], + "quorum_leader_name": "mnode1", + "monmap": { + "epoch": 1, + "fsid": "54d9af7e-da6d-4980-9075-96bb145db65c", + "modified": "2018-08-14 21:02:24.330403", + "created": "2018-08-14 21:02:24.330403", + "features": { + "persistent": [ + "kraken", + "luminous" + ], + "optional": [] + }, + "mons": [ + { + "rank": 0, + "name": "mnode1", + "addr": "192.168.10.246:6789/0", + "public_addr": "192.168.10.246:6789/0" + }, + { + "rank": 1, + "name": "mnode2", + "addr": "192.168.10.247:6789/0", + "public_addr": "192.168.10.247:6789/0" + }, + { + "rank": 2, + "name": "mnode3", + "addr": "192.168.10.248:6789/0", + "public_addr": "192.168.10.248:6789/0" + } + ] + } + } + + +``Ceph PODs:`` + +.. code-block:: console + + ubuntu@mnode1:/opt/openstack-helm$ kubectl get pods -n ceph --show-all=false -o wide + NAME READY STATUS RESTARTS AGE IP NODE + ceph-cephfs-provisioner-784c6f9d59-ndsgn 1/1 Running 0 1h 192.168.4.15 mnode2 + ceph-cephfs-provisioner-784c6f9d59-vgzzx 1/1 Running 0 1h 192.168.3.17 mnode3 + ceph-mds-6f66956547-5x4ng 1/1 Running 0 1h 192.168.4.14 mnode2 + ceph-mds-6f66956547-c25cx 1/1 Running 0 1h 192.168.3.14 mnode3 + ceph-mgr-5746dd89db-9dbmv 1/1 Running 0 1h 192.168.10.248 mnode3 + ceph-mgr-5746dd89db-qq4nl 1/1 Running 0 1h 192.168.10.247 mnode2 + ceph-mon-5qn68 1/1 Running 0 1h 192.168.10.248 mnode3 + ceph-mon-check-d85994946-4g5xc 1/1 Running 0 1h 192.168.4.8 mnode2 + ceph-mon-mwkj9 1/1 Running 0 1h 192.168.10.247 mnode2 + ceph-mon-ql9zp 1/1 Running 0 1h 192.168.10.246 mnode1 + ceph-osd-default-83945928-c7gdd 1/1 Running 0 1h 192.168.10.248 mnode3 + ceph-osd-default-83945928-s6gs6 1/1 Running 0 1h 192.168.10.246 mnode1 + ceph-osd-default-83945928-vsc5b 1/1 Running 0 1h 192.168.10.247 mnode2 + ceph-rbd-provisioner-5bfb577ffd-j6hlx 1/1 Running 0 1h 192.168.4.16 mnode2 + ceph-rbd-provisioner-5bfb577ffd-zdx2d 1/1 Running 0 1h 192.168.3.16 mnode3 + ceph-rgw-6c64b444d7-7bgqs 1/1 Running 0 1h 192.168.3.12 mnode3 + ceph-rgw-6c64b444d7-hv6vn 1/1 Running 0 1h 192.168.4.13 mnode2 + ingress-796d8cf8d6-4txkq 1/1 Running 0 1h 192.168.2.6 mnode5 + ingress-796d8cf8d6-9t7m8 1/1 Running 0 1h 192.168.5.4 mnode4 + ingress-error-pages-54454dc79b-hhb4f 1/1 Running 0 1h 192.168.2.5 mnode5 + ingress-error-pages-54454dc79b-twpgc 1/1 Running 0 1h 192.168.4.4 mnode2 + + +``OpenStack PODs:`` + +.. code-block:: console + + ubuntu@mnode1:/opt/openstack-helm$ kubectl get pods -n openstack --show-all=false -o wide + NAME READY STATUS RESTARTS AGE IP NODE + cinder-api-66f4f9678-2lgwk 1/1 Running 0 12m 192.168.3.41 mnode3 + cinder-api-66f4f9678-flvr5 1/1 Running 0 12m 192.168.0.202 mnode1 + cinder-backup-659b68b474-582kr 1/1 Running 0 12m 192.168.4.39 mnode2 + cinder-scheduler-6778f6f88c-mm9mt 1/1 Running 0 12m 192.168.0.201 mnode1 + cinder-volume-79b9bd8bb9-qsxdk 1/1 Running 0 12m 192.168.4.40 mnode2 + glance-api-676fd49d4d-j4bdb 1/1 Running 0 16m 192.168.3.37 mnode3 + glance-api-676fd49d4d-wtxqt 1/1 Running 0 16m 192.168.4.31 mnode2 + glance-registry-6f45f5bcf7-lhnrs 1/1 Running 0 16m 192.168.3.34 mnode3 + glance-registry-6f45f5bcf7-pbsnl 1/1 Running 0 16m 192.168.0.196 mnode1 + ingress-7b4bc84cdd-9fs78 1/1 Running 0 1h 192.168.5.3 mnode4 + ingress-7b4bc84cdd-wztz7 1/1 Running 0 1h 192.168.1.4 mnode6 + ingress-error-pages-586c7f86d6-2jl5q 1/1 Running 0 1h 192.168.2.4 mnode5 + ingress-error-pages-586c7f86d6-455j5 1/1 Running 0 1h 192.168.3.3 mnode3 + keystone-api-5bcc7cb698-dzm8q 1/1 Running 0 25m 192.168.4.24 mnode2 + keystone-api-5bcc7cb698-vvwwr 1/1 Running 0 25m 192.168.3.25 mnode3 + mariadb-ingress-84894687fd-dfnkm 1/1 Running 2 1h 192.168.3.20 mnode3 + mariadb-ingress-error-pages-78fb865f84-p8lpg 1/1 Running 0 1h 192.168.4.17 mnode2 + mariadb-server-0 1/1 Running 0 1h 192.168.4.18 mnode2 + memcached-memcached-5db74ddfd5-wfr9q 1/1 Running 0 29m 192.168.3.23 mnode3 + rabbitmq-rabbitmq-0 1/1 Running 0 1h 192.168.3.21 mnode3 + rabbitmq-rabbitmq-1 1/1 Running 0 1h 192.168.4.19 mnode2 + rabbitmq-rabbitmq-2 1/1 Running 0 1h 192.168.0.195 mnode1 + +``Result/Observation:`` + +- Ceph cluster is in HEALTH_OK state with 3 MONs and 3 OSDs. +- All PODs are in running state. + + +Step 2: Node reduction (failure): +================================= + +Shutdown 1 of 3 nodes (mnode1, mnode2, mnode3) to simulate node failure/lost. + +In this test env, let's shutdown ``mnode3`` node. + +``Following are PODs scheduled on mnode3 before shutdown:`` + +.. code-block:: console + + ceph ceph-cephfs-provisioner-784c6f9d59-vgzzx 0 (0%) 0 (0%) 0 (0%) 0 (0%) + ceph ceph-mds-6f66956547-c25cx 0 (0%) 0 (0%) 0 (0%) 0 (0%) + ceph ceph-mgr-5746dd89db-9dbmv 0 (0%) 0 (0%) 0 (0%) 0 (0%) + ceph ceph-mon-5qn68 0 (0%) 0 (0%) 0 (0%) 0 (0%) + ceph ceph-osd-default-83945928-c7gdd 0 (0%) 0 (0%) 0 (0%) 0 (0%) + ceph ceph-rbd-provisioner-5bfb577ffd-zdx2d 0 (0%) 0 (0%) 0 (0%) 0 (0%) + ceph ceph-rgw-6c64b444d7-7bgqs 0 (0%) 0 (0%) 0 (0%) 0 (0%) + kube-system ingress-ggckm 0 (0%) 0 (0%) 0 (0%) 0 (0%) + kube-system kube-flannel-ds-hs29q 0 (0%) 0 (0%) 0 (0%) 0 (0%) + kube-system kube-proxy-gqpz5 0 (0%) 0 (0%) 0 (0%) 0 (0%) + openstack cinder-api-66f4f9678-2lgwk 0 (0%) 0 (0%) 0 (0%) 0 (0%) + openstack glance-api-676fd49d4d-j4bdb 0 (0%) 0 (0%) 0 (0%) 0 (0%) + openstack glance-registry-6f45f5bcf7-lhnrs 0 (0%) 0 (0%) 0 (0%) 0 (0%) + openstack ingress-error-pages-586c7f86d6-455j5 0 (0%) 0 (0%) 0 (0%) 0 (0%) + openstack keystone-api-5bcc7cb698-vvwwr 0 (0%) 0 (0%) 0 (0%) 0 (0%) + openstack mariadb-ingress-84894687fd-dfnkm 0 (0%) 0 (0%) 0 (0%) 0 (0%) + openstack memcached-memcached-5db74ddfd5-wfr9q 0 (0%) 0 (0%) 0 (0%) 0 (0%) + openstack rabbitmq-rabbitmq-0 0 (0%) 0 (0%) 0 (0%) 0 (0%) + +.. note:: + In this test env, MariaDb chart is deployed with only 1 replicas. In order to + test properly, the node with MariaDb server POD (mnode2) should not be shutdown. + +.. note:: + In this test env, each node has Ceph and OpenStack related PODs. Due to this, + shutting down a Node will cause issue with Ceph as well as OpenStack services. + These PODs level failures are captured following subsequent screenshots. + +``Check node status:`` + +.. code-block:: console + + ubuntu@mnode1:/opt/openstack-helm$ kubectl get nodes + NAME STATUS ROLES AGE VERSION + mnode1 Ready 1h v1.10.6 + mnode2 Ready 1h v1.10.6 + mnode3 NotReady 1h v1.10.6 + mnode4 Ready 1h v1.10.6 + mnode5 Ready 1h v1.10.6 + mnode6 Ready 1h v1.10.6 + +``Ceph status:`` + +.. code-block:: console + + ubuntu@mnode1:/opt/openstack-helm$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph -s + cluster: + id: 54d9af7e-da6d-4980-9075-96bb145db65c + health: HEALTH_WARN + insufficient standby MDS daemons available + 1 osds down + 1 host (1 osds) down + Degraded data redundancy: 354/1062 objects degraded (33.333%), 46 pgs degraded, 101 pgs undersized + 1/3 mons down, quorum mnode1,mnode2 + + services: + mon: 3 daemons, quorum mnode1,mnode2, out of quorum: mnode3 + mgr: mnode2(active) + mds: cephfs-1/1/1 up {0=mds-ceph-mds-6f66956547-5x4ng=up:active} + osd: 3 osds: 2 up, 3 in + rgw: 1 daemon active + + data: + pools: 19 pools, 101 pgs + objects: 354 objects, 260 MB + usage: 77845 MB used, 70068 MB / 144 GB avail + pgs: 354/1062 objects degraded (33.333%) + 55 active+undersized + 46 active+undersized+degraded + +``Ceph quorum status:`` + +.. code-block:: console + + ubuntu@mnode1:/opt/openstack-helm$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph quorum_status -f json-pretty + +.. code-block:: json + + { + "election_epoch": 96, + "quorum": [ + 0, + 1 + ], + "quorum_names": [ + "mnode1", + "mnode2" + ], + "quorum_leader_name": "mnode1", + "monmap": { + "epoch": 1, + "fsid": "54d9af7e-da6d-4980-9075-96bb145db65c", + "modified": "2018-08-14 21:02:24.330403", + "created": "2018-08-14 21:02:24.330403", + "features": { + "persistent": [ + "kraken", + "luminous" + ], + "optional": [] + }, + "mons": [ + { + "rank": 0, + "name": "mnode1", + "addr": "192.168.10.246:6789/0", + "public_addr": "192.168.10.246:6789/0" + }, + { + "rank": 1, + "name": "mnode2", + "addr": "192.168.10.247:6789/0", + "public_addr": "192.168.10.247:6789/0" + }, + { + "rank": 2, + "name": "mnode3", + "addr": "192.168.10.248:6789/0", + "public_addr": "192.168.10.248:6789/0" + } + ] + } + } + + +``Ceph MON Status:`` + +.. code-block:: console + + ubuntu@mnode1:/opt/openstack-helm$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph mon_status -f json-pretty + +.. code-block:: json + + { + "name": "mnode1", + "rank": 0, + "state": "leader", + "election_epoch": 96, + "quorum": [ + 0, + 1 + ], + "features": { + "required_con": "153140804152475648", + "required_mon": [ + "kraken", + "luminous" + ], + "quorum_con": "2305244844532236283", + "quorum_mon": [ + "kraken", + "luminous" + ] + }, + "outside_quorum": [], + "extra_probe_peers": [], + "sync_provider": [], + "monmap": { + "epoch": 1, + "fsid": "54d9af7e-da6d-4980-9075-96bb145db65c", + "modified": "2018-08-14 21:02:24.330403", + "created": "2018-08-14 21:02:24.330403", + "features": { + "persistent": [ + "kraken", + "luminous" + ], + "optional": [] + }, + "mons": [ + { + "rank": 0, + "name": "mnode1", + "addr": "192.168.10.246:6789/0", + "public_addr": "192.168.10.246:6789/0" + }, + { + "rank": 1, + "name": "mnode2", + "addr": "192.168.10.247:6789/0", + "public_addr": "192.168.10.247:6789/0" + }, + { + "rank": 2, + "name": "mnode3", + "addr": "192.168.10.248:6789/0", + "public_addr": "192.168.10.248:6789/0" + } + ] + }, + "feature_map": { + "mon": { + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 1 + } + }, + "mds": { + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 1 + } + }, + "osd": { + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 1 + } + }, + "client": { + "group": { + "features": "0x7010fb86aa42ada", + "release": "jewel", + "num": 1 + }, + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 5 + } + } + } + } + +``Ceph PODs:`` + +.. code-block:: console + + ubuntu@mnode1:/opt/openstack-helm$ kubectl get pods -n ceph --show-all=false -o wide + NAME READY STATUS RESTARTS AGE IP NODE + ceph-cephfs-provisioner-784c6f9d59-n92dx 1/1 Running 0 1m 192.168.0.206 mnode1 + ceph-cephfs-provisioner-784c6f9d59-ndsgn 1/1 Running 0 1h 192.168.4.15 mnode2 + ceph-cephfs-provisioner-784c6f9d59-vgzzx 1/1 Unknown 0 1h 192.168.3.17 mnode3 + ceph-mds-6f66956547-57tf9 1/1 Running 0 1m 192.168.0.207 mnode1 + ceph-mds-6f66956547-5x4ng 1/1 Running 0 1h 192.168.4.14 mnode2 + ceph-mds-6f66956547-c25cx 1/1 Unknown 0 1h 192.168.3.14 mnode3 + ceph-mgr-5746dd89db-9dbmv 1/1 Unknown 0 1h 192.168.10.248 mnode3 + ceph-mgr-5746dd89db-d5fcw 1/1 Running 0 1m 192.168.10.246 mnode1 + ceph-mgr-5746dd89db-qq4nl 1/1 Running 0 1h 192.168.10.247 mnode2 + ceph-mon-5qn68 1/1 NodeLost 0 1h 192.168.10.248 mnode3 + ceph-mon-check-d85994946-4g5xc 1/1 Running 0 1h 192.168.4.8 mnode2 + ceph-mon-mwkj9 1/1 Running 0 1h 192.168.10.247 mnode2 + ceph-mon-ql9zp 1/1 Running 0 1h 192.168.10.246 mnode1 + ceph-osd-default-83945928-c7gdd 1/1 NodeLost 0 1h 192.168.10.248 mnode3 + ceph-osd-default-83945928-s6gs6 1/1 Running 0 1h 192.168.10.246 mnode1 + ceph-osd-default-83945928-vsc5b 1/1 Running 0 1h 192.168.10.247 mnode2 + ceph-rbd-provisioner-5bfb577ffd-j6hlx 1/1 Running 0 1h 192.168.4.16 mnode2 + ceph-rbd-provisioner-5bfb577ffd-kdmrv 1/1 Running 0 1m 192.168.0.209 mnode1 + ceph-rbd-provisioner-5bfb577ffd-zdx2d 1/1 Unknown 0 1h 192.168.3.16 mnode3 + ceph-rgw-6c64b444d7-4qgkw 1/1 Running 0 1m 192.168.0.210 mnode1 + ceph-rgw-6c64b444d7-7bgqs 1/1 Unknown 0 1h 192.168.3.12 mnode3 + ceph-rgw-6c64b444d7-hv6vn 1/1 Running 0 1h 192.168.4.13 mnode2 + ingress-796d8cf8d6-4txkq 1/1 Running 0 1h 192.168.2.6 mnode5 + ingress-796d8cf8d6-9t7m8 1/1 Running 0 1h 192.168.5.4 mnode4 + ingress-error-pages-54454dc79b-hhb4f 1/1 Running 0 1h 192.168.2.5 mnode5 + ingress-error-pages-54454dc79b-twpgc 1/1 Running 0 1h 192.168.4.4 mnode2 + +``OpenStack PODs:`` + +.. code-block:: console + + ubuntu@mnode1:/opt/openstack-helm$ kubectl get pods -n openstack --show-all=false -o wide + NAME READY STATUS RESTARTS AGE IP NODE + cinder-api-66f4f9678-2lgwk 1/1 Unknown 0 22m 192.168.3.41 mnode3 + cinder-api-66f4f9678-flvr5 1/1 Running 0 22m 192.168.0.202 mnode1 + cinder-api-66f4f9678-w5xhd 1/1 Running 0 1m 192.168.4.45 mnode2 + cinder-backup-659b68b474-582kr 1/1 Running 0 22m 192.168.4.39 mnode2 + cinder-scheduler-6778f6f88c-mm9mt 1/1 Running 0 22m 192.168.0.201 mnode1 + cinder-volume-79b9bd8bb9-qsxdk 1/1 Running 0 22m 192.168.4.40 mnode2 + cinder-volume-usage-audit-1534286100-mm8r7 1/1 Running 0 4m 192.168.4.44 mnode2 + glance-api-676fd49d4d-4tnm6 1/1 Running 0 1m 192.168.0.212 mnode1 + glance-api-676fd49d4d-j4bdb 1/1 Unknown 0 26m 192.168.3.37 mnode3 + glance-api-676fd49d4d-wtxqt 1/1 Running 0 26m 192.168.4.31 mnode2 + glance-registry-6f45f5bcf7-7s8dn 1/1 Running 0 1m 192.168.4.46 mnode2 + glance-registry-6f45f5bcf7-lhnrs 1/1 Unknown 0 26m 192.168.3.34 mnode3 + glance-registry-6f45f5bcf7-pbsnl 1/1 Running 0 26m 192.168.0.196 mnode1 + ingress-7b4bc84cdd-9fs78 1/1 Running 0 1h 192.168.5.3 mnode4 + ingress-7b4bc84cdd-wztz7 1/1 Running 0 1h 192.168.1.4 mnode6 + ingress-error-pages-586c7f86d6-2jl5q 1/1 Running 0 1h 192.168.2.4 mnode5 + ingress-error-pages-586c7f86d6-455j5 1/1 Unknown 0 1h 192.168.3.3 mnode3 + ingress-error-pages-586c7f86d6-55j4x 1/1 Running 0 1m 192.168.4.47 mnode2 + keystone-api-5bcc7cb698-dzm8q 1/1 Running 0 35m 192.168.4.24 mnode2 + keystone-api-5bcc7cb698-vvwwr 1/1 Unknown 0 35m 192.168.3.25 mnode3 + keystone-api-5bcc7cb698-wx5l6 1/1 Running 0 1m 192.168.0.213 mnode1 + mariadb-ingress-84894687fd-9lmpx 1/1 Running 0 1m 192.168.4.48 mnode2 + mariadb-ingress-84894687fd-dfnkm 1/1 Unknown 2 1h 192.168.3.20 mnode3 + mariadb-ingress-error-pages-78fb865f84-p8lpg 1/1 Running 0 1h 192.168.4.17 mnode2 + mariadb-server-0 1/1 Running 0 1h 192.168.4.18 mnode2 + memcached-memcached-5db74ddfd5-926ln 1/1 Running 0 1m 192.168.4.49 mnode2 + memcached-memcached-5db74ddfd5-wfr9q 1/1 Unknown 0 38m 192.168.3.23 mnode3 + rabbitmq-rabbitmq-0 1/1 Unknown 0 1h 192.168.3.21 mnode3 + rabbitmq-rabbitmq-1 1/1 Running 0 1h 192.168.4.19 mnode2 + rabbitmq-rabbitmq-2 1/1 Running 0 1h 192.168.0.195 mnode1 + + +``Result/Observation:`` + +- PODs that were scheduled on mnode3 node has status of NodeLost/Unknown. +- Ceph status shows HEALTH_WARN as expected +- Ceph status shows 1 Ceph MON and 1 Ceph OSD missing. +- OpenStack PODs that were scheduled mnode3 also shows NodeLost/Unknown. + +Step 3: Node Expansion +====================== + +Let's add more resources for k8s to schedule PODs on. + +In this test env, let's use ``mnode4`` and apply Ceph and OpenStack related +labels. + +.. note:: + Since the node that was shutdown earlier had both Ceph and OpenStack PODs, + mnode4 should get Ceph and OpenStack related labels as well. + +After applying labels, let's check status + +``Ceph status:`` + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph -s + cluster: + id: 54d9af7e-da6d-4980-9075-96bb145db65c + health: HEALTH_WARN + 1/4 mons down, quorum mnode1,mnode2,mnode4 + + services: + mon: 4 daemons, quorum mnode1,mnode2,mnode4, out of quorum: mnode3 + mgr: mnode2(active), standbys: mnode1 + mds: cephfs-1/1/1 up {0=mds-ceph-mds-6f66956547-5x4ng=up:active}, 1 up:standby + osd: 4 osds: 3 up, 3 in + rgw: 2 daemons active + + data: + pools: 19 pools, 101 pgs + objects: 354 objects, 260 MB + usage: 74684 MB used, 73229 MB / 144 GB avail + pgs: 101 active+clean + + +``Ceph MON Status`` + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph mon_status -f json-pretty + +.. code-block:: json + + { + "name": "mnode2", + "rank": 1, + "state": "peon", + "election_epoch": 100, + "quorum": [ + 0, + 1, + 3 + ], + "features": { + "required_con": "153140804152475648", + "required_mon": [ + "kraken", + "luminous" + ], + "quorum_con": "2305244844532236283", + "quorum_mon": [ + "kraken", + "luminous" + ] + }, + "outside_quorum": [], + "extra_probe_peers": [ + "192.168.10.249:6789/0" + ], + "sync_provider": [], + "monmap": { + "epoch": 2, + "fsid": "54d9af7e-da6d-4980-9075-96bb145db65c", + "modified": "2018-08-14 22:43:31.517568", + "created": "2018-08-14 21:02:24.330403", + "features": { + "persistent": [ + "kraken", + "luminous" + ], + "optional": [] + }, + "mons": [ + { + "rank": 0, + "name": "mnode1", + "addr": "192.168.10.246:6789/0", + "public_addr": "192.168.10.246:6789/0" + }, + { + "rank": 1, + "name": "mnode2", + "addr": "192.168.10.247:6789/0", + "public_addr": "192.168.10.247:6789/0" + }, + { + "rank": 2, + "name": "mnode3", + "addr": "192.168.10.248:6789/0", + "public_addr": "192.168.10.248:6789/0" + }, + { + "rank": 3, + "name": "mnode4", + "addr": "192.168.10.249:6789/0", + "public_addr": "192.168.10.249:6789/0" + } + ] + }, + "feature_map": { + "mon": { + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 1 + } + }, + "mds": { + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 1 + } + }, + "osd": { + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 2 + } + }, + "client": { + "group": { + "features": "0x7010fb86aa42ada", + "release": "jewel", + "num": 1 + }, + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 1 + } + } + } + } + + +``Ceph quorum status:`` + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph quorum_status -f json-pretty + +.. code-block:: json + + { + "election_epoch": 100, + "quorum": [ + 0, + 1, + 3 + ], + "quorum_names": [ + "mnode1", + "mnode2", + "mnode4" + ], + "quorum_leader_name": "mnode1", + "monmap": { + "epoch": 2, + "fsid": "54d9af7e-da6d-4980-9075-96bb145db65c", + "modified": "2018-08-14 22:43:31.517568", + "created": "2018-08-14 21:02:24.330403", + "features": { + "persistent": [ + "kraken", + "luminous" + ], + "optional": [] + }, + "mons": [ + { + "rank": 0, + "name": "mnode1", + "addr": "192.168.10.246:6789/0", + "public_addr": "192.168.10.246:6789/0" + }, + { + "rank": 1, + "name": "mnode2", + "addr": "192.168.10.247:6789/0", + "public_addr": "192.168.10.247:6789/0" + }, + { + "rank": 2, + "name": "mnode3", + "addr": "192.168.10.248:6789/0", + "public_addr": "192.168.10.248:6789/0" + }, + { + "rank": 3, + "name": "mnode4", + "addr": "192.168.10.249:6789/0", + "public_addr": "192.168.10.249:6789/0" + } + ] + } + } + + +``Ceph PODs:`` + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl get pods -n ceph --show-all=false -o wide + Flag --show-all has been deprecated, will be removed in an upcoming release + NAME READY STATUS RESTARTS AGE IP NODE + ceph-cephfs-provisioner-784c6f9d59-n92dx 1/1 Running 0 10m 192.168.0.206 mnode1 + ceph-cephfs-provisioner-784c6f9d59-ndsgn 1/1 Running 0 1h 192.168.4.15 mnode2 + ceph-cephfs-provisioner-784c6f9d59-vgzzx 1/1 Unknown 0 1h 192.168.3.17 mnode3 + ceph-mds-6f66956547-57tf9 1/1 Running 0 10m 192.168.0.207 mnode1 + ceph-mds-6f66956547-5x4ng 1/1 Running 0 1h 192.168.4.14 mnode2 + ceph-mds-6f66956547-c25cx 1/1 Unknown 0 1h 192.168.3.14 mnode3 + ceph-mgr-5746dd89db-9dbmv 1/1 Unknown 0 1h 192.168.10.248 mnode3 + ceph-mgr-5746dd89db-d5fcw 1/1 Running 0 10m 192.168.10.246 mnode1 + ceph-mgr-5746dd89db-qq4nl 1/1 Running 0 1h 192.168.10.247 mnode2 + ceph-mon-5krkd 1/1 Running 0 4m 192.168.10.249 mnode4 + ceph-mon-5qn68 1/1 NodeLost 0 1h 192.168.10.248 mnode3 + ceph-mon-check-d85994946-4g5xc 1/1 Running 0 1h 192.168.4.8 mnode2 + ceph-mon-mwkj9 1/1 Running 0 1h 192.168.10.247 mnode2 + ceph-mon-ql9zp 1/1 Running 0 1h 192.168.10.246 mnode1 + ceph-osd-default-83945928-c7gdd 1/1 NodeLost 0 1h 192.168.10.248 mnode3 + ceph-osd-default-83945928-kf5tj 1/1 Running 0 4m 192.168.10.249 mnode4 + ceph-osd-default-83945928-s6gs6 1/1 Running 0 1h 192.168.10.246 mnode1 + ceph-osd-default-83945928-vsc5b 1/1 Running 0 1h 192.168.10.247 mnode2 + ceph-rbd-provisioner-5bfb577ffd-j6hlx 1/1 Running 0 1h 192.168.4.16 mnode2 + ceph-rbd-provisioner-5bfb577ffd-kdmrv 1/1 Running 0 10m 192.168.0.209 mnode1 + ceph-rbd-provisioner-5bfb577ffd-zdx2d 1/1 Unknown 0 1h 192.168.3.16 mnode3 + ceph-rgw-6c64b444d7-4qgkw 1/1 Running 0 10m 192.168.0.210 mnode1 + ceph-rgw-6c64b444d7-7bgqs 1/1 Unknown 0 1h 192.168.3.12 mnode3 + ceph-rgw-6c64b444d7-hv6vn 1/1 Running 0 1h 192.168.4.13 mnode2 + ingress-796d8cf8d6-4txkq 1/1 Running 0 1h 192.168.2.6 mnode5 + ingress-796d8cf8d6-9t7m8 1/1 Running 0 1h 192.168.5.4 mnode4 + ingress-error-pages-54454dc79b-hhb4f 1/1 Running 0 1h 192.168.2.5 mnode5 + ingress-error-pages-54454dc79b-twpgc 1/1 Running 0 1h 192.168.4.4 mnode2 + +``OpenStack PODs:`` + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl get pods -n openstack --show-all=false -o wide + Flag --show-all has been deprecated, will be removed in an upcoming release + NAME READY STATUS RESTARTS AGE IP NODE + cinder-api-66f4f9678-2lgwk 1/1 Unknown 0 32m 192.168.3.41 mnode3 + cinder-api-66f4f9678-flvr5 1/1 Running 0 32m 192.168.0.202 mnode1 + cinder-api-66f4f9678-w5xhd 1/1 Running 0 11m 192.168.4.45 mnode2 + cinder-backup-659b68b474-582kr 1/1 Running 0 32m 192.168.4.39 mnode2 + cinder-scheduler-6778f6f88c-mm9mt 1/1 Running 0 32m 192.168.0.201 mnode1 + cinder-volume-79b9bd8bb9-qsxdk 1/1 Running 0 32m 192.168.4.40 mnode2 + glance-api-676fd49d4d-4tnm6 1/1 Running 0 11m 192.168.0.212 mnode1 + glance-api-676fd49d4d-j4bdb 1/1 Unknown 0 36m 192.168.3.37 mnode3 + glance-api-676fd49d4d-wtxqt 1/1 Running 0 36m 192.168.4.31 mnode2 + glance-registry-6f45f5bcf7-7s8dn 1/1 Running 0 11m 192.168.4.46 mnode2 + glance-registry-6f45f5bcf7-lhnrs 1/1 Unknown 0 36m 192.168.3.34 mnode3 + glance-registry-6f45f5bcf7-pbsnl 1/1 Running 0 36m 192.168.0.196 mnode1 + ingress-7b4bc84cdd-9fs78 1/1 Running 0 1h 192.168.5.3 mnode4 + ingress-7b4bc84cdd-wztz7 1/1 Running 0 1h 192.168.1.4 mnode6 + ingress-error-pages-586c7f86d6-2jl5q 1/1 Running 0 1h 192.168.2.4 mnode5 + ingress-error-pages-586c7f86d6-455j5 1/1 Unknown 0 1h 192.168.3.3 mnode3 + ingress-error-pages-586c7f86d6-55j4x 1/1 Running 0 11m 192.168.4.47 mnode2 + keystone-api-5bcc7cb698-dzm8q 1/1 Running 0 45m 192.168.4.24 mnode2 + keystone-api-5bcc7cb698-vvwwr 1/1 Unknown 0 45m 192.168.3.25 mnode3 + keystone-api-5bcc7cb698-wx5l6 1/1 Running 0 11m 192.168.0.213 mnode1 + mariadb-ingress-84894687fd-9lmpx 1/1 Running 0 11m 192.168.4.48 mnode2 + mariadb-ingress-84894687fd-dfnkm 1/1 Unknown 2 1h 192.168.3.20 mnode3 + mariadb-ingress-error-pages-78fb865f84-p8lpg 1/1 Running 0 1h 192.168.4.17 mnode2 + mariadb-server-0 1/1 Running 0 1h 192.168.4.18 mnode2 + memcached-memcached-5db74ddfd5-926ln 1/1 Running 0 11m 192.168.4.49 mnode2 + memcached-memcached-5db74ddfd5-wfr9q 1/1 Unknown 0 48m 192.168.3.23 mnode3 + rabbitmq-rabbitmq-0 1/1 Unknown 0 1h 192.168.3.21 mnode3 + rabbitmq-rabbitmq-1 1/1 Running 0 1h 192.168.4.19 mnode2 + rabbitmq-rabbitmq-2 1/1 Running 0 1h 192.168.0.195 mnode1 + + +``Result/Observation:`` + +- Ceph MON and OSD PODs got scheduled on mnode4 node. +- Ceph status shows that MON and OSD count has been increased. +- Ceph status still shows HEALTH_WARN as one MON and OSD are still down. + +Step 4: Ceph cluster recovery +============================= + +Now that we have added new node for Ceph and OpenStack PODs, let's perform +maintenance on Ceph cluster. + +1) Remove out of quorum MON: +---------------------------- + +Using ``ceph mon_status`` and ``ceph -s`` commands, confirm ID of MON that is out of quorum. + +In this test env, ``mnode3`` is out of quorum. + +.. note:: + In this test env, since out of quorum MON is no longer available due to node failure, we can + processed with removing it from Ceph cluster. + +``Remove MON from Ceph cluster`` + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph mon remove mnode3 + removing mon.mnode3 at 192.168.10.248:6789/0, there will be 3 monitors + +``Ceph Status:`` + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph -s + cluster: + id: 54d9af7e-da6d-4980-9075-96bb145db65c + health: HEALTH_OK + + services: + mon: 3 daemons, quorum mnode1,mnode2,mnode4 + mgr: mnode2(active), standbys: mnode1 + mds: cephfs-1/1/1 up {0=mds-ceph-mds-6f66956547-5x4ng=up:active}, 1 up:standby + osd: 4 osds: 3 up, 3 in + rgw: 2 daemons active + + data: + pools: 19 pools, 101 pgs + objects: 354 objects, 260 MB + usage: 74705 MB used, 73208 MB / 144 GB avail + pgs: 101 active+clean + + io: + client: 132 kB/s wr, 0 op/s rd, 23 op/s wr + +As shown above, Ceph status is now HEALTH_OK and and shows 3 MONs available. + +``Ceph MON Status`` + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph mon_status -f json-pretty + +.. code-block:: json + + { + "name": "mnode4", + "rank": 2, + "state": "peon", + "election_epoch": 106, + "quorum": [ + 0, + 1, + 2 + ], + "features": { + "required_con": "153140804152475648", + "required_mon": [ + "kraken", + "luminous" + ], + "quorum_con": "2305244844532236283", + "quorum_mon": [ + "kraken", + "luminous" + ] + }, + "outside_quorum": [], + "extra_probe_peers": [], + "sync_provider": [], + "monmap": { + "epoch": 3, + "fsid": "54d9af7e-da6d-4980-9075-96bb145db65c", + "modified": "2018-08-14 22:55:41.256612", + "created": "2018-08-14 21:02:24.330403", + "features": { + "persistent": [ + "kraken", + "luminous" + ], + "optional": [] + }, + "mons": [ + { + "rank": 0, + "name": "mnode1", + "addr": "192.168.10.246:6789/0", + "public_addr": "192.168.10.246:6789/0" + }, + { + "rank": 1, + "name": "mnode2", + "addr": "192.168.10.247:6789/0", + "public_addr": "192.168.10.247:6789/0" + }, + { + "rank": 2, + "name": "mnode4", + "addr": "192.168.10.249:6789/0", + "public_addr": "192.168.10.249:6789/0" + } + ] + }, + "feature_map": { + "mon": { + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 1 + } + }, + "client": { + "group": { + "features": "0x1ffddff8eea4fffb", + "release": "luminous", + "num": 1 + } + } + } + } + +``Ceph quorum status`` + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph quorum_status -f json-pretty + +.. code-block:: json + + + { + "election_epoch": 106, + "quorum": [ + 0, + 1, + 2 + ], + "quorum_names": [ + "mnode1", + "mnode2", + "mnode4" + ], + "quorum_leader_name": "mnode1", + "monmap": { + "epoch": 3, + "fsid": "54d9af7e-da6d-4980-9075-96bb145db65c", + "modified": "2018-08-14 22:55:41.256612", + "created": "2018-08-14 21:02:24.330403", + "features": { + "persistent": [ + "kraken", + "luminous" + ], + "optional": [] + }, + "mons": [ + { + "rank": 0, + "name": "mnode1", + "addr": "192.168.10.246:6789/0", + "public_addr": "192.168.10.246:6789/0" + }, + { + "rank": 1, + "name": "mnode2", + "addr": "192.168.10.247:6789/0", + "public_addr": "192.168.10.247:6789/0" + }, + { + "rank": 2, + "name": "mnode4", + "addr": "192.168.10.249:6789/0", + "public_addr": "192.168.10.249:6789/0" + } + ] + } + } + + +2) Remove down OSD from Ceph cluster: +------------------------------------- + +As shown in Ceph status above, ``osd: 4 osds: 3 up, 3 in`` 1 of 4 OSDs is still +down. Let's remove that OSD. + +First run ``ceph osd tree`` command to get list of OSDs. + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph osd tree + ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF + -1 0.19995 root default + -7 0.04999 host mnode1 + 2 hdd 0.04999 osd.2 up 1.00000 1.00000 + -2 0.04999 host mnode2 + 0 hdd 0.04999 osd.0 up 1.00000 1.00000 + -3 0.04999 host mnode3 + 1 hdd 0.04999 osd.1 down 0 1.00000 + -9 0.04999 host mnode4 + 3 hdd 0.04999 osd.3 up 1.00000 1.00000 + +Above output shows that ``osd.1`` is down. + +Run ``ceph osd purge`` command to remove OSD from ceph cluster. + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph osd purge osd.1 --yes-i-really-mean-it + purged osd.1 + +``Ceph status`` + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl exec -n ceph ceph-mon-ql9zp -- ceph -s + cluster: + id: 54d9af7e-da6d-4980-9075-96bb145db65c + health: HEALTH_OK + + services: + mon: 3 daemons, quorum mnode1,mnode2,mnode4 + mgr: mnode2(active), standbys: mnode1 + mds: cephfs-1/1/1 up {0=mds-ceph-mds-6f66956547-5x4ng=up:active}, 1 up:standby + osd: 3 osds: 3 up, 3 in + rgw: 2 daemons active + + data: + pools: 19 pools, 101 pgs + objects: 354 objects, 260 MB + usage: 74681 MB used, 73232 MB / 144 GB avail + pgs: 101 active+clean + + io: + client: 57936 B/s wr, 0 op/s rd, 14 op/s wr + +Above output shows Ceph cluster in HEALTH_OK with all OSDs and MONs up and running. + +``Ceph PODs`` + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl get pods -n ceph --show-all=false -o wide + Flag --show-all has been deprecated, will be removed in an upcoming release + NAME READY STATUS RESTARTS AGE IP NODE + ceph-cephfs-provisioner-784c6f9d59-n92dx 1/1 Running 0 25m 192.168.0.206 mnode1 + ceph-cephfs-provisioner-784c6f9d59-ndsgn 1/1 Running 0 1h 192.168.4.15 mnode2 + ceph-cephfs-provisioner-784c6f9d59-vgzzx 1/1 Unknown 0 1h 192.168.3.17 mnode3 + ceph-mds-6f66956547-57tf9 1/1 Running 0 25m 192.168.0.207 mnode1 + ceph-mds-6f66956547-5x4ng 1/1 Running 0 1h 192.168.4.14 mnode2 + ceph-mds-6f66956547-c25cx 1/1 Unknown 0 1h 192.168.3.14 mnode3 + ceph-mgr-5746dd89db-9dbmv 1/1 Unknown 0 1h 192.168.10.248 mnode3 + ceph-mgr-5746dd89db-d5fcw 1/1 Running 0 25m 192.168.10.246 mnode1 + ceph-mgr-5746dd89db-qq4nl 1/1 Running 0 1h 192.168.10.247 mnode2 + ceph-mon-5krkd 1/1 Running 0 19m 192.168.10.249 mnode4 + ceph-mon-5qn68 1/1 NodeLost 0 2h 192.168.10.248 mnode3 + ceph-mon-check-d85994946-4g5xc 1/1 Running 0 2h 192.168.4.8 mnode2 + ceph-mon-mwkj9 1/1 Running 0 2h 192.168.10.247 mnode2 + ceph-mon-ql9zp 1/1 Running 0 2h 192.168.10.246 mnode1 + ceph-osd-default-83945928-c7gdd 1/1 NodeLost 0 1h 192.168.10.248 mnode3 + ceph-osd-default-83945928-kf5tj 1/1 Running 0 19m 192.168.10.249 mnode4 + ceph-osd-default-83945928-s6gs6 1/1 Running 0 1h 192.168.10.246 mnode1 + ceph-osd-default-83945928-vsc5b 1/1 Running 0 1h 192.168.10.247 mnode2 + ceph-rbd-provisioner-5bfb577ffd-j6hlx 1/1 Running 0 1h 192.168.4.16 mnode2 + ceph-rbd-provisioner-5bfb577ffd-kdmrv 1/1 Running 0 25m 192.168.0.209 mnode1 + ceph-rbd-provisioner-5bfb577ffd-zdx2d 1/1 Unknown 0 1h 192.168.3.16 mnode3 + ceph-rgw-6c64b444d7-4qgkw 1/1 Running 0 25m 192.168.0.210 mnode1 + ceph-rgw-6c64b444d7-7bgqs 1/1 Unknown 0 1h 192.168.3.12 mnode3 + ceph-rgw-6c64b444d7-hv6vn 1/1 Running 0 1h 192.168.4.13 mnode2 + ingress-796d8cf8d6-4txkq 1/1 Running 0 2h 192.168.2.6 mnode5 + ingress-796d8cf8d6-9t7m8 1/1 Running 0 2h 192.168.5.4 mnode4 + ingress-error-pages-54454dc79b-hhb4f 1/1 Running 0 2h 192.168.2.5 mnode5 + ingress-error-pages-54454dc79b-twpgc 1/1 Running 0 2h 192.168.4.4 mnode2 + +``OpenStack PODs`` + +.. code-block:: console + + ubuntu@mnode1:~$ kubectl get pods -n openstack --show-all=false -o wide + Flag --show-all has been deprecated, will be removed in an upcoming release + NAME READY STATUS RESTARTS AGE IP NODE + cinder-api-66f4f9678-2lgwk 1/1 Unknown 0 47m 192.168.3.41 mnode3 + cinder-api-66f4f9678-flvr5 1/1 Running 0 47m 192.168.0.202 mnode1 + cinder-api-66f4f9678-w5xhd 1/1 Running 0 26m 192.168.4.45 mnode2 + cinder-backup-659b68b474-582kr 1/1 Running 0 47m 192.168.4.39 mnode2 + cinder-scheduler-6778f6f88c-mm9mt 1/1 Running 0 47m 192.168.0.201 mnode1 + cinder-volume-79b9bd8bb9-qsxdk 1/1 Running 0 47m 192.168.4.40 mnode2 + glance-api-676fd49d4d-4tnm6 1/1 Running 0 26m 192.168.0.212 mnode1 + glance-api-676fd49d4d-j4bdb 1/1 Unknown 0 51m 192.168.3.37 mnode3 + glance-api-676fd49d4d-wtxqt 1/1 Running 0 51m 192.168.4.31 mnode2 + glance-registry-6f45f5bcf7-7s8dn 1/1 Running 0 26m 192.168.4.46 mnode2 + glance-registry-6f45f5bcf7-lhnrs 1/1 Unknown 0 51m 192.168.3.34 mnode3 + glance-registry-6f45f5bcf7-pbsnl 1/1 Running 0 51m 192.168.0.196 mnode1 + ingress-7b4bc84cdd-9fs78 1/1 Running 0 2h 192.168.5.3 mnode4 + ingress-7b4bc84cdd-wztz7 1/1 Running 0 2h 192.168.1.4 mnode6 + ingress-error-pages-586c7f86d6-2jl5q 1/1 Running 0 2h 192.168.2.4 mnode5 + ingress-error-pages-586c7f86d6-455j5 1/1 Unknown 0 2h 192.168.3.3 mnode3 + ingress-error-pages-586c7f86d6-55j4x 1/1 Running 0 26m 192.168.4.47 mnode2 + keystone-api-5bcc7cb698-dzm8q 1/1 Running 0 1h 192.168.4.24 mnode2 + keystone-api-5bcc7cb698-vvwwr 1/1 Unknown 0 1h 192.168.3.25 mnode3 + keystone-api-5bcc7cb698-wx5l6 1/1 Running 0 26m 192.168.0.213 mnode1 + mariadb-ingress-84894687fd-9lmpx 1/1 Running 0 26m 192.168.4.48 mnode2 + mariadb-ingress-84894687fd-dfnkm 1/1 Unknown 2 1h 192.168.3.20 mnode3 + mariadb-ingress-error-pages-78fb865f84-p8lpg 1/1 Running 0 1h 192.168.4.17 mnode2 + mariadb-server-0 1/1 Running 0 1h 192.168.4.18 mnode2 + memcached-memcached-5db74ddfd5-926ln 1/1 Running 0 26m 192.168.4.49 mnode2 + memcached-memcached-5db74ddfd5-wfr9q 1/1 Unknown 0 1h 192.168.3.23 mnode3 + rabbitmq-rabbitmq-0 1/1 Unknown 0 1h 192.168.3.21 mnode3 + rabbitmq-rabbitmq-1 1/1 Running 0 1h 192.168.4.19 mnode2 + rabbitmq-rabbitmq-2 1/1 Running 0 1h 192.168.0.195 mnode1 diff --git a/doc/source/testing/index.rst b/doc/source/testing/index.rst index 79e3f575ac..9540e312d1 100644 --- a/doc/source/testing/index.rst +++ b/doc/source/testing/index.rst @@ -8,3 +8,4 @@ Testing helm-tests ceph-resiliency/index ceph-upgrade + ceph-node-resiliency