[doc] Update dist upgrade guide for 2023.1 / Ubuntu Jammy

Change-Id: Ic708385ce2a75e42e9a70fa53dbd6030ba318a7c
2024-02-12 08:39:46 +00:00 · 2024-02-12 08:39:46 +00:00 · ded73432b8
parent 4abddeaf30
commit ded73432b8
2 changed files with 86 additions and 11 deletions
--- a/doc/source/admin/upgrades/distribution-upgrades.rst
+++ b/doc/source/admin/upgrades/distribution-upgrades.rst
@ -7,8 +7,9 @@ release to the next.

 .. note::

-   This guide was written when upgrading from Ubuntu Bionic to Focal during the
-   Victoria release cycle.
+   This guide was last updated when upgrading from Ubuntu Focal to Jammy during
+   the Antelope (2023.1) release. For earlier releases please see other
+   versions of the guide.

 Introduction
 ============
@ -36,6 +37,10 @@ upgrade any API hosts/containers. The last 'repo' host to be upgraded should be
 the 'primary', and should not be carried out until after the final service
 which does not support '--limit' is upgraded.

+If you have a multi-architecture deployment, then at least one 'repo' host of
+each architecture will need to be upgraded before upgrading any other hosts
+which use that architecture.
+
 If this order is adapted, it will be necessary to restore some files to the
 'repo' host from a backup part-way through the process. This will be necessary
 if no 'repo' hosts remain which run the older operating system version, which
@ -94,6 +99,15 @@ Pre-Requisites
      and can visit https://admin:password@external_lb_vip_address:1936/ and read
      'Statistics Report for pid # on infrastructure_host'

+*  Ensure RabbitMQ is running with all feature flags enabled to avoid conflicts
+   when re-installing nodes. If any are listed as disabled then enable them via
+   the console on one of the nodes:
+
+   .. code:: console
+
+      rabbitmqctl list_feature_flags
+      rabbitmqctl enable_feature_flag all
+
 Warnings
 ========

@ -134,7 +148,7 @@ Deploying Infrastructure Hosts

   .. code:: console

-      openstack-ansible set-haproxy-backends-state.yml -e hostname=<infrahost> -e backend_state=disabled
+      openstack-ansible set-haproxy-backends-state.yml -e hostname=reinstalled_host -e backend_state=disabled

   Or if you've enabled haproxy_stats as described above, you can visit
   https://admin:password@external_lb_vip_address:1936/ and select them and
@ -164,6 +178,19 @@ Deploying Infrastructure Hosts
         rabbitmqctl cluster_status
         rabbitmqctl forget_cluster_node rabbit@removed_host_rabbitmq_container

+   #. If GlusterFS was running on this host (repo nodes)
+
+      We forget it by running these commands on another repo host. Note that we
+      have to tell Gluster we are intentionally reducing the number of
+      replicas. 'N' should be set to the number of repo servers minus 1.
+      Existing gluster peer names can be found using the 'gluster peer status'
+      command.
+
+      .. code:: console
+
+         gluster volume remove-brick gfs-repo replica N removed_host_gluster_peer:/gluster/bricks/1 force
+         gluster peer detach removed_host_gluster_peer
+
 #. Do generic preparation of reinstalled host

   .. code:: console
@ -198,7 +225,7 @@ Deploying Infrastructure Hosts

   .. code:: console

-      openstack-ansible set-haproxy-backends-state.yml -e hostname=<infrahost> -e backend_state=disabled --limit reinstalled_host
+      openstack-ansible set-haproxy-backends-state.yml -e hostname=reinstalled_host -e backend_state=disabled --limit reinstalled_host

 #. If it is NOT a 'primary', install everything on the new host

@ -271,7 +298,7 @@ Deploying Infrastructure Hosts

   .. code:: console

-      openstack-ansible set-haproxy-backends-state.yml -e hostname=<infrahost> -e backend_state=enabled
+      openstack-ansible set-haproxy-backends-state.yml -e hostname=reinstalled_host -e backend_state=enabled


 Deploying Compute & Network Hosts
@ -300,12 +327,16 @@ Deploying Compute & Network Hosts

   (* because we need to include containers in the limit)

-.. note::
+#. Re-instate compute node hypervisor UUIDs

-   During this upgrade cycle it was noted that network nodes required a restart
-   to bring some tenant interfaces online after running setup-openstack.
-   Additionally, BGP speakers (used for IPv6) had to be re-initialised from the
-   command line. These steps were necessary before reinstalling further network
-   nodes to prevent HA Router interruptions.
+   Compute nodes should have their UUID stored in the file
+   '/var/lib/nova/compute_id' and the 'nova-compute' service restarted. UUIDs
+   can be found from the command line'openstack hypervisor list'.
+
+   Alternatively, the following Ansible can be used to automate these actions:
+
+   .. code:: console
+
+      openstack-ansible ../scripts/upgrade-utilities/nova-restore-compute-id.yml --limit reinstalled_host

 .. _OPS repository: https://opendev.org/openstack/openstack-ansible-ops/src/branch/master/ansible_tools/playbooks/set-haproxy-backends-state.yml
--- a/scripts/upgrade-utilities/nova-restore-compute-id.yml
+++ b/scripts/upgrade-utilities/nova-restore-compute-id.yml
@ -0,0 +1,44 @@
+---
+- name: Ensuring that compute node has UUID state defined
+  hosts: nova_compute
+  vars:
+    nova_compute_id_file: /var/lib/nova/compute_id
+  handlers:
+
+    - name: Restart nova
+      ansible.builtin.service:
+        name: nova-compute
+        state: restarted
+
+  tasks:
+
+    - name: Checking if compute file exist
+      ansible.builtin.stat:
+        path: "{{ nova_compute_id_file }}"
+      register: _compute_id_status
+
+    - name: Get list of existing hypervisors  # noqa: run-once[task]
+      ansible.builtin.command: openstack --os-cloud default hypervisor list -f json -c ID -c "Hypervisor Hostname"
+      run_once: true
+      delegate_to: "{{ groups['utility_all'][0] }}"
+      register: nova_hypervisors
+      changed_when: false
+
+    - name: Get node UUID if needed
+      when: not _compute_id_status.stat.exists
+      block:
+
+        - name: Register hypervisors fact
+          ansible.builtin.set_fact:
+            nova_hv: "{{ nova_hypervisors.stdout | from_json | selectattr('Hypervisor Hostname', 'eq', ansible_facts['nodename']) }}"
+
+        - name: Place node UUID to the expected location
+          ansible.builtin.copy:
+            dest: "{{ nova_compute_id_file }}"
+            content: >
+              {{ nova_hv[0]['ID'] }}
+            owner: nova
+            group: nova
+            mode: "0640"
+          when: nova_hv
+          notify: Restart nova