Adding Backup and Restore Operations

These are a set of tasks to help facilitate backup and restore operations.
This includes preparing an external backup host, configuring SSH for
rsync, and some initial tasks for backing up and restoring a containerized
galera cluster. This should lay the foundation for further backup and
restore tasks.

Change-Id: If5c11956291205cd04a6ef9bdb3d221fcd970f24
This commit is contained in:
Dan Macpherson 2018-09-22 03:05:16 +10:00
parent 4fab777295
commit da38af7d2d
12 changed files with 572 additions and 0 deletions

241
README-backup-ops.md Normal file
View File

@ -0,0 +1,241 @@
# Backup and Restore Operations #
The `openstack-operations` role includes some foundational backup and restore Ansible tasks to help with automatically backing up and restoring OpenStack services. The current services available to backup and restore include:
* MySQL on a galera cluster
* More coming soon...
Scenarios tested:
* TripleO, 1 Controller, 1 Compute, backup to the undercloud
* TripleO, 1 Controller, 1 Compute, backup to remote server
* TripleO, 3 Controllers, 1 Compute, backup to the undercloud
* TripleO, 3 Controllers, 1 Compute, backup to remote server
## Architecture ##
The architecture uses three main host types:
* Target Hosts - Which are the OpenStack nodes with data to backup. For example, this would any nodes with database servers running
* Backup Host - The destination to store the backup.
* Control Host - The host that executes the playbook. For example, this would be the undercloud on TripleO.
You can also unify the Backup Host and Control Host onto a single host. For example, a host that runs playbooks AND stores the backup data,
## Requirements ##
General Requirements:
* Backup Host needs access to the `rsync` package. A task in `initialize_backup_host.yml` will attempt to install it.
MySQL/Galera
* Target Hosts needs access to the `mysql` package. Tasks in the backup and restore files will attempt to install it.
* When restoring to Galera, the Control Host requires the `pacemaker_resource` module. You can obtain this module from the `ansible-pacemaker` RPM. If your operating system does not have access to this package, you can clone the [ansible-pacemaker git repo](https://github.com/redhat-openstack/ansible-pacemaker). When running a restore playbook, include the `ansible-pacemaker` module using the `-M` option (e.g. `ansible-playbook -M /usr/share/ansible-modules ...`)
## Task Files ##
The following is a list of the task files used in the backup and restore process.
Initialization Tasks:
* `initialize_backup_host.yml` - Makes sure the Backup Host (destination) has an SSH key pair and rsync installed.
* `enable_ssh.yml` - Enables SSH access from the Backup Host to the Target Hosts. This is so rsync can pull the backed up data and push the data during a restore.
* `disable_ssh.yml` - Disables SSH access from the Backup Host to the Target Hosts. This ensures that access is only granted during the backup only.
* `set_bootstrap.yml` - In situations with high availability, some restore tasks (such as Pacemaker functions) only need to be carried out by one of the Target Hosts. The tasks in `set_bootstrap.yml` set a "bootstrap" node to help execute single tasks on only one Target Host. This is usually the first node in your list of targets.
Backup Tasks:
* `backup_mysql.yml` - Performs a backup of the OpenStack MySQL data and grants, archives them, and sends them to the desired backup host.
Restore Tasks:
* `restore_galera.yml` - Performs a restore of the OpenStack MySQL data and grants on a containerized galera cluster. This involves shutting down the current galera cluster, creating a brand new MySQL database, then importing the data and grants from the archive. In addition, the playbook saves a copy of the old data in case the restore process fails.
Validation Tasks:
* `validate_galera.yml` - Performs the equivalent of `clustercheck` i.e. checks the `wsrep_local_state` is 4 ("Synced").
## Variables ##
Use the following variables to customize how you want to run these tasks.
Variables for all backup tasks:
* `backup_directory` - The location on the backup host to rsync archives. If unset, defaults to the home directory of the chosen inventory user for the Backup Host. If you aim to have recurring backup jobs and store multiple iterations of the backup, you should set this to a dynamic value such as a timestamp or UUID.
* `backup_server_hostgroup` - The name of the host group containing the backup server. Ideally, this host group only contains the Backup Host. If more than one host exists in this group, the tasks pick the first host in the group. Note the following:
* The chosen Backup Host Group must be in your inventory.
* The Backup Host must be initialized using the `initialize_backup_host.yml`. You can do this by placing the Backup Host in a single host group called `backup` and refer to it as using `hosts: backup[0]` in a play that runs the `initialize_backup_host` tasks.
* You can only use one Backup Host. This is because the delegation for the `synchronize` module allows only one host.
MySQL and galera backup and restore variables:
* `kolla_path` - The location of the configuration for Kolla containers. Defaults to `/var/lib/config-data/puppet-generated`.
* `mysql_bind_host` - The IP address for database server access. The tasks place a temporary firewall block on this IP address to prevent services writing to the database during the restore.
* `mysql_root_password` - The original root password to access the database. If unsent, it checks the Puppet hieradata for the password.
* `mysql_clustercheck_password` - The original password for the clustercheck user. If unsent, it checks the Puppet hieradata for the password.
* `galera_container_image` - The image to use for the temporary container to restore the galera database. If unset, it tries to determine the image from the existing galera container.
## Inventory and Playbooks ##
You ultimately define how to use the tasks with your own playbooks and inventory. The inventory should include the host groups and users to access each host type. For example:
~~~~
[my_backup_host]
192.0.2.200 ansible_user=backup
[my_target_host]
192.0.2.101 ansible_user=openstack
192.0.2.102 ansible_user=openstack
192.0.2.103 ansible_user=openstack
[all:vars]
backup_directory="/home/backup/my-backup-folder/"
~~~~
The process for your playbook depends largely on whether you want to backup or restore. However, the general process usually follows:
1. Initialize the backup host
2. Ensure SSH access from the backup host to your OpenStack nodes
3. Perform the backup or restore. If need be, you might need to set a bootstrap to carry out tasks to isolate on a single Target Host.
4. (Optional) If using a separate Backup Host (i.e. not the Control Host), disable SSH access from the backup host to your OpenStack nodes.
## Examples ##
The following examples show how to use the backup and restore tasks.
### Backup and restore galera to a remote backup server ###
This example shows how to backup data to the `root` user on a remote backup server, and then restore it. The inventory file for both functions are the same:
~~~~
[backup]
192.0.2.250 ansible_user=root
[mysql]
192.0.2.101 ansible_user=heat-admin
192.0.2.102 ansible_user=heat-admin
192.0.2.103 ansible_user=heat-admin
[all:vars]
backup_directory="/root/backup-test/"
~~~~
Backup Playbook:
~~~~
---
- name: Initialize backup host
hosts: "{{ backup_hosts | default('backup') }}[0]"
tasks:
- import_role:
name: ansible-role-openstack-operations
tasks_from: initialize_backup_host
- name: Backup MySQL database
hosts: "{{ target_hosts | default('mysql') }}[0]"
vars:
backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
tasks:
- import_role:
name: ansible-role-openstack-operations
tasks_from: enable_ssh
- import_role:
name: ansible-role-openstack-operations
tasks_from: backup_mysql
- import_role:
name: ansible-role-openstack-operations
tasks_from: disable_ssh
~~~~
We do not need to include the bootstrap tasks with the backup since all tasks are performed by one of the Target Hosts.
Restore Playbook:
~~~~
---
- name: Initialize backup host
hosts: "{{ backup_hosts | default('backup') }}[0]"
tasks:
- import_role:
name: ansible-role-openstack-operations
tasks_from: initialize_backup_host
- name: Restore MySQL database on galera cluster
hosts: "{{ target_hosts | default('mysql') }}"
vars:
backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
tasks:
- import_role:
name: ansible-role-openstack-operations
tasks_from: set_bootstrap
- import_role:
name: ansible-role-openstack-operations
tasks_from: enable_ssh
- import_role:
name: ansible-role-openstack-operations
tasks_from: restore_galera
- import_role:
name: ansible-role-openstack-operations
tasks_from: disable_ssh
~~~~
We include the bootstrap tasks with the backup since all Target Hosts are required for the restore but only certain operations are performed on one of the hosts.
### Backup and restore galera to a combined control/backup host ###
This example shows how to back to a directory on the Control Host using the same user. In this case, we use the `stack` user for both Ansible and rsync operations. We also use the `heat-admin` user to access the OpenStack nodes. Both the backup and restore operations use the same inventory file:
~~~~
[backup]
localhost ansible_user=stack
[mysql]
192.0.2.101 ansible_user=heat-admin
192.0.2.102 ansible_user=heat-admin
192.0.2.103 ansible_user=heat-admin
[all:vars]
backup_directory="/home/stack/backup-test/"
~~~~
Backup Playbook:
~~~~
---
- name: Initialize backup host
hosts: "{{ backup_hosts | default('backup') }}[0]"
tasks:
- import_role:
name: ansible-role-openstack-operations
tasks_from: initialize_backup_host
- name: Backup MySQL database
hosts: "{{ target_hosts | default('mysql') }}[0]"
vars:
backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
tasks:
- import_role:
name: ansible-role-openstack-operations
tasks_from: enable_ssh
- import_role:
name: ansible-role-openstack-operations
tasks_from: backup_mysql
~~~~
Restore Playbook:
~~~~
---
- name: Initialize backup host
hosts: "{{ backup_hosts | default('backup') }}[0]"
tasks:
- import_role:
name: ansible-role-openstack-operations
tasks_from: initialize_backup_host
- name: Restore MySQL database on galera cluster
hosts: "{{ target_hosts | default('mysql') }}"
vars:
backup_server_hostgroup: "{{ backup_hosts | default('backup') }}"
tasks:
- import_role:
name: ansible-role-openstack-operations
tasks_from: set_bootstrap
- import_role:
name: ansible-role-openstack-operations
tasks_from: enable_ssh
- import_role:
name: ansible-role-openstack-operations
tasks_from: restore_galera
~~~~
In This situation, we do not include the `disable_ssh` tasks since this would disable access from the Control Host to the OpenStack nodes for future Ansible operations.

View File

@ -27,6 +27,9 @@ This role includes modules for listing image, volume, and container IDs. The fil
If using Docker, see these guides for [images](https://docs.docker.com/engine/reference/commandline/images/#filtering), [containers](https://docs.docker.com/engine/reference/commandline/ps/#filtering), and [volumes](https://docs.docker.com/engine/reference/commandline/volume_ls/#filtering) for filter options.
## Backup and Restore Operations ##
See [Backup and Restore Operations](README-backup-ops.md) for more details.
## Requirements ##

61
tasks/backup_mysql.yml Normal file
View File

@ -0,0 +1,61 @@
# Tasks for dumping a MySQL backup on a single host and pulling it to the\
# Backup Server.
- name: Make sure mysql client is installed on the Target Hosts
yum:
name: mariadb
state: installed
- name: Remove any existing database backup directory
file:
path: /var/tmp/openstack-backup/mysql
state: absent
- name: Create a new MySQL database backup directory
file:
path: /var/tmp/openstack-backup/mysql
state: directory
- name: Get the database root password
script: |
/bin/hiera -c /etc/puppet/hiera.yaml mysql::server::root_password
when: mysql_root_password is undefined
register: mysql_root_password_cmd_output
become: true
- name: Convert the database root password if unknown
set_fact:
mysql_root_password: "{{ mysql_root_password_cmd_output.stdout_lines[0] }}"
when: mysql_root_password is undefined
# Originally used the script module for this but it had issues with
# command piping. Using a script to perform the MySQL dumps.
- name: Create MySQL backup script
template:
src: backup_mysql.sh.j2
dest: /var/tmp/openstack-backup/mysql/backup_mysql.sh
mode: u+rwx
- name: Run the MySQL backup script
command: /var/tmp/openstack-backup/mysql/backup_mysql.sh
# The archive module is pretty limited. Using a script instead.
- name: Archive the OpenStack databases
script: |
/bin/tar --ignore-failed-read --xattrs \
-zcf /var/tmp/openstack-backup/mysql/openstack-backup-mysql.tar \
/var/tmp/openstack-backup/mysql/*.sql
- name: Copy the archive to the backup server
synchronize:
mode: pull
src: "/var/tmp/openstack-backup/mysql/openstack-backup-mysql.tar"
dest: "{{ backup_directory | default('~/.') }}"
set_remote_user: false
ssh_args: "-F /var/tmp/{{ ansible_hostname }}_config"
delegate_to: "{{ hostvars[groups[backup_server_hostgroup][0]]['inventory_hostname'] }}"
- name: Remove the database backup directory
file:
path: /var/tmp/openstack-backup/mysql
state: absent

11
tasks/disable_ssh.yml Normal file
View File

@ -0,0 +1,11 @@
- name: Remove Backup Host authorized key on the OpenStack nodes
authorized_key:
user: root
state: absent
key: "{{ hostvars[groups[backup_server_hostgroup][0]]['backup_ssh_key']['content'] | b64decode }}"
- name: Remove temporary SSH config for each OpenStack node on Backup Host
file:
path: /var/tmp/{{ ansible_hostname }}_config
state: absent
delegate_to: "{{ hostvars[groups[backup_server_hostgroup][0]]['inventory_hostname'] }}"

14
tasks/enable_ssh.yml Normal file
View File

@ -0,0 +1,14 @@
---
- name: Allow SSH access from Backup Host to OpenStack nodes
authorized_key:
user: "{{ ansible_user }}"
state: present
key: "{{ hostvars[groups[backup_server_hostgroup][0]]['backup_ssh_key']['content'] | b64decode }}"
# The synchronize module has issues with delegation and remote users. This
# task creates SSH config to set the SSH user for each host.
- name: Add temporary SSH config for each OpenStack node on Backup Host
template:
src: backup_ssh_config.j2
dest: /var/tmp/{{ ansible_hostname }}_config
delegate_to: "{{ hostvars[groups[backup_server_hostgroup][0]]['inventory_hostname'] }}"

View File

@ -0,0 +1,19 @@
- name: Make sure the Backup Host has an SSH key
user:
name: "{{ ansible_user }}"
generate_ssh_key: yes
- name: Get the contents of the Backup Host's public key
slurp:
src: "{{ ansible_user_dir }}/.ssh/id_rsa.pub"
register: backup_ssh_key
- name: Install rsync on the Backup Host
yum:
name: rsync
state: installed
- name: Make sure the backup directory exists
file:
path: "{{ backup_directory }}"
state: directory

171
tasks/restore_galera.yml Normal file
View File

@ -0,0 +1,171 @@
# Tasks for restoring a MySQL backup on a galera cluster
- name: Make sure mysql client is installed on the Target Hosts
yum:
name: mariadb
state: installed
- name: Get the galera container image if not user-defined
command: "/bin/bash docker ps --filter name=.*galera.* --format='{{ '{{' }} .Image {{ '}}' }}'"
when: galera_container_image is undefined
register: galera_container_image_cmd_output
become: true
- name: Convert the galera container image variable if unknown
set_fact:
galera_container_image: "{{ galera_container_image_cmd_output.stdout_lines[0] }}"
when: galera_container_image is undefined
- name: Get the database root password
script: |
/bin/hiera -c /etc/puppet/hiera.yaml mysql::server::root_password
when: mysql_root_password is undefined
register: mysql_root_password_cmd_output
become: true
- name: Convert the database root password variable if unknown
set_fact:
mysql_root_password: "{{ mysql_root_password_cmd_output.stdout_lines[0] }}"
when: mysql_root_password is undefined
- name: Get the database clustercheck password
script: |
/bin/hiera -c /etc/puppet/hiera.yaml mysql_clustercheck_password
when: mysql_clustercheck_password is undefined
register: mysql_clustercheck_password_cmd_output
become: true
- name: Convert the database clustercheck password variable if unknown
set_fact:
mysql_clustercheck_password: "{{ mysql_clustercheck_password_cmd_output.stdout_lines[0] }}"
when: mysql_clustercheck_password is undefined
- name: Remove any existing database backup directory
file:
path: /var/tmp/openstack-backup/mysql
state: absent
when: bootstrap_node == true
- name: Create a new mysql database backup directory
file:
path: /var/tmp/openstack-backup/mysql
state: directory
when: bootstrap_node == true
- name: Copy MySQL backup archive from the backup server
synchronize:
mode: push
src: "{{ backup_directory | default('~/.') }}/openstack-backup-mysql.tar"
dest: /var/tmp/openstack-backup/mysql/
set_remote_user: false
ssh_args: "-F /var/tmp/{{ ansible_hostname }}_config"
delegate_to: "{{ hostvars[groups[backup_server_hostgroup][0]]['inventory_hostname'] }}"
when: bootstrap_node == true
- name: Unarchive the database archive
script: |
/bin/tar --xattrs \
-zxf /var/tmp/openstack-backup/mysql/openstack-backup-mysql.tar \
-C /
when: bootstrap_node == true
- name: Get the database bind host IP on each node
script: |
/bin/hiera -c /etc/puppet/hiera.yaml mysql_bind_host
when: mysql_bind_host is undefined
register: mysql_bind_host
become: true
- name: Temporarily disable to database port from external access on each node
iptables:
chain: 'INPUT'
destination: "{{ mysql_bind_host.stdout|trim }}"
destination_port: 3306
protocol: tcp
jump: DROP
become: true
- name: Disable galera-bundle
pacemaker_resource:
resource: galera-bundle
state: disable
wait_for_resource: true
become: true
when: bootstrap_node == true
- name: Get a timestamp
set_fact:
timestamp: "{{ ansible_date_time.iso8601_basic_short }}"
- name: Create directory for the old MySQL database
file:
path: /var/tmp/openstack-backup/mysql-old-{{ timestamp }}
state: directory
- name: Copy old MySQL database
synchronize:
src: "/var/lib/mysql/"
dest: "/var/tmp/openstack-backup/mysql-old-{{ timestamp }}/"
delegate_to: "{{ inventory_hostname }}"
become: true
- name: Create a temporary directory for database creation script
file:
path: /var/tmp/galera-restore
state: directory
- name: Create MySQL backup script
template:
src: create_new_db.sh.j2
dest: /var/tmp/galera-restore/create_new_db.sh
mode: u+rwx
- name: Create a galera restore container, remove the old database, and create a new empty database
docker_container:
name: galera_restore
detach: false
command: "/var/tmp/galera-restore/create_new_db.sh"
image: "{{ galera_container_image }}"
volumes:
- /var/lib/mysql:/var/lib/mysql:rw
- /var/tmp/galera-restore:/var/tmp/galera-restore:ro
become: true
- name: Remove galera restore container
docker_container:
name: galera_restore
state: absent
become: true
- name: Enable galera
pacemaker_resource:
resource: galera-bundle
state: enable
wait_for_resource: true
become: true
when: bootstrap_node == true
- name: Perform a local database port check
wait_for:
port: 3306
host: "{{ mysql_bind_host.stdout|trim }}"
- name: Import OpenStack MySQL data
script: |
/bin/mysql -u root -p{{ mysql_root_password }} < /var/tmp/openstack-backup/mysql/openstack-backup-mysql.sql
when: bootstrap_node == true
- name: Import OpenStack MySQL grants data
script: |
/bin/mysql -u root -p{{ mysql_root_password }} < /var/tmp/openstack-backup/mysql/openstack-backup-mysql-grants.sql
when: bootstrap_node == true
- name: Re-enable the database port externally
iptables:
chain: 'INPUT'
destination: "{{ mysql_bind_host.stdout|trim }}"
destination_port: 3306
protocol: tcp
jump: DROP
state: absent
become: true

8
tasks/set_bootstrap.yml Normal file
View File

@ -0,0 +1,8 @@
- name: Set bootstrap status to false on all nodes
set_fact:
bootstrap_node: false
- name: Set the bootstrap status on the first node
set_fact:
bootstrap_node: true
when: inventory_hostname == ansible_play_hosts[0]

19
tasks/validate_galera.yml Normal file
View File

@ -0,0 +1,19 @@
- name: Get the database clustercheck password
script: |
/bin/hiera -c /etc/puppet/hiera.yaml mysql_clustercheck_password
when: mysql_clustercheck_password is undefined
register: mysql_clustercheck_password_cmd_output
become: true
- name: Convert the database clustercheck password if unknown
set_fact:
mysql_clustercheck_password: "{{ mysql_clustercheck_password_cmd_output.stdout_lines[0] }}"
when: mysql_clustercheck_password is undefined
- name: Check the Galera cluster is Synced
script: |
/bin/mysql -u clustercheck -p{{ mysql_clustercheck_password }} -nNE -e "SHOW STATUS LIKE 'wsrep_local_state';" | tail -1
register: clustercheck_state
until: clustercheck_state.stdout | trim | int == 4
retries: 10
delay: 5

View File

@ -0,0 +1,5 @@
#!/bin/bash
mysql -uroot -p{{ mysql_root_password }} -s -N -e "select distinct table_schema from information_schema.tables where engine='innodb' and table_schema != 'mysql';" | xargs mysqldump -uroot -p{{ mysql_root_password }} --single-transaction --databases > /var/tmp/openstack-backup/mysql/openstack-backup-mysql.sql
mysql -uroot -p{{ mysql_root_password }} -s -N -e "SELECT CONCAT('\"SHOW GRANTS FOR ''',user,'''@''',host,''';\"') FROM mysql.user where (length(user) > 0 and user NOT LIKE 'root')" | xargs -n1 mysql -uroot -p{{ mysql_root_password }} -s -N -e | sed 's/$/;/' > /var/tmp/openstack-backup/mysql/openstack-backup-mysql-grants.sql

View File

@ -0,0 +1,2 @@
Host {{ inventory_hostname }}
User {{ ansible_user }}

View File

@ -0,0 +1,18 @@
#!/bin/bash
rm -rf /var/lib/mysql/*
mysql_install_db --datadir=/var/lib/mysql --user=mysql
chown -R mysql:mysql /var/lib/mysql/
restorecon -R /var/lib/mysql
/usr/bin/mysqld_safe --datadir='/var/lib/mysql' &
while ! mysql -u root -e ";" ; do
echo "Waiting for database to become active..."
sleep 1
done
echo "Database active!"
/usr/bin/mysql -u root -e "CREATE USER 'clustercheck'@'localhost';"
/usr/bin/mysql -u root -e "GRANT PROCESS ON *.* TO 'clustercheck'@'localhost' IDENTIFIED BY '{{ mysql_clustercheck_password }}';"
/usr/bin/mysqladmin -u root password {{ mysql_root_password }}
mysqladmin -u root -p{{ mysql_root_password }} shutdown