Resolve MariaDB/Galera cluster startup/logging issues

This patch ensures that MariaDB is given adequate time to start on a
resources constrained system (180s versus the default of 30s),
ensures that the error log is appropriately populated and also
provides a failback restart in the case where there may be a corrupt
sst directory.

In the handler changes:
 - the environment variable "MYSQLD_STARTUP_TIMEOUT" is now being
   passed into the init script because the defaults are not being
   sourced at the init script runtime.
 - the temporary "sst" directory is cleaned up should the handler
   restart fail. This ensurez that a node is in a clean state if a
   leftover sst directory was on the disk which would cause a node
   to fail to join a cluster or bootstrap.

In the task changes a new configuration file, that is part of the
mariadb package, is being removed which has unforeseen options within
it causing no logs to be created.

The default option "galera_innodb_additional_mem_pool_size" was removed
because its no longer valid within MariaDB10 and we'd never caught that
error message until now.

This patch is based on:
 - https://review.openstack.org/256016
 - https://review.openstack.org/266265

Closes-Bug: #1532761
Closes-Bug: #1533126
Change-Id: I16af30c660790656fc2d59f9943c172b88098905
This commit is contained in:
Jesse Pretorius 2016-01-11 16:36:49 +00:00
parent e3126d84e8
commit d839e2e4f9
4 changed files with 35 additions and 4 deletions

View File

@ -45,7 +45,6 @@ galera_wait_timeout: 28800
## innodb options
galera_innodb_buffer_pool_size: 4096M
galera_innodb_additional_mem_pool_size: 24M
galera_innodb_log_file_size: 1024M
galera_innodb_log_buffer_size: 128M

View File

@ -18,8 +18,33 @@
name: mysql
state: restarted
args: "{{ (not galera_existing_cluster | bool and inventory_hostname == galera_server_bootstrap_node) | ternary('--wsrep-new-cluster', '') }}"
environment:
MYSQLD_STARTUP_TIMEOUT: 180
when: not galera_running_and_bootstrapped | bool
register: galera_restart
until: galera_restart|success
# notifies are only fired when status is "changed"
changed_when: galera_restart | failed
failed_when: false
notify:
- "remove stale .sst"
- "Restart mysql fall back"
- name: remove stale .sst
file:
path: "/var/lib/mysql/.sst"
state: absent
when: galera_restart | failed
- name: Restart mysql fall back
service:
name: mysql
state: restarted
args: "{{ (not galera_existing_cluster | bool and inventory_hostname == galera_server_bootstrap_node) | ternary('--wsrep-new-cluster', '') }}"
environment:
MYSQLD_STARTUP_TIMEOUT: 180
when: galera_restart | failed
register: galera_restart_fall_back
until: galera_restart_fall_back | success
retries: 3
delay: 2
delay: 5

View File

@ -66,6 +66,13 @@
tags:
- galera-config
- name: remove default mysql_safe_syslog
file:
path: "/etc/mysql/conf.d/mysqld_safe_syslog.cnf"
state: absent
tags:
- galera-config
- name: Remove policy-rc
file:
path: "/usr/sbin/policy-rc.d"

View File

@ -15,6 +15,7 @@ socket = /var/run/mysqld/mysqld.sock
[mysqld_safe]
socket = /var/run/mysqld/mysqld.sock
nice = 0
log_error = /var/log/mysql_logs/galera_server_error.log
[mysql]
@ -66,7 +67,6 @@ table-open-cache = 10240
# INNODB #
innodb-flush-method = O_DIRECT
innodb-additional-mem-pool-size = {{ galera_innodb_additional_mem_pool_size }}
innodb-log-file-size = {{ galera_innodb_log_file_size }}
innodb-flush-log-at-trx-commit = 1
innodb-file-per-table = 1