Commit Graph

97 Commits

Author SHA1 Message Date
Rafael Lopez a86390aeab Additional check to replace missing seeded file
The additional check is based on cluster being bootstrapped and the last
backup being a SST.

The change includes new function for checking the last backup was SST and unittests to verify said function as well as the main charm_check_func where the check is used and seeded file is replaced.

Closes-Bug: #2000107
Signed-off-by: Rafael Lopez <rafael.lopez@canonical.com>
Change-Id: I8e516059da5299cc0e0ce8ef0802d3a46abb1a54
2023-01-27 08:34:44 +00:00
Guilherme Maluf Balzana a3f65978a7 Fix log message when catching `get_wsrep_value` exception
`get_wresp_value` can fail in `cursor.execute()` and no log
message would be written.

This commit fix and ensure the right message is logged.

Closes-Bug: #1942936
Change-Id: Idbb1170bbf43fdecec233137d6581bf8f799baa9
2022-07-20 11:51:21 +02:00
Chi Wai, Chan 66f004c2ba Add an option to allow MySQL service to log to syslog.
This patch provides one of the alternative solution to address the
problem that the error log might grow too large.

Closes-bug: #1812950
Change-Id: If7c0c71492eb30f24cbcc03ca05a67e6ea571f4e
2022-06-28 13:53:16 +08:00
Hervé Beraud 6ff8acbd6a Use unittest.mock instead of mock
The mock third party library was needed for mock support in py2
runtimes. Since we now only support py36 and later, we can use the
standard lib unittest.mock module instead.

Note that https://github.com/openstack/charms.openstack is used during tests
and he need `mock`, unfortunatelly it doesn't declare `mock` in its
requirements so it retrieve mock from other charm project (cross dependency).
So we depend on charms.openstack first and when
Ib1ed5b598a52375e29e247db9ab4786df5b6d142 will be merged then CI
will pass without errors.

Depends-On: Ib1ed5b598a52375e29e247db9ab4786df5b6d142
Change-Id: Ie96a81d19be4f14efc7067ddb9c47827f5255ccf
2021-12-15 11:11:14 +00:00
Nicholas Njihia 75fcf19f33 Add max_connect_errors configuration option.
This change adds the config option max_connect_errors for MySQL to this
charm and sets a default of 100.
Commit also includes the inclusion of this (default) config setting in the
unit tests.
Closes-Bug: #1776908

Change-Id: I33b9e29bd64ad8a1fec0edc3dfd657a87648b537
2021-09-02 18:15:22 +03:00
Billy Olsen fc9b9dbcb9 Don't configure databases to replicate w/out async
Check that that the asynchronous replication relations exist prior
to providing the databases_to_replicate to the templating engine
code. Specifying the databases-to-replicate config option is not
supposed to apply when the asynchronous relation is not established.

Closes-Bug: #1934680
Change-Id: Iedd0532268f0bf533305412105590c5afdd7c0ec
2021-08-16 16:46:08 -07:00
Alex Kavanagh 432c3f0be4 Ensure that nagios user gets created with a password
The associated bug is due to a change introduced in commit d55dcde
which was to ensure that the correct password update is used for
different versions of mysql (pre and post 5.7.5).  However, this change
has broken the nagios user creation due to not setting the password.

This patch creates the nagios user and passord at the same time.  The
updating of the password is only done if the account already exists.

The change also corrects the nagios password store in leader-settings to
the 'mysql-nagios.passwd' key instead of 'nagios-password'.  This was an
unfortuante error when the nagios change password was introduced.  The
charm detects if the 'nagios-password' key is used on charm-upgrade and
moves it to 'mysql-nagios.passwd'.  This enables the key to work with
the standard MySQLHelper functions.

Finally, the ALTER command (on percona) doesn't update non-InnoDB tables
and thus needs to be run for each unit when the nagios password is
changed via the action.  The changes in percona_utils.py enable this to
happen.

Whilst the change looks large it ONLY affects the nagios password parts
of the charm.

The related bug is a tracking bug to serve as a reminder to fix this in
charm-helpers and this charm (i.e. make the charm-helpers code work to
change a password for any user other than root, and then enable this
charm to use that code).

Change-Id: Ibc751bef7b4654ebffdf843c556b193373e6e80c
Related-Bug: #1925377
Closes-Bug: #1925042
2021-04-21 18:17:45 +01:00
Alex Kavanagh d55dcdebc9 Ensure that correct password update is used for mysql version
On mysql pre 5.7.5 and older (e.g. xenial) a different password update
SQL statement is needed.  Detect the ubuntu version (assuming the
packages are used) and use the version for that password.

Change-Id: Ifd73f20c2523de164fb23e8f5a5e757393d489e3
Closes-Bug: #1916672
2021-02-27 15:42:55 +00:00
Zuul eac7bceb0c Merge "Raise file limits to allow max-connections >4190" 2020-12-17 20:11:22 +00:00
Trent Lloyd 5952b3de01 Raise file limits to allow max-connections >4190
Install a systemd unit override with LimitNOFILE raised to the required
number based on the same calculation the server performs to ensure both
the requested max_connections and table_open_cache values can be met.

The MySQL server does make some attempt to do this itself when started
as root which worked under Xenial, however in Bionic the systemd service
is started as the mysql user with LimitNOFILE=5000. As a result, the
server cannot raise the limit itself and caps max-connections to 4190,
table-open-cache to 200 and logs a warning to that effect:
[Warning] Changed limits: max_connections: 4190 (requested 12000)

More background:
https://www.percona.com/blog/2017/10/12/open_files_limit-mystery/

Change-Id: I25b429cc9b4970e3d7ef39bb9e6d738fe943686f
Closes-Bug: #1905366
2020-11-30 19:50:16 +08:00
Robert Gildein 11ee1ef563 NRPE: Monitor threads connected to MySQL.
Add a NRPE check to monitor the number of threads connected
to the MySQL database, in proportion to the maximum number of connections.
For this check, a nagios user will be created. This user does not have any
permissions set, does not have access to any database and can only connect
from localhost.
Warning and Critical thresholds (in percentage) can be configured.

Add an action to reset nagios's password. This action could only be run
on the leader unit.

Closes-Bug: #1816759
Change-Id: Id35b0331322c2744a9f839b3eb153eed1bc53aac
2020-11-27 14:50:02 +00:00
Alex Kavanagh 2962cc9a9a Ensure that c.c.unitdata.kv is properly mocked out
As unit tests run concurrently, it's important that all tests use a
mocked out version of the kv() store.  Otherwise, unit tests can race
and fail due to SQLite lock erros.  See linked bug for details.

Change-Id: I7e16a566531a7faf9d3a960c3df524fd46976a2a
Closes-Bug: #1905760
2020-11-27 10:20:31 +00:00
Robert Gildein b5726aa07c React to `source` or `key` changes
After changing the `source` or `key` in the configuration,
the `config-changed` hook was activated, but did not call `add_source`
and `apt_update`.
This logic has been added to the `config-changed` hook as well as
the state changed to `blocked` if the addition of a new source
or key fails.

Closes-Bug: #1893034
Change-Id: I0e7afd5a06c4945341329dd96e8eef3016367386
2020-11-25 17:33:33 +01:00
Alex Kavanagh a4635c13a8 Sync libraries & common files prior to freeze
* charm-helpers sync for classic charms
* charms.ceph sync for ceph charms
* rebuild for reactive charms
* sync tox.ini files as needed
* sync requirements.txt files to sync to standard

Change-Id: I1bf6a834b3678f631f6335af9f0e8b779f863c66
2020-09-27 19:39:08 +01:00
David Ames 7eddd6074d Guarantee executable OCF mysql_monitor file
Closes-Bug: #1890470
Change-Id: Iee0aa8b1ae0f364d01d30a61381af222a264b090
2020-08-07 06:22:12 +00:00
Felipe Reyes 4978615782 Check seeded file in update-status
Change-Id: I7174c24c36ea878d3c4e509beff688871937ecd4
Closes-Bug: #1867458
Func-Test-Pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/203
2020-04-15 16:35:18 -04:00
Felipe Reyes 1f56c36b9f Mark seeded on cluster-relation-changed
When a node is bootstrapped and the other join, they will receive a copy of
the database via SST with the help of XtraBackup, during this process the
/var/lib/percona-xtradb-cluster will be emptied so after this the seeded
file will be missing. Since the action bootstrap-pxc will trigger the
cluster-relation-changed hook it's a good place to call mark_seeded().

Change-Id: I8510bb81435a3096d6a005610fce88ff2b7ffeab
Closes-Bug: #1868326
Func-Test-Pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/203
2020-03-24 18:12:59 -03:00
Liam Young 8443c165f9 Notify clients of series upgrade
When the percona cluster is undergoing a series upgrade, clients
should suspend db activity in their hooks (like db migrations).

This change sents a notification of upgrade down the shared-db
relation which clients can then react to.

Change-Id: I5d8ed7d3935db5568c50f8d585e37a4d0cc6914f
2020-01-29 12:56:23 +00:00
David Ames d8b13606e4 Guard yaml load of grastate on cold boot
Occasionally after cold boot yaml load of grastate will be throw an
exception. Do not error out in this instance.
Update percona to use TEST_ variables
Fix HA overlays

func-test-pr: https://github.com/openstack-charmers/zaza-openstack-tests/pull/116
Change-Id: I6e40970423acb6f70dcc3b91f8b5109de6f46bfc
2019-11-07 15:56:52 -08:00
David Ames 475b889020 Do not add former leader to hosts
After a cold boot, the leader has likely changed. Do not add this node
as a former leader to the hosts list. This may cause changes to
mysqld.cnf and an unwanted attempted restart of mysql.

Change-Id: I5fc4b7822a4550e53e97655771938a903f92fcb1
Close-Bug: #1838648
2019-08-01 07:56:41 -07:00
Liam Young e27006e98b Switch to using DDL for root password update
Using DDL for the root password update triggers the cluster to
replicate the password rather than relying on the charm to apply it
to each unit. This means that only the leader should update the
password.

To avoid a password update via config being missed the implicit save
of config data is turned off on update-status hooks.

Change-Id: I00fe3c7c32a000c6b697263a9ad32273515952a7
Closes-Bug: #1838124
Closes-Bug: #1838125
2019-07-30 12:31:27 -07:00
Liam Young 9844f2fd3e Never give up on res_mysql_monitor
Configure pacemaker to never give up on the res_mysql_monitor
resource and to recheck 5 seconds after a failure. This is
achieved using migration-threshold and failure-timeout
options *1.

*1 https://clusterlabs.org/pacemaker/doc/en-US/Pacemaker/1.1/html/Pacemaker_Explained/_failure_response.html

Change-Id: If19bea77eb5dee9e9eeff105ab98dce1b2de9f74
Closes-Bug: #1837401
2019-07-24 16:58:58 -05:00
David Ames b8c2213dfb Notify bootstrapped action
It turns out a subsequent required step after a cold boot bootstrap is
notifying the cluster of the new bootstrap UUID.

The notify-bootstrapped action should be run on a different node than
the one which ran the bootstrap-pxc action.

This action will ensure the cluster converges on the correct bootstrap
UUID.

A subsequent patch stacked on this one will include tests for the new
cold boot actions.

Change-Id: Idee12d5f7e28498c5ab6ccb9605f751c6427ac30
Partial-Bug: #1744393
2019-07-17 07:58:15 -07:00
David Ames b97a0971c2 Bootstrap action after a cold boot
After a cold boot, percona-cluster will require administrative
intervention. One node will need to bootstrap per upstream
Percona Cluster documentation:
https://www.percona.com/blog/2014/09/01/galera-replication-how-to-recover-a-pxc-cluster/

This change adds an action to bootstrap a single node. On the other
nodes systemd will be attempting to start percona. Once the bootstrapped
node is up the others will join automatically.

Change-Id: Id9a860edc343ee5dbd7fc8c5ce3b4420ec6e134e
Partial-Bug: #1744393
2019-07-10 14:51:25 -07:00
David Ames 910449f6de Display bootstrap grstate after cold boot
To assist in DB cluster recovery after a cold boot, display the
grstate sequence number and the safe to boot status of the instance
in workload status.

Change-Id: Ia4e0e86e7d10b2b22148237688ff77cac4ebee7d
2019-07-09 15:00:25 -07:00
David Ames 681cdf8e45 Convert to python3
Convert the percona cluster charm to python3.

Remove Trusty testing.

Change-Id: Ia5ae43f16caffb5c4356d3f5616e0383e23b5f50
2019-07-08 07:41:48 -07:00
Trent Lloyd 0697559b51 wsrep_slave_threads: default to 48 on bionic
This improves performance significantly for environments constrained by
calls to sync() such as HDDs or lower-end SSDs (or just very busy
environments running many queries)

By default the the queries from other nodes are only processed with
1 thread, which means they will always run slower than on the master and
any long running query will hold up all other queries behind it.

Additionally, when multiple queries commit at once the server can
combine them together into a single on-disk sync ('group commit') which
is not possible otherwise. This optimisation appears to only occur on
Bionic (Percona 5.7) and not Xenial (Percona 5.6).

On Bionic, default to 48 threads which experimentally is a good number
for OpenStack environments without being too crazy high. Galera ensures
that queries that are dependent on each other are still executed
sequentially and generally it is not expected to cause replication
inconsistencies.

However Percona Cluster 5.6 on Xenial appears to have a bug handling
foreign key constraints that causes them to be violated (LP #1823850).
The result is that the slave node crashes out and has to do a full SST
to recover. The same issue is not present on the master. Thus we leave
the default wsrep_slave_threads=1 on Xenial to avoid this issue for now
particularly since Xenial does not appear to be able to use Group Commit
to optimise the number of sync requests generated by the queries - so
this option does not really improve performance there anyway.

Partial-Bug: #1822903
Change-Id: Ic9cdd6562f30a3e52aa3d26fea53ba7c2bbdc771
2019-04-09 15:55:19 +08:00
Alex Kavanagh (tinwood) 79d8a0f7b7 Revert "Convert the charm to Python 3"
It's broken at trusty and needs to be re-worked due to a lack of python3-mysqldb at trusty

This reverts commit 03f93dbc76.

Change-Id: I2b722014fc1ed5823635a6b45b3307326fd901af
2019-03-14 15:12:50 +00:00
Alex Kavanagh 03f93dbc76 Convert the charm to Python 3
Change-Id: I2bb250a4abbe58fe3953357332fa0fe16b432e1c
2019-03-12 15:12:59 +00:00
Zuul e84615037f Merge "Add wsrep-slave-threads and gcs-fc-limit" 2019-01-10 09:28:55 +00:00
Xav Paice 7b7188b610 Add wsrep-slave-threads and gcs-fc-limit
These options allow tuning the Galera replication to avoid flow control from
slowing down the primary.

Change-Id: Ib275cae0db02e4c8c0a85fcc8cb138b26eb26982
Closes-Bug: 1799622
2018-12-17 18:08:24 +13:00
David Ames 11864d4f2a Add unit tests for dbs to sync
Change-Id: I1ed8ecc4f08b0145564c03472df64ee4bf93e6bb
2018-12-12 16:25:47 -08:00
Zuul d7ee8bfd1b Merge "MySQL asynchronous replication" 2018-12-13 00:22:31 +00:00
Tytus Kurek e116b1ef86 MySQL asynchronous replication
This patchset implements new relations: "master" and "slave" based
on a common "mysql-async-replication" interface which are used for
the purpose on enabling MySQL asynchronous replication between
multiple Percona XtraDB Clusters.

Change-Id: I94710bff17091516875c81ca769d8078ef5efd10
Closes-Bug: 1776171
2018-12-12 23:30:41 +01:00
Liam Young 29b586c61b Use chelper to configure vip and dnsha settings
Use helpers from charmhelpers to generate the data to send down the
relation to the hacluster charm.

This results in a few changes in behaviour:

1) The charm will no longer specify a nic name to bind the vip. This
   is because Pacemaker VIP resources are able to automatically
   detect and configure correct iface and netmask parameters based
   on local configuration of the unit.
2) The original iface named VIP resource will be stopped and deleted
   prior to the creation of the new short hash named VIP resource.

Change-Id: Id3804fb7913662b8c573f59d84e663561a687b1f
2018-12-12 09:55:45 +00:00
Ryan Beisner 8ba9abb45b
Fix lint in unit tests re: py3-first and py2 compat
Use local heler for now due to the transitional state of this charm
from py2 to py3, where cmp does not exist.

Change-Id: I86161b8886da69833c4cc074c0f7ede15f456c9d
2018-11-01 21:48:22 -05:00
Ryan Beisner 52bea295b5
Fix lint in unit test
Change-Id: Ibee4f0c45d83e6b0fd5f92cb10158bf3545d2be2
2018-11-01 14:58:39 -05:00
David Ames fe131a0aa6 Series Upgrade
Implement the series-upgrade feature allowing to move between Ubuntu
series.

Change-Id: If38bf1767c8e0c9242071140535b44e12c9f9759
2018-09-17 15:38:00 +02:00
David Ames 055d2bb17f Cluster peers notify readiness
The percona-cluster charm was rendering, and therefore, restarting
after each peer joined. This causes race conditions with the leader
handing out client credentials.

This is due to the parallel but conflicting goals of building the
cluster as quickly as possible but also delaying client relations until
the full cluster is completely ready.

The charm relied on the bootstrap_uuid being set among peers. However,
the bootstrap_uuid is set intentionally early in pursuit of the first
goal building the cluster quickly. It did not signify true readiness.

This change adds another peer relation setting, "ready", that each node
sets when it has bootstrapped, its hacluster is complete, and it has a
sufficient number of peers.

The cluster is considered ready when (min-cluster-size) peers have
indicated readiness. This fulfils goal number two delaying client
relations until the full cluster is ready.

Change-Id: I0998407fcb5efbdb0f7734ac39363e8d41088c79
Closes-Bug: #1775682
2018-06-15 10:24:30 -07:00
Frode Nordahl fce8db8756
Ensure we call application_version_set with valid argument
Change-Id: If5ddff7eaae5e23df7d705959cd8bffd1b9c6afe
Closes-Bug: #1767060
2018-04-26 11:43:00 +02:00
David Ames 801c2e7829 Redesign cluster buildup process
In order to fix bug#1756928 the whole cluster buildup process needed to
be redesigned. The assumptions about what is_bootstrapped and clustered
meant and when to restart on configuration changed needed to be
re-evaluated.

The timing of restarts needed to be protected to avoid collisions.
Only bootstrapped hosts should go in to the
wsrep_cluster_address=gcomm:// setting. Adding or removing units should
be handled gracefully. Starting with a single unit and expanding to a
cluster must work.

This change guarantees mysqld is restarted when the configuration file
changes and meets all the above requirements. As a consequence of the redesign,
the workload status now more accurately reflects the state of the unit.

Charm-helpers sync to bring in distributed_wait fix.

Closes-Bug: #1756308
Closes-Bug: #1756928
Change-Id: I0742e6889b32201806cec6a0b5835e11a8027567
2018-03-29 09:24:05 -07:00
Zuul 9cac8b8521 Merge "Add support for PXC 5.7 and xtrabackup 2.4" 2018-03-19 12:46:40 +00:00
David Ames 5bce1985e1 Ensure leader settings on charm upgrade
Currently bootstrapping is gated by is_leader_bootstrapped which
checks a handful of leader settings. When upgrading from older
versions of the charm, these settings are missing leading to an
attempt to bootstrap an already bootstrapped cluster.

This change makes sure the leader settings is_leader_bootstrapped is
checking for are all set by the leader on upgrade-charm.

Closes-Bug: #1755507

Change-Id: I172f10221b9447ca3e0c5403feaa49acccfa9e42
2018-03-14 21:43:11 +00:00
Corey Bryant 7d835b8674 Add support for PXC 5.7 and xtrabackup 2.4
Bionic will ship with Percona XtraDB Cluster 5.7 and a newer
version of Percona XtraBackup; the majority of charm changes
are associated with the use of native mysql{@} units for
bootstrap and startup of mysqld.

Co-Authored-By: James Page <james.page@ubuntu.com>
Change-Id: I50c5642e11393da3bc03de0ef0b9af4c32e9a0c9
2018-03-09 11:55:43 +00:00
David Ames c82ca2be29 Allow setting gmcast.peer_timeout value
In some resource constrained environments particularly during deploy
time percona-cluster nodes can experience time outs during inter-node
communication.

This changes makes the gmcast.peer_timeout configurable based on the
galara cluster documentation:
http://galeracluster.com/documentation-webpages/galeraparameters.html

Warning: Changing this value from the default may have unintended
consequences. This should only be used when constraint's call for it.

Closes-Bug: #1742683
Change-Id: If93d6ba9d0e99b6af59358a7f3ea40e3aa2a6dbc
2018-02-06 16:05:07 -08:00
David Ames fac45afc60 Gate db{-admin} relations until cluster is ready
Percona-cluster was responding to db and db-admin relations before it
was ready. This led to the error: "WSREP has not yet prepared node for
application use."

This change applies the same gating share-db relation already has to db
and db-admin relations. It also condenses code used in both instances.

This change guarantees the rendered configuration will not
auto-bootstrap for non-leaders. This addresses Bug 1738896.

Closes-Bug: #1742683
Closes-Bug: #1738896
Change-Id: If525595fd109e6a738071a3f016b9c2eabec529e
2018-02-01 10:01:46 -08:00
David Ames bd5474ce2f Avoid simultaneous restarts
It is possible for two or more percona-cluster units to simultaneously
attempt to restart and join the cluster. When this race condition
occurs one unit may error with:
"Failed to start mysql (max retries reached)"

We already have the control mechanism distributed_wait used in other
charms. This change implements this mechanism for percona-cluster.

Configuration options allow for fine tuning. The balance is time vs
tolerance for collision errors. CI systems may tolerate the occasion
false positive for time saved. Where production deployments can
sacrifice a bit of time for a guaranteed deploy.

Change-Id: I52e7f8e410ecd77a7a142d44b43414e33eff3a6e
Closes-Bug: #1745432
2018-01-25 15:20:29 -08:00
David Ames dc19ecb4a3 Guarantee timing of installation and render
Previous attempts to solve Bug #1738896 missed the root cause. The
root cause problem is when the configuration file is rendered before
percona is installed. The rendering includes clustering configuration
which causes percona-cluster to automatically do a single bootstrap
when percona-cluster packages are installed leading to the UUID
mismatch.

The timing and ordering of installation, rendering of the
configuration and restart of mysql is critical across all hook
executions.

This change is a partial reversion of Change ID
I95e56bd28152c934f413025a22dd6821b2ad8e94. The change primarily
guarantees percona-cluster is not installed on non-leader nodes
before the leader is bootstrapped and makes sure the configuration
does not get rendered prior to installation.

is_leader_bootstrapped is introduced and guarantees all data expected
from the leader is available to guard on various tasks.

Closes-Bug: #1744961
Closes-Bug: #1738896
Change-Id: Ifeb1520dba3b14fc1b51a586141905a385f2b2c1
2018-01-24 22:13:19 +00:00
Liam Young f8adca19e2 Retrieve root password from leaderdb if needed
Within update_root_password if there is no root password in the
config() object then cfg['root-password'] will return None.
set_mysql_root_password is then called with None which triggers a
TypeError. The desired root password is in the leaderdb so fallback
to retrieving it from there if it is not in config().

unit_tests/test_utils.py was also updated to fix the behaviour of
config.changed to mimic that of the real method in charm helpers.

Change-Id: I8d22a66f335c0c8e893cda699a55056476e6d9d5
2018-01-24 11:40:35 +00:00
Edward Hope-Morley db18bc2f60 Ensure that we resolve AAAA if prefer-ipv6 is True
We currently resolve A records regardless of whether
prefer-ipv6 is True or False. This patch ensures
that we request AAAA in the case where prefer-ipv6
is True.

Change-Id: I0749cd01e504fb4010addcd91bda3a1f32a03fab
Closes-Bug: 1729572
2017-11-07 10:27:38 +00:00