Increase default TimeoutStopSec value

The charm installs systemd overrides of the TimeoutStopSec and
TimeoutStartSec parameters for the corosync and pacemaker services.
The default timeout stop parameter is changed to 60s, which is a
significant change from the package level default of 30 minutes. The
pacemaker systemd default is 30 minutes to allow time for resources
to safely move off the node before shutting down. It can take some
time for services to migrate away under a variety of circumstances (node
usage, the resource, etc).

This change increases the timeout to 10 minutes by default, which should
prevent things like unattended-upgrades from causing outages due
services not starting because systemd timed out (and an instance was
already running).

Change-Id: Ie88982fe987b742082a978ff2488693d0154123b
Closes-Bug: #1903745
This commit is contained in:
Billy Olsen 2020-11-17 12:25:20 -07:00
parent a482fd2263
commit 9645aefdec
2 changed files with 7 additions and 2 deletions

View File

@ -81,10 +81,13 @@ options:
in seconds. Set value to -1 turn off timeout for the services.
service_stop_timeout:
type: int
default: 60
default: 600
description: |
Systemd override value for corosync and pacemaker service stop timeout
seconds. Set value to -1 turn off timeout for the services.
seconds. The default value will cause systemd to timeout a service stop
after 10 minutes. This should provide for sufficient time for resources
to migrate away from the current node as part of the stop sequence in
most cases. Set value to -1 turn off timeout for the services.
stonith_enabled:
type: string
default: 'False'

View File

@ -120,6 +120,7 @@ from utils import (
get_hostname,
disable_stonith,
is_stonith_configured,
emit_systemd_overrides_file,
)
from charmhelpers.contrib.charmsupport import nrpe
@ -252,6 +253,7 @@ def migrate_maas_dns():
def upgrade_charm():
install()
migrate_maas_dns()
emit_systemd_overrides_file()
update_nrpe_config()