charm-specs/specs/stein/backlog/controlled-service-restarts...

..
  Copyright 2017 Canonical LTD

  This work is licensed under a Creative Commons Attribution 3.0
  Unported License.
  http://creativecommons.org/licenses/by/3.0/legalcode

..
  This template should be in ReSTructured text. Please do not delete
  any of the sections in this template.  If you have nothing to say
  for a whole section, just write: "None". For help with syntax, see
  http://sphinx-doc.org/rest.html To test out your formatting, see
  http://www.tele3.cz/jbar/rest/rest.html

=================================
Service Restart Control In Charms
=================================

Openstack charms continuously respond to hook events from their peers
related applications which frequently result in configuration
changes and subsequent service restarts. This is all fine until these
applications are deployed at large scale and having these services restart
simultaneously can cause (a) service outages and (b) excessive load on
external applications e.g. databases or rabbitmq servers. In order to
mitigate these effects we would like to introduce the ability for charms
to apply controllable patterns to how they restart their services.

Problem Description
===================

An example scenario where this sort of behaviour becomes a problem is where
we have a large number, say 1000, of nova-compute units all connected to the
same rabbitmq server. If we make a config change e.g. enable debug logging
on that application this will result in a restart all nova-* services on
every compute host in tandem which will in turn generate a large spike of
load on the rabbit server as well as making all compute operations block
until these services are back up. This could also clearly have other
knock-on effects such as impacting other applications that depend on
rabbitmq.

There are a number of ways that we could approach solving this problem but
for this proposal we choose simplicity by attempting to use all information
already available to an application unit combined with some user config to
allow units to decide how best to perform these actions.

Every unit of an application already has access to some information that
describes itself with respect to its environment e.g. every unit has a unique
id and some applications have a peer relation that gives them information
about their neighbours. Using this information coupled with some extra
config options on the charm to vary timing we could provide the operator
the ability to control service restarts across units using nothing more
than basic mathematics and no juju api calls.

For example, let's say an application unit knows it has id 215 and the user
has provided two options via config; a modulo value of 2 and an offset of
10. We could then do the following:

.. code:: python

    time.sleep((215 % 2) * 10)

which, when applied to all units, would result in 50% of the cluster
restarting its services 10 seconds after the rest. This should hopefully
alleviate some of the pressure resulting from cluster-wide synchronous
restarts, ensuring that part of the cluster is always responsive and
making restarts happen quicker.

As mentioned above we will require two new config options to any charm for
which this logic is supported:

* service-restart-offset (default to 10)
* service-restart-modulo (default to 1 so that default behaviour is same as
                          before)

The restart logic will skip for any charms not implementing these options.

Over time some units may be deleted from and added back to the cluster
resulting in non-contiguous unit ids. While for applications deployed at
large scale this is unlikely to be significantly impactful, since subsequent
adds and deletes will cancel each other out, it could nevertheless be a
problem so we will check for the existance of a peer relation on the
application we are running and, if one exists, use the info in that relation
to normalise unit ids prior to calculating delays.

Lastly, we must consider how to behave when the charm is being used to upgrade
Openstack services whether directly using config ("big bang") or using actions
defined on a charm. For the case where all services are upgraded at once we
will leave it to the operator to set/unset the offset parameters. For the case
where actions are being used, and likely only a subset of units are being
upgraded at once, we will ignore the control settings i.e. delays will not
be used.

Proposed Change
===============

To implement this change we will extend the restart_on_change() decorator
implemented across the openstack charms so that when services are stop/started
or restarted they will include a time.sleep(delay) where delay is
calculated from unit id combined with two new config options;
service-restart-offset and service-restart-modulo. This calculation will be
done in a new function that will be implemented in contrib.openstack the
output of which will be passed into the restart_on_changed() decorator.

Since a decorator is used we do not need to worry about multiple restarts of
the same service. We do, however, need to consider how apply offsets when
stop/start and restarts are performed manually as is the case in the action
managed upgrades handler.

Alternatives
------------

None

Implementation
==============

Assignee(s)
-----------

Primary assignee:
  hopem

Gerrit Topic
------------

Use Gerrit topic "controlled-service-restarts" for all patches related to
this spec.

.. code-block:: bash

    git-review -t controlled-service-restarts

Work Items
----------

* implement changes to charmhelpers
* sync into openstack charms and add new config opts

Repositories
------------

None

Documentation
-------------

These new settings will be properly documented in the charm config.yaml as
well as in the charm deployment guide.

Security
--------

None

Testing
-------

Unit tests will be provided in charm-helpers and functional tests will be
updated to include config that enables this feature. Scale testing to prove
effectiveness and determine optimal defaults will also be required.

Dependencies
============

None