Tools and automation to achieve Disaster Recovery of OpenStack Cloud Platforms

Go to file

Ghanshyam Mann aee87c8e14 Fix hacking min version to 3.0.1 flake8 new release 3.8.0 added new checks and gate pep8 job start failing. hacking 3.0.1 fix the pinning of flake8 to avoid bringing in a new version with new checks. Though it is fixed in latest hacking but 2.0 and 3.0 has cap for flake8 as <4.0.0 which mean flake8 new version 3.9.0 can also break the pep8 job if new check are added. To avoid similar gate break in future, we need to bump the hacking min version. Also removing the hacking and other related dep from lower-constraints file as theose are blacklisted requirements and does not need to be present there. - http://lists.openstack.org/pipermail/openstack-discuss/2020-May/014828.html Change-Id: Id693765fa9216f880d7105bbac3cd60f2db8500c		2020-09-08 17:31:58 +00:00
config-generator	freezer-dr big bang	2016-05-09 09:55:31 +00:00
doc	Cleanup py27 support	2020-04-12 20:58:22 +02:00
etc	Set default notifier endpoint	2018-10-15 20:28:09 +08:00
freezer_dr	Remove the __future__ modulue	2020-06-19 22:52:55 -07:00
releasenotes/notes	[ussuri][goal] Drop python 2.7 support and testing	2019-11-22 01:53:02 +00:00
tests	Adding pep8, pylint, coverage, sphinx testing	2016-05-09 15:00:10 +00:00
.coveragerc	Adding pep8, pylint, coverage, sphinx testing	2016-05-09 15:00:10 +00:00
.gitignore	Update .gitignore	2018-11-12 06:20:22 -05:00
.gitreview	OpenDev Migration Patch	2019-04-19 19:30:00 +00:00
.pylintrc	update pylint	2018-08-15 13:22:31 +09:00
.stestr.conf	Switch to stestr	2018-07-11 10:59:33 +07:00
.zuul.yaml	Remove mock from lower-constraints	2020-06-24 03:46:14 -07:00
CREDITS.rst	Adding CREDITS.rst	2016-02-12 16:28:58 +00:00
HACKING.rst	Sync Sphinx requirement	2019-08-03 11:53:24 +02:00
LICENSE	Add LICENSE file	2017-01-17 13:18:06 +07:00
README.rst	Sync Sphinx requirement	2019-08-03 11:53:24 +02:00
bindep.txt	Add bindep to fix py37 tests	2019-08-03 11:38:14 +02:00
lower-constraints.txt	Remove mock from lower-constraints	2020-06-24 03:46:14 -07:00
requirements.txt	Update tox.ini and fix pep8 errors	2018-11-07 06:14:17 -05:00
setup.cfg	Cleanup py27 support	2020-04-12 20:58:22 +02:00
setup.py	Cleanup py27 support	2020-04-12 20:58:22 +02:00
test-requirements.txt	Fix hacking min version to 3.0.1	2020-09-08 17:31:58 +00:00
tox.ini	Remove mock from lower-constraints	2020-06-24 03:46:14 -07:00

README.rst

Team and repository tags

Freezer Disaster Recovery

freezer-dr, OpenStack Compute node High Available provides compute node high availability for OpenStack. Simply freezer-dr monitors all compute nodes running in a cloud deployment and if there is any failure in one of the compute nodes freezer-dr will fence this compute node then freezer-dr will try to evacuate all running instances on this compute node, finally freezer-dr will notify all users who have workload/instances running on this compute node as well as will notify the cloud administrators.

freezer-dr has a pluggable architecture so it can be used with:

Any monitoring system to monitor the compute nodes (currently we support only native OpenStack services status)
Any fencing driver (currently supports IPMI, libvirt, ...)
Any evacuation driver (currently supports evacuate api call, may be migrate ??)
Any notification system (currently supports email based notifications, ...)

just by adding a simple plugin and adjust the configuration file to use this plugin or in future a combination of plugins if required

freezer-dr should run in the control plane, however the architecture supports different scenarios. For running freezer-dr under high availability mode, it should run with active passive mode.

How it works

Starting freezer-dr:

freezer-dr Monitoring manager is going to load the required monitoring driver according to the configuration
freezer-dr will query the monitoring system to check if it considers any compute nodes to be down ?
if no, freezer-dr will exit displaying No failed nodes
if yes, freezer-dr will call the fencing manager to fence the failed compute node
Fencing manager will load the correct fencer according to the configuration
once the compute node is fenced and is powered off now we will start the evacuation process
freezer-dr will load the correct evacuation driver
freezer-dr will evacuate all instances to another computes
Once the evacuation process completed, freezer-dr will call the notification manager
The notification manager will load the correct driver based on the configurations
freezer-dr will start the notification process ...