mistral/mistral
Renat Akhmerov 7b71f096b9 New experimental scheduler: the first working version
This patch delivers the first working version of a distributed
scheduler implementation based on local and persistent job
queues. The idea is inspired by the parallel computing pattern
known as "Work stealing" although it doesn't fully repeat it
due to a nature of Mistral.
See https://en.wikipedia.org/wiki/Work_stealing for details.

Advantages of this scheduler implementation:

* It doesn't have job processing delays when a cluster topology'
  is stable caused by DB polling intervals. A job gets scheduled
  in memory and also saved into the persistent storage for
  reliability. A persistent job can be picked up only after a
  configured allowed period of time so that it happens effectively
  after a node responsible for local processing crashed.
* Low DB load. DB polling still exists but it's not a primary
  scheduling mechamisn now but rather a protection from node crash
  situations. That means that a polling interval can now be made
  large like 30 seconds, instead of 1-2 seconds. Less DB load
  leads to less DB deadlocks between scheduler instances and less
  retries on MySQL.
* Since DB load is now less it gives better scalability properties.
  A bigger number of engines won't now lead to much bigger
  contention because of a big DB polling intervals.
* Protection from having jobs forever hanging in processing state.
  In the existing implementation, if a scheduler captured a job
  for processing (set its "processing" flag to True) and then
  crashed then a job will be in processing state forever in the DB.
  Instead of a boolean "processing" flag, the new implementation
  uses a timestamp showing when a job was captured. That gives us
  the opportunity to make such jobs eligible for recapturing and
  further processing after a certain configured timeout.

TODO:

* More testing
* DB migration for the new scheduled jobs table
* Benchmarks and testing under load
* Standardize the scheduler interface and write an adapter for the
  existing scheduler so that we could choose between scheduler
  implementations. It's highly desired to make transition to the
  new scheduler smooth in production: we always need to be able
  to roll back to the existing scheduler.

Partial blueprint: mistral-redesign-scheduler
Partial blueprint: mistral-eliminate-scheduler-delays

Change-Id: If7d06b64ac14d01e80d31242e1640cb93f2aa6fe
2018-08-14 14:02:19 +07:00
..
actions Merge "Remove hardcoded usage of v2 authentication in Barbican actions" 2018-08-03 11:03:26 +00:00
api Merge "Add namespace parameter to Workbook API doc" 2018-08-01 06:45:16 +00:00
auth expose the user info url as a configuration 2018-08-02 15:57:41 +03:00
cmd Enable mutable config in mistral 2018-07-25 03:40:34 +05:30
db New experimental scheduler: the first working version 2018-08-14 14:02:19 +07:00
engine New experimental scheduler: the first working version 2018-08-14 14:02:19 +07:00
event_engine Amend the spelling error of a word 2018-06-18 15:47:30 +08:00
executors Migrate mistral to using the serialization code in mistral-lib 2018-07-23 12:55:41 +01:00
expressions Add YAQL engine options 2018-06-01 17:06:57 +07:00
ext Use the Mistral syntax highlighting on the dsl v2 page 2017-04-06 10:20:34 +01:00
hacking Fix the pep8 commands failed 2017-07-27 22:15:12 +08:00
lang Merge "Remove extra a specification validation" 2018-07-31 08:27:18 +00:00
notifiers Implement notification of execution events 2018-02-24 07:25:55 +00:00
policies Add a policy to control the right to publish resources 2018-07-05 11:46:52 +02:00
resources Fix for YaqlEvaluationException in std.create_instance workflow. 2016-07-12 00:29:23 -04:00
rpc Migrate mistral to using the serialization code in mistral-lib 2018-07-23 12:55:41 +01:00
scheduler New experimental scheduler: the first working version 2018-08-14 14:02:19 +07:00
service Optimize API layer: using from_db_model() instead of from_dict() 2017-05-22 12:03:17 +07:00
services Remove extra a specification validation 2018-07-30 11:55:35 +04:00
tests New experimental scheduler: the first working version 2018-08-14 14:02:19 +07:00
utils New experimental scheduler: the first working version 2018-08-14 14:02:19 +07:00
workflow Allow engine commands as task name 2018-07-19 14:23:18 +00:00
__init__.py Remove eventlet monkey patch in mistral __init__ 2015-02-20 07:49:56 +00:00
_i18n.py Update and optimize documentation links 2017-07-19 17:10:49 +08:00
config.py New experimental scheduler: the first working version 2018-08-14 14:02:19 +07:00
context.py New experimental scheduler: the first working version 2018-08-14 14:02:19 +07:00
exceptions.py Create Base class for Mistral Exceptions and Errors 2018-05-31 08:47:04 +00:00
messaging.py [Event-engine] Make listener pool name configurable 2017-10-13 10:47:34 +03:00
version.py Removed unnecessary utf-8 encoding 2017-01-11 02:58:04 +00:00