* There were references to Mistral specific constants mistakenly
placed into mistral-lib. This patch adds these constants into
the module where task language specifications are declared and
changes the corresponding references.
Change-Id: I8c7c6896f01894ac76cf9365abfdce17e7df7662
* The purpose of this patch is to improve encapsulation of task
execution state management. We already have the class Task
(engine.tasks.Task) that represents an engine task and it is
supposed to be responsible for everything related to managing
persistent state of the corresponding task execution object.
However, we break this encapsulation in many places and various
modules manipulate with task execution state directly. This fact
leads to what is called "spagetty code" because important
things are often spread out across the system and it's hard to
maintain. It also leads to lots of duplications. So this patch
refactors policies so that they manipulate with a task execution
through an instance of Task which hides low level aspects.
Change-Id: Ie728bf950c4244db3fec0f3dadd5e195ad42081d
* When YAQL output data conversion is disabled there's still
an issue caused by presence of not JSON-compatible types within
a YAQL result. The internal Mistral code is already able to deal
with that (due to the previous changes) by checking that and
converting them to what's needed. However, JSON serialization
may still not work if it's done via the standard "json" library.
The library simply doesn't handle those non-standard types and
raises an exception. We have a sanitizing function that all YAQL
results go through, however, it doesn't make sense to do recursive
sanitizing for performance reasons. It does make sense to convert
data as late as possible to avoid redundant data manipulations. So
the sanitizing function handles only the root object in the object
graph. The solution for this problem is to use our own utility
function based on the "oslo_serialization.jsonutils" that is able
to deal with at least part of the mentioned types, specifically
FrozenDict and iterators. Generators are still a problem and this
new function takes care of that separately, assuming that any
generator is just a special iterator and hence represents a
collection, i.e. a list in JSON terms. It works for all the cases
we've encountered so far working with YAQL.
* Used the new function "utils.to_json_str()" everywhere for JSON
serialization, including the action "std.http".
* Added necessary unit tests.
Closes-Bug: #1869168
Depends-On: I1081a44a6f305eb1dfe68a5bad30110385130725
Change-Id: I9e73ea7cbba215c3e1d174b5189be27c640c4d42
This patch delivers the first working version of a distributed
scheduler implementation based on local and persistent job
queues. The idea is inspired by the parallel computing pattern
known as "Work stealing" although it doesn't fully repeat it
due to a nature of Mistral.
See https://en.wikipedia.org/wiki/Work_stealing for details.
Advantages of this scheduler implementation:
* It doesn't have job processing delays when a cluster topology'
is stable caused by DB polling intervals. A job gets scheduled
in memory and also saved into the persistent storage for
reliability. A persistent job can be picked up only after a
configured allowed period of time so that it happens effectively
after a node responsible for local processing crashed.
* Low DB load. DB polling still exists but it's not a primary
scheduling mechamisn now but rather a protection from node crash
situations. That means that a polling interval can now be made
large like 30 seconds, instead of 1-2 seconds. Less DB load
leads to less DB deadlocks between scheduler instances and less
retries on MySQL.
* Since DB load is now less it gives better scalability properties.
A bigger number of engines won't now lead to much bigger
contention because of a big DB polling intervals.
* Protection from having jobs forever hanging in processing state.
In the existing implementation, if a scheduler captured a job
for processing (set its "processing" flag to True) and then
crashed then a job will be in processing state forever in the DB.
Instead of a boolean "processing" flag, the new implementation
uses a timestamp showing when a job was captured. That gives us
the opportunity to make such jobs eligible for recapturing and
further processing after a certain configured timeout.
TODO:
* More testing
* DB migration for the new scheduled jobs table
* Benchmarks and testing under load
* Standardize the scheduler interface and write an adapter for the
existing scheduler so that we could choose between scheduler
implementations. It's highly desired to make transition to the
new scheduler smooth in production: we always need to be able
to roll back to the existing scheduler.
Partial blueprint: mistral-redesign-scheduler
Partial blueprint: mistral-eliminate-scheduler-delays
Change-Id: If7d06b64ac14d01e80d31242e1640cb93f2aa6fe
Now if length parameter in cut functions will be negative value that
there is no restriction on length.
Change-Id: I116d0bcb5663666ba4d280237a03d687de71f549
Closes-bug: #1768450
Signed-off-by: Vitalii Solodilov <mcdkr@yandex.ru>
This patch updates cut_list() to ensure that it does not return
more than the number of characters specified in the limit
parameter.
Change-Id: I9dbd061a5ba1976aae9d1acf1e501a3c98782972
Closes-Bug: 1761246
This patch updates the cut_dict() method to ensure that it only
returns strings up to the length specified.
Change-Id: Iecc1bd4f4c67606eed209a762e4d692691a37161
Closes-Bug: 1760134
* Moved `created_at` and `updated_at` to MistralModelBase to generate a
default value without microseconds
* Increased time delay in scheduler tests. This change doesn't affect
a test duration
* Removed the PostgreSQL database generation commands because there
already are in the `tools/test-setup.sh` script
* Downloaded the python MySQL driver
Change-Id: I50c924ee94619c6622fc553f05a929607646f1fe
Signed-off-by: Vitalii Solodilov <mcdkr@yandex.ru>
* verified the length of task name while creating a workflow
* fixed bug when join task has a long name
* added the length limitation to the spec docs
Change-Id: I233e6b0f30d42b939757e9c1caf4965ecc375aee
Signed-off-by: Vitalii Solodilov <mcdkr@yandex.ru>
Using the package name to identify the top module name can cause issues
if the package is renamed for any reasons. Hard coding the top module
ensures more resiliency towards these types of accidental failures
and provides an obvious mapping for the file names.
Change-Id: I9542ba8f01f57b1b3473185c42d95d99ad1e8435
While printing the output of a execution
size of output_on_error is not considered
with the field_size_limit_kb
This patch udpates the function to include
the logical calculation of total size limit
Closes-Bug: #1690316
Change-Id: Icab63d147f580b8d72d8c02b3d9261fd64d3438a
The oslo.log (logging) configuration library provides standardized
configuration for all openstack projects. It also provides custom
formatters, handlers and support for context specific logging (like
resource id's etc).
I think it's better to use the common logging module.
Change-Id: I3baefd043a557417e8317a80d57cc5a4a48ccc08
* Before this patch, "state_info" field of execution objects
wasn't truncated properly before saving into DB which led to
the DB error like:
DBDataError: (_mysql_exceptions.DataError) (1406, "Data too
long for column 'state_info' at row 1")
It was happening because the method utils.cut() didn't work
accurately enough in case if we passed it with a dictionary.
This patch doesn't fix utils.cut() method but it just saves
space for possible method result difference with the expected
length so that we make sure that the truncated string
representation is always less than 65536 bytes (i.e. the size
of MySQL TEXT type). The total difference is not critical
anyway.
Closes-Bug: #1698125
Change-Id: I18449710ace6276224aaea564588c53a3e2c6adc
* Using method from_db_model() of REST resources where possible
which is more efficient than from_dict() that requires one more
object in memory (a dict)
* Minor style changes
Change-Id: Ie1f3137bee94328f2af676a0831e30c1cf212f47
* This is a preparation for optimizing API layer. We need to
get rid of redundant data coversions like "db model -> dict ->
REST resource". We can do directly "db model -> REST resource".
To do that, db models need to have a method that returns
column values w/o creating a dictionary.
Change-Id: I89c78fdce256249286903c4e2c8bef2a5bf63af7
A new config item 'modules-support-region' is introduced to be used by
cloud operators, mistral will decide if add 'action_region' param to
openstack service action inputs according to that config.
Fixed an action definition for tempest tests.
TODO: Add release note.
Implements: blueprint mistral-multi-region-support
Change-Id: I0b582e9f81ab72cd05f4fae592c568f38dec6e00
* The previous versions of these methods were too specific for the
general utils module. Now they are more generic thereby less
confusing.
Change-Id: Ifa8bbf0cc8a63bf1de1142dc8c52b1f8ad3958c5
* Validation algorithm for workflow and action input is now
semantically more generic and it doesn't make any changes
in the dict of input parameters.
Change-Id: Ic3e73722a2628228e81b2c3a24a714f4d309342f
* If we pass a big data structure into utils.cut() like a dict
with thousands of keys then this function is very inefficient
because it converts the whole data structure into a string first
and then retains only first N symbols (100 by default).
This patch changes function utils.cut() so that a string
representation is build more efficiently. In case of a list or
dict it will be built item by item with the account for given
maximum length.
* Additional profiler decorators
Change-Id: I51ae743e7439e9b68220d47837b2da05ecafb6ef
* 'retry_on_deadlock' decorator allows to retry a method if a DB
deadlock occurs as a result of some DB operation. Deadlocks
occur sometimes when we use named locks, at least when using
MySQL as a DB backend.
* Minor cosmetic changes
Partial-Bug: #1640379
Change-Id: I4fb342a3314e4d23a1f787e2cb6f1d70196663c8
1.As mentioned in [1], we should avoid using
six.iteritems to achieve iterators. We can
use dict.items instead, as it will return
iterators in PY3 as well. And dict.items/keys
will more readable. 2.In py2, the performance
about list should be negligible, see the link [2].
[1] https://wiki.openstack.org/wiki/Python3
[2] http://lists.openstack.org/pipermail/openstack-dev/2015-June/066391.html
Change-Id: Iff88f55dc51981ce502d7d3e67c467242980f20c
Mysql is rounding the microseconds to one second,
which leads to time inconsistency beetwen what
is returned to cllient and what is stored in the DB.
This patch changes the behaviour, so that no mircoseconds
gets genereated.
Closes-bug: 1644881
Change-Id: I514c1d5154b3c658ec74c88b800d2a3ded1fdad9
This new function will allow user to get a list of tasks matching certain
filter. For example only task in state ERROR from the current execution.
It is very useful for debugging, but also very expensive, since it might
require multiple DB queries. In addition it is important to remember a lot
of data can return from this function, so it should be used carefully
Change-Id: I452175bfb60636ed8de9b2b1ceab615359765964
Implements: blueprint yaql-tasks-function
Implements: blueprint yaql-errors-function
As of now, UUID is being generated using either uuid.uuid4()
or uuidutils.generate_uuid(). In order to maintain consistency,
we propose to use uuidutils.generate_uuid() from oslo_utils.
Change-Id: I620cb1f396ce011b9846ff2dad2c9811bc5d0652
Closes-Bug: #1082248
Allows to use Jinja instead of or along with YAQL for expression
evaluation.
* Improved error reporting on API endpoints. Previously, Mistral API
tend to mute important logs related to errors during YAML parsing
or expression evaluation. The messages were shown in the http
response, but would not appear in logs.
* Renamed yaql_utils to evaluation_utils and added few more tests to
ensure evaluation functions can be safely reused between Jinja and
YAQL evaluators.
* Updated action_v2 example to reflect similarities between YAQL and
Jinja syntax.
Change-Id: Ie3cf8b4a6c068948d6dc051b12a02474689cf8a8
Implements: blueprint mistral-jinga-templates
Use dict update instead of recursive merge when published
task output added to workflow context.
Change-Id: I44f99c6d08c6647e4240367be534e9dc1747b857
Closes-Bug: 1616090
Signed-off-by: Istvan Imre <istvan.imre@nokia.com>
* Introduced class hierarchies Task and Action used by Mistral engine.
Note: Action here is a different than executor Action and represents
rather actions of different types: regular python action, ad-hoc
action and workflow action (since for task action and workflow are
polymorphic)
* Refactored task_handler.py and action_handler.py with Task and Action
hierarchies
* Rebuilt a chain call so that the entire action processing would look
like a chain of calls Action -> Task -> Workflow where each level
knows only about the next level and can influence it (e.g. if adhoc
action has failed due to YAQL error in 'output' transformer action
itself fails its task)
* Refactored policies according to new object model
* Fixed some of the tests to match the idea of having two types of
exceptions, MistralException and MistralError, where the latter
is considered either a harsh environmental problem or a logical
issue in the system itself so that it must not be handled anywhere
in the code
TODO(in subsequent patches):
* Refactor WithItemsTask w/o using with_items.py
* Remove DB transaction in Scheduler when making a delayed call,
helper policy methods like 'continue_workflow'
* Refactor policies test so that workflow definitions live right
in test methods
* Refactor workflow_handler with Workflow abstraction
* Get rid of RunExistingTask workflow command, it should be just
one command with various properties
* Refactor resume and rerun with Task abstraction (same way as
other methods, e.g. on_action_complete())
* Add error handling to all required places such as
task_handler.continue_task()
* More tests for error handling
P.S. This patch is very big but it was nearly impossible to split
it into multiple smaller patches just because how entangled everything
was in Mistral Engine.
Partially implements: blueprint mistral-engine-error-handling
Implements: blueprint mistral-action-result-processing-pipeline
Implements: blueprint mistral-refactor-task-handler
Closes-Bug: #1568909
Change-Id: I0668e695c60dde31efc690563fc891387d44d6ba
Improved code style, fixed all H405 (Multi line docstring
summary not separated with an empty line) errors.
Change-Id: I6639a2e1a9dc5d3802cb1bda05c5bf9b302bc82f
* SSH action uses private key names instead of
private keys themself. All private keys now
should be in <user-home>/.ssh/, e.g. searching
of key 'my_key.pem' will be at <user-home>/.ssh/my_key.pem
* Fixed functional tests
* Remove sudo while running tests
Closes-Bug: #1507600
Depends-On: Idc4340cda80d02f4ee88c91600b72d0f914c4084
Change-Id: I503d78ff541183ce850476229842fa6f20061a00
* Added 2 tests: for SSHAction and SSHProxiedAction
* 2 VMs are created for test purposes
* SSHProxiedAction test checks that ssh command was
executed on certain VM (which is in guest network)
Closes-Bug: #1505175
Change-Id: I664885b5743c0915f5c42aeecdd3c6c538894453
Add query params for workflow list REST API:
* limit: return a maximun number of items at a time, default is None, the
query result will include all the resource items, which is backward
compatible.
* marker: the ID of the last item in the previous list.
* sort_keys: columns to sort results by. Default: created_at.
* sort_dirs: directions to sort corresponding to sort_keys, "asc" or
"desc" can be choosed. Default: asc. The length of sort_dirs can
be equal or less than that of sort_keys.
Change-Id: Ie73d4457193999555ce9886d4de1297b4d0bc51d
Partially-Implements: blueprint mistral-query-enhancement
Tooz is a Python library that provides a coordination API. Its primary
goal is to handle groups and membership of these groups in distributed
systems.
This patch adds coordination util to Mistral, which makes use of Tooz
libary.
Change-Id: Icbf3086d01649af813727f0972d6d5b0631d6afb
Partially-Implements: blueprint mistral-service-api
* new engine method (symmetrically to start_workflow) -
start_action;
* possibility to run action without saving the result
to the DB;
* adjusted model_base: updated_at indeed can be None
in set of cases;
* improved engine.utils.validate_input for checking
action_input (also adhoc action input); for this
new util method for getting input dict from input
string is introduced;
* executor client can call rpc method synchronously
for immediately returning result from action and
without callback to engine;
* fixed uploading actions from workbook;
* improved action_handler;
* improved inspect_utils for input validation needs.
TODO (next commit):
- Implementing run action API side
Partially implements blueprint mistral-run-individual-action
Change-Id: I2fc1f3bb4382b72d6de7d7508c82d64e64ca656c
With introduction of action execution, for supported action classes, the
action_context passed as input args to the action class during action
execution should include the action execution ID.
Change-Id: I49d88c57aceef92f548f3680980040f78a4cdfb4
Closes-Bug: #1444155
If the workflow input default value is defined in the spec, the input
param does not needed to be provided for workflow execution.
Partially Implements: blueprint mistral-default-input-values
Change-Id: I03f8032a621d49469d14600e6344524ebdfdbe37
* Creating tools for catching concurrent issues
* Adding a test to learn about SQLite transactions
Change-Id: I3c34f3d5baa06209bfd4beb4ce3c493c51bf6d27
In part 1 of the patch series(https://review.openstack.org/#/c/155171/),
two new taskspec definitions were added: DirectWorkflowTaskSpec and
ReverseWorkflowTaskSpec, corresponding to 'direct' and 'reverse'
workflow types. This patch makes the taskspecs retrieve behavior in a
more generic way, rather than hardcode the workflow type.
Partially Implements: blueprint mistral-specifications-design-improvements
Change-Id: I69b71eeef9fb4d55865b48a017acfdcebbc4f196
Fixes a bug where a second task cannot update a published variable.
Modifies the merge_dicts utility to have an overwrite flag. Modifies the
evaluate_task_outbound_context method to allow overwrite when merging
dicts.
Closes bug: 1415668
Change-Id: I579f932d422228233ad2d22f21999b48a9b92b9f
To ensure better defaults, so Mistral will work more adequately with less configuration.
* enable INFO output by default so that console shows workflow executions, etc.
Our current INFO level is not too verbose and most adequate as default.
* supply working default for sqlite; this way mistral at least starts without supplying config file.
Change-Id: If365516c2cf1c67c5c0396539a6c85afc12c2c47