After this patch, user can update logging format to include root_execution_id in logs, which will be helpful to find and debug logs related to specific workflow execution.
- Logs about creation and status changes of Mistral entities(execution,
task, action execution, etc) are changed to INFO log level.
- User can update logging_context_format_string to include root_execution_id in logs.
Implements: Implements: blueprint improve-mistral-loggers
Change-Id: I54fe058e5451abba6ea7f69d03d498d78a90993e
Opportunity to hide sensitive data from http action logs, such as:
* Request headers
* Request body
* Response body
Change-Id: I6d1b1844898343b8fa30f704761096e3d2936c4d
Implements: blueprint mistral-hide-sensitive-data-from-http-actions-logs
Signed-off-by: Oleg Ovcharuk <vgvoleg@gmail.com>
* This patch refactors Mistral with the action provider concept
that is responsible for delivering actions to the system. So
it takes all the burden of managing action definitions w/o
having to spread that across multiple subsystems like Engine
and API and w/o having to assume that action definitions are
always stored in DB.
* Added LegacyActionProvider that represents the old way of
delivering action definitions to the system. It pretty much just
analyses what entries are configured in the entry point
"mistral.actions" in setup.cfg and build a collection of
corresponding Python action classes in memory accessible by names.
* The module mistral/services/actions.py is now renamed to
adhoc_actions.py because it's effectively responsible only for
ad-hoc actions (those defined in YAML).
* Added the new entry point in setup.cfg "mistral.action.providers"
to register action provider classes
* Added the module mistral/services/actions.py that will be a facade
for action providers. Engine and other subsystems will need to
work with it.
* Other small code changes.
Depends-On: I13033253d5098655a001135c8702d1b1d13e76d4
Depends-On: Ic9108c9293731b3576081c75f2786e1156ba0ccd
Change-Id: I8e826657acb12bbd705668180f7a3305e1e597e2
str.encode() is now not only unnecessary but it actually breaks the
action by converting the input str to bytes. Since Python 2 support has
been dropped, the call to encode() can simply be removed.
Change-Id: I8f5ee8ae9542d9a15fb0937b4a9de1db9915a3a7
* While fetching a list of object in the REST layer we now check
if the object has already been deleted in parallel (by catching
the corresponding exception caused by lazy loading) and don't
include it into the result set.
* Added the corresponding test that uses mocking and two synchronized
threads to reproduce this corner case.
* Minor style changes.
Change-Id: Ia92d799a421e07542f270223c1add2aae7b32aab
Closes-Bug: #1887357
* There were references to Mistral specific constants mistakenly
placed into mistral-lib. This patch adds these constants into
the module where task language specifications are declared and
changes the corresponding references.
Change-Id: I8c7c6896f01894ac76cf9365abfdce17e7df7662
* The purpose of this patch is to improve encapsulation of task
execution state management. We already have the class Task
(engine.tasks.Task) that represents an engine task and it is
supposed to be responsible for everything related to managing
persistent state of the corresponding task execution object.
However, we break this encapsulation in many places and various
modules manipulate with task execution state directly. This fact
leads to what is called "spagetty code" because important
things are often spread out across the system and it's hard to
maintain. It also leads to lots of duplications. So this patch
refactors policies so that they manipulate with a task execution
through an instance of Task which hides low level aspects.
Change-Id: Ie728bf950c4244db3fec0f3dadd5e195ad42081d
* When YAQL output data conversion is disabled there's still
an issue caused by presence of not JSON-compatible types within
a YAQL result. The internal Mistral code is already able to deal
with that (due to the previous changes) by checking that and
converting them to what's needed. However, JSON serialization
may still not work if it's done via the standard "json" library.
The library simply doesn't handle those non-standard types and
raises an exception. We have a sanitizing function that all YAQL
results go through, however, it doesn't make sense to do recursive
sanitizing for performance reasons. It does make sense to convert
data as late as possible to avoid redundant data manipulations. So
the sanitizing function handles only the root object in the object
graph. The solution for this problem is to use our own utility
function based on the "oslo_serialization.jsonutils" that is able
to deal with at least part of the mentioned types, specifically
FrozenDict and iterators. Generators are still a problem and this
new function takes care of that separately, assuming that any
generator is just a special iterator and hence represents a
collection, i.e. a list in JSON terms. It works for all the cases
we've encountered so far working with YAQL.
* Used the new function "utils.to_json_str()" everywhere for JSON
serialization, including the action "std.http".
* Added necessary unit tests.
Closes-Bug: #1869168
Depends-On: I1081a44a6f305eb1dfe68a5bad30110385130725
Change-Id: I9e73ea7cbba215c3e1d174b5189be27c640c4d42
* With disabled YAQL data output conversion, YAQL may return
instances of ContextView which can't be properly saved into
DB. This happens because Mistral serialization code doesn't
turn on JSON conversion of custom objects, and they are just
ignored by the "json" lib when it encounters them.
* Fixed how Mistral serializes context for Javascript evaluation
to address the same problem.
* Implemented __repr__ method of ContextView.
* Removed logging of "data_context" from YAQL evaluation because
previously it was always empty (because the string represetation
of ContextView was always "{}") and now it may be very big, like
megabytes, and the log gets populated too fast. It makes sense to
log YAQL data context only when an error happened. In this case
it helps to investigate an issue.
* Added all required unit tests.
* Fixed the tests for disabled YAQL conversion. In fact, they
didn't test it properly because data conversion wasn't disabled.
Closes-Bug: #1867899
Change-Id: I12b4d0c5f1f49990d8ae09b72f73c0da96254a86
* This patch moves code related to YAQL and Jinja into their
specific modules so that there isn't any module that works with
both. It makes it easier to understand how code related to one
of these technologies works.
* Custome built-in functions for YAQL and Jinja are now in a
separate module. It's easier now to see what's related with
the expression framework now and what's with integration part,
i.e. functions themselves.
* Renamed the base module of expressions similar to other packages.
* Other style changes.
Change-Id: I94f57a6534b9c10e202205dfae4d039296c26407
* It turns out that osprofiler wasn't initialized properly for
threads in which scheduler runs its jobs. So profiling simply
didn't work for this threads and lots of important info didn't
get to the profiler log. This patch fixes it.
* When evaluating a YAQL expression, sometimes we get a huge result
value that we always put into the debug log. In practice, it doesn't
make much sense and, moreover, it utilizes lots of CPU and disk
space. It's better shrink it to some reasonable size that would
allow to make necessary analysis, if needed.
* Other minor style fixes according to the Mistral coding guidelines.
Change-Id: I3df3ab96342c456429e20a905615b90bcb94818d
* Fields of type UUID with a filter prefix are denied because they
don't match the UUID string format. E.g.
"eq:6c07a453-c5e1-4bbe-97ed-3cb77b4f55ff" will throw an
InputException. This patch makes the change that allows to have:
"eq:" "neq:" "gt:" "gte:" "lt:" "lte:" "has:" "in:" "nin:" prefixes
in a value of such fields.
Closes-Bug: #1792875
Change-Id: I26667a82ec768c858f0282124864e377d8cf39f4
Signed-off-by: ali <ali.abdelal@nokia.com>
* Added a list argument to the method get_all() of the executions
API controller class that may contain names of columns that a
client wants to set to the null value (None internally in the
code) in the query string. Thereby we're getting the ability to
filter API entities (currently only workflow executions) fetched
with the get_all() by some fields that are assigned to the None
(null from the API perspective) value, i.e. typically it means
that a value of a field is not defined.
Closes-Bug: #1702421
Change-Id: I78fbf993519beb63ee9aef7058bdcb40f0a12ec3
Check the private_key_filename parameter only if it is not None
Depends-On: https://review.opendev.org/#/c/690040/
Change-Id: If133cf472d1e0ecadd96bc07e7fd0d1ae4068740
Closes-Bug: #1849280
* Module designateclient.v1 doesn't exist anymore after
python-designateclient 3.0.0 is out. The new client
requires a keystone session so all other parameters
were dropped. Since this service now requires a
a session the generator test now mocks the method
_get_fake_client() for this action.
* Minor style changes.
Change-Id: Ida722828e3f1481e08f52257405ddfa2175733fa
A trust client can't do validate token when run cron trigger
This patch will fix that.
Change-Id: I793fbfec03032d9ff7137c11becb6d1c18ec54bc
Closes-Bug: #1843175
When I using mistral with cron trigger and cron was trigged,
the trigger will use trust to do something in my workflow but
mistral-executor service will emit error log:
Forbidden: You are not authorized to perform the requested
action: identity:validate_token
and i was resolved this problem with change parameter from "trust_id"
to "trusts" in function _admin_client:
https://github.com/openstack/mistral/blob/master/mistral/utils/openstack/keystone.py#L95
Change-Id: Idde5b0ed4a4b9f30e4258cd792acb270d5586087
This makes getting a root_execution_id possible without having to go
through filtering and querying the executions search.
Change-Id: Ia6c954e688589f69a7463f1b8e02244d029e8b7a
Changed from importing utils and calling it via utils.filter_utils
to a proper import utils.__init__.py didn't export it so the way it
was called probably worked for python 2 only.
This way is a more ubiquitous way of calling it.
Change-Id: Ie6a073dc286f0d8704c67f8295dfd76bce7897bb
* The custom profiler notifier defined by Mistral now also prints
the total elapsed time of a profiler trace into the log file .
This makes it much easier to search for bottlenecks.
Change-Id: I1db7e66e1b3756cccfde73feee83baef9edf283f
This sets up the HTTPProxyToWSGI middleware in front of Mistral API. The
purpose of this middleware is to set up the request URL correctly in
the case there is a proxy (for instance, a loadbalancer such as HAProxy)
in front of the Mistral API.
The HTTPProxyToWSGI is off by default and needs to be enabled via a
configuration value.
It can be enabled with the option in mistral.conf:
[oslo_middleware]
enable_proxy_headers_parsing=True
Closes-Bug: #1590608
Closes-Bug: #1816364
Change-Id: I04ba85488b27cb05c3b81ad8c973c3cc3fe56d36
keystone_authtoken/auth_uri is deprecated [1]. Use www_authenticate_uri
instead.
keystonemiddleware in requirements and lower constraints should be increased
because www_authenticate_uri was introduced in keystonemiddleware 4.18.0.
[1] https://review.openstack.org/#/c/508522/
Change-Id: I99b0ee941d702a28fb4f392d9747d0e2257a42c8
Closes-Bug: #1788174
* If we use the built-in YAQL function 'str' in a workflow then it
doesn't represent lists as '[item1, item3, ...]' but instead
creates '(item1, item2,...). This is because the standard YAQL
function 'yaql_utils.convert_input_data', which is needed to
convert a initial user data into an internal YAQL format,
converts all sequences (except strings) into tuples.
This patch overrides this behavior for sequences that are not
strings and tuples so that they now get converted into lists.
YAQL uses tuples because it needs to obtain a safe immutable
structure to make calculations upon. But in Mistral list is
more suitable because lots of users care about string
representations. Immutability is not so important because
Mistral code base guarantees that the initial data context
for an expression won't be changed while an expression is
being evaluated by YAQL.
* "str" YAQL function used to work well but it was broken in
https://review.openstack.org/#/c/477816/ that added additional
context preparation in order to fix the issue
https://bugs.launchpad.net/mistral/+bug/1772864
Change-Id: I69d32f8772418d586d6c414842bb54aada217481
Closes-Bug: #1815710
This method allows to specify a private key and avoids its storage
in the filesystem of the executors. This can be used later in
combination of a secrets_retrieve to use keys stored in barbican.
Change-Id: Ide438a7f6d24c8bdc9eb2c82e935fd39a6acc2c6
Closes-Bug: #1806703
A number of configuration options provided by oslo.messaging were
deprecated in Ocata and have now been removed. See
https://docs.openstack.org/releasenotes/oslo.messaging/unreleased.html#upgrade-notes
for more details.
* Because of the removal of a number of options from
[oslo_messaging_rabbit] some code related to them and the
corresponding tests for the Kombu RPC now don't make sense
and so they've been removed by this patch.
* Style/formatting changes in the Kombu RPC tests.
Change-Id: I37c71dbe4bb270367f5434b0b8c2557e29a9b1df
This patch fixes the bug in string handling in ssh_utils which
prevented SSHAction execution in Python 3 setup.
In Python 3, all strings are unicode by default and byte stream
no longer implicitly convert to str.
Py2: paramiko Channel.recv returns type `str`
Py3: paramiko Channel.recv returns type `byte`
Change-Id: I24971858039f287df24d39c19eccc44916ecf580
Closes-Bug: #1781548
This patch delivers the first working version of a distributed
scheduler implementation based on local and persistent job
queues. The idea is inspired by the parallel computing pattern
known as "Work stealing" although it doesn't fully repeat it
due to a nature of Mistral.
See https://en.wikipedia.org/wiki/Work_stealing for details.
Advantages of this scheduler implementation:
* It doesn't have job processing delays when a cluster topology'
is stable caused by DB polling intervals. A job gets scheduled
in memory and also saved into the persistent storage for
reliability. A persistent job can be picked up only after a
configured allowed period of time so that it happens effectively
after a node responsible for local processing crashed.
* Low DB load. DB polling still exists but it's not a primary
scheduling mechamisn now but rather a protection from node crash
situations. That means that a polling interval can now be made
large like 30 seconds, instead of 1-2 seconds. Less DB load
leads to less DB deadlocks between scheduler instances and less
retries on MySQL.
* Since DB load is now less it gives better scalability properties.
A bigger number of engines won't now lead to much bigger
contention because of a big DB polling intervals.
* Protection from having jobs forever hanging in processing state.
In the existing implementation, if a scheduler captured a job
for processing (set its "processing" flag to True) and then
crashed then a job will be in processing state forever in the DB.
Instead of a boolean "processing" flag, the new implementation
uses a timestamp showing when a job was captured. That gives us
the opportunity to make such jobs eligible for recapturing and
further processing after a certain configured timeout.
TODO:
* More testing
* DB migration for the new scheduled jobs table
* Benchmarks and testing under load
* Standardize the scheduler interface and write an adapter for the
existing scheduler so that we could choose between scheduler
implementations. It's highly desired to make transition to the
new scheduler smooth in production: we always need to be able
to roll back to the existing scheduler.
Partial blueprint: mistral-redesign-scheduler
Partial blueprint: mistral-eliminate-scheduler-delays
Change-Id: If7d06b64ac14d01e80d31242e1640cb93f2aa6fe
There are still some hardcoded v2 authentication in barbican actions.
This api has been deprecated and removed, so we can change it to use
instead v3. It also removes the version number from some helper methods.
Change-Id: I0390daf841463d11cb7c61653897949989b6e6eb
Closes-bug: #1783316