mistral

Commit Graph

Author	SHA1	Message	Date
Takashi Kajinami	44cd95684b	Bump hacking hacking 3.0.x is too old. Also remove the note about pip's behavior which was already fixed in recent versions. Change-Id: I65d350943649c3346ed5741631c01724ddd256ef	2024-02-19 02:23:53 +09:00
Vadim Zelenevsky	7cc007415b	Partial Workflow Failure Handling This feature introduces an enhanced error-handling mechanism for workflows, allowing them to gracefully handle issues within individual tasks without causing a complete workflow failure. Previously, when using subworkflow and passing an incomplete set of parameters, the entire workflow would terminate. With this feature, the workflow continues execution, isolating errors at the task level. Consequently, partial issues in one task no longer impact other branches of workflow execution. Implements blueprint partial-workflow-failure-handling Change-Id: Id6a910c85c1d6953408682a2a724c4826333422f	2023-11-29 07:55:38 +00:00
Oleg Ovcharuk	6d3018ea01	Fix join task not refreshing inbound context In error cases join task could lose context of some branches Change-Id: I58a94c4ebc5d860473c9b48df326f6ea29cba9fa Closes-Bug: #2020370 Signed-off-by: Oleg Ovcharuk <vgvoleg@gmail.com>	2023-05-24 18:16:59 +03:00
Vasudeo Nimbekar	9f52e2b6f3	Starting tasks via RPC After this patch mistral will run tasks using RPC which will distribute tasks amongst available engine threads. this will enhance performance in case of executing huge executions containing multiple tasks. Implements: blueprint distribute-mistral-operations Change-Id: I0b7202589eee68ba5560bf2aa60fbbd6118f3719	2023-02-16 13:19:42 +05:30
Zuul	4289317a91	Merge "Task skipping feature"	2023-02-14 13:22:17 +00:00
Vasudeo Nimbekar	88e7e7ceee	Adding root_execution_id parameter to mistral loggers After this patch, user can update logging format to include root_execution_id in logs, which will be helpful to find and debug logs related to specific workflow execution. - Logs about creation and status changes of Mistral entities(execution, task, action execution, etc) are changed to INFO log level. - User can update logging_context_format_string to include root_execution_id in logs. Implements: Implements: blueprint improve-mistral-loggers Change-Id: I54fe058e5451abba6ea7f69d03d498d78a90993e	2023-02-13 05:01:39 +00:00
Oleg Ovcharuk	e72a4e9a70	Task skipping feature This patch adds an ability to rerun failed workflow by skipping failed tasks. Workflow behavior in skip case could be configured by new fields in task definition: * on-skip * publish-on-skip Change-Id: Ib802a1b54e69c29b4d0361f048c2b9c076a4c176 Implements: blueprint mistral-task-skipping-feature Signed-off-by: Oleg Ovcharuk <vgvoleg@gmail.com>	2022-12-01 01:47:30 +03:00
Takashi Kajinami	20aa42b75b	Replace deprecated import of ABCs from collections ABCs in collections should be imported from collections.abc and direct import from collections is deprecated since Python 3.3. Closes-Bug: #1936667 Change-Id: Ide8aa0323d9713c1c2ea0abf3b671ca4dab95ef0	2022-03-02 09:30:29 +00:00
Renat Akhmerov	87c63f4206	Remove a TODO comment about saving an action spec * It's clear now that we don't have to store an action specification as part of the corresponding action execution object because the notion of action specification itself is specific for a certain type of action. In our case, ad-hoc actions. All changes recently made in the Mistral layers above the engine prove the correctnes of this thought. The comment can be safely deleted. Change-Id: I45b97b08184c8d5a88bcc537fb5b1e538f105554	2020-10-05 17:22:28 +07:00
Renat Akhmerov	06a0f33476	Refactor Mistral with Action Providers * This patch refactors Mistral with the action provider concept that is responsible for delivering actions to the system. So it takes all the burden of managing action definitions w/o having to spread that across multiple subsystems like Engine and API and w/o having to assume that action definitions are always stored in DB. * Added LegacyActionProvider that represents the old way of delivering action definitions to the system. It pretty much just analyses what entries are configured in the entry point "mistral.actions" in setup.cfg and build a collection of corresponding Python action classes in memory accessible by names. * The module mistral/services/actions.py is now renamed to adhoc_actions.py because it's effectively responsible only for ad-hoc actions (those defined in YAML). * Added the new entry point in setup.cfg "mistral.action.providers" to register action provider classes * Added the module mistral/services/actions.py that will be a facade for action providers. Engine and other subsystems will need to work with it. * Other small code changes. Depends-On: I13033253d5098655a001135c8702d1b1d13e76d4 Depends-On: Ic9108c9293731b3576081c75f2786e1156ba0ccd Change-Id: I8e826657acb12bbd705668180f7a3305e1e597e2	2020-09-24 11:10:33 +00:00
Q.hongtao	4bc6162515	Remove six library Remove six-library Replace the following items with Python 3 style code. - six.interger_types - six.itervalues - six.text_type - six.string_types - six.StringIO - six.next - six.b - six.PY3 Change-Id: I299c90d5cbeb41be0132691265b8dcbeae65520e	2020-09-23 10:27:12 +08:00
Q.hongtao	da5ac25415	Remove six.moves Remove six.moves Replace the following items with Python 3 style code. - six.moves.urllib - six.moves.queue - six.moves.range - six.moves.http_client Subsequent patches will replace other six usages. Change-Id: I80c713546fcc97391c64e95ef708830632e1ef32	2020-09-22 08:34:20 +08:00
Q.hongtao	aba14934e7	Remove usage of six.add_metaclass With python 3.x, classes can use the metaclass= logic to not require usage of the six library. Subsequent patches will replace other six usages. Change-Id: Iefdc99c338c7aaea18d535426c4676dbedb44f32	2020-09-19 11:37:24 +08:00
Renat Akhmerov	2b7a2bba01	Remove one more self.notify() call from class Task * Getting rid of another self.notify() call in the Task class that's not inside of the set_state() method. That also gave an opportunity to not manage started_at timestamp out of the method set_state(). Change-Id: Ib8f61481a606fe4fc9f37112ef625b8e3c6d5cd3	2020-06-04 15:31:41 +07:00
Renat Akhmerov	90d1f1ba8e	Refactor workflow notifications * Moved all notification management for workflows into the method Workflow.set_state(). It's now in one place. Workflow events are now also identified in one method similar to how it works for tasks based on state transitions. * Other style changes. Change-Id: I40941ecca3eb4b46a06a2f7dc2fd5d909d5d087a	2020-06-03 18:44:18 +07:00
Renat Akhmerov	b55dbdea68	Refactor task notifications * All calls to a notifier within the Task class have now been moved into the method set_state() so that the relation between a state change and a notification is now straightforward and the notification calls don't have to be spread out across different modules. Change-Id: I9c0647235e1439049d3e7db13f19bef542f10508	2020-06-03 17:51:41 +07:00
Renat Akhmerov	a620dabb78	Simplify setting task "started_at" and "finished_at" * Moving the responsibility to manage values of these timestamps into the method Task.set_state() so that this logic is now fully associated with how task execution state changes. Change-Id: I13a5a5921dea06cee7f3efd53af5c327fe89a180	2020-06-02 16:53:51 +07:00
Renat Akhmerov	7dec19ae19	Fix calculating task execution result for "with-items" * The logic of calculating a task result in case of "with-items" was overcomplicated and broke encapsulation of a "with-items" task. This patch makes it simpler, so that the method doesn't need to peek into the internals of a "with-items" task (e.g. runtime_context). Change-Id: I036193cbae15d7f3c3414b123525ceafa91fdeb1	2020-06-02 16:28:42 +07:00
Renat Akhmerov	ddf9577785	Refactor task policies * The purpose of this patch is to improve encapsulation of task execution state management. We already have the class Task (engine.tasks.Task) that represents an engine task and it is supposed to be responsible for everything related to managing persistent state of the corresponding task execution object. However, we break this encapsulation in many places and various modules manipulate with task execution state directly. This fact leads to what is called "spagetty code" because important things are often spread out across the system and it's hard to maintain. It also leads to lots of duplications. So this patch refactors policies so that they manipulate with a task execution through an instance of Task which hides low level aspects. Change-Id: Ie728bf950c4244db3fec0f3dadd5e195ad42081d	2020-06-01 14:05:49 +07:00
Zuul	0237898d59	Merge "Remove OpenStack actions from mistral"	2020-03-06 09:08:24 +00:00
Eyal	8bdf341af7	Remove OpenStack actions from mistral Depends-on: https://review.opendev.org/#/c/703296/ Depends-On: https://review.opendev.org/#/c/704280/ Change-Id: Id62fdabe7699e7c3b2977166e253cfc77779e467	2020-02-26 10:12:01 +02:00
Renat Akhmerov	592981f487	Refactor expressions * This patch moves code related to YAQL and Jinja into their specific modules so that there isn't any module that works with both. It makes it easier to understand how code related to one of these technologies works. * Custome built-in functions for YAQL and Jinja are now in a separate module. It's easier now to see what's related with the expression framework now and what's with integration part, i.e. functions themselves. * Renamed the base module of expressions similar to other packages. * Other style changes. Change-Id: I94f57a6534b9c10e202205dfae4d039296c26407	2020-02-26 12:36:34 +07:00
Oleg Ovcharuk	95d9f899db	Extend task and workflow notification data Change-Id: I93c1e9ed166847aea07531f98a9924a728efbab3 Signed-off-by: Oleg Ovcharuk <vgvoleg@gmail.com>	2020-02-20 10:55:46 +00:00
Renat Akhmerov	6dc0c05f04	Fix adhoc actions * Method _create_action_execution() for AdHocActin didn't have the right signature. It was missing the argument "namespace" and failed under some conditions. This patch does some refactoring to preserve the target namespace during action init time. For regular python actions it's just taken from it's action definition object. For ad-hoc actions it is taken from its definition also but it has to do it separately because it extends the class PythonAction passing a base action definition into it as a parameter of the initializer (so that extracts the namespace of the base action). The benefit of preserving a namespace value during init time is that it becomes available for the entire instance life-span, not only for the method _create_action_execution(). * Style changes (blank lines, indentation, formatting). Change-Id: I84d1cd0fb4a746197ad890276f654cd12455603e	2020-02-19 14:48:35 +07:00
Renat Akhmerov	8d75784356	Add "convert_output_data" config property for YAQL * Adding "convert_output_data" config property gives an opportunity to increase overal performance. If YAQL always converts an expression result, it often takes significant CPU time and overall workflow execution time increases. It is especially important when a workflow publishes lots of data into the context and uses big workflow environments. It's been tested on a very big workflow (~7k tasks) with a big workflow environment (~2.5mb) that often uses the YAQL function "<% env() %>". This function basically just returns the workflow environment. * Created all necessary unit tests. * Other style fixes. Change-Id: Ie3169ec884ec9a0e7e50327dd03cd78dcda0a39b	2020-02-13 17:31:41 +07:00
Renat Akhmerov	829e822581	Init profiler in for a new thread in post_tx_queue.py * Initialization of profiler was also missing for a thread spawned within post_tx_queue.py so we were loosing important profiling info * Changed the profiler test since its logic was already obsolete. Now we initialize profiler in every thread so the only reason to not get any profiler traces when a workflow completed is "enabled = False" in the "profiler" group in the configuration. * Added more profiler traces * Small readability changes in the workflow language spec Change-Id: I35e6711f8e10bb08d7e842f4bca8753b929328fd	2020-02-07 13:42:55 +07:00
Zuul	f1be1dd955	Merge "Update hacking and fix warnings"	2020-01-09 17:23:55 +00:00
ali	20c3408692	Add namespaces to Ad-Hoc actions added namespace for the actions, actions can have the same name if they are not in the same namespace, when executing an action, if an action with that name is not found in the workflow namespace or given namespace mistral will look for that action in the default namespace. * action base can only be in the same namespace,or in the default namespace. * Namespaces are not part of the mistral DSL. * The default namespace is an empty string ''. * all actions will be in a namespace, if not specified, they will be under default namespace Depends-On: I61acaed1658d291798e10229e81136259fcdb627 Change-Id: I07862e30adf28404ec70a473571a9213e53d8a08 Partially-Implements: blueprint create-and-run-workflows-within-a-namespace Signed-off-by: ali <ali.abdelal@nokia.com>	2020-01-07 08:10:53 +00:00
Eyal	a0663305e5	Update hacking and fix warnings Change-Id: I47a17e140f1686e901c67c034105eeec1c421ae7	2020-01-02 17:18:38 +02:00
Zuul	7e0c7c92b7	Merge "Enlarge tags support"	2019-12-09 08:40:38 +00:00
Renat Akhmerov	f61929a3c8	Implement engine graceful shutdown * The functionality of graceful engine shutdown is now possible due to correct calculation of the "graceful" flag in the engine server's stop() method. Unfortunately, the Oslo Service framework doesn't pass it correctly, it simply ignores it in the call chain. So the only way to understand if the shutdown is graceful is to peek at the configuration property "graceful_shutdown_timeout" provided by Oslo Service. If it's greater than zero then we can treat it as graceful. * Oslo Service handles only four OS signals: SIGTERM, SIGINT, SIGHUP and SIGALRM. Only sending SIGTERM to the process leads to a graceful shutdown. For example, SIGINT (which is equal to ctrl + C in a unix shell) interrupts the process immediately. So the only way to do a graceful shutdown of an engine instance using a unix shell is to run the "kill <PID>" command. This needs to be taken into account when using it. * The patch also changes the order in which the engine server stops its inner services so that the underlying RPC server (currently Oslo Messaging based or Kombu based) stops first. This is needed to make sure that, first of all, no new RPC calls can arrive, and thereby, let all active DB transactions finish normally w/o starting new ones. Stopping the RPC server may be a heavy operation if there are already lots of RPC messages waiting for processing that are polled from the queue. So to the great extent the entire functionality of graceful shutdown will depend on whether an underlying RPC server implements the corresponding functionality in the proper way, i.e. after calling stop(graceful=True) it will stop receiving new calls and wait till all buffered RPC messages are processed normally. * The maximum time given to graceful shutdown is controlled via the "graceful_shutdown_timeout" configuration option, which is 60 seconds, by default. * Minor refactoring Implements blueprint: mistral-graceful-scale-in Change-Id: I6d1234dfa21b1e3420ec9ca2c5235dee973748ee	2019-12-06 09:29:26 +00:00
Oleg Ovcharuk	e596ee2e63	Enlarge tags support Workflow and task executions will inherit tags from definition. Executions filtering by tag is included. Change-Id: Id5d615b829901258af2be7ca99178ad92b60d1fb Closes-Bug: #1853457 Signed-off-by: Oleg Ovcharuk <vgvoleg@gmail.com>	2019-11-29 18:40:37 +03:00
Zuul	a9a7a99237	Merge "Make action heartbeats work for all executor types"	2019-11-18 04:49:28 +00:00
Dougal Matthews	a25c8fab88	Mask sensitive data when logging action results When passwords or other sensible data is returned by a Action, they can be logged by Mistral. This change uses the password masking functionality used in mistral-lib and privided by oslo-utils. This function uses the standard cut_repr method in mistral-lib, which also means the output is more standardised. Related-Bug: #1850843 Change-Id: I01bf47f7a83102a1a16b15bf0bbb4021707e11fe	2019-11-15 12:20:05 +00:00
Renat Akhmerov	7ec4f26744	Make action heartbeats work for all executor types * Previously action hearbeats didn't work in case of using local executors because the component responsible for sending heartbeats was started by the executor RPC server which doesn't make sense to initialize for a local executor. This patch refactors the code so that now heartbeats get sent for any type of executors. For local executors it is also useful because a cluster node that runs an engine and a local executor may also crash. With this change, remaining cluster nodes will be able to understand that the action will never complete and one of them will time it out. If all is fine with the node where the local executor is running then heartbeats will be sent normally and the action won't time out. Before this change, in case of local executors a long running action would always time out after a configured amount of time (by default, 60 mins) just because local executors never sent heartbeats. * Made a lot of renamings to clearly see what component is responsible for. * Wrote the tests that check the heartbeat sender, both positive and negative scenarios for local and remote executor types. Closes-Bug: #1852722 Change-Id: I4d0fdff54de9bee70aeaf10a4ef483ad7000840b	2019-11-15 16:44:40 +07:00
Zuul	c11d8eade9	Merge "Refactor rerun of joins"	2019-11-12 09:32:56 +00:00
Zuul	1c7e242975	Merge "Reformat rerun logic for tasks with join"	2019-11-11 14:42:37 +00:00
Renat Akhmerov	59bf2509eb	Refactor rerun of joins * This patch moves logic that schedules a task state refreshing periodic job in case of rerun from the Task class to task_handler.run_task() so that Task doesn't have to know any language specific details and call task handler back. It is more architecturally clean. Change-Id: If7a054bbf77f9ed761d8f3ac36b6d329544f5ff5	2019-11-11 17:10:16 +07:00
Renat Akhmerov	fd24972bef	Fix task expression context * When Mistral prepares a context view to evaluate a YAQL/Jinja expression it needs to put "task_ex.in_contex" before "wf_ex.context" because the first one should take higher priority. For example, if a workflow declares a variable (via the "var" keyword) and then this variable is updated by one of the workflow branches then it should shadow the initial value of the variable when evaluating an expression (e.g. in the action input). * We also don't need to use "ctx or self.ctx" in the modified _evaluate_expression() function because "self.ctx" always becomes "task_ex.in_context" when a task execution is created. * Added one more test to check data flow correctness. Change-Id: Ib9a0e2b3f5cc686cbc53d9e6c049ad7fdc12c76d	2019-11-07 17:46:20 +07:00
Eyal	a68136d1e4	Evaluate input expression should check the in_context The workflow in the test fails because contextView does not evaluate in_context Closes-Bug: #1850315 Change-Id: I54a4cd38e962d363fd2626476bcae9ec0aa8dad6	2019-10-30 06:15:52 +00:00
Zuul	772043881d	Merge "Log the original exception in is_sync"	2019-10-09 16:51:15 +00:00
Dougal Matthews	c0857a7a95	Log the original exception in is_sync Change-Id: Id6521ba37dc5ccff727b06aecd573ac600ea8711	2019-10-03 09:31:29 +01:00
Renat Akhmerov	7a6aac0f5f	Fix "root_execution" lazy loading issue and refactor execution.py * There's an issue with lazyly loaded field of WorkflowExecution model occuring on GET /v2/execution/<id> because the logic that calculates "published_global" of the execution rest resource hits "root_execution" field out of transaction scope indirectly within the "data_flow.get_workflow_environment_dict" method. This patch makes refactoring of this logic and calculates globally published variables of the workflow execution simply as its context that doesn't contain all internal data like "__execution" and "openstack". * Other style change. Closes-Bug: #1846152 Change-Id: Ic8609e55930e2ed13653e79e8ca7a31c951d9030	2019-10-02 11:02:52 +00:00
ali	7e7f1cb92b	moved generic util functions from mistral to mistral-lib Depends-On: I780c270e4b1a184d7d4dcc580d23697ba75edab1 Closes-bug: #1815183 Change-Id: I5a1d402baa3f69c37f9347c8b3d02a83b8f60423	2019-09-13 04:06:27 +00:00
Zuul	a9fec62cd7	Merge "Improve workflow notifications and webhook data"	2019-09-06 12:38:59 +00:00
Renat Akhmerov	c99b87a8c8	Check if workflow execution is empty in integrity checker * Method get_workflow_execution() raises an exception if the workflow execution does not exist. Since this is a valid case (the method may be called via scheduler after the execution is deleted) we need to smoothly handle it. Change-Id: Ibd6099f1e0fd07c71130f11457b355a367229977	2019-09-05 13:47:14 +07:00
Andras Kovi	5eb2a21607	Improve workflow notifications and webhook data The task_execution_id is required to be able to restore the hierarchy of tasks and workflows on the notification receiver side. Also, including the event in the notification is very useful. Also fix the documentation as multiline strings are not supported in ini files. Change-Id: I714fd5c32b0f31f85ac5a4d22d161e662bf18687	2019-09-04 07:12:20 +02:00
Renat Akhmerov	f92a5c8f44	Fix 'with-items' expression evaluation * There was a bug left after the recent refactoring. While evaluating 'with-items' expression we didn't construct a context view properly, it didn't include a workflow environment. This patch fixes it. Closes-Bug: #1839840 Change-Id: I3df711ef2484374418085fe0117fe8b37ce5ba3f	2019-09-04 03:57:03 +00:00
Zuul	2cdcb5415e	Merge "Improve new scheduler"	2019-08-28 20:24:07 +00:00
Renat Akhmerov	0f6bc1897f	Improve new scheduler * Changed method get_scheduled_jobs_count() in the Scheduler interface to has_scheduled_jobs(). In fact, the callers always need to compare the number of jobs with 0, i.e. to see if there are any jobs or not. But more importantly, this semantics (returning just boolean) allows to make a good optimisation for DefaultScheduler and avoid DB calls in a number of cases. Practically, for example, it saves several seconds (5-6) if we run a workflow with 500 parallel no-op tasks that are all merged with one "join" task. Tested on 1 and 2 engines. * Added test assertions for has_scheduled_jobs() * Other minor chagnes Change-Id: Ife48d9e464114fd60a08707d8f32f847a6f623c9	2019-08-16 13:39:39 +07:00

1 2 3 4 5 ...

540 Commits