Adding workflow global context spec

* A specification for the blueprint: https://blueprints.launchpad.net/mistral/+spec/mistral-global-wf-context Change-Id: I94f57d901d0ecae9ae5129d9799f3b0ba33a69b3
2017-01-16 18:17:19 +07:00 · 2017-01-16 18:17:19 +07:00 · 67c02d7dfc
parent feaa16a819
commit 67c02d7dfc
1 changed files with 260 additions and 0 deletions
--- a/specs/ocata/approved/workflow-global-context.rst
+++ b/specs/ocata/approved/workflow-global-context.rst
@ -0,0 +1,260 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+=======================
+Workflow Global Context
+=======================
+
+Launchpad blueprint:
+
+https://blueprints.launchpad.net/mistral/+spec/mistral-global-wf-context
+
+Workflow global context will allow to store variables not associated with
+particular workflow branches.
+
+Problem description
+===================
+
+Currently 'publish' keyword in Mistral saves variables into a storage
+(context) which is associated only with a branch.
+
+Example:
+
+ ::
+
+   ---
+   version: '2.0'
+
+   wf:
+     tasks:
+       A:
+         action: std.noop
+         publish:
+           my_var: 1
+           on-success: A1
+
+       A1:
+         action: my_action param1=<% $.my_var %>
+
+       B:
+         action: std.noop
+         publish:
+           my_var: 2
+         on-success: B1
+
+       B1:
+         action: my_action param1=<% $.my_var %>
+
+
+The expression "$.my_var" in the declaration of A1 will always evaluate to 1,
+for B1 it will always evaluate to 2. This doesn't depend on the order in which
+A and B will run. This is because we have two branches (A -> A1 and B -> B1)
+for which the variable "my_var" has its own different version.
+
+Sometimes though we need to be able to share data across branches which is now
+impossible due to aforementioned semantics.
+The concept of workflow global context can help solve this problem. The word
+"global" here means "accessible from any workflow branch".
+
+We also need an ability to make atomic updates of global workflow context.
+It's necessary when we, for example, want to create a global counter (e.g.
+counter of network calls to external systems performed by a workflow).
+
+Use Cases
+---------
+
+* Building conditions based on events happened in parallel workflow branches.
+  Example: one branch needs to notify the other one that it should stop.
+* Passing data between branches. Example: one branch needs to wait till the
+  other one produces some expected result. This is, essentially, creating
+  a cross-branch mutex.
+* Counters that need to decrement or increment atomically.
+
+Proposed change
+===============
+
+In order to achieve this goal the proposal is:
+
+* Add the new keyword "publish-global" which is similar to "publish"
+  with the difference that it publishes variables into workflow global
+  context instead of branch workflow context. It's important to note
+  that this is an unprotected way of modifying data because race
+  conditions are possible when writing different values for same
+  variables in the global context from parallel branches. In other
+  words, if we have branches A and B and there are tasks in these
+  branches writing different values to the variable X in the global
+  context Mistral won't provide any guarantees as far as what value
+  is going to be assigned to X and what value will be lost. Users need
+  to understand possible consequences.
+  For instance, using this keyword it's impossible to create an atomic
+  counter since it doesn't assume acquiring a lock under which we can
+  safely perform multiple operations (e.g. read and then write).
+  However, for many scenarios even this model can be useful. For example,
+  if there's only one branch writing values and others are only readers.
+* Add the new YAQL/Jinja function "global()" to explicitly access
+  variables in workflow global context.
+* Make global variables also accessible using "$." in YAQL and "_." in
+  Jinja in a way that branch variables can shadow them if they are
+  published in the current branch.
+* Add the new keyword "publish-global-atomic" which is similar to
+  "publish-global" but allows to atomically read and write variables
+  in workflow global context by acquiring a temporary lock on it.
+  Unlike 'publish-global' this will allow to create atomic counters
+  when we need to perform multiple operations against the storage
+  atomically.
+
+Example #1 (writing and reading global variables):
+
+ ::
+
+   ---
+   version: '2.0'
+
+   wf:
+     tasks:
+       A:
+         action: std.noop
+         publish:
+           my_var: "branch value"
+         publish-global:
+           my_var: "global value"
+         on-success: A1
+
+       A1:
+         # $.my_var will always evaluate to "branch value" because A1 belongs
+         # to the same branch as A and runs after A. When using "$" to access
+         # context variables branch values have higher priority.
+         # In order to access global context reliably we need to use YAQL/Jinja
+         # function 'global'. So global(my_var) will always evaluate to
+         # 'global value'.
+         action: my_action1 param1=<% $.my_var %> param2=<% global(my_var) %>
+
+       B:
+         # $.my_var will evaluate to "global value" if task A completes
+         # before task B and "null", if not. It's because A and B are
+         # parallel and 'publish' in A doesn't apply to B, only
+         # 'publish-global' does. In this example global(my_var) has the same
+         # meaning as $.my_var because there's no ambiguity from what context
+         # we should take variable 'my_var'.
+         action: my_action2 param1=<% $.my_var %> param2=<% global(my_var) %>
+
+
+Example #2 (writing global variables atomically):
+
+ ::
+
+   ---
+   version: '2.0'
+
+   vars:
+     - my_global_var: 0
+
+   wf:
+     tasks:
+       task1:
+         action: std.noop
+         publish-global-atomic:
+           counter: <% global(my_global_var) + 1 %>
+
+       task2:
+         action: std.noop
+         publish-global-atomic:
+           counter: <% global(my_global_var) + 1 %>
+
+
+Alternatives
+------------
+
+None.
+
+Data model impact
+-----------------
+
+Workflow execution object already has the field "context" which is now
+immutable and initialized with openstack specific data, execution id and
+environment variables. In order to get the full context for evaluating a
+YAQL/Jinja expression in a task declaration we always build a context view
+merged from workflow input, workflow execution "context" field and branch
+specific context (e.g. task inbound context when evaluating action
+parameters). The field "context" can play the role of workflow global
+context. However, the idea to reuse this field can be revisited during
+the implementation phase.
+
+REST API impact
+---------------
+
+None.
+
+End user impact
+---------------
+
+New workflow language feature that allows to store global variables into
+workflow context.
+
+Performance Impact
+------------------
+
+When using "publish-global-atomic" we'll need to use locking in order
+to prevent concurrent modifications of global workflow context while
+reading and modifying it when processing a certain task. In fact, this is
+equal to locking the whole execution object and hence will have a serious
+performance impact in case of many parallel tasks. For this reason,
+"publish-global-atomic" needs to be well documented and used with
+precaution.
+
+Deployer impact
+---------------
+
+None.
+
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+  rakhmerov
+
+Other contributors:
+  melisha
+
+Work Items
+----------
+
+* Add 'publish-global' and 'publish-global-atomic' into the direct workflow
+  specification.
+* Make changes in Mistral engine to publish variables into global context
+  (preliminarily it will be the field 'context' of workflow execution object).
+* Implement YAQL/Jinja function 'global' to explicitly read variables from
+  workflow global context.
+* Add locking workflow global context (i.e. workflow execution) in case of
+  using 'publish-global-atomic'. A thread that acquires a lock must first
+  refresh state of workflow execution and then proceed with publishing etc.
+
+Dependencies
+============
+
+None.
+
+
+Testing
+=======
+
+* Unit tests for 'publish-global' keyword and 'global' function in different
+  cases: parallel branches, sequential branches.
+* Unit tests to check that branch-local variables take precedence when
+  reading variables using '$.' in YAQL and '_.' in Jinja.
+* Unit tests for 'publish-global-atomic' that checks atomicity of reads and
+  writes of global variables. Although unit tests can't fully test this
+  feature. In order to fully test it we need to have a test with multiple
+  Mistral engines to make sure we have concurrent access to workflow execution.
+
+References
+==========
+
+None.