Merge "Dependency-driven multi-step resource deallocation"
This commit is contained in:
commit
bc9d5d8042
|
@ -0,0 +1,544 @@
|
|||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
==================================================
|
||||
Dependency-driven multi-step resource deallocation
|
||||
==================================================
|
||||
|
||||
https://blueprints.launchpad.net/murano/+spec/dependency-driven-resource-deallocation
|
||||
|
||||
Murano components may allocate various kinds of resources: virtual machines,
|
||||
networks, volumes etc. When these components get removed from the deployment
|
||||
appropriate resources have to be deallocated. Current implementation of this
|
||||
process has some significant limitations and flaws.
|
||||
|
||||
This specification aims to address these issues and provide a design for the
|
||||
better resource deallocation / garbage collection system for MuranoPL.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
In Murano the deallocation of resources is managed by a garbage collection
|
||||
system (GC). Its present implementation is based on the execution of special
|
||||
methods called ``.destroy()`` which may be defined in each MuranoPL class and
|
||||
are intended to contain the custom code to deallocate resources allocated by
|
||||
the objects of those classes.
|
||||
These methods get executed when their objects leave object graph, however the
|
||||
exact order of these executions is currently undefined.
|
||||
|
||||
There are two different scenarios when the objects may leave the object graph
|
||||
thus causing their ``.destroy`` methods to be called:
|
||||
|
||||
* "Offline" changes of the Object Graph, i.e. the changes introduced in the
|
||||
serialized version of the object graph via the API. These changes are
|
||||
detected by comparing the incoming object graph (the one passed from the
|
||||
API for deployment) with the "snapshot" of current environment made after
|
||||
the previous deployment has been completed. If some object exists in the
|
||||
"snapshot" but is missing in the input graph it is considered to be
|
||||
removed. Such objects are deserialized from the snapshot and their
|
||||
``.destroy`` methods are called in the order from deepest nested objects
|
||||
towards topmost ones.
|
||||
|
||||
* "Runtime" changes. Some objects may be removed from the object graph during
|
||||
deployment: they may be unreferenced or assigned to Runtime properties or
|
||||
local variable only. As a result after the deployment completes these
|
||||
objects are not serialized neither into the output version of the object
|
||||
graph nor into its "snapshot". The next deployments will know nothing about
|
||||
their existence so the objects will be lost forever. To recover the
|
||||
resources allocated by such unreferenced objects murano analyzes its
|
||||
ObjectStore after the execution is complete. Each object which is present in
|
||||
the store but is not present in the output version of the object graph is
|
||||
considered to be "orphan" and thus its ``.destroy`` method is called. The
|
||||
order of these calls is currently undefined: the objects get destroyed based
|
||||
on their position in the object store, which is hardly predictable.
|
||||
|
||||
Such a design is insufficient for production grade applications which
|
||||
often require the following scenarios:
|
||||
|
||||
* If some object is going to be deleted another object (either owning or just
|
||||
referencing it) may need to execute some actions before or after the object
|
||||
is deleted.
|
||||
|
||||
* When a group of nested or interconnected objects is about to be deleted the
|
||||
order in which their destructors should be executed may be different in
|
||||
different cases.
|
||||
|
||||
* Sometimes the actions being executed during the destruction of an object may
|
||||
depend on the fact whether some other object is about to be deleted or not.
|
||||
|
||||
|
||||
*Example*
|
||||
|
||||
*Consider an Application which consists of a Network component and several
|
||||
VM components. All the components are owned by the Application but there
|
||||
are no ownership relationships between them. When the Application is going
|
||||
to be deleted (i.e. its whole subgraph leaves the environment) all of its
|
||||
components are about to be removed as well, and so their ``.destroy``
|
||||
methods will be called. Since Network does not own the VM's the order
|
||||
of these calls is undefined. Due to various implementation details it may
|
||||
be impossible to remove the Network before the VMs which are connected to
|
||||
it (e.g. in case when the VM has a mandatory requirement to be always
|
||||
connected to at least one network). In this case the ``.destroy`` of a
|
||||
Network component should be always called after all the VMs have been
|
||||
destroyed.*
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
Several improvements have to be made in Murano engine to address the problems
|
||||
described above.
|
||||
|
||||
General concepts
|
||||
----------------
|
||||
|
||||
Destruction dependencies
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
There should be a way to establish directional `destruction dependencies`
|
||||
between Murano objects. If object `Foo` establishes such a dependency on the
|
||||
object `Bar` then:
|
||||
|
||||
* `Foo` will be notified when `Bar` is about to be destroyed. These
|
||||
notifications are covered in details in "Multi-step destruction" section
|
||||
below.
|
||||
|
||||
* If both `Foo` and `Bar` are going to be destroyed in the same garbage
|
||||
collection execution, `Bar` will be destroyed before Foo.
|
||||
|
||||
These dependencies are not related to object ownership relationship or
|
||||
property-based cross-references: the owner may have a destruction dependency on
|
||||
its nested object or vise versa; the objects referencing each other may have
|
||||
some destruction dependency established. Even entirely unrelated objects may
|
||||
have a destruction dependency between them.
|
||||
|
||||
Since the destruction dependencies are directional there is a theoretical
|
||||
possibility of a circular dependency to exist. In case if two or more objects
|
||||
form such a circle they will still be notified about pending destruction of
|
||||
their dependencies, however the order of this notifications - and the
|
||||
destruction itself - is undefined in this case.
|
||||
|
||||
For now it's proposed to use destruction dependencies only for runtime garbage
|
||||
collection. Probably, it will be implemented when we find a solution for issue
|
||||
mentioned above in `Known issues`_.
|
||||
|
||||
|
||||
Multi-step destruction
|
||||
^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
Instead of just iterating through all the objects going to be destructed and
|
||||
calling their ``.destroy`` methods Murano should perform a multi-step garbage
|
||||
collection according to the following algorithm:
|
||||
|
||||
1. Detect all the objects going to be destroyed using either comparison of
|
||||
current object graph with its snapshot or the reference detection (see
|
||||
"Offline" and "Online" changes in "Problem description" section)
|
||||
|
||||
2. Sort the list of detected objects using the following comparator: for
|
||||
any two objects A and B in the list:
|
||||
|
||||
.. code::
|
||||
|
||||
IF (A has-a-destruction-dependency-on B)
|
||||
AND (NOT B has-a-destruction-dependency-on A)
|
||||
THEN A>B
|
||||
ELSE IF (A owns B) THEN A>B
|
||||
ELSE A==B
|
||||
|
||||
where `has-a-destruction-dependency-on` means that the left operand object
|
||||
has a destruction dependency (probably transitive) on right operand object,
|
||||
`owns` means that the left operand object owns (probably transitively) the
|
||||
right operand object.
|
||||
|
||||
.. note::
|
||||
|
||||
The destruction order of objects which are considered to be equal by
|
||||
the algorithm above is undefined. Even more, future implementations may
|
||||
destroy such objects in parallel.
|
||||
|
||||
|
||||
3. For each object in the list:
|
||||
|
||||
3.1. Notify all the objects having a destruction dependency on it that the
|
||||
target object will be destroyed.
|
||||
|
||||
3.2. Call the ``.destroy()`` method of the object if it is present.
|
||||
|
||||
3.3. Change the object's status to "Destroyed" (see below).
|
||||
|
||||
Destroyed objects
|
||||
^^^^^^^^^^^^^^^^^
|
||||
|
||||
When an object is being processed by a garbage collector, it means that there
|
||||
are no live references to it from the objects of the environment. However there
|
||||
may be cases when the code which handles either the pre-destroy notification
|
||||
(p. 3.1 above) or the actual ``.destroy`` method re-establishes the references
|
||||
to the object being destructed, and thus the object remains in the object graph
|
||||
after the GC is completed. Since the resources may be deallocated at this time
|
||||
the regular usage of the object is not possible, however if it is assigned to
|
||||
a property of some another object in the graph it may not always be possible to
|
||||
just nullify that property since it may cause a contract violation.
|
||||
|
||||
To resolve such collisions it is proposed to explicitly mark such destroyed
|
||||
objects as "destroyed". MuranoPL executor will not allow to execute any methods
|
||||
on such objects, however their properties remain accessible (i.e. readable) so
|
||||
any runtime information associated with them may be recovered. Destroyed
|
||||
objects will be serialized with the rest of object graph but the
|
||||
json-representation of the object will have a special flag in their class
|
||||
header (the "?" section) to indicate their special status. When deserialized
|
||||
from json such objects will retain their "destroyed" status, so the method
|
||||
execution will still be impossible even in subsequent deployments.
|
||||
|
||||
When the destroyed objects are unreferenced from the object graph they go away
|
||||
without additional actions: garbage collector ignores them since their
|
||||
resources have already been released.
|
||||
|
||||
Garbage collection executions
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
|
||||
The multi-step object destruction described above should take place in three
|
||||
different scenarios:
|
||||
|
||||
1. *(currently existing)* Before the deployment, destroys objects which were
|
||||
present in the object graph after the previous deployment was finished but
|
||||
were not found in the incoming object graph of a new deployment, i.e. the
|
||||
ones explicitly removed using the API.
|
||||
|
||||
2. *(currently existing)* After the deployment, destroys the objects existing
|
||||
in the Object Store but not being a part of the **persistable** object graph
|
||||
of current environment, i.e. having no references to them from the
|
||||
persistable (In, Out, InOut) properties of the environment or its transitive
|
||||
children).
|
||||
|
||||
3. *(proposed)* During the deployment, explicitly initiated from MuranoPL code.
|
||||
Destroys objects which are not part of the **complete** object graph, i.e.
|
||||
having no references to them from any properties of the environment
|
||||
(including runtime and private properties) AND not being referenced by local
|
||||
variables in any frame of all the the green threads of current deployment.
|
||||
|
||||
To implement p.3 above a new algorithm is needed. It should analyze all the
|
||||
active contexts of all the running greenthreads of the current deployment and
|
||||
retrieve all the data variables from that context, traversing through the
|
||||
parent contexts as well. All the objects of ``MuranoObject`` type collected
|
||||
this way should be added to the "queue of active roots" to be used for further
|
||||
processing.
|
||||
For each object of this queue the algorithm should save the id of the object
|
||||
into the "result set" and then find objects which are reachable from the
|
||||
current one (i.e. the objects of ``MuranoObject`` type contained in properties
|
||||
of any kind). For each such object the algorithm should check whether its id is
|
||||
already present in the "result set". If not, the object is added to the end of
|
||||
"queue of active roots". The algorithm runs till it processes all the objects
|
||||
of the queue.
|
||||
The "result set" which the algorithm gets at the end of this process contains
|
||||
the ids of alive objects. All other objects of the object store should be
|
||||
considered as candidates for garbage collection.
|
||||
|
||||
There should be one additional check made before doing the actual destruction:
|
||||
some of the objects may have no valid access paths from MuranoPL objects, but
|
||||
could be referenced from some python-back objects. This happens when an Object
|
||||
is passed to some method backed with pythonic code. In this case the executor
|
||||
creates an object of type ``MuranoObjectInterface`` - a wrapper to simplify the
|
||||
work with a murano object from python. This object contains a reference to the
|
||||
actual MuranoPL object within. If appropriate python object is alive, then
|
||||
its corresponding MuranoPL object should not be garbage collected even if there
|
||||
are no references to it from the active roots of MuranoPL.
|
||||
To keep track of such situations the Garbage Collector should contain a special
|
||||
dictionary mapping ids of the objects to the weak proxies pointing to the
|
||||
``MuranoObjectInterface`` objects passed to pythonic code. For every garbage
|
||||
collection candidate the algorithm should check if there is a map entry for
|
||||
this object and the weak proxy at that entry is alive. If this is true, then
|
||||
the object is excluded from the list of GC candidates.
|
||||
|
||||
The resulting list of GC candidates is then destroyed as described in
|
||||
"Multi-step destruction" section above.
|
||||
|
||||
Known issues
|
||||
^^^^^^^^^^^^
|
||||
|
||||
Murano is using ``Objects`` and ``ObjectsCopy`` objects to transfer data between
|
||||
deployments. When destruction dependencies will be implemented handler will make
|
||||
changes (if any) to objects in ``ObjectsCopy``. So, this changes aren't applied
|
||||
during next deployment.
|
||||
|
||||
It's proposed to change the way of object model generation, with updating new model
|
||||
objects if they have been changed after last deployment. However, solving this
|
||||
issued is not an aim of this specification, so we can skip details.
|
||||
|
||||
Code changes
|
||||
------------
|
||||
|
||||
GC class
|
||||
^^^^^^^^
|
||||
|
||||
A new python-backed Murano class called ``GC`` should be added to the core
|
||||
library. It should have the following static methods:
|
||||
|
||||
* ``collect()`` - initiates garbage collection of unreferenced objects of
|
||||
current deployment (see p.3 in "Garbage collection executions" section
|
||||
above).
|
||||
|
||||
* ``isDestroyed(object)`` - checks if the ``object`` was already destroyed
|
||||
during some GC session and thus its methods cannot be called.
|
||||
|
||||
* ``subscribe(target, handler=null)`` - establishes a destruction dependency
|
||||
from the caller to the object passed as ``target``. Method may be called
|
||||
several times, in this case only a single destruction dependency will be
|
||||
established, however the same amount of calls of ``unsubscribe`` will be
|
||||
required to remove it.
|
||||
|
||||
``handler`` argument is optional. If passed it should be the name of an
|
||||
instance methods defined by the caller class to handle notification of
|
||||
``target``'s destruction (see "Multi-step destruction" section above: this
|
||||
handlers is executed for p. 3.1)
|
||||
|
||||
The following arguments will be passed to the handler method
|
||||
|
||||
* ``sender`` - an instance of ``GC`` class describing the current
|
||||
garbage collection session.
|
||||
|
||||
* ``object`` - a target object which is going to be destroyed. It is not
|
||||
recommended to persist the reference to this object anywhere. This will not
|
||||
prevent the object from being garbage collected but the object will be
|
||||
moved to the "destroyed" state which is almost always bad. The option to do
|
||||
so is considered to be advanced feature which should not be done unless it
|
||||
is absolutely necessary.
|
||||
|
||||
* ``unsubscribe(target)`` - removes the destruction dependency from the caller
|
||||
to the object passed as ``target``. Method may be called several times
|
||||
without any side-effects. If ``subscribe`` was called more than once the same
|
||||
(or more) amount of calls to ``unsubscribe`` is needed to remove the
|
||||
dependency.
|
||||
|
||||
An instance of ``GC`` class will be created during the garbage collection
|
||||
session to encapsulate runtime information about this session. It defines a
|
||||
single method which may be of use during a GC session:
|
||||
|
||||
* ``isDoomed(object)`` - checks if the ``object`` is marked for destruction
|
||||
during this GC session.
|
||||
|
||||
Pythonic back-end of the ``GC`` class should be able to establish destruction
|
||||
dependencies by storing the back-refs to the dependent object in attributes of
|
||||
the python's instance of MuranoObject representing the dependency.
|
||||
|
||||
.. note::
|
||||
|
||||
This is the opposite of how the destruction dependencies are stored
|
||||
when the model is serialized: in serialized form that is the dependent
|
||||
object who owns the reference to the dependency object. In runtime it
|
||||
is the dependency object who owns the reference to dependent object.
|
||||
|
||||
|
||||
When the garbage collection is needed the class will be instantiated and a list
|
||||
of objects-to-delete created based on the current state of object graph, object
|
||||
store and the execution context. Garbage collections will use this object for
|
||||
all the steps of the workflow.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Application developers may try to implement their own event-based notification
|
||||
logic to notify about pending and completed object destructions. However it
|
||||
will solve only part of the problem: notifications will work properly, but they
|
||||
will not affect the order in which the objects are destroyed, so the workflows
|
||||
will be too complicated. Also this alternative will not have the advanced
|
||||
features proposed in this spec, such as ability to check if some object is
|
||||
going to be destroyed.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
None
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
Versioning impact
|
||||
-----------------
|
||||
|
||||
The proposed change is completely backwards compatible: without explicit
|
||||
destruction dependencies objects will be collected based on their ownership
|
||||
relationships, i.e. as it is done in the current implementation.
|
||||
|
||||
The packages containing classes which explicitly call the methods of ``GC``
|
||||
should have package format of at least 1.4 to prevent their execution on older
|
||||
versions of Murano which do not have this feature.
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
None
|
||||
|
||||
Deployer impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
Developers will get the new MuranoPL-based API to manage resource deallocation
|
||||
lifecycle. If they do not want to use it they don't need to do anything.
|
||||
|
||||
|
||||
Murano-dashboard / Horizon impact
|
||||
---------------------------------
|
||||
|
||||
None
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
ativelkov
|
||||
|
||||
Other contributors:
|
||||
starodubcevna
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
* Implement a system to define and use destruction dependencies in runtime.
|
||||
|
||||
* Introduce changes to MuranoObject class to keep track of "destroyed"
|
||||
object status.
|
||||
|
||||
* Modify the serializer / deserializer to properly persist the value of the
|
||||
"destroyed object" flag.
|
||||
|
||||
* Modify the code which instantiates yaql contexts for MuranoPL so all the
|
||||
created contexts are tracked by the execution session.
|
||||
|
||||
* Implement sorting algorithms to arrange objects-to-be-destroyed based on
|
||||
criteria defined in p.2 of "Multi-step destruction" section above.
|
||||
|
||||
* Modify an algorithm to collect alive object roots from the runtime and
|
||||
private properties and local variables of executing threads.
|
||||
|
||||
* Implement multi-step destruction workflow.
|
||||
|
||||
* Implement ``GC`` class to bind all the above.
|
||||
|
||||
* Create test-runner-based tests to cover all the test scenarios.
|
||||
|
||||
* Document the new features.
|
||||
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
The development of this feature will enable Application Development Framework
|
||||
[1] to address resource deallocation problems during application uninstall.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Tests should be written for test-runner to cover various scenarios of resource
|
||||
deallocation.
|
||||
|
||||
Runtime garbage collection
|
||||
--------------------------
|
||||
|
||||
There should be test cases covering that:
|
||||
|
||||
* objects assigned to persistent (Input, Output, InputOutput) properties (both
|
||||
locally-declared and inherited) of objects reachable from the current roots
|
||||
are NOT garbage collected;
|
||||
|
||||
* objects assigned to transient (Runtime and undeclared) properties (both
|
||||
locally-declared and inherited) of objects reachable from the current roots
|
||||
are NOT garbage collected; target properties should be both locally-declared
|
||||
and inherited;
|
||||
|
||||
* objects assigned to static properties of various classes are NOT garbage
|
||||
collected;
|
||||
|
||||
* objects passed to python-backed objects and unreferenced in MuranoPL are NOT
|
||||
garbage collected unless their MuranoObjectInterface proxies are unreferenced
|
||||
/ GC'ed in python;
|
||||
|
||||
* objects assigned to local variables of the current execution frame (i.e.
|
||||
variables of the current method and all the caller methods in call stack)
|
||||
including method arguments are NOT garbage collected;
|
||||
|
||||
* single unreferenced objects ARE garbage collected;
|
||||
|
||||
* graphs of interconnected objects having no references from non-collected
|
||||
objects ARE garbage collected;
|
||||
|
||||
* objects passed to python-backed objects and unreferenced in both MuranoPL and
|
||||
python ARE garbage collected;
|
||||
|
||||
* garbage collector correctly processes stack-frame objects from green-threads
|
||||
other than the one it is executed from;
|
||||
|
||||
Destruction dependency resolution order
|
||||
---------------------------------------
|
||||
|
||||
There should be test cases covering that:
|
||||
|
||||
* if some child object has a destruction dependency on its parent, the parent
|
||||
gets destroyed before the child;
|
||||
|
||||
* if some parent object has a destruction dependency on its child, the child
|
||||
gets destroyed before the parent;
|
||||
|
||||
* if some objects not being the part of some ownership hierarchy have some
|
||||
destruction dependency, the dependency-object is destroyed before the
|
||||
dependent one;
|
||||
|
||||
* if some objects have circular destruction dependency they are all destroyed
|
||||
(the order is not enforced by the test);
|
||||
|
||||
Destruction events
|
||||
------------------
|
||||
|
||||
Given the base scenario of object A having a destruction dependency on object B
|
||||
and B being GC'ed, there should be tests covering that:
|
||||
|
||||
* the right order of events occurs (A gets warned about possible B's
|
||||
destruction -> A is notified about inevitable B's destruction -> B is
|
||||
destroyed -> A is notified that B was destroyed);
|
||||
|
||||
* A may prevent B's destruction by establishing a reference on B in the warning
|
||||
handler;
|
||||
|
||||
* A may cancel GC in both warning and pre-destroy notification handlers;
|
||||
|
||||
* A may establish more then 1 destruction dependency on B and still be
|
||||
notified just once;
|
||||
|
||||
* A may remove the destruction dependency and not get notified on B's
|
||||
destruction;
|
||||
|
||||
* If A established N destruction dependencies and then removed them M times,
|
||||
(N>M) then notifications are still delivered;
|
||||
|
||||
* If A established N destruction dependencies and then removed them M times,
|
||||
(N<=M) then notifications are not delivered;
|
||||
|
||||
* B may establish a destruction dependency on itself thus subscribing to
|
||||
appropriate notifications;
|
||||
|
||||
* ``phase`` property of GC instance is correct in appropriate event handlers;
|
||||
|
||||
* ``isDoomed`` and ``isDestroyed`` methods return appropriate values when
|
||||
called by A for B in appropriate event handlers.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
Developers documentation should be updated to describe the new ``GC`` class and
|
||||
its static and instance methods, as well as the design guidelines for
|
||||
application developers to follow to utilize the new capability.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
[1] https://github.com/openstack/murano-specs/blob/master/specs/newton/approved/application-development-framework.rst
|
Loading…
Reference in New Issue