Add spec for handling unknown schemas
blueprint unknown-schemas Change-Id: I1d92140d1e0c98f513a422e429d34e77cd554383
This commit is contained in:
parent
c70fe1a11c
commit
9354843f96
|
@ -0,0 +1,208 @@
|
|||
..
|
||||
|
||||
==========================================
|
||||
Policy Support for Unknown Table Schemas
|
||||
==========================================
|
||||
|
||||
The new distributed architecture requires the policy engine to
|
||||
handle the case when the schema for some datasource drivers are
|
||||
unknown. Today the policy engine assumes the schemas for all
|
||||
datasource drivers are known at load-time. This spec outlines
|
||||
a mechanism for supporting unknown schemas.
|
||||
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
For the new distributed architecture, the policy engine will not know
|
||||
the schema for all the datasources at the time rules are loaded from the
|
||||
database. This is currently problematic because column-references are
|
||||
compiled away at the time rules are loaded from the database, and that
|
||||
compilation procedure requires the schema. Thus in the new architecture,
|
||||
the policy engine will crash when it tries to load policy rules that
|
||||
contain column references.
|
||||
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
The fix to this problem is to enable the policy engine to load rules
|
||||
that include column references without compiling away those column references.
|
||||
That is, the main reason to compile away column references is that the
|
||||
internal datastructures for representing rules cannot represent those
|
||||
column references natively, and hence, the column references must be
|
||||
removed at load-time. The first task is to extend the internal
|
||||
datastructures in compile.py so they can represent named-columns.
|
||||
|
||||
The second reason column references get compiled away is that they cannot
|
||||
be evaluated (even semantically) without the schema. The second task
|
||||
is to extend the run-time capabilities of the policy engine so that
|
||||
rules can be disabled without being deleted. A disabled rule will
|
||||
be completely hidden from the evaluation engine when answering queries
|
||||
yet will still be visible but marked as "disabled" when users view
|
||||
the rules.
|
||||
|
||||
Every time a new rule is inserted, if its schema is unknown, that
|
||||
rule must be disabled. Moreover, every rule using a table dependent
|
||||
on the table in the head of that rule must be disabled. Similarly
|
||||
for deletion except that deletion can cause other rules to be enabled.
|
||||
|
||||
Every time the schema changes, all rules impacted by that schema
|
||||
change should be checked for consistency, and disabled rules
|
||||
must be enabled once all schemas are known. Once a rule
|
||||
is enabled, the column references are compiled away.
|
||||
|
||||
If a 2nd schema arrives (unequal to the first), the policy engine
|
||||
must check for consistency and recompile any rules whose schema
|
||||
may have changed.
|
||||
|
||||
For example, if the following rule is inserted before the schemas
|
||||
for nova-servers and neutron-networks is known, the rule will
|
||||
be disabled since it has column references.
|
||||
|
||||
p(x, z) :- nova:servers(id=x, network=y), neutron:networks(id=y, status=z)
|
||||
|
||||
Then when the nova schema becomes known this rule is validated
|
||||
against that schema but is not enabled because the neutron schema
|
||||
is unknown.
|
||||
|
||||
Finally when the neutron schema becomes known, the column references
|
||||
are compiled away and the rule is officially enabled.
|
||||
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Instead of disabling rules, another option is to modify the
|
||||
evaluation engine to do a best-effort query evaluation. The evaluation
|
||||
algorithms themselves would know about column-references, and would
|
||||
attempt to operate even if the schema was unknown.
|
||||
|
||||
The downside to this alternative is that the rules are actually
|
||||
semantically ambiguous, and hence the result of evaluation has
|
||||
unknown semantic value.
|
||||
|
||||
|
||||
Policy
|
||||
------
|
||||
|
||||
N/A
|
||||
|
||||
Policy actions
|
||||
--------------
|
||||
|
||||
N/A
|
||||
|
||||
Data sources
|
||||
------------
|
||||
|
||||
N/A
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
No database modifications are required.
|
||||
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
The Rule object will have an additional boolean field representing whether
|
||||
or not the rule is disabled.
|
||||
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
N/A
|
||||
|
||||
|
||||
Notifications impact
|
||||
--------------------
|
||||
|
||||
N/A
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
N/A
|
||||
|
||||
Performance impact
|
||||
------------------
|
||||
|
||||
Rule inserts could now be slower since if the rule inserted gets disabled,
|
||||
that could cause many other rules to be disabled.
|
||||
|
||||
Rule deletions likewise could cause many policy rules to be enabled.
|
||||
|
||||
Schema updates are expensive because the policy engine must do consistency
|
||||
checks on all rules that are relevant, and potentially re-compile rules.
|
||||
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
N/A
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
N/A
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
thinrichs
|
||||
|
||||
Other contributors:
|
||||
<launchpad-id or None>
|
||||
|
||||
Work items
|
||||
----------
|
||||
|
||||
|
||||
1. Modify compile.py datastructures to natively represent column
|
||||
references. Include a 'disabled' flag.
|
||||
2. Modify query evaluation engine to ignore disabled rules
|
||||
3. Modify triggers to ignore disabled tables
|
||||
4. Enable/disable rules on insert/delete/set-schema
|
||||
- Write dependency analysis routine to compute the rules/tables
|
||||
that are disabled once a given table is disabled.
|
||||
Likely to need a datastructure that tracks disabled tables.
|
||||
- Modify update routine to do schema check and enable/disable rules
|
||||
as appropriate using the dependency analysis.
|
||||
- Modify set-schema to appropriately enable/disable rules
|
||||
May want to add field to rules that say which tables the compilation
|
||||
was dependent on.
|
||||
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
N/A
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Unit test coverage should be mostly adequate.
|
||||
|
||||
Only real need for tempest tests would be testing the startup of Congress,
|
||||
but that is not supported with tempest.
|
||||
|
||||
|
||||
Documentation impact
|
||||
====================
|
||||
|
||||
N/A
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
The need for this spec was discussed at the Liberty Midcycle Sprint.
|
||||
https://etherpad.openstack.org/p/congress-liberty-sprint
|
||||
|
||||
|
Loading…
Reference in New Issue