Add conductor/node locality spec

This proposes a method to support hard affinity of nodes to conductors
based on locality tags.

Change-Id: I043c4615394b6dae873fb46a6597e2e8076f21bb
Story: 2001795
This commit is contained in:
Jim Rollenhagen 2018-04-06 16:52:58 -04:00
parent 2338b37360
commit 838e42dba4
2 changed files with 258 additions and 0 deletions

View File

@ -0,0 +1,257 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
================================
Conductor/node grouping affinity
================================
https://storyboard.openstack.org/#!/story/2001795
This spec proposes adding a ``conductor_group`` property to nodes, which can be
matched to one or more conductors configured with a matching
``conductor_group`` configuration option, to restrict control of those nodes to
these conductors.
Problem description
===================
Today, there is no way to control the conductor-to-node mapping. This is
desirable for a number of reasons:
* An operator may have an Ironic deployment that spans multiple sites. Without
control of this mapping, images may be pulled over WAN links. This causes
slower deployments and may be less secure.
* Similarly, an operator may want to map nodes to conductors that are
physically closer to the nodes in the same site, to reduce the number of
network hops between the node and the conductor. A prime example of this
would be to place a conductor in each rack, reducing the path to only go
through the top-of-rack switch.
* A deployer may have multiple networks for out-of-band control, that must be
completely isolated. This feature would allow isolating a conductor to a
single out-of-band network.
* A deployer may have multiple physical networks that not all conductors are
connected to. By configuring the mapping correctly, conductors can manage
only the nodes which they can communicate with. This is described further
in another RFE.[0]
Proposed change
===============
We propose adding a ``conductor_group`` configuration option for the conductor,
which is a single arbitrary string specifying some grouping characteristic of
the conductor.
We also propose a ``conductor_group`` field in the node record, which will be
used to map a node to a conductor. This matching will be done case-insensitive,
to make things a bit easier for operators.
A blank ``conductor_group`` field or config is the default. A conductor without
a group can only manage nodes without a group, and a node without a group can
only be managed by a conductor without a group.
The hash ring will need to be modified to take grouping into account, as
described below in the `RPC API Impact`_ section.
Alternatives
------------
Another RFE[1] proposes a complex system of hard and soft affinity, affinity
and anti-affinity, and scoring of placement to a conductor with multiple tags.
This is quite complex, and I don't believe we'll get it done in the short term.
Completing this more basic work doesn't block this more complex work, and so
we should take it one step at a time.
Data model impact
-----------------
A ``conductor_group`` field will be added to the nodes table, as a
``VARCHAR(255)``. This will have a default of ``""``, or the empty string.
This string will be used in the hash ring calculation, so there's no sense in
defaulting to ``NULL``.
A ``conductor_group`` field will also be added to the conductors table, also as
a ``VARCHAR(255)``. This will also have a default of ``""``, or the empty
string. This will be used to build the hash ring to look up where nodes should
land.
State Machine Impact
--------------------
None.
REST API impact
---------------
The ``conductor_group`` field of the node will be added to the node object in
the REST API, with a microversion as usual. It will be allowed in POST and
PATCH requests. As with the database, it will be restricted to 255 characters.
There must be a conductor in that group available, as the conductor services
node creation and updates, and is selected via the hash ring.
It's worth noting that we would like to expose the grouping of conductors
via the REST API eventually. However, the best way to do this isn't
immediately clear, so we leave it outside the scope of this spec for now.
Another RFE[3] proposes a service management API that may be a good fit.
Client (CLI) impact
-------------------
"ironic" CLI
~~~~~~~~~~~~
None, it's deprecated.
"openstack baremetal" CLI
~~~~~~~~~~~~~~~~~~~~~~~~~
The ``conductor_group`` field for a node will be exposed in the client output,
and added to the ``node create`` and ``node set`` commands.
RPC API impact
--------------
This will affect which conductor is the destination for RPC calls corresponding
to a given node, however won't have a direct effect on the RPC API itself.
The hash ring will change such that the internal keys for the hash ring will
now be of the structure ``"$conductor_group:$drivername"``. A colon (``:``) is
used as the separator between the two, to eliminate conflicts between
conductor groups and drivers or hardware types. For example, an ``agent_ilo``
key with no separator could mean a node with no group and the ``agent_ilo``
driver, or it could mean a node with group ``agent_`` using the ``ilo``
hardware type. To handle upgrades, hash ring keys will be built without
the conductor group while the service is pinned to a version before this
feature, and built with the conductor group when the service is unpinned or
pinned to a version after this feature is implemented.
We handle upgrades by ignoring grouping for services which have a pin in the
RPC version that is less than the release with this feature. Once everything
is upgraded and unpinned, we begin using the grouping tags configured.
Operators should leave a sufficient number of conductors available without a
grouping tag configured to run the cluster, until nodes can be configured
with the grouping tag. Any nodes without a grouping tag will only be
managed by conductors without a grouping tag.
Driver API impact
-----------------
Hash ring generation and lookup will include the grouping tag, as specified
above in the `RPC API Impact`_ section.
Nova driver impact
------------------
This change is transparent to Nova.
Ramdisk impact
--------------
None.
Security impact
---------------
No direct impact; however this provides another mechanism for securing a
deployment by enabling logical infrastructure segregation.
Other end user impact
---------------------
None.
Scalability impact
------------------
None.
Performance Impact
------------------
None.
Other deployer impact
---------------------
Deployers that wish to use this feature will need to manage the process of
labeling conductors and nodes to enable it, which may be a non-trivial task.
Developer impact
----------------
None.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
jroll
Other contributors:
dtantsur
Work Items
----------
* Add database fields.
* Add conductor config and populate conductor DB field.
* Change the hash ring calculation, and bump the RPC API so that we can pin
during upgrades.
* Add fields to the node and conductor objects.
* Make the REST API changes.
* Update the client library/CLI.
* Document the feature.
Dependencies
============
None.
Testing
=======
Unit tests should be sufficient, as that's how we test our hash ring now.
It's difficult to test this with Tempest without exposing conductor grouping
via the REST API.
Upgrades and Backwards Compatibility
====================================
This is described in the `RPC API Impact`_ section.
Documentation Impact
====================
This should be documented in the install guide and admin guide.
References
==========
[0] https://storyboard.openstack.org/#!/story/1734876
[1] https://storyboard.openstack.org/#!/story/1739426
[2] Notes from the Rocky PTG session:
https://etherpad.openstack.org/p/ironic-rocky-ptg-location-awareness
[3] https://storyboard.openstack.org/#!/story/1526759

View File

@ -0,0 +1 @@
../approved/conductor-node-locality-affinity.rst