diff --git a/specs/approved/conductor-node-locality-affinity.rst b/specs/approved/conductor-node-locality-affinity.rst new file mode 100644 index 00000000..8ea39a3d --- /dev/null +++ b/specs/approved/conductor-node-locality-affinity.rst @@ -0,0 +1,257 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. + + http://creativecommons.org/licenses/by/3.0/legalcode + +================================ +Conductor/node grouping affinity +================================ + +https://storyboard.openstack.org/#!/story/2001795 + +This spec proposes adding a ``conductor_group`` property to nodes, which can be +matched to one or more conductors configured with a matching +``conductor_group`` configuration option, to restrict control of those nodes to +these conductors. + + +Problem description +=================== + +Today, there is no way to control the conductor-to-node mapping. This is +desirable for a number of reasons: + +* An operator may have an Ironic deployment that spans multiple sites. Without + control of this mapping, images may be pulled over WAN links. This causes + slower deployments and may be less secure. + +* Similarly, an operator may want to map nodes to conductors that are + physically closer to the nodes in the same site, to reduce the number of + network hops between the node and the conductor. A prime example of this + would be to place a conductor in each rack, reducing the path to only go + through the top-of-rack switch. + +* A deployer may have multiple networks for out-of-band control, that must be + completely isolated. This feature would allow isolating a conductor to a + single out-of-band network. + +* A deployer may have multiple physical networks that not all conductors are + connected to. By configuring the mapping correctly, conductors can manage + only the nodes which they can communicate with. This is described further + in another RFE.[0] + + +Proposed change +=============== + +We propose adding a ``conductor_group`` configuration option for the conductor, +which is a single arbitrary string specifying some grouping characteristic of +the conductor. + +We also propose a ``conductor_group`` field in the node record, which will be +used to map a node to a conductor. This matching will be done case-insensitive, +to make things a bit easier for operators. + +A blank ``conductor_group`` field or config is the default. A conductor without +a group can only manage nodes without a group, and a node without a group can +only be managed by a conductor without a group. + +The hash ring will need to be modified to take grouping into account, as +described below in the `RPC API Impact`_ section. + +Alternatives +------------ + +Another RFE[1] proposes a complex system of hard and soft affinity, affinity +and anti-affinity, and scoring of placement to a conductor with multiple tags. +This is quite complex, and I don't believe we'll get it done in the short term. +Completing this more basic work doesn't block this more complex work, and so +we should take it one step at a time. + +Data model impact +----------------- + +A ``conductor_group`` field will be added to the nodes table, as a +``VARCHAR(255)``. This will have a default of ``""``, or the empty string. +This string will be used in the hash ring calculation, so there's no sense in +defaulting to ``NULL``. + +A ``conductor_group`` field will also be added to the conductors table, also as +a ``VARCHAR(255)``. This will also have a default of ``""``, or the empty +string. This will be used to build the hash ring to look up where nodes should +land. + +State Machine Impact +-------------------- + +None. + +REST API impact +--------------- + +The ``conductor_group`` field of the node will be added to the node object in +the REST API, with a microversion as usual. It will be allowed in POST and +PATCH requests. As with the database, it will be restricted to 255 characters. +There must be a conductor in that group available, as the conductor services +node creation and updates, and is selected via the hash ring. + +It's worth noting that we would like to expose the grouping of conductors +via the REST API eventually. However, the best way to do this isn't +immediately clear, so we leave it outside the scope of this spec for now. +Another RFE[3] proposes a service management API that may be a good fit. + +Client (CLI) impact +------------------- + +"ironic" CLI +~~~~~~~~~~~~ +None, it's deprecated. + +"openstack baremetal" CLI +~~~~~~~~~~~~~~~~~~~~~~~~~ +The ``conductor_group`` field for a node will be exposed in the client output, +and added to the ``node create`` and ``node set`` commands. + +RPC API impact +-------------- + +This will affect which conductor is the destination for RPC calls corresponding +to a given node, however won't have a direct effect on the RPC API itself. + +The hash ring will change such that the internal keys for the hash ring will +now be of the structure ``"$conductor_group:$drivername"``. A colon (``:``) is +used as the separator between the two, to eliminate conflicts between +conductor groups and drivers or hardware types. For example, an ``agent_ilo`` +key with no separator could mean a node with no group and the ``agent_ilo`` +driver, or it could mean a node with group ``agent_`` using the ``ilo`` +hardware type. To handle upgrades, hash ring keys will be built without +the conductor group while the service is pinned to a version before this +feature, and built with the conductor group when the service is unpinned or +pinned to a version after this feature is implemented. + +We handle upgrades by ignoring grouping for services which have a pin in the +RPC version that is less than the release with this feature. Once everything +is upgraded and unpinned, we begin using the grouping tags configured. + +Operators should leave a sufficient number of conductors available without a +grouping tag configured to run the cluster, until nodes can be configured +with the grouping tag. Any nodes without a grouping tag will only be +managed by conductors without a grouping tag. + +Driver API impact +----------------- + +Hash ring generation and lookup will include the grouping tag, as specified +above in the `RPC API Impact`_ section. + +Nova driver impact +------------------ + +This change is transparent to Nova. + +Ramdisk impact +-------------- + +None. + +Security impact +--------------- + +No direct impact; however this provides another mechanism for securing a +deployment by enabling logical infrastructure segregation. + +Other end user impact +--------------------- + +None. + +Scalability impact +------------------ + +None. + +Performance Impact +------------------ + +None. + +Other deployer impact +--------------------- + +Deployers that wish to use this feature will need to manage the process of +labeling conductors and nodes to enable it, which may be a non-trivial task. + +Developer impact +---------------- + +None. + + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee: + jroll + +Other contributors: + dtantsur + +Work Items +---------- + +* Add database fields. + +* Add conductor config and populate conductor DB field. + +* Change the hash ring calculation, and bump the RPC API so that we can pin + during upgrades. + +* Add fields to the node and conductor objects. + +* Make the REST API changes. + +* Update the client library/CLI. + +* Document the feature. + + +Dependencies +============ + +None. + + +Testing +======= + +Unit tests should be sufficient, as that's how we test our hash ring now. +It's difficult to test this with Tempest without exposing conductor grouping +via the REST API. + +Upgrades and Backwards Compatibility +==================================== + +This is described in the `RPC API Impact`_ section. + + +Documentation Impact +==================== + +This should be documented in the install guide and admin guide. + + +References +========== + +[0] https://storyboard.openstack.org/#!/story/1734876 + +[1] https://storyboard.openstack.org/#!/story/1739426 + +[2] Notes from the Rocky PTG session: + https://etherpad.openstack.org/p/ironic-rocky-ptg-location-awareness + +[3] https://storyboard.openstack.org/#!/story/1526759 diff --git a/specs/not-implemented/conductor-node-locality-affinity.rst b/specs/not-implemented/conductor-node-locality-affinity.rst new file mode 120000 index 00000000..ceafa311 --- /dev/null +++ b/specs/not-implemented/conductor-node-locality-affinity.rst @@ -0,0 +1 @@ +../approved/conductor-node-locality-affinity.rst \ No newline at end of file