Propose control plane resize feature.

Spec to move resize logic into driver to allow CAPI drivers to
optionally implement this, if they choose.

Change-Id: I3d25c010b38134dabf64185bd38dbb3acb8b1dd8
This commit is contained in:
Dale Smith 2024-01-11 15:00:12 +13:00
parent 042d55a322
commit 4dde4d5af1
1 changed files with 129 additions and 0 deletions

View File

@ -0,0 +1,129 @@
Magnum Control Plane resize
===========================
This is a proposal to implement support for resizing the number
of control plane nodes.
Problem Description
-------------------
Currently, no driver in Magnum supports resizing the number of control plane nodes.
Cluster API supports modifying the control plane node count to any odd number such
that etcd's raft consensus can be established and isn't wasteful of nodes.
In practice, this means 1, 3, 5 and 7 nodes are supported, though any odd number
may be acceptable[1].
Use Cases
---------
Below are some of the use cases that this change addresses:
1. User creates a cluster with a single Control Plane node, and then
updates their requirement to have an HA control plane.
2. User has a HA cluster and wants to save resources by reducing to 1.
3. Moving between 3 and 5 control plane nodes may be a good choice for users
as their availability needs change.
All of these actions currently require re-creation of the cluster.
Proposed Change
---------------
The proposed change includes:
1. Move the logic of `default-master` resize away from the Conductor and into the
driver.
2. The driver may choose to validate the new size, and only support odd numbered
sizes such as 1, 3, 5 and 7.
3. No existing drivers in Magnum will support this resize action. The first drivers
to use this will be out of tree Cluster API drivers.
Check sections 'Data Model Impact' and 'REST API Impact' for more details.
Alternatives
------------
The alternative is to continue requiring users to re-create clusters when
they have a need for differing control plane sizes.
Data Model Impact
-----------------
There is no data model impact.
The database already stores the number of control plane nodes.
REST API Impact
---------------
The REST API does not need to change.
This feature will be supported and implemented using the existing resize API
under `POST /v1/clusters/{cluster_ident}/actions/resize` by providing a
`nodegroup` parameter of `default-master` and the requested size.
This resize API currently does not permit any changes to the `default-master`
nodegroup.
Security Impact
---------------
None
Notifications Impact
--------------------
None
Developer Impact
----------------
Driver developers will have the option of supporting this resize action.
The default will be backwards compatible in that it is not supported.
Implementation
--------------
Assignee(s)
-----------
dalees
Milestones
----------
Target Milestone for completion:
Caracal
Work Items
----------
* Move resize validation logic from Conductor and move into driver.
* Define a default implementation for all drivers. The default will be Not Supported, which
is consistent with the current behaviour.
* Implement the resize action in one or more drivers. The assignee intends to implement this in
the CAPI Helm driver, and inform other out-of-tree developers of this optional feature.
Dependencies
------------
None
References
----------
[1] https://etcd.io/docs/v3.5/faq/#what-is-failure-tolerance