Add Cluster API Kubernetes COE driver

This spec proposes a new driver class that implements Kubernetes
cluster orchestration using Cluster API. The intention of this
proposal is to create a simplified Kubernetes driver that takes
advantage of a broader cross-infrastructure community using Cluster
API.

story: 2009780
Change-Id: I0750ec7440c1fa329104a524cf6779203e842c7c
This commit is contained in:
Stig Telfer 2022-01-12 21:48:00 +00:00 committed by John Garbutt
parent 188a3338f7
commit d4f010eee2
2 changed files with 296 additions and 0 deletions

View File

@ -6,6 +6,14 @@
OpenStack Magnum Design Specifications
==================================================
Antelope approved specs:
.. toctree::
:glob:
:maxdepth: 2
specs/antelope/*
Ussuri approved specs:
.. toctree::

View File

@ -0,0 +1,288 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
..
This template should be in ReSTructured text. The filename in the git
repository should match the launchpad URL, for example a URL of
https://blueprints.launchpad.net/magnum/+spec/awesome-thing should be named
awesome-thing.rst . Please do not delete any of the sections in this
template. If you have nothing to say for a whole section, just write: None
For help with syntax, see http://sphinx-doc.org/rest.html
To test out your formatting, see http://www.tele3.cz/jbar/rest/rest.html
==================
Cluster API driver
==================
https://storyboard.openstack.org/#!/story/2009780
Problem description
===================
The existing Magnum support for kubernetes has proven hard to
maintain, particular around: upgrade, auto healing, auto scaling,
and keeping track of host operating system changes.
At the same time, Kubernetes Cluster API has gained a lot of
traction as a way to create, update and upgrade Kubernetes clusters.
In particular, there is an active community keeping the OpenStack
connector working:
https://github.com/kubernetes-sigs/cluster-api-provider-openstack
In addition, the community maintains a set of image build pipelines
to create kubeadm powered clusters across a range of operating
systems. These have been shown to work well with OpenStack:
https://image-builder.sigs.k8s.io/capi/providers/openstack.html
There is a strong case for a native OpenStack API that
gives you a multi-tenant cluster as a service that is well
inegrated with keystone.
Currently that isn't expected to be Cluster API CRDs directly,
although that may change. See the Cluster API book:
https://cluster-api.sigs.k8s.io/user/personas.html#service-provider-kubernetes-as-a-service
In this spec, we propose adding a new Magnum driver that allows
operators to offer Cluster API managed Kubernetes clusters, via
the existing Magnum API.
Proposed change
===============
For details on Cluster API terminology please see:
https://cluster-api.sigs.k8s.io/user/concepts.html
The new Cluster API driver will not make use of heat, instead it
will create clusters by interacting with a Kubernetes Cluster that
has been configured as a Cluster API Management Cluster.
For CI jobs, there is an expection of a tempest magnum plugin doing
something like creating a minikube will all the required Cluster API
components installed. Then configuring the new Magnum driver to have
access to that cluster.
The intention for the new driver is to make it compatible with the
Magnum REST API. No changes to the REST API are anticipated.
Bootstrapping and managing the external management cluster and the
Cluster API service is out of scope. This driver would depend on
a Cluster API service being installed in a similar way to expecting
Heat API to be installed.
Feature comparison
------------------
A feature comparison shows that Cluster API has much in common with Magnum,
but introduces some innovations that would be beneficial to Magnum.
+--------------------------+----------------------+---------------------------+
| Feature | Magnum | Cluster-API |
+==========================+======================+===========================+
| Cloud Provider OpenStack | Installed by default | Installed via Helm charts |
+--------------------------+----------------------+---------------------------+
| Host OS support | FCOS 33 supported, | Typically Ubuntu 20.04 |
| | 34 and beyond WIP | LTS, various choices |
| | (due to cgroups v2) | supported by the image |
| | | builder [#]_. |
| | *NOTE: no security | |
| | updates for FCOS 33 | |
| | any more?* | |
+--------------------------+----------------------+---------------------------+
| External dependencies | Heat, Keystone, | External Kubernetes, |
| | others. | Keystone, others. |
+--------------------------+----------------------+---------------------------+
| Supported CNIs | Flannel, Calico. | Further options available,|
| | | eg Cilium. |
+--------------------------+----------------------+---------------------------+
| Cinder CSI | Default from Victoria| Installed via Helm charts.|
+--------------------------+----------------------+---------------------------+
| Prometheus monitoring | Installed by default.| Installed via Helm charts.|
+--------------------------+----------------------+---------------------------+
| Ingress controllers | Octavia, Traefik, | Nginx installed via Helm. |
| | Nginx. | |
+--------------------------+----------------------+---------------------------+
| Horizon dashboard | Supported. | Not supported. |
+--------------------------+----------------------+---------------------------+
| Delegated authorisation | Keystone trust. | Application credential. |
+--------------------------+----------------------+---------------------------+
| Kubernetes CRD API | None. | Helm, flux, argo, etc. |
+--------------------------+----------------------+---------------------------+
| In-place upgrades | Partial - depends on | Various supported |
| | driver. | strategies, build in |
| | | infrastrucutre agnostic |
| | | code. Defaults to rolling |
| | | upgrade (like Magnum). |
+--------------------------+----------------------+---------------------------+
| Self-healing | Partial / uncertain. | Supported with |
| | | infrastrcuture agnostic |
| | | code via reconciliation |
| | | loop. |
+--------------------------+----------------------+---------------------------+
| Auto-scaling | Supported for | Supported with |
| | default node group. | infrastructure agnositc |
| | | code. |
+--------------------------+----------------------+---------------------------+
| Multiple node groups | Supported. | Supported, with no default|
| | | group. |
+--------------------------+----------------------+---------------------------+
| Additional networks | Supported (review | Supported |
| | pending) | |
+--------------------------+----------------------+---------------------------+
| New Kubernetes versions | Test burden on Magnum| Test burden split between |
| | entirely. | Cluster API for |
| | | Kubernetes, Magnum for |
| | | infrastructure provision. |
+--------------------------+----------------------+---------------------------+
Alternatives
------------
We could attempt to maintain and fix the existing k8s support, but
its getting harder and harder.
Implementation
==============
The initial target for the driver is to support the following operations,
likely a new patch adding a new operation in roughly the following order:
* define templates that map to different k8s versions
(i.e. image_id from template)
* create a cluster and delete a cluster
(flavor_id and node counts from default node group)
* coe credentials to help users create a valid kubeconfig
* Devstack install (manually) passing sonoboy conformance tests
* support for resizing the default node group
* upgrade by moving to a newer template (with updated image)
* add/remove/resize node groups
* customize internal and external network uuids
* Sonoboy conformance tests passing in Zuul CI
Initial POC
-----------
An initial POC making this set of assumptions can be found here:
https://review.opendev.org/c/openstack/magnum/+/851076
The POC established a few ground rules for cluster templates:
* image_id in the template the key part of the template
* kube_tag i.e. the k8s version, is fixed for the image_id,
ideally we should validate that link
* using uuid for the name of templates and clusters in k8s
as some magnum names are not valid k8s resource names,
and reduces the need to sanitise user inputs
For clusters we found:
* uuid used for the name in k8s, as with templates
* default node group users: cluster.flavor_id, cluster.node_count
* Other node groups map in a similar way, using uuid as the name
* control plane size can some from cluster.master_flavor_id,
(although, there might be good reason to ignore this and
have it defined only in the template)
For an initial version of the driver, we can leave everything
else pretty much hardcoded. Eventually we could respect some tags
added to both the template and the cluster, but they are unlikely
to be a similar format to the existing drivers.
Note: all network specifications would be ignored in this first
version, as the external network can be specificed by the default
helm chart variables.
Communicating with K8s
----------------------
The current template used jinja templates in the driver to
template out the required K8s CRDs. While that is great if
you are used to ansbile, likely everyone else consideres
that strange.
We propose we move towards using helm. Where by a template
will map to a specific helm chart and release.
A cluster will take the image_uuid from the template and
flavor_ids and node counts from the cluster, to create a
set of helm values to be used with the above chart.
Operators can then customise a set of "standard" helm
charts as needed for their specific Cloud.
For an example of what we might use as a starting point,
please see:
https://github.com/stackhpc/capi-helm-charts
OpenStack creds
---------------
The current POC found it hard to use the existing cluster
trusts with cluster api provider openstack, largely
due to limitations in gophercloud's parsing of clouds.yaml
files. It can work with OpenStack cloud provider, so
prehaps this might be fixable upstream.
Instead we considered creating a per cluster application
credentials that are registered with each cluster on creation,
and can be deleted when the cluster is removed. Although the
current POC code assumes pre-created per project credentails,
to make it simpler.
We need to check if these credentials can be rotated when
different users manage the cluster, without re-creating all
nodes in the cluster. In parituclar, when a user has their
role assignments changes, this can invalidate all
application credentails, and cause problems similar to
when users are deleted and they created the mangum cluster.
More work is needed to determine the best way forward, likely
it will involve the driver creating application credentials
for each cluster, and storing those in kubernetes.
Assignee(s)
-----------
Primary assignee:
* Matt Pryor (StackHPC)
With support from:
* John Garbutt (StackHPC)
Milestones
----------
* initial driver with create and delete
* full support for all expected
* CI functional tests passing
* Conformance tests passing on helm chart repo
Work Items
----------
Complete all the above milestones.
Dependencies
============
None
Security Impact
===============
A new driver built upon Cluster API has the potential to improve
security for Magnum, due to wider scrutiny of the open source
implementation, a smaller code base for the Magnum team to maintain
and a larger community focussing on the security of Cluster API's
managed clusters.
References
==========
.. [#] https://docs.openstack.org/magnum/latest/user/#rolling-upgrade
.. [#] https://cluster-api.sigs.k8s.io
.. [#] https://github.com/kubernetes-sigs/image-builder/tree/master/images/capi/packer/qemu
.. [#] https://review.opendev.org/c/openstack/magnum/+/815521