Add Cluster API Kubernetes COE driver
This spec proposes a new driver class that implements Kubernetes cluster orchestration using Cluster API. The intention of this proposal is to create a simplified Kubernetes driver that takes advantage of a broader cross-infrastructure community using Cluster API. story: 2009780 Change-Id: I0750ec7440c1fa329104a524cf6779203e842c7c
This commit is contained in:
parent
188a3338f7
commit
d4f010eee2
|
@ -6,6 +6,14 @@
|
|||
OpenStack Magnum Design Specifications
|
||||
==================================================
|
||||
|
||||
Antelope approved specs:
|
||||
|
||||
.. toctree::
|
||||
:glob:
|
||||
:maxdepth: 2
|
||||
|
||||
specs/antelope/*
|
||||
|
||||
Ussuri approved specs:
|
||||
|
||||
.. toctree::
|
||||
|
|
|
@ -0,0 +1,288 @@
|
|||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
..
|
||||
This template should be in ReSTructured text. The filename in the git
|
||||
repository should match the launchpad URL, for example a URL of
|
||||
https://blueprints.launchpad.net/magnum/+spec/awesome-thing should be named
|
||||
awesome-thing.rst . Please do not delete any of the sections in this
|
||||
template. If you have nothing to say for a whole section, just write: None
|
||||
For help with syntax, see http://sphinx-doc.org/rest.html
|
||||
To test out your formatting, see http://www.tele3.cz/jbar/rest/rest.html
|
||||
|
||||
==================
|
||||
Cluster API driver
|
||||
==================
|
||||
|
||||
https://storyboard.openstack.org/#!/story/2009780
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
The existing Magnum support for kubernetes has proven hard to
|
||||
maintain, particular around: upgrade, auto healing, auto scaling,
|
||||
and keeping track of host operating system changes.
|
||||
|
||||
At the same time, Kubernetes Cluster API has gained a lot of
|
||||
traction as a way to create, update and upgrade Kubernetes clusters.
|
||||
In particular, there is an active community keeping the OpenStack
|
||||
connector working:
|
||||
https://github.com/kubernetes-sigs/cluster-api-provider-openstack
|
||||
|
||||
In addition, the community maintains a set of image build pipelines
|
||||
to create kubeadm powered clusters across a range of operating
|
||||
systems. These have been shown to work well with OpenStack:
|
||||
https://image-builder.sigs.k8s.io/capi/providers/openstack.html
|
||||
|
||||
There is a strong case for a native OpenStack API that
|
||||
gives you a multi-tenant cluster as a service that is well
|
||||
inegrated with keystone.
|
||||
Currently that isn't expected to be Cluster API CRDs directly,
|
||||
although that may change. See the Cluster API book:
|
||||
https://cluster-api.sigs.k8s.io/user/personas.html#service-provider-kubernetes-as-a-service
|
||||
|
||||
In this spec, we propose adding a new Magnum driver that allows
|
||||
operators to offer Cluster API managed Kubernetes clusters, via
|
||||
the existing Magnum API.
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
For details on Cluster API terminology please see:
|
||||
https://cluster-api.sigs.k8s.io/user/concepts.html
|
||||
|
||||
The new Cluster API driver will not make use of heat, instead it
|
||||
will create clusters by interacting with a Kubernetes Cluster that
|
||||
has been configured as a Cluster API Management Cluster.
|
||||
|
||||
For CI jobs, there is an expection of a tempest magnum plugin doing
|
||||
something like creating a minikube will all the required Cluster API
|
||||
components installed. Then configuring the new Magnum driver to have
|
||||
access to that cluster.
|
||||
|
||||
The intention for the new driver is to make it compatible with the
|
||||
Magnum REST API. No changes to the REST API are anticipated.
|
||||
|
||||
Bootstrapping and managing the external management cluster and the
|
||||
Cluster API service is out of scope. This driver would depend on
|
||||
a Cluster API service being installed in a similar way to expecting
|
||||
Heat API to be installed.
|
||||
|
||||
Feature comparison
|
||||
------------------
|
||||
|
||||
A feature comparison shows that Cluster API has much in common with Magnum,
|
||||
but introduces some innovations that would be beneficial to Magnum.
|
||||
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Feature | Magnum | Cluster-API |
|
||||
+==========================+======================+===========================+
|
||||
| Cloud Provider OpenStack | Installed by default | Installed via Helm charts |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Host OS support | FCOS 33 supported, | Typically Ubuntu 20.04 |
|
||||
| | 34 and beyond WIP | LTS, various choices |
|
||||
| | (due to cgroups v2) | supported by the image |
|
||||
| | | builder [#]_. |
|
||||
| | *NOTE: no security | |
|
||||
| | updates for FCOS 33 | |
|
||||
| | any more?* | |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| External dependencies | Heat, Keystone, | External Kubernetes, |
|
||||
| | others. | Keystone, others. |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Supported CNIs | Flannel, Calico. | Further options available,|
|
||||
| | | eg Cilium. |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Cinder CSI | Default from Victoria| Installed via Helm charts.|
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Prometheus monitoring | Installed by default.| Installed via Helm charts.|
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Ingress controllers | Octavia, Traefik, | Nginx installed via Helm. |
|
||||
| | Nginx. | |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Horizon dashboard | Supported. | Not supported. |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Delegated authorisation | Keystone trust. | Application credential. |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Kubernetes CRD API | None. | Helm, flux, argo, etc. |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| In-place upgrades | Partial - depends on | Various supported |
|
||||
| | driver. | strategies, build in |
|
||||
| | | infrastrucutre agnostic |
|
||||
| | | code. Defaults to rolling |
|
||||
| | | upgrade (like Magnum). |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Self-healing | Partial / uncertain. | Supported with |
|
||||
| | | infrastrcuture agnostic |
|
||||
| | | code via reconciliation |
|
||||
| | | loop. |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Auto-scaling | Supported for | Supported with |
|
||||
| | default node group. | infrastructure agnositc |
|
||||
| | | code. |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Multiple node groups | Supported. | Supported, with no default|
|
||||
| | | group. |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| Additional networks | Supported (review | Supported |
|
||||
| | pending) | |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
| New Kubernetes versions | Test burden on Magnum| Test burden split between |
|
||||
| | entirely. | Cluster API for |
|
||||
| | | Kubernetes, Magnum for |
|
||||
| | | infrastructure provision. |
|
||||
+--------------------------+----------------------+---------------------------+
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
We could attempt to maintain and fix the existing k8s support, but
|
||||
its getting harder and harder.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
The initial target for the driver is to support the following operations,
|
||||
likely a new patch adding a new operation in roughly the following order:
|
||||
|
||||
* define templates that map to different k8s versions
|
||||
(i.e. image_id from template)
|
||||
* create a cluster and delete a cluster
|
||||
(flavor_id and node counts from default node group)
|
||||
* coe credentials to help users create a valid kubeconfig
|
||||
* Devstack install (manually) passing sonoboy conformance tests
|
||||
* support for resizing the default node group
|
||||
* upgrade by moving to a newer template (with updated image)
|
||||
* add/remove/resize node groups
|
||||
* customize internal and external network uuids
|
||||
* Sonoboy conformance tests passing in Zuul CI
|
||||
|
||||
Initial POC
|
||||
-----------
|
||||
|
||||
An initial POC making this set of assumptions can be found here:
|
||||
https://review.opendev.org/c/openstack/magnum/+/851076
|
||||
|
||||
The POC established a few ground rules for cluster templates:
|
||||
|
||||
* image_id in the template the key part of the template
|
||||
* kube_tag i.e. the k8s version, is fixed for the image_id,
|
||||
ideally we should validate that link
|
||||
* using uuid for the name of templates and clusters in k8s
|
||||
as some magnum names are not valid k8s resource names,
|
||||
and reduces the need to sanitise user inputs
|
||||
|
||||
For clusters we found:
|
||||
|
||||
* uuid used for the name in k8s, as with templates
|
||||
* default node group users: cluster.flavor_id, cluster.node_count
|
||||
* Other node groups map in a similar way, using uuid as the name
|
||||
* control plane size can some from cluster.master_flavor_id,
|
||||
(although, there might be good reason to ignore this and
|
||||
have it defined only in the template)
|
||||
|
||||
For an initial version of the driver, we can leave everything
|
||||
else pretty much hardcoded. Eventually we could respect some tags
|
||||
added to both the template and the cluster, but they are unlikely
|
||||
to be a similar format to the existing drivers.
|
||||
|
||||
Note: all network specifications would be ignored in this first
|
||||
version, as the external network can be specificed by the default
|
||||
helm chart variables.
|
||||
|
||||
Communicating with K8s
|
||||
----------------------
|
||||
|
||||
The current template used jinja templates in the driver to
|
||||
template out the required K8s CRDs. While that is great if
|
||||
you are used to ansbile, likely everyone else consideres
|
||||
that strange.
|
||||
|
||||
We propose we move towards using helm. Where by a template
|
||||
will map to a specific helm chart and release.
|
||||
|
||||
A cluster will take the image_uuid from the template and
|
||||
flavor_ids and node counts from the cluster, to create a
|
||||
set of helm values to be used with the above chart.
|
||||
|
||||
Operators can then customise a set of "standard" helm
|
||||
charts as needed for their specific Cloud.
|
||||
|
||||
For an example of what we might use as a starting point,
|
||||
please see:
|
||||
https://github.com/stackhpc/capi-helm-charts
|
||||
|
||||
OpenStack creds
|
||||
---------------
|
||||
|
||||
The current POC found it hard to use the existing cluster
|
||||
trusts with cluster api provider openstack, largely
|
||||
due to limitations in gophercloud's parsing of clouds.yaml
|
||||
files. It can work with OpenStack cloud provider, so
|
||||
prehaps this might be fixable upstream.
|
||||
|
||||
Instead we considered creating a per cluster application
|
||||
credentials that are registered with each cluster on creation,
|
||||
and can be deleted when the cluster is removed. Although the
|
||||
current POC code assumes pre-created per project credentails,
|
||||
to make it simpler.
|
||||
|
||||
We need to check if these credentials can be rotated when
|
||||
different users manage the cluster, without re-creating all
|
||||
nodes in the cluster. In parituclar, when a user has their
|
||||
role assignments changes, this can invalidate all
|
||||
application credentails, and cause problems similar to
|
||||
when users are deleted and they created the mangum cluster.
|
||||
|
||||
More work is needed to determine the best way forward, likely
|
||||
it will involve the driver creating application credentials
|
||||
for each cluster, and storing those in kubernetes.
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
|
||||
* Matt Pryor (StackHPC)
|
||||
|
||||
With support from:
|
||||
|
||||
* John Garbutt (StackHPC)
|
||||
|
||||
Milestones
|
||||
----------
|
||||
|
||||
* initial driver with create and delete
|
||||
* full support for all expected
|
||||
* CI functional tests passing
|
||||
* Conformance tests passing on helm chart repo
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
Complete all the above milestones.
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
None
|
||||
|
||||
Security Impact
|
||||
===============
|
||||
|
||||
A new driver built upon Cluster API has the potential to improve
|
||||
security for Magnum, due to wider scrutiny of the open source
|
||||
implementation, a smaller code base for the Magnum team to maintain
|
||||
and a larger community focussing on the security of Cluster API's
|
||||
managed clusters.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
.. [#] https://docs.openstack.org/magnum/latest/user/#rolling-upgrade
|
||||
.. [#] https://cluster-api.sigs.k8s.io
|
||||
.. [#] https://github.com/kubernetes-sigs/image-builder/tree/master/images/capi/packer/qemu
|
||||
.. [#] https://review.opendev.org/c/openstack/magnum/+/815521
|
Loading…
Reference in New Issue