magnum-specs/specs/antelope/clusterapi-driver.rst

12 KiB

Cluster API driver

https://storyboard.openstack.org/#!/story/2009780

Problem description

The existing Magnum support for kubernetes has proven hard to maintain, particular around: upgrade, auto healing, auto scaling, and keeping track of host operating system changes.

At the same time, Kubernetes Cluster API has gained a lot of traction as a way to create, update and upgrade Kubernetes clusters. In particular, there is an active community keeping the OpenStack connector working: https://github.com/kubernetes-sigs/cluster-api-provider-openstack

In addition, the community maintains a set of image build pipelines to create kubeadm powered clusters across a range of operating systems. These have been shown to work well with OpenStack: https://image-builder.sigs.k8s.io/capi/providers/openstack.html

There is a strong case for a native OpenStack API that gives you a multi-tenant cluster as a service that is well inegrated with keystone. Currently that isn't expected to be Cluster API CRDs directly, although that may change. See the Cluster API book: https://cluster-api.sigs.k8s.io/user/personas.html#service-provider-kubernetes-as-a-service

In this spec, we propose adding a new Magnum driver that allows operators to offer Cluster API managed Kubernetes clusters, via the existing Magnum API.

Proposed change

For details on Cluster API terminology please see: https://cluster-api.sigs.k8s.io/user/concepts.html

The new Cluster API driver will not make use of heat, instead it will create clusters by interacting with a Kubernetes Cluster that has been configured as a Cluster API Management Cluster.

For CI jobs, there is an expection of a tempest magnum plugin doing something like creating a minikube will all the required Cluster API components installed. Then configuring the new Magnum driver to have access to that cluster.

The intention for the new driver is to make it compatible with the Magnum REST API. No changes to the REST API are anticipated.

Bootstrapping and managing the external management cluster and the Cluster API service is out of scope. This driver would depend on a Cluster API service being installed in a similar way to expecting Heat API to be installed.

Feature comparison

A feature comparison shows that Cluster API has much in common with Magnum, but introduces some innovations that would be beneficial to Magnum.

Feature Magnum Cluster-API
Cloud Provider OpenStack Installed by default Installed via Helm charts
Host OS support

FCOS 33 supported, 34 and beyond WIP (due to cgroups v2)

NOTE: no security updates for FCOS 33 any more?

Typically Ubuntu 20.04 LTS, various choices supported by the image builder1.
External dependencies Heat, Keystone, others. External Kubernetes, Keystone, others.
Supported CNIs Flannel, Calico. Further options available, eg Cilium.
Cinder CSI Default from Victoria Installed via Helm charts.
Prometheus monitoring Installed by default. Installed via Helm charts.
Ingress controllers Octavia, Traefik, Nginx. Nginx installed via Helm.
Horizon dashboard Supported. Not supported.
Delegated authorisation Keystone trust. Application credential.
Kubernetes CRD API None. Helm, flux, argo, etc.
In-place upgrades Partial - depends on driver. Various supported strategies, build in infrastrucutre agnostic code. Defaults to rolling upgrade (like Magnum).
Self-healing Partial / uncertain. Supported with infrastrcuture agnostic code via reconciliation loop.
Auto-scaling Supported for default node group. Supported with infrastructure agnositc code.
Multiple node groups Supported. Supported, with no default group.
Additional networks Supported (review pending) Supported
New Kubernetes versions Test burden on Magnum entirely. Test burden split between Cluster API for Kubernetes, Magnum for infrastructure provision.

Alternatives

We could attempt to maintain and fix the existing k8s support, but its getting harder and harder.

Implementation

The initial target for the driver is to support the following operations, likely a new patch adding a new operation in roughly the following order:

  • define templates that map to different k8s versions (i.e. image_id from template)
  • create a cluster and delete a cluster (flavor_id and node counts from default node group)
  • coe credentials to help users create a valid kubeconfig
  • Devstack install (manually) passing sonoboy conformance tests
  • support for resizing the default node group
  • upgrade by moving to a newer template (with updated image)
  • add/remove/resize node groups
  • customize internal and external network uuids
  • Sonoboy conformance tests passing in Zuul CI

Initial POC

An initial POC making this set of assumptions can be found here: https://review.opendev.org/c/openstack/magnum/+/851076

The POC established a few ground rules for cluster templates:

  • image_id in the template the key part of the template
  • kube_tag i.e. the k8s version, is fixed for the image_id, ideally we should validate that link
  • using uuid for the name of templates and clusters in k8s as some magnum names are not valid k8s resource names, and reduces the need to sanitise user inputs

For clusters we found:

  • uuid used for the name in k8s, as with templates
  • default node group users: cluster.flavor_id, cluster.node_count
  • Other node groups map in a similar way, using uuid as the name
  • control plane size can some from cluster.master_flavor_id, (although, there might be good reason to ignore this and have it defined only in the template)

For an initial version of the driver, we can leave everything else pretty much hardcoded. Eventually we could respect some tags added to both the template and the cluster, but they are unlikely to be a similar format to the existing drivers.

Note: all network specifications would be ignored in this first version, as the external network can be specificed by the default helm chart variables.

Communicating with K8s

The current template used jinja templates in the driver to template out the required K8s CRDs. While that is great if you are used to ansbile, likely everyone else consideres that strange.

We propose we move towards using helm. Where by a template will map to a specific helm chart and release.

A cluster will take the image_uuid from the template and flavor_ids and node counts from the cluster, to create a set of helm values to be used with the above chart.

Operators can then customise a set of "standard" helm charts as needed for their specific Cloud.

For an example of what we might use as a starting point, please see: https://github.com/stackhpc/capi-helm-charts

OpenStack creds

The current POC found it hard to use the existing cluster trusts with cluster api provider openstack, largely due to limitations in gophercloud's parsing of clouds.yaml files. It can work with OpenStack cloud provider, so prehaps this might be fixable upstream.

Instead we considered creating a per cluster application credentials that are registered with each cluster on creation, and can be deleted when the cluster is removed. Although the current POC code assumes pre-created per project credentails, to make it simpler.

We need to check if these credentials can be rotated when different users manage the cluster, without re-creating all nodes in the cluster. In parituclar, when a user has their role assignments changes, this can invalidate all application credentails, and cause problems similar to when users are deleted and they created the mangum cluster.

More work is needed to determine the best way forward, likely it will involve the driver creating application credentials for each cluster, and storing those in kubernetes.

Assignee(s)

Primary assignee:

  • Matt Pryor (StackHPC)

With support from:

  • John Garbutt (StackHPC)

Milestones

  • initial driver with create and delete
  • full support for all expected
  • CI functional tests passing
  • Conformance tests passing on helm chart repo

Work Items

Complete all the above milestones.

Dependencies

None

Security Impact

A new driver built upon Cluster API has the potential to improve security for Magnum, due to wider scrutiny of the open source implementation, a smaller code base for the Magnum team to maintain and a larger community focussing on the security of Cluster API's managed clusters.

References


  1. https://docs.openstack.org/magnum/latest/user/#rolling-upgrade↩︎