WIP - Add collection to deploy magnum cluster-api with vexxhost driver

Change-Id: I121f5e97453354bb5c0227b296462805e269a7f5
This commit is contained in:
Jonathan Rosser 2023-11-20 10:26:31 +00:00
parent 1c0bd2ae11
commit bb60e94e23
7 changed files with 687 additions and 0 deletions

3
mcapi_vexxhost/README.md Normal file
View File

@ -0,0 +1,3 @@
# Ansible Collection - osa_ops.mcapi_vexxhost
Documentation for the collection.

View File

@ -0,0 +1,326 @@
Install vexxhost magnum-cluster-api driver
##########################################
About this repository
---------------------
This repository includes playbooks and roles to deploy the Vexxhost
magnum-cluster-api driver for the OpenStack Magnum service.
The playbooks create a complete deployment including the control plane
k8s cluster which should result in a ready-to-go experience for operators.
The following architectural features are present:
* The control plane k8s cluster is an integral part of the openstack-ansible
deployment, and forms part of the foundational components alongside mariadb
and rabbitmq.
* The control plane k8s cluster is deployed on the infra hosts and integrated
with the haproxy loadbalancer and OpenStack internal API endpoint, and not
exposed outside of the deployment
* SSL is supported between all components and configuration is
possible to support different certificate authorities on the internal
and external loadbalancer endpoints.
* Control plane traffic can stay entirely within the management network
if required
* The magnum-cluster-api-proxy service is deployed to allow communication
between the control plane and workload clusters when a floating IP is not
attached to the workload cluster.
* It is possible to do a completely offline install for airgapped environments
The magnum-cluster-api driver for magnum can be found here https://https://github.com/vexxhost/magnum-cluster-api
Documentation for the Vexxhost magnum-cluster-api driver is here https://vexxhost.github.io/magnum-cluster-api/
The ansible collection used to deploy the controlplane k8s cluster is here https://github.com/vexxhost/ansible-collection-kubernetes
The ansible collection used to deploy the container runtime for the controlplane k8s cluster is here https://github.com/vexxhost/ansible-collection-containers
**These playbooks require Openstack-Ansible Antelope or later.**
Highlevel overview of the Magnum infrastructure these playbooks will
build and operate against.
.. image:: mcapi-architecture.png
:scale: 100 %
:alt: OSA Magnum Cluster API Architecture
:align: center
Pre-requisites
--------------
* An existing openstack-ansible deployment
* Control plane using LXC containers, bare metal deployment is not tested
* Core openstack services plus Octavia
OpenStack-Ansible Integration
-----------------------------
The playbooks are distributed as an ansible collection, and integrate with
Openstack-Ansible by adding the collection to the deployment host by
adding the following to `/etc/openstack_deploy/user-collection-requirements.yml`
under the collections key.
.. code-block:: yaml
collections:
- name: vexxhost.kubernetes
source: https://github.com/vexxhost/ansible-collection-kubernetes
type: git
version: main
- name: osa_ops.mcapi_vexxhost
type: git
version: master
source: https://opendev.org/openstack/openstack-ansible-ops#/mcapi_vexxhost
The collections can then be installed with the following command:
.. code-block:: bash
cd /opt/openstack-ansible
openstack-ansible scripts/get-ansible-collection-requirements.yml
OpenStack-Ansible configuration for magnum-cluster-api driver
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Specify the deployment of the control plane k8s cluster in
`/etc/openstack_deploy/env.d/k8s.yml`
.. code-block:: yaml
---
component_skel:
k8s_capi:
belongs_to:
- k8s_all
container_skel:
k8s_container:
belongs_to:
- cluster-api_containers
contains:
- k8s_capi
physical_skel:
cluster-api_containers:
belongs_to:
- all_containers
cluster-api_hosts:
belongs_to:
- hosts
Define the physical hosts that will host the controlplane k8s
cluster, this example is for an all-in-one deployment and should
be adjust to match a real deployment with multiple hosts if
high availability is required.
.. code-block:: yaml
cluster-api_hosts:
aio1:
ip: 172.29.236.100
Integrate the control plane k8s cluster with the haproxy loadbalancer
in `/etc/openstack-deploy/group_vars/k8s_all/haproxy_service.yml`
.. code-block:: yaml
---
haproxy_k8s_service:
haproxy_service_name: k8s
haproxy_backend_nodes: "{{ groups['k8s_all'] | default([]) }}"
haproxy_ssl: false
haproxy_ssl_all_vips: false
haproxy_port: 6443
haproxy_balance_type: tcp
haproxy_balance_alg: leastconn
haproxy_interval: '15000'
haproxy_backend_port: 16443
haproxy_backend_rise: 2
haproxy_backend_fall: 2
haproxy_timeout_server: '15m'
haproxy_timeout_client: '5m'
haproxy_backend_options:
- tcplog
- ssl-hello-chk
- log-health-checks
- httpchk GET /healthz
haproxy_backend_httpcheck_options:
- 'expect status 200'
haproxy_backend_server_options:
- check-ssl
- verify none
haproxy_accept_both_protocols: "{{ k8s_accept_both_protocols | default(openstack_service_accept_both_protocols) }}"
haproxy_service_enabled: "{{ groups['k8s_all'] is defined and groups['k8s_all'] | length > 0 }}"
k8s_haproxy_services:
- "{{ haproxy_k8s_service | combine(haproxy_k8s_service_overrides | default({})) }}"
Configure the LXC container that will host the control plane k8s cluster to
be suitable for running nested containers in `/etc/openstack-deploy/group_vars/k8s_all/main.yml`
.. code-block:: yaml
---
lxc_container_config_list:
- "lxc.apparmor.profile=unconfined"
lxc_container_mount_auto:
- "proc:rw"
- "sys:rw"
Set up config-overrides for the magnum service in `/etc/openstack-deploy/user_variables_magnum.yml`.
Adjust the images and flavors here as necessary, these are just for demonstration. Upload as many
images as you need for the different workload cluster kubernetes versions.
.. code-block:: yaml
#list the images to upload to glance here, or set to an empty list
#to handle image uploading by some other means
magnum_glance_images:
- disk_format: qcow2
distro: ubuntu
file: https://object-storage.public.mtl1.vexxhost.net/swift/v1/a91f106f55e64246babde7402c21b87a/magnum-capi/ubuntu-2204-kube-v1.23.17.qcow2
image_format: bare
name: ubuntu-2204-kube-v1.23.17
public: true
#the cluster templates cannot be created during the magnum installation
#as the control plane k8s credentials must be in place first
magnum_cluster_templates: []
#any flavors specified in the cluster template must already exist
#the magnum playbook can create flavors, or set to an empty list
#to handle flavor creation by some other means
magnum_flavors:
- cloud: default
disk: 40
name: m1.medium
ram: 4096
vcpus: 2
Set up config-overrides for the control plane k8s cluster in /etc/openstack-deploy/user_variables_k8s.yml`
Attention must be given to the SSL configuration. Users and workload clusters will
interact with the external endpoint and must trust the SSL certificate. The magnum
service and cluster-api can be configured to interact with either the external or
internal endpoint and must trust the SSL certificiate. Depending on the environment,
these may be derived from different certificate authorities.
.. code-block:: yaml
# connect ansible group, host and network addresses into control plane k8s deployment
kubernetes_control_plane_group: k8s_all
kubelet_hostname: "{{ ansible_facts['hostname'] }}"
kubelet_node_ip: "{{ management_address }}"
kubernetes_hostname: "{{ internal_lb_vip_address }}"
kubernetes_non_init_namespace: true
# install the vexxhost magnum-cluster-api plugin into the magnum venv
magnum_user_pip_packages:
- git+https://github.com/vexxhost/magnum-cluster-api@main#egg=magnum-cluster-api
# make the required settings in magnum.conf
magnum_config_overrides:
drivers:
# ensure that the external VIP CA is trusted by the workload cluster
openstack_ca_file: '/usr/local/share/ca-certificates/ExampleCorpRoot.crt'
capi_client:
# ensure that the internal VIP CA is trusted by the CAPI driver
ca_file: '/usr/local/share/ca-certificates/ExampleCorpRoot.crt'
endpoint: 'internalURL'
cluster_template:
# the only permitted workload network driver is calico
kubernetes_allowed_network_drivers: 'calico'
kubernetes_default_network_driver: 'calico'
certificates:
# store certificates in the magnum database instead of barbican
cert_manager_type: x509keypair
# Pick a range of addresses for the control plane k8s cluster cilium
# network that do not collide with anything else in the deployment
cilium_ipv4_cidr: 172.29.200.0/22
# Set this manually, or kube-proxy will try to do this - not possible
# in a non-init namespace and will fail in LXC
openstack_host_nf_conntrack_max: 1572864
# OSA containers do not run ssh so cannot use the ansible synchronize module
upload_helm_chart_method: copy
TODO: docker-image-py is needed in /opt/openstack-ansible/requirements.txt
Run the deployment
------------------
For a new deployment
^^^^^^^^^^^^^^^^^^^^
Run the OSA playbooks/setup.yml playbooks as usual, following the normal
deployment guide.
Run the magnum-cluster-api deployment
.. code-block:: bash
openstack-ansible osa_ops.mcapi_vexxhost.k8s_install
For an existing deployment
^^^^^^^^^^^^^^^^^^^^^^^^^^
Create the k8s control plane containers
openstack-ansible playbooks/lxc-containers-create.yml --limit k8s_all
Run the magnum-cluster-api deployment
.. code-block:: bash
openstack-ansible osa_ops.mcapi_vexxhost.k8s_install
Use Magnum to create a workload cluster
---------------------------------------
Magnum cluster-api should now be ready to use
Upload Images
Create a cluster template
Optional Components
-------------------
Deploy the workload clusters with a local registry
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TODO - describe how to do this
Deploy the control plane cluster from a local registry
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TODO - describe how to do this
Use of magnum-cluster-api-proxy
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TODO - describe what this is for
Troubleshooting
---------------
Local testing
-------------
An OpenStack-Ansible all-in-one configured with Magnum and Octavia is
capable of running a functioning magnum-cluster-api deployment.
Sufficient memory should be available beyond the minimum 8G usually required
for an all-in-one. A multinode workload cluster may require nova to boot several
Ubuntu images in addition to an Octavia loadbalancer instance. 64G would
be an appropriate amount of system RAM.
There also must be sufficient disk space in `/var/lib/nova/instances` to
support the required number of instances - the normal minimum of 60G
required for an all-in-one deployment will be insufficient, 500G would
be plenty.

Binary file not shown.

After

Width:  |  Height:  |  Size: 208 KiB

62
mcapi_vexxhost/galaxy.yml Normal file
View File

@ -0,0 +1,62 @@
### REQUIRED
# The namespace of the collection. This can be a company/brand/organization or product namespace under which all
# content lives. May only contain alphanumeric lowercase characters and underscores. Namespaces cannot start with
# underscores or numbers and cannot contain consecutive underscores
namespace: osa_ops
# The name of the collection. Has the same character restrictions as 'namespace'
name: mcapi_vexxhost
# The version of the collection. Must be compatible with semantic versioning
version: 1.0.0
# The path to the Markdown (.md) readme file. This path is relative to the root of the collection
readme: README.md
# A list of the collection's content authors. Can be just the name or in the format 'Full Name <email> (url)
# @nicks:irc/im.site#channel'
authors:
- your name <example@domain.com>
### OPTIONAL but strongly recommended
# A short summary description of the collection
description: your collection description
# Either a single license or a list of licenses for content inside of a collection. Ansible Galaxy currently only
# accepts L(SPDX,https://spdx.org/licenses/) licenses. This key is mutually exclusive with 'license_file'
license:
- GPL-2.0-or-later
# The path to the license file for the collection. This path is relative to the root of the collection. This key is
# mutually exclusive with 'license'
license_file: ''
# A list of tags you want to associate with the collection for indexing/searching. A tag name has the same character
# requirements as 'namespace' and 'name'
tags: []
# Collections that this collection requires to be installed for it to be usable. The key of the dict is the
# collection label 'namespace.name'. The value is a version range
# L(specifiers,https://python-semanticversion.readthedocs.io/en/latest/#requirement-specification). Multiple version
# range specifiers can be set and are separated by ','
dependencies: {}
# The URL of the originating SCM repository
repository: http://example.com/repository
# The URL to any online docs
documentation: http://docs.example.com
# The URL to the homepage of the collection/project
homepage: http://example.com
# The URL to the collection issue tracker
issues: http://example.com/issue/tracker
# A list of file glob-like patterns used to filter any files or directories that should not be included in the build
# artifact. A pattern is matched from the relative path of the file or directory of the collection directory. This
# uses 'fnmatch' to match the files or directories. Some directories and files like 'galaxy.yml', '*.pyc', '*.retry',
# and '.git' are always filtered
build_ignore: []

View File

@ -0,0 +1,189 @@
---
# Copyright 2023, BBC R&D
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
- name: Gather k8s facts
hosts: k8s_all
gather_facts: false
tags:
- always
tasks:
- name: Gather minimal facts for k8s
setup:
gather_subset:
- "!all"
- min
when: osa_gather_facts | default(True)
- name: Create and configure k8s container
hosts: k8s_all
serial: "{{ k8s_serial | default('20%') }}"
gather_facts: true
user: root
pre_tasks:
- import_role:
name: openstack.osa.lxc_container_setup
vars:
list_of_bind_mounts:
- bind_dir_path: '/usr/lib/modules'
mount_path: '/usr/lib/modules'
- bind_dir_path: '/usr/src'
mount_path: '/usr/src'
- bind_dir_path: '/dev/kmsg'
mount_path: '/dev/kmsg'
create: file
extra_container_config:
- 'security.privileged=true'
- 'security.nested=true'
- 'raw.lxc="lxc.apparmor.profile=unconfined"'
- 'lxc.cap.drop='
- 'lxc.cgroup.devices.allow=a'
- 'lxc.cgroup2.devices.allow=a'
when: not is_metal
- include_tasks: common-tasks/unbound-clients.yml
when:
- hostvars['localhost']['resolvconf_enabled'] | bool
- name: ensure kernel headers are installed on host
package:
name: "linux-headers-{{ ansible_facts['kernel'] }}"
state: present
delegate_to: "{{ physical_host }}"
when: not is_metal
roles:
- role: "openstack.osa.system_crontab_coordination"
- role: "systemd_service"
systemd_services:
- service_name: bpf-mount
execstarts: /usr/bin/bash -c '/usr/bin/mount bpffs -t bpf /sys/fs/bpf && /usr/bin/mount --make-shared /sys/fs/bpf'
- service_name: cilium-cgroup2-mount
execstarts: /usr/bin/bash -c 'mkdir -p /run/cilium/cgroupv2 && /usr/bin/mount -t cgroup2 none /run/cilium/cgroupv2 && /usr/bin/mount --make-shared /run/cilium/cgroupv2'
# environment: "{{ deployment_environment_variables | default({}) }}"
tags:
- k8s-container
- k8s
- name: Configure haproxy services
import_playbook: openstack.osa.haproxy_service_config.yml
vars:
service_group: k8s_all
service_variable: "k8s_haproxy_services"
when: groups[service_group] | length > 0
tags:
- haproxy-service-config
- name: Install kubernetes
hosts: k8s_all
gather_facts: true
serial: "{{ k8s_serial | default('20%') }}"
user: root
roles:
- role: "vexxhost.containers.containerd"
- role: "vexxhost.kubernetes.kubernetes"
- role: "vexxhost.kubernetes.helm"
- role: "vexxhost.kubernetes.cilium"
environment: "{{ deployment_environment_variables | default({}) }}"
tags:
- k8s
- k8s-install
- name: Install cluster_api
hosts: k8s_all
gather_facts: true
user: root
roles:
- role: "vexxhost.kubernetes.cert_manager"
- role: "vexxhost.kubernetes.cluster_api"
# environment: "{{ deployment_environment_variables | default({}) }}"
tags:
- cluster-api
- name: Set up helm and k8s credentials in magnum hosts
hosts: magnum_all
gather_facts: true
user: root
vars:
k8s_admin_conf_src: "/etc/kubernetes/admin.conf"
k8s_admin_conf_dest: "/var/lib/magnum/.kube/config"
tasks:
- name: Collect admin config from k8s cluster
slurp:
src: "{{ k8s_admin_conf_src }}"
register: k8s_admin_conf_slurp
delegate_to: "{{ groups['k8s_all'][0] }}"
run_once: true
- name: Ensure target directory exists
file:
state: directory
path: "{{ k8s_admin_conf_dest | dirname }}"
owner: magnum
group: magnum
- name: Write k8s admin config to magnum home dir
copy:
content: "{{ k8s_admin_conf_slurp.content | b64decode }}"
dest: "{{ k8s_admin_conf_dest }}"
owner: magnum
group: magnum
mode: '0600'
- name: Install helm
include_role:
name: "vexxhost.kubernetes.helm"
# environment: "{{ deployment_environment_variables | default({}) }}"
tags:
- magnum_k8s_conf
# deploy the proxy service to communicate directly between magnum coe
# clusters and the capi control plane without going via a public floating
# IP
- name: Install magnum-cluster-api-proxy
hosts: network_hosts
vars:
_venv_tag: "{{ venv_tag | default('untagged') }}"
_bin: "/openstack/venvs/magnum-cluster-api-proxy-{{ _venv_tag }}/bin"
magnum_cluster_api_proxy_system_group_name: 'capi_proxy'
magnum_cluster_api_proxy_system_user_name: 'capi_proxy'
magnum_cluster_api_proxy_system_user_comment: 'Magnum Cluster API Proxy System User'
magnum_cluster_api_proxy_system_user_home: '/var/lib/{{ magnum_cluster_api_proxy_system_user_name }}'
magnum_cluster_api_proxy_system_user_shell: '/bin/false'
magnum_cluster_api_proxy_etc_directory: '/etc/capi_proxy'
k8s_admin_conf_src: "/etc/kubernetes/admin.conf"
k8s_admin_conf_dest: "{{ magnum_cluster_api_proxy_system_user_home }}/.kube/config"
environment: "{{ deployment_environment_variables | default({}) }}"
roles:
- openstack.osa.source_install_vars
tasks:
- setup:
gather_subset:
- "!all"
- min
when: osa_gather_facts | default(True)
tags:
- always
- name: Install proxy service
include_role:
name: osa_ops.mcapi_vexxhost.proxy
tags:
- magnum-cluster-api-proxy

View File

@ -0,0 +1,101 @@
---
# Copyright 2023, BBC R&D
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# create virtualenv
- name: Install the python venv
import_role:
name: "python_venv_build"
vars:
_upper_constraints_url: "{{ requirements_git_url | default('https://releases.openstack.org/constraints/upper/' ~ requirements_git_install_branch | default('master')) }}"
_git_constraints:
- "--constraint {{ _upper_constraints_url }}"
venv_python_executable: "{{ openstack_venv_python_executable | default('python3') }}"
venv_build_constraints: "{{ _git_constraints }}"
venv_install_destination_path: "{{ _bin | dirname }}"
#venv_install_distro_package_list:
# - haproxy # this will be there for free on the host in an AIO
venv_pip_install_args: "{{ pip_install_options | default('') }}"
venv_pip_packages:
- git+https://github.com/vexxhost/magnum-cluster-api@main#egg=magnum-cluster-api
# create user and group
- name: Create the magnum_cluster_api_proxy system group
group:
name: "{{ magnum_cluster_api_proxy_system_group_name }}"
state: "present"
system: "yes"
- name: Create the magnum_cluster_api_proxy system user
user:
name: "{{ magnum_cluster_api_proxy_system_user_name }}"
group: "{{ magnum_cluster_api_proxy_system_group_name }}"
comment: "{{ magnum_cluster_api_proxy_system_user_comment }}"
shell: "{{ magnum_cluster_api_proxy_system_user_shell }}"
system: "yes"
createhome: "yes"
home: "{{ magnum_cluster_api_proxy_system_user_home }}"
- name: Create magnum_cluster_api_proxy directories
file:
path: "{{ item.path }}"
state: "directory"
owner: "{{ item.owner | default(magnum_cluster_api_proxy_system_user_name) }}"
group: "{{ item.group | default(magnum_cluster_api_proxy_system_group_name) }}"
mode: "{{ item.mode | default('0750') }}"
with_items:
- path: "{{ magnum_cluster_api_proxy_etc_directory }}"
- path: "{{ magnum_cluster_api_proxy_system_user_home }}"
- path: "{{ magnum_cluster_api_proxy_system_user_home }}/.kube"
- name: Collect admin config from k8s cluster
slurp:
src: "{{ k8s_admin_conf_src }}"
register: k8s_admin_conf_slurp
delegate_to: "{{ groups['k8s_all'][0] }}"
run_once: true
- name: Write k8s admin config to capi_proxy home dir
copy:
content: "{{ k8s_admin_conf_slurp.content | b64decode }}"
dest: "{{ k8s_admin_conf_dest }}"
owner: "{{ magnum_cluster_api_proxy_system_user_name }}"
group: "{{ magnum_cluster_api_proxy_system_group_name }}"
mode: '0600'
- name: Write capi_proxy sudoers config
template:
src: capi_sudoers.j2
dest: /etc/sudoers.d/capi_proxy_sudoers
# create service
- name: Run the systemd service role
import_role:
name: systemd_service
vars:
systemd_user_name: "{{ magnum_cluster_api_proxy_system_user_name }}"
systemd_group_name: "{{ magnum_cluster_api_proxy_system_group_name }}"
systemd_service_restart_changed: true
systemd_tempd_prefix: openstack
systemd_slice_name: magnum-cluster-api-proxy
systemd_lock_path: /var/lock/magnum-cluster-api-proxy
systemd_service_cpu_accounting: true
systemd_service_block_io_accounting: true
systemd_service_memory_accounting: true
systemd_service_tasks_accounting: true
systemd_services:
- service_name: magnum-cluster-api-proxy
execstarts:
- "{{ _bin ~ '/magnum-cluster-api-proxy' }}"
start_order: 1

View File

@ -0,0 +1,6 @@
# {{ ansible_managed }}
Defaults:{{ magnum_cluster_api_proxy_system_user_name }} !requiretty
Defaults:{{ magnum_cluster_api_proxy_system_user_name }} secure_path="{{ _bin }}:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
{{ magnum_cluster_api_proxy_system_user_name }} ALL = (root) NOPASSWD: {{ _bin }}/privsep-helper