cloud-provider-openstack has changed their image repo. To use the
plugins matching later versions of k8s, this needs to be updated.
Also update tags for CI test to match version being tested.
[1] https://github.com/kubernetes/cloud-provider-openstack/pull/2169
Change-Id: I9390db5e1aa357c17a39a7c208d837befafd3820
PodSecurityPolicy has been removed in Kubernetes v1.25 [1]. To allow Magnum
to support Kubernetes v1.25 and above, PodSecurityPolicy Admission
Controller has has been removed.
[1] https://kubernetes.io/docs/concepts/security/pod-security-policy/
Change-Id: I0fb0c372b484275b0677114193289469ee788b84
Set resource requests for system pods to
guarantee at least some amount of resources.
This prevents them from being starved of
CPU/memory when running alongside resource
intensive workloads in the cluster and
gives them a higher quality of service class.
metrics-server:
100m/200Mi recommended for up to 100 node clusters.
https://github.com/kubernetes-sigs/metrics-server#scaling
openstack-cloud-controller-manager:
200m CPU taken from example manifests.
kubernetes-dashboard:
100m/100Mi taken from helm chart defaults.
heapster:
100m/128Mi taken from helm chart defaults.
influxdb:
100m/256Mi taken from influx helm chart defaults.
grafana (for influxdb):
100m/200Mi same as monitoring grafana.
ingress-traefik:
100m/50Mi taken from helm chart defaults.
cluster-autoscaler:
100m/300Mi taken from helm chart defaults.
csi-cinder-nodeplugin:
25m CPU on both containers to ensure
Burstable QoS class.
csi-cinder-controllerplugin:
20m CPU on all containers to ensure
Burstable QoS class.
tiller-deploy:
25m CPU to ensure it can always handle
the readiness probe.
octavia-ingress-controller:
50m CPU, just a guess really.
Story: 2008825
Task: 42290
Change-Id: Ifcd764c00d7046744ba63609078cc6c5d02fdc1c
* in 1.20 8080 is not supported anymore
** use only 6443
** change all probes for health to use kubectl and 6443
* configure the signing key in API
story: 2008524
task: 41731
Change-Id: Ibaf1840214016d2dd6ac15e2137eb3cd3d767889
Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>
Without this, heat container agents using kubectl version
1.18.x (e.g. ussuri-dev) fail because they do not have the correct
KUBECONFIG in the environment.
Task: 39938
Story: 2007591
Change-Id: Ifc212478ae09c658adeb6ba4c8e8afc8943e3977
Improve the taint of master node kubelet to get the conformance
test passed and update the OCCM and Helm/Tiller tolerations accordingly.
Task: 39223
Story: 2007256
Change-Id: Ief452e05ddf13a1d1ee77641311c3ae7abbe90f2
A new config option `post_install_manifest_url` is added to support
installing cloud provider/vendor specific manifest after booted
the k8s cluster. It's an URL pointing to the manifest file. For
example, cloud admin can set their specific storageclass into
this file, then it will be automatically setup after created
the cluster.
Task: 35798
Story: 2006209
Change-Id: Ib5a2c5cd7970085db941f189613e175f622aea3f
At present, the openstack cloud controller manager tag v1.17.0 is broken
due to missing RBAC policy for leases. This patch addressed this
shortcoming and thereby allowing the nodes in the cluster to be
untainted.
story: 2007031
task: 37838
Change-Id: Ide46d90dd30b41edaeaa8632205cc23b9ba7f162
Signed-off-by: Bharat Kunwar <brtknr@bath.edu>
There are some small regression issues introduced by the podman
support patch. And another issue is since k8s v1.16, the daemonsets
has been moved app/v1 from extensions[1], so we need to update the
system:node-drainer ClusterRole so that kubectl can be called on
worker node to trigger the drain process. Both issues are fixed
in this patch.
[1] https://kubernetes.io/docs/setup/release/notes/#deprecations-and-removals
Task: 37642
Story: 2005201
Change-Id: I87ed49fd1e9cd513ae54f6758717379adafae3a4
At the moment, cluster deployment fails when cluster_user_trust=False.
This is because the entire SoftwareDeployment exits rather than a single
script fragment. This patch fixes this by scoping the remainder of the
script conditional on whether TRUST_ID is defined.
Finally, default `cloud_provider_enabled` to false when
`cluster_user_trust` is false. Raise an error when
`cloud_provider_enabled` is overridden to true when `cluster_user_trust`
is false. This ensures that the minion kubelet is correctly configured.
Change-Id: Ibd9270c87bfa5d2f490e2e226e33ca56696d9e81
Story: 2006531
Task: 36587
For moving to 1.15.x and beyond we need to have PSP for privileged pods.
flannel, calico and node-problem-detector need it.
PSP
story: 2006515
task: 36513
Allow-priv
story: 2006252
task: 35867
Change-Id: I306a249afb275fdbd71354ed75043ffc4d466304
Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>
When there is more than one NIC attached to an instance, openstack cloud
provider returns a random InternalIP back to the host resulting in instability
with API server which only talks to a default interface.
This patch incorporates the changes made in
https://github.com/kubernetes/cloud-provider-openstack/pull/444 which enables
OpenStack Cloud Controller Manager (OCCM) to respect the
`internal-network-name` in cloud-config file which ensures that InternalIP
remains stable.
Uses a separate cloud-config file for OCCM to ensure in-tree Cinder volumes
remain compatible.
Change-Id: Idfa52ed2d512e7dc383a556371e896205dd542f9
Story: 2005333
Task: 30271
Rolling ugprade is an important feature for a managed k8s service,
at this stage, two user cases will be covered:
1. Upgrade base operating system
2. Upgrade k8s version
Known limitation: When doing operating system upgrade, there is no
chance to call kubectl drain to evict pods on that node.
Task: 30185
Story: 2002210
Change-Id: Ibbed59bc135969174a20e5243ff8464908801a23
Openstack-cloud-controller-manager restarts several times during
cluster creation.
This happens because cloud-controller-manager starts running before
needed secrets exist in kubernetes. Cloud-controller-manager lists secrets
and if the secrets exists it uses it and moves on, but if the secret
doesn't exist it starts a watch until it does. As this is not allowed the
pod fails.
This is triggered by Issue
https://github.com/kubernetes/cloud-provider-openstack/issues/545
Story: 2005270
Change-Id: If8f34dc45b3b8a76e3d561ed41b4d0a783ceecb5
Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>
Add a new secret in kube-system holding the trustee information. This is
useful for any service running within kubernetes needing to contact
OpenStack services.
Change-Id: I1939fb6a33c9eb6a45697d070f58c9510be774b3
The scripts in kube_cluster_config SoftwareConfig resource are combined
together as one script inside the VM, any 'exit' clause will stop
executing the following script.
Change-Id: I25965c663e6e1ca5a59d0f28098174810bd76df1
* Use the external cloud-provider [0]
* Label master nodes
* Make the script the deploys the cloud-provider and clusterroles
for the apiserver a SoftwareDeployment
* Rename kube_openstack_config to cloud-config,
for cinder to workm the kubelet expects the cloud config name only
like this. Keep a copy of kube_openstack_config for backwards
compatibility.
Change-Id: Ife5558f1db4e581b64cc4a8ffead151f7b405702
Task: 22361
Story: 2002652
Co-Authored-By: Spyros Trigazis <spyridon.trigazis@cern.ch>
- Start workers as soon as the master VM is created, rather than
waiting all the services ready.
- Move all the SoftwareDeployment outside of kubemaster stack.
- Tweak the scripts in SoftwareDeployment so that they can be combined
into a single script.
Story: 2004573
Task: 28347
Change-Id: Ie48861253615c8f60b34a2c1e9ad6b91d3ae685e
Co-Authored-By: Lingxian Kong <anlin.kong@gmail.com>
Currently, Magnum is using k8s API /version to check the API
availibility which is not a good way because /version only
reflects if the basic k8s api is working on not. And it will
return response even the etcd service is down. This patch fixes
it by using /healthz to replace /version.
Task: 22566
Story: 1775759
Change-Id: I45a1bd48a22842a251dafa6c349f0022fd319e3f
Create admin cluster role for k8s_fedora_atomic, it is defined in
the configuration but it wasn't applied.
story: 1766284
task: 22208
Change-Id: I112fe2ddb1d5400fcbc73bbdbc8d483d5a92d120
When creating a multi-master cluster, all master nodes will attempt to
create kubernetes resources in the cluster at this same time, like
coredns, the dashboard, calico etc. This race conditon shouldn't be
a problem when doing declarative calls instead of imperative (kubectl
apply instead of create). However, due to [1], kubectl fails to apply
the changes and the deployemnt scripts fail causing cluster to creation
to fail in the case of Heat SoftwareDeployments. This patch passes the
ResourceGroup index of every master so that resource creation will be
attempted only from the first master node.
[1] https://github.com/kubernetes/kubernetes/issues/44165
Task: 21673
Story: 1775759
Change-Id: I83f78022481aeef945334c37ac6c812bba9791fd
Add an admin service account and give it the
cluster role. It can be used for access apps
with token authentication like the
kubernetes-dashboard.
Remove the cluster role from the dashboard service account.
Change-Id: I7980c0e72b0d71921e42af7338d02b8a1e563c34
Closes-Bug: #1766284
Due to a few several small connected patches for the
fedora atomic driver, this patch includes 4 smaller patches.
Patch 1:
k8s: Do not start kubelet and kube-proxy on master
Patch [1], misses the removal of kubelet and kube-proxy from
enable-services-master.sh and therefore they are started if they
exist in the image or the script will fail.
https://review.openstack.org/#/c/533593/
Closes-Bug: #1726482
Patch 2:
k8s: Set require-kubeconfig when needed
From kubernetes 1.8 [1] --require-kubeconfig is deprecated and
in kubernetes 1.9 it is removed.
Add --require-kubeconfig only for k8s <= 1.8.
[1] https://github.com/kubernetes/kubernetes/issues/36745
Closes-Bug: #1718926https://review.openstack.org/#/c/534309/
Patch 3:
k8s_fedora: Add RBAC configuration
* Make certificates and kubeconfigs compatible
with NodeAuthorizer [1].
* Add CoreDNS roles and rolebindings.
* Create the system:kube-apiserver-to-kubelet ClusterRole.
* Bind the system:kube-apiserver-to-kubelet ClusterRole to
the kubernetes user.
* remove creation of kube-system namespaces, it is created
by default
* update client cert generation in the conductor with
kubernetes' requirements
* Add --insecure-bind-address=127.0.0.1 to work on
multi-master too. The controller manager on each
node needs to contact the apiserver (on the same node)
on 127.0.0.1:8080
[1] https://kubernetes.io/docs/admin/authorization/node/
Closes-Bug: #1742420
Depends-On: If43c3d0a0d83c42ff1fceffe4bcc333b31dbdaab
https://review.openstack.org/#/c/527103/
Patch 4:
k8s_fedora: Update coredns config to pass e2e
To pass the e2e conformance tests, coredns needs to
be configured with POD-MODE verified. Otherwise, pods
won't be resolvable [1].
[1] https://github.com/coredns/coredns/tree/master/plugin/kuberneteshttps://review.openstack.org/#/c/528566/
Closes-Bug: #1738633
Change-Id: Ibd5245ca0f5a11e1d67a2514cebb2ffe8aa5e7de