Commit Graph

25 Commits

Author SHA1 Message Date
Jake Yip a41c884463 Update cloud-provider-openstack registry
cloud-provider-openstack has changed their image repo. To use the
plugins matching later versions of k8s, this needs to be updated.

Also update tags for CI test to match version being tested.

[1] https://github.com/kubernetes/cloud-provider-openstack/pull/2169

Change-Id: I9390db5e1aa357c17a39a7c208d837befafd3820
2024-02-28 18:57:55 +11:00
Jake Yip 1b1c2122f0 Remove PodSecurityPolicy
PodSecurityPolicy has been removed in Kubernetes v1.25 [1]. To allow Magnum
to support Kubernetes v1.25 and above, PodSecurityPolicy Admission
Controller has has been removed.

[1] https://kubernetes.io/docs/concepts/security/pod-security-policy/

Change-Id: I0fb0c372b484275b0677114193289469ee788b84
2023-04-26 20:33:44 +10:00
Spyros c1c9942f8b fcos-k8s: Update to v1.22
* change rbac.authorization.k8s.io/v1beta1 to v1
  * update metrics-server
* change storage.k8s.io/v1beta1 to v1
* drop kubelet-https
* update to FCOS 35

story: 2009828
task: 44416

Signed-off-by: Spyros <strigazi@gmail.com>
Change-Id: I24b89366a4a8e8bc4c90f6a85ef6de2ac77dae1d
2022-02-03 13:59:32 +00:00
Thomas George Hartland 04477b13f8 Add resource requests for system components
Set resource requests for system pods to
guarantee at least some amount of resources.
This prevents them from being starved of
CPU/memory when running alongside resource
intensive workloads in the cluster and
gives them a higher quality of service class.

metrics-server:
  100m/200Mi recommended for up to 100 node clusters.
  https://github.com/kubernetes-sigs/metrics-server#scaling

openstack-cloud-controller-manager:
  200m CPU taken from example manifests.

kubernetes-dashboard:
  100m/100Mi taken from helm chart defaults.
  heapster:
    100m/128Mi taken from helm chart defaults.
  influxdb:
    100m/256Mi taken from influx helm chart defaults.
  grafana (for influxdb):
    100m/200Mi same as monitoring grafana.

ingress-traefik:
  100m/50Mi taken from helm chart defaults.

cluster-autoscaler:
  100m/300Mi taken from helm chart defaults.

csi-cinder-nodeplugin:
  25m CPU on both containers to ensure
  Burstable QoS class.

csi-cinder-controllerplugin:
  20m CPU on all containers to ensure
  Burstable QoS class.

tiller-deploy:
  25m CPU to ensure it can always handle
  the readiness probe.

octavia-ingress-controller:
  50m CPU, just a guess really.

Story: 2008825
Task: 42290
Change-Id: Ifcd764c00d7046744ba63609078cc6c5d02fdc1c
2021-11-26 09:52:45 +00:00
Bartosz Bezak 12766eaff8 Add cloud-provider flag to openstack cloud control manager
Recent OpenStack Cloud Control Manager (occm) fails without
cloud-provider flag, starting v1.21.0 onwards --cloud-provider cannot be
empty Error: --cloud-provider cannot be empty

Additionally, add create role for serviceaccounts/token resource [1].

[1] 7d844dac9d/manifests/controller-manager/cloud-controller-manager-roles.yaml (L52-L57)

Story: 2009023
Task: 42745
Change-Id: I55042665c25704cd65eb4e4883f8a796bdcdaa7f
2021-08-03 13:51:55 +00:00
Spyros Trigazis 1b72456e12 k8s: Do not use insecure api port
* in 1.20 8080 is not supported anymore
** use only 6443
** change all probes for health to use kubectl and 6443
* configure the signing key in API

story: 2008524
task: 41731

Change-Id: Ibaf1840214016d2dd6ac15e2137eb3cd3d767889
Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>
2021-02-02 09:10:25 +00:00
Bharat Kunwar 799563eb61 Remove shebang from scripts
Without this, heat container agents using  kubectl version
1.18.x (e.g. ussuri-dev) fail because they do not have the correct
KUBECONFIG in the environment.

Task: 39938
Story: 2007591

Change-Id: Ifc212478ae09c658adeb6ba4c8e8afc8943e3977
2020-06-16 20:53:07 +00:00
Feilong Wang 076547e170 [k8s] Improve the taint of master node kubelet
Improve the taint of master node kubelet to get the conformance
test passed and update the OCCM and Helm/Tiller tolerations accordingly.

Task: 39223
Story: 2007256

Change-Id: Ief452e05ddf13a1d1ee77641311c3ae7abbe90f2
2020-04-01 09:15:16 +13:00
Feilong Wang d61dd1d5b5 [k8s] Support post install manifest URL
A new config option `post_install_manifest_url` is added to support
installing cloud provider/vendor specific manifest after booted
the k8s cluster. It's an URL pointing to the manifest file. For
example, cloud admin can set their specific storageclass into
this file, then it will be automatically setup after created
the cluster.

Task: 35798
Story: 2006209

Change-Id: Ib5a2c5cd7970085db941f189613e175f622aea3f
2020-03-05 20:30:12 +13:00
Bharat Kunwar b2393220c6 [k8s] Fix RBAC for OCCM v1.17.0
At present, the openstack cloud controller manager tag v1.17.0 is broken
due to missing RBAC policy for leases. This patch addressed this
shortcoming and thereby allowing the nodes in the cluster to be
untainted.

story: 2007031
task: 37838

Change-Id: Ide46d90dd30b41edaeaa8632205cc23b9ba7f162
Signed-off-by: Bharat Kunwar <brtknr@bath.edu>
2019-12-18 17:28:10 +00:00
Feilong Wang cab9492dff [k8s] Fix rolling upgrade with podman
There are some small regression issues introduced by the podman
support patch. And another issue is since k8s v1.16, the daemonsets
has been moved app/v1 from extensions[1], so we need to update the
system:node-drainer ClusterRole so that kubectl can be called on
worker node to trigger the drain process. Both issues are fixed
in this patch.

[1] https://kubernetes.io/docs/setup/release/notes/#deprecations-and-removals

Task: 37642
Story: 2005201

Change-Id: I87ed49fd1e9cd513ae54f6758717379adafae3a4
2019-12-03 17:15:25 +00:00
Bharat Kunwar eebcc9b7a1 Fix k8s deployment when cluster_user_trust=False
At the moment, cluster deployment fails when cluster_user_trust=False.
This is because the entire SoftwareDeployment exits rather than a single
script fragment. This patch fixes this by scoping the remainder of the
script conditional on whether TRUST_ID is defined.

Finally, default `cloud_provider_enabled` to false when
`cluster_user_trust` is false. Raise an error when
`cloud_provider_enabled` is overridden to true when `cluster_user_trust`
is false. This ensures that the minion kubelet is correctly configured.

Change-Id: Ibd9270c87bfa5d2f490e2e226e33ca56696d9e81
Story: 2006531
Task: 36587
2019-09-20 03:49:03 +00:00
Spyros Trigazis 7267c1ea43 k8s_fedora_atomic: Add PodSecurityPolicy
For moving to 1.15.x and beyond we need to have PSP for privileged pods.
flannel, calico and node-problem-detector need it.

PSP
story: 2006515
task: 36513

Allow-priv
story: 2006252
task: 35867

Change-Id: I306a249afb275fdbd71354ed75043ffc4d466304
Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>
2019-09-11 08:38:42 +00:00
Bharat Kunwar 425fb0fa32 Add network config to stabilise multi-NIC scenario
When there is more than one NIC attached to an instance, openstack cloud
provider returns a random InternalIP back to the host resulting in instability
with API server which only talks to a default interface.

This patch incorporates the changes made in
https://github.com/kubernetes/cloud-provider-openstack/pull/444 which enables
OpenStack Cloud Controller Manager (OCCM) to respect the
`internal-network-name` in cloud-config file which ensures that InternalIP
remains stable.

Uses a separate cloud-config file for OCCM to ensure in-tree Cinder volumes
remain compatible.

Change-Id: Idfa52ed2d512e7dc383a556371e896205dd542f9
Story: 2005333
Task: 30271
2019-07-29 09:07:26 +00:00
Feilong Wang 05c27f2d73 [k8s][fedora atomic] Rolling upgrade support
Rolling ugprade is an important feature for a managed k8s service,
at this stage, two user cases will be covered:

1. Upgrade base operating system
2. Upgrade k8s version

Known limitation: When doing operating system upgrade, there is no
chance to call kubectl drain to evict pods on that node.

Task: 30185
Story: 2002210

Change-Id: Ibbed59bc135969174a20e5243ff8464908801a23
2019-06-07 14:48:08 +12:00
Diogo Guerra 21acb8dc9a Fix openstack-cloud-controller-manager restarts
Openstack-cloud-controller-manager restarts several times during
cluster creation.

This happens because cloud-controller-manager starts running before
needed secrets exist in kubernetes. Cloud-controller-manager lists secrets 
and if the secrets exists it uses it and moves on, but if the secret 
doesn't exist it starts a watch until it does. As this is not allowed the
pod fails.

This is triggered by Issue 
https://github.com/kubernetes/cloud-provider-openstack/issues/545

Story: 2005270

Change-Id: If8f34dc45b3b8a76e3d561ed41b4d0a783ceecb5
Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>
2019-03-20 14:55:23 +01:00
Ricardo Rocha ca442a7202 [k8s] Add trustee as a secret in kube-system
Add a new secret in kube-system holding the trustee information. This is
useful for any service running within kubernetes needing to contact
OpenStack services.

Change-Id: I1939fb6a33c9eb6a45697d070f58c9510be774b3
2019-02-20 09:45:14 +01:00
Lingxian Kong 49d0444974 Do not use 'exit' in the script
The scripts in kube_cluster_config SoftwareConfig resource are combined
together as one script inside the VM, any 'exit' clause will stop
executing the following script.

Change-Id: I25965c663e6e1ca5a59d0f28098174810bd76df1
2019-01-16 13:45:55 +13:00
Jim Bach 6c61a1a949 k8s_fedora: Use external kubernetes/cloud-provider-openstack
* Use the external cloud-provider [0]
* Label master nodes
* Make the script the deploys the cloud-provider and clusterroles
  for the apiserver a SoftwareDeployment
* Rename kube_openstack_config to cloud-config,
  for cinder to workm the kubelet expects the cloud config name only
  like this. Keep a copy of kube_openstack_config for backwards
  compatibility.

Change-Id: Ife5558f1db4e581b64cc4a8ffead151f7b405702
Task: 22361
Story: 2002652
Co-Authored-By: Spyros Trigazis <spyridon.trigazis@cern.ch>
2018-12-19 10:56:47 +01:00
Lingxian Kong cae7fa21b6 [k8s] Cluster creation speedup
- Start workers as soon as the master VM is created, rather than
  waiting all the services ready.
- Move all the SoftwareDeployment outside of kubemaster stack.
- Tweak the scripts in SoftwareDeployment so that they can be combined
  into a single script.

Story: 2004573
Task: 28347
Change-Id: Ie48861253615c8f60b34a2c1e9ad6b91d3ae685e
Co-Authored-By: Lingxian Kong <anlin.kong@gmail.com>
2018-12-15 11:59:57 +00:00
Feilong Wang f6d1c0de46 Fix etcd race condition issue
Currently, Magnum is using k8s API /version to check the API
availibility which is not a good way because /version only
reflects if the basic k8s api is working on not. And it will
return response even the etcd service is down. This patch fixes
it by using /healthz to replace /version.

Task: 22566
Story: 1775759

Change-Id: I45a1bd48a22842a251dafa6c349f0022fd319e3f
2018-07-11 15:57:56 +12:00
Spyros Trigazis 4c5d38adef k8s_fedora: Create admin cluster-role
Create admin cluster role for k8s_fedora_atomic, it is defined in
the configuration but it wasn't applied.

story: 1766284
task: 22208

Change-Id: I112fe2ddb1d5400fcbc73bbdbc8d483d5a92d120
2018-06-18 13:59:14 +02:00
Feilong Wang 3c72d7b88b Fix race condition issue for k8s multi masters
When creating a multi-master cluster, all master nodes will attempt to
create kubernetes resources in the cluster at this same time, like
coredns, the dashboard, calico etc. This race conditon shouldn't be
a problem when doing declarative calls instead of imperative (kubectl
apply instead of create). However, due to [1], kubectl fails to apply
the changes and the deployemnt scripts fail causing cluster to creation
to fail in the case of Heat SoftwareDeployments. This patch passes the
ResourceGroup index of every master so that resource creation will be
attempted only from the first master node.

[1] https://github.com/kubernetes/kubernetes/issues/44165

Task: 21673
Story: 1775759

Change-Id: I83f78022481aeef945334c37ac6c812bba9791fd
2018-06-14 09:16:32 +12:00
Spyros Trigazis 91d5229b9c k8s_fedora: Add admin user
Add an admin service account and give it the
cluster role. It can be used for access apps
with token authentication like the
kubernetes-dashboard.

Remove the cluster role from the dashboard service account.

Change-Id: I7980c0e72b0d71921e42af7338d02b8a1e563c34
Closes-Bug: #1766284
2018-04-25 12:22:43 +00:00
Spyros Trigazis 2329cb7fb4 k8s: Fix kubelet, add RBAC and pass e2e tests
Due to a few several small connected patches for the
fedora atomic driver, this patch includes 4 smaller patches.

Patch 1:
k8s: Do not start kubelet and kube-proxy on master

Patch [1], misses the removal of kubelet and kube-proxy from
enable-services-master.sh and therefore they are started if they
exist in the image or the script will fail.

https://review.openstack.org/#/c/533593/
Closes-Bug: #1726482

Patch 2:
k8s: Set require-kubeconfig when needed

From kubernetes 1.8 [1] --require-kubeconfig is deprecated and
in kubernetes 1.9 it is removed.

Add --require-kubeconfig only for k8s <= 1.8.

[1] https://github.com/kubernetes/kubernetes/issues/36745

Closes-Bug: #1718926

https://review.openstack.org/#/c/534309/

Patch 3:
k8s_fedora: Add RBAC configuration

* Make certificates and kubeconfigs compatible
  with NodeAuthorizer [1].
* Add CoreDNS roles and rolebindings.
* Create the system:kube-apiserver-to-kubelet ClusterRole.
* Bind the system:kube-apiserver-to-kubelet ClusterRole to
  the kubernetes user.
* remove creation of kube-system namespaces, it is created
  by default
* update client cert generation in the conductor with
  kubernetes' requirements
* Add --insecure-bind-address=127.0.0.1 to work on
  multi-master too. The controller manager on each
  node needs to contact the apiserver (on the same node)
  on 127.0.0.1:8080

[1] https://kubernetes.io/docs/admin/authorization/node/

Closes-Bug: #1742420
Depends-On: If43c3d0a0d83c42ff1fceffe4bcc333b31dbdaab
https://review.openstack.org/#/c/527103/

Patch 4:
k8s_fedora: Update coredns config to pass e2e

To pass the e2e conformance tests, coredns needs to
be configured with POD-MODE verified. Otherwise, pods
won't be resolvable [1].

[1] https://github.com/coredns/coredns/tree/master/plugin/kubernetes

https://review.openstack.org/#/c/528566/
Closes-Bug: #1738633

Change-Id: Ibd5245ca0f5a11e1d67a2514cebb2ffe8aa5e7de
2018-02-08 13:35:00 +00:00