magnum

Commit Graph

Author	SHA1	Message	Date
Dale Smith	dc2b3724f5	Support k8s 1.27: Remove unsupported kubelet arg This argument has been defined for containerd clusters in Magnum, and is set to the default (and only valid) value of 'remote'. Kubelet warning in 1.26: * Flag --container-runtime has been deprecated, will be removed in 1.27 as the only valid value is 'remote' Kubelet error in 1.27: * E0801 03:10:26.723998 8889 run.go:74] "command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime" Change-Id: I072fab1342593941414b86e28b8a76edf2b19a6f	2024-01-02 06:44:13 +00:00
Zuul	2c193622de	Merge "Fix pods unable to send traffic to ClusterIP"	2023-05-10 10:52:51 +00:00
Jake Yip	ae7a50e2af	Fix pods unable to send traffic to ClusterIP Flannel with VXLAN suffers from a bug[1] where pods on the same node are unable to send traffic to a service's ClusterIP when the endpoint is on the same node. This is due to improper NATTing of the return traffic. The fix is to load the br_netfilter module as specified in the kubernetes doc.[2] [1] https://github.com/flannel-io/flannel/issues/1702 [2] https://kubernetes.io/docs/setup/production-environment/container-runtimes/#forwarding-ipv4-and-letting-iptables-see-bridged-traffic Change-Id: Ic182bba9d480421c2cb581558ebde8dfb20421c8	2023-03-29 19:27:17 +11:00
Dale Smith	16baf85716	Support k8s 1.26: remove logtostderr klog args have been removed from kubernetes in 1.26, and deprecated since 1.23. https://github.com/kubernetes/kubernetes/pull/112120 The argument --logtostderr has defaulted to true for a long time, so this removal on older versions should have no impact. Change-Id: I64f934a9bbc39c5e054d8a83b3f6edee061469e6	2023-02-13 23:12:26 +00:00
Dale Smith	5061dc5bb5	Fix kubelet for Fedora CoreOS 36 to provide real resolvconf to containers. In Fedora CoreOS 36 CoreDNS cannot start correctly due to a loopback issue where /etc/resolv.conf is mounted and points to localhost. Tested on Fedora CoreOS 35,36,37, with Docker and containerd. https://coredns.io/plugins/loop/#troubleshooting-loops-in-kubernetes-clusters https://fedoraproject.org/wiki/Changes/systemd-resolved#Detailed_Description Story: 2010519 Depends-On: I3242b718e32c92942ac471bc7e182a42e803005b Change-Id: I8106324ce71d6c22fa99e1a84b5a09743315811a	2023-02-05 09:01:56 +00:00
Dale Smith	b318560b59	Fix pods stuck terminating. If the kubelet container is restarted on a host (during upgrades, or manually) the bind mounts duplicate into /rootfs and kubelet cannot unmount these. This leads to stuck terminating pods that must be resolved with either --force or restart of kubelet container. Adding 'rslave' means that when the kubelet unmounts volumes at /var/lib/kubelet/pods this propogates to the host (using 'rshared'), and back into the container in /rootfs. This bug was likely introduced when mounting of /rootfs was added[0]. [0] `1994e9448a` Change-Id: I44f80ccc97c0eeab98f1edbe4a22763732b7f4da	2022-10-26 00:09:48 +00:00
Daniel Meyerholt	f7cd2928d6	Support K8s 1.24+ Only specify dockershim options when container runtime is not containerd. Those options were ignored in the past when using containerd but since 1.24 kubelet refuses to start. Task: 45282 Story: 2010028 Signed-off-by: Daniel Meyerholt <dxm523@gmail.com> Change-Id: Ib44cc30285c8bd4219d4a45dc956696505ddd570	2022-05-28 13:32:51 +02:00
Spyros	c1c9942f8b	fcos-k8s: Update to v1.22 * change rbac.authorization.k8s.io/v1beta1 to v1 * update metrics-server * change storage.k8s.io/v1beta1 to v1 * drop kubelet-https * update to FCOS 35 story: 2009828 task: 44416 Signed-off-by: Spyros <strigazi@gmail.com> Change-Id: I24b89366a4a8e8bc4c90f6a85ef6de2ac77dae1d	2022-02-03 13:59:32 +00:00
Feilong Wang	fe75ca3459	Fix kubelet on FCOS 34 Fedora CoreOS 34 has switched from cgroups v1 to cgroups v2 by default, which changes the sysfs hierarchy. Task: 42809 Story: 2009045 Change-Id: I2f9651421370ba44e2f0ddc7bb6526745b62ad40	2021-07-12 11:09:28 +00:00
Zuul	bfffeca927	Merge "Revert "[K8S] Enable --use-service-account-credentials""	2021-04-19 09:28:15 +00:00
Bharat Kunwar	ec0927242e	Revert "[K8S] Enable --use-service-account-credentials" This reverts commit `e9b4889670`. Reason for revert: breaks cluster deployment Change-Id: Ifefc3715acf8a87bf75c1d1aa0297db6b8333431	2021-04-16 13:05:38 +00:00
Zuul	c07628bca6	Merge "Support hyperkube_prefix label"	2021-04-07 19:09:49 +00:00
Zuul	4ce323f760	Merge "Add separated CA cert for etcd and front-proxy"	2021-04-07 12:35:10 +00:00
Feilong Wang	16344a5a95	Add separated CA cert for etcd and front-proxy Support creating different for k8s, etcd and front-proxy for security hardening. We're following some best practices[1][2] but adjusted based on the current Magnum deployment approach. [1] https://kubernetes.io/docs/setup/best-practices/certificates/ [2] https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/ Task: 40687 Story: 2008031 Change-Id: I523a4a85867f82d234ba1f3e6fad8b8cd2291182	2021-04-01 17:31:34 +00:00
Lingxian Kong	e9b4889670	[K8S] Enable --use-service-account-credentials Enable the config --use-service-account-credentials=true. This is necessary to support Pod Security Policy[1]. See https://kubernetes.io/docs/reference/command-line-tools-reference/kube-controller-manager/ for the option description, and more information here[2]. [1]: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#troubleshooting [2]: https://docs.datadoghq.com/security_monitoring/default_rules/cis-kubernetes-1.5.1-1.3.3/ Change-Id: I053808fac72a63af7ebf6f33d94659134b6cbdac	2021-03-30 19:04:42 +13:00
Bharat Kunwar	fc1f27a569	Support hyperkube_prefix label Additionally for k8s_fedora_coreos_v1 driver: * Introduce hyperkube_prefix which defaults to k8s.gcr.io/ * Bump default kube_tag to v1.18.16 Story: 1668998 Task: 41791 Change-Id: I38b8df45a00f1a2a1604059b8329d1dd762e05cd	2021-02-18 13:18:56 +00:00
Zuul	f64cfa490a	Merge "k8s: Do not use insecure api port"	2021-02-10 10:00:26 +00:00
Zuul	f6dafb5084	Merge "Make kubelet and kube-proxy use the secure port"	2021-02-10 10:00:18 +00:00
Spyros Trigazis	1b72456e12	k8s: Do not use insecure api port * in 1.20 8080 is not supported anymore use only 6443 change all probes for health to use kubectl and 6443 * configure the signing key in API story: 2008524 task: 41731 Change-Id: Ibaf1840214016d2dd6ac15e2137eb3cd3d767889 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2021-02-02 09:10:25 +00:00
Bharat Kunwar	7bfd7519af	[k8s-fcos] Fix insecure registry At present, insecure registry doesn't work as expected when Podman is used. This patch addresses the issue by fixing the ignition user data so that Podman is configured correctly. Then it ensures that --insecure-registry flag is provided to Docker in /etc/sysconfig/docker. Story: 2008479 Task: 41519 Change-Id: I2e1c86e0c88ab5b59185fd523e9c9696ce0f951e	2021-01-29 11:01:38 +00:00
Spyros Trigazis	d11f4e8393	Make kubelet and kube-proxy use the secure port Create certificates for kubelet and kube-proxy on control-plane nodes similar to worker nodes. Use the secure kube-apiserver port on control-plane nodes. story: 2008524 task: 41602 Change-Id: Ibeb32a24ca25914cab32c63a9ccafaf711148a84 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2021-01-15 12:27:54 +00:00
Feilong Wang	a837b5c03d	Update default k8s admission controller list There are two issues with current k8s admission controller list: 1. The default existing list is not consistent when user passes in extra controller or not 2. The existing list is out of date. The new list are based on below consideration: 1. Get the default list based on k8s v1.16.x[1] because it's the supported oldest version. 2. Keep it consistent when user passes in extra controllers or not 3. Keep all the admission controllers we has used in the code [1] https://v1-16.docs.kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#which-plugins-are-enabled-by-default Task: 40767 Story: 2008076 Change-Id: Ie5b89b97710d2e2d41c9ce4f3ec30046390acbeb	2020-10-12 21:44:26 +00:00
Feilong Wang	662b831fc7	Drop KUBE_API_PORT for kube-apiserver From k8s v1.19.x, kube-apiserver binary can't accept any parameter, and actually we're not using the pass-in KUBE_API_PORT. So it's safe to drop it. Change-Id: I12a0bb3441d18c3b68a8db4ab3234e04e5218cd2	2020-09-07 21:08:54 +12:00
Bharat Kunwar	799563eb61	Remove shebang from scripts Without this, heat container agents using kubectl version 1.18.x (e.g. ussuri-dev) fail because they do not have the correct KUBECONFIG in the environment. Task: 39938 Story: 2007591 Change-Id: Ifc212478ae09c658adeb6ba4c8e8afc8943e3977	2020-06-16 20:53:07 +00:00
Spyros Trigazis	40f40b7772	k8s: Use the same kubectl version as API In the heat-agent we use kubectl to install several deployments, it is better if we use matching versions of kubectl and apiserver to minimize errors. Additionally, the heat-agent won't need kubectl anymore. story: 2007591 task: 39536 Change-Id: If8f6d84efc70606ac0d888c084c82d8c7eff54f8 Signed-off-by: Spyros Trigazis <strigazi@gmail.com>	2020-04-24 17:11:13 +00:00
Spyros Trigazis	5b10eb7001	k8s: Add admin.conf kubeconfig A proper kubeconfig is required since 1.18 [0], instead of talking to apiserver's default insecure port :8080. [0] https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#kubectl story: 2007591 task: 39535 Change-Id: I15ef91bbec20a8037d47902225eabb3082579705 Signed-off-by: Spyros Trigazis <strigazi@gmail.com>	2020-04-23 12:42:19 +00:00
Spyros Trigazis	1ea8db948c	fcos-kubelet: Add rpc-statd dependency To mount nfs volumes with the embedded volume pkg [0], rpc-statd is required and should be started by mount.nfs. When running kubelet in a chroot this fails. With atomic containers it used to work. [0] https://github.com/kubernetes/kubernetes/tree/master/pkg/volume/nfs story: 2005201 task: 39403 Change-Id: Ib64efe7ecbe9a24e86fa9d9a35a4d90c0e8bbf2e Signed-off-by: Spyros Trigazis <strigazi@gmail.com>	2020-04-14 15:02:41 +00:00
Feilong Wang	076547e170	[k8s] Improve the taint of master node kubelet Improve the taint of master node kubelet to get the conformance test passed and update the OCCM and Helm/Tiller tolerations accordingly. Task: 39223 Story: 2007256 Change-Id: Ief452e05ddf13a1d1ee77641311c3ae7abbe90f2	2020-04-01 09:15:16 +13:00
Bharat Kunwar	1994e9448a	fcos: Mount /:/rootfs:ro to Kubelet Kubelet fails to handle SELinux labelling of Cinder PV without presenting the rootfs to Kubelet and as a result, an unprivileged container lacks the ability to access the path. With this patch, Kubelet handles the correct labelling automatically when a Cinder PV is attached to a pod. The default behaviour using system containers in Fedora Atomic is to mount rootfs [1] but we did not implement the same behaviour in Fedora CoreOS which was a mistake as this was a missing piece of code. [1] https://github.com/openstack/magnum/blob/master/dockerfiles/kubernetes-kubelet/config.json.template#L335 Story: 2007413 Task: 39129 Change-Id: Id59c604928244bf49773b7519fa756d5b2814b69	2020-03-28 09:13:57 +00:00
Zuul	3022f530b6	Merge "k8s-fedora: Set max-size to 10m for containers"	2020-03-16 10:05:45 +00:00
Zuul	305a0095ff	Merge "Add cinder_csi_enabled label"	2020-03-16 06:43:47 +00:00
Spyros Trigazis	af74b326d0	k8s-fedora: Set max-size to 10m for containers Set the max-size for container/pod logs to 10m and max of 5 rotated files. The values relay the default of kubernetes when it is using a remote container runtime [0] (container-log-max-files and container-log-max-size) This defaults cover the case of containerd. [0] https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/ story: 2007402 task: 39031 Change-Id: Ie3106b40b4d1c6866761c507122047e88e513651 Signed-off-by: Spyros Trigazis <strigazi@gmail.com>	2020-03-12 15:18:21 +00:00
Zuul	fa45002e21	Merge "Add opt-in containerd support"	2020-02-25 12:36:33 +00:00
Zuul	32a404fd69	Merge "[k8s] Fix instance ID issue with podman and autoscaler"	2020-02-21 11:39:21 +00:00
Bharat Kunwar	9565984fd9	Add cinder_csi_enabled label Add support for out of tree Cinder CSI. This is installed when the cinder_csi_enabled=true label is added. This will allow us to eventually deprecate in-tree Cinder. story: 2007048 task: 37868 Change-Id: I8305b9f8c9c37518ec39198693adb6f18542bf2e Signed-off-by: Bharat Kunwar <brtknr@bath.edu>	2020-02-21 10:24:36 +00:00
Zuul	73b894a671	Merge "core-podman: Mount os-release properly"	2020-02-21 02:05:33 +00:00
Spyros Trigazis	de21e0431a	Add opt-in containerd support New labels: container_runtime, containerd or fallback to host-docker containerd_version, taken from https://github.com/containerd/containerd/releases containerd_tarball_url, eg https://storage.googleapis.com/cri-containerd-release/cri-containerd-1.2.4.linux-amd64.tar.gz containerd_tarball_sha256, sha256 of the above tarball story: 2007317 task: 38823 Change-Id: I6c6599cdee61f508bd2a5e4c454da3125a256753 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2020-02-20 15:47:40 +00:00
Feilong Wang	eb2b688f3e	[k8s] Fix instance ID issue with podman and autoscaler Adding the volume mount for /etc/machine-id so that the kubelet boostraped by podman can access the correct instance ID. Without this, autoscaler will fail to delete empty node. This issue is reported on autoscaler repo[1]. [1] https://github.com/kubernetes/autoscaler/issues/2819 Task: 38743 Story: 2007286 Change-Id: I2852f4b255e782bb65b13571502194ee9f455ae3	2020-02-12 21:38:00 +13:00
Xinliang Liu	0b5d029179	Upgrade pause image to version 3.1 k8s.gcr.io/pause image repository support mutilarch on version 3.1. Pulling other non-amd64 arch image, no need to go to other arch specific repository ,like pause-arm64. It is easy to support mutilarch, like amd64, arm64, ppc64le, if use version 3.1. See 3.1 manifest: https://console.cloud.google.com/gcr/images/google-containers/GLOBAL/pause@sha256:f78411e19d84a252e53bff71a4407a5686c46983a2c2eeed83929b888179acea/details?tab=info Change-Id: Id4a5dfc3bc3c9ce909c7b0a3709506a250b2851f Task: 37841 Story: 2007026	2020-02-11 06:57:11 +00:00
Spyros Trigazis	e731a7cb5e	core-podman: Mount os-release properly To display the node OS-IMAGE in k8s properly we need to mount /usr/lib/os-release, /ets/os-release is just a symlink. story: 2006459 task: 38505 Change-Id: I0c850126c7299cb7a4fe201efee311d76bc14ce6 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2020-01-29 10:03:04 +01:00
Spyros Trigazis	8fa77dae0b	Fix entrypoint for k8s components in podman Upstream k8s images changed the entrypoint to /hyperkube instead of shell. Set the entrypoint to /hyperkube which works for v1.17.x and v1.16.x. podman inspect k8s.gcr.io/hyperkube:v1.16.0 \| grep Entrypoint -A 2 podman inspect k8s.gcr.io/hyperkube:v1.17.0 \| grep Entrypoint -A 2 "Entrypoint": [ "/hyperkube" ] story: 2007031 task: 37834 Change-Id: I021aeeef9f39dd426c1f335161a3d4b3f51670e8 Signed-off-by: Spyros Trigazis <strigazi@gmail.com>	2019-12-18 14:17:59 +00:00
Zuul	b07f8ba861	Merge "Support TimeoutStartSec for k8s systemd services"	2019-11-13 11:20:16 +00:00
Feilong Wang	15a4ea14e1	Support TimeoutStartSec for k8s systemd services Now Magnum is using podman and systemd to manage the k8s components. In cases where the nodes pull images from docker.io or another mirror registry with high latency, some of the components may take long time to start, which is causing timeout when bootstraping k8s cluster for fedora atomic/coreos drivers. This patch fixes it by adding TimeoutStartSec for the systemd services. Task: 37251 Story: 2006459 Change-Id: I709bac620e4ceec1858672076eb0aef997704b62	2019-11-13 22:19:51 +13:00
Spyros Trigazis	aa6b3bbeba	k8s_fedora: Add use_podman label Choose whether system containers etcd, kubernetes and the heat-agent will be installed with podman or atomic. This label is relevant for k8s_fedora drivers. k8s_fedora_atomic_v1 defaults to use_podman=false, meaning atomic will be used pulling containers from docker.io/openstackmagnum. use_podman=true is accepted as well, which will pull containers by k8s.gcr.io. k8s_fedora_coreos_v1 defaults and accepts only use_podman=true. Fix upgrade for k8s_fedora_coreos_v1 and magnum-cordon systemd unit. Task: 37242 Story: 2005201 Change-Id: I0d5e4e059cd4f0458746df7c09d2fd47c389c6a0 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2019-10-23 10:43:52 +00:00
Theodoros Tsioutsias	113fdc44b2	ng-12: Label nodegroup nodes With this change each node will be labeled with the following: * --node-labels=magnum.openstack.org/role=${NODEGROUP_ROLE} * --node-labels=magnum.openstack.org/nodegroup=${NODEGROUP_NAME} Change-Id: Ic410a059b19a1252cdf6eed786964c5c7b03d01c	2019-10-16 11:53:44 +00:00
Spyros Trigazis	73dc57c319	Support Fedora CoreOS 30 Add fedora coreos driver. To deploy clusters with fedora coreos operators or users need to add os_distro=fedora-coreos to the image. The scripts to deploy kubernetes on top are the same with fedora atomic. Note that this driver has selinux enabled. The startup of the heat-container-agent uses a workaround to copy the SoftwareDeployment credentials to /var/lib/cloud/data/cfn-init-data. The fedora coreos driver requires heat train to support ignition. Task: 29968 Story: 2005201 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch> Change-Id: Iffcaa68d385b1b829b577ebce2df465073dfb5a1	2019-10-16 09:44:19 +00:00
Spyros Trigazis	3674b3617a	k8s_atomic: Run all syscontainer with podman Using the atomic cli to install kubelet breaks mount propagation of secrets, configmaps and so on. Using podman in a systemd unit works. Additionally, with this change all atomic commands are dropped, containers are pulled from gcr.io (ofiicial kubernetes containers). Finally, after this patch only by starting the heat-agent with ignition, we can use fedora coreos as a drop-in replacement. * Drop del of docker0 This command to remove docker0 is carried from earlier versions of docker. This is not an issue anymore. story: 2006459 task: 36871 Change-Id: I2ed8e02f5295e48d371ac9e1aff2ad5d30d0c2bd Signed-off-by: Spyros Trigazis <spyridon.trigazi@cern.ch>	2019-10-08 09:14:36 +00:00
Spyros Trigazis	bb747ac5e7	k8s_fedora: Move rp_filter=1 for calico up follow up of: I828cec27968ffe0961011e34a66e0eef3e567c91 Move set of sysctl.conf up as it does need to depend on NetworkManager configuration. upstream docs: Cluster nodes must have rp_filter set to strict (1). https://github.com/projectcalico/calico/blob/master/v3.9/getting-started/kubernetes/installation/migration-from-flannel.md story: 2006441 task: 36564 Change-Id: I8a6e970a8ea3d1d3424eab05f1617509cf27d52b Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2019-10-02 10:16:57 +00:00
Spyros Trigazis	9dc92654d2	k8s_fedora: Label master nodes with kubectl Due to [0], we can not label nodes with node-role.kubernetes.io/master="". We need to do it with the kubernetes API. [0] https://github.com/kubernetes/kubernetes/issues/75457 story: 2006459 task: 36872 Change-Id: I2dc2a125c49f9fc33aa02d3d0c99a5bb0eec1156 Signed-off-by: Spyros Trigazis <spyridon.trigazi@cern.ch>	2019-10-01 18:58:13 +00:00
Spyros Trigazis	ddf27e935d	Add hostname-override to kube-proxy Pass the node name to kube-proxy and not repy on the cloud provider to set it. Kube-proxy needs to start before the cloud-provider. Without it kube-proxy fail to find the node in the kubernete api. story: 2006459 task: 36873 Change-Id: Ie04d8d99e68ee43c9d407dbd6f746f6249337ba2	2019-10-01 18:38:35 +00:00

1 2 3

113 Commits