magnum

Commit Graph

Author	SHA1	Message	Date
Dale Smith	dc2b3724f5	Support k8s 1.27: Remove unsupported kubelet arg This argument has been defined for containerd clusters in Magnum, and is set to the default (and only valid) value of 'remote'. Kubelet warning in 1.26: * Flag --container-runtime has been deprecated, will be removed in 1.27 as the only valid value is 'remote' Kubelet error in 1.27: * E0801 03:10:26.723998 8889 run.go:74] "command failed" err="failed to parse kubelet flag: unknown flag: --container-runtime" Change-Id: I072fab1342593941414b86e28b8a76edf2b19a6f	2024-01-02 06:44:13 +00:00
Zuul	2c193622de	Merge "Fix pods unable to send traffic to ClusterIP"	2023-05-10 10:52:51 +00:00
Jake Yip	ae7a50e2af	Fix pods unable to send traffic to ClusterIP Flannel with VXLAN suffers from a bug[1] where pods on the same node are unable to send traffic to a service's ClusterIP when the endpoint is on the same node. This is due to improper NATTing of the return traffic. The fix is to load the br_netfilter module as specified in the kubernetes doc.[2] [1] https://github.com/flannel-io/flannel/issues/1702 [2] https://kubernetes.io/docs/setup/production-environment/container-runtimes/#forwarding-ipv4-and-letting-iptables-see-bridged-traffic Change-Id: Ic182bba9d480421c2cb581558ebde8dfb20421c8	2023-03-29 19:27:17 +11:00
Dale Smith	16baf85716	Support k8s 1.26: remove logtostderr klog args have been removed from kubernetes in 1.26, and deprecated since 1.23. https://github.com/kubernetes/kubernetes/pull/112120 The argument --logtostderr has defaulted to true for a long time, so this removal on older versions should have no impact. Change-Id: I64f934a9bbc39c5e054d8a83b3f6edee061469e6	2023-02-13 23:12:26 +00:00
Dale Smith	5061dc5bb5	Fix kubelet for Fedora CoreOS 36 to provide real resolvconf to containers. In Fedora CoreOS 36 CoreDNS cannot start correctly due to a loopback issue where /etc/resolv.conf is mounted and points to localhost. Tested on Fedora CoreOS 35,36,37, with Docker and containerd. https://coredns.io/plugins/loop/#troubleshooting-loops-in-kubernetes-clusters https://fedoraproject.org/wiki/Changes/systemd-resolved#Detailed_Description Story: 2010519 Depends-On: I3242b718e32c92942ac471bc7e182a42e803005b Change-Id: I8106324ce71d6c22fa99e1a84b5a09743315811a	2023-02-05 09:01:56 +00:00
Dale Smith	b318560b59	Fix pods stuck terminating. If the kubelet container is restarted on a host (during upgrades, or manually) the bind mounts duplicate into /rootfs and kubelet cannot unmount these. This leads to stuck terminating pods that must be resolved with either --force or restart of kubelet container. Adding 'rslave' means that when the kubelet unmounts volumes at /var/lib/kubelet/pods this propogates to the host (using 'rshared'), and back into the container in /rootfs. This bug was likely introduced when mounting of /rootfs was added[0]. [0] `1994e9448a` Change-Id: I44f80ccc97c0eeab98f1edbe4a22763732b7f4da	2022-10-26 00:09:48 +00:00
Daniel Meyerholt	f7cd2928d6	Support K8s 1.24+ Only specify dockershim options when container runtime is not containerd. Those options were ignored in the past when using containerd but since 1.24 kubelet refuses to start. Task: 45282 Story: 2010028 Signed-off-by: Daniel Meyerholt <dxm523@gmail.com> Change-Id: Ib44cc30285c8bd4219d4a45dc956696505ddd570	2022-05-28 13:32:51 +02:00
Feilong Wang	fe75ca3459	Fix kubelet on FCOS 34 Fedora CoreOS 34 has switched from cgroups v1 to cgroups v2 by default, which changes the sysfs hierarchy. Task: 42809 Story: 2009045 Change-Id: I2f9651421370ba44e2f0ddc7bb6526745b62ad40	2021-07-12 11:09:28 +00:00
Zuul	c07628bca6	Merge "Support hyperkube_prefix label"	2021-04-07 19:09:49 +00:00
Bharat Kunwar	fc1f27a569	Support hyperkube_prefix label Additionally for k8s_fedora_coreos_v1 driver: * Introduce hyperkube_prefix which defaults to k8s.gcr.io/ * Bump default kube_tag to v1.18.16 Story: 1668998 Task: 41791 Change-Id: I38b8df45a00f1a2a1604059b8329d1dd762e05cd	2021-02-18 13:18:56 +00:00
Zuul	f64cfa490a	Merge "k8s: Do not use insecure api port"	2021-02-10 10:00:26 +00:00
Spyros Trigazis	1b72456e12	k8s: Do not use insecure api port * in 1.20 8080 is not supported anymore use only 6443 change all probes for health to use kubectl and 6443 * configure the signing key in API story: 2008524 task: 41731 Change-Id: Ibaf1840214016d2dd6ac15e2137eb3cd3d767889 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2021-02-02 09:10:25 +00:00
Bharat Kunwar	7bfd7519af	[k8s-fcos] Fix insecure registry At present, insecure registry doesn't work as expected when Podman is used. This patch addresses the issue by fixing the ignition user data so that Podman is configured correctly. Then it ensures that --insecure-registry flag is provided to Docker in /etc/sysconfig/docker. Story: 2008479 Task: 41519 Change-Id: I2e1c86e0c88ab5b59185fd523e9c9696ce0f951e	2021-01-29 11:01:38 +00:00
Feilong Wang	385bc9700b	Update default values for docker nofile and vm.max_map_count Task: 40801 Story: 2008098 Change-Id: I1802e4002d9aa89a321f130a16fd8021a773b73a	2020-09-01 22:14:53 +00:00
Bharat Kunwar	799563eb61	Remove shebang from scripts Without this, heat container agents using kubectl version 1.18.x (e.g. ussuri-dev) fail because they do not have the correct KUBECONFIG in the environment. Task: 39938 Story: 2007591 Change-Id: Ifc212478ae09c658adeb6ba4c8e8afc8943e3977	2020-06-16 20:53:07 +00:00
Spyros Trigazis	1afaa545bf	atomic: Do not install control-plane on minions apiserver controller-manager and scheduler are not used in the minions. story: 2007568 task: 39837 Change-Id: I93b380c484b7e3881b2aa0620fe41ab9d61c1eec Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2020-05-25 14:50:58 +03:00
Spyros Trigazis	40f40b7772	k8s: Use the same kubectl version as API In the heat-agent we use kubectl to install several deployments, it is better if we use matching versions of kubectl and apiserver to minimize errors. Additionally, the heat-agent won't need kubectl anymore. story: 2007591 task: 39536 Change-Id: If8f6d84efc70606ac0d888c084c82d8c7eff54f8 Signed-off-by: Spyros Trigazis <strigazi@gmail.com>	2020-04-24 17:11:13 +00:00
Spyros Trigazis	1ea8db948c	fcos-kubelet: Add rpc-statd dependency To mount nfs volumes with the embedded volume pkg [0], rpc-statd is required and should be started by mount.nfs. When running kubelet in a chroot this fails. With atomic containers it used to work. [0] https://github.com/kubernetes/kubernetes/tree/master/pkg/volume/nfs story: 2005201 task: 39403 Change-Id: Ib64efe7ecbe9a24e86fa9d9a35a4d90c0e8bbf2e Signed-off-by: Spyros Trigazis <strigazi@gmail.com>	2020-04-14 15:02:41 +00:00
Bharat Kunwar	1994e9448a	fcos: Mount /:/rootfs:ro to Kubelet Kubelet fails to handle SELinux labelling of Cinder PV without presenting the rootfs to Kubelet and as a result, an unprivileged container lacks the ability to access the path. With this patch, Kubelet handles the correct labelling automatically when a Cinder PV is attached to a pod. The default behaviour using system containers in Fedora Atomic is to mount rootfs [1] but we did not implement the same behaviour in Fedora CoreOS which was a mistake as this was a missing piece of code. [1] https://github.com/openstack/magnum/blob/master/dockerfiles/kubernetes-kubelet/config.json.template#L335 Story: 2007413 Task: 39129 Change-Id: Id59c604928244bf49773b7519fa756d5b2814b69	2020-03-28 09:13:57 +00:00
Spyros Trigazis	af74b326d0	k8s-fedora: Set max-size to 10m for containers Set the max-size for container/pod logs to 10m and max of 5 rotated files. The values relay the default of kubernetes when it is using a remote container runtime [0] (container-log-max-files and container-log-max-size) This defaults cover the case of containerd. [0] https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/ story: 2007402 task: 39031 Change-Id: Ie3106b40b4d1c6866761c507122047e88e513651 Signed-off-by: Spyros Trigazis <strigazi@gmail.com>	2020-03-12 15:18:21 +00:00
Zuul	fa45002e21	Merge "Add opt-in containerd support"	2020-02-25 12:36:33 +00:00
Zuul	32a404fd69	Merge "[k8s] Fix instance ID issue with podman and autoscaler"	2020-02-21 11:39:21 +00:00
Zuul	73b894a671	Merge "core-podman: Mount os-release properly"	2020-02-21 02:05:33 +00:00
Spyros Trigazis	de21e0431a	Add opt-in containerd support New labels: container_runtime, containerd or fallback to host-docker containerd_version, taken from https://github.com/containerd/containerd/releases containerd_tarball_url, eg https://storage.googleapis.com/cri-containerd-release/cri-containerd-1.2.4.linux-amd64.tar.gz containerd_tarball_sha256, sha256 of the above tarball story: 2007317 task: 38823 Change-Id: I6c6599cdee61f508bd2a5e4c454da3125a256753 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2020-02-20 15:47:40 +00:00
Feilong Wang	eb2b688f3e	[k8s] Fix instance ID issue with podman and autoscaler Adding the volume mount for /etc/machine-id so that the kubelet boostraped by podman can access the correct instance ID. Without this, autoscaler will fail to delete empty node. This issue is reported on autoscaler repo[1]. [1] https://github.com/kubernetes/autoscaler/issues/2819 Task: 38743 Story: 2007286 Change-Id: I2852f4b255e782bb65b13571502194ee9f455ae3	2020-02-12 21:38:00 +13:00
Xinliang Liu	0b5d029179	Upgrade pause image to version 3.1 k8s.gcr.io/pause image repository support mutilarch on version 3.1. Pulling other non-amd64 arch image, no need to go to other arch specific repository ,like pause-arm64. It is easy to support mutilarch, like amd64, arm64, ppc64le, if use version 3.1. See 3.1 manifest: https://console.cloud.google.com/gcr/images/google-containers/GLOBAL/pause@sha256:f78411e19d84a252e53bff71a4407a5686c46983a2c2eeed83929b888179acea/details?tab=info Change-Id: Id4a5dfc3bc3c9ce909c7b0a3709506a250b2851f Task: 37841 Story: 2007026	2020-02-11 06:57:11 +00:00
Spyros Trigazis	e731a7cb5e	core-podman: Mount os-release properly To display the node OS-IMAGE in k8s properly we need to mount /usr/lib/os-release, /ets/os-release is just a symlink. story: 2006459 task: 38505 Change-Id: I0c850126c7299cb7a4fe201efee311d76bc14ce6 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2020-01-29 10:03:04 +01:00
Spyros Trigazis	8fa77dae0b	Fix entrypoint for k8s components in podman Upstream k8s images changed the entrypoint to /hyperkube instead of shell. Set the entrypoint to /hyperkube which works for v1.17.x and v1.16.x. podman inspect k8s.gcr.io/hyperkube:v1.16.0 \| grep Entrypoint -A 2 podman inspect k8s.gcr.io/hyperkube:v1.17.0 \| grep Entrypoint -A 2 "Entrypoint": [ "/hyperkube" ] story: 2007031 task: 37834 Change-Id: I021aeeef9f39dd426c1f335161a3d4b3f51670e8 Signed-off-by: Spyros Trigazis <strigazi@gmail.com>	2019-12-18 14:17:59 +00:00
Zuul	b07f8ba861	Merge "Support TimeoutStartSec for k8s systemd services"	2019-11-13 11:20:16 +00:00
Feilong Wang	15a4ea14e1	Support TimeoutStartSec for k8s systemd services Now Magnum is using podman and systemd to manage the k8s components. In cases where the nodes pull images from docker.io or another mirror registry with high latency, some of the components may take long time to start, which is causing timeout when bootstraping k8s cluster for fedora atomic/coreos drivers. This patch fixes it by adding TimeoutStartSec for the systemd services. Task: 37251 Story: 2006459 Change-Id: I709bac620e4ceec1858672076eb0aef997704b62	2019-11-13 22:19:51 +13:00
Spyros Trigazis	aa6b3bbeba	k8s_fedora: Add use_podman label Choose whether system containers etcd, kubernetes and the heat-agent will be installed with podman or atomic. This label is relevant for k8s_fedora drivers. k8s_fedora_atomic_v1 defaults to use_podman=false, meaning atomic will be used pulling containers from docker.io/openstackmagnum. use_podman=true is accepted as well, which will pull containers by k8s.gcr.io. k8s_fedora_coreos_v1 defaults and accepts only use_podman=true. Fix upgrade for k8s_fedora_coreos_v1 and magnum-cordon systemd unit. Task: 37242 Story: 2005201 Change-Id: I0d5e4e059cd4f0458746df7c09d2fd47c389c6a0 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2019-10-23 10:43:52 +00:00
Theodoros Tsioutsias	113fdc44b2	ng-12: Label nodegroup nodes With this change each node will be labeled with the following: * --node-labels=magnum.openstack.org/role=${NODEGROUP_ROLE} * --node-labels=magnum.openstack.org/nodegroup=${NODEGROUP_NAME} Change-Id: Ic410a059b19a1252cdf6eed786964c5c7b03d01c	2019-10-16 11:53:44 +00:00
Spyros Trigazis	73dc57c319	Support Fedora CoreOS 30 Add fedora coreos driver. To deploy clusters with fedora coreos operators or users need to add os_distro=fedora-coreos to the image. The scripts to deploy kubernetes on top are the same with fedora atomic. Note that this driver has selinux enabled. The startup of the heat-container-agent uses a workaround to copy the SoftwareDeployment credentials to /var/lib/cloud/data/cfn-init-data. The fedora coreos driver requires heat train to support ignition. Task: 29968 Story: 2005201 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch> Change-Id: Iffcaa68d385b1b829b577ebce2df465073dfb5a1	2019-10-16 09:44:19 +00:00
Spyros Trigazis	3674b3617a	k8s_atomic: Run all syscontainer with podman Using the atomic cli to install kubelet breaks mount propagation of secrets, configmaps and so on. Using podman in a systemd unit works. Additionally, with this change all atomic commands are dropped, containers are pulled from gcr.io (ofiicial kubernetes containers). Finally, after this patch only by starting the heat-agent with ignition, we can use fedora coreos as a drop-in replacement. * Drop del of docker0 This command to remove docker0 is carried from earlier versions of docker. This is not an issue anymore. story: 2006459 task: 36871 Change-Id: I2ed8e02f5295e48d371ac9e1aff2ad5d30d0c2bd Signed-off-by: Spyros Trigazis <spyridon.trigazi@cern.ch>	2019-10-08 09:14:36 +00:00
Spyros Trigazis	bb747ac5e7	k8s_fedora: Move rp_filter=1 for calico up follow up of: I828cec27968ffe0961011e34a66e0eef3e567c91 Move set of sysctl.conf up as it does need to depend on NetworkManager configuration. upstream docs: Cluster nodes must have rp_filter set to strict (1). https://github.com/projectcalico/calico/blob/master/v3.9/getting-started/kubernetes/installation/migration-from-flannel.md story: 2006441 task: 36564 Change-Id: I8a6e970a8ea3d1d3424eab05f1617509cf27d52b Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2019-10-02 10:16:57 +00:00
Spyros Trigazis	ddf27e935d	Add hostname-override to kube-proxy Pass the node name to kube-proxy and not repy on the cloud provider to set it. Kube-proxy needs to start before the cloud-provider. Without it kube-proxy fail to find the node in the kubernete api. story: 2006459 task: 36873 Change-Id: Ie04d8d99e68ee43c9d407dbd6f746f6249337ba2	2019-10-01 18:38:35 +00:00
Spyros Trigazis	3a38cfb2ef	k8s_fedora: Set rp_filter=1 for calico upstream docs: Cluster nodes must have rp_filter set to strict (1). https://github.com/projectcalico/calico/blob/master/v3.9/getting-started/kubernetes/installation/migration-from-flannel.md story: 2006441 task: 36564 Change-Id: I828cec27968ffe0961011e34a66e0eef3e567c91 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2019-09-11 10:36:57 +00:00
Theodoros Tsioutsias	7871859514	Trivial fix for cluster creation in master This is the fix for the "line 528: KUBE_PROXY_ARGS: unbound variable" error in master. Change-Id: Iaf5bbc8e4946c6625e82b6f68e754328f08b6ce7 Story: 2006492 Task: 36448	2019-09-05 13:55:18 +00:00
Ricardo Rocha	00f518fc59	Take kubeproxy_options into account on proxy setup The label kubeproxy_options was being ignored when setting up both master and minions. Add it to the kube proxy args. Change-Id: Ic830f19e1af062e90d066e6df4df2e4376e4f379 Story: 2006465 Task: 36394	2019-08-28 11:56:35 +02:00
Zuul	04fd0470ad	Merge "k8s: stop introspecting instance name"	2019-08-08 19:50:58 +00:00
Mohammed Naser	2f2d05c826	k8s: stop introspecting instance name We kept introspecting the name of the instance with the assumption that the network always existed under .novalocal This is not always the case, with certain variables changed inside Neutron it is possible to control this, therefore, leading in failing deploys. With this change, we pass the instance name directly to the cluster and therefore we always have the accurate name. Task: 36160 Story: 2006371 Change-Id: I2ba32844b822ffc14da043e6ef7d071bb62a22ee	2019-08-07 21:24:06 +00:00
Lingxian Kong	52155f0e76	Support auto_healing_controller This patch allows the user to choose the auto-healing service by introducing a new label 'auto_healing_controller', currently, 'draino' and 'magnum-auto-healer'[1] are supported. 'draino' is the default value for backward compatibility. [1]: https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/using-magnum-auto-healer.md Change-Id: I7ff14837a8d7d360b72c8f40733e84c88c4269d4	2019-07-24 17:52:33 +12:00
Spyros Trigazis	afd2403adc	k8s: Clear cni configuration In fedora atomic 29, podman is present and configures its own cni. We need to clear the cni configuration otherwise we will get that cni0 is already used. story: 2006171 task: 35682 Change-Id: Ic70938184bdb98eaaf4f384ce553818cf2624a2a Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2019-07-04 14:37:40 +02:00
Feilong Wang	cd67553f76	Fix overlay2 + docker_volume_size When using docker_storage_driver=overlay2 plus docker_volume_size > 0, user will run into problem that some pods can't be created. The root cause is kubelet needs the permission for /var/lib/docker to read/write. This patch fixes it by add /var/lib/docker to kubelet container's mount. Task: 30221 Story: 2005314 Change-Id: Ie19c95e6280e16644c686550950359cc9934c719	2019-06-10 10:18:10 +12:00
Feilong Wang	05c27f2d73	[k8s][fedora atomic] Rolling upgrade support Rolling ugprade is an important feature for a managed k8s service, at this stage, two user cases will be covered: 1. Upgrade base operating system 2. Upgrade k8s version Known limitation: When doing operating system upgrade, there is no chance to call kubectl drain to evict pods on that node. Task: 30185 Story: 2002210 Change-Id: Ibbed59bc135969174a20e5243ff8464908801a23	2019-06-07 14:48:08 +12:00
Feilong Wang	75fab6ff37	[fedora_atomic] Support auto healing for k8s Using Node Problem Detector, Draino and AutoScaler to support auto healing for K8s cluster, user can use a new label "auto_healing_enabled' to turn on/off it. Meanwhile, a new label "auto_scaling_enabled" is also introduced to enable the capability to let the k8s cluster auto scale based its workload. Task: 28923 Story: 2004782 Change-Id: I25af2a72a7a960205929374d2300bd83d4d20960	2019-04-17 14:47:39 +12:00
Jonathan Rosser	2595fda3e3	Ensure http proxy environment is available during 'atomic install' for k8s The scripts run by cloud-init for the master and minion nodes currently write proxy environment variables into /bin/bashrc when they are defined. These variables will only be introduced into the running environment when a new bash shell is started. The /bin/sh used by the fragment scripts will ignore /etc/bashrc, so the new shells invoked per fragment will not have the http proxy variables present. This means that the master/minion node deployment fails when behind an http proxy. This patch adds explicit exports for HTTP_PROXY and HTTPS_PROXY when those variables are defined, and not empty. Task: 29863 Change-Id: Id05c90d5bf99d720ae6002b38d3291e364e1e0c4	2019-03-07 22:16:38 +00:00
Spyros Trigazis	2ab874a5be	[k8s] Make flannel self-hosted Similar to calico, deploy flannel as a DS. Flannel can use the kubernetes API to store data, so it doesn't need to contact the etcd server directly anymore. This patch drops to relatively large files for flannel's config, flannel-config-service.sh and write-flannel-config.sh. All required config is in the manifests. Additional options to the controller manager: --allocate-node-cidrs=true and --cluster-cidr. Change-Id: I4f1129e155e2602299394b5866165260f4ea0df8 story: 2002751 task: 24870	2019-03-05 18:33:45 +01:00
Jim Bach	6c61a1a949	k8s_fedora: Use external kubernetes/cloud-provider-openstack * Use the external cloud-provider [0] * Label master nodes * Make the script the deploys the cloud-provider and clusterroles for the apiserver a SoftwareDeployment * Rename kube_openstack_config to cloud-config, for cinder to workm the kubelet expects the cloud config name only like this. Keep a copy of kube_openstack_config for backwards compatibility. Change-Id: Ife5558f1db4e581b64cc4a8ffead151f7b405702 Task: 22361 Story: 2002652 Co-Authored-By: Spyros Trigazis <spyridon.trigazis@cern.ch>	2018-12-19 10:56:47 +01:00
Zuul	3a50a242d3	Merge "[k8s] Add proxy to master and set cluster-cidr"	2018-08-21 08:46:10 +00:00

1 2

81 Commits