magnum

Commit Graph

Author	SHA1	Message	Date
Michal Nasiadka	ed699b0c9a	Drop k8s_fedora_atomic_v1 driver Change-Id: I3551ae244ecf99f67a9b142c964c020a5fae70a3	2024-02-27 16:35:35 +00:00
Grzegorz Bialas	9643abc9ae	Upgrade to calico_tag=v3.21.2 Additionally, use fixed subnet CIDR for IP_AUTODETECTION_METHOD supported from v3.16.x onwards. Story: 2007256 Task: 42017 Change-Id: Iaa25cd5054cec5482f01d90e2cd150bcd9700dbe	2022-01-21 08:50:15 +00:00
Zuul	c07628bca6	Merge "Support hyperkube_prefix label"	2021-04-07 19:09:49 +00:00
Bharat Kunwar	fc1f27a569	Support hyperkube_prefix label Additionally for k8s_fedora_coreos_v1 driver: * Introduce hyperkube_prefix which defaults to k8s.gcr.io/ * Bump default kube_tag to v1.18.16 Story: 1668998 Task: 41791 Change-Id: I38b8df45a00f1a2a1604059b8329d1dd762e05cd	2021-02-18 13:18:56 +00:00
Zuul	f6dafb5084	Merge "Make kubelet and kube-proxy use the secure port"	2021-02-10 10:00:18 +00:00
Diogo Guerra	ea64468ab3	3. Configure monitoring apps path based endpoints * Add monitoring_ingress_enabled magnum label to set up ingress with path based routing for all the configured services {alertmanager,grafana,prometheus}. When using this, cluster_root_domain_name magnum label must be used to setup base path where this services are available. * Add cluster_basic_auth_secret magnum label to configure basic auth on unprotected services {alertmanager and prometheus}. This is only in effect when app access is routed by ingress. * Set services logFormat to json to enable easier machine log parsing. task: 39477 story: 2006765 Depends-On: Ieb90605182626869528349a7fdeed65061914bcb Change-Id: Ie0e7000e0d94b2037f2c398fa67a2a2b7e256bc3 Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>	2021-02-05 15:52:52 +00:00
Diogo Guerra	37497ccf5b	1. Configurable prometheus monitoring persistent storage * Add metrics_retention_days magnum label allowing user to specify prometheus server scraped metrics retention days (default: 14) * Add metrics_retention_size magnum label allowing user to specify prometheus server metrics storage maximum size in Gib (default: 14) * Add metrics_scrape_interval allowing user to specify prometheus scrape frequency in seconds (default: 30) * Add metrics_storage_class_name allowing user to specify the storageClass to use as external retention for pod fail-over data persistency task: 39509 story: 2006765 Change-Id: I42117837e8e3cd03f3cb723df4d73692ead0d169 Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>	2021-02-05 15:52:33 +00:00
Spyros Trigazis	d11f4e8393	Make kubelet and kube-proxy use the secure port Create certificates for kubelet and kube-proxy on control-plane nodes similar to worker nodes. Use the secure kube-apiserver port on control-plane nodes. story: 2008524 task: 41602 Change-Id: Ibeb32a24ca25914cab32c63a9ccafaf711148a84 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2021-01-15 12:27:54 +00:00
Mohammed Naser	2c63aca8c6	Stop using delete_on_termination for BFV instances When using delete_on_termination and the booting of the instance fails on the first attempt, the second attempt will fail with Heat. The reason is that with delete_on_termination set to True, Nova will delete the volume when Heat deletes the ERROR'd instance and it will then result in the follow-up boot to fail with an error along the line of unable to find volume, which masks the real failure from the user (which could potentialy be aquota issue). With this patch, we no longer set this and instead use the default of false. This will not mean we will leak volumes because when we delete the stack, Heat will do all the right things and delete them in order, making sure the volume disappears eventually. Change-Id: I362cea7bf57825035d13d234d0181a2b1fca5743	2020-08-26 20:53:06 -04:00
Bharat Kunwar	799563eb61	Remove shebang from scripts Without this, heat container agents using kubectl version 1.18.x (e.g. ussuri-dev) fail because they do not have the correct KUBECONFIG in the environment. Task: 39938 Story: 2007591 Change-Id: Ifc212478ae09c658adeb6ba4c8e8afc8943e3977	2020-06-16 20:53:07 +00:00
Bharat Kunwar	a79f8f52f9	[k8s] Use Helm v3 by default - Refactor helm installer to use a single meta chart install job install job and config which use Helm v3 client. - Use upstream helm client binary instead of using helm-client container maintained by us. To verify checksum, helm_client_sha256 label is introduced for helm_client_tag (or alternatively for URL specified using new helm_client_url label). - Default helm_client_tag=v3.2.1. - Default tiller_tag=v2.16.7, tiller_enabled=false. Story: 2007514 Task: 39295 Change-Id: I9b9633c81afb08b91576a9a4d3c5a0c445e0cee4	2020-05-26 15:23:14 +00:00
Zuul	715a27dcb7	Merge "Update prometheus monitoring chart and images"	2020-05-12 23:01:33 +00:00
Zuul	5ada350502	Merge "[k8s] Upgrade k8s dashboard version to v2.0.0"	2020-05-01 14:20:42 +00:00
Spyros Trigazis	40f40b7772	k8s: Use the same kubectl version as API In the heat-agent we use kubectl to install several deployments, it is better if we use matching versions of kubectl and apiserver to minimize errors. Additionally, the heat-agent won't need kubectl anymore. story: 2007591 task: 39536 Change-Id: If8f6d84efc70606ac0d888c084c82d8c7eff54f8 Signed-off-by: Spyros Trigazis <strigazi@gmail.com>	2020-04-24 17:11:13 +00:00
Feilong Wang	b4965416b1	[k8s] Upgrade k8s dashboard version to v2.0.0 Heapster has been deprecated for a while and the new k8s dashboard 2.0.0 version supports metrics-server now. So it's time to upgrade the default k8s dashboard to v2.0.0. Task: 39101 Story: 2007256 Change-Id: I02f8cb77b472142f42ecc59a339555e60f5f38d0	2020-04-24 16:34:36 +12:00
Diogo Guerra	62a4b8ba09	Update prometheus monitoring chart and images Features: * Add to prometheus federation exported metrics the cluster_uuid label Updates: * prometheus-operator chart tag bumped to 8.12.13 * Update container_infra_prefix to missing prometheusOperator images task: 39540 task: 39541 story: 2006765 Change-Id: I76bca268bf4e0b8c253f112c5665bd2b43fc8d44 Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>	2020-04-23 17:59:57 +02:00
Diogo Guerra	06659759f1	[k8s] Introduce helm_client_tag label. Added label helm_client_tag to allow user to specify helm client container version. Task: 39294 Story: 2007514 Change-Id: I5d1cf238511951ac4a1849ca66b74dc747865391 Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>	2020-04-17 12:52:08 +00:00
Bharat Kunwar	fd80e1989f	Add selinux_mode label Fedora Atomic default: permissive Fedora CoreOS default: enforcing Story: 2007413 Task: 39033 Change-Id: Ibc1e02098155ac95bb35fcea5f21cc380bdf0d03 Signed-off-by: Bharat Kunwar <brtknr@bath.edu>	2020-03-28 17:57:25 +00:00
Zuul	305a0095ff	Merge "Add cinder_csi_enabled label"	2020-03-16 06:43:47 +00:00
Feilong Wang	d61dd1d5b5	[k8s] Support post install manifest URL A new config option `post_install_manifest_url` is added to support installing cloud provider/vendor specific manifest after booted the k8s cluster. It's an URL pointing to the manifest file. For example, cloud admin can set their specific storageclass into this file, then it will be automatically setup after created the cluster. Task: 35798 Story: 2006209 Change-Id: Ib5a2c5cd7970085db941f189613e175f622aea3f	2020-03-05 20:30:12 +13:00
Bharat Kunwar	9565984fd9	Add cinder_csi_enabled label Add support for out of tree Cinder CSI. This is installed when the cinder_csi_enabled=true label is added. This will allow us to eventually deprecate in-tree Cinder. story: 2007048 task: 37868 Change-Id: I8305b9f8c9c37518ec39198693adb6f18542bf2e Signed-off-by: Bharat Kunwar <brtknr@bath.edu>	2020-02-21 10:24:36 +00:00
Spyros Trigazis	de21e0431a	Add opt-in containerd support New labels: container_runtime, containerd or fallback to host-docker containerd_version, taken from https://github.com/containerd/containerd/releases containerd_tarball_url, eg https://storage.googleapis.com/cri-containerd-release/cri-containerd-1.2.4.linux-amd64.tar.gz containerd_tarball_sha256, sha256 of the above tarball story: 2007317 task: 38823 Change-Id: I6c6599cdee61f508bd2a5e4c454da3125a256753 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2020-02-20 15:47:40 +00:00
Zuul	16ea8b6397	Merge "Fix api-cert-manager=true blocking cluster creation"	2020-02-03 17:53:15 +00:00
Diogo Guerra	1ecec95b8c	Fix api-cert-manager=true blocking cluster creation In the current release, cert-api-manager runs on kubecluster.yaml [1], but in the kubemaster.yaml [2] the script [3] expects the existance of the ca.key file (if the cert_api_manager_enabled=true), otherwise it gets blocked. This file (ca.key), in turn, it's created only when enable-cert-api-manager.sh runs [4] So, we have a dead lock... So we need to change the call enable-cert-api-manager.sh into the kubemaster.yaml [1] https://github.com/openstack/magnum/blob/master/magnum/drivers/k8s_fedora_atomic_v1/templates/kubecluster.yaml#L1158-L1161 [2] https://github.com/openstack/magnum/blob/master/magnum/drivers/k8s_fedora_atomic_v1/templates/kubemaster.yaml#L760 [3] https://github.com/openstack/magnum/blob/master/magnum/drivers/common/templates/kubernetes/fragments/enable-services-master.sh#L12-L16 [4] https://github.com/openstack/magnum/blob/master/magnum/drivers/common/templates/kubernetes/fragments/enable-cert-api-manager.sh#L11 On other issue, the chown of this file (ca.key) it's not working. Moving the call of this file into kubemaster.yaml makes cluster creation FAILS because of an error [7] in [5]. If we check a cluster created in stein [6] we notice that the file is owned by root:root. Knowing this we can comment [5] for now. [5] https://github.com/openstack/magnum/blob/master/magnum/drivers/common/templates/kubernetes/fragments/enable-cert-api-manager.sh#L13 [6] http://paste.openstack.org/show/788534/ [7] http://paste.openstack.org/show/788537/ Change-Id: Ibee2df435c3f7c34bff74e9146fb28d8367124b1 Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>	2020-01-17 14:29:36 +01:00
Feilong Wang	a0e62df093	[k8s] Fix volumes availability zone issue For a multi AZ env, if Nova doesn't support cross AZ volume mount, then the cluster creation may fail because of block device mapping error. The patch fixes this issue by passing in the AZ information when creating volumes for etcd, docker and the node root disk. Task: 38131 Story: 2007097 Change-Id: I39c99259abc84cbbee50ac1a827e9349ede6593c	2020-01-16 12:41:26 +13:00
Diogo Guerra	355c71924b	Add calico_ipv4pool_ipip label IPIP Mode to use for the IPv4 POOL created at start up allowed_values: ["Always", "CrossSubnet", "Never", "Off"] default: "Off" Change-Id: Ib834a1f86a6db408047cc8f86fc7744d16d83904 Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>	2020-01-09 14:22:23 +01:00
Feilong Wang	ad2ef4962c	Fix proxy issue for k8s fedora drivers Due to the big changes recently to support k8s rolling upgrade, a regression issue was introduced which is broken the proxy function for image downloading. This patch fixes it for both fedor atomic driver and fedora coreos driver. Task: 37784 Story: 2007005 Change-Id: I11113d69629e1a97a58e5270f67c7404292b45c3	2019-12-20 09:40:00 +13:00
Diogo Guerra	df52f9c9ea	[k8s] Update metrics-server Magnum allows to use CONTAINER_INFRA_PREFIX to specify a local repository from which we can pull container images. This repository defaults to the upstream one that is specified in the metrics helm chart. * This patch allows for the usage of CONTAINER_INFRA_PREFIX to correctly configure the pull of the metric-server container image from the specified repo. * Add label metrics_server_chart_tag to allow user to specify stable/metrics-server chart tag to use * Add label metrics_server_enabled to allow enable/disable of component (defaults: true) Story: 2004816 Task: 37390 Change-Id: Idc315937a82317b76349bbe8466d900d00194953 Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>	2019-12-16 13:06:24 +01:00
Zuul	1af2826dd9	Merge "Add prometheus-adapter"	2019-12-11 14:17:30 +00:00
Bharat Kunwar	1ad4a9d0a0	[k8s] Add heapster_enabled label Story: 2004816 Task: 37654 Change-Id: Icd7f380d87672c00257e34df385d81e1c3e36ddf Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>	2019-12-11 11:40:47 +00:00
Diogo Guerra	354575804f	Add prometheus-adapter This will install the prometheus-adapter stable helm chart. Requires monitoring_enabled=true. The chart version can be configured using prometheus_adapter_chart_tag and an option is available to overwrite the default configuration rules for a user defined ConfigMap referenced by using prometheus_adapter_configmap label. story: 2006765 task: 37278 Change-Id: I5b86f4455f88c8dbeac6e56942e1ca55f1d1726c Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>	2019-12-10 13:54:39 +01:00
Bharat Kunwar	7d6e344f1a	Add nginx_ingress_controller_chart_tag Additioanlly, bumping up the Chart version to 1.24.7 without which the ingress controller fails to deploy on 1.16.x. Additionally, bump up nginx_ingress_controller_tag version to 0.26.1. This is to ensure that we are running an up to date nginx ingress controller with fixes for known CVEs. Story: 2006853 Task: 37444 Change-Id: Ibf045a06d19b02095e19d9a21d14a91a39a3751c	2019-11-24 11:24:33 +00:00
Spyros Trigazis	aa6b3bbeba	k8s_fedora: Add use_podman label Choose whether system containers etcd, kubernetes and the heat-agent will be installed with podman or atomic. This label is relevant for k8s_fedora drivers. k8s_fedora_atomic_v1 defaults to use_podman=false, meaning atomic will be used pulling containers from docker.io/openstackmagnum. use_podman=true is accepted as well, which will pull containers by k8s.gcr.io. k8s_fedora_coreos_v1 defaults and accepts only use_podman=true. Fix upgrade for k8s_fedora_coreos_v1 and magnum-cordon systemd unit. Task: 37242 Story: 2005201 Change-Id: I0d5e4e059cd4f0458746df7c09d2fd47c389c6a0 Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>	2019-10-23 10:43:52 +00:00
Fei Long Wang	09f85f3746	[fedora-atomic][k8s] Support operating system upgrade Along with the kubernetes version upgrade support we just released, we're adding the support to upgrade the operating system of the k8s cluster (including master and worker nodes). It's an inplace upgrade leveraging the atomic/ostree upgrade capability. Story: 2002210 Task: 33607 Change-Id: If6b9c054bbf5395c30e2803314e5695a531c22bc	2019-10-18 14:44:27 +00:00
Theodoros Tsioutsias	113fdc44b2	ng-12: Label nodegroup nodes With this change each node will be labeled with the following: * --node-labels=magnum.openstack.org/role=${NODEGROUP_ROLE} * --node-labels=magnum.openstack.org/nodegroup=${NODEGROUP_NAME} Change-Id: Ic410a059b19a1252cdf6eed786964c5c7b03d01c	2019-10-16 11:53:44 +00:00
Stanislav Dmitriev	cd054f20ac	Change the order of resource creation Resource creation order in kubernetes templates for Fedora Atomic was changed to avoid neutron bug https://bugs.launchpad.net/neutron/+bug/1845360 Floating IP should be assigned to network port after instance creation Change-Id: Ib7e0503d475d7cd3164a116c3a0325c4ae417a0a Story: 2006631 Task: 36844	2019-10-01 18:29:05 +00:00
Zuul	60d2485d83	Merge "[fedora atomic k8s] Add boot from volume support"	2019-09-20 11:21:33 +00:00
Zuul	83569e8394	Merge "calico: drop calico_cni_tag"	2019-09-20 11:08:53 +00:00
Mohammed Naser	cfe2753fd3	[fedora atomic k8s] Add boot from volume support Support boot from volume for Kubernetes all nodes (master and worker) so that user can create a big size root volume, which could be more flexible than using docker_volume_size. And user can specify the volume type so that user can leverage high performance storage, e.g. NVMe etc. And a new label etcd_volme_type is added as well so that user can set volume type for etcd volume. If the boot_volume_type or etcd_volume_type are not passed by labels, Magnum will try to read them from config option default_boot_volume_type and default_etcd_volume_type. A random volume type from Cinder will be used if those options are not set. Task: 30374 Story: 2005386 Co-Authorized-By: Feilong Wang<flwang@catalyst.net.nz> Change-Id: I39dd456bfa285bf06dd948d11c86867fc03d5afb	2019-09-20 05:00:29 +00:00
Bharat Kunwar	e84cc4c975	Convert network UUID to name required for OCCM Sometimes, the fixed_network value gets rendered as UUID. However OCCM's internal-network-name requires the network name, it does not support UUID. This patch introduces a new parameter called fixed_network_name which converts fixed_network UUID to name if it is UUID-like. Story: 2005333 Task: 36313 Change-Id: I3453bc0dbea285687d39c9782685cb1f2a3ecd39	2019-08-25 22:16:42 +00:00
Zuul	04fd0470ad	Merge "k8s: stop introspecting instance name"	2019-08-08 19:50:58 +00:00
Mohammed Naser	2f2d05c826	k8s: stop introspecting instance name We kept introspecting the name of the instance with the assumption that the network always existed under .novalocal This is not always the case, with certain variables changed inside Neutron it is possible to control this, therefore, leading in failing deploys. With this change, we pass the instance name directly to the cluster and therefore we always have the accurate name. Task: 36160 Story: 2006371 Change-Id: I2ba32844b822ffc14da043e6ef7d071bb62a22ee	2019-08-07 21:24:06 +00:00
Zuul	f1cf3d0b38	Merge "Support auto_healing_controller"	2019-08-06 08:40:25 +00:00
Bharat Kunwar	425fb0fa32	Add network config to stabilise multi-NIC scenario When there is more than one NIC attached to an instance, openstack cloud provider returns a random InternalIP back to the host resulting in instability with API server which only talks to a default interface. This patch incorporates the changes made in https://github.com/kubernetes/cloud-provider-openstack/pull/444 which enables OpenStack Cloud Controller Manager (OCCM) to respect the `internal-network-name` in cloud-config file which ensures that InternalIP remains stable. Uses a separate cloud-config file for OCCM to ensure in-tree Cinder volumes remain compatible. Change-Id: Idfa52ed2d512e7dc383a556371e896205dd542f9 Story: 2005333 Task: 30271	2019-07-29 09:07:26 +00:00
Lingxian Kong	52155f0e76	Support auto_healing_controller This patch allows the user to choose the auto-healing service by introducing a new label 'auto_healing_controller', currently, 'draino' and 'magnum-auto-healer'[1] are supported. 'draino' is the default value for backward compatibility. [1]: https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/using-magnum-auto-healer.md Change-Id: I7ff14837a8d7d360b72c8f40733e84c88c4269d4	2019-07-24 17:52:33 +12:00
Zuul	1963fce81a	Merge "Add npd_enabled label"	2019-07-10 00:35:48 +00:00
Diogo Guerra	41b83cef43	[k8s] Update prometheus monitoring helm based configuration * prometheus-operator chart version upgraded from 0.1.31. to 5.12.3 * Fix an issue where when using Feature Gate Priority the scheduler would evict the prometheus monitoring node-exporter pods * Fix an issue where intensive CPU utilization would make the metrics fail intermitently or completly fail * Prometheus resources are now calculated based on the MAX_NODE_COUNT requested * Change the sampling rate from the standard 30s to 1 minute (Rollback) * Add the missing tiller CONTAINER_INFRA_PREFIX variable to the ConfigMap * Add label prometheus_operator_chart_tag to enable the user to specify the stable/prometheus-operator chart to use * Fix breaking changes on CoreDNS metrics introduced by `8fb27da2fc` * Fix Graphana dashboard not showing data. Change-Id: If42873cd6668c07e4e911e4eef5e4ae2232be66f Task: 30777 Task: 30779 Story: 2005588 Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>	2019-06-25 10:07:55 +00:00
Diogo Guerra	10a5996e32	Add npd_enabled label Change-Id: Id3c5fdda6424d1a51f2e60ae26ca3069d93e00ee Story: 2004782 Task: 34192 Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>	2019-06-20 19:01:42 +02:00
Mohammed Naser	cd26be16c6	calico: drop calico_cni_tag This variable was not being used anywhere so it was an extra parameter that served no purpose. Change-Id: I7ae84ab6683530d95a8bca51487558b381f9cef2	2019-06-18 16:36:22 -04:00
Feilong Wang	05c27f2d73	[k8s][fedora atomic] Rolling upgrade support Rolling ugprade is an important feature for a managed k8s service, at this stage, two user cases will be covered: 1. Upgrade base operating system 2. Upgrade k8s version Known limitation: When doing operating system upgrade, there is no chance to call kubectl drain to evict pods on that node. Task: 30185 Story: 2002210 Change-Id: Ibbed59bc135969174a20e5243ff8464908801a23	2019-06-07 14:48:08 +12:00

1 2 3

140 Commits