Commit Graph

218 Commits

Author SHA1 Message Date
Jakub Darmach e3aaa89376 Removing Tiller support
Removed Tiller support from Helm modules install kubernetes fragment.

Change-Id: I81db0055ae82c64218498ae3e2a4fcc802f8d0e4
2024-02-29 14:41:29 +00:00
Takashi Kajinami 91f181e3ad Remove six from drivers module
This is part of the steps to remove usage of six library, which is no
longer needed since python 2 support was removed.

Change-Id: If6fb372f72a469e55e956e127c49863b5a557552
2024-02-19 10:43:24 +00:00
Michal Nasiadka 68c8acba39 Remove execution bit on unnecessary files
Change-Id: Ia41b843fdf20154750b129a8ab5dd42f5c3989fb
2024-02-19 00:30:21 +00:00
Zuul 39af658193 Merge "heat: Update addresses on CREATE_FAILED" 2024-02-02 10:31:53 +00:00
Michal Nasiadka 339a771587 heat: Update addresses on CREATE_FAILED
This is required for Tempest CI to fetch master/node addresses in order
to collect logs from them on cluster creation failure.

Change-Id: I24ac7ff632a8758bfefa5b66341a19eb9712dac6
2024-01-31 11:07:10 +00:00
Michal Nasiadka bc79012f46 Drop Swarm support
Label validator function has been left behind, although it's not
checking for anything right now - might be useful in future.

Change-Id: I74c744dc957d73aef7556aff00837611dadbada7
2024-01-24 13:20:21 +13:00
Dale Smith 2fd3059f38 Remove support for in-place upgrades with the Heat driver.
Heat stack SoftwareConfig is unable to provide a reliable upgrade
experience, so is being disabled. More details in code comments.

A Cluster API driver provides a way forward for Magnum to support
these again, and implement upgrade_cluster.

Change-Id: Ibea354ebfe36e8d689a95c30820709ec2b633964
2023-12-20 21:54:44 +13:00
ricolin 6169eb26ed Fix pep8 gate
This fix propose two parts:
* introduce timeout (60s) to requests calls
* remove `file` scheme support for requests calls.

Change-Id: Ide2c2915ba5d6ff03933160b74f7206492276968
2023-03-14 09:17:54 +08:00
Michal Nasiadka ac5702c406 Adapt Cinder CSI to upstream manifest
- Bump also components to upstream manifest versions.
- Add small tool to sync Cinder CSI manifests automatically

Change-Id: Icd19b41d03b7aa200965a3357a8ddf8b4b40794a
2022-09-26 13:28:37 +00:00
ricolin 1ed78a4438 Allow update cluster status with admin context
Trust token can be deleted outside of magnum,
But when trust token not found, the periodic update status job will
stay in inprogress unless another cluster action triggered.

Propose to use admin context when trust can not be found in periodic
update status job.

Story: 2010232
Task: 46031

Change-Id: I9cc9a0e654fb26ebec517e3413a592ac6613777c
2022-08-18 05:29:32 +08:00
Michal Nasiadka 5af49aa2fa Add support for choosing Octavia provider
Story: 2008805

Add support for:
* choosing Octavia provider
* setting octavia_lb_algorithm
* disabling Octavia healthcheck

Change-Id: I2d424fc2e2f8967e4b3007faedbc089f37126968
2022-05-03 09:32:24 +00:00
Zuul c07628bca6 Merge "Support hyperkube_prefix label" 2021-04-07 19:09:49 +00:00
Bharat Kunwar 00f8aa5d67 Fix debug logging during cluster upgrade
Incorrectly formatted logging causes error. This PS fixes it.

Story: 2008628
Task: 41833

Change-Id: Iac87a4a56187694d5f5b3454de380de6b6db48fa
2021-03-17 17:17:50 +00:00
Bharat Kunwar fc1f27a569 Support hyperkube_prefix label
Additionally for k8s_fedora_coreos_v1 driver:
* Introduce hyperkube_prefix which defaults to k8s.gcr.io/
* Bump default kube_tag to v1.18.16

Story: 1668998
Task: 41791

Change-Id: I38b8df45a00f1a2a1604059b8329d1dd762e05cd
2021-02-18 13:18:56 +00:00
guilhermesteinmuller 439548e3de Fix ostree_* upgrade
Currently, the code assumes that both
ostree_commit and ostree_remote are present
in cluster_template.labels. If one of them are
missing, the ostree upgrade fails [1] and leaves
the cluster with UPDATE_FAILED status.

By the docs[2], it is understood that users have
the ability to choose only one of the labels.

[1] https://gist.github.com/guilhermesteinmuller/7bf9f51e421283783cf737900797232c
[2] https://github.com/openstack/magnum/blob/master/doc/source/user/rolling-upgrade.rst

Change-Id: I0f65169305ba74c082b65bf39083def278404b93
2021-02-15 06:50:45 +00:00
Diogo Guerra ea64468ab3 3. Configure monitoring apps path based endpoints
* Add monitoring_ingress_enabled magnum label to set up ingress with
path based routing for all the configured services
{alertmanager,grafana,prometheus}. When using this,
cluster_root_domain_name magnum label must be used to setup base path
where this services are available.
* Add cluster_basic_auth_secret magnum label to configure basic auth
on unprotected services {alertmanager and  prometheus}. This is only
in effect when app access is routed by ingress.
* Set services logFormat to json to enable easier machine log parsing.

task: 39477
story: 2006765

Depends-On: Ieb90605182626869528349a7fdeed65061914bcb
Change-Id: Ie0e7000e0d94b2037f2c398fa67a2a2b7e256bc3
Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>
2021-02-05 15:52:52 +00:00
Diogo Guerra 37497ccf5b 1. Configurable prometheus monitoring persistent storage
* Add metrics_retention_days magnum label allowing user to specify
prometheus server scraped metrics retention days (default: 14)
* Add metrics_retention_size magnum label allowing user to specify
prometheus server metrics storage maximum size in Gib (default: 14)
* Add metrics_scrape_interval allowing user to specify prometheus
scrape frequency in seconds (default: 30)
* Add metrics_storage_class_name allowing user to specify the
storageClass to use as external retention for pod fail-over data
persistency

task: 39509
story: 2006765

Change-Id: I42117837e8e3cd03f3cb723df4d73692ead0d169
Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>
2021-02-05 15:52:33 +00:00
Bharat Kunwar 1cdc0628a2 [fix] Sync nodegroup status before delete_complete
Magnum cluster deletion is not behaving as expected. While it appears to
delete successfully, _delete_complete routine in
magnum/drivers/heat/driver.py is never called because the status of
nodegroups have not had the chance to sync with Heat before
_check_delete_complete is called. As a result, for example, trustee user
accounts are left orphaned. This PS changes or order of activities so
that _delete_complete is invoked successfully.

Story: 2007965
Task: 40459

Change-Id: Ibadd5b57fe175bb0b100266e2dbcc2e1ea4efcf9
2020-10-12 20:39:09 +00:00
Feilong Wang 8020391e4a [k8s] Support CA certs rotate
Now k8s cluster owner can do CA cert rotate to re-generate CA of
the cluster, service account keys and the certs of all nodes will
be regenerated as well. Cluster user needs to get a new kubeconfig
to access kubernetes API. This function is only supported by
Fedora CoreOS driver.

To test this patch with python-magnumclient, you need this patch
https://review.opendev.org/#/c/724243/, otherwise, you will see
an error about "not enough values to unpack", though the CA cert
rotate request has been processed by Magnum server side correctly.

Task: 39580
Story: 2005201

Change-Id: I4ae12f928e4f49b99732fba097371692cb35d9ee
2020-08-24 16:31:58 +12:00
Zuul e2135ac11f Merge "[fix] Append v3/v1 to auth_url/magnum_url if discovery fails" 2020-07-30 14:31:08 +00:00
Feilong Wang 946c1d67c7 Add master_lb_enabled to cluster
Adding the master_lb_enabled option when creating a cluster,
which will benefit both the cloud provider side and the end
user side. For cloud prodiver, they don't have to maintain
separate cluster templates with or w/o master_lb_enabled enabled.
For end user, they can easily use one single template to create
different clusters with different configs.

Task: 39680
Story: 2007634

Change-Id: I0b586f05168ece84fd340ef7493a56688191053d
2020-07-21 11:07:33 +12:00
Bharat Kunwar 5a688b1869 [fix] Append v3/v1 to auth_url/magnum_url if discovery fails
Sometimes, version discovery fails when Magnum cannot talk to
Keystone via specified trustee_keystone_interface intended for
cluster instances either because it is not unreachable from the
controller or CA certs are missing for TLS enabled interface and the
returned auth_url or magnum_url may not be suffixed with /v3 or /v1
respectively, in which case append the url with the suffix so that
instances can still talk to Keystone/Magnum.

Story: 2007868
Task: 40235

Change-Id: Iae831dc549a855269b4639c31676e75d2a9433d6
2020-06-30 13:36:24 +00:00
Zuul 82d92d08a3 Merge "Lower log level of missing output" 2020-06-20 05:16:52 +00:00
Feilong Wang b2e3f2346b Fix proxy issue for etcd and k8s
When the cloud is behind a proxy, podman needs to access the dockerhub
via proxy to pull the image, so the proxy settings need to be exported
to etcd systemd file as well. We're setting the heat-params as
environment file for k8s components already.

Besides, because CIDR of fixed subnet vary for different clusters,
so the subnet CIDR should be added into NO_PROXY list. Otherwise,
it will affect the communication between etcd members and also the
communication between k8s components.

Task: 39990
Story: 2007768

Change-Id: I4dba79e04abe38b9806e847348d3dd77ef96bee5
2020-06-17 09:54:11 +12:00
Spyros Trigazis 65ab249189 Lower log level of missing output
Lower the log level of a warning for a missing output to debug.
This log line appears repeatedly on successful cluster deletion,
creation failure (for unrelated reasons) and nodegroup creation
failure (again for unrelated reasons, eg timeout). This is
triggered when having multiple magnum conductors all trying to
query the status in heat. Additionally, this warning is not an
indication of a malfunction in a cluster or a failure, so it is
useful only for debugging. Finally, add the cluster id, cluster
status and stack id to have more context.

story: 2007636
task: 40062

Change-Id: Ie44b1d13899d77bd2a5d5b1e6107c384277788b9
Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>
2020-06-12 12:18:48 +03:00
Zuul 52690900a7 Merge "Fix label fixed_network_cidr" 2020-06-11 11:20:37 +00:00
Feilong Wang 001b9c6101 Fix label fixed_network_cidr
Now the label `fixed_network_cidr` is not handled correctly, no matter
if the label is set, the default value '10.0.0.0/24' is used for
fixed network anyway. This patch fixes it and renamed it as
`fixed_subnet_cidr` to make less confusion. The new behaviour will be:
1. If the label `fixed_subnet_cidr` is set but no fixed subnet passed
   in, then a new subnet will be created with the given CIDR.
2. If a fixed subnet is passed in by user, then label `fixed_subnet_cidr`
   will be override with the CIDR from the given subnet.

Task: 39847
Story: 2007712

Change-Id: Id05e36696bf85297a556fcd959ed897fe47b7354
2020-06-11 13:54:59 +12:00
Spyros Trigazis 9f4c63a0df resize: Send only nodes_to_remove and node_count
When resizing a NG we should strictly send the
desired node_count and the nodes_to_remove.
Otherwise the stack update operation may replace/rebuild
nodes or other resources.

This was the functionality with:
Id84e5d878b21c908021e631514c2c58b3fe8b8b0
But it was reverted with:
I725413e77f5a7bdb48131e8a10e5dc884b5e066a

Story: 2005266
task: 39860

Change-Id: Ib31b6801e0e2d954c31ac91e77ae9d3ef1afebd2
Signed-off-by: Spyros Trigazis <strigazi@gmail.com>
2020-06-05 08:47:53 +00:00
Bharat Kunwar a79f8f52f9 [k8s] Use Helm v3 by default
- Refactor helm installer to use a single meta chart install job
  install job and config which use Helm v3 client.
- Use upstream helm client binary instead of using helm-client container
  maintained by us. To verify checksum, helm_client_sha256 label is
  introduced for helm_client_tag (or alternatively for URL specified
  using new helm_client_url label).
- Default helm_client_tag=v3.2.1.
- Default tiller_tag=v2.16.7, tiller_enabled=false.

Story: 2007514
Task: 39295

Change-Id: I9b9633c81afb08b91576a9a4d3c5a0c445e0cee4
2020-05-26 15:23:14 +00:00
Zuul a2f4b28c60 Merge "[k8s] Add label 'master_lb_allowed_cidrs'" 2020-05-15 07:50:59 +00:00
Feilong Wang 3b87c5cc6f [k8s] Add label 'master_lb_allowed_cidrs'
A new label named `master_lb_allowed_cidrs` is added to control
the IP range which can access the k8s api and etcd load balancers.
It's a good security enhancement.

Task: 39188
Story: 2007414

Change-Id: I157a3b01d169e550e79b94316803fde8ddf77b03
2020-05-14 21:31:10 +12:00
Theodoros Tsioutsias f7a50223e7 More verbose logs for cluster ops
Most of the times issues with cluster update/upgrade/resize can be
identified just by looking at the parameters sent to Heat. This patch
changes the existing log messages for cluster update and resize to info
from debug. Adds a log message for cluster upgrade.

story: #2007636
task: #39689
Change-Id: Ibac5e105885b6e7042e88dea31cfeafe42a401ab
2020-05-07 09:38:44 +00:00
Zuul 5ada350502 Merge "[k8s] Upgrade k8s dashboard version to v2.0.0" 2020-05-01 14:20:42 +00:00
Zuul e87b6e632e Merge "[k8s] Fix no IP address in api_address" 2020-04-27 07:44:13 +00:00
Feilong Wang b4965416b1 [k8s] Upgrade k8s dashboard version to v2.0.0
Heapster has been deprecated for a while and the new k8s dashboard
2.0.0 version supports metrics-server now. So it's time to upgrade
the default k8s dashboard to v2.0.0.

Task: 39101
Story: 2007256

Change-Id: I02f8cb77b472142f42ecc59a339555e60f5f38d0
2020-04-24 16:34:36 +12:00
Feilong Wang 5dfb0d94c0 [k8s] Fix no IP address in api_address
This is a corner case that when floating_ip_enabled=False,
master_lb_enabled=True,master_lb_floating_ip_enabled=False in
cluster template, but setting floating_ip_enabled=True when
creating the cluster. The current logic is not correct which
resulted in missing IP address in the api_address of cluster.

Task: 39519
Story: 2007586

Change-Id: I5e2ca270c4f4e2c48d067cd5b8f6609c037cb6e5
2020-04-22 21:58:28 +12:00
Zuul 69225341a8 Merge "Fix ServerAddressOutputMapping for private clusters" 2020-04-21 12:30:20 +00:00
Diogo Guerra 06659759f1 [k8s] Introduce helm_client_tag label.
Added label helm_client_tag to allow user to specify helm client
container version.

Task: 39294
Story: 2007514

Change-Id: I5d1cf238511951ac4a1849ca66b74dc747865391
Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>
2020-04-17 12:52:08 +00:00
Bharat Kunwar 0e58e267d1 Fix ServerAddressOutputMapping for private clusters
Following changes were introduced in Train release:

- Allow setting network, subnet and FIP when creating cluster
  (I11579ff6b83d133c71c2cbf49ee4b20996dfb918)
- ng-7: Adapt parameter and output mappings
  (I45cf765977c7f5a92f28ae12c469b98435763163)

The first change allowed setting cluster.floating_ip_enabled but the
second change introduced ServerAddressOutputMapping conditional on
cluster_template.floating_ip_enabled which leads to an edge case where
if floating_ip_enabled is overriden to False when a cluster is created
when it is True in the cluster_template scope, we see this error in the
conductor logs: ValueError: Field `node_addresses[0]' cannot be None and
the cluster remains forever stuck in CREATE_IN_PROGRESS status despite
Heat reaching CREATE_COMPLETE. This commit addresses this issue by
correctly referring to the cluster.floating_ip_enabled.

Change-Id: Ic83f625178786d4750a66dd6dd6db35c05bc0272
Story: 2007550
Task: 39401
2020-04-15 08:07:56 +00:00
Zuul 3b9f06726d Merge "Add selinux_mode label" 2020-04-10 00:09:32 +00:00
Andreas Jaeger ae228bb5cc Update hacking for Python3
The repo is Python 3 now, so update hacking to version 3.0 which
supports Python 3.

Fix problems found.

Update local hacking checks for new flake8.

Remove hacking and friends from lower-constraints, those are not needed
for co-installing.

Change-Id: I926efaef501f190e78da9cab40c1e94203277258
2020-03-31 20:09:46 +02:00
Bharat Kunwar fd80e1989f Add selinux_mode label
Fedora Atomic default: permissive
Fedora CoreOS default: enforcing

Story: 2007413
Task: 39033

Change-Id: Ibc1e02098155ac95bb35fcea5f21cc380bdf0d03
Signed-off-by: Bharat Kunwar <brtknr@bath.edu>
2020-03-28 17:57:25 +00:00
Zuul 441a81910c Merge "Use cluster name for fixed_network instead of private" 2020-03-27 23:14:43 +00:00
Feilong Wang 529b036e78 Fix calico regression issue caused by default ipv4pool change
With I13aa0c58bf168bc069edf1d5c0187f89011fffdb, we missed to update
the default value of pods_network_cidr. As a result, there is a
mismatch between the calico_ipv4pool and the cidr configured in
kubernetes (kube-proxy and kube-controller-mananer). The mismatch
will cause some connection issues between pods/nodes. This patch
fixes it.

Task: 39153
Story: 2007426

Change-Id: Ic560322f5009f28e7e72704508705c1572a9262d
2020-03-27 09:56:19 +13:00
Bharat Kunwar 2864fc57d4
Use cluster name for fixed_network instead of private
At present, when a fixed_network is not specified, it is given the name
"private" by default. When multiple clusters are created, we end up in a
situation where we end up with multiple networks all with the same name.
This PS intends to make it easier to see where the resources belong to
by using the cluster name.

Story: 2007460
Task: 39139

Change-Id: I7f8028b716f9a9eced17d85ca2e46e2b1e34875f
2020-03-24 06:38:38 +00:00
Bharat Kunwar dfea2741f2
Fix join of status_reason
At present, the status reason resolves to:

    default-master <reason> ,default-worker <reason>

It should be:

    default-master <reason>, default-worker <reason>

This minor patch fixes this.

Task: 39092
Story: 2007438

Change-Id: I3382da8d950279713861e14d97997d5a5205b1e7
2020-03-18 10:39:51 +00:00
Zuul 305a0095ff Merge "Add cinder_csi_enabled label" 2020-03-16 06:43:47 +00:00
Feilong Wang d61dd1d5b5 [k8s] Support post install manifest URL
A new config option `post_install_manifest_url` is added to support
installing cloud provider/vendor specific manifest after booted
the k8s cluster. It's an URL pointing to the manifest file. For
example, cloud admin can set their specific storageclass into
this file, then it will be automatically setup after created
the cluster.

Task: 35798
Story: 2006209

Change-Id: Ib5a2c5cd7970085db941f189613e175f622aea3f
2020-03-05 20:30:12 +13:00
Bharat Kunwar 9565984fd9 Add cinder_csi_enabled label
Add support for out of tree Cinder CSI. This is installed when the
cinder_csi_enabled=true label is added. This will allow us to eventually
deprecate in-tree Cinder.

story: 2007048
task: 37868

Change-Id: I8305b9f8c9c37518ec39198693adb6f18542bf2e
Signed-off-by: Bharat Kunwar <brtknr@bath.edu>
2020-02-21 10:24:36 +00:00
Spyros Trigazis de21e0431a Add opt-in containerd support
New labels:
container_runtime, containerd or fallback to host-docker
containerd_version, taken from https://github.com/containerd/containerd/releases
containerd_tarball_url, eg https://storage.googleapis.com/cri-containerd-release/cri-containerd-1.2.4.linux-amd64.tar.gz
containerd_tarball_sha256, sha256 of the above tarball

story: 2007317
task: 38823

Change-Id: I6c6599cdee61f508bd2a5e4c454da3125a256753
Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>
2020-02-20 15:47:40 +00:00