This is part of the steps to remove usage of six library, which is no
longer needed since python 2 support was removed.
Change-Id: If6fb372f72a469e55e956e127c49863b5a557552
This is required for Tempest CI to fetch master/node addresses in order
to collect logs from them on cluster creation failure.
Change-Id: I24ac7ff632a8758bfefa5b66341a19eb9712dac6
Label validator function has been left behind, although it's not
checking for anything right now - might be useful in future.
Change-Id: I74c744dc957d73aef7556aff00837611dadbada7
Heat stack SoftwareConfig is unable to provide a reliable upgrade
experience, so is being disabled. More details in code comments.
A Cluster API driver provides a way forward for Magnum to support
these again, and implement upgrade_cluster.
Change-Id: Ibea354ebfe36e8d689a95c30820709ec2b633964
This fix propose two parts:
* introduce timeout (60s) to requests calls
* remove `file` scheme support for requests calls.
Change-Id: Ide2c2915ba5d6ff03933160b74f7206492276968
- Bump also components to upstream manifest versions.
- Add small tool to sync Cinder CSI manifests automatically
Change-Id: Icd19b41d03b7aa200965a3357a8ddf8b4b40794a
Trust token can be deleted outside of magnum,
But when trust token not found, the periodic update status job will
stay in inprogress unless another cluster action triggered.
Propose to use admin context when trust can not be found in periodic
update status job.
Story: 2010232
Task: 46031
Change-Id: I9cc9a0e654fb26ebec517e3413a592ac6613777c
* Add monitoring_ingress_enabled magnum label to set up ingress with
path based routing for all the configured services
{alertmanager,grafana,prometheus}. When using this,
cluster_root_domain_name magnum label must be used to setup base path
where this services are available.
* Add cluster_basic_auth_secret magnum label to configure basic auth
on unprotected services {alertmanager and prometheus}. This is only
in effect when app access is routed by ingress.
* Set services logFormat to json to enable easier machine log parsing.
task: 39477
story: 2006765
Depends-On: Ieb90605182626869528349a7fdeed65061914bcb
Change-Id: Ie0e7000e0d94b2037f2c398fa67a2a2b7e256bc3
Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>
* Add metrics_retention_days magnum label allowing user to specify
prometheus server scraped metrics retention days (default: 14)
* Add metrics_retention_size magnum label allowing user to specify
prometheus server metrics storage maximum size in Gib (default: 14)
* Add metrics_scrape_interval allowing user to specify prometheus
scrape frequency in seconds (default: 30)
* Add metrics_storage_class_name allowing user to specify the
storageClass to use as external retention for pod fail-over data
persistency
task: 39509
story: 2006765
Change-Id: I42117837e8e3cd03f3cb723df4d73692ead0d169
Signed-off-by: Diogo Guerra <diogo.filipe.tomas.guerra@cern.ch>
Magnum cluster deletion is not behaving as expected. While it appears to
delete successfully, _delete_complete routine in
magnum/drivers/heat/driver.py is never called because the status of
nodegroups have not had the chance to sync with Heat before
_check_delete_complete is called. As a result, for example, trustee user
accounts are left orphaned. This PS changes or order of activities so
that _delete_complete is invoked successfully.
Story: 2007965
Task: 40459
Change-Id: Ibadd5b57fe175bb0b100266e2dbcc2e1ea4efcf9
Now k8s cluster owner can do CA cert rotate to re-generate CA of
the cluster, service account keys and the certs of all nodes will
be regenerated as well. Cluster user needs to get a new kubeconfig
to access kubernetes API. This function is only supported by
Fedora CoreOS driver.
To test this patch with python-magnumclient, you need this patch
https://review.opendev.org/#/c/724243/, otherwise, you will see
an error about "not enough values to unpack", though the CA cert
rotate request has been processed by Magnum server side correctly.
Task: 39580
Story: 2005201
Change-Id: I4ae12f928e4f49b99732fba097371692cb35d9ee
Adding the master_lb_enabled option when creating a cluster,
which will benefit both the cloud provider side and the end
user side. For cloud prodiver, they don't have to maintain
separate cluster templates with or w/o master_lb_enabled enabled.
For end user, they can easily use one single template to create
different clusters with different configs.
Task: 39680
Story: 2007634
Change-Id: I0b586f05168ece84fd340ef7493a56688191053d
Sometimes, version discovery fails when Magnum cannot talk to
Keystone via specified trustee_keystone_interface intended for
cluster instances either because it is not unreachable from the
controller or CA certs are missing for TLS enabled interface and the
returned auth_url or magnum_url may not be suffixed with /v3 or /v1
respectively, in which case append the url with the suffix so that
instances can still talk to Keystone/Magnum.
Story: 2007868
Task: 40235
Change-Id: Iae831dc549a855269b4639c31676e75d2a9433d6
When the cloud is behind a proxy, podman needs to access the dockerhub
via proxy to pull the image, so the proxy settings need to be exported
to etcd systemd file as well. We're setting the heat-params as
environment file for k8s components already.
Besides, because CIDR of fixed subnet vary for different clusters,
so the subnet CIDR should be added into NO_PROXY list. Otherwise,
it will affect the communication between etcd members and also the
communication between k8s components.
Task: 39990
Story: 2007768
Change-Id: I4dba79e04abe38b9806e847348d3dd77ef96bee5
Lower the log level of a warning for a missing output to debug.
This log line appears repeatedly on successful cluster deletion,
creation failure (for unrelated reasons) and nodegroup creation
failure (again for unrelated reasons, eg timeout). This is
triggered when having multiple magnum conductors all trying to
query the status in heat. Additionally, this warning is not an
indication of a malfunction in a cluster or a failure, so it is
useful only for debugging. Finally, add the cluster id, cluster
status and stack id to have more context.
story: 2007636
task: 40062
Change-Id: Ie44b1d13899d77bd2a5d5b1e6107c384277788b9
Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>
Now the label `fixed_network_cidr` is not handled correctly, no matter
if the label is set, the default value '10.0.0.0/24' is used for
fixed network anyway. This patch fixes it and renamed it as
`fixed_subnet_cidr` to make less confusion. The new behaviour will be:
1. If the label `fixed_subnet_cidr` is set but no fixed subnet passed
in, then a new subnet will be created with the given CIDR.
2. If a fixed subnet is passed in by user, then label `fixed_subnet_cidr`
will be override with the CIDR from the given subnet.
Task: 39847
Story: 2007712
Change-Id: Id05e36696bf85297a556fcd959ed897fe47b7354
When resizing a NG we should strictly send the
desired node_count and the nodes_to_remove.
Otherwise the stack update operation may replace/rebuild
nodes or other resources.
This was the functionality with:
Id84e5d878b21c908021e631514c2c58b3fe8b8b0
But it was reverted with:
I725413e77f5a7bdb48131e8a10e5dc884b5e066a
Story: 2005266
task: 39860
Change-Id: Ib31b6801e0e2d954c31ac91e77ae9d3ef1afebd2
Signed-off-by: Spyros Trigazis <strigazi@gmail.com>
- Refactor helm installer to use a single meta chart install job
install job and config which use Helm v3 client.
- Use upstream helm client binary instead of using helm-client container
maintained by us. To verify checksum, helm_client_sha256 label is
introduced for helm_client_tag (or alternatively for URL specified
using new helm_client_url label).
- Default helm_client_tag=v3.2.1.
- Default tiller_tag=v2.16.7, tiller_enabled=false.
Story: 2007514
Task: 39295
Change-Id: I9b9633c81afb08b91576a9a4d3c5a0c445e0cee4
A new label named `master_lb_allowed_cidrs` is added to control
the IP range which can access the k8s api and etcd load balancers.
It's a good security enhancement.
Task: 39188
Story: 2007414
Change-Id: I157a3b01d169e550e79b94316803fde8ddf77b03
Most of the times issues with cluster update/upgrade/resize can be
identified just by looking at the parameters sent to Heat. This patch
changes the existing log messages for cluster update and resize to info
from debug. Adds a log message for cluster upgrade.
story: #2007636
task: #39689
Change-Id: Ibac5e105885b6e7042e88dea31cfeafe42a401ab
Heapster has been deprecated for a while and the new k8s dashboard
2.0.0 version supports metrics-server now. So it's time to upgrade
the default k8s dashboard to v2.0.0.
Task: 39101
Story: 2007256
Change-Id: I02f8cb77b472142f42ecc59a339555e60f5f38d0
This is a corner case that when floating_ip_enabled=False,
master_lb_enabled=True,master_lb_floating_ip_enabled=False in
cluster template, but setting floating_ip_enabled=True when
creating the cluster. The current logic is not correct which
resulted in missing IP address in the api_address of cluster.
Task: 39519
Story: 2007586
Change-Id: I5e2ca270c4f4e2c48d067cd5b8f6609c037cb6e5
Following changes were introduced in Train release:
- Allow setting network, subnet and FIP when creating cluster
(I11579ff6b83d133c71c2cbf49ee4b20996dfb918)
- ng-7: Adapt parameter and output mappings
(I45cf765977c7f5a92f28ae12c469b98435763163)
The first change allowed setting cluster.floating_ip_enabled but the
second change introduced ServerAddressOutputMapping conditional on
cluster_template.floating_ip_enabled which leads to an edge case where
if floating_ip_enabled is overriden to False when a cluster is created
when it is True in the cluster_template scope, we see this error in the
conductor logs: ValueError: Field `node_addresses[0]' cannot be None and
the cluster remains forever stuck in CREATE_IN_PROGRESS status despite
Heat reaching CREATE_COMPLETE. This commit addresses this issue by
correctly referring to the cluster.floating_ip_enabled.
Change-Id: Ic83f625178786d4750a66dd6dd6db35c05bc0272
Story: 2007550
Task: 39401
The repo is Python 3 now, so update hacking to version 3.0 which
supports Python 3.
Fix problems found.
Update local hacking checks for new flake8.
Remove hacking and friends from lower-constraints, those are not needed
for co-installing.
Change-Id: I926efaef501f190e78da9cab40c1e94203277258
With I13aa0c58bf168bc069edf1d5c0187f89011fffdb, we missed to update
the default value of pods_network_cidr. As a result, there is a
mismatch between the calico_ipv4pool and the cidr configured in
kubernetes (kube-proxy and kube-controller-mananer). The mismatch
will cause some connection issues between pods/nodes. This patch
fixes it.
Task: 39153
Story: 2007426
Change-Id: Ic560322f5009f28e7e72704508705c1572a9262d
At present, when a fixed_network is not specified, it is given the name
"private" by default. When multiple clusters are created, we end up in a
situation where we end up with multiple networks all with the same name.
This PS intends to make it easier to see where the resources belong to
by using the cluster name.
Story: 2007460
Task: 39139
Change-Id: I7f8028b716f9a9eced17d85ca2e46e2b1e34875f
At present, the status reason resolves to:
default-master <reason> ,default-worker <reason>
It should be:
default-master <reason>, default-worker <reason>
This minor patch fixes this.
Task: 39092
Story: 2007438
Change-Id: I3382da8d950279713861e14d97997d5a5205b1e7
A new config option `post_install_manifest_url` is added to support
installing cloud provider/vendor specific manifest after booted
the k8s cluster. It's an URL pointing to the manifest file. For
example, cloud admin can set their specific storageclass into
this file, then it will be automatically setup after created
the cluster.
Task: 35798
Story: 2006209
Change-Id: Ib5a2c5cd7970085db941f189613e175f622aea3f
Add support for out of tree Cinder CSI. This is installed when the
cinder_csi_enabled=true label is added. This will allow us to eventually
deprecate in-tree Cinder.
story: 2007048
task: 37868
Change-Id: I8305b9f8c9c37518ec39198693adb6f18542bf2e
Signed-off-by: Bharat Kunwar <brtknr@bath.edu>