Sometimes, version discovery fails when Magnum cannot talk to
Keystone via specified trustee_keystone_interface intended for
cluster instances either because it is not unreachable from the
controller or CA certs are missing for TLS enabled interface and the
returned auth_url or magnum_url may not be suffixed with /v3 or /v1
respectively, in which case append the url with the suffix so that
instances can still talk to Keystone/Magnum.
Story: 2007868
Task: 40235
Change-Id: Iae831dc549a855269b4639c31676e75d2a9433d6
When the cloud is behind a proxy, podman needs to access the dockerhub
via proxy to pull the image, so the proxy settings need to be exported
to etcd systemd file as well. We're setting the heat-params as
environment file for k8s components already.
Besides, because CIDR of fixed subnet vary for different clusters,
so the subnet CIDR should be added into NO_PROXY list. Otherwise,
it will affect the communication between etcd members and also the
communication between k8s components.
Task: 39990
Story: 2007768
Change-Id: I4dba79e04abe38b9806e847348d3dd77ef96bee5
Lower the log level of a warning for a missing output to debug.
This log line appears repeatedly on successful cluster deletion,
creation failure (for unrelated reasons) and nodegroup creation
failure (again for unrelated reasons, eg timeout). This is
triggered when having multiple magnum conductors all trying to
query the status in heat. Additionally, this warning is not an
indication of a malfunction in a cluster or a failure, so it is
useful only for debugging. Finally, add the cluster id, cluster
status and stack id to have more context.
story: 2007636
task: 40062
Change-Id: Ie44b1d13899d77bd2a5d5b1e6107c384277788b9
Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>
Now the label `fixed_network_cidr` is not handled correctly, no matter
if the label is set, the default value '10.0.0.0/24' is used for
fixed network anyway. This patch fixes it and renamed it as
`fixed_subnet_cidr` to make less confusion. The new behaviour will be:
1. If the label `fixed_subnet_cidr` is set but no fixed subnet passed
in, then a new subnet will be created with the given CIDR.
2. If a fixed subnet is passed in by user, then label `fixed_subnet_cidr`
will be override with the CIDR from the given subnet.
Task: 39847
Story: 2007712
Change-Id: Id05e36696bf85297a556fcd959ed897fe47b7354
When resizing a NG we should strictly send the
desired node_count and the nodes_to_remove.
Otherwise the stack update operation may replace/rebuild
nodes or other resources.
This was the functionality with:
Id84e5d878b21c908021e631514c2c58b3fe8b8b0
But it was reverted with:
I725413e77f5a7bdb48131e8a10e5dc884b5e066a
Story: 2005266
task: 39860
Change-Id: Ib31b6801e0e2d954c31ac91e77ae9d3ef1afebd2
Signed-off-by: Spyros Trigazis <strigazi@gmail.com>
A new label named `master_lb_allowed_cidrs` is added to control
the IP range which can access the k8s api and etcd load balancers.
It's a good security enhancement.
Task: 39188
Story: 2007414
Change-Id: I157a3b01d169e550e79b94316803fde8ddf77b03
Following changes were introduced in Train release:
- Allow setting network, subnet and FIP when creating cluster
(I11579ff6b83d133c71c2cbf49ee4b20996dfb918)
- ng-7: Adapt parameter and output mappings
(I45cf765977c7f5a92f28ae12c469b98435763163)
The first change allowed setting cluster.floating_ip_enabled but the
second change introduced ServerAddressOutputMapping conditional on
cluster_template.floating_ip_enabled which leads to an edge case where
if floating_ip_enabled is overriden to False when a cluster is created
when it is True in the cluster_template scope, we see this error in the
conductor logs: ValueError: Field `node_addresses[0]' cannot be None and
the cluster remains forever stuck in CREATE_IN_PROGRESS status despite
Heat reaching CREATE_COMPLETE. This commit addresses this issue by
correctly referring to the cluster.floating_ip_enabled.
Change-Id: Ic83f625178786d4750a66dd6dd6db35c05bc0272
Story: 2007550
Task: 39401
At present, when a fixed_network is not specified, it is given the name
"private" by default. When multiple clusters are created, we end up in a
situation where we end up with multiple networks all with the same name.
This PS intends to make it easier to see where the resources belong to
by using the cluster name.
Story: 2007460
Task: 39139
Change-Id: I7f8028b716f9a9eced17d85ca2e46e2b1e34875f
Docker volume size as well as volume env files should be fetched
based on the nodegroup and not the cluster.
story: 2006701
task: 37008
Change-Id: Ia9e7f4612f36f4e57626b2e931b84898523e9ccb
Without this patch, it is impossible to create a cluster without
defining a fixed_network or a fixed_subnet that already exists since we
get a Fixed{Network,Subnet}NotFound error, and Heat is unable to create
these for us.
Story: 2002652
Task: 37201
Change-Id: I0e26682b0b6093b215393eb4ce8e94eae8e5e8f7
Signed-off-by: Bharat Kunwar <brtknr@bath.edu>
With this change each node will be labeled with the following:
* --node-labels=magnum.openstack.org/role=${NODEGROUP_ROLE}
* --node-labels=magnum.openstack.org/nodegroup=${NODEGROUP_NAME}
Change-Id: Ic410a059b19a1252cdf6eed786964c5c7b03d01c
Removes the role heat param from all templates. Instead and only for
k8s templates adds the master_role and worker_role params. The new
worker_only condition should be true for all roles except for master.
Finally, adds the missing is_cluster_stack param to all templates.
Change-Id: Ie0799373fe492c2e0a0cad903ed6e8c93e6266b5
Since OpenStack Cloud Controller Manager only accepts fixed_subnet uuid,
convert fixed_subnet name to uuid when a cluster is created.
Without this patch, there is a chance OCCM fails to start in come cases
when fixed_subnet is rendered as name.
Story: 2002652
Task: 28816
Change-Id: Ie70bc00f5617ef94c39c9faea7d39617ee01b07b
This adds the support for creating and deleting worker nodegroups
using different stack per nodegroup. In order to be backwards
compatible, default nodegroups will remain in one stack.
Having this in mind cluster status is now calculated aggregating the
statuses of the underlying stacks.
Change-Id: I97839ab8495ed5d860785dff1f6e3cc59b6a9ff7
This is a missing case after we fixed[1]. When user passing in
an existing network when creating cluster, the network name is
missed in the code. This patch fixes it.
[1] https://review.opendev.org/678067
Task: 36430
Story: 2005333
Change-Id: I3a005089c4a755812c40589d8fa1e3ab7bbf062d
Sometimes, the fixed_network value gets rendered as UUID. However OCCM's
internal-network-name requires the network name, it does not support
UUID. This patch introduces a new parameter called fixed_network_name
which converts fixed_network UUID to name if it is UUID-like.
Story: 2005333
Task: 36313
Change-Id: I3453bc0dbea285687d39c9782685cb1f2a3ecd39
When using a public cluster template, user still need the capability
to reuse their existing network/subnet, and they also need to be
able to turn of/off the floatingip to overwrite the setting in the
public template. This patch supports that by adding those three
items as parameters when creating cluster.
Story: 2006208
Task: 35797
Change-Id: I11579ff6b83d133c71c2cbf49ee4b20996dfb918
This is a regression issue introduced by the rolling upgrade feature,
without setting the master_kube_tag and minion_kube_tag, they will
be set with the default value. This patch fixes it by keeping them
consistent with the kube_tag label.
Change-Id: I8b0ca3f87c9a52d48ecb75e4dd8de18a61a10d6f
Now the coe_version is out of sync with the k8s version deployed
for the cluster. This patch will make sure the kube_version is
consistent with the kube_tag when creating the cluster and upgrading
the cluster.
Task: 33608
Story: 2002210
Change-Id: I5812dac340099ecd8923c1e4a60ce0e6611f7ca4
This reverts commit e8d0ee1b14.
This commit is reverted for two reasons:
* It is undesirable that the end user can inject proxy config into
the magnum-conductor service via the cluster template.
* The proxy settings for the magnum-conductor service may not be
the same as those which are required in the cluster template for
the end user VM.
Systemd, docker and podman all include native mechanisms for setting
environment variables for proecesses, and this should be used by the
cloud operator / deployment tooling to configure the required proxy
settings for the magnum-conductor service.
In particular this patch makes it impossible for the cloud operator
to specify their own http_proxy via the environment, the user supplied
cluster template setting will always be used.
Change-Id: I33da19ad6764bedcf15f2a08381063e2471f8991
The existing drivers are adapted to get node_count and master_count
information from the cluster's nodegroups. At the same time the
output mappings were updated to reflect the changes in the stack to
the nodegroups.
story: 2005266
Change-Id: I725413e77f5a7bdb48131e8a10e5dc884b5e066a
Now an OpenStack driver for Kubernetes Cluster Autoscaler is being
proposed to support autoscaling when running k8s cluster on top of
OpenStack. However, currently there is no way in Magnum to let
the external consumer to control which node will be removed. The
alternative option is calling Heat API directly but obviously it
is not the best solution and it's confusing k8s community. So with
this patch, we're going to add a new API:
POST <ClusterID>/actions/resize
And the post body will be:
{
"node_count": 3,
"nodes_to_remove": ["dd9cc5ed-3a2b-11e9-9233-fa163e46bcc2"],
"nodegroup": "production_group"
}
The API will be working in a declarative way. For example, there
are 3 nodes in the cluser now, user can propose an API request
like above. Magnum will call Heat to remove the node
dd9cc5ed-3a2b-11e9-9233-fa163e46bcc2 firstly, then bring the node
count back to 3 again.
Task: 29563
Story: 2005052
Change-Id: I7e36ce82c3f442976cc498153950b19c56a1759f
- Add "octavia" as one of the "ingress_controller" options.
- Add label "octavia_ingress_controller_tag".
- Use external network ID in the heat templates.
Story: 2004838
Change-Id: I7d889a054cd5feb2eeef523b20607a6c7630d777
HTTP(S) proxy can be specified when creating the template.
https://docs.openstack.org/magnum/latest/admin/magnum-proxy.html
However, it is not being utilized when talking to a public etcd discovery
service, which result in failed cluster creation. We need to be able to
use HTTP(S) proxy when services are running behind a firewall.
Change-Id: I13d86b0dc7c232a51149107f0412219388d8c2cd
story: 2004664
Cluster update was used for scaling operations only,
but if the heat-temaplates where changed for any reason
(eg upgrade of the magnum server), the stack update command
was destructive.
This patch uses the existing parameter in the stack update call.
story: 1722573
task: 21583
Change-Id: Id84e5d878b21c908021e631514c2c58b3fe8b8b0
A new label `service_cluster_ip_range` is added for k8s so that
user can set the service portal ip range to avoid conflicts with
pod ip range.
Task: 22568
Story: 2002725
Change-Id: Ie6e95a953059cc4bd5cf15a44f8666b714defb13
In the OpenStack deployment with Octavia service enabled, the octavia
service should be used not only for master nodes high availability, but
also for k8s LoadBalancer type service implementation as well.
Change-Id: Ib61f59507510253794a4780a91e49aa6682c8039
Closes-Bug: #1770133
Define a set of new labels to pass additional options to the kubernetes
daemons - kubelet_options, kubeapi_options, kubescheduler_options,
kubecontroller_options, kubeproxy_options.
In all cases the default value is "", meaning no extra options are
passed to the daemons.
Change-Id: Idabe33b1365c7530edc53d1a81dee3c857a4ea47
Closes-Bug: #1701223
Add ingress controller configuration and backend to kubernetes clusters.
A new label 'ingress_controller' defines which backend should serve
ingress, with traefik added as the only option for now.
It is defined as a DaemonSet, with instances on all nodes defined with a
certain role. This role is set as an additional cluster label
'ingress_controller_role', with a default value of 'ingress'.
For now no node is automatically set with this role, with users or operators
having to do this manually after cluster creation.
Change-Id: I5175cf91f37e2988dc3d33042558d994810842f3
Closes-Bug: #1738808
Add a new label 'cert_manager_api' to kubernetes clusters controlling the
enable/disable of the kubernetes certificate manager api.
The same cluster cert/key pair is used by this api. The heat agent is used
to install the key in the master node(s), as this is required for kubernetes
to later sign new certificate requests.
The master template init order is changed so the heat agent is launched
previous to enabling the services - the controller manager requires the CA key
to be locally available before being launched.
Change-Id: Ibf85147316e3a194d8a3f92cbb4ae9ce8e16c98f
Partial-Bug: #1734318
Add flavor_id as an option during cluster create. If not given,
the default is taken from the cluster template.
Add flavor_id in the Cluster object and use that instead
of the one from ClusterTemplate.
Update both magnum and magnum cli documentation to reflect the above changes.
Partial-Bug: #1699245
Change-Id: Ib60c05cce1cf2639ca4740abdd264403033433f9
Add master_flavor_id as an option during cluster create. If not given,
the default is taken from the cluster template.
Add master_flavor_id in the Cluster object and use that instead
of the one from ClusterTemplate.
Update both magnum and magnum cli documentation to reflect the above changes.
Partial-Bug: #1699247
Change-Id: Id1d973167b381538121583a0a9691304b39e98de
Add labels as an option during cluster create. If not given,
the default is taken from the cluster template.
Add labels in the Cluster object and use that instead
of the one from ClusterTemplate.
Update both magnum and magnum cli documentation to reflect the above changes.
Partial-Bug: #1697651
Implements: blueprint flatten-attributes
Change-Id: I8990c78433dcbbca5bc4aa121678b02636346802
Allow setting the size of a volume for etcd storage.
Default is 0 which matches the current behavior - no persistency.
Related-Bug: #1697655
Change-Id: I8a30df63684133a902ae209ba6c124da2a567d3f
kube-ui [2] is deprecated and not actively maintained since long time.
Instead kubernetes dashboard [1] has lot of features and is actively
managed.
With this patch kube-ui is removed and kubernetes dashboard is added
and enabled in k8s cluster by default.
The kubernetes dashboard is enabled by default. To disable it, set the
label 'kube_dashboard_enabled' to False
Reference:
[1] https://github.com/kubernetes/dashboard
[2] https://github.com/kubernetes/kube-ui
Change-Id: I8864c097a3da6a602e0f25d3ff8ade788aa134a9
Implements: blueprint add-kube-dashboard
Profit from the default cAdvisor deployed by k8s to deploy the
remaining monitoring stack on top, made of node-exporter,
Prometheus and Grafana.
Node-exporter is ran as a normal pod through a manifest, while
Prometheus and Grafana are deployments with 1 replica.
Prometheus has compliance with Kubernetes, so the discovery of
the nodes and other k8s components is configured directly in
Prometheus configuration.
Change-Id: If2cab996b9458580a55b5212ab298c909622e7f3
Partially-Implements: blueprint container-monitoring
If nothing is specified a set of recommended default plugins is used,
which includes the ServiceAccount one.
Change-Id: I1383aae09ba68f8e83b07e3eaae40ab071f7be94
Closes-Bug: #1646489
Make Kubernetes' kube-controller-manager and kube-scheduler
health checks configurable as a parameter to the cluster-template
(label).
Set their value higher for all deployments. And set their value
to a high number for tests, for the CI.
Change-Id: I65e2da12487c513419125f0525a4e21bac22210e
Closes-Bug: 1648826
If a fixed_network and fixed_subnet is specified no private network
is created by the templates and the specified network is
used instead for VMs provisioning, like in the Ironic driver.
Currently missing is the code to handle the use case where you
specify a fixed_network but not a fixed_subnet, this will come
in a following patch.
Partially Implements: blueprint decouple-private-network
Change-Id: I2003eb709b22b905063d846eb71570fc5e033618
Refactor driver interface to encapsulate the orchestration
strategy. This first patch only refactors the main driver
operations. A follow-on will handle the state synchronization
and removing the poller from the conductor.
1. Make driver interface abstract
2. Move external cluster operations into driver interface
3. Make Heat-based driver abstract and update based on
driver interface changes
4. Move Heat driver code into its own module
5. Update existing Heat drivers based on interface changes
Change-Id: Icfa72e27dc496862d950ac608885567c911f47f2
Partial-Blueprint: bp-driver-consolodation