Rolling ugprade is an important feature for a managed k8s service,
at this stage, two user cases will be covered:
1. Upgrade base operating system
2. Upgrade k8s version
Known limitation: When doing operating system upgrade, there is no
chance to call kubectl drain to evict pods on that node.
Task: 30185
Story: 2002210
Change-Id: Ibbed59bc135969174a20e5243ff8464908801a23
The current magnum traefik deployment will always pull latest traefik
container image. With the new launch of traefik v2
(https://blog.containo.us/back-to-traefik-2-0-2f9aa17be305) this will
have impact on how the ingress is described in k8s.
This patch:
* Sets the traefik version to default tag v1.7.9, stable release
prior to v2.
* Adds a new label <traefik_ingress_controller_tag> to enable user
to specify other than default traefik release.
Task: 30143
Task: 30146
Story: 2005286
Change-Id: I031a594f7b6014d88df055664afcf51b1cd2cd94
Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>
Using Node Problem Detector, Draino and AutoScaler to support
auto healing for K8s cluster, user can use a new label
"auto_healing_enabled' to turn on/off it.
Meanwhile, a new label "auto_scaling_enabled" is also introduced
to enable the capability to let the k8s cluster auto scale based
its workload.
Task: 28923
Story: 2004782
Change-Id: I25af2a72a7a960205929374d2300bd83d4d20960
Add an nginx based Ingress controller for Kubernetes.
The use case is to provide better support use cases which require either
L4 access or SSL passthrough, which lack proper support in Traefik.
Selection is done via the same label 'ingress_controller' with value
'nginx'. Deployment relies on the upstream nginx-ingress helm chart.
Change-Id: I1db2074fce9d43c03f479a6aaeb4f238d7101555
Story: 2005327
Task: 30255
When there is more than one NIC attached to an instance, openstack cloud
provider returns a random InternalIP back to the host resulting in instability
with API server which only talks to a default interface.
This patch incorporates the changes made in
https://github.com/kubernetes/cloud-provider-openstack/pull/444 which enables
OpenStack Cloud Controller Manager to respect the `internal-network-name` in
cloud-config file which ensures that InternalIP remains stable.
Story: 2005333
Task: 30271
Change-Id: I9e3ad459dd05753b53cb4ce75ee3aed649fef196
The Kubernetes Helm repository includes in its stable distribution
a prometheus-operator Chart.
This stable/prometheus-operator chart can be used to install all the
dependencies and some default configurations to use prometheus.
The installed extra charts are:
* stable/prometheus-node-exporter (data scraping)
* stable/prometheus (prometheus and alertmanager server)
* stable/grafana (visualization dashboard)
* stable/prometheus-operator (supervision and simple configuration)
The prometheus-operator is installed by using the label
monitoring_enabled=True. Also, the label grafana_admin_passwd can be
used to set the admin password for access to the grafana dashboard
This patch allows for transferral of prometheus monitoring maintenance
work to be done by the kubernetes/helm team.
Task: 28544
Story: 2004623
depends_on: I99d3a78085ba10030200f12bbfe58a72964e2326
Change-Id: I80d590785bf30f9d634debeaf51c0d4cce0aeb93
Signed-off-by: Diogo Guerra <dy090.guerra@gmail.com>
Deploying Node Problem Detector to all nodes to detect problems which
can be leverage by auto healing. This is the first step of enabling
the auto healing feature.
Task: 29886
Story: 2004782
Change-Id: I1b6075025c5f369821b4136783e68b16535dc6ef
Similar to calico, deploy flannel as a DS.
Flannel can use the kubernetes API to store
data, so it doesn't need to contact the etcd
server directly anymore.
This patch drops to relatively large files for
flannel's config, flannel-config-service.sh and
write-flannel-config.sh. All required config is
in the manifests.
Additional options to the controller manager:
--allocate-node-cidrs=true and --cluster-cidr.
Change-Id: I4f1129e155e2602299394b5866165260f4ea0df8
story: 2002751
task: 24870
Add enable_tiller label to install tiller in k8s_fedora_atomic
clusters. Defaults to false.
Add tiller_tag label to select the version of tiller. If the
tag is not set the tag that matches the helm client version in
the heat-agent will be picked. The tiller image can be stored
in a private registry and the cluster can pull it using the
container_infra_prefix label.
Install tiller securely using helper container.
TODO:
*add instructions on how RBAC is designed
https://docs.helm.sh/using_helm/#example-deploy-tiller-in-a-namespace-restricted-to-deploying-resources-in-another-namespace
* add docs on how to install addon in the cluster using this tiller
* how users can get the creds to talk to tiller
NOTE:
The main goal of this tiller is internal usage!
Users can still deploy other tillers in other namespaces.
story: 2003902
task: 26780
Change-Id: I99d3a78085ba10030200f12bbfe58a72964e2326
Signed-off-by: dioguerra <dy090.guerra@gmail.com>
- Add "octavia" as one of the "ingress_controller" options.
- Add label "octavia_ingress_controller_tag".
- Use external network ID in the heat templates.
Story: 2004838
Change-Id: I7d889a054cd5feb2eeef523b20607a6c7630d777
Now cloud-provider-openstack of Kubernetes has a webhook to support
Keystone authorization and authentication. With this feature, user
can use a new label 'keystone-auth-enabled' to enable the keystone
authN and authZ.
DocImpact
Task: 21637
Story: 1755770
Change-Id: I3d21ad8f55c0d7308a302f62db9e9af147a604f8
* Use the external cloud-provider [0]
* Label master nodes
* Make the script the deploys the cloud-provider and clusterroles
for the apiserver a SoftwareDeployment
* Rename kube_openstack_config to cloud-config,
for cinder to workm the kubelet expects the cloud config name only
like this. Keep a copy of kube_openstack_config for backwards
compatibility.
Change-Id: Ife5558f1db4e581b64cc4a8ffead151f7b405702
Task: 22361
Story: 2002652
Co-Authored-By: Spyros Trigazis <spyridon.trigazis@cern.ch>
- Start workers as soon as the master VM is created, rather than
waiting all the services ready.
- Move all the SoftwareDeployment outside of kubemaster stack.
- Tweak the scripts in SoftwareDeployment so that they can be combined
into a single script.
Story: 2004573
Task: 28347
Change-Id: Ie48861253615c8f60b34a2c1e9ad6b91d3ae685e
Co-Authored-By: Lingxian Kong <anlin.kong@gmail.com>
To upgrade cluster we need to be able to set image tags
so this change adds to labels for corresponding containers
Task: 23314
Story: 2003171
Change-Id: I4cd0270a69fb889c59bdb28966821adb11fd0292
Add 'cloud_provider_enabled' label for the k8s_fedora_atomic
driver. Defaults to true. For specific kubernetes versions if
'cinder' is selected as a 'volume_driver', it is implied that
the cloud provider will be enabled since they are combined.
The motivation for this change is that in environments with
high load to the OpenStack APIs, users might want to disable
the cloud provider.
story: 1775358
task: 1775358
Change-Id: I2920f699654af1f4ba45644ab60a04a3f70918fe
Kubernetes should initialize its Global configuration for the OpenStack
provider with the region specified in the Heat stack.
This will allow user to create Magnum Kubernetes clusters in
multiregional OpenStack installation with different public endpoint for
services.
Task: 22576
Story: 2002728
Change-Id: I66820369b889e16445cad7a48cd0f458aae1c41f
Multi master deployments for k8s driver use different service account
keys for each api/controller manager server which leads to 401 errors
for service accounts. This patch will create a signed cert and private
key for k8s service account keys explicitly, dedicatedly for the k8s
cluster to avoid the inconsistent keys issue.
Task: 21653
Story: 1766546
Change-Id: I61547405f866d3c5a84da63de66724b55af1066a
When creating a multi-master cluster, all master nodes will attempt to
create kubernetes resources in the cluster at this same time, like
coredns, the dashboard, calico etc. This race conditon shouldn't be
a problem when doing declarative calls instead of imperative (kubectl
apply instead of create). However, due to [1], kubectl fails to apply
the changes and the deployemnt scripts fail causing cluster to creation
to fail in the case of Heat SoftwareDeployments. This patch passes the
ResourceGroup index of every master so that resource creation will be
attempted only from the first master node.
[1] https://github.com/kubernetes/kubernetes/issues/44165
Task: 21673
Story: 1775759
Change-Id: I83f78022481aeef945334c37ac6c812bba9791fd
This patch allows specification of Cgroup driver for Kubelet service.
The necessity of this patch was realised after upgrading Docker to the
new community edition (17.3+) which defaults to `cgroupfs` Cgroup
driver but on the other hand, Fedora Atomic (version 27) comes with
1.13. Cgroup drivers for Docker need to be identical for the two
services, Docker and Kubelet, need to be able to work together.
Story: 2002533
Task: 22079
Change-Id: Ia4b38a63ede59e18c8edb01e93acbb66f1e0b0e4
In the OpenStack deployment with Octavia service enabled, the octavia
service should be used not only for master nodes high availability, but
also for k8s LoadBalancer type service implementation as well.
Change-Id: Ib61f59507510253794a4780a91e49aa6682c8039
Closes-Bug: #1770133
To allow ther api server access pods, we need
flannel to be running on the master node.
* Run flannel on the master node in a system
container.
Change-Id: Ic0996ba36e335e970f3d2255840b24a8b4f738b8
Closes-Bug: #1757936
Define a set of new labels to pass additional options to the kubernetes
daemons - kubelet_options, kubeapi_options, kubescheduler_options,
kubecontroller_options, kubeproxy_options.
In all cases the default value is "", meaning no extra options are
passed to the daemons.
Change-Id: Idabe33b1365c7530edc53d1a81dee3c857a4ea47
Closes-Bug: #1701223
Add ingress controller configuration and backend to kubernetes clusters.
A new label 'ingress_controller' defines which backend should serve
ingress, with traefik added as the only option for now.
It is defined as a DaemonSet, with instances on all nodes defined with a
certain role. This role is set as an additional cluster label
'ingress_controller_role', with a default value of 'ingress'.
For now no node is automatically set with this role, with users or operators
having to do this manually after cluster creation.
Change-Id: I5175cf91f37e2988dc3d33042558d994810842f3
Closes-Bug: #1738808
In Fedora Atomic 27 etcd and flanneld are removed from the base image.
Install them as a system containers.
* update docker-storage configuration
* add etcd and flannel tags as labels
Change-Id: I2103c7c3d50f4b68ddc11abff72bc9e3f22839f3
Closes-Bug: #1735381
Add a new label 'cert_manager_api' to kubernetes clusters controlling the
enable/disable of the kubernetes certificate manager api.
The same cluster cert/key pair is used by this api. The heat agent is used
to install the key in the master node(s), as this is required for kubernetes
to later sign new certificate requests.
The master template init order is changed so the heat agent is launched
previous to enabling the services - the controller manager requires the CA key
to be locally available before being launched.
Change-Id: Ibf85147316e3a194d8a3f92cbb4ae9ce8e16c98f
Partial-Bug: #1734318
Added configuration parameter, verify_ca, to magnum.conf with default
value of True. This parameter is passed to the heat templates to
indicate whether the cluster nodes validate the Certificate Authority
when making requests to the OpenStack APIs (Keystone, Magnum, Heat).
This configuration parameter can be set to False to disable CA
validation.
Co-Authored-By: Vijendar Komalla <vijendar.komalla@rackspace.com>
Change-Id: Iab02cb1338b811dac0c147378dbd0e63c83f0413
Partial-Bug: #1663757
Add a label to prefix all container image use by magnum:
* kubernetes components
* coredns
* node-exporter
* kubernetes-dashboard
Using this label all containers will be pulled from the specified
registry and group in the registry.
TODO:
* grafana
* prometheus
Closes-Bug: #1712810
Change-Id: Iefe02f5ebc97787ee80431e0f16f73ae8444bdc0
1. It will fail to create cluster if there is chinese in tenant name
2. TENANT_NAME is unnecessary after changing to trustee
this patch is for k8s_fedora_atomic and k8s_fedora_ironic
Change-Id: Ie072f183110ae95861fb3694a913a3a4526549fb
Close-Bug: #1711308
Separate the tag from which to pull from the kubernetes version.
With the current state the tag and the version happen to be the
the same. But, it is not decided yet in the fedoraproject how the
images are going to be tag. Finally, operators might want to try
their own container images with custom tags.
Depends-On: Icddb8ed1598f2ba1f782622f86fb6083953c3b3f
Implements: blueprint run-kube-as-container
Change-Id: I4c4bc055d7df5e65aede93464bff51e6d5971504
Allow setting the size of a volume for etcd storage.
Default is 0 which matches the current behavior - no persistency.
Related-Bug: #1697655
Change-Id: I8a30df63684133a902ae209ba6c124da2a567d3f
Enable internal cluster DNS by deploying CoreDNS in the kube-system
namespace. It covers dns queries for both the cluster and external,
acting as a proxy with a cache layer in front.
Version of CoreDNS hard-coded to 007, image taken from dockerhub.
Related-Bug: #1692449
Change-Id: I0a9703b531fe872416dcd79fa7d4d27c1ea61586
Same fix as CoreOS for Fedora which enable multimaster with
TLS and ETCD Load balancer.
Closes-Bug: #1679724
Change-Id: I45b62a20f0a89ebd1494ad61021384fc7a416e8e
kube-ui [2] is deprecated and not actively maintained since long time.
Instead kubernetes dashboard [1] has lot of features and is actively
managed.
With this patch kube-ui is removed and kubernetes dashboard is added
and enabled in k8s cluster by default.
The kubernetes dashboard is enabled by default. To disable it, set the
label 'kube_dashboard_enabled' to False
Reference:
[1] https://github.com/kubernetes/dashboard
[2] https://github.com/kubernetes/kube-ui
Change-Id: I8864c097a3da6a602e0f25d3ff8ade788aa134a9
Implements: blueprint add-kube-dashboard
Profit from the default cAdvisor deployed by k8s to deploy the
remaining monitoring stack on top, made of node-exporter,
Prometheus and Grafana.
Node-exporter is ran as a normal pod through a manifest, while
Prometheus and Grafana are deployments with 1 replica.
Prometheus has compliance with Kubernetes, so the discovery of
the nodes and other k8s components is configured directly in
Prometheus configuration.
Change-Id: If2cab996b9458580a55b5212ab298c909622e7f3
Partially-Implements: blueprint container-monitoring
This commit addresses multiple potential vulnerabilities in
Magnum. It makes the following changes:
* Permissions for /etc/sysconfig/heat-params inside Magnum
created instances are tightened to 0600 (used to be 0755).
* Certificate retrieval is modified to work without the need
for a Keystone trust.
* The cluster's Keystone trust id is only passed into
instances for clusters where that is actually needed. This
prevents the trustee user from consuming the trust in cases
where it is not needed.
* The configuration setting trust/cluster_user_trust (False by
default) is introduced. It needs to be explicitely enabled
by the cloud operator to allow clusters that need the
trust_id to be passed into instances to work. Without this
setting, attempts to create such clusters will fail.
Please note, that none of these changes apply to existing
clusters. They will have to be deleted and rebuilt to benefit
from these changes.
Change-Id: I643d408cde0d6e30812cf6429fb7118184793400