Commit Graph

41 Commits

Author SHA1 Message Date
Takashi Kajinami 91f181e3ad Remove six from drivers module
This is part of the steps to remove usage of six library, which is no
longer needed since python 2 support was removed.

Change-Id: If6fb372f72a469e55e956e127c49863b5a557552
2024-02-19 10:43:24 +00:00
Michal Nasiadka 68c8acba39 Remove execution bit on unnecessary files
Change-Id: Ia41b843fdf20154750b129a8ab5dd42f5c3989fb
2024-02-19 00:30:21 +00:00
Michal Nasiadka 339a771587 heat: Update addresses on CREATE_FAILED
This is required for Tempest CI to fetch master/node addresses in order
to collect logs from them on cluster creation failure.

Change-Id: I24ac7ff632a8758bfefa5b66341a19eb9712dac6
2024-01-31 11:07:10 +00:00
Dale Smith 2fd3059f38 Remove support for in-place upgrades with the Heat driver.
Heat stack SoftwareConfig is unable to provide a reliable upgrade
experience, so is being disabled. More details in code comments.

A Cluster API driver provides a way forward for Magnum to support
these again, and implement upgrade_cluster.

Change-Id: Ibea354ebfe36e8d689a95c30820709ec2b633964
2023-12-20 21:54:44 +13:00
ricolin 1ed78a4438 Allow update cluster status with admin context
Trust token can be deleted outside of magnum,
But when trust token not found, the periodic update status job will
stay in inprogress unless another cluster action triggered.

Propose to use admin context when trust can not be found in periodic
update status job.

Story: 2010232
Task: 46031

Change-Id: I9cc9a0e654fb26ebec517e3413a592ac6613777c
2022-08-18 05:29:32 +08:00
Bharat Kunwar 00f8aa5d67 Fix debug logging during cluster upgrade
Incorrectly formatted logging causes error. This PS fixes it.

Story: 2008628
Task: 41833

Change-Id: Iac87a4a56187694d5f5b3454de380de6b6db48fa
2021-03-17 17:17:50 +00:00
guilhermesteinmuller 439548e3de Fix ostree_* upgrade
Currently, the code assumes that both
ostree_commit and ostree_remote are present
in cluster_template.labels. If one of them are
missing, the ostree upgrade fails [1] and leaves
the cluster with UPDATE_FAILED status.

By the docs[2], it is understood that users have
the ability to choose only one of the labels.

[1] https://gist.github.com/guilhermesteinmuller/7bf9f51e421283783cf737900797232c
[2] https://github.com/openstack/magnum/blob/master/doc/source/user/rolling-upgrade.rst

Change-Id: I0f65169305ba74c082b65bf39083def278404b93
2021-02-15 06:50:45 +00:00
Bharat Kunwar 1cdc0628a2 [fix] Sync nodegroup status before delete_complete
Magnum cluster deletion is not behaving as expected. While it appears to
delete successfully, _delete_complete routine in
magnum/drivers/heat/driver.py is never called because the status of
nodegroups have not had the chance to sync with Heat before
_check_delete_complete is called. As a result, for example, trustee user
accounts are left orphaned. This PS changes or order of activities so
that _delete_complete is invoked successfully.

Story: 2007965
Task: 40459

Change-Id: Ibadd5b57fe175bb0b100266e2dbcc2e1ea4efcf9
2020-10-12 20:39:09 +00:00
Feilong Wang 8020391e4a [k8s] Support CA certs rotate
Now k8s cluster owner can do CA cert rotate to re-generate CA of
the cluster, service account keys and the certs of all nodes will
be regenerated as well. Cluster user needs to get a new kubeconfig
to access kubernetes API. This function is only supported by
Fedora CoreOS driver.

To test this patch with python-magnumclient, you need this patch
https://review.opendev.org/#/c/724243/, otherwise, you will see
an error about "not enough values to unpack", though the CA cert
rotate request has been processed by Magnum server side correctly.

Task: 39580
Story: 2005201

Change-Id: I4ae12f928e4f49b99732fba097371692cb35d9ee
2020-08-24 16:31:58 +12:00
Spyros Trigazis 9f4c63a0df resize: Send only nodes_to_remove and node_count
When resizing a NG we should strictly send the
desired node_count and the nodes_to_remove.
Otherwise the stack update operation may replace/rebuild
nodes or other resources.

This was the functionality with:
Id84e5d878b21c908021e631514c2c58b3fe8b8b0
But it was reverted with:
I725413e77f5a7bdb48131e8a10e5dc884b5e066a

Story: 2005266
task: 39860

Change-Id: Ib31b6801e0e2d954c31ac91e77ae9d3ef1afebd2
Signed-off-by: Spyros Trigazis <strigazi@gmail.com>
2020-06-05 08:47:53 +00:00
Theodoros Tsioutsias f7a50223e7 More verbose logs for cluster ops
Most of the times issues with cluster update/upgrade/resize can be
identified just by looking at the parameters sent to Heat. This patch
changes the existing log messages for cluster update and resize to info
from debug. Adds a log message for cluster upgrade.

story: #2007636
task: #39689
Change-Id: Ibac5e105885b6e7042e88dea31cfeafe42a401ab
2020-05-07 09:38:44 +00:00
Andreas Jaeger ae228bb5cc Update hacking for Python3
The repo is Python 3 now, so update hacking to version 3.0 which
supports Python 3.

Fix problems found.

Update local hacking checks for new flake8.

Remove hacking and friends from lower-constraints, those are not needed
for co-installing.

Change-Id: I926efaef501f190e78da9cab40c1e94203277258
2020-03-31 20:09:46 +02:00
Bharat Kunwar dfea2741f2
Fix join of status_reason
At present, the status reason resolves to:

    default-master <reason> ,default-worker <reason>

It should be:

    default-master <reason>, default-worker <reason>

This minor patch fixes this.

Task: 39092
Story: 2007438

Change-Id: I3382da8d950279713861e14d97997d5a5205b1e7
2020-03-18 10:39:51 +00:00
Bharat Kunwar 895b693c07 [fix] Allow cluster OS upgrade without specifying kube_tag
If kube_tag is not specified in the new cluster_template, the existing
kube_tag should be reused. At the moment, we simply see this error which
does not make any sense:

    $ openstack coe cluster upgrade k8s k8s-alt
    '\'kube_tag\'\n (HTTP 500) (Request-ID: req-652883e9-05f3-43f1-b94d-8c6e0de75a2e)

Story: 2005201
Task: 37712

Change-Id: Ic15ad96a13f18c820bba592d4550f3a4fa951ffb
2019-12-16 11:39:43 +00:00
Zuul a2b5f5139b Merge "Failed state was ignored for default ngs" 2019-10-23 15:16:17 +00:00
Fei Long Wang 09f85f3746 [fedora-atomic][k8s] Support operating system upgrade
Along with the kubernetes version upgrade support we just released, we're
adding the support to upgrade the operating system of the k8s cluster
(including master and worker nodes). It's an inplace upgrade leveraging the
atomic/ostree upgrade capability.

Story: 2002210
Task: 33607

Change-Id: If6b9c054bbf5395c30e2803314e5695a531c22bc
2019-10-18 14:44:27 +00:00
Theodoros Tsioutsias 0ac4db955f ng-13: Support nodegroup upgrade
Adds support for upgrading nodegroups. All non-default nodegroups,
are allowed to be upgraded using the CT set in the cluster. The
only label that gets upgraded for now is kube_tag. All other labels
in the new cluster_template are ignored.

Change-Id: Icade1a70f160d5ec1c0e6f06ee642e29fe9b02ff
2019-10-16 11:53:44 +00:00
Spyros Trigazis 73dc57c319 Support Fedora CoreOS 30
Add fedora coreos driver. To deploy clusters with fedora coreos operators
or users need to add os_distro=fedora-coreos to the image. The scripts
to deploy kubernetes on top are the same with fedora atomic. Note that
this driver has selinux enabled.

The startup of the heat-container-agent uses a workaround to copy the
SoftwareDeployment credentials to /var/lib/cloud/data/cfn-init-data.
The fedora coreos driver requires heat train to support ignition.

Task: 29968
Story: 2005201

Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>

Change-Id: Iffcaa68d385b1b829b577ebce2df465073dfb5a1
2019-10-16 09:44:19 +00:00
Theodoros Tsioutsias ae159882e4 Failed state was ignored for default ngs
This change addresses an issue with state aggregation where default
ngs were in a failed state and it was ignored. e.g. default ngs were
in UPDATE_FAILED, a non-default ng in UPDATE_COMPLETE and the cluster
reported UPDATE_COMPLETE.

Change-Id: I317c896f0f161427fada677393df5fd2435e7bbd
story: 2006713
task: 37084
2019-10-15 07:54:33 +00:00
Theodoros Tsioutsias e52f77b299 ng-9: Driver for nodegroup operations
This adds the support for creating and deleting worker  nodegroups
using different stack per nodegroup. In order to be backwards
compatible, default nodegroups will remain in one stack.

Having this in mind cluster status is now calculated aggregating the
statuses of the underlying stacks.

Change-Id: I97839ab8495ed5d860785dff1f6e3cc59b6a9ff7
2019-09-26 08:45:57 +00:00
Theodoros Tsioutsias 5027e0daf8 ng-8: APIs for nodegroup CRUD operations
This adds the changes needed in the API and conductor level to support
creating updating and deleting nodegroups.

Change-Id: I4ad60994ad6b4cb9cac18129557e1e87e61ae98c
2019-09-26 08:45:57 +00:00
Mohammed Naser b5d50ddd89 k8s: refactor functions into KubernetesDriver
We currently have a lot of duplicate functions across our drivers
which use Kubernetes.  This takes them and brings them into a
common class called KubernetesDriver and cleans up the subclasses.

Change-Id: I6f880cb03ed43ec3bc9d3d9e5a7b87eaceda40e9
2019-06-24 16:33:06 -04:00
Spyros Trigazis (strigazi) 9b1bd5da54 Add cluster upgrade to the API
To enable the rolling upgrade ability of Kubernetes Cluster, this
patch is proposing a new API /upgrade to support upgrade the
base operating system of nodes and the version of Kubernetes, even
add-ons running on the k8s cluster:

POST <ClusterID>/actions/upgrade

And the post body will be:

{
    "cluster_template": 'dd9cc5ed-3a2b-11e9-9233-fa163e46bcc2',
    "max_batch_size": 1,
    "nodegroup": "production_group"
}

Co-Authored-By: Feilong Wang <flwang@catalyst.net.nz>

Task: 30168
Story: 2002210

Change-Id: Ia168877778aa0d473383eb06b1c8a16dc06b0576
2019-06-07 12:01:10 +12:00
Theodoros Tsioutsias ea95b0dc5c ng-3: Adapt existing drivers
The existing drivers are adapted to get node_count and master_count
information from the cluster's nodegroups. At the same time the
output mappings were updated to reflect the changes in the stack to
the nodegroups.

story: 2005266

Change-Id: I725413e77f5a7bdb48131e8a10e5dc884b5e066a
2019-03-28 10:31:01 +00:00
Feilong Wang 15ecdb8033 Support <ClusterID>/actions/resize API
Now an OpenStack driver for Kubernetes Cluster Autoscaler is being
proposed to support autoscaling when running k8s cluster on top of
OpenStack. However, currently there is no way in Magnum to let
the external consumer to control which node will be removed. The
alternative option is calling Heat API directly but obviously it
is not the best solution and it's confusing k8s community. So with
this patch, we're going to add a new API:

POST <ClusterID>/actions/resize

And the post body will be:

{
    "node_count": 3,
    "nodes_to_remove": ["dd9cc5ed-3a2b-11e9-9233-fa163e46bcc2"],
    "nodegroup": "production_group"
}

The API will be working in a declarative way. For example, there
are 3 nodes in the cluser now, user can propose an API request
like above. Magnum will call Heat to remove the node
dd9cc5ed-3a2b-11e9-9233-fa163e46bcc2 firstly, then bring the node
count back to 3 again.

Task: 29563
Story: 2005052

Change-Id: I7e36ce82c3f442976cc498153950b19c56a1759f
2019-03-19 20:13:17 +00:00
Lingxian Kong e18ced4d5c Delete Octavia loadbalancers for fedora atomic k8s driver
For k8s cluster, the loadbalancers created for LoadBalancer type
services should be deleted before the cluster deletion.

Change-Id: I75f44187b7be7d0ffb6a8f195f755de4b1564335
Closes-Bug: #1712062
2018-12-13 13:18:40 +13:00
Erik Olof Gunnar Andersson f2fd732ce2 Trivial code cleanups
Cleaning up comments and logging to make sure they properly adhere
to Openstack standards.

* Consistently use """ instead of ''' for comments.
* Always lazy-load logging parameters.
* Fixed bad log line in cert_manager.

Change-Id: I547f5dfa61609a899aef9b1470be8d8a6d8e4b81
2018-10-02 19:41:34 +00:00
Spyros Trigazis 3f773f1fd0 Use existing templates for cluster-update command
Cluster update was used for scaling operations only,
but if the heat-temaplates where changed for any reason
(eg upgrade of the magnum server), the stack update command
was destructive.

This patch uses the existing parameter in the stack update call.

story: 1722573
task: 21583

Change-Id: Id84e5d878b21c908021e631514c2c58b3fe8b8b0
2018-09-24 11:17:02 +02:00
Spyros Trigazis 797f0157d6 Resolve stack outputs only on COMPLETE
While the stack is not COMPLETE, we do not need
to resolve the outputs of the stack. Resolving the
outputs is expensive for large stacks.

story: 2002959
task: 22961

Change-Id: I26861214bba8cc92f4e7f9ecba5ba51df99346cb
Signed-off-by: Spyros Trigazis <spyridon.trigazis@cern.ch>
2018-07-20 06:52:57 +00:00
Kirsten G d9e590bdc6 Cache barbican certs for periodic tasks
Added configuration parameter, temp_cache_dir, to magnum.conf with
default value of "/var/lib/magnum/certificate-cache". This local
directory will hold cached cluster TLS credentials that are generated
during periodic tasks, to reduce load as the number of clusters
increases. If the temp_cache_dir does not exist, the certificates
will be created as tempfiles.

Closes-Bug: #1659545

Change-Id: I8808c4098a7c8d22dbfc841142c9f9c8b976dde1
2018-04-03 06:15:58 +00:00
Clenimar Filemon ec950be894 federation api: api endpoints
this commit introduces a new '/federations'
endpoint to Magnum API, as well as its controllers,
entities and conductor handlers.

this corresponds to the first phase of the
federation-api spec. please refer to [1] for more
details.

[1] https://review.openstack.org/#/c/489609/

Change-Id: I662ac2d6ddec07b50712109541486fd26c5d21de
Partially-Implements: blueprint federation-api
2018-02-09 00:59:31 -03:00
Costin Gamenț 336404c861 Generate lower case stack name
If stack name contains upper case characters, Openstack Cloud Provider
will complain.

Change-Id: I482e7143fb772d409d2c25b68ce6a6020e130bc3
Close-Bug: #1718947
2017-11-23 11:13:50 +01:00
Costin Gamenț 283c093187 Generate stack name as a valid hostname
Truncate cluster name to 30 characters, map ('_', '.') to '-' and remove non
alpha-numeric characters.

Change-Id: Ibb2bddc5b602a34d0e2bebd1f6bb197669bf21ec
Close-Bug: #1718947
2017-11-02 13:17:32 +00:00
Mark Goddard 88a6e3bab5 Don't poll heat if no stack exists
Cluster objects are created asynchronously from their underlying
heat stacks, meaning that the periodic update can sometimes end up
trying to poll a cluster's heat stack before the stack has been created.

This change checks whether the stack_id is None and skips polling heat
if so. This has the side effect of resolving bug 1682058, since we don't
try to use a trust and trustee that do not exist.

Change-Id: I73f039659250f1d5b69b23141835c4602c8e019a
Closes-Bug: #1682058
2017-07-28 07:46:47 +00:00
coldmoment ba8ad5e37f Add a hacking rule for string interpolation at logging
String interpolation should be delayed to be handled
by the logging code, rather than being done at the point
of the logging call.
See the oslo i18n guideline
* https://docs.openstack.org/oslo.i18n/latest/user/guidelines.html#adding-variables-to-log-messages
and
* https://github.com/openstack-dev/hacking/blob/master/hacking/checks/other.py#L39

Change-Id: I8a4f5f896865aebbff88ee894f0081e58cfce9ef
2017-07-15 14:49:45 +08:00
yuanpeng 71d25456d2 Remove log translations
Log messages are no longer being translated. This removes all use of
the _LE, _LI, and _LW translation markers to simplify logging and to
avoid confusion with new contributions.

See:
http://lists.openstack.org/pipermail/openstack-i18n/2016-November/002574.html
http://lists.openstack.org/pipermail/openstack-dev/2017-March/113365.html

Change-Id: If1f4bd2f6be967368f52fb367c5a428d3eb58a9d
Closes-Bug:#1674551
2017-03-30 17:05:10 +08:00
Randall Burt c40413518a Use correct context synching status
Use the trust context when synching cluster status
with orchestration status.

Change-Id: I8ae0d1b92c3adce83032bb6c5f269d8d23c20c5e
Partial-Blueprint: bp-driver-consolodation
Closes-Bug: #1651243
2016-12-20 17:02:00 +00:00
Jason Dunsmore 1cdb4f94aa Import magnum.i18n._ in driver/heat/driver.py
Closes-Bug: 1651169
Change-Id: I7d0cf66517935a13dfdf7ff24dd933024353f6f8
2016-12-19 09:52:17 -06:00
Randall Burt 84a9464957 Move cluster status notifications out of driver
Move cluster status change notifications into the
periodic task so that drivers do not have to have
any knowledge of Magnum notification strategy.

Change-Id: I5c71dd780f7bd6d4b683e491f5b4ce22cecb396c
Partial-Blueprint: bp-driver-consolodation
2016-12-07 15:27:44 +00:00
Randall Burt 759c1b3b2b Move cluster status updates into driver
This is an alternative implementation to:

https://review.openstack.org/#/c/397961

This version implements an earlier proposal from the
spec that adds a driver method for synchronizing
cluster state. This method is optional so that drivers
that do not wish to leverage the existing periodic
synchronization task can do so in whatever manner
they wish and Magnum will not force them to do anything
unnecessarily.

1. add an update_cluster_status method to the driver
   interface
2. implment update_cluster_status for Heat drivers
   using the existing tested logic
3. Remove cluster status updates from the cluster conductor
   in favor of the periodic sync_cluster_status task - this
   should avoid timeouts and race conditions possible in the
   previous implementation
4. Update the periodic sync_cluster_status method to use
   the driver to update cluster status rather than calling
   Heat directly

Change-Id: Iae0ec7af2542343cc51e85f0efd21086d693e540
Partial-Blueprint: bp-driver-consolodation
2016-12-01 19:52:06 -06:00
Randall Burt 7890725c52 Refactor driver interface (pt 1)
Refactor driver interface to encapsulate the orchestration
strategy. This first patch only refactors the main driver
operations. A follow-on will handle the state synchronization
and removing the poller from the conductor.

1. Make driver interface abstract
2. Move external cluster operations into driver interface
3. Make Heat-based driver abstract and update based on
   driver interface changes
4. Move Heat driver code into its own module
5. Update existing Heat drivers based on interface changes

Change-Id: Icfa72e27dc496862d950ac608885567c911f47f2
Partial-Blueprint: bp-driver-consolodation
2016-12-01 09:23:46 -06:00