Commit Graph

284 Commits

Author SHA1 Message Date
Dawud 140722f74e
Enable HAProxy Prometheus metrics endpoint
HAProxy exposes a Prometheus metrics endpoint, it just needs to be
enabled. Enable this and remove configuration for
prometheus-haproxy-exporter. Remaining prometheus-haproxy-exporter
containers will automatically be removed.

Change-Id: If6e75691d2a996b06a9b95cb0aae772db54389fb
Co-Authored-By: Matt Anson <matta@stackhpc.com>
2024-01-05 10:36:31 +00:00
Mark Goddard af6e1ca4fd Support Ansible max_fail_percentage
This allows us to continue execution until a certain proportion of hosts
to fail. This can be useful at scale, where failures are common, and
restarting a deployment is time-consuming.

The default max failure percentage is 100, keeping the default
behaviour. A global max failure percentage may be set via
kolla_max_fail_percentage, and individual services may define a max
failure percentage via <service>_max_fail_percentage.

Note that all hosts in the inventory must be reachable for fact
gathering, even those not included in a --limit.

Closes-Bug: #1833737
Change-Id: I808474a75c0f0e8b539dc0421374b06cea44be4f
2023-12-05 11:49:42 +01:00
Jan Gutter ed3b27cc92 etcd: Add support for more scenarios
This commit addresses a few shortcomings in the etcd service:
  * Adding or removing etcd nodes required manual intervention.

  * The etcd service would have brief outages during upgrades or
    reconfigures because restarts weren't always serialised.

This makes the etcd service follow a similar pattern to mariadb:
  * There is now a distiction between bootstrapping the cluster
    and adding / removing another member.

  * This more closely follows etcd's upstream bootstrapping
    guidelines.

  * The etcd role now serialises restarts internally so the
    kolla_serial pattern is no longer appropriate (or necessary).

This does not remove the need for manual intervention in all
failure modes: the documentation has been updated to address the
most common issues.

Note that there's repetition in the container specifications: this
is somewhat deliberate. In a future cleanup, it's intended to reduce
the duplication.

Change-Id: I39829ba0c5894f8e549f9b83b416e6db4fafd96f
2023-11-28 18:43:56 +01:00
James Kirsch 5581a28253 Add support for LetsEncrypt-managed certs
Add support for automatic provisioning and renewal of HTTPS
certificates via LetsEncrypt.

Spec is available at:
https://etherpad.opendev.org/p/kolla-ansible-letsencrypt-https

Depends-On: https://review.opendev.org/c/openstack/kolla/+/887347
Co-Authored-By: Michal Arbet <michal.arbet@ultimum.io>
Implements: blueprint letsencrypt-https
Change-Id: I35317ea0343f0db74ddc0e587862e95408e9e106
2023-11-07 10:59:51 +01:00
Mark Goddard 6c037790f2 Refactor MariaDB and RabbitMQ restart procedure
Ansible 2.14.3 introduced a change that broke the method used for
restarting MariaDB and RabbitMQ serially [1][2]. In
I57425680a4cdbf0daeb9b2cc35920f1b933aa4a8 we limited to 2.14.2 to work
around this. Ansible upstream claim this behaviour was unintentional,
and will not fix it.

This change moves to a different approach where we use separate plays
with a 'serial' keyword to execute the restart.

This change also removes the restriction on the maximum supported
version of 2.14.2 on ansible-core - any 2.14 release is now supported.

[1] 65366f663d
[2] https://github.com/ansible/ansible/issues/80848

Depends-On: https://review.opendev.org/c/openstack/kolla/+/884208

Change-Id: I5a12670d07077d24047aaff57ce8d33ccf7156ff
2023-06-17 21:02:49 +00:00
Zuul 0a128d24b9 Merge "Put etcd behind HTTP loadbalancer" 2023-02-14 11:31:09 +00:00
Will Szumski 6f536a4f71 Put etcd behind HTTP loadbalancer
etcd-compatible tooz drivers do not support multiple endpoints via
backend_url. We can put a loadbalancer in front of etcd and configure
backend_url to use the VIP instead. The issue with hard coding the first
host is that we break coordination if we take this host offline. In the
case of cinder, we would not be able to perform any volume related
operations.

Co-Authored-By: Mark Goddard <mark@stackhpc.com>
Change-Id: Ib684501ba03c386dc5ac71e5cbea05c99f191665
2023-02-13 11:45:53 +00:00
yangshaoxue 113b77c8cb Add skyline service
Support to deploy skyline by kolla-ansible.

Implements: blueprint skyline
Depends-On: https://review.opendev.org/c/openstack/kolla/+/826948

Change-Id: Ice5621491a432ba32138abd6f62d1f815cc219e0
2023-01-31 13:47:18 +08:00
Michal Nasiadka 673ca8c7e7 Drop skydive
Change-Id: I8855bd60c2fd77f33fb55d4123131a94327bd166
2023-01-05 14:55:53 +01:00
Michal Nasiadka 3a94996b41 ovn: Change order of deployment
ovn-controller should be deployed first according to OVN upgrade guide.
Since we are getting newer OVN/OVS versions from RDO/Ubuntu in a cycle,
let's apply that to deployment.

Closes-Bug: #1979329

Change-Id: I017aec611a057db1634cfc2634164b21cb210193
2022-12-22 09:50:40 +01:00
Michal Nasiadka f128d19957 Remove kafka, storm, zookeeper
Their cleanup has been added to monasca cleanup command.

Change-Id: I19a846e2683ae70b33ca64d2aba7ac71eb724588
2022-12-08 06:50:15 +00:00
Zuul 113242c864 Merge "Replace ElasticSearch and Kibana with OpenSearch" 2022-12-01 14:38:51 +00:00
Michal Nasiadka e1ec02eddf Replace ElasticSearch and Kibana with OpenSearch
This change replaces ElasticSearch with OpenSearch, and Kibana
with OpenSearch Dashboards. It migrates the data from ElasticSearch
to OpenSearch upon upgrade.

No TLS support is in this patch (will be a followup).

A replacement for ElasticSearch Curator will be added as a followup.

Depends-On: https://review.opendev.org/c/openstack/kolla/+/830373

Co-authored-by: Doug Szumski <doug@stackhpc.com>
Co-authored-by: Kyle Dean <kyle@stackhpc.com>
Change-Id: Iab10ce7ea5d5f21a40b1f99b28e3290b7e9ce895
2022-12-01 10:27:50 +00:00
Michal Nasiadka 63a7968d8d ovn: Break out role into ovn-db and ovn-controller roles
Instead of handling everything in one role - let's have small
fit-for-purpose roles, because in reality these are two hosts
roles and performance should be better with this approach.

[1]: https://docs.ovn.org/en/latest/intro/install/ovn-upgrades.html

Change-Id: I8f9dbe9d950323f16375ad5e1dbaedfb1be6585f
2022-11-28 13:52:30 +01:00
Doug Szumski adb8f89a36 Remove support for deploying OpenStack Monasca
Kolla Ansible is switching to OpenSearch and is dropping support for
deploying ElasticSearch. This is because the final OSS release of
ElasticSearch has exceeded its end of life.

Monasca is affected because it uses both Logstash and ElasticSearch.
Whilst it may continue to work with OpenSearch, Logstash remains an
issue.

In the absence of any renewed interest in the project, we remove
support for deploying it. This helps to reduce the complexity
of log processing configuration in Kolla Ansible, freeing up
development time.

Change-Id: I6fc7842bcda18e417a3fd21c11e28979a470f1cf
2022-11-11 15:48:11 +00:00
LinPeiWen 322e288368 Performance: site.yml remove redundant 'when'
Facts define the group key to judge in incloud roles,
remove when statement does not execute to speed up execution

Partially-Implements: blueprint performance-improvements
Change-Id: If22255f1adc07ab16b46f8ad1280efdf7d713d28
2022-04-25 18:40:55 +08:00
Zuul 4601fdbabd Merge "drop qdrouterd support" 2022-04-11 11:52:36 +00:00
Marcin Juszkiewicz b540717387 drop qdrouterd support
Change-Id: I562fa187094f212003d0b17d20675f771cf082e6
2022-04-08 17:21:33 +02:00
Radosław Piliszek e8025b3cb8 Ironic: rename containers
Change-Id: I8e4096d7136d0ce9e54f1af0bb9ba110487fb35b
2022-04-06 08:51:05 +00:00
Michal Nasiadka 8fe9872031 monasca: Remove monasca-grafana leftovers
In Xena [1] we removed Monasca Grafana service, but some components were left
to support cleanup operations.

[1]: https://review.opendev.org/c/openstack/kolla-ansible/+/788228

Change-Id: Iccc7bc3628bb7cbab1ac28f41c7b7dc7695894c6
2022-03-23 07:04:57 +00:00
jinyuanliu 3ccb176f13 ADD venus for kolla-ansible
This project [1] can provide a one-stop solution to log collection,
cleaning, indexing, analysis, alarm, visualization, report generation
and other needs, which involves helping operator or maintainer to
quickly solve retrieve problems, grasp the operational health of the
platform, and improve the level of platform management.

[1] https://wiki.openstack.org/wiki/Venus

Change-Id: If3562bbed6181002b76831bab54f863041c5a885
2022-03-17 20:35:08 +08:00
Zuul 63706667e1 Merge "Add support for deploying Prometheus libvirt exporter" 2022-02-21 21:35:55 +00:00
LinPeiWen 1f3dcce5ac Support enable/disable rabbitmq prometheus plugins
rabbitmq starting from 3.8.0, built-in Prometheus support,
prometheus plugins are enabled by default, when the environment is
"enable_prometheus is no", rabbitmq role will disable prometheus plugins

Closes-Bug: #1885106

Change-Id: I4d694d6224c813285d228d6bc7eece5731db1078
2022-01-09 09:50:00 +00:00
Doug Szumski 491d418476 Add support for deploying Prometheus libvirt exporter
Add support for deploying the Kolla Prometheus libvirt exporter image to
facilitate gathering metrics from the Nova libvirt service.

Co-Authored-by: Dr. Jens Harbott <harbott@osism.tech>
Change-Id: Ib27e60c39297b86ae674297370f9543ab08cda05
Partially-Implements: blueprint libvirt-exporter
2022-01-05 13:30:45 +01:00
Radosław Piliszek 0cbdedd0a3 Drop vmtp
Details in the attached reno.

Change-Id: I438a453ca522493524fdb9760c1edb330916084b
2021-12-21 07:29:32 +00:00
Doug Szumski 5b06115be8 Finish removing Monasca Log Transformer
This service was disabled in the Wallaby release and all
references to it can now be removed.

Change-Id: I482640dd63959143732d86fcffb320cc94611247
2021-11-15 10:40:21 +00:00
wu.chunyang 1f71df1a8b Remove chrony role from kolla
chrony is not supported in Xena cycle, remove it from kolla

Moved tasks from chrony role to chrony-cleanup.yml playbook to avoid a
vestigial chrony role.

Co-Authored-By: Mark Goddard <mark@stackhpc.com>

Change-Id: I5a730d55afb49d517c85aeb9208188c81e2c84cf
2021-09-30 18:56:14 +02:00
Mark Goddard 8c5012e940 Add support for Ceph RadosGW integration
* Register Swift-compatible endpoints in Keystone
* Load balance across RadosGW API servers using HAProxy

The support is exercised in the cephadm CI jobs, but since RGW is
not currently enabled via cephadm, it is not yet tested.

https://docs.ceph.com/en/latest/radosgw/keystone/

Implements: blueprint ceph-rgw

Change-Id: I891c3ed4ed93512607afe65a42dd99596fd4dbf9
2021-09-30 13:08:13 +00:00
Michal Arbet f0241f807f Remove haproxy,keepalived groups
Haproxy was renamed in [1].

[1] https://review.opendev.org/c/openstack/kolla-ansible/+/770618

Change-Id: Ib2d7f0774fede570a8c4c315d83afd420c31da0b
2021-09-16 13:41:13 +02:00
Michal Arbet ffd53512af Rename role haproxy to loadbalancer
For now role haproxy is maintaining haproxy
and keepalived. In follow-up changes there is also
proxysql added.

This patch is *only* renaming/moving stuff to more
prominent role loadbalancer, and moving also specific
templates to subdirectory.

This was done only to better diff in follow-up
changes.

Change-Id: I1d39d5bcaefc4016983bf267a2736b742cc3a555
2021-08-19 21:20:33 +02:00
wu.chunyang 5261998467 Remove tempest role
Remove tempest role as planned

Change-Id: If3cf073e88c83f670c867a49afe48845f9e81008
2021-07-07 21:58:39 +08:00
wu.chunyang 3009109616 Remove rally deployment
Remove rally role as planned

Change-Id: Ic898efe42b21b01c45d4621af2cf90ecd7afc398
2021-06-16 09:12:34 +08:00
Matthias Runge ccf8cc5dca Remove support for panko
the project is deprecated and in the process of being removed
from OpenStack upstream.

Change-Id: I9d5ebed293a5fb25f4cd7daa473df152440e8b50
2021-06-11 18:00:05 +02:00
Mark Goddard db517a44e4 masakari: support host monitor
Change-Id: I3f43df7766c57622ab8d01a759fbeeef0a0c2b93
Implements: blueprint masakari-hostmonitor
Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2021-04-08 16:39:47 +00:00
Gaëtan Trellu 9f578c85e0 Add HAcluster Ansible role
Adds HAcluster Ansible role. This role contains High Availability
clustering solution composed of Corosync, Pacemaker and Pacemaker Remote.

HAcluster is added as a helper role for Masakari which requires it for
its host monitoring, allowing to provide HA to instances on a failed
compute host.

Kolla hacluster images merged in [1].

[1] https://review.opendev.org/#/c/668765/

Change-Id: I91e5c1840ace8f567daf462c4eb3ec1f0c503823
Implements: blueprint ansible-pacemaker-support
Co-Authored-By: Radosław Piliszek <radoslaw.piliszek@gmail.com>
Co-Authored-By: Mark Goddard <mark@stackhpc.com>
2021-04-08 06:39:19 +00:00
Zuul c2ff7d74c0 Merge "Register Elasticsearch in Keystone" 2021-03-26 09:54:58 +00:00
Michal Arbet 209dc1e9dc Set changed_when to false for group_by tasks
This trivial patch is just turning off ansible
changed report for group_by tasks as it could
be confusing for user.

Change-Id: I7512af573782359a6f01290a55291ac7eb0de867
2021-03-13 13:59:23 +00:00
Doug Szumski 9e668902c2 Register Elasticsearch in Keystone
This makes it possible for services to fetch the Elasticsearch endpoint
from Keystone. It is useful for both operators and Monasca Tempest.

Change-Id: Id60298582496a8959e82b970676669ca17e2e9d4
2021-02-23 10:22:50 +00:00
Kendall Nelson 25b9de91a2 Remove Retired Karbor Support
As announced on the openstack-discuss ML[1], Karbor is retiring
this cycle (Wallaby).

Needed-By: https://review.opendev.org/c/openstack/karbor/+/767032

[1] http://lists.openstack.org/pipermail/openstack-discuss/2020-November/018643.html

Change-Id: I222cf302e507f6a9de0347c79ec536aa7be22bb6
2020-12-22 09:50:49 +00:00
Zuul f30cf26271 Merge "Remove retired Searchlight support" 2020-12-19 03:36:07 +00:00
Ghanshyam Mann c7386a8168 Remove retired Searchlight support
Searchlight project is retiring in Wallaby cycle[1].
This commit removes the ansible roles of Searchlight project
before its code is removed.

Needed-By: https://review.opendev.org/c/openstack/searchlight/+/764526

[1] http://lists.openstack.org/pipermail/openstack-discuss/2020-November/018637.html

Change-Id: I85aab66376ea4f1376c2705066ba3c7e5645644f
2020-12-15 18:37:34 -06:00
Ghanshyam Mann dafde93fe2 Remove retired Qinling support
Qinling project is retiring in Wallaby cycle[1].
This commit removes the ansible roles of Qinling project
before its code is removed.

Needed-By: https://review.opendev.org/c/openstack/qinling/+/764521

[1] http://lists.openstack.org/pipermail/openstack-discuss/2020-November/018638.html

Change-Id: I6543bacff638b1649511f7e779807954c34ef570
2020-12-15 18:35:09 -06:00
zhoulinhui a637d6c67d Add the missing hosts for vitrage
refer to https://github.com/openstack/kolla-ansible/blob/master/ansible/inventory/multinode#L728

Change-Id: Ib6cd78cb2058a35f15b5affb98e0f63805b0edf3
2020-08-21 15:20:39 +00:00
Mark Goddard 9bca246b10 Fix play hosts for ironic, monasca, neutron, nova
Some plays were not applied to all groups referenced by the services
they deploy. In most cases this works fine, but if the default inventory
is modified this may cause problems where containers are not deployed to
hosts in the missing groups, if they are not a member of other groups
that the play is targeted to.

This change syncs up the play hosts for all services.

Closes-Bug: #1889387

Change-Id: I6b92d8e53a29b06a065e0611840140d09c8a6695
2020-08-03 09:50:59 +01:00
Radosław Piliszek fffe9021ff Drop a no-longer-relevant note
Modern Ansible handles this just fine.

Change-Id: Iea4d0499b92e2449ef8bc01651af6d3548ceab20
2020-07-27 17:34:54 +02:00
Christian Berendt 6eb02245d6 Remove Hyper-V integration
Change-Id: I2e22ec47f644de2f1509a0111c9e1fffe8da0a1a
2020-07-27 10:25:46 +01:00
Mark Goddard 56ae2db7ac Performance: Run common role in a separate play
The common role was previously added as a dependency to all other roles.
It would set a fact after running on a host to avoid running twice. This
had the nice effect that deploying any service would automatically pull
in the common services for that host. When using tags, any services with
matching tags would also run the common role. This could be both
surprising and sometimes useful.

When using Ansible at large scale, there is a penalty associated with
executing a task against a large number of hosts, even if it is skipped.
The common role introduces some overhead, just in determining that it
has already run.

This change extracts the common role into a separate play, and removes
the dependency on it from all other roles. New groups have been added
for cron, fluentd, and kolla-toolbox, similar to other services. This
changes the behaviour in the following ways:

* The common role is now run for all hosts at the beginning, rather than
  prior to their first enabled service
* Hosts must be in the necessary group for each of the common services
  in order to have that service deployed. This is mostly to avoid
  deploying on localhost or the deployment host
* If tags are specified for another service e.g. nova, the common role
  will *not* automatically run for matching hosts. The common tag must
  be specified explicitly

The last of these is probably the largest behaviour change. While it
would be possible to determine which hosts should automatically run the
common role, it would be quite complex, and would introduce some
overhead that would probably negate the benefit of splitting out the
common role.

Partially-Implements: blueprint performance-improvements

Change-Id: I6a4676bf6efeebc61383ec7a406db07c7a868b2a
2020-07-07 15:00:47 +00:00
gugug dc56401b42 Use the children group for site.yml
1. Use the children group for site.yml
2. Add some missing groups

Change-Id: I01d686368b11a105a8965cf987d23772ecbf97de
2020-07-05 22:56:17 +08:00
Mark Goddard 76c3f05680 Skip storm play when not enabled
Minor scalability improvement, not currently applied to storm.

Change-Id: I928d362067c52c3113bc0fbd3ae4b9be1810b7e5
TrivialFix
2020-06-26 14:42:54 +01:00
gugug f13847a5a2 Remove the congress roles since it has been retired
more info: https://review.opendev.org/#/c/721733/

Depends-On: I561ead226f714d98c8e06e6027715a64c3a8e47e
Depends-On: I21c9ab9820f78cf76adf11c5f0591c60f76372a8
Change-Id: Ic740d090211ee331b374a6dac69dfde466df7200
Co-Authored-By: jacky06 <zhang.min@99cloud.net>
2020-06-20 01:51:03 +00:00