Commit Graph

898 Commits

Author SHA1 Message Date
Sven Kieske c508479f44
CI: fix check-failure.sh sudo missing
this produces errors itself, e.g.:

```
 for container in $failed_containers
+ docker inspect prometheus_openstack_exporter
[]
permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock: Get "http://%2Fvar%2Frun%2Fdocker.sock/v1.45/containers/prometheus_openstack_exporter/json": dial unix /var/run/docker.sock: connect: permission denied
```

Signed-off-by: Sven Kieske <kieske@osism.tech>
Change-Id: I280e2660d187d966098ae39df6392503b1aa5bdf
2024-03-22 11:54:14 +01:00
Zuul 439e7fa177 Merge "Revert "Pin zun jobs to Docker 20"" 2024-03-19 15:32:33 +00:00
Zuul ccc768414c Merge "CI: Increase galera node timeouts" 2024-03-18 09:22:52 +00:00
Michal Nasiadka 23e961e8b3 CI: Increase galera node timeouts
Some multinode jobs have been failing due to galera suspecting inactive nodes
Tweak it as per [1].

[1]: https://galeracluster.com/library/documentation/recovery.html

Change-Id: I5577ea2c23d6dbd440bd52899a30ea9531996256
2024-03-15 19:20:55 +00:00
Zuul f65e4257dd Merge "CI: Use 2023.2 image for rabbitmq on ipv6 scenario" 2024-03-14 16:31:32 +00:00
Michal Nasiadka b04486df07 Bump ansible-core versions to 2.15 and 2.16
Change-Id: Iab40eb92c7e4a9092471bef9d4477a4fa34f1c85
2024-03-14 06:13:38 +00:00
Michal Nasiadka 0ec71d87cd CI: Use 2023.2 image for rabbitmq on ipv6 scenario
Currently RMQ 3.13 fails on ipv6 multinode scenario, use 3.12 from 2023.1
until [1] gets resolved.

[1]: https://github.com/rabbitmq/rabbitmq-server/issues/10728

Change-Id: If11710e99cf2e340e558d68e2071c1bb16825e55
2024-03-13 16:22:32 +00:00
Michal Nasiadka a88ebd77b0 CI: Replace etcd with redis in GATE_IMAGES for cephadm scenario
We replaced redis with etcd in that scenario, but GATE_IMAGES
are not updated.

Change-Id: Ie9d6642f8ce51bc2a35b800c6c149153c14378db
2024-03-05 16:49:45 +01:00
Zuul 311fd881e4 Merge "Template system scoped admin-openrc and clouds.yml files" 2024-02-19 12:40:06 +00:00
Bartosz Bezak 6e835ae758 Template system scoped admin-openrc and clouds.yml files
Ironic enabled secure RBAC with system scoped enforcement [1].

Some API calls, for instance 'baremetal:driver:get' needs system
scope role by design [2], even with elevated access project scope
service role [3].

[1] https://review.opendev.org/c/openstack/ironic/+/902009
[2] 8ec5606622/ironic/common/policy.py (L1349-L1357)
[3] https://review.opendev.org/c/openstack/kolla-ansible/+/908007

Related-Bug: #2051837

Change-Id: Id6313d7dd343b82d4c9ccf7bf429d340ea0e93d1
2024-02-15 15:01:59 +00:00
Doug Szumski afa202e259 CI: Fix prometheus-opensearch-upgrade CI job
The upgrade job needs the haproxy exporter group, which
was missing from the inventory.

Change-Id: Ie4ecf283a2f4ac056ace5e76f2acc4ba1a8fe0b4
2024-02-15 10:59:34 +00:00
Michal Nasiadka 63cf525af5 CI: Increase RADOS timeout for cephadm jobs
Default timeout is 5 and we're often hitting that on our poor man's
Ceph.

Change-Id: Ide92b3c32150c0045b0723155f94b21ea9cdce66
2024-02-14 10:02:35 +00:00
Michal Nasiadka fe155496e1 CI: Switch cephadm jobs to redis
etcd is flakey and complaining over slow disk

Change-Id: I1f5191015b53bdb218cfeaa43586ecf2d71a161e
2024-02-13 12:46:23 +01:00
Zuul 07bbf1707f Merge "[CI] Enable testing horizon" 2024-02-09 13:03:12 +00:00
Zuul 23909f1b9e Merge "CI: Run SLURP upgrade job" 2024-02-09 10:41:54 +00:00
Michal Nasiadka 09fb029569 CI: Run SLURP upgrade job
Change-Id: I246b14c9b547c6a0ff0be68ad57e723839cc3275
2024-02-08 13:13:35 +00:00
Michal Arbet 05462c471c [CI] Enable testing horizon
Change Ib7f72b2663199ef80844a412bc436c6ef09322cc
disabled horizon testing. This patch enabling
horizon tests again.

Change-Id: Iff670525c91c8adbcf2a01288b12456cb4a31809
2024-02-07 16:13:27 +01:00
Zuul 074d8b0ebf Merge "Enable HAProxy Prometheus metrics endpoint" 2024-02-07 10:33:24 +00:00
Michal Arbet f0b7bf33ab [CI] Test neutron DNS integration and designate
This patch adds tests for neutron and designate DNS
integration.

Tests are based on scenarios described below in [1][2].

[1] https://docs.openstack.org/neutron/latest/admin/config-dns-int.html
[2] https://docs.openstack.org/neutron/latest/admin/config-dns-int-ext-serv.html

Change-Id: I3953f760458285e5c9a818599492c6176e857dde
2024-01-30 09:56:46 +01:00
Michal Arbet 2624e93852 [CI] Fix podman cross-dependency build
Change-Id: I3501e6bf17ccb94adfcdb62956dceba9d67b0881
2024-01-26 13:23:59 +01:00
Michal Arbet 47ddac4131 Bump ansible-lint version
The version that we were capping to is no longer compatible with latest
upper-constraints.txt, so let us free float again.

The resulting linting errors are included for now to unblock the gate,
these will still need to be discussed or fixed later.

NOTE(kevko): Temporarily disabling horizon deployment, as it's not
possible to unblock gates without it

Co-Authored-By: Michal Arbet <michal.arbet@ultimum.io>
Change-Id: Ib7f72b2663199ef80844a412bc436c6ef09322cc
2024-01-22 22:49:46 +01:00
hongbin 9c77220f6e Revert "Pin zun jobs to Docker 20"
This reverts commit 94a74f58c7.

Reason for revert: Zun/Kuryr-libnetwork has switched to "local" scope which doesn't require docker 20 anymore. It should work for latest docker version. Related patches:
https://review.opendev.org/c/openstack/zun/+/903884

Change-Id: Ieb545ae5a5917322f599728587c3f04ea8356126
2024-01-22 12:24:37 +00:00
Bartosz Bezak 1d38ff5e9c use docker_custom_config override for Kolla CI upgrade jobs
In Kolla CI K-A upgrade job needs docker_custom_config override
as docker_registry var is being used both for docker daemon
config - for kolla images build, and kolla-ansible container images
sources - where we're using quay.io mirror.
docker_custom_config gets precedence in docker daemon
configuration.

docker_custom_config was removed in [1].

[1] https://review.opendev.org/c/openstack/kolla-ansible/+/904067

Change-Id: I1e890223faf25b1169a49e22a9529f90806d2f3a
2024-01-17 13:37:28 +00:00
Zuul 3490b0f14e Merge "Test haproxy single external frontend" 2024-01-12 21:06:10 +00:00
Zuul aac86a9248 Merge "CI: Rework docker config vars" 2024-01-12 14:50:39 +00:00
Pierre Riteau f86ed0270f CI: Test Nova server resize functionality
This adds an extra resize operation to core OpenStack tests. This should
be fast since we are only increasing the number of cores of the VM and
could help catch additional errors in CI tests.

Change-Id: Ia61b995dbffcda4f1e6494548df457231cb67bd7
2024-01-08 22:15:04 +00:00
Zuul 1538092522 Merge "CI: Use ControlPersist and ControlMaster" 2024-01-08 11:49:02 +00:00
Dawud 140722f74e
Enable HAProxy Prometheus metrics endpoint
HAProxy exposes a Prometheus metrics endpoint, it just needs to be
enabled. Enable this and remove configuration for
prometheus-haproxy-exporter. Remaining prometheus-haproxy-exporter
containers will automatically be removed.

Change-Id: If6e75691d2a996b06a9b95cb0aae772db54389fb
Co-Authored-By: Matt Anson <matta@stackhpc.com>
2024-01-05 10:36:31 +00:00
Michal Nasiadka 9bc99b9434 Test haproxy single external frontend
Change-Id: Id25b4407a8170f69e4cd7278e0aff64c609ace7d
2024-01-03 08:31:14 +00:00
Michal Nasiadka 85e6432630 CI: Rework docker config vars
Change-Id: I552fea9f9b461e57611f1d2aa5c767a1f4043ff8
2023-12-20 15:40:10 +00:00
Michal Nasiadka 2cc21b0e63 Drop redundant note in globals-default.j2
Change-Id: I4d09018f4e921e90cbe7457c1f7fb025ef3acfa8
2023-12-20 07:24:53 +01:00
Zuul 7fc76fb4e9 Merge "CI: Move openstack clients setup to a role" 2023-12-14 21:01:51 +00:00
Michal Nasiadka 56cbf8c031 CI: Move openstack clients setup to a role
Change-Id: Id5c53b63e88502c999b89cbc62405bb8953fef3a
2023-12-14 16:38:26 +01:00
Zuul 17a76d2a0e Merge "Add precheck for RabbitMQ quorum queues" 2023-12-14 14:54:40 +00:00
Matt Crees 61f84e3beb Add precheck for RabbitMQ quorum queues
Adds a precheck to fail if non-quorum queues are found in RabbitMQ.

Currently excludes fanout and reply queues, pending support in
oslo.messaging [1].

[1]: https://review.opendev.org/c/openstack/oslo.messaging/+/888479

Closes-Bug: #2045887
Change-Id: Ibafdcd58618d97251a3405ef9332022d4d930e2b
2023-12-13 14:49:05 +00:00
Michal Nasiadka 70ebf95c2b CI: Pin docker to <7 in setup_gate.sh
Change-Id: I46c05a54171cbf43a51594998561db94af7d17e6
2023-12-13 14:05:28 +00:00
Zuul 7a29abb590 Merge "CI: Move test-ovn before test-core-openstack" 2023-12-07 16:30:17 +00:00
Michal Nasiadka 4a3d6b6e51 CI: Add SCENARIO env var to upgrade.sh
Change-Id: I373f6d13809674c521155ca51962785a8b1ac598
2023-12-06 14:59:23 +01:00
Michal Nasiadka 6160c232b1 CI: Use ControlPersist and ControlMaster
Similar to [1].

[1]: https://review.opendev.org/c/openstack/openstack-ansible/+/851426

Change-Id: I254f71d607353e0cf4d3d5ebafd6813287c4fa9f
2023-12-05 17:24:26 +01:00
Zuul 21236c2e90 Merge "CI: Install rich on depends-on podman builds" 2023-11-30 18:58:51 +00:00
Michal Nasiadka 6fb1220b65 CI: Install rich on depends-on podman builds
Change-Id: I54f94c383ae5a1185b364495422e1ab79cbd1afb
2023-11-30 15:29:43 +01:00
Sven Kieske 64575519aa enable quorum queues
This implements a global toggle `om_enable_rabbitmq_quorum_queues`
to enable quorum queues for each service in RabbitMQ, similar to
what was done for HA[0].

Quorum Queues are enabled by default.

Quorum queues are more reliable, safer, simpler and faster than
replicated mirrored classic queues[1].

Mirrored classic queues are deprecated and scheduled for removal
in RabbitMQ 4.0[2].

Notice, that we do not need a new policy in the RabbitMQ definitions
template, because their usage is enabled on the client side and can't
be set using a policy[3].

Notice also, that quorum queues are not yet enabled in oslo.messaging
for the usage of reply_ and fanout_ queues (transient queues).
This will change once[4] is merged.

[0]: https://review.opendev.org/c/openstack/kolla-ansible/+/867771
[1]: https://www.rabbitmq.com/quorum-queues.html
[2]: https://blog.rabbitmq.com/posts/2021/08/4.0-deprecation-announcements/
[3]: https://www.rabbitmq.com/quorum-queues.html#declaring
[4]: https://review.opendev.org/c/openstack/oslo.messaging/+/888479

Signed-off-by: Sven Kieske <kieske@osism.tech>
Change-Id: I6c033d460a5c9b93c346e9e47e93b159d3c27830
2023-11-30 13:53:00 +00:00
Michal Nasiadka ba54f8cdda CI: Fail on fluentd log parsing errors
Change-Id: Ie3963f5ed20f7fb61ef2e03f0cf12a4ea1c87c9c
2023-11-29 17:42:44 +00:00
Dr. Jens Harbott 0b1a59dd8c podman: install "rich" dependency
This dependency was added to podman-py in version 4.8.0, but not added
properly to their requirements. Install it explicitly for our tox
and integration testing as a workaround.

[0] https://github.com/containers/podman-py/issues/350

Change-Id: I61a5fdfc4e505f2577185f0c0f1297cf2709be2c
2023-11-29 17:04:06 +00:00
Zuul 29e1827bd1 Merge "CI: Add oslo_db.exception.DBConnectionError to check-logs.sh" 2023-11-29 14:15:59 +00:00
Zuul e971d0c795 Merge "etcd: Add support for more scenarios" 2023-11-29 11:13:18 +00:00
Michal Nasiadka 6e9e66b892 CI: Add oslo_db.exception.DBConnectionError to check-logs.sh
Change-Id: Ia1de6d9452e2c900169e9b4ccb7dfc1280283909
2023-11-29 10:04:44 +01:00
Jan Gutter ed3b27cc92 etcd: Add support for more scenarios
This commit addresses a few shortcomings in the etcd service:
  * Adding or removing etcd nodes required manual intervention.

  * The etcd service would have brief outages during upgrades or
    reconfigures because restarts weren't always serialised.

This makes the etcd service follow a similar pattern to mariadb:
  * There is now a distiction between bootstrapping the cluster
    and adding / removing another member.

  * This more closely follows etcd's upstream bootstrapping
    guidelines.

  * The etcd role now serialises restarts internally so the
    kolla_serial pattern is no longer appropriate (or necessary).

This does not remove the need for manual intervention in all
failure modes: the documentation has been updated to address the
most common issues.

Note that there's repetition in the container specifications: this
is somewhat deliberate. In a future cleanup, it's intended to reduce
the duplication.

Change-Id: I39829ba0c5894f8e549f9b83b416e6db4fafd96f
2023-11-28 18:43:56 +01:00
Zuul 9a0ac440df Merge "Revert "Enable RabbitMQ HA queues by default"" 2023-11-28 16:45:06 +00:00
Zuul db79eb0a55 Merge "Rename kolla_docker to kolla_container" 2023-11-28 12:06:09 +00:00