Commit Graph

42 Commits

Author SHA1 Message Date
Michal Nasiadka 85e6432630 CI: Rework docker config vars
Change-Id: I552fea9f9b461e57611f1d2aa5c767a1f4043ff8
2023-12-20 15:40:10 +00:00
Michal Arbet 2c0f416e3a Fix podman logs
Change-Id: I7b1a5bef0cb1dc1dfad1c6c4ca486e6ead847f12
2023-10-12 07:29:10 +00:00
Martin Hiner 53e8b80ed3 Add container engine option to scripts
This patch add a way to choose container engine inside tool and test
scripts. This is in preparation for Podman introduction but still
leaves Docker as default container engine.

Signed-off-by: Martin Hiner <m.hiner@partner.samsung.com>
Change-Id: I395d2bdb0dfb4b325b6ad197c8893c8a0f768324
2023-04-28 16:16:55 +02:00
Michal Nasiadka f8e1b8f47f CI: cephadm: copy cephadm log
Change-Id: I186651e2ad05a76a606444ee673b73e171456312
2023-02-16 15:40:32 +00:00
Zuul 25dcee46a8 Merge "CI: cephadm - stop copying keyrings" 2022-04-14 09:15:22 +00:00
Zuul 9a1bd06098 Merge "[CI] Collect info about environment before deploy" 2022-03-03 16:35:34 +00:00
Michal Nasiadka cf8098d3e4 CI: cephadm - stop copying keyrings
Change-Id: I9ce8727e0ed04a7193e3257ba5ee97954b2532b7
2022-02-28 12:15:40 +00:00
Zuul 599f82ad32 Merge "CI: Bump Ceph to Pacific" 2022-02-10 18:17:42 +00:00
Michal Nasiadka 496a3df95f CI: Bump Ceph to Pacific
Change-Id: I9c736a586a757b49170977c7f9cf2c4890557a33
2022-02-10 12:15:54 +00:00
Radosław Piliszek 654edefca3 [CI] Replace parted with lsblk
parted hangs waiting for user input (see examples below)
on Debian and Ubuntu nodes which have created a cinder
volume on lvm, causing POST_FAILURE of the entire CI job.
Zun (Cinder iSCSI LVM) jobs are affected.
parted seemingly tries to interpret contents of the created
volume and fails miserably.
Since there is no reason why we would need to see the output
of parted specifically, this patch is switching to use
lsblk to simply list visible block devices.
Along with the rest of the commands, this should be just
the right level of detail.
And we avoid having parted interpret internals of otherwise
opaque block devices.

Example issues:

Warning: Not all of the space available to
/dev/mapper/cinder--volumes-cinder--volumes--pool appears to be used, you can
fix the GPT to use all of the space (an extra 9732096 blocks) or continue with
the current setting?
Fix/Ignore?

Warning: Not all of the space available to
/dev/mapper/cinder--volumes-cinder--volumes--pool-tpool appears to be used, you
can fix the GPT to use all of the space (an extra 9732096 blocks) or continue
with the current setting?
Fix/Ignore?

Warning: Not all of the space available to
/dev/mapper/cinder--volumes-cinder--volumes--pool_tdata appears to be used, you
can fix the GPT to use all of the space (an extra 9732096 blocks) or continue
with the current setting?
Fix/Ignore?

Change-Id: I7beecf2dd6c49c8934722cf22efa74e920ecb060
2022-02-04 22:32:23 +01:00
Dmitry Tantsur 10616fb19a Prepare tests for the Ironic combined service
In I51bf7226aea145dc7c8fd93d61caa233ca16c9c9 we are introducing a way to
run Ironic API and conductor in one process. In the Bifrost change
I9faecfe6ece6d3c35396e3378c1e3930a487e130 we are switching Bifrost to
it, which breaks the Kolla job.

This change makes get_logs.sh aware of the new service. Also drop
RabbitMQ since Bifrost hasn't supported it for a while already.

Change-Id: I30ac6bd4332dacbdce1f5e25bd6a97d2982b208e
2021-12-06 11:39:14 +01:00
Radosław Piliszek e6edec78e5 [CI] Collect info about environment before deploy
And also collect lsmod listing.

These both are useful to have a clearer picture of the original
environment and the effect kolla ansible had on it.

Change-Id: I5d87cfd45e4369df40b8195124535e59d24700c3
2021-09-25 20:10:15 +00:00
Zuul 5c95fa32b7 Merge "[CI] Log dbus services" 2021-05-21 13:36:16 +00:00
Radosław Piliszek e8c4b2e1b2 [CI] Log dbus services
Change-Id: I1b113d23ca3267a801409383bf39cda5cbcbb4c3
2021-05-14 13:34:34 +00:00
Radosław Piliszek e548b5969d [CI] Save systemctl info
Change-Id: Ia08b0372110ee366f4c48c5ea3bc95db0edbbe31
2021-04-06 11:20:07 +00:00
Michał Nasiadka 191b46ef40 Reduce number of logs and disable ara HTML report
- Remove /var/log/kolla link to omit uploading the same logs twice
- Remove ARA HTML report (usually takes around 120MB) - can be easily
  generated from the sqlite db

Change-Id: I74cd6d1128689ab2c73f00ee08af3778d7d670a4
2021-03-10 15:16:05 +00:00
Michał Nasiadka 65a16a08e2 CI: Move from ceph-ansible to cephadm
Change-Id: I81a4f8f8b8faa7559740531bb16d8aec7fc23f9b
2021-03-02 17:49:12 +01:00
Hongbin Lu 91678f67af Zun: Add zun-cni-daemon to compute node
Zun has a new component "zun-cni-daemon" which should be
deployed in every compute nodes. It is basically an implementation
of CNI (Container Network Interface) that performs the neutron
port binding.

If users is using the capsule (pod) API, the recommended deployment
option is using "cri" as capsule driver. This is basically to use
a CRI runtime (i.e. CRI plugin for containerd) for supporting
capsules (pods). A CRI runtime needs a CNI plugin which is what
the "zun-cni-daemon" provides.

The configuration is based on the Zun installation guide [1].
It consits of the following steps:
* Configure the containerd daemon in the host. The "zun-compute"
  container will use grpc to communicate with this service.
* Install the "zun-cni" binary at host. The containerd process
  will invoke this binary to call the CNI plugin.
* Run a "zun-cni-daemon" container. The "zun-cni" binary will
  communicate with this container via HTTP.

Relevant patches:
Blueprint: https://blueprints.launchpad.net/zun/+spec/add-support-cri-runtime
Install guide: https://review.opendev.org/#/c/707948/
Devstack plugin: https://review.opendev.org/#/c/705338/
Kolla image: https://review.opendev.org/#/c/708273/

[1] https://docs.openstack.org/zun/latest/install/index.html

Depends-On: https://review.opendev.org/#/c/721044/
Change-Id: I9c361a99b355af27907cf80f5c88d97191193495
2020-04-30 02:22:20 +00:00
Michal Nasiadka 4e6fe7a6da Remove kolla-ceph
Kolla-Ansible Ceph deployment mechanism has been deprecated in Train [1].

This change removes the Ansible code and associated CI jobs.

[1]: https://review.opendev.org/669214

Change-Id: Ie2167f02ad2f525d3b0f553e2c047516acf55bc2
2020-02-11 11:42:06 +01:00
Zuul b103989642 Merge "CI: Add Ceph-Ansible jobs" 2020-01-27 09:09:13 +00:00
Michal Nasiadka d8c15ad4e8 CI: Add Ceph-Ansible jobs
* Adding zuul centos-source/ubuntu-source ceph-ansible jobs
* Jobs will deploy all Ceph integrated OpenStack components, i.e.
  cinder, glance, nova
* Will utilize core openstack testing script

Depends-On: https://review.opendev.org/685032
Depends-On: https://review.opendev.org/698301

Implements: blueprint ceph-ansible
Change-Id: I233082b46785f74014177f579aeac887a25b2ae2
2020-01-24 22:37:03 +01:00
Michal Nasiadka d597cece85 CI: Add timestamps to Docker container logs
Change-Id: Ie5111b898da980d63e9d90003f823172e7a78bc2
2020-01-24 09:59:01 +01:00
Michal Nasiadka ac62b560ff Stop gzipping logs in get-logs.sh
As per [1] we should stop compressing the logs sent to swift in order
to get them back readable via a browser.

[1]: http://lists.openstack.org/pipermail/openstack-discuss/2020-January/011875.html

Change-Id: I9b5afceb8a2792491a339bf87bcd9db1c10274e8
2020-01-09 09:20:43 +01:00
Radosław Piliszek bc053c09c1 Implement IPv6 support in the control plane
Introduce kolla_address filter.
Introduce put_address_in_context filter.

Add AF config to vars.

Address contexts:
- raw (default): <ADDR>
- memcache: inet6:[<ADDR>]
- url: [<ADDR>]

Other changes:

globals.yml - mention just IP in comment

prechecks/port_checks (api_intf) - kolla_address handles validation

3x interface conditional (swift configs: replication/storage)

2x interface variable definition with hostname
(haproxy listens; api intf)

1x interface variable definition with hostname with bifrost exclusion
(baremetal pre-install /etc/hosts; api intf)

neutron's ml2 'overlay_ip_version' set to 6 for IPv6 on tunnel network

basic multinode source CI job for IPv6

prechecks for rabbitmq and qdrouterd use proper NSS database now

MariaDB Galera Cluster WSREP SST mariabackup workaround
(socat and IPv6)

Ceph naming workaround in CI
TODO: probably needs documenting

RabbitMQ IPv6-only proto_dist

Ceph ms switch to IPv6 mode

Remove neutron-server ml2_type_vxlan/vxlan_group setting
as it is not used (let's avoid any confusion)
and could break setups without proper multicast routing
if it started working (also IPv4-only)

haproxy upgrade checks for slaves based on ipv6 addresses

TODO:

ovs-dpdk grabs ipv4 network address (w/ prefix len / submask)
not supported, invalid by default because neutron_external has no address
No idea whether ovs-dpdk works at all atm.

ml2 for xenapi
Xen is not supported too well.
This would require working with XenAPI facts.

rp_filter setting
This would require meddling with ip6tables (there is no sysctl param).
By default nothing is dropped.
Unlikely we really need it.

ironic dnsmasq is configured IPv4-only
dnsmasq needs DHCPv6 options and testing in vivo.

KNOWN ISSUES (beyond us):

One cannot use IPv6 address to reference the image for docker like we
currently do, see: https://github.com/moby/moby/issues/39033
(docker_registry; docker API 400 - invalid reference format)
workaround: use hostname/FQDN

RabbitMQ may fail to bind to IPv6 if hostname resolves also to IPv4.
This is due to old RabbitMQ versions available in images.
IPv4 is preferred by default and may fail in the IPv6-only scenario.
This should be no problem in real life as IPv6-only is indeed IPv6-only.
Also, when new RabbitMQ (3.7.16/3.8+) makes it into images, this will
no longer be relevant as we supply all the necessary config.
See: https://github.com/rabbitmq/rabbitmq-server/pull/1982

For reliable runs, at least Ansible 2.8 is required (2.8.5 confirmed
to work well). Older Ansible versions are known to miss IPv6 addresses
in interface facts. This may affect redeploys, reconfigures and
upgrades which run after VIP address is assigned.
See: https://github.com/ansible/ansible/issues/63227

Bifrost Train does not support IPv6 deployments.
See: https://storyboard.openstack.org/#!/story/2006689

Change-Id: Ia34e6916ea4f99e9522cd2ddde03a0a4776f7e2c
Implements: blueprint ipv6-control-plane
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-10-16 10:24:35 +02:00
Zuul 94ece4a702 Merge "CI: collect more system configs (name resolution)" 2019-09-20 17:20:54 +00:00
Radosław Piliszek e7d5c58415 CI: collect more system configs (name resolution)
This patch adds configs relevant to name resolution.

Change-Id: I7ebc2409e9ec0bd875abf0bf4e452bc89efe940d
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-09-20 10:30:49 +02:00
Mark Goddard 8e40629161 CI: Use VXLAN overlay network
VXLAN is necessary to run HA in CI (due to floating VIP
address handled by keepalived).
It also turned out to be required to have private
IPv6 address assignments.
This patch is based on linux bridge rather than OVS
to avoid problems with OVS deployed in containers.

This patch enables haproxy in multinode jobs.

Includes saving of linux networking details.

Makes DASHBOARD_URL agree with OS_AUTH_URL - properly uses the
pre-upgrade value for testing.

Co-authored-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
Depends-on: https://review.opendev.org/683068
Depends-on: https://review.opendev.org/682957
Change-Id: I66888712da80c3d6f84ee4949762961664d3adea
2019-09-19 11:07:02 +02:00
Michal Nasiadka 8cf24bcc81 CI: Add docker inspect output to docker_info logs
Change-Id: I081f2f4762651bca935f08a67b20f21946aaf051
2019-08-16 09:30:16 +00:00
Zuul 571c89712d Merge "CI: Collect docker and systemd configs" 2019-08-12 17:19:36 +00:00
Radosław Piliszek 826f6850d0 ceph: fixes to deployment and upgrade
1) ceph-nfs (ganesha-ceph) - use NFSv4 only
This is recommended upstream.
v3 and UDP require portmapper (aka rpcbind) which we
do not want, except where Ubuntu ganesha version (2.6)
forces it by requiring enabled UDP, see [1].
The issue has been fixed in 2.8, included in CentOS.
Additionally disable v3 helper protocols and kerberos
to avoid meaningless warnings.

2) ceph-nfs (ganesha-ceph) - do not export host dbus
It is not in use. This avoids the temptation to try
handling it on host.

3) Properly handle ceph services deploy and upgrade
Upgrade runs deploy.
The order has been corrected - nfs goes after mds.
Additionally upgrade takes care of rgw for keystone
(for swift emulation).

4) Enhance ceph keyring module with error detection
Now it does not blindly try to create a keyring after
any failure. This used to hide real issue.

5) Retry ceph admin keyring update until cluster works
Reordering deployment caused issue with ceph cluster not being
fully operational before taking actions on it.

6) CI: Remove osd df from collected logs as it may hang CI
Hangs are caused by healthy MON and no healthy MGR.
A descriptive note is left in its place.

7) CI: Add 5s timeout to ceph informational commands
This decreases the timeout from the default 300s.

[1] https://review.opendev.org/669315

Change-Id: I1cf0ad10b80552f503898e723f0c4bd00a38f143
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-08-05 06:26:25 +00:00
Radosław Piliszek 2430c29033 CI: Collect docker and systemd configs
Change-Id: I59a05e8a0a2656596d2cced61bd98f2aa790d60b
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
2019-07-03 19:45:00 +00:00
Mark Goddard 845040ad3f Add CI job for ironic
Adds four new CI jobs for testing centos/ubuntu binary/source deploys
with ironic enabled. These are run only when there are changes to the
ironic role.

Performs some simple testing by creating a node using the fake-hardware
hardware type and creating a server.

Change-Id: Ie669e57ce2af53257b4ca05f45193cb73f48827a
Depends-On: https://review.opendev.org/664011
2019-06-11 10:22:04 +01:00
Michal Nasiadka 3f01c7c7cd Add haproxy stats to gate logs
Change-Id: Iebd98acf03418817d3707c4a117771b73da80166
2019-02-21 12:55:54 +01:00
Eduardo Gonzalez 362b6ee40e Test zun container creation in gates
Change-Id: If5b4ba975a65e07d2704eb6bdb9d841d6a9c3d42
2018-12-19 19:50:59 +01:00
Eduardo Gonzalez a7dbc39240 Suppress log copy output in gates
Change-Id: I01e58d3548d6adc4a2d6f1088773df7941da3865
2018-11-20 11:14:24 +00:00
Mark Goddard 45a4f9c075 Add a job for testing deployment of bifrost
Deploys a bifrost container using kolla-ansible bifrost-deploy.

IPA and disk images are downloaded rather than built to improve
reliability.

Currently only minimal testing of the deployment is performed, creating
and deleting an ironic node. Ideally we would perform a bare metal node
deployment.

The job is based on CentOS, as Ubuntu bifrost deployment is currently
failing with a python-MySQLdb error.

Change-Id: Ic45094594c21116b5b0d6a606f568fc7954175e3
2018-06-18 13:40:43 +01:00
Jeffrey Zhang cdd125117f Optimize zuul v3 jobs
- move check container failure from post.yml to run.yml
- add binary related jobs
- use static kolla-ansible src dir, which is helpful for kolla project
  to use.
- generate correct /etc/hosts by using private ip address and hostname
- fix the wrong api interface in global.yml file

Change-Id: Idfdee6dfe18f0fa2d4f984df59b57553122ce298
2017-10-26 09:58:29 +08:00
Jeffrey Zhang baa9319a75 Move to zuul v3 in project jobs
Partial-Bug: #1720601
Change-Id: Ibc20a6ae8c645ff82f3c14a6286073dffd4cfae2
2017-10-18 12:31:52 -07:00
Eduardo Gonzalez fee1538c38 Retrieve fluentd logs in gates
Fluentd send logs to stdout,
this changes creates a file with fluentd logs
output to easy discover missing patterns during fluentd
changes.

Change-Id: I131f95089eac60ccb4c48cf5071c3b44c5ea42ca
2017-09-12 23:03:25 +02:00
Eduardo Gonzalez c27338bf2f Retrieve ceph logs in gate
Check status of ceph cluster

Change-Id: I4919a32794cc75bd3e8f411a219f778238a334ee
2017-08-29 15:54:26 +00:00
Eduardo Gonzalez 52f73f4061 Fix logging collection in gates
Log retrieval was out of sync since repo split
and from multinode gates.

Many useful information retrieved before like
ps, df, docker info, etc is not in kolla-ansible
gates.

Also, his change fix logs visualization to have
colored view, allowing to easily identify errors.

Change-Id: I948233e26ceb6efc58b962bcb4b710b3f006232b
2017-07-24 14:32:09 +00:00
Michal (inc0) Jastrzebski f5354f55b1 Enable multinode gate
This patches changes deploy_gate quite a bit so in reality all
deployments will now assume multinode (even if it's single node). After
that we will refactor it even further to enable easy addition of new
scenerios.

Change-Id: I1faada46e6a7aa026128b2f01d77eabb04759439
2017-06-05 11:35:20 -07:00