Commit Graph

237 Commits

Author SHA1 Message Date
Clark Boylan 4195dcf119 Revert "Remove rax from zuul log upload config"
This reverts commit 9d25a857da.

We revert this as our testing shows the MFA updates to our regular
accounts don't seem to affect swift uploads currently. That may change
when MFA is required across the board on the 26th but we can reapply
this sort of disablement and testing then and in the meantime continue
to have two providers of log upload locations.

Change-Id: I9f24f67253934a6a128a6cee3cceb9c1f0bcdf37
2024-03-20 11:55:18 -07:00
Clark Boylan 9d25a857da Remove rax from zuul log upload config
Rax is requiring everyone to use multi factor auth by March 26, 2024.
We're currently transitioning to MFA early to control when and how it
happens. One benefit of doing this early is we can pull rax out of the
log upload destination list to enable us to test it still works after
the MFA switch.

Do this by removing Rax from the prod uploads and removing ovh from the
test uploads so that only rax is used in base-test. After we update to
MFA we can check that uploads for base-test still work then revert this
change.

Change-Id: I8dafb5ea7ad6b10989ca6258c3f56bc8b91d0e06
2024-03-20 09:53:42 -07:00
James E. Blair ddb3137d87 Revert "Temporarily disable uploads to ovh_gra"
This restores our standard log upload config now that the outage
in ovh is resolved.

This reverts commit babd74eb4f.
This reverts commit 7e033243f3.

Change-Id: Ib308489085a6ad2ace501e4ae078606634eda50a
2023-12-11 09:10:56 -08:00
James E. Blair babd74eb4f Force base-test to upload to ovh_gra
Test if the recent outage is resolved.

Change-Id: I3b83b084ed2fd8adf09f7dc12cae825fb53384f2
2023-12-11 08:45:01 -08:00
James E. Blair 7e033243f3 Temporarily disable uploads to ovh_gra
We've seen a high incidence of failures uploading to ovh_gra recently.

Change-Id: I687427d9eab5754f7db6021a2250c78ea1053b05
2023-12-11 07:29:52 -08:00
Clark Boylan 84cffab894 Reset log upload targets
This resets the log upload targets to our full list of options. This
should only be landed after the previous commit has beened used to
confirm all is well in rax iad and dfw again.

Change-Id: Icc22521b812bfb10b158c5e1e07665f3e45aaa4e
2023-11-06 13:56:11 -08:00
Clark Boylan a6da5a83e1 Force base-test to upload only to rax-iad and rax-dfw
These were disabled due to service issues. Independent testing seems to
indicate that things are happy again. Set up base-test to upload only to
these to regions to confirm. We'll reset base and base-test jobs to the
full set of upload targets if this goes well.

Change-Id: I90141b8f666101fabbcc0b820b370fe20af73632
2023-11-06 13:54:03 -08:00
Clark Boylan 1e5f2cd932 Temporarily disable log uploads to rax dfw and iad swift
We've seen errors with uploads to rax iad and rax dfw resulting in jobs
reporting POST_FAILURE with no logs uploaded. Testing container listings
against iad and dfw results in 500 errors but the same requests succeed
against ord so we leave it in place.

Based on the 500 errors received in testing above we believe this issue
is not on our end (client/sdk updates for example) but rather on the
server/service side and we need to disable the endpoints and ride it
out.

Change-Id: Ic38fb681c1de84bdb96edb1ecaa7c632a5bec706
2023-11-06 12:34:44 -08:00
Radosław Piliszek ca59b60fbe buildset-registry: Always use Docker
Since run-buildset-registry depends on the container_command,
but buildset-registry supports only Docker, we need to enforce it.

Change-Id: I8966251030dcb3342befa727b2cc6e20b7229b11
2023-05-22 21:44:40 +02:00
Clark Boylan 3fc688b08d Run ensure-quay-repo in our base container jobs
This will ensure that properly defined container images are created in
quay registries for us. You need to do this out of band of the docker
push if you want the image repos to be public. The ensure-quay-repo role
should ignore images that don't have the correct metadata making this
safe for all container base jobs.

Depends-On: https://review.opendev.org/c/zuul/zuul-jobs/+/881521
Change-Id: Ic358f5e2f44c2a1e02140f8c848fe352214ba65a
2023-04-25 15:28:39 -07:00
James E. Blair dbd7b981db Move pull-from-intermediate-registry to localhost
This should run on the executor, not the nodes.

Change-Id: I61fd52982c81d6dfe309b641cdb28278b4b438f2
2023-03-23 13:19:19 -07:00
James E. Blair 30b02a86dc Fix container-image pre playbook container_command default
This variable can be undefined, so use the documented default.

Change-Id: Ied1f79a5ec0ca769301ed2ecd38e67f1f511aa50
2023-03-23 12:55:33 -07:00
James E. Blair 326c244f8c Add ensure-* roles to container image jobs
This is in the zuul-jobs pre-playbook, but we don't actually inherit
from those jobs so we need to duplicate it.

Change-Id: I875df74936736b80dbb2f29bbb474b993f4616ea
2023-03-23 11:51:10 -07:00
James E. Blair 4119042c7e Add container-image jobs
These are the analogs of the opendev-build-docker-image jobs,
using the newer container roles.

Change-Id: Ifec8fd7db3b238536b396a9012bdf93d0d19547e
Depends-On: https://review.opendev.org/c/zuul/zuul-jobs/+/878291
2023-03-23 10:47:33 -07:00
Clark Boylan 69efb45018 Revert "Temporarily disable rax swift log uploads"
This reverts commit 5ce784d816.
This reverts commit 5497c3aa3a.

Revert the two chagnes that were used to disable rax and then force rax
under base-test for testing. The testing performed after that change
landed seems to indicate things are working.

Change-Id: Ibc3e71399205895d1508786f1eb40cb13d44817a
2023-02-09 14:18:18 -08:00
Clark Boylan 5497c3aa3a Update base-test to only upload to rax
We recently disabled rax swift log uploads due to errors. Force all
uploads under the base-test job to go to rax so that we can test this
more now that the immediate fire is contained.

Change-Id: I7cb8b312356fbaf0d8b4db02b6cc9363f3b13c6f
2023-02-09 11:22:24 -08:00
Clark Boylan 5ce784d816 Temporarily disable rax swift log uploads
We are seeing failures to these regions. Disable them until we can debug
further to avoid unnecessary job failures.

Change-Id: If47636adf08279f8c691c3e9b6351b08067f3191
2023-02-09 11:05:47 -08:00
Dr. Jens Harbott 16470c3780 Revert "Disable OVH BHS1 and GRA1 log uploads"
This reverts commit 022af868f1.

Reason for revert: OVH issue fixed

Change-Id: I7413b87e6c7e520661fd6e51f7ba417eed042225
2023-01-11 13:18:10 +00:00
Clark Boylan 022af868f1 Disable OVH BHS1 and GRA1 log uploads
These regions are returning 503s for file retrievals and some jobs are
failing to upload logs there. Disable until it stabilizes.

Change-Id: Ic5d75b95bf8e3c71025c7297644e7fb3ed2fd9b3
2023-01-10 14:47:14 -08:00
James E. Blair 675ff8b712 Add nox-docs base jobs
This adds a copy of the tox-docs related jobs but using nox instead.

Depends-On: https://review.opendev.org/868134
Change-Id: I445202f366c748191fe6a05e145c05cbad1bb8f5
2022-12-20 09:17:00 -08:00
Clark Boylan 8a7b3895d4 Revert "Disable ovh swift endpoints due to errors"
This reverts commit 85e1ff20ea.

The incident [0] has been marked as resolved by our provider. We should
be good to return to our full set of swift backends for log storage.

[0] https://public-cloud.status-ovhcloud.com/incidents/by8279p6sdjd

Change-Id: I46d5ae367412081808c22f6b2626fbb83fe2e34c
2022-11-16 11:50:42 -08:00
Clark Boylan 85e1ff20ea Disable ovh swift endpoints due to errors
There is an incident producing errors with ovh swift object storage [0].
Disable these regions until that incident is resolved.

Note we disable rax on the test job so that we can easily test things
are functional once this incident is resolved. Reverting this change
will reenable all swift endpoints in base and base-test jobs.

[0] https://public-cloud.status-ovhcloud.com/incidents/by8279p6sdjd

Change-Id: I8f0655f95308a31881680d1b0c25ed6af8f54fb7
2022-11-16 07:49:45 -08:00
Ian Wienand 543a02f059
infra-prod: use prod_bastion group
This is similar in purpose to
I137ab824b9a09ccb067b8d5f0bb2896192291883 to separate out where we are
talking to the bastion host from the executor, versus the nested
ansible CD run.

Add the host in the "prod_bastion" group, and switch the source setup
playbook to use "prod_bastion[0]".  This reduces the number of places
you have to update the bridge name when you change the host.

Change-Id: I66df4057b3990eed2230d894ff42d0a425a2381a
2022-11-04 09:23:46 +11:00
Ian Wienand f53b34c171
infra-prod: Move project-config reset into base-jobs
Currently we reset trees to master in two places; here and in
sync-project-config (Ib999731fe132b1e9f197e51d74066fa75cb6c69b).  This
is a bit confusing, and requires delegating tasks to the bridge node
which isn't great.  Also, as we think about trying to make jobs run in
parallel it's another place to get things wrong.

This merges the update into one place.

Change-Id: I6ffeb6e6562fb34db89f4e475da27b60e30f6fe7
2022-10-28 18:00:34 +11:00
Ian Wienand e9526fe69e
Switch to bridge01.opendev.org
Switch to the new bastion host

Change-Id: I8b7547af99f8858934af2593f8ac9b4172484895
2022-10-25 16:07:42 +11:00
Ian Wienand 94857d2f38
setup-keys: add bridge node to "bastion" group
This puts the dynamically added bridge node in the "bastion" group.
This way the production jobs can refer to the generic group name, and
be abstracted from the actual hostname.

Change-Id: Ie35f3f003f21472be2ca87ab962141d17fc2a7b6
2022-10-12 14:12:37 +11:00
Ian Wienand accdc49eef
Fix zuul_console_disabled typo
Similar to Ie0a0d8f4ae137dc12f4c13f901096ee39d9a088e in system-config;
fix the typo on this variable name.

Change-Id: I579af80831ec6c317aa4c03d68a1e1934c2fe16c
2022-10-11 13:52:26 +11:00
Ian Wienand 83e03c36e8
Add zuul_console_disable flag to added hosts
This stops the bridge trying to write out console streaming files that
will never be read, because we don't allow connections to the
streaming port.  c.f. Ifbb5b8acb1f231812905cf9643bfec6fbbd08324

Change-Id: I82f194631c2a6d4ed2e46e057a609e5d68ffd2dc
2022-10-11 10:05:48 +11:00
Clark Boylan d1dc777fd1 Use test-prepare-workspace-git in base-test
This will enable us to test changes to test-prepare-workspace-git and
ultimately prepare-workspace-git.

Change-Id: Ic6badd58a7021595508cad0d3ecb9c7d80780858
2022-09-22 10:49:29 -07:00
Ian Wienand c81fc34da4
configure-mirrors: enable extras-common for 9-stream
This uses the new argument provided in the dependent change to enable
the extras-common repo for 9-stream.  Since this is already running
with the default arguments, it should be low-risk to change them here
and only affect CentOS 9-stream.

Change-Id: I185657987fd1b454db683bd1329a985940014750
2022-09-20 08:47:00 +10:00
Ian Wienand ceba1e4f5a
base-test: add descriptive names
This ports I40f2592a316bb8293f91d90be3996a6c697de196 to keep this file
in sync with base/pre.yaml

Change-Id: Ie0063edc6e6ae9e1c478b538cbda010dd03177c9
2022-09-20 08:47:00 +10:00
Ian Wienand 3bfdda452c
Revert "Switch base-test to test-prepare-workspace-git"
This reverts commit 88e7b0da57.  The
testing is complete.

Change-Id: I4e5420e05bc8ef8ece56fb53746236e751869cd7
2022-09-20 08:46:55 +10:00
Ian Wienand df855f8a33 base-pre: Add some descriptive names to playbooks
In doing some work on the Zuul console, I noticed that none of these
have descriptive names.  That looks a bit ugly in the console where
you just get a generic "Play: all".  Give them some names as a clue to
what's going on.

Change-Id: I40f2592a316bb8293f91d90be3996a6c697de196
2022-08-26 11:25:00 +10:00
Jeremy Stanley dc40f85d80 Revert "Temporarily stop trying to save logs to Rackspace"
Merge once https://identity.api.rackspacecloud.com/ has a valid cert
expiration again.

This reverts commit fd5b8fbdc5.

Change-Id: Ibfe20409c8a9d7bc1c424ef8dc6656ff28d89e66
2022-07-28 13:26:11 +00:00
Jeremy Stanley fd5b8fbdc5 Temporarily stop trying to save logs to Rackspace
The Keystone API endpoint at identity.api.rackspacecloud.com 443/tcp
is currently serving an X.509 cert with an expiration of
2022-07-28 12:00:00 UTC (roughly 1.5 hours ago), so logs can't be
uploaded to their swift by our base job, resulting in widespread
POST_FAILURE results. Remove them from the round-robin destination
list for now, and we can revert this once the cert has been renewed.

Change-Id: Icfc593196a1176cb41657c277f80cb01cf2eb654
2022-07-28 13:17:26 +00:00
Jonathan Rosser 91a648d42b Separate swift provider selection from the swift log upload task
Select the provider in a seperate task so that the provider name
can then be included in the task name for the upload. This will
enable the provider to be seen in the job console even if the upload
subseqently fails.

In addition, the upload role can now be called more than once in a
future patch, keeping the provider constant between invocations.

base-test results https://review.opendev.org/c/zuul/zuul-jobs/+/848880

Change-Id: Ie69cbfaebfbe80ad9ce7de789c12b5db7cb6e0c2
2022-07-14 10:20:41 +01:00
Jeremy Stanley 72b24672e7 Revert "Temporarily disable log uploads to OVH"
Service is restored.

This reverts commit f8756d0080.

Change-Id: I2519045ff2791290c172598e5c34edb440d6fffa
2022-07-12 14:31:23 +00:00
Jeremy Stanley f8756d0080 Temporarily disable log uploads to OVH
We've suddenly started getting errors from OVH's Swift endpoints
saying "Payment Required: Access was denied for financial reasons."
Stop uploading new logs here since this may also be causing
POST_FAILURE results for builds.

Change-Id: I4928ed439a34484ac73a4162d6ab09e5d11de106
2022-07-12 11:47:13 +00:00
Jonathan Rosser 9a29fc9fb3 Avoid potential loop variable collision in base-test post-logs
Make the variable different from 'item' to avoid any possible
collision with other uses of that name from ansible loops.

Change-Id: I6dfb6f8494538acfdfa4f3f93e02cb955fd2bd9c
2022-07-06 17:09:01 +01:00
Jonathan Rosser 8fe82f11ed Separate swift provider selection from the swift log upload task for base-test
Select the provider in a seperate task so that the provider name
can then be included in the task name for the upload. This will
enable the provider to be seen in the job console even if the upload
subseqently fails.

In addition, the upload role can now be called more than once in a
future patch, keeping the provider constant between invocations.

Change-Id: I37ec05125824a0442652e6444369967bc5170aae
2022-06-28 21:46:50 +01:00
Jeremy Stanley 9c4b60356b Revert "Temporarily stop uploading logs to Rackspace"
After pinning openstacksdk<0.99 in the zuul images with
If1cf1f8c301de09df1d212b6cef151317f6dc6bf, the problem with missing
CORS headers for Rackspace Swift object uploads seems to have
subsided, so we no longer need to limit where we write our logs.

This reverts commit 7b85fb90df.

Change-Id: I7e1a9cc87fea1bd1517b9340342e74ab578c9cb5
2022-06-01 15:36:28 +00:00
Jeremy Stanley 7b85fb90df Temporarily stop uploading logs to Rackspace
Something has broken with CORS headers for logs recently uploaded to
the three Rackspace regions we use. They can still be browsed in raw
form, but the Zuul results page is unable to provide a failure
summary, console breakdown, or deep-linkable logs.

While we work to identify the underlying cause, avoid uploading
further logs to those locations with our base/post-logs playbook,
relying instead exclusively on the two OVH regions configured (which
seem to still work as before). Leave the Rackspace entries in
base-test/post-logs so we'll still be able to easily replicate the
problem for further troubleshooting.

Change-Id: I92ede6bf4717c07e78f43c11fb2b1cd94e1a5478
2022-05-28 12:02:06 +00:00
James E. Blair 88e7b0da57 Switch base-test to test-prepare-workspace-git
This tests https://review.opendev.org/842573

Change-Id: Id028f97749f0c8bcd47706d2dbb84e4cacf4e362
2022-05-19 13:11:54 -07:00
Clark Boylan 9078f71fbb Stop submitting logstash/subunit gearman jobs
The openstack health server stopped working a few months ago and we
ended up shutting down the subunit workers and the health api server as
a result. This means we can stop submitting gearman jobs to process
subunit files.

Also about a year ago we indicated to OpenStack that we could keep the
logstash tooling running through the yoga cycle which is now over. We
haven't had any volunteers or help to continue running the ELK stack in
opendev so we're going to shut it down now that yoga is out the door.
Openstack did end up working with AWS to set up an opensearch
replacement which users can look to for log indexing of CI jobs in
OpenStack.

Change-Id: I5f0f3805e191f0cd6354285299ed33c42d3899fd
2022-04-12 13:03:37 -07:00
Ian Wienand 19dc53290a base: fail centos-8 if pointing to centos-8-stream image type
As noted inline, this fails the job if using the centos-8 label (see
If32d0c4c503e11285fdcb7c45188568a5dc010bf) that actually points at a
centos-8-stream node.  This should encourage people to fix the node
usage.

Change-Id: I602f2c48fa4845288d72de0cf1d46149815d1cbc
2022-02-10 10:00:27 +11:00
Ian Wienand aa27a8920f base-test: fix typos in centos-8 detection
We don't need to quote the when: statements.  Follow-on to
I84fe5bca76884d8f258a292d0814ad43ac7f2be1.

Change-Id: Ieb114467dea3be0a0ec7a96fbd10ba47f7f00cac
2022-02-10 09:59:15 +11:00
Ian Wienand b16e29083a base-test: fail centos-8 if pointing to centos-8-stream image type
As noted inline, this fails the job if using the centos-8 label (see
If32d0c4c503e11285fdcb7c45188568a5dc010bf) that actually points at a
centos-8-stream node.  This should encourage people to fix the node
usage.

This adds the job to base-test.
I602f2c48fa4845288d72de0cf1d46149815d1cbc adds it to production.

Change-Id: I84fe5bca76884d8f258a292d0814ad43ac7f2be1
2022-02-09 10:37:38 +11:00
Ian Wienand 9f64fef266 base-test: sync with base/pre.yaml
Change-Id: I0fa477968d6b428d427ff72a4dd1af3902bb3f49
2022-02-09 10:35:27 +11:00
Jeremy Stanley 38a3a7febc Revert "infra-prod-setup-keys: drop inventory add"
This reverts commit 70827542ad.

Change-Id: Id3b9dd28b775fffd4e4c5ec43754bfc12430f6d2
2021-12-02 19:18:12 +00:00
Ian Wienand 70827542ad infra-prod-setup-keys: drop inventory add
This playbook does not need to actually access the bastion host
(bridge.o.o) so does not need the inventory setup steps here.

It was pointed out in review of the prior change
(I1bbf4f1402938216401dd924da62aa869a08875b) that we could drop this
job and do this known_hosts setup in system-config for each job.
However, I think it's not a bad idea to keep a synchronization point
for the infra-prod jobs here in a trusted playbook.

Depends-On: https://review.opendev.org/c/opendev/system-config/+/807808
Change-Id: I43285bf61a2902851a15929ac3725fe131ef5b1f
2021-12-01 07:01:59 +00:00