Commit Graph

393 Commits

Author SHA1 Message Date
James E. Blair 8821dbaa13 Make ansible package check more robust
The JSON parser will fail if given the empty string, so make sure
that if we don't get any output back, we don't try to parse it.

Additionally, if no extra packages are required, then don't bother
running the command in the first place.

Finally, include the missing package list in the error log message
rather than in a separate debug log entry, for easier correlation.

Change-Id: I0c39c74fdf05611439b35cd72b8ab70b836f4c1a
2024-03-25 11:25:35 -07:00
Clark Boylan dfe96519b1 Replace pkg_resource dep resolution with pip
Python 3.12 has more fully excised setuptools and as a result
pkg_resources. Since we are using pip to install things we can instead
rely on pip to resolve whether or not packages are installed. Do this so
that we don't need to explicitly install setuptools where it may not be
needed.

Change-Id: I8ee189c828914fd648847b5650b5fb2fb255ff17
2024-03-20 09:17:19 -07:00
James E. Blair 341e8d8ccd Use importlib instead of pkg_resources for ansible conf
Using importlib to read resource strings is said to be much faster
than pkg_resources[1].  See if we see a performance improvement.

Thanks to Clark Boylan for the suggestion.

[1] https://docs.python.org/3/library/importlib.resources.html#importlib.resources.files

Change-Id: I0b7d16d14989918682e137a56da7b0c65b14c336
2024-03-20 09:17:19 -07:00
James E. Blair 5a8e373c3b Replace Ansible 6 with Ansible 9
Ansible 6 is EOL and Ansible 9 is available.  Remove 6 and add 9.

This is usually done in two changes, but this time it's in one
since we can just rotate the 6 around to make it a 9.

command.py has been updated for ansible 9.

Change-Id: I537667f66ba321d057b6637aa4885e48c8b96f04
2024-02-15 16:20:45 -08:00
James E. Blair b038dcaf9f Deprecate Ansible 6
Ansible 6 is no longer supported and 8 is available and working.
Deprecate Ansible 6.

Change-Id: I721ae1659cc062d9938ceea863ad746996892cc7
2024-02-07 13:22:21 -08:00
James E. Blair f99cee543e Use getBuildTimes for build time estimator
The existing build time estimator uses the normal getBuilds query
method.  We recently added an optimized query for a new build times
API endpoint.  As one would expect, the build time estimator has
limited query needs and should be able to use the new optimized
query as well.

The one thing missing is a search for a specific result (ie, SUCCESS),
so that is added.  The unused sort operator is removed.

From what I can tell, switching to sorting by end_time should not
produce a reduction in performance compared to sorting by id.

Change-Id: I1096d466accad5574b6cfa226e68b070f769128f
2024-01-04 06:30:52 -08:00
Zuul f291e52850 Merge "Reduce config error context" 2023-11-08 15:50:58 +00:00
James E. Blair a29b7e75af Reduce config error context
We include the yaml text of the entire configuration object in
many of our configuration error reports.  This can be quite large
which is not always helpful to the user, and can cause operational
problems with large numbers of errors of large objects.

To address that, let's only include a few lines.  But let's also
try to make those lines useful by centering the actual attribute
that caused the error in our snippet.

To do this, we need to keep the line number of all of our configuration
attribute keys.  This is accomplished by subclassing str and adding
a "line" attribute so that it behaves exactly like a string most
of the time, but when we go to assemble a configuration error, we can
check to see if we have a line number, and if so, zoom in on that line.

Example:

Zuul encountered a deprecated syntax while parsing its configuration
in the repo org/common-config on branch master.  The problem was:

  All regular expressions must conform to RE2 syntax, but an
  expression using the deprecated Perl-style syntax has been detected.
  Adjust the configuration to conform to RE2 syntax.

  The RE2 syntax error is: invalid perl operator: (?!

The problem appears in the "branches" attribute
of the "org/project" project stanza:

  ...
      name: org/project
      check:
        jobs:
          - check-job:
              branches: ^(?!invalid).*$
      gate:
        jobs:
          - check-job
      post:
  ...

  in "org/common-config/zuul.yaml@master", line 84

This reduction is currently only performed for deprecation warnings but
can be extended to the rest of the configuration errors in subsequent
changes.

Change-Id: I9d9cb286dacee86d54ea854df48d7a4dd37f5f12
2023-10-26 08:50:20 -07:00
James E. Blair 18fb324f1e Add auth token to websocket
When making a websocket request, browsers do not send the
"Authorization" header.  Therefore if a Zuul tenant is run in
a configuration where authz is required for read-only access,
the websocket-based log streaming will always fail.

To correct this, we will remove the http request authz check
from the console-stream endpoint, and add an optional token
parameter to the websocket message payload.  The JS web app
will be responsible for sending the auth token in the payload,
and the web server will validate it if it is required for the
tenant.  Thanks to Andrei Dmitriev for this suggestion.

Since we essentially have two different authz code paths in
zuul-web now, in order to share as much code as possible, the
authz sequence is refactored in such a way that the final authz
check can be deferred.  First we create an AuthContext at the
start of the request which stores tenant and header information,
then the actual validation is performed in a separate step where
the token can optionally be provided.

In the http code path, we create the AuthContext and validate
immediately, using the Authorization header, and we do all of that
in the cherrypy tool at the start of the request.

In the websocket code path, we create the AuthContext as the
websocket handler is being created by the cherrypy request handler,
then we perform validation after receiving a message on the
websocket.  We use the token supplied from the request.

Error handling is adjusted so in the http code path, exceptions
that return appropriate http errors are raised, but in the
websocket path, these are caught and translated into websocket
close calls.

A related issue is that we perform no validation that the
streaming build log being requested belongs to the tenant via
which the request is being sent.  This was unecessary before
read-only access was an option, but now that it is, we should
check that a streaming build request arrives via the correct
tenant URL.  This change adjusts that as well.

During testing, it was noted that the tenant configuration syntax
allows admin-rules and access-rules to use the scalar-or-list
pattern, however some parts of the code assumed only lists.  The
configloader is updated to use scalar-or-list for both of those
values.

Change-Id: Ifd4c21bb1fe962bf23acb5b4f10b3bbaba61e63a
Co-Authored-By: Andrei Dmitriev <andrei.dmitriev@nokia.com>
2023-10-24 07:29:55 -07:00
Zuul 502b178ad9 Merge "Use cyaml when reading/writing Ansible" 2023-09-15 08:03:13 +00:00
James E. Blair d4fac1a0e8 Register RE2 syntax errors as warnings
This adds a configuration warning (viewable in the web UI) for any
regular expressions found in in-repo configuration that can not
be compiled and executed with RE2.

Change-Id: I092b47e9b43e9548cafdcb65d5d21712fc6cc3af
2023-08-28 15:04:49 -07:00
James E. Blair 3d5f87359d Add configuration support for negative regex
The re2 library does not support negative lookahead expressions.
Expressions such as "(?!stable/)", "(?!master)", and "(?!refs/)" are
very useful branch specifiers with likely many instances in the wild.
We need to provide a migration path for these.

This updates the configuration options which currently accepts Python
regular expressions to additionally accept a nested dictionary which
allows specifying that the regex should be negated.  In the future,
other options (global, multiline, etc) could be added.

A very few options are currently already compiled with re2.  These are
left alone for now, but once the transition to re2 is complete, they
can be upgraded to use this syntax as well.

Change-Id: I509c9821993e1886cef1708ddee6d62d1a160bb0
2023-08-28 15:03:58 -07:00
James E. Blair 1d07a097ee Use cyaml when reading/writing Ansible
We subclass the yaml.SafeDumper class to adjust its behavior with an
override of the ignore_aliases method.  It is possible to subclass
the cyaml.CSafeDumper class as well.  The "C" part is actually the
Parser and Emitter, not the Dumper/Representer, so our override
is still effective whether we use the C or Python versions.

This can produce a significant performance increase when exchanging
large amounts of data with Ansible.

The C emitter is more aggressive about not using unecessary quotes,
so the ansible dumper test assertions need to change.  To add some
extra assurance, that test is also updated to check that the round-trip
load is as expected as well.

Change-Id: I30fd82c0b9472120d010f3f4a65e17fb426b0f7e
2023-08-22 16:15:19 -07:00
James E. Blair 1b042ba4ab Add job failure output detection regexes
This allows users to trigger the new early failure detection by
matching regexes in the streaming job output.

For example, if a unit test job outputs something sufficiently
unique on failure, one could write a regex that matches that and
triggers the early failure detection before the playbook completes.

For hour-long unit test jobs, this could save a considerable amount
of time.

Note that this adds the google-re2 library to the Ansible venvs.  It
has manylinux wheels available, so is easy to install with
zuul-manage-ansible.  In Zuul itself, we use the fb-re2 library which
requires compilation and is therefore more difficult to use with
zuul-manage-ansible.  Presumably using fb-re2 to validate the syntax
and then later actually using google-re2 to run the regexes is
sufficient.  We may want to switch Zuul to use google-re2 later for
consistency.

Change-Id: Ifc9454767385de4c96e6da6d6f41bcb936aa24cd
2023-08-21 16:41:21 -07:00
James E. Blair 60a8dfd451 Add Ansible 8
This is the currently supported version of Ansible.  Since 7 is out
of support, let's skip it.

Change-Id: I1d13c23189dce7fd9db291ee03a452089b92a421
2023-07-19 15:46:48 -07:00
Clark Boylan c1b0a00c60 Only check bwrap execution under the executor
The reason for this is that containers for zuul services need to run
privileged in order to successfully run bwrap. We currently only expect
users to run the executor as privilged and the new bwrap execution
checks have broken other services as a result. (Other services load the
bwrap system bceause it is a normal zuul driver and all drivers are
loaded by all services).

This works around this by add a check_bwrap flag to connection setup and
only setting it to true on the executor. A better longer term followup
fixup would be to only instantiate the bwrap driver on the executor in
the first place. This can probably be accomplished by overriding the
ZuulApp configure_connections method in the executor and dropping bwrap
creation in ZuulApp.

Temporarily stop running the quick-start job since it's apparently not
using speculative images.

Change-Id: Ibadac0450e2879ef1ccc4b308ebd65de6e5a75ab
2023-05-17 13:45:23 -07:00
Simon Westphahl ac88ab76eb
Fix deprecated use of currentThread() in REPL
zuul/lib/repl.py:29: DeprecationWarning: currentThread() is deprecated, use current_thread() instead
  obj = self.files.get(threading.currentThread(), self.default)

Change-Id: I6f5a9b6b169b882024a41623364eefe0955796ef
2023-02-14 11:14:15 +01:00
Clark Boylan 2747ea6f56 Fix DeprecationWarning: ssl.PROTOCOL_TLS is deprecated
Since python 3.10 ssl.PROTOCOL_TLS has been deprecated. We are expected
to use ssl.PROTOCOL_TLS_CLIENT and ssl.PROTOCOL_TLS_SERVER depending on
how the sockets are to be used. Switch over to these new constants to
avoid the DeprecationWarning.

One thing to note is that PROTOCOL_TLS_CLIENT has default behaviors
around cert verification and hostname checking. Zuul is already
explicitly setting those options the way it wants to and I've left that
alone to avoid trouble if the defaults change later.

Finally, this doesn't fix the occurence of this error that happens
within kazoo. A separate PR has been made upstream to kazoo and this
should be fixed in the next kazoo release.

Change-Id: Ib41640f1d33d60503066464c8c98f865a74f003a
2023-02-07 16:37:20 -08:00
Clark Boylan 26523d8e56 Fix ResourceWarnings in fingergw
The fingergw (and its associated testing) was not properly managing ssl
sockets. The issue was we were in a context manager for the underlying
tcp socket which will get closed, but that doesn't call close() on the
ssl socket wrapping the tcp socket. Fix this by moving common recv()
code into a function then use the ssl socket in an inner context manager
if we are using ssl.

Both ssl and plain tcp will close() properly and we avoid duplicating
common code.

Change-Id: I1feefbd03a90734cf3c16baa6ed8f52cd8e00d14
2023-02-07 16:17:14 -08:00
James E. Blair 343904e1a4 Use unsafe_skip_rsa_key_validation with cryptography
This is a partial revert of c4476d1b6a
which added the use of a private flag to skip unecessary (for us)
cryptography checks.  The cryptography package has now normalized
that flag into a parameter we can pass, so use the new param and
update the dependency to require the version that supports it.

Change-Id: I1dfa203525e85020ccf942422ad3cc7040b851dd
2023-01-11 10:37:24 -08:00
Clark Boylan 647940925f Cleanup test logging
We were overlogging because we check for an openssl flag early and warn
if it isn't present. That warning creates a default root streamhandler
that emits to stderr causing all our logging to be emitted there.

Fix this by creating a specific logger for this warning (avoids
polluting the root logger) and add an assertion that the root logger's
handler list is empty when we modify it for testing.

Note I'm not sure why this warning is happening now and wasn't before.
Maybe our openssl installations changed or cryptography modified the
flag? This is worth investigating in a followup.

Change-Id: I2a82cd6575e86facb80b28c81418ddfee8a32fa5
2023-01-11 10:36:15 -08:00
James E. Blair f9eb499870 Remove Ansible 5
Change-Id: Icd8c33dfe1c8ffd21a717a1a94f1783c244a6b82
2022-10-11 17:03:57 -07:00
James E. Blair 81e9a51185 Deprecate Ansible 5, make Ansible 6 the default
Ansible 5 is no longer supported and 6 is available and working.
Deprecate Ansible 5.

Change-Id: I8c152f7c0818bccd07f50e85bef9a82ddb863a68
2022-10-11 16:56:57 -07:00
Simon Westphahl bbe89422e7 Store parent span context with span info
Change-Id: Idb9b673542c2054f7bbae094ad5702a472197fe1
2022-10-06 09:14:59 -07:00
Zuul 591b0b5da8 Merge "Create link to previous buildset span" 2022-09-28 19:38:11 +00:00
Zuul f1f6090a92 Merge "Remove support for Ansible 2" 2022-09-23 17:31:54 +00:00
Simon Westphahl 7d3b186b3d
Create link to previous buildset span
Create a link to the previous buildset span on gate reset. To make this
work we'll start the buildset span when the buildset is created instead
of only when we set the configuration.

This change also adds the `is_remote` flag of the span context of
related links. This is required for creating a `SpanContext` in order to
deserialize the links.

Change-Id: If3a3a83739c1472659d71d05dcf67f84ddce4247
2022-09-19 14:52:26 +02:00
James E. Blair 8c2433a2c4
Tracing: implement span save/restore
This adds methods to allow us to save and restore spans using
ZooKeeper data.

Additionally, we subclass the tracing.Span class so that we can
transparently handle timestamps which are stored as floating point
numbers rather than integer nanoseconds.

To exercise the new features, emit spans for QueueItems and BuildSets.

Because most of our higher-level (parent) spans may start on
one host and end on another, we save the full information about
the span in ZK and restore it whenever we do anything with it,
including starting child spans.  This works well for starting
a Build span given a BuildSet, since both objects are used by
the executor client and so the span information for both is
available.

However, there are cases where we would like to have child spans
and we do not have the full information of the parent, such as
any children of the Build span on the executor.  We could
duplicate all the information of the Build span in ZK and send
it along with the build request, but we really only need a few
bits of info to start a remote child span.  In OpenTelemetry,
this is called trace propogation, and there are some tools for
this which assume that the implicit trace context is being used
and formats information for an HTTP header.  We could use those
methods, but this change adds a simpler API that is well suited
to our typical json-serialization method of propogation.

To use it, we will add a small extra dictionary to build and merge
requests.  This should serialize to about 104 bytes.

So that we can transparantly handle upgrades from having no
saved state for spans and span context in our ZK data, have our
tracing API return a NonRecordingSpan when we try to restore
from a None value.  This code uses tracing.INVALID_SPAN or
tracing.INVALID_SPAN_CONTEXT which are suitable constants.  They
are sufficiently real for the purpose of context managers, etc.

The only down side is that any child spans of these will be
real, actual reported spans, so in these cases, we will emit
what we intend to be child spans as actual parent traces.
Since this should only happen when a user first enables tracing
on an already existing system, that seems like a reasonable
trade-off.  As new objects are populated, the spans will be emitted
as expected.

The trade off here is that our operational code can be much
simpler as we can avoid null value checks and any confusion regarding
context managers.

In particular, we can just assume that tracing spans and contexts
are always valid.

Change-Id: If55b06572b5e95f8c21611b2a3c23f7fd224a547
2022-09-19 08:42:28 +02:00
James E. Blair ce40b29677
Add support for configuring and testing tracing
This adds support for configuring tracing in Zuul along with
basic documentation of the configuration.

It also adds test infrastructure that runs a gRPC-based collector
so that we can test tracing end-to-end, and exercises a simple
test span.

Change-Id: I4744dc2416460a2981f2c90eb3e48ac93ec94964
2022-09-19 08:42:28 +02:00
James E. Blair 2d6b5c19ba Remove support for Ansible 2
Versions 2.8 and 2.9 are no longer supported by the Ansible project.

Change-Id: I888ddcbecadd56ced83a27ae5a6e70377dc3bf8c
2022-09-14 17:14:10 -07:00
James E. Blair 7949efd255 Add Ansible 6
Change-Id: I0d450d9385b9aaab22d2d87fb47798bf56525f50
2022-09-02 10:12:55 -07:00
James E. Blair ad03402dec Deprecate Ansible 2, make Ansible 5 default
Change-Id: I2576d0dcec7c8f7bbb76bdd469fd992874742edc
2022-09-02 10:12:54 -07:00
Zuul 8434446c98 Merge "Handle jwt decoding error, fix exception default messages" 2022-07-14 18:29:09 +00:00
James E. Blair c4476d1b6a Skip RSA key validation on load
OpenSSL 3.0.0 performs key validation in a very slow manner.  Since
our keys are internally generated and securely stored, we can skip
validation.  See https://github.com/pyca/cryptography/issues/7236

This reduces key loading time from 0.7 to 0.005 seconds/key in
OpenDev.

OpenSSL 1.1.1, which was being used until recently took a similarly
short amount of time.

Change-Id: Ie3841da2c9f7ca2da5b8de4bb619e8bad9c215af
2022-06-18 11:51:38 -07:00
Zuul ad5b910a38 Merge "Temporarily pin OpenStackSDK before 0.99" 2022-05-31 20:50:51 +00:00
Jeremy Stanley a399318b60 Temporarily pin OpenStackSDK before 0.99
The OpenStack SDK/CLI team made an experimental "release candidate"
with 0.99.0, and OpenDev observed missing CORS headers for objects
uploaded to RackSpace's Swift service after restarting executors
onto container images with this version installed. Pin to an earlier
version for now, in order to avoid this and any other as of yet
undiscovered problems in that version.

Change-Id: If1cf1f8c301de09df1d212b6cef151317f6dc6bf
2022-05-31 14:46:59 +00:00
James E. Blair 591d7e624a Unify service stop sequence
We still had some variations in how services stop.  Finger, merger,
and scheduler all used signal.pause in a while loop which is
incompatible with stopping via the command socket (since we would
always restart the pause).  Sending these components a stop or
graceful signal would cause them to wait forever.

Instead of using signal.pause, use the thread.join methods within
a while loop, and if we encounter a KeyboardInterrupt (C-c) during
the join, call our exit handler and retry the join loop.

This maintains the intent of the signal.pause loop (which is to
make C-c exit cleanly) while also being compatible with an internal
stop issued via the command socket.

The stop sequence is now unified across all components.  The executor
has an additional complication in that it forks a process to handle
streaming.  To keep a C-c shutdown clean, we also handle a keyboard
interrupt in the child process and use it to indicate the start of
a shutdown.  In the main executor process, we now close the socket
which is used to keep the child running and then wait for the child
to exit before the main process exits (so that the child doesn't
keep running and emit a log line after the parent returns control
to the terminal).

Change-Id: I216b76d6aaf7ebd01fa8cca843f03fd7a3eea16d
2022-05-28 10:27:50 -07:00
Matthieu Huin 03878ee643 Handle jwt decoding error, fix exception default messages
Using a badly formatted token resulted in an error 500 from zuul-web.
Return a more precise error message and an error 401 in zuul-web when
this occurs.

Also fix a typo in default messages for some auth-related exceptions.

Change-Id: I4abe013e76ac51c3dad7ccd969ffe79f5cb459e3
2022-05-12 18:48:19 +02:00
Daniel Pawlik 66844797d2 Add passlib to Ansible venvs
The passlib library is needed to generate bcrypt password hash.
If the passlib is not installed, Zuul will print a message:

    *0

or

    crypt.crypt does not support 'bcrypt' algorithm

Change-Id: Ib1adc385bea519ac55fd23ba9da21e0d78f14dcb
2022-05-05 16:28:24 +02:00
Jeremy Stanley a89ce345c0 Add netaddr to Ansible venvs
Change I733e48127f2b1cf7d2d52153844098163e48bae8 removed ARA, which
was indirectly depending on netaddr. Without netaddr, Ansible IPv6
tasks will break, even though Ansible doesn't itself declare a
dependency on it.

Explicitly add netaddr to our Ansible venvs, so we can perform tasks
which require it.

Change-Id: Ic214377c3e50acc93c2a4a9e564818169b8e2552
2022-04-30 23:36:44 +00:00
James E. Blair ebf5c96d57 Add support for Ansible 5
This adds support for Ansible 5.  As mentioned in the reno, only
the major version is specified; that corresponds to major.minor in
Ansible core, so is approximately equivalent to our current regime.

The command module is updated to be based on the current code in
ansible core 2.12.4 (corresponding to community 5.6.0).  The previous
version is un-symlinked and copied to the 2.8 and 2.8 directories
for easy deletion as they age out.

The new command module has corrected a code path we used to test
that the zuul_stream module handles python exceptions in modules,
so instead we now take advantage of the ability to load
playbook-adjacent modules to add a test fixture module that always
raises an exception.  The zuul stream functional test validation is
adjusted to match the new values.

Similarly, in test_command in the remote tests, we relied on that
behavior, but there is already a test for module exceptions in
test_module_exception, so that check is simply removed.

Among our Ansible version tests, we occasionally had tests which
exercised 2.8 but not 2.9 because it is the default and is otherwise
tested.  This change adds explicit tests for 2.9 even if they are
redundant in order to make future Ansible version updates easier and
more mechanical (we don't need to remember to add 2.9 later when
we change the default).

This is our first version of Ansible where the value of
job.ansible-version could be interpreted as an integer, so the
configloader is updated to handle that possibility transparently,
as it already does for floating point values.

Change-Id: I694b979077d7944b4b365dbd8c72aba3f9807329
2022-04-14 13:33:53 -07:00
James E. Blair 2a8b29aa94 Remove built-in ARA support
This has been pinned to a very old version of ARA for some time, and
newer versions of Ansible are no longer compatible with the old version
of ARA.  Since this isn't receiving maintenance keeping it up to date,
remove it.

Note that if there is desire for support for this or other callback
plugins, it would be quite reasonable and relatively straightforward
to add the ability to generically configure additional callback plugins.
This would have the advantage of not requiring tight internal integration
between Zuul and other callback plugins.  Such a change would likely
be welcome.

Change-Id: I733e48127f2b1cf7d2d52153844098163e48bae8
2022-04-13 16:44:34 -07:00
James E. Blair 008e3b45dd Add IBM and Azure deps to the default ansible config
These are required for the log upload roles in zuul-jobs to work,
so include them by default.

Change-Id: Ibc1ffa9d0acb7a988ac207765a67da5a6d2ac3ce
2022-04-13 14:17:04 -07:00
James E. Blair 6214731f8b Fix Ansible plugin loading
This corrects a security vulnerability related to loading Ansible
plugins under the `ansible.builtin.*` aliases.

Change-Id: I3a394904765e22080aa038c44bfe26e07a1e86c7
Story: 2009941
2022-03-24 14:50:20 -07:00
Felix Edel 8db6b6113a
Look up worker_zone for log streaming from executor
Currently, we are looking up the worker_zone for the log streaming from
the BuildRequest's path in ZooKeeper. This is a problem for unzoned
builds as those builds don't provide a zone information in their path
(zone=None).

Due to this, the log streaming won't use the FingerGateways and instead
always falls back to use the direct connection to the executor. This
works as long as the executor is located in the same region as zuul-web,
but in other cases the log streaming is broken.

To fix this, the executor will now store its zone information in the
worker_info of the BuildRequest when accepting the BuildRequest. In
the streamer_utils library we will use this zone information instead of
the zone from the ZooKeeper path.

Co-Authored-By: Simon Westphahl <simon.westphahl@bmw.de>
Change-Id: I63b148fa29e05157fce032d0f41b909da8a11e87
2022-02-24 20:50:03 +01:00
James E. Blair 864a2b7701 Make a global component registry
We generally try to avoid global variables, but in this case, it
may be helpful to set the component registry as a global variable.

We need the component registry to determine the ZK data model API
version.  It's relatively straightforward to pass it through the
zkcontext for zkobjects, but we also may need it in other places
where we might alter processing of data we previously got from zk
(eg, the semaphore cleanup).  Or we might need it in serialize or
deserialize methods of non-zkobjects (for example, ChangeKey).

To account for all potential future uses, instantiate a global
singleton object which holds a registry and use that instead of
local-scoped component registry objects.  We also add a clear
method so that we can be sure unit tests start with clean data.

Change-Id: Ib764dbc3a3fe39ad6d70d4807b8035777d727d93
2022-02-14 10:58:34 -08:00
James E. Blair a160484a86 Add zuul-scheduler tenant-reconfigure
This is a new reconfiguration command which behaves like full-reconfigure
but only for a single tenant.  This can be useful after connection issues
with code hosting systems, or potentially with Zuul cache bugs.

Because this is the first command-socket command with an argument, some
command-socket infrastructure changes are necessary.  Additionally, this
includes some minor changes to make the services more consistent around
socket commands.

Change-Id: Ib695ab8e7ae54790a0a0e4ac04fdad96d60ee0c9
2022-02-08 14:14:17 -08:00
James E. Blair 482338f70c Identify cherrypy requests in logs
This adds a unique ID to every cherrypy request and adds it to the
all of the log lines in the web server.

Some requests produce multiple log entries, and this will allow us
to associate them with each other as well as the cherrpy access log
which is emitted at the end of the request.

To add this to cherrypy's request logging, we subclass the cherrypy
LogManager class and add accessor methods so that whenever the
superclass requests a logger, it gets our annotated logger.

See: https://github.com/cherrypy/cherrypy/blob/main/cherrypy/_cplogging.py

There is an underscore in that module name, but several parts of the
cherrpy documentation suggest importing underscore modules for the
purpose of extending cherrpy.

On the ZuulApi class, we add a similar accessor method for obtaining
the annotated logger for any request.

The two log streaming classes initialize their log attribute in the
constructor since their instances are dedicated to a single request.

The remaining classes retain the class-level logger since they operate
outside of the request context.

Change-Id: I66af717ab34fa4ad3492a01badb25e3865133104
2022-02-02 16:16:27 -08:00
James E. Blair 29fbee7375 Add a model API version
This is a framework for making upgrades to the ZooKeeper data model
in a manner that can support a rolling Zuul system upgrade.

Change-Id: Iff09c95878420e19234908c2a937e9444832a6ec
2022-01-27 12:19:11 -08:00
James E. Blair 41d8e478a5 Remove "sql connection" backwards compatability for database
In 4.0 we deprecated connections using the "sql" driver in favor of
using the new "database" config file section.  Remove the backwards
compatible handling of that so that "sql" connections or lack of
"database" section report an error.

Change-Id: I7e592cf5ff63f73f487e41bb6e3e4a4ae523e3fc
2022-01-25 16:07:08 -08:00