zuul/zuul - zuul - OpenDev: Free Software Needs Free Tools

Commit Graph

Author	SHA1	Message	Date
James E. Blair	9105ffe00b	Add script to generate openapi spec The existing openapi spec document (used to generate the swagger ui page in the web app as well as the rst documentation) is both incomplete and wrong due to bitrot. This change adds a script which automatically generates much of the api documentation from the code. The output is still incomplete, but it does include at least the same endpoints currently documented, and of those, all of the inputs and outputs. Due to its automatic generation, all of the endpoints and their inputs are now documented. Only some outputs are missing (as well as explanatory text, which was pretty thin before). It does the following: * Inspects the cherrypy router object to determine the endpoints to include, and identifies their HTTP methods and the python functions that implement them. * It inspects the function python docstring to get summary documentation for the endpoint. * It inspects the function arguments and compares them to the router path to determine if each is a path or query parameter, as well as whether each is required. * It merges type and descriptive information from the python docstring about each parameter. * For output, a schema system similar to voluptuous is used to describe the output names and types, as well as optional descriptive information. One of two function decorators are used to describe the output. It removes the documentation for the status page output format. This API is specially optimized for the Zuul status page, is very complex, and we should therefore not encourage end-users to develop against it. The endpoint itself is documented as such, but the response value is undocumented. Future work: More descriptive text and output formats can be documented. Change-Id: Ib1a2aad728c4a7900841a8e3b617c146f2224953	2024-03-09 11:25:40 -08:00
Zuul	617bbb229c	Merge "Fix validate-tenants isolation"	2024-02-28 02:46:55 +00:00
James E. Blair	1cc276687a	Change status json to use "refs" instead of "changes" This is mostly an internal API change to replace the use of the word "change" with "ref" in the status json. This matches the database and build/buildsets records. Change-Id: Id468d16d6deb0af3d1c0f74beb1b25630455b8f9	2024-02-09 07:39:52 -08:00
James E. Blair	1f026bd49c	Finish circular dependency refactor This change completes the circular dependency refactor. The principal change is that queue items may now include more than one change simultaneously in the case of circular dependencies. In dependent pipelines, the two-phase reporting process is simplified because it happens during processing of a single item. In independent pipelines, non-live items are still used for linear depnedencies, but multi-change items are used for circular dependencies. Previously changes were enqueued recursively and then bundles were made out of the resulting items. Since we now need to enqueue entire cycles in one queue item, the dependency graph generation is performed at the start of enqueing the first change in a cycle. Some tests exercise situations where Zuul is processing events for old patchsets of changes. The new change query sequence mentioned in the previous paragraph necessitates more accurate information about out-of-date patchsets than the previous sequence, therefore the Gerrit driver has been updated to query and return more data about non-current patchsets. This change is not backwards compatible with the existing ZK schema, and will require Zuul systems delete all pipeline states during the upgrade. A later change will implement a helper command for this. All backwards compatability handling for the last several model_api versions which were added to prepare for this upgrade have been removed. In general, all model data structures involving frozen jobs are now indexed by the frozen job's uuid and no longer include the job name since a job name no longer uniquely identifies a job in a buildset (either the uuid or the (job name, change) tuple must be used to identify it). Job deduplication is simplified and now only needs to consider jobs within the same buildset. The fake github driver had a bug (fakegithub.py line 694) where it did not correctly increment the check run counter, so our tests that verified that we closed out obsolete check runs when re-enqueing were not valid. This has been corrected, and in doing so, has necessitated some changes around quiet dequeing when we re-enqueue a change. The reporting in several drivers has been updated to support reporting information about multiple changes in a queue item. Change-Id: I0b9e4d3f9936b1e66a08142fc36866269dc287f1 Depends-On: https://review.opendev.org/907627	2024-02-09 07:39:40 -08:00
James E. Blair	fb7d24b245	Fix validate-tenants isolation The validate-tenants scheduler subcommand is supposed to perform complete tenant validation, and in doing so, it interacts with zk. It is supposed to isolate itself from the production data, but it appears to accidentally use the same unparsed config cache as the production system. This is mostly okay, but if the loading paths are different, it could lead to writing cache errors into the production file cache. The error is caused because the ConfigLoader creates an internal reference to the unparsed config cache and therefore ignores the temporary/isolated unparsed config cache created by the scheduler. To correct this, we will always pass the unparsed config cache into the configloader. Change-Id: I40bdbef4b767e19e99f58cbb3aa690bcb840fcd7	2024-01-31 14:58:45 -08:00
Zuul	d12ec11321	Merge "Improve support for web enqueue/dequeue"	2024-01-14 16:31:05 +00:00
Zuul	0ffe18a930	Merge "Index job map by uuid"	2024-01-13 17:27:33 +00:00
James E. Blair	50f068ee6d	Add a build-times web endpoint This endpoint runs an optimized query for returning information suitable for displaying a graph of build times. This includes a schema migration to add some indexes to aid the query. Change-Id: I56e8422a599c1ee51216f26fcae5a39013066e6b	2024-01-03 13:06:07 -08:00
James E. Blair	a9ff6b3410	Improve support for web enqueue/dequeue This change: * Returns the build/buildset oldrev through the REST API (this field was missing). * Updates the web UI so that when enqueuing or dequeueing a ref it will send exactly the oldrev/newrev values it received, including None/null. * No longer translate None to 40*'0' when creating internal management events. In concert, these changes allow a user to re-enqueue exactly as originally enqueued buildsets for branch tips (periodic pipeline) as well as ref updates (tag/post pipelines). Additionally, the re-enqueue method in the web UI is updated to support re-enqueing tag and branch heads (it only worked on change and ref-updates before). Finally, the buildset page is updated to show the old and new revs if they are non-null. Change-Id: I9886cd44f8b4bae6f4a5ce3644f0598a73ecfe0a	2023-12-14 10:18:33 -08:00
James E. Blair	cb3c4883f2	Index job map by uuid This is part of the circular dependency refactor. It changes the job map (a dictionary shared by the BuildSet and JobGraph classes (BuildSet.jobs is JobGraph._job_map -- this is because JobGraph is really just a class to encapsulate some logic for BuildSet)) to be indexed by FrozenJob.uuid instead of job name. This helps prepare for supporting multiple jobs with the same name in a buildset. Change-Id: Ie17dcf2dd0d086bd18bb3471592e32dcbb8b8bda	2023-12-12 10:22:25 -08:00
Zuul	50e06b4e74	Merge "Tidy some auth exceptions"	2023-12-01 20:04:14 +00:00
Matthieu Huin	2ba13b9575	zuul-web: ensure HTTPError is invoked with 4xx status code The _getTenantOrRaise helper function would raise an HTTP error if a tenant isn't ready, and attempt to return a 204 status code along the error. Cherrypy however doesn't allow for HTTP errors with status codes under 400. Instead, return status code 422 ("Unprocessable Entity") when the tenant isn't ready, which seems more appropriate: the configuration isn't loaded so the query cannot be processed. Change-Id: I41547df26a04698627c8f0697557e1e5039c0e1e	2023-11-22 14:59:26 +01:00
James E. Blair	00949554c9	Implement server-side filtering and pagination of config errors In order to support pagination of config errors (so that when a user does decide to look at the config errors page, we don't necessarily need to transfer all of the data each time, implement server-side filtering and pagination. A later change can implement the same in the web ui. Change-Id: I0c6cb8a10cd4d807ed92cad438ef592b1cdaf19b	2023-11-20 14:17:16 -08:00
Zuul	138b6a1379	Merge "Refactor bundle in sql connection"	2023-11-17 01:03:36 +00:00
Simon Westphahl	68d7a99cee	Send job parent data + artifacts via build request With job parents that supply data we might end up updating the (secret) parent data and artifacts of a job multiple times in addition to also storing duplicate data as most of this information is part of the parent's build result. Instead we will collect the parent data and artifacts before scheduling a build request and send it as part of the request paramters. If those parameters are part of the build request the executor will use them, otherwise it falls back on using the data from the job for backward compatibility. This change affects the behavior of job deduplication in that input data from parent jobs is no longer considered when deciding if a job can be deduplicated or not. Change-Id: Ic4a85a57983d38f033cf63947a3b276c1ecc70dc	2023-11-15 07:24:52 +01:00
James E. Blair	c3efd73b11	Tidy some auth exceptions These two web auth exceptions produce output that's a little different than the others, in that they duplicate the error and description json fields. Use an error field that looks like the other auth errors (which are unique short strings derived from class names). This lets clients concatenate the two and produce a reasonable output. Change-Id: I1703444c19bfa0a06e11c3c521b7f46b31053d7b	2023-10-25 13:08:46 -07:00
James E. Blair	18fb324f1e	Add auth token to websocket When making a websocket request, browsers do not send the "Authorization" header. Therefore if a Zuul tenant is run in a configuration where authz is required for read-only access, the websocket-based log streaming will always fail. To correct this, we will remove the http request authz check from the console-stream endpoint, and add an optional token parameter to the websocket message payload. The JS web app will be responsible for sending the auth token in the payload, and the web server will validate it if it is required for the tenant. Thanks to Andrei Dmitriev for this suggestion. Since we essentially have two different authz code paths in zuul-web now, in order to share as much code as possible, the authz sequence is refactored in such a way that the final authz check can be deferred. First we create an AuthContext at the start of the request which stores tenant and header information, then the actual validation is performed in a separate step where the token can optionally be provided. In the http code path, we create the AuthContext and validate immediately, using the Authorization header, and we do all of that in the cherrypy tool at the start of the request. In the websocket code path, we create the AuthContext as the websocket handler is being created by the cherrypy request handler, then we perform validation after receiving a message on the websocket. We use the token supplied from the request. Error handling is adjusted so in the http code path, exceptions that return appropriate http errors are raised, but in the websocket path, these are caught and translated into websocket close calls. A related issue is that we perform no validation that the streaming build log being requested belongs to the tenant via which the request is being sent. This was unecessary before read-only access was an option, but now that it is, we should check that a streaming build request arrives via the correct tenant URL. This change adjusts that as well. During testing, it was noted that the tenant configuration syntax allows admin-rules and access-rules to use the scalar-or-list pattern, however some parts of the code assumed only lists. The configloader is updated to use scalar-or-list for both of those values. Change-Id: Ifd4c21bb1fe962bf23acb5b4f10b3bbaba61e63a Co-Authored-By: Andrei Dmitriev <andrei.dmitriev@nokia.com>	2023-10-24 07:29:55 -07:00
James E. Blair	0a08299b5f	Refactor bundle in sql connection This refactors the sql connection to accomodate multiple simulataneous changes in a buildset. The change information is removed from the buildset table and placed in a ref table. Buildsets are associated with refs many-to-many via the zuul_buildset_ref table. Builds are also associated with refs, many-to-one, so that we can support multiple builds with the same job name in a buildset, but we still know which change they are for. In order to maintain a unique index in the new zuul_ref table (so that we only have one entry for a given ref-like object (change, branch, tag, ref)) we need to shorten the sha fields to 40 characters (to accomodate mysql's index size limit) and also avoid nulls (to accomodate postgres's inability to use null-safe comparison operators on indexes). So that we can continue to use change=None, patchset=None, etc, values in Python, we add a sqlalchemy TypeDectorator to coerce None to and from null-safe values such as 0 or the empty string. Some previous schema migration tests inserted data with null projects, which should never have actually happened, so these tests are updated to be more realistic since the new data migration requires non-null project fields. The migration itself has been tested with a data set consisting of about 3 million buildsets with 22 million builds. The runtime on one ssd-based test system in mysql is about 22 minutes and in postgres about 8 minutes. Change-Id: I21f3f3dfc8f93a23744856e5b82b3c948c118dc2	2023-10-19 17:42:09 -07:00
James E. Blair	4ebf9296f3	Add tenant_status web endpoint The config error list is getting longer, for everyone, especially now that we are including warnings. To avoid loading the entirety when it's not necessary, add an API endpoint that simply returns the number of config errors. This can later be used by the web app to highlight the blue bell without actually fetching the errors. This endpoint is named tenant_status so that if we find more details like this that we want to add in the future, we can extend the returned dictionary with them. It is not added to the tenant info endpoint because that is unauthenticated and should not leak information about the tenant. Change-Id: Ie11eb26dc38e28922ddabbca39d89cda7e763d13	2023-09-22 07:10:55 -07:00
James E. Blair	eb803984a0	Use tenant-level layout locks The current "layout_lock" in the scheduler is really an "abide" lock. We lock it every time we change something in the abide (including tenant layouts). The name is inherited from pre-multi-tenant Zuul. This can cause some less-than-optimal behavior when we need to wait to acquire the "layout_lock" for a tenant reconfiguration event in one thread while another thread holds the same lock because it is reloading the configuration for a different tenant. Ideally we should be able to have finer-grained tenant-level locking instead, allowing for less time waiting to reconfigure. The following sections describe the layout lock use prior to this commit and how this commit adjusts the code to make it safe for finer-grained locking. 1) Tenant iteration The layout lock is used in some places (notably some cleanup methods) to avoid having the tenant list change during the method. However, the configloader already performs an atomic replacement of the tenant list making it safe for iteration. This change adds a lock around updates to the tenant list to prevent corruption if two threads update it at the same time. The semaphore cleanup method indirectly references the abide and layout for use in global and local semaphores. This is just for path construction, and the semaphores exist apart from the abide and layout configurations and so should not be affected by either changing while the cleanup method is running. The node request cleanup method could end up running with an outdated layout objects, including pipelines, however it should not be a problem if these orphaned objects end up refreshing data from ZK right before they are removed. In these cases, we can simply remove the layout lock. 2) Protecting the unparsed project branch cache The config cache cleanup method uses the unparsed project branch cache (that is, the in-memory cache of the contents of zuul config files) to determine what the active projects are. Within the configloader, the cache is updated and then also used while loading tenant configuration. The layout lock would have made sure all of these actions were mutually exclusive. In order to remove the layout lock here, we need to make sure the Abide's unparsed_project_branch_cache is safe for concurrent updates. The unparsed_project_branch_cache attribute is a dictionary that conains references to UnparsedBranchCache objects. Previously, the configloader would delete each UnparsedBranchCache object from the dictionary, reinitialize it, then incrementially add to it. This current process has a benign flaw. The branch cache is cleared, and then loaded with data based on the tenant project config (TPC) currently being processed. Because the cache is loaded based on data from the TPC, it is really only valid for one tenant at a time despite our intention that it be valid for the entire abide. However, since we do check whether it is valid for a given TPC, and then clear and reload it if it is not, there is no error in data, merely an incomplete utilization of the cache. In order to make the cache safe for use by different tenants at the same time, we address this problem (and effectively make it so that it is also effective for different tenants, even at different times). The cache is updated to store the ltime for each entry in the cache, and also to store null entries (with ltimes) for files and paths that have been checked but are not present in the project-cache. This means that at any given time we can determine whether the cache is valid for a given TPC, and support multiple TPCs (i.e., multiple tenants). It's okay for the cache to be updated simultaneously by two tenants since we don't allow the cache contents to go backwards in ltime. The cache will either have the data with at least the ltime required, or if not, that particular tenant load will spawn cat jobs and update it. 3) Protecting Tenant Project Configs (TPCs) The loadTPC method on the ConfigLoader would similarly clear the TPCs for a tenant, then add them back. This could be problematic for any other thread which might be referencing or iterating over TPCs. To correct this, we take a similar approach of atomic replacement. Because there are two flavors of TPCs (config and untrusted) and they are stored in two separate dictionaries, in order to atomically update a complete tenant at once, the storage hierarchy is restructured as "tenant -> {config/untrusted} -> project" rather than "{config/untrusted} -> tenant -> project". A new class named TenantTPCRegistry holds both flavors of TPCs for a given tenant, and it is this object that is atomically replaced. Now that these issues are dealt with, we can implement a tenant-level thread lock that is used simply to ensure that two threads don't update the configuration for the same tenant at the same time. The scheduler's unparsed abide is updated in two places: upon full reconfiguration, or when another scheduler has performed a full reconfiguration and updated the copy in ZK. To prevent these two methods from performing the same update simultaneously, we add an "unparsed_abide_lock" and mutually exclude them. Change-Id: Ifba261b206db85611c16bab6157f8d1f4349535d	2023-08-24 17:32:25 -07:00
James E. Blair	1b042ba4ab	Add job failure output detection regexes This allows users to trigger the new early failure detection by matching regexes in the streaming job output. For example, if a unit test job outputs something sufficiently unique on failure, one could write a regex that matches that and triggers the early failure detection before the playbook completes. For hour-long unit test jobs, this could save a considerable amount of time. Note that this adds the google-re2 library to the Ansible venvs. It has manylinux wheels available, so is easy to install with zuul-manage-ansible. In Zuul itself, we use the fb-re2 library which requires compilation and is therefore more difficult to use with zuul-manage-ansible. Presumably using fb-re2 to validate the syntax and then later actually using google-re2 to run the regexes is sufficient. We may want to switch Zuul to use google-re2 later for consistency. Change-Id: Ifc9454767385de4c96e6da6d6f41bcb936aa24cd	2023-08-21 16:41:21 -07:00
Simon Westphahl	3b011296e6	Keep task stdout/stderr separate in result object Combining stdout/stderr in the result can lead to problems when e.g. the stdout of a task is used as an input for another task. This is also different from the normal Ansible behavior and can be surprising and hard to debug for users. The new behavior is configurable and off by default to retain backward compatibility. Change-Id: Icaced970650913f9632a8db75a5970a38d3a6bc4 Co-Authored-By: James E. Blair <jim@acmegating.com>	2023-08-17 16:22:41 -07:00
James E. Blair	9e0f2b5694	Fix setting autoholds through API with change supplied The set autohold api endpoint incorrectly handled supplied values such that if the user supplied a change without a ref it would always use the default ref (.*). This corrects the case handling and adds tests. Change-Id: I1ae14c327fd8fd2b866013d4d5078a9fbd85f843	2023-06-01 15:45:59 -07:00
James E. Blair	84e0e76e2f	Add error information to config-errors API endpoint This is the first in a series of changes to improve the usability of the web view of config errors. The end goal is to be able to display them in a more structured manner. A secondary goal is to eventually add warnings (eg, deprecation warnings) which is really only feasible if we have structured presentation of errors. This change does the following: * Adds severity and error names to existing configuration errors * And makes them available via the config-errors API endpoint * Reduces the call sites for the error accumulator (LoadingErrors.addError) * Unifies the calling convention for the accumulator (we stop passing in Exception objects) Change-Id: Ia17dd3e7ad8cdfa8a07bb03b871078415d0c145e	2023-05-25 15:41:37 -07:00
Zuul	e812ce6a3d	Merge "Add missing event id to management events"	2023-05-22 12:07:51 +00:00
Simon Westphahl	711e1e5c98	Add missing event id to management events The change management events via Zuul web and the command socket did not have an event ID assigned. This made it harder to debug issues where we need to find the logs related to a certain action. Change-Id: I05ccbc13c7f906f91e13fb66e4a01a51fc822676	2023-04-14 08:27:29 +02:00
James E. Blair	84c0420792	Add statement timeouts to some web sql queries The SQL queries are designed to be highly optimized and should return in milliseconds even with millions of rows. However, sometimes query planners are misled by certain characteristics and can end up performing suboptimally. To protect the web server in case that happens, set a statement or query timeout for the queries which list builds or buildsets. This will instruct mysql or postgresql to limit execution of the buildset or build listing queries to 30 seconds -- but only if these queries originate in zuul-web. Other users (such as the admin tools) may still run these queries without an explicit time limit (though the server may still have one). Unfortunately (or perhaps fortunately) the RDBMSs can occasionally satisfy the queries we use in testing in less than 1ms, making a functional test of this feature impractical (we are unable to set the timeout to 0ms). Change-Id: If2f01b33dc679ab7cf952a4fbf095a1f3b6e4faf	2023-03-13 14:57:29 -07:00
Clark Boylan	2747ea6f56	Fix DeprecationWarning: ssl.PROTOCOL_TLS is deprecated Since python 3.10 ssl.PROTOCOL_TLS has been deprecated. We are expected to use ssl.PROTOCOL_TLS_CLIENT and ssl.PROTOCOL_TLS_SERVER depending on how the sockets are to be used. Switch over to these new constants to avoid the DeprecationWarning. One thing to note is that PROTOCOL_TLS_CLIENT has default behaviors around cert verification and hostname checking. Zuul is already explicitly setting those options the way it wants to and I've left that alone to avoid trouble if the defaults change later. Finally, this doesn't fix the occurence of this error that happens within kazoo. A separate PR has been made upstream to kazoo and this should be fixed in the next kazoo release. Change-Id: Ib41640f1d33d60503066464c8c98f865a74f003a	2023-02-07 16:37:20 -08:00
James E. Blair	fe04739c78	Reuse queue items after reconfiguration When we reconfigure, we create new Pipeline objects, empty the values in the PipelineState and then reload all the objects from ZK. We then re-enqueue all the QueueItems to adjust and correct the object pointers between them (item_ahead and items_behind). We can avoid reloading all the objects from ZK if we keep queue items from the previous layout and rely on the re-enqueue method correctly resetting any relevant object pointers. We already defer this re-enqueue work to the next pipeline processing after a reconfiguration (so the reconfiguration itself doesn't take very long, but now the first pipeline run after a reconfiguration must perform a complete refresh). With this change, that first refresh is no longer be a complete refresh but a normal refresh, so we will get the benefits of previous reductions in refresh times. The main risk of this change is that it could introduce a memory leak. During development, additional debugging was performed to verify that after a re-enqueue, there are no obsolete layout or pipeline objects reachable from the pipeline state object. On schedulers where a re-enqueue does not take place (these schedulers would simply see the layout update and re-create their PipelineState python objects and refresh them after another scheduler has already performed the re-enqueue), we need to ensure that we update any internal references to Pipeline objects (which then lead to Layout objects and can cause memory leaks). To address that, we update the pipeline references in the ChangeQueue instances underneath a given PipelineState when that state is being reset after a reconfiguration. This change also removes the pipeline reference from the QueueItem, replacing it with a property that uses the pipeline reference on the ChangeQueue instead. This removes one extra place where an incorrect reference could cause a memory leak. Change-Id: I7fa99cd83a857216321f8d946fd42abd9ec427a3	2022-12-13 13:19:48 -08:00
James E. Blair	1245d100ca	Refactor merge mode name lookup This is repeated in a few places, centralize it. Change-Id: I7bbed1f5f9faad31affa71ef17fbfc1740c54db8	2022-11-10 15:52:46 -08:00
James E. Blair	3a981b89a8	Parallelize some pipeline refresh ops We may be able to speed up pipeline refreshes in cases where there are large numbers of items or jobs/builds by parallelizing ZK reads. Quick refresher: the ZK protocol is async, and kazoo uses a queue to send operations to a single thread which manages IO. We typically call synchronous kazoo client methods which wait for the async result before returning. Since this is all thread-safe, we can attempt to fill the kazoo pipe by having multiple threads call the synchronous kazoo methods. If kazoo is waiting on IO for an earlier call, it will be able to start a later request simultaneously. Quick aside: it would be difficult for us to use the async methods directly since our overall code structure is still ordered and effectively single threaded (we need to load a QueueItem before we can load the BuildSet and the Builds, etc). Thus it makes the most sense for us to retain our ordering by using a ThreadPoolExecutor to run some operations in parallel. This change parallelizes loading QueueItems within a ChangeQueue, and also Builds/Jobs within a BuildSet. These are the points in a pipeline refresh tree which potentially have the largest number of children and could benefit the most from the change, especially if the ZK server has some measurable latency. Change-Id: I0871cc05a2d13e4ddc4ac284bd67e5e3003200ad	2022-11-09 10:51:29 -08:00
Zuul	7606304159	Merge "Change merge mode default based on driver"	2022-10-27 02:25:37 +00:00
James E. Blair	9d2e1339ff	Support authz for read-only web access This updates the web UI to support the requirement for authn/z for read-only access. If authz is required for read access, we will automatically redirect. If we return and still aren't authorized, we will display an "Authorization required" page (rather than continuing and popping up API error notifications). The API methods are updated to send an authorization token whenever one is present. Change-Id: I31c13c943d05819b4122fcbcf2eaf41515c5b1d9	2022-10-25 20:22:42 -07:00
James E. Blair	95ec2c45e5	Set Access-Control-Allow-Origin headers in check_auth tool Since we check authorization in every method except info now, set the headers in the check_auth tool instead of the individual methods; that way they are set even in the case of a 401. Change-Id: I397180122e03915694ba6e59b4bd3a743120ee6e	2022-10-25 20:22:40 -07:00
James E. Blair	c22f2c98e0	Add access-rules configuration and documentation This allows configuration of read-only access rules, and corresponding documentation. It wraps every API method in an auth check (other than info endpoints). It exposes information in the info endpoints that the web UI can use to decide whether it should send authentication information for all requests. A later change will update the web UI to use that. Change-Id: I3985c3d0b9f831fd004b2bb010ab621c00486e05	2022-10-25 20:22:33 -07:00
James E. Blair	8c47d9ce4e	Add api-root tenant config object In order to allow for authenticated read-only access to zuul-web, we need to be able to control the authz of the API root. Currently, we can only specify auth info for tenants. But if we want to control access to the tenant list itself, we need to be able to specify auth rules. To that end, add a new "api-root" tenant configuration object which, like tenants themselves, will allow attaching authz rules to it. We don't have any admin-level API endpoints at the root, so this change does not add "admin-rules" to the api-root object, but if we do develop those in the future, it could be added. A later change will add "access-rules" to the api-root in order to allow configuration of authenticated read-only access. This change does add an "authentication-realm" to the api-root object since that already exists for tenants and it will make sense to have that in the future as well. Currently the /info endpoint uses the system default authentication realm, but this will override it if set. In general, the approach here is that the "api-root" object should mirror the "tenant" object for all attributes that make sense. Change-Id: I4efc6fbd64f266e7a10e101db3350837adce371f	2022-10-25 20:19:39 -07:00
James E. Blair	8f2dd91cbf	Add check_auth tool to zuul-web Authentication checking in the admin methods of zuul-web is very duplicative. Consolidate all of the auth checks into a cherrypy tool that we can use to decorate methods. This tool also anticipates that we will have read-only checks in the future, but for now, it is still only used for admin checks. This tool also populates some additional parameters (like tenant and auth info) so that we don't need to call "getTenantOrRaise" multiple times in a request. Several methods performed HTTP method checks inside the method which inhibits our ability to wrap an entire method with an auth_check. To resolve this, we now use method conditions on the routes dispatcher. As a convention, I have put the options handling on the "GET" methods since they are most likely to be universal. Change-Id: Id815efd9337cbed621509bb0f914bdb552379bc7	2022-10-25 20:19:25 -07:00
Zuul	99959a3fa3	Merge "Simplify tenant_authorizatons check"	2022-10-26 02:21:25 +00:00
Zuul	75573b7aec	Merge "Remove unused /api/user/authorizations REST endpoint"	2022-10-25 23:31:51 +00:00
Zuul	dcc1c9194a	Merge "Rename admin-rule to authorization-rule"	2022-10-25 23:31:48 +00:00
Zuul	b70d8de85b	Merge "Include skipped builds in database and web ui"	2022-10-25 04:10:18 +00:00
James E. Blair	e2a472bc97	Change merge mode default based on driver The default merge mode is 'merge-resolve' because it has been observed that it more closely matches the behavior of jgit in Gerrit (or, at least it did the last time we looked into this). The other drivers are unlikely to use jgit and more likely to use the default git merge strategy. This change allows the default to differ based on the driver, and changes the default for all non-gerrit drivers to 'merge'. The implementation anticipates that we may want to add more granularity in the future, so the API accepts a project as an argument, and in the future, drivers could provide a per-project default (which they may obtain from the remote code review system). That is not implemented yet. This adds some extra data to the /projects endpoint in the REST api. It is currently not easy (and perhaps not possible) to determine what a project's merge mode is through the api. This change adds a metadata field to the output which will show the resulting value computed from all of the project stanzas. The project stanzas themselves may have null values for the merge modes now, so the web app now protects against that. Change-Id: I9ddb79988ca08aba4662cd82124bd91e49fd053c	2022-10-13 10:31:19 -07:00
James E. Blair	55ec721fa8	Simplify tenant_authorizatons check This method iterates over all tenants but only needs to return information about a single tenant. Simplify the calculation for efficiency. This includes a change in behavior for unknown tenants. Currently, a request to /api/tenant/{name}/authorizations will always succeed even if the tenant does not exist (it will return an authorization entry indicating the user is not an admin of the unknown tenant). This is unnecessary and confusing. It will now return a 404 for the unknown tenant. In the updated unit test, tenant-two was an unknown tenant; its name has been updated to 'unknown' to make that clear. (Since the test asserted that data were returned either way, it is unclear whether the original author of the unit test expected tenant-two to be unknown or known.) Change-Id: I545575fb73ef555b34c207f8a5f2e70935c049aa	2022-10-06 15:38:24 -07:00
James E. Blair	5e6dbf2001	Remove unused /api/user/authorizations REST endpoint This has not beeen used for a while and can be removed. This will simplify the authorization code in zuul-web. Change-Id: I0fa6c4fb87672c44d3f97db0be558737b4f102bc	2022-10-06 15:38:24 -07:00
James E. Blair	3a0eaa1ffe	Rename admin-rule to authorization-rule This is a preparatory step to add access-control for read-level access to the API and web UI. Because we will likely end up with tenant config that looks like: - tenant: name: example admin-rules: ['my-admin-rule'] access-rules: ['my-read-only-rule'] It does not make sense for 'my-read-only-rule' to be defined as: - admin-rule: name: read-only-rule In other words, the current nomenclature conflates (new word: nomenconflature) the idea of an abstract authorization rule and what it authorizes. The new name makes it more clear than an authorization-rule can be used to authorize more than just admin access. Change-Id: I44da8060a804bc789720bd207c34d802a52b6975	2022-10-06 15:38:24 -07:00
James E. Blair	0738d31b08	Include skipped builds in database and web ui We have had an on-and-off relationship with skipped builds in the database. Generally we have attempted to exclude them from the db, but we have occasionally (accidentally?) included them. The status quo is that builds with a result of SKIPPED (as well as several other results which don't come from the executor) are not recorded in the database. With a greater interest in being able to determine which jobs ran or did not run for a change after the fact, this job deliberately adds all builds (whether they touch an executor or not, whether real or not) to the database. This means than anything that could potentially show up on the status page or in a code-review report will be in the database, and can therefore be seen in the web UI. It is still the case that we are not actually interested in seeing a page full of SKIPPED builds when we visit the "Builds" tab in the web ui (which is the principal reason we have not included them in the database so far). To address this, we set the default query in the builds tab to exclude skipped builds (it is easy to add other types of builds to exclude in the future if we wish). If a user then specifies a query filter to include specific results, we drop the exclusion from the query string. This allows for the expected behavior of not showing SKIPPED by default, then as specific results are added to the filter, we show only those, and if the user selects that they want to see SKIPPED, they will then be included. On the buildset page, we add a switch similar to the current "show retried jobs" switch that selects whether skipped builds in a buildset should be displayed (again, it hides them by default). Change-Id: I1835965101299bc7a95c952e99f6b0b095def085	2022-10-06 13:28:02 -07:00
James E. Blair	1eda9ccf96	Correct exit routine in web, merger Change I216b76d6aaf7ebd01fa8cca843f03fd7a3eea16d unified the service stop sequence but omitted changes to zuul-web. Update zuul-web to match and make its sequence more robust. Also remove unecessary sys.exit calls from the merger. Change-Id: Ifdebc17878aa44d57996e4bdd46e49e6144b406b	2022-10-05 13:25:07 -07:00
Zuul	ac9958ada5	Merge "Trace received Github events"	2022-10-04 03:34:14 +00:00
Simon Westphahl	7d52b98373	Trace received Github events We'll create a span when zuul-web receives a Github webhook event which is then linked to the span for the event pre-processing step. The pre-processing span context will be added to the trigger events and with Icd240712b86cc22e55fb67f6787a0974d5308043 complete tracing of the whole chain from receiving a Github event until a change is enqueued. Change-Id: I1734a3a9e44f0ae01f5ed3453f8218945c90db58	2022-09-30 09:50:37 +02:00
James E. Blair	06cfe2cacd	Add semaphores to REST API This adds information about semaphores to the REST API. It allows for inspection of the known semaphores in a tenant, the current number of jobs holding the semaphore, and information about each holder iff that holder is in the current tenant. Followup changes will add zuul-client and zuul-web support for the API, along with docs and release notes. Change-Id: I6ff57ca8db11add2429eefcc8b560abc9c074f4a	2022-09-07 14:28:12 -07:00

1 2 3 4 5 ...

277 Commits