Commit Graph

32 Commits

Author SHA1 Message Date
Antoine Musso 0c47b4292d Retire the project on OpenDev
It has been migrated to the Jenkins community:
https://github.com/jenkinsci/gearman-plugin/

Depends-On: Ib6010d7ce85a934501c50a53e9ac78dcf74bc403
Change-Id: I0c84db2ad3fbb4d9f0eff793a0159c6ed3a8e25c
2021-05-27 17:23:43 +02:00
Timothy Chavez 2f10c6a06f Send node labels back on build completion
Zuul will not necessarily know which node type the job it dispatches
to Jenkins will run on, so we send that information back to Zuul on
build completion so it can use it to submit metrics in that context.

Change-Id: Ibca938fcf8a65facd7e39dab4eb994dfc637722a
2015-08-25 16:37:07 -05:00
Jenkins f2024bd53e Merge "Use TextParameterValue instead of String" 2015-06-18 02:17:11 +00:00
James E. Blair 7abfdbd2d0 Stop sending status updates
Don't send status updates every 10 seconds.  Only send them at the
start of a job (to fill in information like worker and expected
duration, etc).  We don't actually do anything with subsequent
updates, and if Zuul wants to know how long a job has been running
it's perfectly capable of working that out on its own.

Change-Id: I4df5f82b3375239df35e3bc4b03e1263026f0a68
2015-05-05 10:39:53 -07:00
James E. Blair b37c6a2789 Use TextParameterValue instead of String
If a value with a newline is received, Jenkins does not display the
value correctly in the parameters page.  Based on a quick reading
of similar issues elsewhere, it may also not save the value correctly
for later use by plugins such as the 'rebuild' plugin.  Switching
from string to text parameter types solves this.  However, it does
cause _all_ parameters to be treated as text, which wastes a bit of
real-estate on the parameters listing screen with tall textboxes.
We could scan the string for '\n' and choose appropriately as an
alternative.

Change-Id: I84ef198fd6ef852fc0a403e126f13b8cbb58a7b1
2014-07-02 16:54:15 -07:00
James E. Blair ad75b7e0b0 Set a node offline even if there is an exception
In particular, an InterruptedException is likely in the portions
of safeExecuteFunction that wait for the Jenkins job to complete.
In those cases, we still want to return an exception, but we also
want to make sure that once we have scheduled a build on a node,
when that build is finished (even if it is due to some catastrophe
such as a failure to communicate with the node), we still take
the node offline.

We have seen the occasional scheduled job stuck in Jenkins because
of a situation where a node fails, and the gearman plugin schedules
another build on the node because the offline method has not run.
Meanwhile, nodepool deletes the node (because Jenkins said the
job finished) and the scheduled build gets stuck.  This should
eliminate that.

Change-Id: I69b1e4b21430b7427ed47c3cb43bd94e04213321
2013-09-25 11:04:31 -07:00
James E. Blair af21876dfe Always return WORK_COMPLETE when a build finishes regardless of
the result.

This is a change to the Zuul-Gearman protocol, however, Zuul is
already compatible with this mode of operation.

The new idea is that WORK_COMPLETE should be returned for every
completed job, regardless of the result.  The result of the build
should be determined by the client by inspecting the data returned
with the WORK_COMPLETE packet.

WORK_FAIL should now instead be used to indicate that the job failed
to run for some reason (perhaps the scheduler was, after all, unable
to schedule a build that it had accepted from the gearman server).

WORK_EXCEPTION continues to indicate something similar to WORK_FAIL,
but with exception information attached.

(At the moment, gearman-plugin should now only return WORK_COMPLETE
or WORK_EXCEPTION; the option to use WORK_FAIL as described is a
future enhancement.)

Change-Id: I32187065dac7e83573636021faf964df8dfd63be
2013-09-16 14:11:11 -05:00
Khai Do 9f388435d2 remove restriction on slave to run single job at a time
The gearman plugin was restricting slaves to runnning
only one job at a time even though the slave may have the
ability to run multiple jobs in parallel (by having
multiple executors).  This commit fixes the restriction.

Change-Id: I62b4ff9d12474a5885c549f0b532366cb60fcbca
2013-08-16 08:04:10 -07:00
James E. Blair c97253eff5 Rework starting/stopping executors
There was at least one error, likely a race condition, with the
previous code which could cause more than one ExecutorWorkerThread for
a node to be spawned.  In particular, I think the bogus comparison in
ComputerListenerImpl may have a large part in that (it checked to see
if a _Computer_ object was in a list of _Thread_ objects).

To improve reliability around adding and removing nodes, all related
functionality is moved to the GearmanProxy class.  Any methods (most
of them) that have to do with starting or stopping worker threads are
synchronized on the GearmanProxy monitor (the important parts of most
threads were already synchronized on the worker list before, so this
should not be much of a performance change).

The methods that start management and executor threads now do their
own checking to verify that such threads do not already exist, making
it so that calling them is more idempotent.  Existing checks external
to the class have been removed (these were likely somewhat racy).

To avoid keeping redundant data structures, the node availability list
is removed, and instead if we need to find an availability object, we
walk the list of worker threads and compare to their nodes.  Because
we do this so much, the list of worker and management threads are
changed to use those explicit classes instead of
AbstractWorkerThreads.

The accessor methods for the internal lists of worker threads is
removed to ensure that they are only managed through GearmanProxy.
This changed some unit tests and required the removal of one complete
test (which was not doing much more than verifying the addition
operator).

Also, when stopping ExecutorWorkerThreads, stop all of the ones
associated with a node.

When a computer goes offline, Computer.getNode() returns null, so we
can't know which workers should be deleted.  Instead of using Nodes as
keys for our workers, use Computers instead and change everything to
use Computer (most functions needed Computer rather than Node anyway),
and in the few remaining places where a Node is needed, convert the
other direction.

Change-Id: Ia5084579317f972400069cc3e84db4e0b6560a80
2013-08-13 11:31:50 -07:00
James E. Blair e45ffe249d Add OFFLINE_NODE_WHEN_COMPLETE option
If a build job is requested with the "OFFLINE_NODE_WHEN_COMPLETE"
parameter set to a true value, then mark the node as temporarily
offline when the build is complete (regardless of the outcome).

This facilitates single-use slaves (or slaves that need cleanup
after their jobs).  "Temporarily offline" was chosen as the
most lightweight method of preventing new builds to facilitate
either performing an external cleanup action (which would then
online the existing node), or external deletion of the node.

To accomplish this, the NodeAvailabilityMonitor unlock call is
moved from the StartJobWorker gearman function out into the gearman
worker so that the lock is held during the entire run of the job
and further past the point where the StartJobWorker will set
the node offline.

Also, supply the name of the gearman worker (which includes the
node name) with the build data to the client.  This way the client
will know which worker performed the job, and whose node may need
to be manipulated if the offline flag is set.

Change-Id: I5cda75eb44b26ec58e5f03d0aa980af09ee023f6
2013-08-06 11:09:14 -07:00
James E. Blair a9adc4b7d9 Make logging more consistent
Change-Id: Ifedb1ddbd7663900438fd89f2eabb3c1f4c2a5aa
2013-07-11 12:39:24 -07:00
James E. Blair 474afb007b Use AbstractProject instead of Project in function factory.
Fixes an illegal argument exception.

Change-Id: I9a1fde359e043b7cdf2fb1e635ac65a6196dc0c0
2013-06-14 16:17:11 -07:00
James E. Blair 4556818799 Report exceptions while running the job to the client.
Don't catch any exceptions while running the job; instead, report
them back to the client (via a catch-all exception handler in
StartJobWorker).

If the worker raises an exception, unlock the node monitor, in case
the worker didn't get to the point where it would be unlocked.

This change has the side effect that if the gearman server disconnects
while the job is running, the worker should return from watching the
job run (as soon as it notices, currently up to 5 seconds).  This is
helpful in that it will be available to register with gearman again,
including sending CAN_DO packets.  But the node monitor will still
prevent it from scheduling a new job while the one it started earlier
is still running.

Change-Id: Ie01ef0f9e706d81452b189099e36242ab9967950
2013-06-14 15:23:07 -07:00
James E. Blair 6041401766 Handle mutex scheduling from Gearman or Jenkins.
Every node (slave or master) gets an AvailabilityMonitor that
handles mutually exclusive access to scheduling builds on that
node.  If Jenkins wants to run a build on the node, it will only
be able to do so if we are not waiting for a response to a
GRAB_JOB packet from Gearman.  Likewise, immediately before
sending a GRAB_JOB, we lock the monitor and only unlock it if
we either get a NO_JOB response, or after the job we were just
assigned starts building.

(As an exception to the above rule, since Jenkins will apply the
same scheduling veto logic to the build that we request via Gearman,
(while we still hold the lock) we tell the monitor to expect a request
for that build from Jenkins and we permit Jenkins to build it even
if the lock is held.)

Change-Id: Iae03932aef4b503c69699b99d38a6fc2691fb02e
2013-06-13 12:42:51 -07:00
James E. Blair 7a8d940c4b Update zuul-gearman protocol.
* Use name+number as the build identifier for all meta-jobs.
  (Zuul has name + number as build metadata, so avoid adding new/duplicate
  features).
* Refer to the name of the manager worker as 'manager' instead of 'master'
  to avoid jenkins specifics.
* Just call the url "url" instead of "full_url".
* Change SetDescriptionWorker to use name+number as the build id.
  Also, expect 'html_description' instead of 'description'.
* Don't catch as many exceptions, instead, let them propogate so that they
  get turned into WORK_EXCEPTION packets with information (instead of
  WORK_FAIL packets which have no info).
* Change StopJobWorker to use the same name+number interface as
  setDescriptionWorker (for consistency, expandability, plus it makes
  the code simpler).

Change-Id: I8e078540c252bf9c1f14b79f8182517cbaa13555
2013-06-06 15:40:01 -07:00
zaro0508 42c95ac958 add gearman worker to set build descriptions
This commit adds another management worker for setting a build's descriptions.
The worker will lookup the build by it's build_id (jobName:jobId)
and if it can find it then will set the description and return with COMPLETED.
If it cannot find the build or cannot set the description for any reason
it will return FAILED.

This commit also adds teh build_id to the data returned by the StartJobWorker.

Usage:

  client = gear.Client()
  client.addServer('localhost')
  client.waitForServer()
  job_name = 'set_description:JenkinsHostName'
  data = {'build_id':"pep8:2013-05-15_17-32-07",'description':"<h1>My Test Build</h1>"}
  job = gear.Job(job_name, simplejson.dumps(data))
  client.submitJob(job)

Change-Id: I4990772d591d27bbb3b4f20abfb4d988077b4995
2013-05-18 11:26:15 -07:00
James E. Blair 4e560245c0 Reference jenkins master in workers.
The "stop:" function should be qualified by the name of the jenkins master
to make it globally unique (eg "stop:jenkins.example.com").

Supply the name of the master in the WORK_DATA that is sent to the client
when the job starts building so that the client can direct a subsequent
"stop:" request to the right worker.

Change-Id: I0112b84ae614ce4faaed880ea3d1073674dfe5fe
2013-04-29 13:25:58 -07:00
James E. Blair 5525624d6f Send build info as WORK_DATA.
Send a WORK_DATA packet as soon as the build starts as well as
immediately after it terminates (and before the WORK_FAIL/SUCCESS)
packet with build information such as name, number, url, status,
etc.

Remove the textual build result descriptions -- the client can
refer to the most recent WORK_DATA packet for information.

Don't send parameters back; the client probably already knows them.

Change-Id: I7432cd73951a91afe428994c7b8222c63fb0eab8
2013-04-29 10:19:57 -07:00
James E. Blair 20844c7e46 Add local GearmanWorker.
Add a package local implementation of something like the GearmanWorker from
java-gearman (based on GearmanWorkerImpl).

It is much simpler than the existing GearmanWorkerImpl and is more suited to
the way we need to use it in the Jenkins plugin.  It assumes jobs are always
changed in batches, and only changes jobs at the top of the event loop (not
when a job is running).

The worker threads are updated to only request job changes when there is an
actual difference.

WORK_STATUS events are sent every 10 seconds while a job is running.

run-fast is updated to only remove the gearman plugin from the work directory,
preserving any other plugins that may be installed.

This isn't very elegant, but is a start and broadly demonstrates what we need
the plugin to do.

Change-Id: I26df504534ec50f03c9e0ef772a709046cf88a23
2013-04-22 10:29:03 -07:00
James E. Blair c4e660fd34 Send WORK_STATUS packets on job start.
This implementation relies on implementation knowledge of
java-gearman, but with the current API, there don't seem to be
many alternatives short of implementing a new GearmanWorker.

Change-Id: I9c8d5da91012a0d69ba296ac3c4123310f25c4f2
2013-04-11 14:52:32 -07:00
zaro0508 9c9acdbdba fix executor names, clean up, remove code duplication
src/main/java/hudson/plugins/gearman/AbstractWorkerThread.java
    Removed Id field, it was initially added because I thought it was the plugin's
    responsibility to cancel jobs that are on the gearman queue.  We've decided that
    it will be the client (zuul or otherwise) responsibility to cancel jobs from the gearman
    queue.  The gearman plugin will cancel jobs that have already been put on the
    jenkins queue.

src/main/java/hudson/plugins/gearman/ComputerListenerImpl.java
src/main/java/hudson/plugins/gearman/ExecutorWorkerThread.java
src/main/java/hudson/plugins/gearman/GearmanPluginConfig.java
src/main/java/hudson/plugins/gearman/GearmanProxy.java
src/main/java/hudson/plugins/gearman/ManagementWorkerThread.java
src/main/java/hudson/plugins/gearman/StartJobWorker.java
    Refactor to reduce code duplication. Consolidated creation of management worker and
    executor workers.  Added a fix so that executors spawned on master node
    is named 'master-manager' for the manager and 'master-exec-#' for executors

src/test/java/hudson/plugins/gearman/ManagementWorkerThreadTest.java
    Added test to make sure worker name is set correctly

src/main/java/hudson/plugins/gearman/GearmanPluginUtil.java
src/test/java/hudson/plugins/gearman/GearmanPluginUtilTest.java
    Useful utils for the gearman plugin with tests

Change-Id: I96e097dc0dbf5cd78e5e82af584976085aee61b3
2013-03-22 15:47:19 -07:00
zaro0508 c461e204f2 misc doc and logging updates
Checkin to change README from txt to rst format.
Jenkins seems to hijack logging so i've removed all of the
logging specific bindings that were added previously.
I also added "----" to begining of this plugin's logging
messages so i could easily keep track of them.

Change-Id: Ibd8c56af5b9ad18152bcb0d3ff0c41168a6d2fd1
2013-03-12 10:51:24 -07:00
zaro0508 0bf4a7d2ff return status messages to gearman client
This checkin is to return a correct job status messages to the gearman
client.  It wasn't working before due to this gearman-java issue
https://bugs.launchpad.net/gearman-java/+bug/1126496

src/main/java/hudson/plugins/gearman/StartJobWorker.java
src/main/java/hudson/plugins/gearman/StopJobWorker.java
    Updates to return job error, warning, and success results to gearman client.
    Would like to point out that gearman java<->python translation doesn't work quite
    right.  I believe the python implementation of the gearman worker never
    sends exception messages back to the client

src/main/java/hudson/plugins/gearman/example/StartJobClient.py
src/main/java/hudson/plugins/gearman/example/StopJobClient.py
    Update examples to show how to extract messages

pom.xml
    Updated developer info

Change-Id: Ie8d82be8a8e7c34bc368efda953d5ddfb9547e01
2013-03-08 13:36:34 -08:00
zaro 0e81e014da decouple gearman from the gearman configuration
This change is to create a new object to store Gearman
objects and state info.

src/main/java/hudson/plugins/gearman/GearmanProxy.java
  created to keep Gearman state info.

src/main/java/hudson/plugins/gearman/GearmanPluginConfig.java
  simplied this class by removing the core gearman stuff out to
  a GearmanProxy.java class

src/main/java/hudson/plugins/gearman/Constants.java
  Use one logger instead of two.  updated logger reference in all
  of the other files in this checkin

src/main/java/hudson/plugins/gearman/ProjectListener.java
src/main/java/hudson/plugins/gearman/StartJobWorker.java
src/main/java/hudson/plugins/gearman/StopJobWorker.java
src/main/java/hudson/plugins/gearman/ComputerListenerImpl.java
  update references to changed class and methods

Change-Id: I879cdb8839c8b5437bccf6d7e1602c33eff434a6
2013-03-05 09:51:10 -08:00
zaro 7f0cffa5d2 Update to enable gearman to Grab UUID from client
This fix is driven by a bug fix in gearman-java.  The bug fix enabled
Gearman worker to get the job id from the client
(https://bugs.launchpad.net/gearman-java/+bug/1098816)

src/main/java/hudson/plugins/gearman/AbstractWorkerThread.java
    set the worker to perform GRAB_JOB_UNIQ

src/main/java/hudson/plugins/gearman/StartJobWorker.java
    Get and decode the job UUID from gearman client

src/main/java/hudson/plugins/gearman/StopJobWorker.java
    Get and decode the job UUID from gearman client
    Remove cancelBuild method because it's not needed when using the gearman plugin,
    the gearman queue replaces the jenkins queue.
    Put in a stub for cancelJob method to cancel jobs from the Gearman queue.

src/main/java/hudson/plugins/gearman/example/StopJobClient.py
    minor update - don't need to send in data for stop job.

Change-Id: Ie3bb512cf17796091970fec4fa4c7afd05592bdb
2013-03-04 09:43:47 -08:00
Khai Do e9ff9fba0a pass jenkins build result back to gearman client
This checkin enables jenkins build result to be passed back to the
requesting gearman client.  Although i think gearman-java
fails to return results on a job failure.
https://answers.launchpad.net/gearman-java/+question/221348

Change-Id: Ia35458c23dea2ca04febfb63e933b99f2f0d0cb2
2013-02-21 11:15:01 -08:00
Khai Do 7f4fee3e45 Fix to enable execution of builds on master computer
This checkin enables gearman to execute builds on the master node.  The
Master node is a special case because it does not inherit from the Slave Node object in
Jenkins, instead it's a Computer object so whenever you do Jenkins.getNodes() the master
is not in the list of Nodes. Also the master name is "", however when you want to
run a build against the master you need to pass in a "master" label to scheduleBuild
method.

Change-Id: I65c21e7cf7f2e244c94638f8858ab0fa5fc8acb9
2013-02-08 15:58:37 -08:00
Khai Do e812a72d7d Change to spawn a thread for each jenkins executor instead of just thread per jenkins nodes.
Also added functionality to wait until a StartJobWorker can service a build request.  This change eliminates
putting builds on the jenkins queue.  Now jobs are either running or it's not.  The only cancel that
makes sense is an abort (currently running jobs).
AbstractWorkerThread.java - add comments, set worker id to name instead of random uuid
ExectorWorkerThread.java - create thread of each jenkins executor
GearmanPlugin.java - refactor to spawn a thread for every executor
NodeAssignmentAction.java - provide access to label name StartJobWorker.java - make thread block execution until there is an available
jenkins executor to run the job. Also set the gearman job return parameters.
StopJobWorker.java - Set gearman job return parameters.

Change-Id: I30cec8ca3900eb7976c38077383505ea73e744dd
2013-02-08 15:58:35 -08:00
Khai Do ef3df2039a Add comments, decouple build actions, and cancel jobs.
ExecutorWorkerThread - added comments
ManagementWorkerThread - added comments
NodeAssignmentAction - de-couple build actions. Action to send jenkins build to a specific jenkins node
NodeParametersAction - de-couple build actions. Action to send parameters to a jenkins build
StartJobWorker - De-couple build actions.  Now schedule a build with NodeAssignmentAction and NodeParametersAction
StopJobWorker - Cancel or abort builds.  This only cancels build right now, abort does not work yet

Change-Id: I72247f9a292fc78f5ea48b7c50d1f5df386efd00
2013-02-08 15:56:31 -08:00
Khai Do e6d78ad75f Starting jenkins builds with Gearman workers
Gearman worker can now start jenkins jobs with passed in parameters.  The
uuid is also passed in as a parameter due to a non-existing gearman-java
feature: https://answers.launchpad.net/gearman-java/+question/218865.

Change-Id: If04488ec2bfc19ca1b7b3fc94ca3e04154fa55c3
2013-02-08 15:54:18 -08:00
Khai Do 6efb971490 Added methods to register gearman functions
Gearman functions to start jenkins jobs and abort jenkins jobs are now registered when
'launchWorker' checkbox is selected in the Jenkins global config page.  There is a
stop:$host function to stop a builds.  There are build:$project:$label functions
to start jenkins builds.  The functions are registed and plumbed thru but don't
actually do anything yet.  I also had to fake UUID functionality due to
gearman-java issue: https://bugs.launchpad.net/gearman-java/+bug/1098816

Change-Id: I7cb772c7119fa289d17edff5d81a041cb01031ae
2013-02-08 15:54:08 -08:00
Khai Do 147d3b3d45 setup gearman workers
Added gearman executor and management workers.  It's all plumbed thru
and started from jenkins config page.

Change-Id: I58a7a150e9ba4748bd61254ed9328a1b9f28c3b9
2013-02-06 10:21:20 -08:00