Updated openstack/openstack

Project: openstack-infra/gear  865be4cec8009207e57c67f80a5b2b9ee4015fd6

Fix race between stopWaitingForJobs() and getJob()

Turbo-Hipster's test suite was failing for me in a very non-deterministic
manner -- sometimes the ZuulClient would get stuck in a call to
gearman-Worker's getJob. It turned out that it was possible for the
whole worker.stopWaitingForJobs to finish before a call to
worker.getJob) gets scheduled. This meant that the stopWaitingForJobs'
logic which tried hard to interrupt any pending getJob() calls failed.

The fix is to let the getJob() check whether it missed it chance, i.e.
whether the whole worker is not supposed to be running anymore. In order
to guarantee thread safety, both setting of this variable and checking
whether it's set should happen in a synchronized manner. Stuff gets
messy here: both getJob() and stopWaitingForJobs() acquire a lock, which
means that getJob() must *not* hold the lock while it blocks (otherwise,
stopWaitingForJobs() won't be able to interrupt it because it would get
deadlocked before it gets a chance to enqueue its Nones).

It seems that there's one illusive race, though -- when thread B is
executing getJob() and gets interrupted right after the try/finally
terminates (and hence the lock is already released) and execution turns
into stopWaitingForJobs, it's quite possible that the self.running gets
unset after the getJob has already checked it. However, the lock also
protects the self.waiting_for_jobs which means that either the
stopWaitingForJobs() will see an increased waiting_for_jobs integer, or
that the getJob() will notice an updated self.running.

Change-Id: I51ec9cf06622d91ab22a4ff80630fae7913d4b5d
This commit is contained in:
Jenkins 2015-02-27 15:52:39 +00:00 committed by Gerrit Code Review
parent 928f9d7698
commit 9cf5714a1f
1 changed files with 1 additions and 0 deletions

1
gear Submodule

@ -0,0 +1 @@
Subproject commit 865be4cec8009207e57c67f80a5b2b9ee4015fd6