Updated openstack/openstack
Project: openstack-infra/gear 865be4cec8009207e57c67f80a5b2b9ee4015fd6 Fix race between stopWaitingForJobs() and getJob() Turbo-Hipster's test suite was failing for me in a very non-deterministic manner -- sometimes the ZuulClient would get stuck in a call to gearman-Worker's getJob. It turned out that it was possible for the whole worker.stopWaitingForJobs to finish before a call to worker.getJob) gets scheduled. This meant that the stopWaitingForJobs' logic which tried hard to interrupt any pending getJob() calls failed. The fix is to let the getJob() check whether it missed it chance, i.e. whether the whole worker is not supposed to be running anymore. In order to guarantee thread safety, both setting of this variable and checking whether it's set should happen in a synchronized manner. Stuff gets messy here: both getJob() and stopWaitingForJobs() acquire a lock, which means that getJob() must *not* hold the lock while it blocks (otherwise, stopWaitingForJobs() won't be able to interrupt it because it would get deadlocked before it gets a chance to enqueue its Nones). It seems that there's one illusive race, though -- when thread B is executing getJob() and gets interrupted right after the try/finally terminates (and hence the lock is already released) and execution turns into stopWaitingForJobs, it's quite possible that the self.running gets unset after the getJob has already checked it. However, the lock also protects the self.waiting_for_jobs which means that either the stopWaitingForJobs() will see an increased waiting_for_jobs integer, or that the getJob() will notice an updated self.running. Change-Id: I51ec9cf06622d91ab22a4ff80630fae7913d4b5d
This commit is contained in:
parent
928f9d7698
commit
9cf5714a1f
|
@ -0,0 +1 @@
|
|||
Subproject commit 865be4cec8009207e57c67f80a5b2b9ee4015fd6
|
Loading…
Reference in New Issue