Follows the same configuration that was used on
tripleo-quickstart-extras and documented use on tripleo-docs.
Change-Id: Iba8a2db92137f9f6ad28f498627eb1b87039d99f
Story: https://tree.taiga.io/project/tripleo-ci-board/task/381
When we updated the client code on the te-broker to support some
changes that required newer clients we broke the instance liveness
check thanks to a backwards incompatibility in novaclient. Since
all future te-brokers will be using the newer novaclient this change
just updates the Client creation call to use the new syntax.
Change-Id: Ib3331ee21c533f028007acae355528d9c5edbf76
There are situations where we may need to deploy additional
undercloud-like nodes in a test environment. Support for this has
been added to OVB and this commit wires it into the te-broker.
Change-Id: I84bfba3ee67cd5564ad0a4372c424a2622a97e6f
This adds support for heterogeneous OVB environments to the
te-broker. It is primarily intended for scale testing jobs, since
the normal test jobs only deploy a single compute node. We wouldn't
gain much by using a smaller flavor for that one vm and there is a
cost in complexity to setting up the environment.
Right now this will only work for jobs that deploy just control and
compute nodes. Support for a third role type for ceph or others
could be added in a similar fashion.
Change-Id: I398d13356b3c15c0c7cd448366186b7589ad93e4
We sometimes see more testenvs in existence than there are running
Jenkins jobs. The likely cause of this is that a job gets killed
in some unusual way (maybe a new patch set gets pushed), and the
gear client doesn't get a chance to signal back.
Since we can't rely on anything in the instance to handle this
scenario, let's add a check to the testenv-worker that makes sure
the Jenkins instance is still around, and if not signals the gear
client to proceed and delete the testenv.
Note that this replaces the previously added zuul status check,
which proved ineffective because the instance on which it was
running gets deleted before it has a chance to do anything.
Change-Id: I0270ee2ea1498247e0aeb007f4707f9502af8324
The nonha jobs don't use net-iso, so there's no need to spend the
time creating a lot of networks and ports. In addition, OVB now
has the ability to deploy a network environment that supports
basic bonding, and this change adds support for deploying that as
well. No jobs currently use bonding, but that will be added in a
follow-up patch.
Change-Id: Ifb65d962293b8b69b2a84597c29c1ffae5d9bc2c
this allows us to deploy and boot an additional node which we can have
ready along side the undercloud node itself.
Change-Id: I352de761841568e2820ba334757496702980d65a
This reverts commit 0030f6c664.
Test env requests are queuing up and nothing is getting a env in a timely
mannor. As I think we are still trying to create envs for jobs that have
timed out. Revert this to our behaviour befor last week while we investigate
the problem.
Change-Id: I79353941d838628e492b46acce6dde8ef2ec3aff
Adds a named semaphore to the testenv-worker code that will
prevent more than 5 testenvs from creating at once. This requires
the use of the posix_ipc Python module, which is added to the
te-broker puppet manifest.
Right now if someone pushes a patch series of tripleo-ci changes,
which each start 8 jobs at the moment, we can end up with 20 or 30
testenvs creating concurrently. Not surprisingly, this does not
work well and most of those testenvs will fail or timeout. Using
an approximate testenv creation time of 3 minutes (which is based
solely on my observation of how long the Heat stacks generally take),
the 5 concurrent creations allowed should process up to 30 testenvs
in the 20 minute timeout of the te-broker. Even if we do exceed
that number, the testenvs that _do_ get processed are far more
likely to succeed in a less loaded environment so our overall pass
rate will be higher than all 30+ at once. And of course we can
increase the concurrency if we find that 5 at once can be handled
easily, which would only increase the potential throughput.
The one major drawback of this approach is that Linux IPC is fairly
terrible at error-handling, so if a testenv-worker process holding
a semaphore dies in some ungraceful fashion that doesn't allow the
Python code to release the semaphore, we may have to reboot the
te-broker to clean up. I don't anticipate such a situation happening
often in the simple te-broker environment, but it's something to be
aware of.
Change-Id: Id80105a1578aa2120d2508018e53846affe254a0
Add scripts to prepare rh2, (an OVB based cloud) for CI.
This patch only includes whats needed to prepare the cloud
for CI, the changes to the CI scripts themselves will be
part of another patch.
Change-Id: Ie2d1c607f283e6babb00ea19d32bebae5383867a