As a first step towards supporting multiple ansible versions we need
tooling to manage ansible installations. This moves the installation
of ansible from the requirements.txt into zuul. This is called as a
setup hook to install the ansible versions into
<prefix>/lib/zuul/ansible. Further this tooling abstracts knowledge
that the executor must know in order to actually run the correct
version of ansible.
The actual usage of multiple ansible versions will be done in
follow-ups.
For better maintainability the ansible plugins live in
zuul/ansible/base where plugins can be kept in different versions if
necessary. For each supported ansible version there is a specific
folder that symlinks the according plugins.
Change-Id: I5ce1385245c76818777aa34230786a9dbaf723e5
Depends-On: https://review.openstack.org/623927
Today we expect zuul_return to be run on localhost (zuul-executor).
With that in mind, convert to an action plugin so it only runs on a
zuul-executor.
Change-Id: I236360563c812ee628f78ac062e9ae6cc183aee4
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
Currently we do live streaming of command and shell tasks in
loops. However this is severely broken as we only get the output of
the first task that ran. This is hard to solve as a per task streaming
in loops is not possible because of missing start events. Further
streaming several tasks at once is difficult too because we don't know
how many tasks we need to stream upfront.
So the intermediate solution until the whole log streaming is reworked
is to just not live stream tasks in loops.
While we're at it also fix a broken log statement and remove duplicate
exit code prints of action tasks.
Change-Id: Ic1358d5b9f939549ffdd5d770fe4eafc047be6f1
flake 3.6.0 introduces a couple of new tests, handle them in the zuul
base:
* Disable "W504 line break after binary operator", this is a new warning
with different coding style.
* Fix "F841 local variable 'e' is assigned to but never used"
* Fix "W605 invalid escape sequence" - use raw strings for regexes.
* Fix "F901 'raise NotImplemented' should be 'raise
NotImplementedError'"
* Ignore "E252 missing whitespace around parameter equals" since it
reports on parameters like:
def makeNewJobs(self, old_job, parent: Job=None):
Change "flake8: noqa" to "noqa" since "flake8: noqa" is a file level
noqa and gets ignored with flake 3.6.0 if it's not at beginning of line
- this results in many warnings for files ./zuul/driver/bubblewrap/__init__.py and
./zuul/cmd/migrate.py. Fix any issues there.
Change-Id: Ia79bbc8ac0cd8e4819f61bda0091f4398464c5dc
This fixes an issue when using zuul_return several times.
zuul_return module takes a dictionary as a parameter with one root key: 'zuul'
Using dict.update with two identical keys will cause dict key overriding
When using zuul_return in child and parent playbooks, the parent zuul_return
value overwrites the child zuul_return value
Change-Id: I6308020d2af0670eb5c598a3daf0ef51066651f6
The console daemon has, thus far, primarily been tested on python2 for
various reasons.
However, when forcing it to python3 by setting
ansible_python_interpreter=/usr/bin/python3, we find that the console
daemon explodes because Python 3 does not allow unbuffered I/O on
non-binary files.
It should be fine to have /dev/null be binary mode, as it is just being
used to do low-level fileno() operations, the file ojbect is discarded
shortly thereafter.
Change-Id: Ib030863c2de17825e29874733fc5a9b023f7a601
With the update to Ansible 2.5 command and shell tasks used in
handlers are broken and fail with [1]. The reason for this is that the
callback v2_playbook_on_task_start is not called anymore for
handlers. Instead the callback v2_playbook_on_handler_task_start is
called for them. This leads to a missing zuul_log_id in the handler
task and trying to log to /tmp/console-None.log. In case this file was
already created by a handler using sudo it may not be accessible which
leads to the exception.
This can be fixed by also defining v2_playbook_on_handler_task_start
in zuul_stream.
Also add a validation of zuul_log_id in the command module. This
should make it easier to spot such errors next time.
[1] Trace
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/tmp/ansible_X_j_M1/ansible_module_command.py", line 185, in follow
with Console(log_uuid) as console:
File "/tmp/ansible_X_j_M1/ansible_module_command.py", line 162, in __enter__
self.logfile = open(self.logfile_name, 'ab', buffering=0)
IOError: [Errno 13] Permission denied: '/tmp/console-None.log'
Change-Id: Ib9dd7fe09e4e7734f7a9ada876e6ce450ebc5038
Story: 2002528
Task: 22067
We suspect there are more fixes required, and the additional check
these changes added may cause more harm in the interim.
This reverts commit 3512a5608c.
This reverts commit d94a0d6f06.
Change-Id: I6e874cd68a1cf6f974e36ab1870573bddf4e647b
With the update to Ansible 2.5 command and shell tasks used in
handlers are broken and fail with [1]. The reason for this is that the
callback v2_playbook_on_task_start is not called anymore for
handlers. Instead the callback v2_playbook_on_handler_task_start is
called for them. This leads to a missing zuul_log_id in the handler
task and trying to log to /tmp/console-None.log. In case this file was
already created by a handler using sudo it may not be accessible which
leads to the exception.
This can be fixed by also defining v2_playbook_on_handler_task_start
in zuul_stream.
Also add a validation of zuul_log_id in the command module. This
should make it easier to spot such errors next time.
[1] Trace
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/tmp/ansible_X_j_M1/ansible_module_command.py", line 185, in follow
with Console(log_uuid) as console:
File "/tmp/ansible_X_j_M1/ansible_module_command.py", line 162, in __enter__
self.logfile = open(self.logfile_name, 'ab', buffering=0)
IOError: [Errno 13] Permission denied: '/tmp/console-None.log'
Change-Id: I1978aa8faa488fec87406e1481d455d49731f867
Story: 2002528
Task: 22067
Our custom command.py Ansible module is updated to match the
version from 2.5, plus our additions.
strip_internal_keys() is moved within Ansible yet again.
Change-Id: Iab951c11b23a24757cf5334b36bc8f7d12e19db0
Depends-On: https://review.openstack.org/567007
This updates both the dependency to ansible 2.4 and also ports in the
needed changes to the command module.
Version 2.4.0 definitely does not work for us because YAML
hosts file parsing is broken, but 2.4.1 and greater should
be fine.
Change-Id: I63f72b45ecb9533eac5ba9eb0eef426beec905e3
There was an occurrence to fail_json where the message was put in via
a positional argument instead of the msg keyword.
Change-Id: Ie4b9935869fab01e598fd7f34a5245515152c09b
The Zuul implementation of the command module doesn't need to stream
the console when the no_log attribute is enabled.
Change-Id: I04856c194d815855ff098a1259db16d7e0098a65
Squashed changes:
- Use 'inventory' instead of 'hostfile' in ansible.cfg.
'hostfile' is deprecated.
- Use 'os.environ.copy()' in zuul_return.py since this causes 2.4 to
throw an exception now deep within module.exit_json().
Change-Id: I0a52c9e169a54d24a7b361010045fb10211418b7
This module reads output from a command (via a pipe) one line at
a time. The only input we should receive from it is either:
* a byte string from the command output terminating with a \n
* a python "string" terminating with a \n
* that means in python2, a bytestring with a \n
* and in python3, a unicode string with a \n
For now, we only need to focus on python2 because we explicitly run
this code under python2, however, it's wise to be forward-compatible
with python3.
The error in the previous version of this code is to assume that the
value we read from the command was a unicode string which needed to
be encoded in order to be written to the log file. That's incorrect;
what we receive from the command should already be encoded according
to the system locale.
This change no longer encodes the lines received from the command
(because they are always bytestrings, in python2 or python3, they
will follow the code path where no further encoding happens). Of course,
if it turns out not to be encoded in utf-8, then zuul_stream is likely
going to bomb because it assumes everything it reads is utf-8, but
that's a different problem. In practice, we have utf-8 or C locales
universally at the moment.
Finally, there is a bunch of explicit encoding and bytestring handling
added to this method. That is mostly in service of the future codepath
under python3; elsewhere in this file we call ".addLine('[Zuul] ...')".
Under python2, that's a bytestring so no further work is necessary. In
python3, that's a unicode string, so we need to encode it.
We should never hit the exception handler, however, if somehow we manage
to, it should at least be able to write some data to the log file which
approximates what it was given.
Change-Id: Iae2f3ee012d914454c335184a8ec7c7ecb924ec7
zuul/executor/ansiblelaunchserver.py has been superseeded by
zuul/executor/server.py. zuul_afs was needed in zuul tree for 2.5 but
for 3 can just go in a role in zuul-jobs.
Change-Id: I7b8efeaa2fef32ca24d639e00a2e48a94634d3b4
Currently when ansible has an error when trying to run a command or
shell task the command module has no chance to send any console
log. Thus zuul_stream doesn't terminate by itself and gets killed
after a timeout of 30s. As a try to fix this zuul_console sends
periodically a notice about the not found logfile. When requesting the
streamer to exit it can check then if there is still no console file
and exit by itself without needing to timeout.
Change-Id: I42bc05b0d2c530fbfc00c6295da24d18a6ec6435
This loads a json file (work/results.json) that the job can write
to. It will be loaded by the executor after the job completes and
returned to the scheduler.
We can use the data in this file as the reported log URL for the
build. Later we can use it to supply file/line comments in
reviews.
Change-Id: Ib4eb743405f337c5bd541dd147e687fd44699713
The check 'if not ret' not only matches a missing return code but also
the value 0 which is actually a successful run. Thus successful
commands end with an error 'Something went horribly wrong during task
execution'. This can be fixed by explicitly checking for None.
Also adds a successful shell task to the test_playbook test case which
fails without this patch.
Change-Id: If1d0721574a82e247659ab0f865ae6acfe12a6be
If there's an exception, we might not write the final lines to the log.
Put those in a finally block.
Change-Id: I0a294af3874c53543176a7bd9c4b89253717b83c
This caused us to fail to log lines when running commands on
localhost (because in that situation, we use python3).
Change-Id: If9f5cf6ed62cb24a6fc16ebf189d36a752fa66d5
It gets decoded as utf-8 on the other side of the socket, and it's being
written to the file as utf-8 - so there's not reason to read it in to a
string before sending it over the wire.
Change-Id: Iad3d33108835fad3ceae0eec38985659449a6452
These are causing WILDLY strange issues on ze01. Revert them until we
understand.
Revert "Update run_command to latest ansible"
This reverts commit 1dbf5f96b5.
Revert "Sync command from ansible"
This reverts commit 429428c020.
Change-Id: Ib481160312216a4dbc9c3184d31cbc35b5b36371
Update the run_command from the basic module_utils that zuul overwrites.
This has a whole bunch of python3 changes that we can simply port in.
I've tried to just go around the zuul changes.
Change-Id: Ifca6b345f633add8e8bd136bd133ad62d1b92169
Sync the base of the command module from the latest ansible. There are a
few fixes in here that relate to python 3.
Change-Id: I474c2dd82bb11c43c52e6bba0507539951579780
This should look for the process holding open the port, and then delete
all of the remaining files.
Co-Authored-By: Clark Boylan <clark.boylan@gmail.com>
Change-Id: Iba3eda63e84c4357a121a1782b97a12232e1b8ce
Spawning a single reader at the top isn't actually working. In cases
where there are multiple playbooks per host (literally every zuul v3 job
given pre playbooks for git repos), the stamp file was preventing following
playbook from spawning a daemon, but the daemon was only persistent in
the context of a single playbook.
We can't just spawn a new one per playbook without some modifications,
as otherwise the existing already-streamed content would get streamed
again.
Grab the finger streaming code, which accepts an argument as to what to
stream, and re-use it in zuul_console. Combine this with adding a unique
id to each task. That way each task on a host will log to a distinct
logfile, and each callback will stream only that task's log output.
This allows us to join the reader as well, so that we won't get
streaming overlap across tasks.
Change-Id: Ic5eb6c38af698f4ba8b4504aa69170834ec4036a
It's a legitimate usecase in ansible to register the results of a shell
command and then test its value and make decisions. Not sending stdout
back in the result object would make playbooks that do that break.
There is a risk that we die under the load of really long log files
though - so let's keep track of that.
Change-Id: I2c2fe558d6ec93cef7bb4b1c5abd983488edd745
We don't need zuul_log anymore. People can write ansible themselves,
which means they can write debug statements, which will go into the
collected log.
Also, put the port and the filename into constants so that they don't
seem like magic numbers as much.
Change-Id: I8020cc3e841617366831e80fe92fc477452d6da2
The logic to rsync files into AFS is very complex, requiring
an rsync command for each of the pseudo-build-roots that are
produced by our docs jobs. Rather than try to do this in ansible
YAML, move it into an ansible module where it is much simpler.
Change-Id: I4cab8003442734ed48c67e09ea8407ec69303d87
Having a modified command module with the zuul_runner logic allows us to
use normal command and shell entries in the playbooks. (shell is just a
wrapper around command)
At this moment in time it's an invasive fork of the run_command method
on AnsibleModule. That's not optimal for long term, but should get us
closer to being able to discuss appropriate hook points with upstream
ansible.
Use environment task parameter instead of parameters
ansible has a structure for passing in environment variables which we
can use. We did not use it before due to a behavior in ansible from
pre-2.2 that set LANG settings in the environment in a way that caused
us to need to clean things in zuul_runner. The module_set_locale
variable defaults to False in 2.2, but to True in 2.1 (which was the
regression) Set the config value explcitly just to be sure.
Change-Id: Iae4769f923ecf74462e1fe43168ea93ff1c61d6e
In the next patch, we're going to change the body of zuul_runner. But,
in order to render that diff well, do the rename in this patch.
Change-Id: I3727f506cae5da561948869bd8f8daaf42e4dc0d
Now that osic-cloud1 is providing IPv6 only, we need to update
zuul_console to also include ipv6 support. Previously, our logic
would only bind to 1 address, even if the host supported both
ipv4/ipv6.
With help from fungi, we can simplify our logic binding to both ipv4
and ipv6 at the same time.
Change-Id: Ia0286e017f14eab77c5d60333ee092c87bb1a84b
Signed-off-by: Paul Belanger <pabelanger@redhat.com>