It removes the existing the TQE collect-logs role in order to
test with the new opendev/ansible-role-collect-logs role.
https://tree.taiga.io/project/tripleo-ci-board/task/1001
Change-Id: Ib7892fca145a8c1947f54bfa8f7a35675e625e4d
Signed-off-by: Chandan Kumar <chkumar@redhat.com>
The reproducer script is assuming a job was started by zuul. In some instances
we are still running jobs from platform different than zuul like ci.centos. In
those instances the log collection fails because the reproducer script creation
misses a lot of files
This patch includes that role only under the condition that zuul variable is
defined.
Change-Id: I01ec134f85943a8332832bdc6caf138d978c1c37
ARA reports are generated automatically in infra when
they are in ara-report directory. Don't generate htmls in upstream
jobs to save a space.
Change-Id: I7aa81ae3b06878baeab471e340477ca8a20f8594
zuul_variables and zuul_console were added when there
was an investigation into simulating zuul behavior
for the reproducer. We are no longer following that
workflow for the reproducer and therefore these
failing lines can be removed.
Change-Id: I5f057ce78273ddf8cd6381c9a420b317713379b6
Makes those files conformant with current linting rules and avoids
linting errors when we need to toch them again.
Previous doing "pre-commit run -a" uncovered these errors, now is no
longer reporting any other errors.
Change-Id: Ie4cf229c8f11c2b55b323eac23c89483b26d3781
This uses the zuul-variables file for zuul.projects and the json console
log written by the depends on, for the playbooks. This is tracked by the
ci team with [1]. These are made available to but not yet used in the
reproducer (that will come later, see the story linked in [1]).
[1] https://tree.taiga.io/project/tripleo-ci-board/task/270
Depends-On: https://review.openstack.org/615191
Co-Authored-By: Ronelle Landy <rlandy@redhat.com>
Change-Id: I64923cfd75e697b98507be1ff398c14654108ddf
In the latest versions of the undercloud install, ansible is used.
Use ARA to profile the undercloud install steps as driven by TripleO
and save the ARA report to logs/ara_oooq_root/
Change-Id: I2b034b83ba7779d15a5d69263e67d3aea3f631a8
When using pipes in ansible shell module we can miss the error
because shell is executing without pipefail by default.
This rule will check every shell module if it has "set -e pipefail"
and will fail in case if not.
It excludes tasks which have 'ignore_errors: true' or register
variables.
Change-Id: I394c72040d62dff76180aeb9d703bb8a212bcc98
Using var/log/extra/dump_variables_hostvars.json.txt.gz
We can see the existance of this variable in the logs.
closes-Bug: #1742557
Change-Id: Ifd02ab371731dd4be863c193878567d170b02ee7
If graphite server has a problem or collecting data fails it's
still need to continue publishin logs, so ignore errors in
ara graphite task.
Change-Id: I7a330b673a7b06ebc0ef1a7617ba07069fecb1b1
This review adds functionality to create a reproducer
script in the logs. The reproducer script will allow
users to recreate failing OVB and multinode jobs
in personal cloud tenants.
User documentation for the reproducer-quickstart script
is added.
Change-Id: I9fe8550a75c3ffb6d1271b01b1144bfbdc82c95d
The ui_validate_simple test has been failing of late and
it is hard to debug since logs are not collected
from localhost.
This review copies all the .sh and .log files
from the local working dir to the log directory so that
they are included in log collection.
Related-Bug: 1734928
Change-Id: I3bc646f5d1d9584cef4e624e5904f7e88c59442e
Send ARA statistics of particular tasks to Graphite server
Depends-On: Ie5324b3328c1516d5a0e6af263da61b1d8692b4b
Change-Id: I7167b62dada67403faf1f5171d6cddef419e8da2
This change will create a README file with a simple job debug guide and
links to the frequently used but somewhat hidden files within the
collected logs.
Change-Id: I818067952017c88e855bfeee76fa438638cdd942
in upstream tripleo-ci,
- The collect_dir is a link, not a dir, if file module is invoked with
state: directory, it will change the link to a dir, creating a new
empty dir, denying the rest of the task to work properly
Stop using file module, use a shell command to create the dir instead
- the collect dir is a link, not a dir, so find {{ dir }} is not finding
anythin
Add a trailing slash to all find commands to be sure find is following the link
Change-Id: Ibe7b5083a8aaf60a7858d759d54ede0b630e7bbf
currently the first task in publish checks if the logs dir is a
directory, and fails if it's a link to a directory.
Adding follow: yes so the location can be a directory pointed from a
link
Change-Id: I992a9c89d7b04fd2a890b8d85f86cc34e8767740
In addition to known text file extensions, also rename files without
extension in the /var/log and /etc directories.
Change-Id: Ia9898816831392951cd927b7661d4d8fdcb4d007
Collecting the console logs from internal jobs is failing due to
certificate issues. Adding the -k option to curl will ignore
any certificate requirements and collect the console log.
Change-Id: Ic90a045c5cc848996dd23be3210347fb95319a13
Add the artcl_txt_rename option. When enabled, the publishing step
renames known text file extension to end in txt.gz which is directly
displayed by upstream log servers.
Also simplify the way we handle the stackviz and tempest results.
Change-Id: I793088995ca5a945738c5b04c1cefdd974e5f2d1
ARA is being set up by quickstart.sh and the static report is
later generated by collect-logs.sh in the workspace.
The collect-logs role should seek that location and recover it if
it's available.
Change-Id: I611a071bb839f3c402a6c1bc2db35951f75461e0
We are having issues with collect-logs and potentially infra. Use
shell so we get better output.
This was motivated by a failure in collect-logs.
https://ci.centos.org/job/tripleo-quickstart-promote-master-delorean-minimal/858/console
failed here:
https://github.com/openstack/tripleo-quickstart-extras/blame/master/roles/collect-logs/tasks/publish.yml#L23
The output from command module looks like this:
01:24:17.955 fatal: [localhost]: FAILED! => {"ansible_job_id":
"897063162915.10850", "changed": false, "cmd":
"/home/rdo-ci/.ansible/tmp/ansible-tmp-1484081443.92-162620171328073/command.py",
"failed": 1, "finished": 1, "msg": "[Errno 2] No such file or directory",
"outdata": "", "stderr": ""}
The actual command is buried in {longpath}/command.py, and for
the CI jobs only a handful of folks can get ssh access to the
nodes, which would allow for setting ANSIBLE_KEEP_REMOTE_FILES=1
to debug this (by looking at the actual command.py after failure).
Switching to shell block will output the expanded shell command
to the output by default.
Change-Id: Ie94492f023a2c7af8b6361f9538184c7de55cd7a
Uploading the logs can take too much time, resulting ssh connection
timeout during the ansible task execution. Log collection usually runs
in post build tasks, making it immune to jenkins job timeouts.
This change makes the upload tasks asyncronous and adds a timeout.
Change-Id: I65cf017717775ac85b953fe554f84c79e4f808b5
We should run and publish a minimal sets of files even without any host,
when we failed the run before inventory generation.
This change separates the collection step that runs on all hosts except
localhost, and the rest running on localhost. Running on localhost
always succeeds, even with an empty inventory.
Also add a log environment file for local collect-logs.sh runs that
does not upload logs.
Change-Id: I48d07d42be879026fb80afd73835484770006f85
Zuul uses different variables and files to handle artifact uploading,
this change adds the required modifications to collect and publish logs
in the rdoproject.org Zuul environment.
Change-Id: I5d74392210e55be5f5ecd889a5017750a874d45a