Run wheel build jobs in parallel and keep logs

Even after increasing the build-time to 180 minutes with Idc1d53d1e73b4a9eb13f258f4a4d5627ec3cf300 it's not enough to avoid timeouts. I did a manual run and it took 214 minutes on the centos builder. We can parallelise builds. The "parallel" program provides us with plenty of useful help in this regard. As described in the comments, this will run ncpu jobs (4 on our builders) and capture and store the logs for each individual job. A test run on the centos build host was reduced to ~1 hour. The job output is particularly handy because we can do a little work to make an easy-to-parse failure log. This job is a little weird because failures aren't hard -- we can still release everything we did manage to build. Initially this overview log file will enable a 3rd party monitor that can alert interested people when failing builds occur. We store the build logs as a .tar.bz2 bundle as described, but the smaller files are copied directly. Change-Id: Ifec15f71cc3530d47da4f3304d3fe094d77e0980 Depends-On: I9635721a4f8d718ad402d23600840f091267952c
2017-01-24 15:18:46 +11:00 · 2017-01-24 15:18:46 +11:00 · 8928d5f626
parent b63283e583
commit 8928d5f626
2 changed files with 54 additions and 4 deletions
--- a/jenkins/jobs/wheel-mirror.yaml
+++ b/jenkins/jobs/wheel-mirror.yaml
@ -65,6 +65,7 @@
          python: "{python}"

    publishers:
+      - devstack-logs
      - console-log

 - job-template:
--- a/jenkins/scripts/wheel-build.sh
+++ b/jenkins/scripts/wheel-build.sh
@ -5,16 +5,65 @@ WHEELHOUSE_DIR=$1
 PROJECT=openstack/requirements
 WORKING_DIR=`pwd`/$PROJECT
 PYTHON_VERSION=$2
+LOGS=$WORKSPACE/logs
+
+FAIL_LOG=${LOGS}/failed.txt
+
+# preclean logs
+mkdir -p ${LOGS}
+rm -rf ${LOGS}/*

 # Extract and iterate over all the branch names.
 BRANCHES=`git --git-dir=$WORKING_DIR/.git branch -r | grep '^  origin/[^H]'`
 for BRANCH in $BRANCHES; do
    git --git-dir=$WORKING_DIR/.git show $BRANCH:upper-constraints.txt \
        2>/dev/null > /tmp/upper-constraints.txt  || true
+
+    # setup the building virtualenv.  We want to freshen this for each
+    # branch.
    rm -rf build_env
    virtualenv -p $PYTHON_VERSION build_env
-    for pkg in $(cat /tmp/upper-constraints.txt); do
-        build_env/bin/pip --log $WORKSPACE/pip.log wheel -w $WHEELHOUSE_DIR "${pkg}" || \
-            echo "*** WHEEL BUILD FAILURE: ${pkg}"
-    done
+
+    # SHORT_BRANCH is just "master","newton","kilo" etc. because this
+    # keeps the output log hierarchy much simpler.
+    SHORT_BRANCH=${BRANCH##origin/}
+    SHORT_BRANCH=${SHORT_BRANCH##stable/}
+
+    # Failed parallel jobs don't fail the whole job, we just report
+    # the issues for investigation.
+    set +e
+
+    # This runs all the jobs under "parallel".  The stdout, stderr and
+    # exit status for each pip invocation will be captured into files
+    # kept in ${LOGS}/build/${SHORT_BRANCH}/1/[package].  The --joblog
+    # file keeps an overview of all run jobs, which we can probe to
+    # find failed jobs.
+    cat /tmp/upper-constraints.txt | \
+        parallel --files --progress --joblog ${LOGS}/$SHORT_BRANCH-job.log \
+                --results ${LOGS}/build/$SHORT_BRANCH \
+                build_env/bin/pip --verbose wheel -w $WHEELHOUSE_DIR {}
+    set -e
+
+    # Column $7 is the exit status of the job, $14 is the last
+    # argument to pip, which is our package.
+    FAILED=$(awk -e '$7!=0 {print $14}' ${LOGS}/$SHORT_BRANCH-job.log)
+    if [ -n "${FAILED}" ]; then
+        echo "*** FAILED BUILDS FOR BRANCH ${BRANCH}" >> ${FAIL_LOG}
+        echo "${FAILED}" >> ${FAIL_LOG}
+        echo -e "***\n\n" >> ${FAIL_LOG}
+    fi
 done
+
+if [ -f ${FAIL_LOG} ]; then
+    cat ${FAIL_LOG}
+fi
+
+# XXX This does make a lot of log files; about 80mb after compression.
+# In theory we could correlate just the failed logs and keep those
+# from the failure logs above.  This is currently (2017-01) left as an
+# exercise for when the job is stable :) bz2 gave about 20%
+# improvement over gzip in testing.
+pushd ${LOGS}
+tar cvjf build-logs.tar.bz2 ./build
+rm -rf ./build
+popd