Merge "Spec: Python Build/Install Process Simplification"
This commit is contained in:
commit
77efc8c47a
|
@ -0,0 +1,339 @@
|
|||
Python Build/Install Process Simplification
|
||||
###########################################
|
||||
:date: 2017-09-06 13:00
|
||||
:tags: python, build, source, repo
|
||||
|
||||
The current python wheel/venv build process is not easily understood, and the
|
||||
install process has become complicated. This blueprint aims to work towards
|
||||
making it simpler to deploy, simpler to understand and to make many of the
|
||||
current features which are forced on all deployers to be opt-in.
|
||||
|
||||
Launchpad Blueprint:
|
||||
https://blueprints.launchpad.net/openstack-ansible/+spec/python-build-install-simplification
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Building
|
||||
~~~~~~~~
|
||||
The Python repository used for OpenStack-Ansible deployments is used to
|
||||
prepare `Python wheels`_ for any git- or pypi-sourced packages for an
|
||||
environment. Using wheels speeds up the installation of the package and
|
||||
takes away the need to install the distribution packages required to
|
||||
compile the package when installing.
|
||||
|
||||
The repository preparation process also prepares `Python virtualenvs`_
|
||||
for all OSA roles with the prefix ``os_`` (which are expected to be
|
||||
OpenStack services) in order to speed up the deployment of the services
|
||||
by downloading a complete virtualenv instead of installing the packages
|
||||
individually for every host that needs the service.
|
||||
|
||||
The ``py_pkgs`` lookup, which pulls together the information used by the
|
||||
build process. It is a black box in terms of what it does, making some
|
||||
decisions about the information it reads and outputs which are not
|
||||
documented anywhere other than in the code itself. The code is not easily
|
||||
modified without breaking the process and is therefore most often left
|
||||
alone and not well maintained, resulting in an increasing amount of technical
|
||||
debt. The subsequent jinja in the repo-build role which processes it is
|
||||
tough to work through and not easily maintained. While both of these could
|
||||
be adjusted to make use of different plugins or filters, it would remain
|
||||
a set of black boxes which are complex to untangle.
|
||||
|
||||
The way that git repositories are specified and parameters are provided
|
||||
to the build process does not scale very well. Each git repo requires at
|
||||
least two flat variables to be set (``git_repo`` and ``git_install_branch``)
|
||||
and can optionally have more set. This model of setting variables makes it
|
||||
really easy to override individual settings, but requires the use of a
|
||||
pattern match mechanism to discover all the settings (which is why we use
|
||||
the lookup to do it). The settings are also put in disparate places, making
|
||||
them hard to find - defaults/repo_packages, role/defaults. It is not very
|
||||
obvious to most newcomers how to change them and it is not obvious to many
|
||||
veterans what many of the settings mean. It often requires a lot of code
|
||||
walking to understand the meaning of some settings like ``venvwithindex``
|
||||
and ``ignorerequirements``.
|
||||
|
||||
The git clone process used to fetch the git sources in order to use when
|
||||
building is done asynchronously in order to improve the time to completion,
|
||||
however individual asynchronous tasks cannot be retried in Ansible, and the
|
||||
git clones commonly fail. This is an Ansible limitation which we could work
|
||||
around by implementing our own action module, but this would increase the
|
||||
technical debt as we would have to constantly keep the module code updated
|
||||
as we update to later versions of Ansible.
|
||||
|
||||
When building wheels, pip has no way of resolving all dependencies up-front.
|
||||
The only capability it has is to resolve the requirements for the current
|
||||
package. It then processes each package requirement in turn. To do so requires
|
||||
downloading the package and unpacking it to read the requirements. This is a
|
||||
sequential process and therefore takes a long time when processing packages
|
||||
with a lot of requirements as is typical for OpenStack projects.
|
||||
|
||||
In Kilo the OpenStack requirements management process did not have the jobs
|
||||
which tested the co-installability of all OpenStack packages and produced the
|
||||
``upper-constraints.txt`` file as a manifest of which package versions worked
|
||||
together. We therefore needed to do our own processing of all python packages
|
||||
which would be installed by the roles and had to compile a set of requirements
|
||||
and constraints across them all for the purpose of building the wheels, and
|
||||
ensuring that the installed set were consistent for a build. When the
|
||||
OpenStack requirements repository started publishing the upper-constraints
|
||||
file we adopted it immediately to help keep builds more consistent. However,
|
||||
we still produce our own ``requirements_absolute_requirements.txt`` file
|
||||
which is used for all pip install tasks in order to ensure consistency and
|
||||
to ensure that the packages we built from git are used (instead of making
|
||||
the install in the role do the install from the git source, it installs from
|
||||
the wheel held in the repo server). However this is not practical any more
|
||||
as there are requirements for different services and needs which are not
|
||||
resolvable down to a common set - we need to be able to allow the installation
|
||||
of any version of packages and only apply constraints when needed.
|
||||
|
||||
Some of the venvs we build do not adhere to the OpenStack requirements process
|
||||
and therefore sometimes cannot be built using the upper constraints file.
|
||||
There has also been some interest in being able to do mixed series deployments
|
||||
instead of homogenous deployments. This would involve preparing a venv
|
||||
containing packages from a different series with a different set of
|
||||
constraints. Currently the constraints used in the repo build process are
|
||||
global - we only have the ability to enable/disable their use when building
|
||||
venvs. It would be better to be able to specify a global fallback for
|
||||
constraints, but to allow per venv constraints too.
|
||||
|
||||
The use of Python 2.7 for OpenStack and Ansible is waning and the need to
|
||||
shift everything to use Python 3.5 has arisen as a new requirement. The
|
||||
tooling will need to be shifted to implement the venvs using Python 3.5
|
||||
where applicable, but may still need to prepare venvs using Python 2.7 if
|
||||
a service does not yet support running in a Python 3.5 environment.
|
||||
|
||||
In Newton we introduced the ability to do multi-architecture builds to cater
|
||||
for multiple architectures, then had to also split out multi-distro builds due
|
||||
do wheel/venvs references to C libraries being different for each distro due
|
||||
to the libraries available being different. Currently this is working, but it
|
||||
makes the repo build process much more complex and take a lot more time. The
|
||||
process to synchronise the per-distro and per-architecture built artifacts is
|
||||
error prone and confuses many newcomers to the project.
|
||||
|
||||
In order to facilitate using the repo-server to respond to pip index queries,
|
||||
multiple directories and symlinks have been used to prepare the appropriate
|
||||
structure so that the correct responses are given back to pip. The process of
|
||||
setting up all the symlinks is very time consuming and in some places the
|
||||
process may cause dead links, especially when rebuilding for a specific release
|
||||
tag.
|
||||
|
||||
Storing
|
||||
~~~~~~~
|
||||
Once the wheels, venvs and other artifacts are built for an environment they
|
||||
are stored and synchronised between the repo containers using a combination of
|
||||
rsync and lsyncd. While this sync process is generally OK, it is commonly a
|
||||
cause for confusion and requires a complex troubleshooting process to figure
|
||||
out why packages are not present.
|
||||
|
||||
Installing
|
||||
~~~~~~~~~~
|
||||
The consumption of the prepared wheels and virtualenvs has changed over
|
||||
time. With the introduction of ``developer_mode`` into the roles there
|
||||
is a lot of code and functionality duplication between the repo build
|
||||
process and the role installation process.
|
||||
|
||||
The need to cater for the optional inclusion of a variety of plugins/drivers
|
||||
in the venvs either through the use of additional Python packages or by
|
||||
symlinking system packages into the venv (when the package is proprietary or
|
||||
unavailable via git or pypi) causes further complexity in the process.
|
||||
|
||||
When executing a pip installation, pip always looks for packages in the
|
||||
following order: local cache, local folder, default index, extra indexes.
|
||||
Pip will always check all locations before deciding which to use for the
|
||||
installation. This means that if there are multiple indexes used, it queries
|
||||
them all. This can be slow if any of those are not local to the environment.
|
||||
|
||||
.. _Python wheels: http://pythonwheels.com
|
||||
.. _Python virtualenvs: https://virtualenv.pypa.io
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
* Change the repo build process to, by default, only build wheels for git
|
||||
sources it is given without also building the dependencies. The ability
|
||||
to build all wheels will still be there, but will not be the default
|
||||
behaviour. This will cut down the time taken in this process when in CI,
|
||||
development environments or small online environments where it is not
|
||||
necessary to build/store all the wheels. The full build will only be
|
||||
necessary for offline deployments and for environments where the deployer
|
||||
specifically opts-in to ensuring that everything is built.
|
||||
|
||||
* Replace the current storage structure for wheels with a flat directory.
|
||||
This directory will be served via the pypi API provided by the very simple
|
||||
`pypiserver`_ application. If we need to continue to provide per-distro
|
||||
or per-architecture wheels then we could implement distro/arch indexes
|
||||
which are supplied by individual folders. However, it is unlikely that
|
||||
this will be necessary.
|
||||
|
||||
* Use nginx as a reverse proxy which responds to requests from pip by first
|
||||
trying against the local pypiserver, then against tarballs.openstack.org
|
||||
and then against pypi. This will allow nginx to cache all downloaded
|
||||
packages (speeding up subsequent requests) without the repo server having
|
||||
build them.
|
||||
|
||||
* Implement changes to the roles to allow service-specific constraints to
|
||||
be applied when building venvs. This allows a CI process to build service
|
||||
venvs and to publish the list of tested versions for that service. Then
|
||||
for production builds the published list can be used as a constraint for
|
||||
the venv to ensure that production builds use the same versions. This
|
||||
solves a problem we have today where some projects (eg: tempest, rally,
|
||||
gnocchi) have to be built unconstrained as they do not conform to the
|
||||
global requirements process.
|
||||
|
||||
* Implement changes to each role to handle the wheel building and venv
|
||||
building, but do it in such a way that only the build can be executed
|
||||
by using tags, setting a specific flag, or include_role and tasks_from.
|
||||
The specific dependencies can then be itemised in the role and the role
|
||||
can be used for artifact preparation.
|
||||
|
||||
* Remove all pip install activities from hosts, replacing them with the
|
||||
use of distro packages exclusively for any python requirements on the
|
||||
hosts. We should avoid implementing as many python packages on the host
|
||||
as possible and focus all efforts on implementing everything we need
|
||||
(including the Ansible requirements for targeted hosts) into venvs.
|
||||
All Ansible tasks should then specifically use the appropriate venv
|
||||
when executing tasks, avoiding the use of any python libraries on the
|
||||
host. This prevents system package conflicts and will reduce the host
|
||||
package installation requirements.
|
||||
|
||||
* Implement a playbook which is optionally used to prepare pre-built venvs
|
||||
for an environment as they are today. If a deployer wishes to prepare
|
||||
the venvs in a build process, the playbook should be exercised in the
|
||||
build process and should be executed on a designated 'build host' which
|
||||
will make use of ephemeral containers and/or virtual machines on the build
|
||||
host to exercise the builds for the necessary distribution and architecture
|
||||
combinations.
|
||||
|
||||
* Remove the complex git caching/staging process which exists today and
|
||||
make the use of the repo server for git caching for the services that
|
||||
require it (eg: nova-console uses novnc/spice from git) entirely optional.
|
||||
|
||||
* Implement a playbook which can be used to stage offline installs by
|
||||
downloading all built artifacts (completed, perhaps by a CI job) to the
|
||||
deployment host, then distributing them appropriately.
|
||||
|
||||
* Simplify the constraints management by implementing the use of
|
||||
--constraints in the following order:
|
||||
|
||||
--constraint user-specified-constraints.txt
|
||||
--constraint openstack-ansible-pins.txt
|
||||
--constraint openstack-upper-constraints.txt
|
||||
|
||||
This would replace the current method which merges the various constraints
|
||||
into one file, requiring a fair amount of jinja magic because a single
|
||||
file cannot have two constraints and resolve successfully into a single
|
||||
result as we need in our current mechanism.
|
||||
|
||||
* Implement changes to roles to ensure that the build process and the
|
||||
packages only required when building (dev headers, etc) are only
|
||||
used when a build is being executed. The build packages and the runtime
|
||||
packages will be changed into separate lists so that the runtime environment
|
||||
is only installing the packages it needs.
|
||||
|
||||
* Ensure that 'optional' pip packages are installed into the venv during the
|
||||
build stage, rather than during the install stage.
|
||||
|
||||
.. _pypiserver: https://pypiserver.readthedocs.io
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
* The build process can remain as-is, continuing to confuse deployers and
|
||||
difficult to maintain.
|
||||
|
||||
* The build process can be changed to only build and store wheels for packages
|
||||
which are pip installed onto the hosts, and only to build and store the
|
||||
venvs for distribution.
|
||||
|
||||
Playbook/Role impact
|
||||
--------------------
|
||||
|
||||
Playbooks will be added to cater for the build process and the staging
|
||||
process. The roles will be adjusted to properly separate out the build
|
||||
tasks and the distro packages to install for the build (versus those
|
||||
required when using pre-built wheels).
|
||||
|
||||
Upgrade impact
|
||||
--------------
|
||||
|
||||
Care will be taken to ensure that upgrades happen as they do today.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
The security posture should be improved by the reduction of packages installed
|
||||
onto hosts and containers when a full set of artifacts are built.
|
||||
|
||||
Performance impact
|
||||
------------------
|
||||
|
||||
The performance of the deployment should be improved due to the reduction in
|
||||
time taken to deploy with pre-built packages if a full set of artifacts are
|
||||
built.
|
||||
|
||||
End user impact
|
||||
---------------
|
||||
|
||||
There is no end-user impact for consumers of an OpenStack cloud, except
|
||||
perhaps that upgrades will be quicker to execute, thus resulting in reduced
|
||||
maintenance slot requirements.
|
||||
|
||||
Deployer impact
|
||||
---------------
|
||||
|
||||
* As deployments and upgrades will be quicker to execute, deployers will be
|
||||
able to execute them in shorter maintenance slots.
|
||||
|
||||
* Deployers will need to understand how better to utilise the CI process to
|
||||
prepare the required artifacts to speed up deployments.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
As the build process will be integrated into the roles, it will be easier to
|
||||
understand how it works and what it does.
|
||||
|
||||
Dependencies
|
||||
------------
|
||||
|
||||
This spec will be implemented in partnership with
|
||||
https://blueprints.launchpad.net/openstack-ansible/+spec/deployment-stages
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
jesse-pretorius (odyssey4me)
|
||||
|
||||
Work items
|
||||
----------
|
||||
|
||||
Each of the roles implemented in the default AIO will be worked through in
|
||||
sequence to re-arrange and optimise based on this workflow. The work items
|
||||
are not being detailed here but will be reflected in gerrit through the
|
||||
blueprint's topic and will be visible in launchpad.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
As this process matures, it may be simpler to use the integrated build for
|
||||
all role testing instead of having two seperate test implementations. This
|
||||
reduces technical debt for the project.
|
||||
|
||||
Documentation impact
|
||||
====================
|
||||
|
||||
This work will need to include documentation updates which describe the new
|
||||
way that deployments can be implemented using full artifact builds and
|
||||
how to implement offline installs.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
* https://12factor.net/
|
||||
|
||||
* http://www.clearlytech.com/2014/01/04/12-factor-apps-plain-english/
|
||||
|
||||
|
Loading…
Reference in New Issue