In cases where certificates were regenerated for OVN, a service restart
is required in order to apply and use new certs.
We provide also a unique handler name to distinguish certs between ones
installed for neutron-server and OVN.
Depends-On: https://review.opendev.org/c/openstack/openstack-ansible/+/912768
Change-Id: Iedea6f1a67349bafecca5c792072fcd8f95cc546
With update of ansible-lint to version >=6.0.0 a lot of new
linters were added, that enabled by default. In order to comply
with linter rules we're applying changes to the role.
With that we also update metdata to reflect current state.
Depends-On: https://review.opendev.org/c/openstack/ansible-role-systemd_service/+/888223
Change-Id: I3905e334cfbeb7ccb976358016f81c5edd6cd284
By overriding the variable `neutron_backend_ssl: True` HTTPS will
be enabled, disabling HTTP support on the neutron backend api.
The ansible-role-pki is used to generate the required TLS
certificates if this functionality is enabled.
Depends-On: https://review.opendev.org/c/openstack/openstack-ansible/+/879085
Change-Id: I9f16f916d1ef3e5937c91f6b09a3d4073594ecb4
At the moment we don't restart services if systemd unit file is changed.
We knowingly prevent systemd_service role handlers to execute
by providing `state: started` as otherwise service will be restarted twice.
With that now we ensure that role handlers will also listen for systemd
unit changes.
Change-Id: I831f6d62f0d31384258571e01a4e7cdd75b73e2c
There is no need to configure and run Open vSwitch (data-plane) services
on `neutron-ovn-northd` (control-plane) nodes.
Change-Id: I6fdc5b0e212a8b21fc576639a2a82dfe3324244e
When installing/upgrading packages (like ovs) a race-condition may occur
when neutron-ovs-agent tries to restore mesh while ovs is being spawning
down. This results in errors in neutron-ovs-agent like:
ovsdb-client: tcp:127.0.0.1:6640: Open_vSwitch database was removed
ovsdbapp.exceptions.TimeoutException: TXN queue is full
In order to prevent that we disable ablity to restart services on
package installation/upgrade, but do that with handler after
neutron-ovs-agent get stopped.
Change-Id: I4bd717c35e030aa1ede21d9a01460037d1ab070c
This patch will adjust some variable for C8-Stream job to fix
OVN deployment for CentOS-8-Stream. Renamed ovn-central with
ovn-northd for more generic name.
Change-Id: Ifdb773f9f539469e21d37075f6b88259eb1ffa3e
This patch will add ovn clustering support, Basically it will use first
node to start cluster and then new nodes will use leader node to join
cluster.
Change-Id: I4b11d3484c99e538ecd6f7d05570486b5f59c782
Currently we symlink /etc/neutron to empty directory at pre-stage,
and filling it with config only during post_install. This means,
that policies and rootwrap filters are not working properly until
playbook execution finish. Additionally, we replace sudoers file
with new path in it, which makes current operations impossible for
the service, since rootwrap can not gain sudo privileges.
With this change we move symlinking and rootwrap steps to handlers,
which means that we will do replace configs while service is stopped.
During post_install we place all of the configs inside the venv,
which is versioned at the moment.
This way we minimise downtime of the service while performing upgrades
Change-Id: I6d1686ab79647acfc086f21864bde14c8a1a1a49
As per the community goal of migrating the policy file
the format from JSON to YAML[1], we need to replace policy.json to
policy.yaml and remove deprecated policy.json.
config_template has been choosen instead of the copy, since it can
properly handle content that has been lookuped.
We make a separate task not to restart service when it's not needed.
[1] https://governance.openstack.org/tc/goals/selected/wallaby/migrate-policy-format-from-json-to-yaml.html
Change-Id: Ib47a387f15e358e15104fd1cf37e53e61c92ce61
The existing handler which prevents keepalived from being killed
during neutron-l3-agent restarts doesn't go far enough at present.
This patch adjusts the processes which are killed to exclude the
haproxy instance which is responsible for proxying the Nova
metadata service.
Change-Id: I407c6662841bec9d3e3208d2e46bd7d1d5db00fb
Once we do not kill keepalived for l3 agent, it might be usefull to
override that. This is possible with neutron_l3_cleanup_on_shutdown
When it set to True, keepalived will be restarted by l3 agent
except first service restart, where it wil be killed by handler,
since config should be loaded first.
Change-Id: I9eea72d68398f9fd272b1e9ae0c0c0198336c2f5
Systemd processes use a default KillMode of 'control-group'
which causes all other processes spawned during execution to be
killed on service stop. Neutron expects the keepalived processes
it starts to remain running in order to prevent data-plane
interruptions for HA routers.
This change switches the systemd KillMode to process in order to
prevent this issue. In doing so we also have to clean up
non-keepalived processes started by neutron so that upon restart
everything is running from the latest virtualenv which may have
changed during an upgrade.
Change-Id: I958fda17e6207553466d8a7512e35c30b122c22c
Closes-Bug: #1846198
Depends-On: https://review.opendev.org/771770
We use the same condition, which defines against what host some "service"
tasks should run against, several times. It's hard to keep it the same
across the role and ansible spending additional resources to evaluate
it each time, so it's simpler and better for the maintenance to set
a boolean variable which will say for all tasks, that we want to run
only against signle host, if they should run or not now.
Change-Id: Ic7277f4d3c6697a5be2fa86e07f8b19f0c5db069
In order to radically simplify how we prepare the service
venvs, we use a common role to do the wheel builds and the
venv preparation. This makes the process far simpler to
understand, because the role does its own building and
installing. It also reduces the code maintenance burden,
because instead of duplicating the build processes in the
repo_build role and the service role - we only have it all
done in a single place.
We also change the role venv tag var to use the integrated
build's common venv tag so that we can remove the role's
venv tag in group_vars in the integrated build. This reduces
memory consumption and also reduces the duplication.
This is by no means the final stop in the simplification
process, but it is a step forward. The will be work to follow
which:
1. Replaces 'developer mode' with an equivalent mechanism
that uses the common role and is simpler to understand.
We will also simplify the provisioning of pip install
arguments when doing this.
2. Simplifies the installation of optional pip packages.
Right now it's more complicated than it needs to be due
to us needing to keep the py_pkgs plugin working in the
integrated build.
3. Deduplicates the distro package installs. Right now the
role installs the distro packages twice - just before
building the venv, and during the python_venv_build role
execution.
Depends-On: https://review.openstack.org/598957
Change-Id: I1c1dcd73c9d5d78c0a8e40149ff84cbba828f3e2
Implements: blueprint python-build-install-simplification
Signed-off-by: Jesse Pretorius <jesse.pretorius@rackspace.co.uk>
The files and templates we carry are almost always in a state of
maintenance. The upstream services are maintaining these files and
there's really no reason we need to carry duplicate copies of them. This
change removes all of the files we expect to get from the upstream
service. while the focus of this change is to remove configuration file
maintenance burdens it also allows the role to execute faster.
* Source installs have the configuration files within the venv at
"<<VENV_PATH>>/etc/<<SERVICE_NAME>>". The role will now link the
default configuration path to this directory. When the service is
upgraded the link will move to the new venv path.
* Distro installs package all of the required configuration files.
To maintain our current capabilities to override configuration the
role will fetch files from the disk whenever an override is provided and
then push the fetched file back to the target using `config_template`.
Change-Id: I8fba4a1f70d7f5870ad81c8a84e3b1d15742c70f
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
With the more recent versions of ansible, we should now use
"is" instead of the "|" sign for the tests.
This should fix it.
Change-Id: Ie83262629060866dd0d208ed41704aef21416aa0
When using Ansible with python3, the result of
the intersect filter is a set, not a list. This
causes a failure when trying to access item 0
in the list.
In this patch we cast the set to a list before
accessing item 0. This will work for both
python2 and python3.
Change-Id: I9b2a01882bfd5f02e0e0b85e1d57fab0fa88d0e9
When executing the playbook with limits set, the scoping
currently does not work quite as well as one would hope.
This tightens it a bit more to ensure that it operates
as expected.
Change-Id: I8ebd42bcdf9676fd580955256498bbd8801f1cbd
As it turns out, run_once executes a task
once per batch when executed from a serialised
playbook instead of once per set of play hosts
as previously thought.
In this patch we implement a combination of
dynamic inclusion, inventory scoping and
play host scoping to achieve the required
goal of only running it once, even when the
playbook is executed using limits.
Depends-On: I548971e2de92280fe2cc4ff0a6a926497733fa7d
Change-Id: Icde03cd2610f52859fd8d2d7540f442a36e51695
A run_once task decides on a host without the
context of conditionals related to the task.
The filter is done prior to evaluating any
conditionals.
Also, only the neutron-server neutron.conf has
the database connection credentials as a
security measure, so it is not able to execute
the database migrations.
This patch ensures that the database migrations
are only run once, but are also executed
wherever neutron-server is. It also adds
multiple instances of neutron-server and
neutron-agents to better test against problems
of this nature happening again.
Additionally, the patch uses the handlers for
the offline migrations to ensure that
neutron-server is only restarted once when a
new tag is deployed.
Change-Id: I672ceb0848415c8f2653ebc8f7556db77f7f001c
Currently when multiple services share a host, the
restart order is random. This is due to an unordered
dict being used to facilitate the mapping of services
to their groups, names and other options.
Based on [1], this patch implements changes to the role
to ensure that services on the same host are restarted
in the correct order when the software/config changes.
[1] https://docs.openstack.org/developer/neutron/devref/upgrade.html
Change-Id: I368b51ef37763f4163ead591d6743c4d56962ef9
Due to the debug message plugin the handler restart
messages show at the end of the playbook execution
which is a little confusing. Using debug also
requires setting changed_when to true which is a
little extra bit of code which we do not have to
carry.
Instead we use the command module which is simple,
works and less wordy.
Change-Id: I3861447fe685b38ecb9dfad8f3dff666b160af36
When the policy file is copied from the templated
file to the active file, it loses its group/mode
settings. This patch ensures that they are properly
replicated during the copy.
Change-Id: I17c242bc0af72b839009fc2115d1ff940d4251cb
Since we create a filtered list of services we should use the handler to
only restart those services, rather than all services.
Additionally, we no longer need to check "service_en" since that will
happen when the services are filtered. We can then check if the service
exists before performing the metadata-proxy cleanup.
Change-Id: I4ca3d90123e146d09cab20a0eda627014d9717e6
Depends-On: Ic152cfb01930d8bde10ea6d9aa5bba173a87e376
The policy.json file is currently read continually by the
services and is not only read on service start. We therefore
cannot template directly to the file read by the service
(if the service is already running) because the new policies
may not be valid until the service restarts. This is
particularly important during a major upgrade. We therefore
only put the policy file in place after the service restart.
This patch also tidies up the handlers and some of the install
tasks to simplify them and reduce the tasks/code a little.
Change-Id: Ib213d7272c3d7c692dabedd95ff8ab1cc2088c87
This creates a specific slice which all OpenStack services will operate
from. By creating an independent slice these components will be governed
away from the system slice allowing us to better optimise resource
consumption.
See the following for more information on slices:
* https://www.freedesktop.org/software/systemd/man/systemd.slice.html
See for following for more information on resource controls:
* https://www.freedesktop.org/software/systemd/man/systemd.resource-control.html
Tools like ``systemd-cgtop`` and ``systemd-cgls`` will now give us
insight into specific processes, process groups, and resouce consumption
in ways that we've not had access to before. To enable some of this reporting
the accounting options have been added to the [Service] section of the unit
file.
Change-Id: I6078d6cf19da27afc22b5f699165acea0139ddc1
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
The NS metadata proxy pid cleanup process hunts for and removes
PIDs executing old code by using version tags. Under certain
conditions it's possible for an old PID to have expired before
the cleanup action has run. This change simply wraps the
`pkill` command with a test to ensure the task isn't failing.
Should a PID actually be cleaned up the task will print to stdout
and log using the logger command.
Closes-Bug: #1627185
Change-Id: I8c012feb399f8ca65172e9404b859c8f6111de35
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
It is moved to the Nova role where libvirt/qemu is managed in
Id2cfa3353543fecd55f1135abad89f07071e2f60.
Depends-On: Id2cfa3353543fecd55f1135abad89f07071e2f60
Change-Id: Ib2d2056962e38f6fa4f96785a333413bf2c2fead
Integrate deployment for Project Calico's Neutron networking
plugin into the os_neutron role.
See http://docs.openstack.org/developer/networking-calico/
for more information about Calico.
Change-Id: I80546b6deefe0878398716d173b7dcc36c3bef3a
In cases where no old process was found, the shell
command was exiting 1 and causing the handler to report
a failure.
Change-Id: Ic1701f7495abdec713c05d2c95bfdd1e7b1ff73d
Closes-Bug: #1603136
The metadata proxy service will now be cleaned up when this role is
executed if an old version of the metadata-proxy is still running.
Once the old versions of the metadata-proxy have been cleaned up
the metadata agent will respawn the correct process within 60 seconds.
This change is required to ensure upgrades are always executing the
correct code from a given release.
Change-Id: I3a0b5c5b75742f06a94e5334c749b88e54f7f43c
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
UpgradeImpact:
The neutron_service_names variable has been removed from
playbooks/roles/os_neutron/defaults/main.yml
Closes-Bug: #1516655
Change-Id: Id7265ff687f672256b8014d177e21337eb172230
Signed-off-by: Javeria Khan <javeriak@plumgrid.com>
This change implements the blueprint to convert all roles and plays into
a more generic setup, following upstream ansible best practices.
Items Changed:
* All tasks have tags.
* All roles use namespaced variables.
* All redundant tasks within a given play and role have been removed.
* All of the repetitive plays have been removed in-favor of a more
simplistic approach. This change duplicates code within the roles but
ensures that the roles only ever run within their own scope.
* All roles have been built using an ansible galaxy syntax.
* The `*requirement.txt` files have been reformatted follow upstream
Openstack practices.
* Dynamically generated inventory is now more organized, this should assist
anyone who may want or need to dive into the JSON blob that is created.
In the inventory a properties field is used for items that customize containers
within the inventory.
* The environment map has been modified to support additional host groups to
enable the seperation of infrastructure pieces. While the old infra_hosts group
will still work this change allows for groups to be divided up into seperate
chunks; eg: deployment of a swift only stack.
* The LXC logic now exists within the plays.
* etc/openstack_deploy/user_variables.yml has all password/token
variables extracted into the separate file
etc/openstack_deploy/user_secrets.yml in order to allow seperate
security settings on that file.
Items Excised:
* All of the roles have had the LXC logic removed from within them which
should allow roles to be consumed outside of the `os-ansible-deployment`
reference architecture.
Note:
* the directory rpc_deployment still exists and is presently pointed at plays
containing a deprecation warning instructing the user to move to the standard
playbooks directory.
* While all of the rackspace specific components and variables have been removed
and or were refactored the repository still relies on an upstream mirror of
Openstack built python files and container images. This upstream mirror is hosted
at rackspace at "http://rpc-repo.rackspace.com" though this is
not locked to and or tied to rackspace specific installations. This repository
contains all of the needed code to create and/or clone your own mirror.
DocImpact
Co-Authored-By: Jesse Pretorius <jesse.pretorius@rackspace.co.uk>
Closes-Bug: #1403676
Implements: blueprint galaxy-roles
Change-Id: I03df3328b7655f0cc9e43ba83b02623d038d214e