According to ansible documentation[1] the filter syntax shouldn't be
used in test since 2.5.
One of the strange outcome I've experienced is ceph update failure was
undetected and the job would end up in timeout.
Fix all the tests still using that idiom to the new one.
[1] https://docs.ansible.com/ansible/latest/user_guide/playbooks_tests.html#test-syntax
Change-Id: Ibee9655139c0b61ba04417ac39967d1f53793404
(cherry picked from commit dac931018e)
The current conditional is failing causing the bug below, let's
instead rely on the ansible_python version. Seen as part of the
work at [1]
Related-Bug: 1886681
[1] https://tree.taiga.io/project/tripleo-ci-board/task/1817
Change-Id: I01c971b8e289e324439edfac8819ed5b3264b9ac
(cherry picked from commit 4c178922b2)
(cherry picked from commit c27075d244)
(cherry picked from commit 44d481df34)
ceph-ansible and python-tripleoclient have different "minimum" required
version of ansible in specification. This patch upgrades ceph-ansible as
prerequisite so all dependencies will be met. In fast-forward script
checks if ceph-ansible is installed.If yes, it will upgrade it to latest
versions in repositories.
Change-Id: I3eadfad34aa563084fd4cc68990cb247e734d508
Related-Bug: 1885637
(cherry picked from commit 3fb8fe1912)
Passing a tags to the tripleo-upgrade role doesn't work as all the
initial set_facts tasks are skipped.
Make sure we always run those set_facts regardless of the tags used by
the user.
Change-Id: I62a2e21fd062e302a03b898730555e2ab7d5a542
Closes-Bug: #1843442
(cherry picked from commit 593afa2337)
(cherry picked from commit 5a41e20937)
This add those two options. They work like their ir overcloud
conterpart.
This add the possibility to add new parameters *during*
update/upgrade/ffwd if required.
Currently adjustement has been done only to update templates.
The other limitation is that this is currently comptatible only in
infrared context and thus is skipped for upstream CI.
Change-Id: I43148a0bc494cf3fc6ff109e7f3a4b94cf751d99
(cherry picked from commit 212c02fb4e)
(cherry picked from commit 29e8fc8470)
(cherry picked from commit f9cf26d8bf)
1. The python_bin var is unused, so it should be removed.
2. Tasks should be seperated by a newline in order to easily see
that they are seperate tasks.
3. The return code of 2 was in the wrong place. It should have been
where the data was changed, as per the changed_when condition.
4. With the return code adjusted, the final task's condition also
needed adjusting to match.
Related-Bug: #1856865
Change-Id: I916c3ed394f54561fff2a106288a667c0b877360
(cherry picked from commit 53377cdb14)
(cherry picked from commit 967f3805d3)
(cherry picked from commit 6c5631bb3d)
The python binary is not available in rhel8, add a variable with the
binary. In stein on, we deal python3 by default and print isn't
supported anymore now print is a function. This patch defines python_bin
based on OS version (python in RHEL7 or python3 in RHEL8). Output is
assigned as ansible registered variable to save it to file later on.
Additionally this patch moves all pipes to jq to minimize number of forks
Co-Authored-By: Sergii Golovatiuk <sgolovat@redhat.com>
Closes-Bug: #1856865
Change-Id: I8b2cacd4271a59dfda948462146c0866b8b7725f
(cherry picked from commit 614322b9f6)
(cherry picked from commit bceafc0ab4)
We can encounter corner case pacemaker issues with parallel role
update. While we solve them, we need a way to disable parallel role
update.
Using a idiom mentionned in the ansible documentation[1] we start role
update by batch. When batch is 1, this is serial update, one role
after another.
This is the default.
[1] https://docs.ansible.com/ansible/latest/user_guide/playbooks_async.html
Change-Id: I03378557653d07113fa70782e5d22bf5e3e969b8
(cherry picked from commit 8d2027f1f1)
(cherry picked from commit a7704f6559)
(cherry picked from commit 52c2dda30a)
Packet loss detection worked not correct when percentage was lower
than 1% (e.g. 0.172282%). This caused not relevant fails of upgrade
jobs in CI. See example below:
5224 packets transmitted, 5215 received, 0.172282% packet loss, time 5570ms
rtt min/avg/max/mdev = 0.398/0.801/14.855/0.449 ms
Ping loss higher than 1% detected
Change-Id: Ief3fb1d22fc6b69d79b8b28ec8e986437f456e80
(cherry picked from commit 6ecf38c4ed)
(cherry picked from commit 4de17fde3f)
tripleo-ansible-inventory use overcloud as the name of the stack by
default. On some ci, that value may be different, this then produces
an undercloud only inventory file.
Change-Id: Ic420c0717165e01df99ad2368a23fc9fc10e71c1
Closes-Bug: #1857120
(cherry picked from commit 6483436d7d)
(cherry picked from commit 1da968309d)
The tripleo-upgrade has a workarounds logic which allows us to
apply patches or specific modifications prior and after some of
the upgrade steps. However, if these workarouds need to be applied
in the overcloud nodes, the only way to do it was via a bash script
which would iterate over the nodes and apply the patch or perform a
change on each of the nodes.
This patch adds a new workarounds field: ansible_hosts. When this
option will be present in the workaround and it will be different
than an empty string then the workaround will be aplied via Ansible in
the nodes specified in that ansible_hosts field. This ansible_hosts
option needs to be used in combination with the command one, as
the command will be transformed in a shell Ansible task which will
be executed in the nodes passed in the ansible_hosts option.
Example:
pre_overcloud_upgrade_prepare_workarounds:
- set_root_password:
patch: false
basedir: ''
id: ''
ansible_hosts: 'overcloud'
command: |
echo redhat | passwd root --stdin
will turn into a set_root_password.yaml Ansible playbook under
~/ansible_workarounds:
cat ~/ansible_workarounds/set_root_password.yaml
- hosts: overcloud
tasks:
- name: set_root_password workaround
shell: |
echo redhat | passwd root --stdin
When executing the workarounds, a new bash function ansible_patch
has been included which will take care of executing the generated
Ansible playbook.
Also, an optional input parameter could be passed to workarounds.sh,
when passed, it will be taken as input for the --limit option when
executing ansible-playbook. This way, we can execute a workaround
specifically in a server, instead of running it in all of them.
Change-Id: I421ebecfc5504ac2fd225de0c4fb0cbf735bbdaf
(cherry picked from commit e582d2f304)
(cherry picked from commit b6bac90695)
On the update_noop or upgrade_noop the operation_type is undefined.
(cherry picked from commit bf32b828d6)
Change-Id: I25e47c987ff796171f40fffedd89b855eab18265
(cherry picked from commit 1719ee4f41)
In middle of Queens and Rocky release we introduced option to
set parallel execution of minor update on selected role types.
If the environment went through FFWD or was deployed before this
option got backported we need to update the roles_data.
By default we have update_serial either unset at all for OSP13 and OSP14 or
we have it set where Pacemaker enabled nodes, CephOSD nodes and Networkers
have it set to 1. This is mostly defensive precaution and we do allow running
in parallel for CephOSD and Networkers for production systems that did enough
testing on preprod or can take small outage. We should also parallelize it in
CI as we just waste time here.
Change-Id: I4cff09dc6aa9ac944b20a52ae087a8923f55209f
(cherry picked from commit 2044957694)
(cherry picked from commit ad0e9b3954)
Previously we ran role by role which is not necessary and not
intended way of production update. In case of proper testing on
preprod customers should run update as fast as possible which
means leveraging ability to run all roles at once and even all
nodes of selected roles at once. With this patch the Pacemaker
enabled roles will still update in serial but the roles will be
done in parallel. In case of 3 controller 3 database 3 messaging
setup it will look roughly like this:
Update of controller[0] database[0] messaging[0]
Update of controller[0] database[1] messaging[1]
Update of controller[1] database[2] messaging[2]
Update of controller[2]
This is due to not blocking between roles and each role taking
different amount of time to apply the update. At any moment the
pacemaker quorum is not broken.
Change-Id: Ib119210139886382726bc0ccddfdb4f7f6803015
(cherry picked from commit a2b433133d)
The --reverse option causes patch to fail even if the diff is
just few lines off. The reverse option to check if patch was
applied is not a good idea with yaml. It's fairly easy to create
reverse patch for patch that wasn't applied.
Change-Id: I4a1459344794f5d602dc1b781d15a591ea2ac135
(cherry picked from commit a4fa67c1f9)
Up to now, the minor updates workarounds was not being used very much
being a little bit left away in comparision with what it was implemented
for the upgrades one. This patch allows minor updates to benefit from
the same workarounds mechanism, at the same time that any improvement
in the upgrades mechanism will be available for updates too.
Also, it was removed the references to the {{ working_dir }} variable
in those shell tasks which have the argument changing directory to that
very same {{ working_dir }}, mostly to simplify the tasks and remove
redundancy.
Change-Id: Ibc57c51ff19ebad093c887bee545ca6a7d51827f
(cherry picked from commit ef15456503)
To synchronize with other infrared components that use 0.4.0 already.
Change-Id: I4c40690877dcaa666617760d224fea6e23db965c
(cherry picked from commit 9602c30e98)
To speed up CI and development process it would be better to disable
this script. CI will fail anyway even without this validation. Users
will be able to enable it if/when they decides to use tripleo-upgrade
for production upgrade.
Closes-bug: #1844567
Change-Id: Ia5a767491c2f297b396c5cc937c1495e4267a4e3
(cherry picked from commit 72dd2c49e3)
Moved the file to README.rst for consistency with other OpenStack
project documentation.
Change-Id: I4754a085c6255f977142302d2bee135220056c4f
(cherry picked from commit e3e97beb53)
Remove the build of the documentation from tox, the project doesn't use
sphinx. That fix an error during the build of the documentation, because
the configuration file for sphinx is not found.
Change-Id: I577a05c2f7916bfca637ecb4451abbee5bd7714e
(cherry picked from commit 4776547196)
Change the mail author to use a generic address. I think it's better.
Change-Id: I6d535a6d4a0d8483ddb958a9d1dcb7810bccd468
(cherry picked from commit a51bd6f178)
With the release of Ansible 2.5, the recommended way to perform
loops is the use the new loop keyword instead of with_X style loops.
This review addresses aforementioned change for common tasks
within tripleo-upgrade role.
Change-Id: I70d387b381b6ce297507cbfe669ea7be902df605
(cherry picked from commit b25817a233)
Use same mechanism for templates as is used within upgrades tasks.
Change-Id: Idcec723addb392363241d8b625cdd53ece4f3c83
(cherry picked from commit 42beb3e008)
Include has some unintuitive behaviours depending on
if it is running in a static or dynamic in play or in playbook context,
in an effort to clarify behaviours move to a new set of modules:
include_tasks, include_role, import_playbook, import_tasks.
Change-Id: I32198527a084d35f8a2c91e3e7d3f32b6fbe9e1e
(cherry picked from commit d67e6e7166)
Include has some unintuitive behaviours depending on
if it is running in a static or dynamic in play or in playbook context,
in an effort to clarify behaviours move to a new set of modules:
include_tasks, include_role, import_playbook, import_tasks.
Change-Id: I33018bcc8f4798f33f73e1aad47419d8094269c8
(cherry picked from commit 6e9762f3ec)
Include has some unintuitive behaviours depending on
if it is running in a static or dynamic in play or in playbook context,
in an effort to clarify behaviours move to a new set of modules:
include_tasks, include_role, import_playbook, import_tasks.
Change-Id: I08e9abfc9a39a4ca50e5c747f65b2953d34ccbfa
(cherry picked from commit 571d6b0b5b)
Include has some unintuitive behaviours depending on if it is running
in a static or dynamic in play or in playbook context,
in an effort to clarify behaviours we are moving to a new set modules:
include_tasks, include_role, import_playbook, import_tasks.
Also this one might got deprecated so let switch to more specific
modules from generic.
Note this change also include a fix for the invalid option option to
include_tasks that was not needed at the time of the original commit.
Closes-Bug: #1827906
Change-Id: Ib7ec62a4e8311c4da2af85e30eb33184d05a046f
(cherry picked from commit 2ff51bd816)
It's needed to apply post-ceph upgrade workarounds after ceph
upgrade is performed, not after pre-converge workarounds are
applied.
Change-Id: I3981324bb092f408dd597d895b2b2017fed516ba
(cherry picked from commit 9fec3161c6)
The ceph_osds variable is obtained from the ceph configuration
and it seems it can be a dictionary formed by the osds directories
as keys, example {'/dev/vdc': {}, '/dev/vdb': {}, '/dev/vdf': {},
'/dev/vde': {}, '/dev/vdd': {}} .
For that reason, we need to differentiate when the variable is
a list or a dictionary, as otherwise the loop variable won't be
able to iterate over the values.
Change-Id: Ib7aded0ff3668079dbe43cc3fbfa6a7e067cec83
Closes-Bug: #1844974
(cherry picked from commit 75290252d7)
(cherry picked from commit b70b3e0a2b)
TripleO offers a default network topology when deploying
with network isolation enabled, and this is reflected in
the network_data.yaml file in tripleo-heat-templates.
If this is the case it has to be handled during update/upgrade.
Change-Id: If50109753d4845357c869986c256186e8e22006d
(cherry picked from commit 663d380f6f)
(cherry picked from commit 44810b2a10)
workload_memory has default set so no need for extra condition.
Option 'workload_launch_post_composable_upgrade' in not valid
starting from Q hence no need to keep it.
Change-Id: I901f90174ed974efd90c99e43a378150a1c5e7f8
(cherry picked from commit 82b0ad633d)
We always start upgrade from Controllers role, that's a requirement.
Other roles are upgrade in the order in which they apper in
'overcloud' hostgroup from tripleo-ansible-inventory.
So far we cannot guarantee more than upgrading controllers first.
This change introduces new variable 'roles_upgrade_order'
(semi-colon separated list of roles), which can specify desired order
of roles' upgrade.
WARNING: it's operator's responsibility to specify 'controller'
role as first item of a list.
Example of using 'roles_upgrade_order' variable:
-e 'roles_upgrade_order=Controller,Database;Networker;Compute'
Here we start upgrade with Controller and Database roles,
then Networker role and finishing with Compute role.
Default behavior is unchanged: we start by upgrading Controller role,
followed by a roles from tripleo-ansible-inventory.
Change-Id: I8807877990f569cf6eec76e162bc353e8a34ffe8
(cherry picked from commit 34b12e0dee)
Next scripts are not relevant for current release,
so no need to keep them around.
Change-Id: I4b7f2ac15e9a4ff8f40975aa1deecc6f58530bc1
(cherry picked from commit 711a673c8f)
SSL is enabled on uc by default starting from R,
so here is a way how ssl cert path is resolved:
1. If undercloud_service_certificate configured in undercloud.conf
use it
2. Check if generate_service_certificate is specified and
set to 'true' in undercloud.conf, or not present in undercloud.conf
(defaults to 'true')
3. Find autogenerated file in format:
/etc/pki/tls/certs/undercloud-[undercloud_public_host].pem
Change-Id: I014474001882874d84c4a60f35bd33db77baf55a
(cherry picked from commit 96b4bec38d)
Follows the same linting configuration that was implemented first in
tripleo-quickstart-extras which makes used of pre-commit tool for
managing all linters.
This also avoids problems where a new linter release may break our
gates because pre-commit always pins versions.
Removes ansible from requirements.txt as it needs to be listed only
in ansible-requirements.txt
Change-Id: Ia229d3d58763d743bd19ad9099d7907561f3c77f
Depends-On: https://review.openstack.org/#/c/627186/
(cherry picked from commit 8ac74a8177)