This is occasionally failing with a timeout. It has been for months,
but it seems like the frequency has increased lately.
Change-Id: Ib1270e4f5bada8680a5d19133a888a8ade8f73c3
Closes-Bug: #2039707
In troubleshooting grenade failures in ironic,
we've observed a sporatic failure causing the
ironic-grenade job to fail upwards of 20-40%
of the time where it seems cirros, due to the
boot sequencing and interfaces, is not
online for networking until after sixty seconds
have passed. In one case 83 seconds before the
networking was fully online.
In consulting with neutron contributors, a random
job check seems to reveal that even with pure VM
workloads, it takes on average 35 seconds into the
ping check. As such, it seems reasonable to make
the setting configurable so ironic-grenade can
execute with an increased timeout more appropriate
to the job settings and test environment.
Adds a INSTANCE_WAIT variable, which defaults to
the prior setting which was static, 60 seconds.
Change-Id: Ie5acf1ad7f8dca39db07f7e61035a8916439004d
Nova has deprecated the server-side keypair generation process, which
is used by cinder's resource create phase in grenade. Even though it
still works for old microversions, it generates RSA keys which will
not be accepted (by default) in newer distros. This makes us generate
the keypair locally and upgrade the public key instead of generating
it server-side.
Change-Id: I0275a44f2ec4977b085b8129fd06e80a0ed6e68d
The jobs fails intermittently on ping check
with current timeout of 30 seconds.
The testvm can take time to boot and have network
ready on slow systems and using qemu hypervisor
and lead to ping timeout.
With [1] included noticed it taking more than 40 seconds
to boot testvm, so bumping timeout to 60 seconds will
help in such cases.
[1] https://review.opendev.org/c/openstack/grenade/+/874417
Related-Bug: #1463631
Related-Bug: #2007357
Change-Id: Ibdd030e126d508e6ff24cde180c611ada7f24cb3
As reported in the related bug, the rootwrap filters are not updated
between versions. This patch calls the devstack method that setups
the Neutron rootwrap configuration and filters.
Closes-Bug: #1999235
Change-Id: Iaebf1b33ccf3bfd64191f9a898408bcfe11dd557
In [1] we finally got rid of the unfinished lib/neutron module and kept
only lib/neutron-legacy. It's renamed to lib/neutron now and it's the
only neutron related module in Devstack.
So this patch removes leftovers related to the old lib/neutron-legacy.
[1] https://review.opendev.org/c/openstack/devstack/+/865014
Change-Id: I2a856b15eda992f0e78ee8eff65f39646e24c936
Grenade jobs stop services, check fip connectivity
for a nova server and then upgrade to next release.
But since ovn data plane and db services are stopped along
with other services, fip connectivity fails as a result.
We shouldn't stop these services along with other
neutron services. This patch sets "SKIP_STOP_OVN" to True
to skip stop of ovn services.
This fixes ovn grenade jobs.
Depends-On: https://review.opendev.org/c/openstack/devstack/+/839362
Change-Id: I2bd3f7e5f0af9a6532db7a1cdb4bc45a963042ca
Nova recently introduced an issue that breaks FFU due to requiring
new compute service records to be recorded before control services are
allowed to start. However the former depends on the latter, which
creates an unsatisfiable circular dependency. Since this has been
both released and backported, Nova introduced a workaround config flag
specifically to get out of that loop, which we set here if needed.
Related-Bug: #1958883
Change-Id: If4820dd4ec7e05ade6eb4a82a328797102704570
Depends-On: https://review.opendev.org/c/openstack/nova/+/826097
Recently we added a check in cinder for cipher algorithm[1]
Currently grenade is not passing cipher when creating an
encrypted type, hence grenade job is failing in glance and cinder gate.
This patch addresses the same by passing the cipher and other
properties.
[1] https://review.opendev.org/c/openstack/cinder/+/800268
Closes-Bug: #1938763
Change-Id: I1eb352deca11916f4db1010a50d15a83e33a61bb
The openstackclient has defaulted to using the v3 API since 4.0.0 [0],
so there is no reason for grenade to insist on using v2, especially since
v2 is being removed in Xena [1].
[0] a96089ff6d
[1] https://review.opendev.org/c/openstack/cinder/+/792299
Change-Id: Ie558d45374f92f79942b80a9c27fd393ad4f4761
Grenade does not support the fast forward upgrade so old release
wise script are not in use in current grenade support.
from-* script in current grenade will only be run via from-$BASE_DEVSTACK_BRANCH
- 1a1f58a69e/inc/upgrade (L92)
and BASE_DEVSTACK_BRANCH will be wallaby for current master grenade or
immediate previous release which means we do not need to keep the older
releases upgrade extra scripts.
Ditto for within-* script.
Change-Id: I66b021f03faa38f976243b7f194454360def215f
Remove condition for checking if BASE_DEVSTACK_DIR/lib/neutron
exists which was introduced ~5 years ago.
Based on the comment it looks like it can be deleted.
Change-Id: Ice9a709ad75bb5794aebed379fb35f78f0b3422f
Victoria only supports focal, so the upgrade from victoria
(starting from victoria+1) should use focal.
This patch should not be backported to victoria, because the
upgrades from ussuri still need to use bionic.
In order to make it work on focal:
- do not fail if ebtables does not contain the broute table
(which happens when it is based on nft);
- when the swift loopback image is remounted on upgrade,
do not use the nobarrier option, which was removed
with the 4.19 kernel. See also the corresponding change
in devstack: I6871a7765e3e04122d8d546f43d36bb8415383fc
Change-Id: If57c54828baf4e250ad08fdd95351490010e1b41
Cinder volume deletion is async and can take time to complete in the
c-vol backend *after* c-api has already returned to the caller.
As such we need to wait until encrypted volumes are deleted fully before
attempting to delete the associated encrypted volume type as this
request will fail when the volume is still being deleted by c-vol.
This change adds a simple waiter to ensure the volume is removed before
removing the type.
Change-Id: I466763ae9fc5a7ad13b498d43b0c16802c1b800b
Closes-Bug: #1907157
The screen_stop function was removed some time ago together
with the screen support (I8c27182f60b0f5310b3a8bf5feb02beb7ffbb26a)
This does probably mean no proper horizon/grenade testing has been
done in the past 3 years...
Change-Id: I389e2a4f8135e0e22a70098b95c17457cbedf1cf
This runs the online data migrations for placement after upgrading
to new code and synchronizing the DB schema.
Change-Id: I91e7a7e451b5756722eef20ed4ee61f2d35c2fd6
This change adds some basic coverage for attaching and detaching an
encrypted LUKS volume during an upgrade. To keep things readable a new
_wait_for_volume_update utility function is introduced and called
throughout the cinder tests when waiting for a volume to switch state or
to become bootable.
Change-Id: I894ef91a1e38775b1b56feb84612c7661046b4ba
Starting with train, Cinder API v1 is no longer deployed, so we should
no longer hardcode it's use.
Change-Id: I31513b5d18ac1fdbf58d932d1ef132675a019e28
Until now we haven't needed this because placement was being upgraded
as part of nova and placement was not requiring a db sync to upgrade.
Now it does, so an upgrade.sh is added to 15_placement, based on
keystone's (which is simpler than nova's), and the repeated parts
in nova's upgrade.sh are removed. a settings file is also added
Change-Id: I5354a6b32b73c613018bbe17f0691c69f7839b5f
Needed-By: https://review.opendev.org/669170
In the cinder resource creation we have been timing out ssh connection
tests to the test server. Unfortunately all of the existing logs I can
find indicate that this server is fine. Lets dump the console log to see
if that gives us any more info when ssh connectivity times out.
Change-Id: Ife397941deab80658fff375846a690a46a0c8b02
When tls-proxy is enabled, few functions in lib/tls relies on lib/apache
functions (restart_apache_server, stop_apache_server, enable_apache_site,
enable_apache_mod).
It is thus necessary to source lib/apache whenever lib/tls is used.
Change-Id: Ia9ffdc490d0cba04a084d572312e177b19791585
In the from-rocky placement extraction script, this patch reverses
the order to migrate the database and to create the placement.conf.
This is because ``[placement_database]/connection`` should be set
before the database migration shell script is executed if we want
to stamp the database version within the shell script.
Needed-By: https://review.openstack.org/621733
Change-Id: I8fe401814f887c8b29f1c6262d8b9e5263a07bcc
On slower test nodes we're seeing failures where the
2nd cinder volume created is not yet available by the
time we try to attach it to a server which fails with
a 400 InvalidVolume error. This fixes the bug by waiting
for the volume to be available before trying to attach it.
Change-Id: I833d79ecc97ddc844bf156ab64477c7c77424f20
Closes-Bug: #1807520
When running grenade under python3, swift runs under python2. Swift
includes openstack client in its requirements so when it installs under
Python 2 it clobbers the existing openstack client at /usr/local/bin and
any previously installed plugins are wiped out because their entry
points aren't seen. This was discovered because the osc-placement
commands were not available to nova inventory verification commands
during the nova upgrade, which comes after swift.
To address this, we do a reinstall at the end of swift's upgrade.
Because python-openstack client is in ENABLED_PYTHON3_PACKAGES,
it will install as Python3 when installed standalone.
Note that the issue being fixed here is python3 specific, not
OpenStack release specific.
Closes-Bug: 1805156
Change-Id: If23619885a4766e243b39eae6041efe4a4d0a3af
Under python3, the output of the openstack resource provider
inventory list command is random which breaks how the
_get_inventory_value function is used to store and verify
inventory before and after the upgrade.
This change uses the "openstack resource provider inventory show"
command to target the specific resource class and column value (total)
so we don't have to worry about sort order.
While in here, a comment is left about why we can't do the
same for getting the resource class allocation value.
Change-Id: I1e8dadae631bee87628c5b5390609deb8a1a71e5
Closes-Bug: #1803312
This adds the main upgrade steps for placement extraction
from nova in stein. A from-rocky script is added which will
run after openstack/placement is cloned/installed by devstack
but before services are started on the stein side of the
grenade run.
If the CI infrastructure (devstack-gate/zuul) does not clone the
openstack/placement repo then we need to do it ourselves.
Note that until devstack is actually new enough to configure
the extracted placement repo from change
https://review.openstack.org/#/c/600162/ this upgrade
script will noop and we'll continue to deploy and use placement
from nova. This is needed since we have a catch-22 dependency
with that devstack change which can't land without this grenade
change.
The script will copy the placement-related tables from the
nova_api database and put them into the placement database,
which is also created by the upgrade script.
Then, it will write out the placement config file along
with the placement database connection so the placement
service can start on the stein (new) side.
Finally, it will write out the uwsgi config for placement,
and disable nova-placement-api apache site in favor of
placement-api.
Co-Authored-By: Chris Dent <cdent@anticdent.org>
Change-Id: Ia0f19debb442be2b3d04eae238a3d7287393b5eb
Starting with the Stein series, the ironic virt driver no longer exposes
the standard VCPU/MEMORY/DISK resource classes. If for any reason the
inventory for them is created on the Rocky side of grenade, the verification
phase fails after upgrading to Stein. This breaks ironic-inspector CI.
This patch updates the logic to check for a custom resource class.
Change-Id: I265046d1615a34f55836264ea8f6ce72d32391cf
Change I3da3530aa73a8a1500116bcefdcba7b947d5e05e in devstack
renamed the "Member" role to "member" which is OK for MySQL
which is case insensitive by default, but not postgresql which
is case sensitive by default. This change fixes the role name
usage in grenade and also removes the reference to the invalid
bug against keystone.
Change-Id: I3e24581c5c77e07edc7c867296e066f40acbbc29
Closes-Bug: #1792983
Cells v2 has been required for nova since Ocata
so we can drop the conditional logic on the
NOVA_CONFIGURE_CELLSV2 variable, set by the (now branch-specific)
neutron-grenade job. The only remaining check for
the variable is in the from-mitaka script (which might
be irrelevant at this point given mitaka and newton
are both end of life).
Change-Id: Ic3101ef5f82b3341772a591669ff96bf9ab72ab6
This makes the nova resources module save information about the host
inventory (first node only), as well as the allocation information
about the test instance before the upgrade. Afterwards in the
verify phase, we compare those values to the current ones to make
sure we have not lost or changed anything during the upgrade.
Change-Id: Ifef2acf7dce17d0fa21baac9da6f8403e69136a4
This patch swaps the order of doing api_db sync with db_sync.
The API database migration should be run before the migrations for
the main/cell databases. This is because the former contains
information about how to find and connect to the latter.
This was discovered in Rocky since we added a data model change
(a new column to the cell_mappings table) that depends on db sync
being run after having added this new column to the api_db (i.e running
api_db sync before db sync). The data model change was the first time
where the actual sync order became meaningful.
This has been correctly done in devstack and the fact that grenade was
doing the db sync before api_db sync was hidden by the fact that
devstack had run at least once before reaching this part of grenade,
which meant that the database was already initialized.
Change-Id: Ic790ef7c3531c2b672621310524797548246b2ef
Closes-Bug: #1761775
The openstack server remove floating ip CLI no longer works
with python-novaclient 10.0.0 directly because novaclient
removed it's deprecated API bindings for updating a floating
IP to unset it's associated port.
We need to fix python-openstackclient, deprecate that CLI and
eventually remove it, but at the same time, we can get ahead of
that deprecation and change to the 'floating ip unset --port' CLI.
Change-Id: I63d69bc2b96df04777f00f32930e92564e33e8c2
Closes-Bug: #1745795
Placement API was not being shutdown at the end of the OLD target,
thus when it was being started at the beginning of the NEW target it
was not actually getting new code. A call to stop_placement is added.
In addition, make sure placement is installed in upgrade.sh so that
any new stuff is in place.
The depends-on is to a devstack change which makes sure that
lib/placement does not call remove_uwsgi_config when stop_placement is
called. Without this, there's no config file for the
devstack@placement-api systemd unit to run, and there is an immediate
exit.
Change-Id: I7f2158aeaef82a47e11c6e29675e542023fff4be
Depends-On: Iee763adf7895145d97b184924896db3f1f48a015
Closes-Bug: #1736385
In the pike release we migrated everything to use uwsgi. There is no
need to force everything back to eventlet. This was necessary when pike
was master because in ocata we used eventlet. We don't want to switch
deployment mechanisms in an upgrade. But now that pike is base we don't
need to have any of these since everything is consistent on both sides
of the upgrade.
Depends-On: I066f5f87ff22d7da2e3814f8c2de75f2af625d2b
Change-Id: Ib504ab21dfc5e32eb3f73f57d636981963e20520
* Stop nova-conductor explicitly if it is still running
(also filed a devstack specific change in
I9ffd6d09df6f390a842b8a374097f144564d2db4)
* Run keystone, cinder, nova etc under mod_wsgi as we run
into problems with uwsgi (need to fix those but not
right now during the release process)
* Make sure we use singleconductor as grenade we don't yet
support multiple rabbitmq(s) in the multinode scenario
* Hack to pass through some of the variables from above
to the additional node before it runs stack.sh as
the defaults won't work (superconductor etc)
Depends-On: I075eb5a88113acfa36519e2c6e2aab87836be065
Depends-On: I9ffd6d09df6f390a842b8a374097f144564d2db4
Change-Id: If4c82ca12fe7b8b1ca7cfd8181d24dbd8dad3baa
This commit removes the ensure_logs_exist call in all the upgrade
scripts. This function currently only works in a non-systemd world. Now
that all the logs go in the journal we can't check for separate log
files anymore. In the future we'll modifiy the function to handle
systemd and the journal properly, but for right now this is to unblock
things.
Change-Id: Iedf824a1772115e0dff287a898636f8e58471269
Nova now has multiple databases, let's add them to the list
nova-api nova_cell0 nova_cell1
Since it's hard to detect which set of databases are present,
we should disable bailing out when trying to save these.
Change-Id: I075eb5a88113acfa36519e2c6e2aab87836be065