This was a TODO to remove delete attachment call from refresh after
remove_volume_connection call.
Remove volume connection process itself deletes attachment on passing
delete_attachment flag.
Bumps RPC API version.
Change-Id: I03ec3ee3ee1eeb6563a1dd6876094a7f4423d860
cmd nova-manage volume_attachment refresh vm-id vol-id connetor
There were cases where the instance said to live in compute#1 but the
connection_info in the BDM record was for compute#2, and when the script
called `remote_volume_connection` then nova would call os-brick on
compute#1 (the wrong node) and try to detach it.
In some case os-brick would mistakenly think that the volume was
attached (because the target and lun matched an existing volume on the
host) and would try to disconnect, resulting in errors on the compute
logs.
- Added HostConflict exception
- Fixes dedent in cmd/manange.py
- Updates nova-mange doc
Closes-Bug: #2012365
Change-Id: I21109752ff1c56d3cefa58fcd36c68bf468e0a73
When people transition from three ironic nova-compute processes down
to one process, we need a way to move the ironic nodes, and any
associcated instances, between nova-compute processes.
For saftey, a nova-compute process must first be forced_down via
the API, similar to when using evacaute, before moving the associated
ironic nodes to another nova-compute process. The destination
nova-compute process should ideally not be running, but not forced
down.
blueprint ironic-shards
Change-Id: I33034ec77b033752797bd679c6e61cef5af0a18f
OSError will only be raised, if file path is not readable because of permission issue.
With this change we will get correct error msg.
Change-Id: Iad3b0f2ab3e6eafd9f6c98477edfa35c4cd46ee8
Moved lock and unlock instance code to context manager.
Updated _refresh volume attachment method to use instance
lock context manager
Now there will be a single request ID for the lock, refresh, and unlock
actions. Earlier, the volume_attachment refresh operation used to have a
unique req-id for each action.
Related-Bug: #2012365
Change-Id: I6588836c3484a26d67a5995710761f0f6b6a4c18
This chnage adds the pre-commit config and
tox targets to run codespell both indepenetly
and via the pep8 target.
This change correct all the final typos in the
codebase as detected by codespell.
Change-Id: Ic4fb5b3a5559bc3c43aca0a39edc0885da58eaa2
This addresses comments from code review to add handling of PCPU during
the migration/copy of limits from the Nova database to Keystone. In
legacy quotas, there is no settable quota limit for PCPU, so the limit
for VCPU is used for PCPU. With unified limits, PCPU will have its own
quota limit, so for the automated migration command, we will simply
create a dedicated limit for PCPU that is the same value as the limit
for VCPU.
On the docs side, this adds more detail about the token authorization
settings needed to use the nova-manage limits migrate_to_unified_limits
CLI command and documents more OSC limit commands like show and delete.
Related to blueprint unified-limits-nova-tool-and-docs
Change-Id: Ifdb1691d7b25d28216d26479418ea323476fee1a
When people transition from three ironic nova-compute processes down
to one process, we need a way to move the ironic nodes, and any
associcated instances, between nova-compute processes.
For saftey, a nova-compute process must first be forced_down via
the API, similar to when using evacaute, before moving the associated
ironic nodes to another nova-compute process. The destination
nova-compute process should ideally not be running, but not forced
down.
blueprint ironic-shards
Change-Id: I7ef25e27bf8c47f994e28c59858cf3df30975b05
This command aims to help migrate to unified limits quotas by reading
legacy quota limits from the Nova database and calling the Keystone API
to create corresponding unified limits.
Related to blueprint unified-limits-nova-tool-and-docs
Change-Id: I5536010ea1212918e61b3f4f22c2077fadc5ebfe
Previously, we archived deleted rows in batches of max_rows parents +
their child rows in a single database transaction. Doing it that way
limited how high a value of max_rows could be specified by the caller
because of the size of the database transaction it could generate.
For example, in a large scale deployment with hundreds of thousands of
deleted rows and constant server creation and deletion activity, a
value of max_rows=1000 might exceed the database's configured maximum
packet size or timeout due to a database deadlock, forcing the operator
to use a much lower max_rows value like 100 or 50.
And when the operator has e.g. 500,000 deleted instances rows (and
millions of deleted rows total) they are trying to archive, being
forced to use a max_rows value several orders of magnitude lower than
the number of rows they need to archive was a poor user experience.
This changes the logic to archive one parent row and its foreign key
related child rows one at a time in a single database transaction
while limiting the total number of rows per table as soon as it reaches
>= max_rows. Doing this will allow operators to choose more predictable
values for max_rows and get more progress per invocation of
archive_deleted_rows.
Closes-Bug: #2024258
Change-Id: I2209bf1b3320901cf603ec39163cf923b25b0359
Calling __str__ on a SQLAlchemy URL returns a URL with a masked password
in SQLAlchemy 2.0. We want to store the unmasked version when creating
new Cell objects.
Change-Id: I23fd38465f7dec20b00dc25776dfde18318000b1
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Suggested-by: Melanie Witt <melwittt@gmail.com>
This creates an online data migration that will allow manual pushing
of the instance.compute_id linkage to the ComputeNode object. This
will be necessary for instances that are deleted, shelved, never
scheduled or otherwise on a compute host that isn't updatating it.
Related to blueprint compute-object-ids
Change-Id: I7a3d82d4d176ba12c4b8ffe6941b32a888db9b05
Hacking has bumped the version of flake8 that it's using to 5.0 in its
6.0.1 release. This turns up quite a few pep8 errors lurking in our
code. Fix them.
Needed-by: https://review.opendev.org/c/openstack/hacking/+/874516
Change-Id: I3b9c7f9f5de757f818ec358c992ffb0e5f3e310f
In 3.3.0 the align attirbute applied to both the header
and the data. In 3.4.0 align only applies to the data.
This change restores the previous left align behavior
for both header and data.
Closes-Bug: #1988482
Change-Id: Ia77410b10c1706bc6561b11cf5d2ef72b936795e
We have many places where we implement singleton behavior for the
placement client. This unifies them into a single place and
implementation. Not only does this DRY things up, but may cause us
to initialize it fewer times and also allows for emitting a common
set of error messages about expected failures for better
troubleshooting.
Change-Id: Iab8a791f64323f996e1d6e6d5a7e7a7c34eb4fb3
Related-Bug: #1846820
This is no longer needed as our minimum is 1.4.13.
Change-Id: I946a790f3461f1cf76a49c18596cc0a6f8058f6c
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
This a follow up for change Ic8783053778cf4614742186e94059d5675121db1
and corrects the 'image_property set --property' arg format in the
hw_machine_type doc. Newline formats in the nova-manage CLI doc is
cleaned up to be consistent throughout and unnecessary () is removed
from the ImagePropertyCommands class.
Related to blueprint libvirt-device-bus-model-update
Change-Id: I5b67e9ae5125f6dad68cae7ac0601ac5b02e74b3
This adds an image property show and image property set command to
nova-manage to allow users to update image properties stored for an
instance in system metadata without having to rebuild the instance.
This is intended to ease migration to new machine types, as updating
the machine type could potentially invalidate the existing image
properties of an instance.
Co-Authored-By: melanie witt <melwittt@gmail.com>
Blueprint: libvirt-device-bus-model-update
Change-Id: Ic8783053778cf4614742186e94059d5675121db1
Previously the volume_attachments show command would incorrectly use the
nova.objects.BlockDeviceMapping.get_by_volume helper to fetch the
underlying BlockDeviceMapping object from the database that does not
support multiattach volumes.
This is corrected by switching to the get_by_volume_and_instance helper
that can pick out unique BlockDeviceMapping objects using both of the
supplied volume and instance UUIDs.
Change-Id: Ifab05abf3775efb0f29f80c9300297208f60d5d9
Closes-Bug: #1945452
The nova-manage placement heal_allocations CLI is capable of healing
missing placement allocations due to port resource requests. To support
the new extended port resource request this code needs to be adapted
too.
When the heal_allocation command got the port resource request
support in train, the only way to figure out the missing allocations was
to dig into the placement RP tree directly. Since then nova gained
support for interface attach with such ports and to support that
placement gained support for in_tree filtering in allocation candidate
queries. So now the healing logic can be generalized to following:
For a given instance
1) Find the ports that has resource request but no allocation key in the
binding profile. These are the ports we need to heal
2) Gather the RequestGroups from the these ports and run an
allocation_candidates query restricted to the current compute of the
instance with in_tree filtering.
3) Extend the existing instance allocation with a returned allocation
candidate and update the instance allocation in placement.
4) Update the binding profile of these ports in neutron
The main change compared to the existing implementation is in step 2)
the rest mostly the same.
Note that support for old resource request format is kept alongside of
the new resource request format until Neutron makes the new format
mandatory.
blueprint: qos-minimum-guaranteed-packet-rate
Change-Id: I58869d2a5a4ed988fc786a6f1824be441dd48484
This optional kwarg to the nova.volume.cinder.API.attachment_update
method ends up stashed in the connector passed to c-api and sets the
device associated with the attachment within Cinder. While this being
unset has no real world impact it should be kept the same as the
original attachment for completeness.
The Cinder fixture is extended to mimic the behaviour of
nova.volume.cinder.API.attachment_update prior to calling Cinder
allowing us to assert the value stashed in the connector and attachment
record.
Closes-Bug: #1945450
Change-Id: Ib2938a407598bf2dd466aae41700f350d2d34418
While these commands would previously use LOG.exception to at least log
the exceptions in nova-manage.log the user wouldn't see anything printed
to stdout by default. This change logs a simple message to the user
pointing them in the direction of the more verbose log if they need
more help.
Change-Id: I28ed8e35e057e5b57d1da437616f8aff1a184fe4
Currently, when 'nova-manage db archive_deleted_rows' is run with
the --until-complete option, the process will archive rows in batches
in a tight loop, which can cause problems in busy environments where
the aggressive archiving interferes with other requests trying to write
to the database.
This adds an option for users to specify an amount of time in seconds
to sleep between batches of rows while archiving with --until-complete,
allowing the process to be throttled.
Closes-Bug: #1912579
Change-Id: I638b2fa78b81919373e607458e6f68a7983a79aa
This adds a force kwarg to delete_allocation_for_instance which
defaults to True because that was found to be the most common use case
by a significant margin during implementation of this patch.
In most cases, this method is called when we want to delete the
allocations because they should be gone, e.g. server delete, failed
build, or shelve offload. The alternative in these cases is the caller
could trap the conflict error and retry but we might as well just force
the delete in that case (it's cleaner).
When force=True, it will DELETE the consumer allocations rather than
GET and PUT with an empty allocations dict and the consumer generation
which can result in a 409 conflict from Placement. For example, bug
1836754 shows that in one tempest test that creates a server and then
immediately deletes it, we can hit a very tight window where the method
GETs the allocations and before it PUTs the empty allocations to remove
them, something changes which results in a conflict and the server
delete fails with a 409 error.
It's worth noting that delete_allocation_for_instance used to just
DELETE the allocations before Stein [1] when we started taking consumer
generations into account. There was also a related mailing list thread
[2].
Closes-Bug: #1836754
[1] I77f34788dd7ab8fdf60d668a4f76452e03cf9888
[2] http://lists.openstack.org/pipermail/openstack-dev/2018-August/133374.html
Change-Id: Ife3c7a5a95c5d707983ab33fd2fbfc1cfb72f676
Add a combination of commands to allow users to show existing stashed
connection_info for a volume attachment and update volume attachments
with fresh connection_info from Cinder by recreating the attachments.
Unfortunately we don't have an easy way to access host connector
information remotely (i.e. over the RPC API), meaning we need to also
provide a command to get the compute specific connector information
which must be run on the compute node that the instance is located on.
Blueprint: nova-manage-refresh-connection-info
Co-authored-by: Stephen Finucane <stephenfin@redhat.com>
Change-Id: I2e3a77428f5f6113c10cc316f94bbec83f0f46c1
Rewrap some code just to make it a little less fugly to read.
Change-Id: If78bbd578bbba73fc85446ad55d34d3addd6c4af
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Merge these, removing an unnecessary layer of abstraction, and place
them in the new 'nova.db.main' directory. The resulting change is huge,
but it's mainly the result of 's/sqlalchemy import api/main import api/'
and 's/nova.db.api/nova.db.main.api/' with some necessary cleanup. We
also need to rework how we do the blocking of API calls since we no
longer have a 'DBAPI' object that we can monkey patch as we were doing
before. This is now done via a global variable that is set by the 'main'
function of 'nova.cmd.compute'.
The main impact of this change is that it's no longer possible to set
'[database] use_db_reconnect' and have all APIs automatically wrapped in
a DB retry. Seeing as this behavior is experimental, isn't applied to
any of the API DB methods (which don't use oslo.db's 'DBAPI' helper),
and is used explicitly in what would appear to be the critical cases
(via the explicit 'oslo_db.api.wrap_db_retry' decorator), this doesn't
seem like a huge loss.
Change-Id: Iad2e4da4546b80a016e477577d23accb2606a6e4
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Nested allocations are only partially supported in nova-manage placement
heal_allocations CLI. This patch documents the missing support and
blocks healing instances with VGPU or Cyborg device profile request in
the embedded flavor. Blocking is needed as if --forced is used with such
instances then the tool could recreate an allocation ignoring some of
these resources.
Change-Id: I89ac90d2ea8bc268940869dbbc90352bfad5c0de
Related-Bug: bug/1939020
To query a resource provider by uuid, request path should look like
/resource_providers?uuid=<uuid>
istead of
/resource_providers&uuid=<uuid>
This change fixes the wrong path so that "nova-manage placement audit"
command can look up the target resource provider properly.
Closes-Bug: #1936278
Change-Id: I2ae7e9782316e3662e4e51e3f5ba0ef597bf281b
No need for these things to live in two places.
Change-Id: Ided8e57ddd0cf12fd7d37d4fe029fb1807f8d719
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
The task_log table contains instance usage audit records if
nova-compute has been configured with [DEFAULT]instance_usage_audit =
True. This will be the case if OpenStack Telemetry is being used in the
deployment, as the option causes nova to generate audit information
that Telemetry then retrieves from the server usage audit log API [1].
Historically, there has been no way to delete task_log table records
other than manual database modification. Because of this, task_log
records could pile up over time and operators are forced to perform
manual steps to periodically truncate the table.
This adds a --task-log option to the 'nova-manage db
archive_deleted_rows' CLI to also archive task_log records while
archiving the database. --task-log works in conjunction with --before
if operators desire archving only records that are older than <date>.
The 'updated_at' field is used by --task-log --before <date> to
determine the age of a task_log record for archival.
Closes-Bug: #1877189
[1] https://docs.openstack.org/api-ref/compute/#server-usage-audit-log-os-instance-usage-audit-log
Change-Id: Ibed67854a693c930effd4dba7aca6cd03b65bd92
This patch makes the necessary change to adapt to the SQLAlchemy 1.4
release in a way that is still compatible with the currently pinned 1.3
versions.
This is related to the overall effort to bump SQLAlchemy to 1.4
https://review.opendev.org/c/openstack/requirements/+/788339
Co-Authored-By: Mike Bayer <mike_mp@zzzcomputing.com>
Closes-Bug: #1926426
Change-Id: I8a0ab3b91b4203ab603caac02ee5132be7890e9a
This change adds a libvirt command to list all instance UUIDs with
hw_machine_type unset in their image metadata. This will be useful to
operators attempting to change the [libvirt]hw_machine_type default in
the future as it allows them to confirm if it is safe to change the
configurable without impacting existing instances.
blueprint: libvirt-default-machine-type
Change-Id: I39909ace97f62e87f326d4d822d4e4c391445765
This change adds a second update command to the libvirt group
within nova-manage. This command will set or update the machine type of
the instance when the following criteria are met:
* The instance must have a ``vm_state`` of ``STOPPED``, ``SHELVED`` or
``SHELVED_OFFLOADED``.
* The machine type is supported. The supported list includes alias and
versioned types of ``pc``, ``pc-i440fx``, ``pc-q35``, ``q35``, ``virt``
or ``s390-ccw-virtio``.
* The update will not move the instance between underlying machine types.
For example, ``pc`` to ``q35``.
* The update will not move the instance between an alias and versioned
machine type or vice versa. For example, ``pc`` to ``pc-1.2.3`` or
``pc-1.2.3`` to ``pc``.
A --force flag is provided to skip the above checks but caution
should be taken as this could easily lead to the underlying ABI of the
instance changing when moving between machine types.
blueprint: libvirt-default-machine-type
Change-Id: I6b80021a2f90d3379c821dc8f02a72f350169eb3
This change introduces the first machine_type command to nova-manage to
fetch and display the current machine type if set in the system metadata
of the instance.
blueprint: libvirt-default-machine-type
Change-Id: Idc035671892e4668141a93763f8f2bed7a630812
This command was helpful to assist users FFUing past the Pike release,
however, it's no longer helpful and should be removed now. Do just that.
Change-Id: Ib42f65dbcf61ead571e9107e7ffbba2b29f48d64
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Change I2aae01ed235f1257b0b3ddc6aee4efc7be38eb6e indicated that this
command was no longer necessary and could be removed. In hindsight, it's
been unnecessary since Liberty, which introduced a blocking migration
requiring this script be run, and it could have been deleted years ago.
No time like the present though.
Change-Id: I532c7918a8e2c887f29d2f0e1e33b80f2b3a7507
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>