At the moment, if an instance migration/resize fails and is deleted
while being in error state, the instance revert files are not deleted.
This change fixes this issue, ensuring that instance migration files
are cleaned up when an instance is destroyed.
(cherry-picked from commit 24cc1f3fff)
Closes-Bug: #1724240
Change-Id: I3752e7d0bd96a7d418563afd997739e88a87914d
When spawning an Hyper-V instance with NICs having the vif_type "hyperv",
neutron will fail to bind the port to the Hyper-V host if the neutron
server doesn't have the "hyperv" mechanism driver installed and configured,
resulting in a PortBindingFailed exception on the nova-compute side.
When this exception is encountered, the logs will say to check the
neutron-server logs, but the problem and its solution are not obvious
or clear, resulting in plenty of questions / reports, all having the
same solution: is there an L2 agent on the host alive and reporting to
neutron, and if neutron Hyper-V agent is used, make sure to install
networking-hyperv and configure neutron-server to use the "hyperv"
mechanism_driver.
(cherry-picked from commit d4fa1d7da1)
Change-Id: I94ed8f15f3c6312a6f675b9255aff7e9d76f96fc
Closes-Bug: #1744032
The Nova Compute manager expects a VirtualInterfacePlugException
exception to be raised when a vif plug fails.
This particularly affects the service initialization step. If we're
raising an exception other than the above mentioned one, the exception
won't be cought, in which case the Nova Compute service will stop.
This change ensures that we're raising the expected exception if vif
plugging fails.
(cherry picked from commit 6adc5cf4d4)
Change-Id: Ibaa405577d4fa06f27fdd9440aca49be0c620af5
Closes-bug: #1736392
Reporting the disk_available_least field can help in making sure
the scheduler doesn't pick a host that cannot fit a specific flavor's
disk.
The reported local_gb_used is calculated based on the instances spawned
by nova on a certain compute node, and might not reflect the actual
reality, especially on shared storage scenarios.
Change-Id: I20992acef119f11f6584094438043a760fc4a287
Closes-Bug: #1717892
(cherry picked from commit 6479f5539d)
During cold migration, the following steps happen:
1. The source node will stop the VM, copy the VM files to a _revert path,
destroy the VM, and return the _revert path as disk_info.
2. The destination node will get that disk_info, and if it was configured
not to move the storage (multiple CSVs cluster scenario), it will just
copy the files to a path without the ending _revert.
The issue is that rstrip can strip more characters than desired. It will
remove all characters contained by "_revert" from the given path's ending.
For example:
"instance_000000be_revert".rstrip("_revert") => "instance_000000b"
Change-Id: I784835800a961be4ba755a4d51edbc931eccdc57
Closes-Bug: #1716862
(cherry picked from commit e00240f5d7)
In the case of multiple CSVs scenario, disks are not migrated on
cold migration. Thus, pathutils relies on the VM's ConfigurationDataRoot
when determining its disk file locations, but the VM's
ConfigurationDataRoot is set after checking the disks, resulting in
a DiskNotFoundException.
This commit addresses this issue.
Change-Id: I589239c6ebcd3cf706bc188179a74c97bb831a0d
Closes-Bug: #1716886
(cherry picked from commit 0c673d5c5c)
This change ensures that vif ports as well as volume connections
are properly removed after an instance fails to spawn.
In order to avoid having similar issues in the future, the
'block_device_info' and 'network_info' arguments become mandatory
for the VMOps.destroy method.
Side note: for convenience reasons, one redundant unit test has
been squashed.
Closes-Bug: #1714285
Change-Id: Ifa701459b15b5a2046528fa45eca7ab382f1f7e8
(cherry picked from commit 6e715ed580)
After cold migrating an instance, volumes are not disconnected on
the source node side. This change fixes it.
A recent change fixed a similar issue regarding vif ports but we've
missed volume connections.
Change-Id: I094b405a151366236d6b86e45e7a989818006e2b
Closes-Bug: #1705683
(cherry picked from commit fedb492a14)
planned_vm_exists method exists in migrationutils, not in vmutils.
Change-Id: I9af57254b90ef787b4633fcb367a77857a018ff4
(cherry picked from commit 90b372ea42)
If an instance having iSCSI volumes attached is being
live-migrated, a Planned VM is created at the destination.
If the live-migration fails, the Planned VM is not cleaned
up at the destination.
Depends-On: I91636a82b057f566eab9887c422911163668f556
Change-Id: If62941eb44ff1a5bbf5df01f5cfd19d9008d98bb
Closes-Bug: #1604078
A recent change introduced a regression which prevents SMB volumes
from being detached. By passing the volume tag to os-win when
detaching a disk, os-win then ignores the disk path and attempts
to retrieve the attachment using the tag.
Virtual images don't get tagged, for which reason detaches will fail
siliently.
For now, we'll just skip passing the tag when detaching such disks.
Change-Id: Ie0500c29bfbbd63761b732872f1dda92720dbe21
Closes-Bug: #1710616
Adds the following config options:
* max_failover_count - the maximum number of failovers that can occur
in the failover_period. Once a VM's number of failovers reaches
this number, the VM will simply end up in a Failed state.
* failover_period - the number of hours in which the max_failover_count
failovers can occur.
* auto_failback - allow the VM to failback to its original host once it
is available.
Closes-Bug: #1709598
Depends-On: I2af82c8ecf3bee2e2f2161976d75123f8a0c89e9
Change-Id: Ib28b41f95b544b1a9547b80f389d65dc8ecf5f9b
When attaching passthrough disks, we're passing a Msvm_DiskDrive
object path instead of the actual disk path.
The issue is that even if the according disk path is retrieved, we
may fail to retrieve the above mentioned resource path.
This change adds the missing check. This way, a retry may be
performed if needed. Also, this ensures that we don't get a VM disk
with no actual disk attached (Hyper-V allows it).
Closes-Bug: #1709049
Change-Id: I55abf82f51d4dc739083eec2ea479d2c8115670f
Retrieving the disk path can be a time consuming operation in case
of passthrough disks. As such disks attachments will be tagged using
the volume id, we'll just use that instead.
Note that Hyper-V does not allow us to attach the same passthrough
disk to multiple instances, which means that we're safe to rely
on this tag.
Depends-On: Ic3775c3e81a5f05b20221ef52e393fdb54d0190c
Change-Id: I99448881d8f9e247e1c34b1cd489c5488eff7146
When the driver is initialized, we're attempting to capture serial
console output for instances that were not created by Nova,
although those should be skipped.
The event handler decides whether an instance should be skipped or
not based on its instance notes, which are expected to contain an
uuid, the instance uuid, in case of Nova instances.
We'll do the same when the driver is initialized. Note that there
may be VMs that were not created by us but do have an UUID in their
"Notes" field but this workflow is *good enough* and consistent for
now.
Alternatively, we could fetch from nova a list of instances that are
expected to reside on this node, but this may be out of sync. The
other option would be to change the notes format, adding some specific
prefix, but that would not be backwards compatible.
Change-Id: I156348e39a0db80160a7223d18dd9cfd277e7e5a
Closes-Bug: #1709057
There is a race condition during failover migration, that, when
triggered, stops the host from proper cleanup.
If the destination node finishes the failover before the host,
proper cleanup is done, otherwise there will pe an ovs port
remaining on the source host.
This change addresses this issue by removing the race condition
which was previously used to check if the migration was done
properly.
This doesn't completly fix the because, if the host was already updated,
no aciton is taken at the source node.
Change-Id: I50213de235907f80e1d4266dfd6785ea147a7f0c
Partial-Bug: #1705992
In order to avoid namespace sharing with networking-hyperv, compute-hyperv
should use the compute_hyperv namespace.
Closes-Bug: #1709085
Change-Id: I44eede0ea6b6558c0ddf6ebdc342ed8f11395f3b
At the moment, the vif ports are not unplugged after cold migration
on the source node.
This change addresses this issue by passing the network info when
destroying the instance.
Change-Id: I902ab7499fa1aca7fab415393a9670350b8437aa
Closes-Bug: #1705683
This change fixes the number of arguments used by the cluster
driver destroy method, as per a recent nova change.
The according argument was removed from the "stadard" driver by
a different change, omitting the cluster one.
Change-Id: I52ab6ed4ebed89ce55fb21d2d252187af98ab152
Starting with the Pike series, OpenStack no longer supports log
translation.
This patch remove log translations then update hacking rules to prevent
future mistakes. It also enables H904, which allows the logging package
to skip creating the formatted log message if the message is not going
to be emitted because of the current log level. [1]
https://docs.openstack.org/oslo.i18n/latest/user/guidelines.html#adding-variables-to-log-messages
Change-Id: I584d10356acea0ede6d219a8ca9e1b001b8d6064
At the moment, when doing the live migration precheck on the
destination, if the source or destination instance location are
not available, an OSError will be raised.
We need to catch it and throw a MigrationPreCheckError instead
so that we allow a different destination to be used.
Closes-Bug: 1694636
Change-Id: I3286c32ca205ffd2d5d1aaab88cc96699476e410
When creating an instance, the driver doesn't check if nested
virtualization is required. This fixes this issue.
Closes-Bug: #1698771
Change-Id: Ia30942dfcd0a4ab88ce3818020c5476d007a2395