libvirt: Add workaround to cleanup instance dir when using rbd

At present all virt drivers provide a cleanup method that takes a single
destroy_disks boolean to indicate when the underlying storage of an
instance should be destroyed.

When cleaning up after an evacuation or revert resize the value of
destroy_disks is determined by the compute layer calling down both into
the check_instance_shared_storage_local method of the local virt driver
and remote check_instance_shared_storage method of the virt driver on
the host now running the instance.

For the Libvirt driver the initial local call will return None when
using the shared block RBD imagebackend as it is assumed all instance
storage is shared resulting in destroy_disks always being False when
cleaning up. This behaviour is wrong as the instance disks are stored
separately to the instance directory that still needs to be cleaned up
on the host. Additionally this directory could also be shared
independently of the disks on a NFS share for example and would need to
also be checked before removal.

This change introduces a backportable workaround configurable for the
Libvirt driver with which operators can ensure that the instance
directory is always removed during cleanup when using the RBD
imagebackend. When enabling this workaround operators will need to
ensure that the instance directories are not shared between computes.

Future work will allow for the removal of this workaround by separating
the shared storage checks from the compute to virt layers between the
actual instance disks and any additional storage required by the
specific virt backend.

Related-Bug: #1761062
Partial-Bug: #1414895
Change-Id: I8fd6b9f857a1c4919c3365951e2652d2d477df77
This commit is contained in:
Lee Yarwood 2018-12-03 09:03:26 +00:00
parent 357b8b38e8
commit d6c1f6a103
6 changed files with 90 additions and 1 deletions

View File

@ -569,6 +569,10 @@ Possible values:
* $state_path/instances where state_path is a config option that specifies
the top-level directory for maintaining nova's state. (default) or
Any string representing directory path.
Related options:
* ``[workarounds]/ensure_libvirt_rbd_instance_dir_cleanup``
"""),
cfg.BoolOpt('instance_usage_audit',
default=False,

View File

@ -800,6 +800,7 @@ Related options:
* virt.use_cow_images
* images_volume_group
* [workarounds]/ensure_libvirt_rbd_instance_dir_cleanup
"""),
cfg.StrOpt('images_volume_group',
help="""

View File

@ -212,6 +212,36 @@ Related options:
* ``compute_driver``: Only the libvirt driver is affected.
.. _bug #1289064: https://bugs.launchpad.net/nova/+bug/1289064
"""),
cfg.BoolOpt(
'ensure_libvirt_rbd_instance_dir_cleanup',
default=False,
help="""
Ensure the instance directory is removed during clean up when using rbd.
When enabled this workaround will ensure that the instance directory is always
removed during cleanup on hosts using ``[libvirt]/images_type=rbd``. This
avoids the following bugs with evacuation and revert resize clean up that lead
to the instance directory remaining on the host:
https://bugs.launchpad.net/nova/+bug/1414895
https://bugs.launchpad.net/nova/+bug/1761062
Both of these bugs can then result in ``DestinationDiskExists`` errors being
raised if the instances ever attempt to return to the host.
.. warning:: Operators will need to ensure that the instance directory itself,
specified by ``[DEFAULT]/instances_path``, is not shared between computes
before enabling this workaround otherwise the console.log, kernels, ramdisks
and any additional files being used by the running instance will be lost.
Related options:
* ``compute_driver`` (libvirt)
* ``[libvirt]/images_type`` (rbd)
* ``instances_path``
"""),
]

View File

@ -17117,6 +17117,27 @@ class LibvirtConnTestCase(test.NoDBTestCase,
self.assertTrue(instance.cleaned)
save.assert_called_once_with()
@mock.patch.object(libvirt_driver.LibvirtDriver, 'unfilter_instance')
@mock.patch.object(libvirt_driver.LibvirtDriver, 'delete_instance_files',
return_value=True)
@mock.patch.object(objects.Instance, 'save')
@mock.patch.object(libvirt_driver.LibvirtDriver, '_undefine_domain')
def test_cleanup_instance_dir_with_rbd_workaround(self,
_undefine_domain, save, delete_instance_files, unfilter_instance):
self.flags(images_type='rbd', group='libvirt')
self.flags(ensure_libvirt_rbd_instance_dir_cleanup=True,
group='workarounds')
instance = objects.Instance(self.context, **self.test_instance)
drvr = libvirt_driver.LibvirtDriver(fake.FakeVirtAPI())
# destroy_disks=False here as check_instance_shared_storage_local call
# would return None when using the rbd imagebackend
drvr.cleanup(self.context, instance, network_info={},
destroy_disks=False)
delete_instance_files.assert_called_once_with(instance)
self.assertEqual(1, int(instance.system_metadata['clean_attempts']))
self.assertTrue(instance.cleaned)
save.assert_called_once_with()
@mock.patch.object(libvirt_driver.LibvirtDriver, '_get_volume_encryption')
@mock.patch.object(libvirt_driver.LibvirtDriver, '_use_native_luks')
def test_swap_volume_native_luks_blocked(self, mock_use_native_luks,

View File

@ -1064,7 +1064,16 @@ class LibvirtDriver(driver.ComputeDriver):
is_shared_block_storage = False
if migrate_data and 'is_shared_block_storage' in migrate_data:
is_shared_block_storage = migrate_data.is_shared_block_storage
if destroy_disks or is_shared_block_storage:
# NOTE(lyarwood): The following workaround allows operators to ensure
# that non-shared instance directories are removed after an evacuation
# or revert resize when using the shared RBD imagebackend. This
# workaround is not required when cleaning up migrations that provide
# migrate_data to this method as the existing is_shared_block_storage
# conditional will cause the instance directory to be removed.
if ((destroy_disks or is_shared_block_storage) or
(CONF.workarounds.ensure_libvirt_rbd_instance_dir_cleanup and
CONF.libvirt.images_type == 'rbd')):
attempts = int(instance.system_metadata.get('clean_attempts',
'0'))
success = self.delete_instance_files(instance)

View File

@ -0,0 +1,24 @@
---
other:
- |
The ``[workarounds]/ensure_libvirt_rbd_instance_dir_cleanup`` configuration
option has been introduced. This can be used by operators to ensure that
instance directories are always removed during cleanup within the Libvirt
driver while using ``[libvirt]/images_type = rbd``. This works around known
issues such as `bug 1414895`_ when cleaning up after an evacuation and
`bug 1761062`_ when reverting from an instance resize.
Operators should be aware that this workaround only applies when using the
libvirt compute driver and rbd images_type as enabled by the following
configuration options:
* ``[DEFAULT]/compute_driver = libvirt``
* ``[libvirt]/images_type = rbd``
.. warning:: Operators will need to ensure that the instance directory
itself, specified by ``[DEFAULT]/instances_path``, is not shared between
computes before enabling this workaround, otherwise files associated with
running instances may be removed.
.. _bug 1414895: https://bugs.launchpad.net/nova/+bug/1414895
.. _bug 1761062: https://bugs.launchpad.net/nova/+bug/1761062