diff --git a/specs/2024.1/approved/libvirt-virtiofs-attach-manila-shares.rst b/specs/2024.1/approved/libvirt-virtiofs-attach-manila-shares.rst new file mode 100644 index 000000000..1a769df7d --- /dev/null +++ b/specs/2024.1/approved/libvirt-virtiofs-attach-manila-shares.rst @@ -0,0 +1,551 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. + + http://creativecommons.org/licenses/by/3.0/legalcode + +============================================================================= +Allow Manila shares to be directly attached to an instance when using libvirt +============================================================================= + +https://blueprints.launchpad.net/nova/+spec/libvirt-virtiofs-attach-manila-shares + +Manila is the OpenStack Shared Filesystems service. This spec will outline API, +database, compute and libvirt driver changes required in Nova to allow the +shares provided by Manila to be associated with and attached to instances. + +Problem description +=================== + +At present users must manually connect to and mount shares provided by Manila +within their instances. As a result operators need to ensure that Manila +backend storage is routable from the guest subnets. + +Use Cases +--------- + +- As an operator I want the Manila datapath to be separate to any tenant + accessible networks. + +- As a user I want to attach Manila shares directly to my instance and have a + simple interface with which to mount them within the instance. + +- As a user I want to detach a directly attached Manila share from my instance. + +- As a user I want to track the Manila shares attached to my instance. + +Proposed change +=============== + +This initial implementation will only provide support for attaching a share to +and later detaching a share from an existing ``SHUTOFF`` instance. The ability +to express attachments during the initial creation of an instance will not be +covered by this spec. + +Support for move operations once a share is attached will also not +be covered by this spec, any requests to shelve, evacuate, resize, cold migrate +or live migrate an instance with a share attached will be rejected with a +HTTP409 response for the time being. + +A new server ``shares`` API will be introduced under a new microversion. This +will list current shares, show their details and allow a share to be +attached or detached. + +A new ``share_mapping`` database table and associated ``ShareMapping`` +versioned objects will be introduced to capture details of the share +attachment. A base ShareMapping versioned object will be provided from which +virt driver and backend share specific objects can be derived providing +specific share attach and detach implementations. + +.. note:: + + One thing to note here is that no Manila state will be stored within Nova + aside from export details used to initially attach the share. These details + later being used when detaching the share. If the share is then reattached + Nova will request fresh export details from Manila and store these in a + new share attachment within Nova. + +The libvirt driver will be extended to support the above with initial support +for cold attach and detach. Future work will aim to add live attach and detach +once `support lands in libvirt itself`__. + +.. __: https://listman.redhat.com/archives/libvir-list/2021-October/msg00097.html + +This initial libvirt support will target the basic NFS and slightly more +complex CephFS backends within Manila. Shares will be mapped through to the +underlying libvirt domains using ``virtio-fs``. This will require ``QEMU`` +>=5.0 and ``libvirt`` >= 6.2 on the compute host and a kernel version of >= 5.4 +within the instance guest OS. + +Additionally this initial implementation will require that the associated +instances use `file backed memory`__ or `huge pages`__. This is a requirement +of `virtio-fs`__ as the ``virtiofsd`` service uses the `vhost-user`__ protocol +to communicate directly with the underlying guest. +(ref: `vhost-user documentation`__) + +Two new compute capability traits and filters will be introduced to model an +individual compute's support for virtio-fs and file backed memory. +And while associating a share to an instance, a check will ensure the host +running the instance will support the + +- ``COMPUTE_STORAGE_VIRTIO_FS`` trait + +and either the + +- ``COMPUTE_MEM_BACKING_FILE`` trait + +or + +that the instance is configured with ``hw:mem_page_size`` extra spec. + +From an operator's point of view, it means +``COMPUTE_STORAGE_VIRTIO_FS`` support requires that +operators must upgrade all their compute nodes to the version supporting +shares using virtiofs. + +``COMPUTE_MEM_BACKING_FILE`` support requires that operators configure one or +more hosts with file backed memory. Ensuring the instance will land on one of +these hosts can be achieved by creating an AZ englobing these hosts. +And then instruct users to deploy their instances in this AZ. +Alternatively, operators can guide the scheduler to choose a suitable host +by adding ``trait:COMPUTE_MEM_BACKING_FILE=required`` as an extra spec or +image property. + +.. __: https://docs.openstack.org/nova/latest/admin/file-backed-memory.html +.. __: https://docs.openstack.org/nova/latest/admin/huge-pages.html +.. __: https://virtio-fs.gitlab.io/ +.. __: https://libvirt.org/kbase/virtiofs.html#other-options-for-vhost-user-memory-setup +.. __: https://qemu-project.gitlab.io/qemu/interop/vhost-user.html + +Users will be able to mount the attached shares using a mount tag, this is +either the share UUID from Manila or a string provided by the users with their +request to attach the share. + +.. code-block:: shell + + user@instance $ mount -t virtiofs $tag /mnt/mount/path + +A previously discussed ``os-share`` library will not be created with this +initial implementation but could be in the future if the logic required to +mount and track shares on the underlying host is also required by other +projects. For the time being `existing code within the libvirt driver`__ used +to track filesystem host mounts used by volumes hosted on ``remoteFS`` based +storage (such as NFS, SMB etc) will be reused as much as possible. + +.. __: https://github.com/openstack/nova/blob/8f250f50446ca2d7aa84609d5144088aa4cded78/nova/virt/libvirt/volume/mount.py#L152-L174 + + +Share mapping status:: + + +----------------------------------------------------+ Reboot VM + Start VM | | --------------+ + Share mounted | active | | + +------------------> | | <-------------+ + | +----------------------------------------------------+ + | | | | + | | Stop VM | | + | | Fail to umount | | + | v | | + | +------------------+ | | + | | error | <+-------------+-------------------+ + | +------------------+ | | | + | | | | | + | | Detach share or | | | + | | delete VM | Delete VM | | + | v | | | + | +------------------+ | | | + | +-------------> | φ | <+ | | Start VM + | | +------------------+ | | Fail to mount + | | | | | + | | Detach share | | Stop VM | + | | or delete VM | Attach share | Share unmounted | + | | v v | + | | +----------------------------------------------------+ | + | +-------------- | inactive | -+ + | +----------------------------------------------------+ + | | + +----------------------+ + + +φ + means no entry in the database. No association between a share and a server. + +Attach share + means POST /servers/{server_id}/shares + +Detach share + means DELETE /servers/{server_id}/shares + +This chart describe the share mapping status (nova), this is independent from +the status of the Manila share. + +Share attachment/detachment can only be done if the VM state is ``STOPPED`` +or ``ERROR``. +These are operations only on the database, and no RPC calls will be required +to the compute API. This is an intentional design for this spec. +As a result, this could lead to situation where the VM start operation fails +as an underlying share attach fails. + +Mount operation will be done when the share is not mounted on the compute host. +If a previous share would have been mounted on the compute host for another +server, then it will attempt to mount it and a warning will be logged that +the share was already mounted. + +Umount operation will be really done when the share is mounted and not used +anymore by another server. + +With the above mount and umount operation, the state is stored in memory and +do not require a lookup in the database. + +The share will be mounted on the compute host using read/write mode. +Read-only will not be supported as a share could not be mounted read-only +and read/write at the same time. If the user wants to mount the share +read-only, it will have to do it in the VM fstab. + +Manila share removal issue: + +Despite a share being used by instances, it can be removed by the user. +As a result, the instances will lose access to the data and might cause +difficulties in removing the missing share and fixing the instance. +This is an identified issue that requires Manila modifications. +A solution was identified with the Manila team to attach metadata to the share +access-allow policy that will lock the share and prevent its deletion until +the lock is not removed. +If the above Manila change can land in the Zed cycle, +the proposal here is to use the lock mechanism in Nova. +Otherwise, clearly document the known issue as unsupported and warn users that +they should take care and avoid this pitfall. + +Instance metadata: + +Add instace shares in the instance metadata. +Extend DeviceMetadata with ShareMetadata object containing `shareId` and +`tag` used to mount the virtiofs on an instance by the user. +See :ref:`caracal-other-end-user-impact`. + +Alternatives +------------ + +The only alternative is to continue with the current situation where users must +mount the shares within their instances manually. The downside being that these +instances must have access to the storage network used by the Manila backends. + +REST API impact +--------------- + +A new server level ``shares`` API will be introduced under a new microversion +with the following methods: + +* GET ``/servers/{server_id}/shares`` + +List all shares attached to an instance. + +Return Code(s): 200,400,401,403,404 + +.. code-block:: json + + { + "shares": [ + { + "shareId": "48c16a1a-183f-4052-9dac-0e4fc1e498ad", + "status": "active", + "tag": "foo" + }, + { + "shareId": "e8debdc0-447a-4376-a10a-4cd9122d7986", + "status": "active", + "tag": "bar" + } + ] + } + +* GET ``/servers/{server_id}/shares/{shareId}`` + +Show details of a specific share attached to an instance. + +Return Code(s): 200,400,401,403,404 + +.. code-block:: json + + { + "share": { + "shareId": "e8debdc0-447a-4376-a10a-4cd9122d7986", + "status": "active", + "tag": "bar" + } + } + +PROJECT_ADMIN will be able to see details of the attachment id and export +location stored within Nova: + +.. code-block:: json + + { + "share": { + "attachmentId": "715335c1-7a00-4dfe-82df-9dc2a67bd8bf", + "shareId": "e8debdc0-447a-4376-a10a-4cd9122d7986", + "status": "active", + "tag": "bar", + "export_location": "server.com/nfs_mount,foo=bar" + } + } + +* ``POST /servers/{server_id}/shares`` + +Attach a share to an instance. + +Prerequisite(s): + +- Instance much be in the ``SHUTOFF`` state. +- Instance should have the required capabilities to enable + virtiofs (see above). + +This is a synchronous API. As a result, the VM share attachement state +is defined in the database and set as inactive. +Then, power on the VM will do the required operations to attach the share and +set it as active if there are no errors. + +Return Code(s): 202,400,401,403,404,409 + +Request body: + +.. note:: + + ``tag`` will be an optional request parameter in the request body, when not + provided it will be the shareId(UUID) as always provided in the request. + + ``tag`` if povided by the user must be an ASCII string with a maximum + lenght of 64 bytes. + + +.. code-block:: json + + { + "share": { + "shareId": "e8debdc0-447a-4376-a10a-4cd9122d7986" + } + } + +Response body: + +.. code-block:: json + + { + "share": { + "shareId": "e8debdc0-447a-4376-a10a-4cd9122d7986", + "status": "active", + "tag": "e8debdc0-447a-4376-a10a-4cd9122d7986" + } + } + +* ``DELETE /servers/{server_id}/shares/{shareId}`` + +Detach a share from an instance. + +Prerequisite(s): Instance much be in the ``SHUTOFF`` or ``ERROR`` state. + +Return Code(s): 202,400,401,403,404,409 + +Data model impact +----------------- + +A new ``share_mapping`` database table will be introduced. + +* ``id`` - Primary key autoincrement + +* ``uuid`` - Unique UUID to identify the particular share attachment + +* ``instance_uuid`` - The UUID of the instance the share will be attached to + +* ``share_id`` - The UUID of the share in Manila + +* ``status`` - The status of the share attachment within Nova + (``active``, ``inactive``, ``error``) + +* ``tag`` - The device tag to be used by users to mount the share within + the instance. + +* ``export_location`` - The export location used to attach the share to the + underlying host + +* ``share_proto`` - The Shared File Systems protocol (``NFS``, ``CEPHFS``) + +A new base ``ShareMapping`` versioned object will be introduced to encapsulate +the above database entries and to be used as the parent class of specific virt +driver implementations. + +The database field `status` and `share_proto` values will not be enforced +using enums allowing future changes and avoid database migrations. +However, to make code more robust, enums will be defined on the object fields. + +Fields containing text will use String and not Text type in the database schema +to limit the column width and be stored inline in the database. + +This base ``ShareMapping`` object will provide stub ``attach`` and ``detach`` +methods that will need to be implemented by any child objects. + +New ``ShareMappingLibvirt``, ``ShareMappingLibvirtNFS`` and +``ShareMappingLibvirtCephFS`` objects will be introduced as part of the libvirt +implementation. + +Security impact +--------------- + +The ``export_location`` JSON blob returned by Manila and used to mount the +share to the host and the host filesystem location should +not be logged by Nova and only accessible by default through the API by admins. + +Notifications impact +-------------------- + +New notifications will be added: + +* One to add new notifications for share attach and share detach. +* One to extend the instance update notification with the share mapping + information. + +Share mapping in the instance payload will be optional and controlled via the +``include_share_mapping`` notification configuration parameter. It will be +disabled by default. + +Proposed payload for attached and detached notification will be the same as +the one returned by the show command with admin rights. + +.. code-block:: json + + { + "share": { + "instance_uuid": "7754440a-1cb7-4d5b-b357-9b37151a4f2d", + "attachmentId": "715335c1-7a00-4dfe-82df-9dc2a67bd8bf", + "shareId": "e8debdc0-447a-4376-a10a-4cd9122d7986", + "status": "active", + "tag": "bar", + "export_location": "server.com/nfs_mount,foo=bar" + } + } + +Proposed instance payload for instance updade, will be the list of share +attached to this instance. + +.. code-block:: json + + { + "shares": + [ + { + "instance_uuid": "7754440a-1cb7-4d5b-b357-9b37151a4f2d", + "attachmentId": "715335c1-7a00-4dfe-82df-9dc2a67bd8bf", + "shareId": "e8debdc0-447a-4376-a10a-4cd9122d7986", + "status": "active", + "tag": "bar", + "export_location": "server.com/nfs_mount,foo=bar" + }, + { + "instance_uuid": "7754440a-1cb7-4d5b-b357-9b37151a4f2d", + "attachmentId": "715335c1-7a00-4dfe-82df-ffffffffffff", + "shareId": "e8debdc0-447a-4376-a10a-4cd9122d7987", + "status": "active", + "tag": "baz", + "export_location": "server2.com/nfs_mount,foo=bar" + } + ] + } + +.. _caracal-other-end-user-impact: + +Other end user impact +--------------------- + +Users will need to mount the shares within their guestOS using the returned +``tag``. + +Users could use the instance metadata to discover and auto mount the share. + +Performance Impact +------------------ + +Through the use of ``vhost-user`` ``virtio-fs`` should have near local +(mounted) file system performance within the guestOS. +While there will be near local performance between the vm and host, +the actual performance will be limited by the network performance of +the network file share protocol and hardware. + +Other deployer impact +--------------------- + +None + +Developer impact +---------------- + +None + +Upgrade impact +-------------- + +A new compute service version and capability traits will be introduced to +ensure both the compute service and underlying virt stack are new enough to +support attaching a share via ``virtio-fs`` before the request is accepted. + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee: + uggla (rene.ribaud) + +Other contributors: + lyarwood (initial contributor) + +Feature Liaison +--------------- + +Feature liaison: + uggla + +Work Items +---------- + +- Add new capability traits within os-traits +- Add support within the libvirt driver for cold attach and detach +- Add new shares API and microversion + +Dependencies +============ + +None + +Testing +======= + +- Functional libvirt driver and API tests +- Integration Tempest tests + +Documentation Impact +==================== + +Extensive admin and user documentation will be provided. + +References +========== + +History +======= + +.. list-table:: Revisions + :header-rows: 1 + + * - Release Name + - Description + * - Yoga + - Introduced + * - Zed + - Reproposed + * - Antelope + - Reproposed + * - Bobcat + - Reproposed + * - Caracal + - Reproposed