Encrypted backups

This specs adds the possibility for chunked backup drivers to encrypted/decrypt the data. This is primarily intended for backups that are stored in public clouds, such as GCS and S3. Change-Id: Ib9999e01e02dc48a14a268e5277203a6ac0dd89b
2022-10-25 13:49:19 +02:00 · 2022-10-25 13:49:19 +02:00 · 0cfece2ed6
parent 5adae83030
commit 0cfece2ed6
2 changed files with 283 additions and 7 deletions
--- a/specs/2024.1/chunked-backup-encryption.rst
+++ b/specs/2024.1/chunked-backup-encryption.rst
@ -0,0 +1,283 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+=========================
+Encrypted Chunked Backups
+=========================
+
+https://blueprints.launchpad.net/cinder/+spec/encrypted-backups
+
+Backups inheriting from the Chunked Backup Driver always store data unencrypted
+which may not be undesirable in some scenarios.
+
+
+Problem description
+===================
+
+Keeping data in public cloud storage services, such as GCS and S3, means that
+access control to that information is only partially in our control, and we are
+forced to trust that the provider has tight control over its access.
+
+This is not the best policy with sensitive information, and it is recommended
+to encrypt sensitive data.
+
+Unfortunately in OpenStack all public cloud drivers, GCS and S3, do not support
+encryption, which means that anyone with access to the OpenStack backup
+location will be able to access the raw data of all my unencrypted Cinder
+volumes.
+
+
+Use Cases
+=========
+
+As a system administrator I want to configure my OpenStack deployment to make
+my backups to S3 without having to worry about having sensitive data readable
+for anyone that has access to the S3 account, authorized or not.
+
+
+Proposed change
+===============
+
+The proposal is to support encrypted backups for the S3 and GCS Cinder backup
+drivers using a single static encryption key for all encrypted backups.
+
+Users would **not** be allowed to select whether they want encryption or not,
+as this would be configured by the system administrator that configures the
+Cinder services and knows where backups are being stored.
+
+Note:
+  The RBD driver would not support encryption.
+
+The reason to use a single key instead of per backup encryption keys using
+Barbican, like Cinder does for volumes, is because the primary objective of
+encrypting backups is to prevent people that has access to S3 from reading the
+contents of the backups, and in a disaster scenario where Barbican store is
+also lost, all backups would be rendered unusable by the loss of the keys.
+
+The proposal is to have 3 new configuration options ``backup_encryption_mode``
+and ``backup_encryption_algorithm``, ``backup_encryption_key`` to configure how
+chunked backup drivers would handle encryption.
+
+The ``backup_encryption_mode`` would accept 3 different options: ``off``,
+``always``, and ``auto``, where:
+
+- ``off`` would disable encryption, just like we do for the
+  ``backup_compression_algorithm``, for example if we were using our own Swift
+  cluster.
+
+- ``on`` would force all backup to be encrypted.
+
+- ``auto`` would only encrypt unencrypted volumes and would backup already
+  encrypted volumes as they are, since double encryption would not provide
+  additional security.
+
+The ``backup_encryption_key`` would be a string configuration option.
+
+For simplicity this initial implementation will only support the ``Fernet``
+algorithm from the ``cryptography`` python module, that Cinder already
+requires, which is a symmetric authenticated cryptography.  It will be the
+default algorithm and will be configured with the value ``fernet`` in the
+``backup_encryption_algorithm`` configuration option.
+
+To support encrypted backup we need a new version of the *metadata* file format
+to be able to store the encryption algorithm to mark the backup as encrypted.
+
+First we would bump the ``DRIVER_VERSION`` and update the
+``DRIVER_VERSION_MAPPING`` to include the restore method for the new version,
+which would support decrypting volumes.
+
+In the ``_write_metadata`` method we would save a new key called
+``encryption_algorithm`` that would contain the configured
+``backup_encryption_algorithm`` when we are encrypting the backup or ``none``
+if the data is not being encrypted because backup encryption is disabled or the
+volume is already encrypted.
+
+Backup drivers supporting encryption shall fail to start if the
+``backup_encryption_algorithm`` has an incorrect value or if
+``backup_encryption_mode`` is not ``off`` and there is no valid
+``backup_encryption_key``.
+
+The new code should be able to support doing an incremental encrypted backup
+even if the parent backup is non encrypted, and even if the parent was made by
+an older Cinder release that used version ``1.0.0`` of the metadata format.
+This means that the restore operation must select the appropriate restore
+method based on each backup's metadata, and not assume that all backups from a
+backup chain are the same.
+
+In terms of code this is quite easy, in the ``restore`` method it just needs to
+move the assignment of the ``restore_func`` to the ``while`` loop after the
+``backup1`` metadata has been read.
+
+When compression is enabled with the ``backup_compression_algorithm`` the
+compression should be done prior to the encryption, since encryption turns the
+data into high-entropy data reducing the likelihood of the compression doing
+anything useful.
+
+Since encryption algorithms are implemented in C libraries the encryption and
+decryption of data shall be done in a native thread from the ``tpool`` to
+prevent greenthreads from blocking and turning the whole service unresponsive.
+
+Backup internals should be hidden from end-users, so Cinder will not expose
+whether backups are encrypted or not for them, but since this information could
+be useful for system administrators it will need to make it visible for them.
+
+For this purpose a new DB column called ``encrypted`` will be added to the
+``backups`` table to reflect whether the data in that specific backup is
+encrypted or not.  A backup will be encrypted whenever an encrypted volume is
+backed up or when an unencrypted volume is backup up and the chunked backup
+driver encrypts it.
+
+It's important to reflect in the documentation that this field only reflects
+that specific backup in a chain of backups, so we could have an encrypted
+incremental backup that states that it is encrypted but its parent is
+unencrypted.
+
+Alternatives
+------------
+
+An alternative could be to have use a single Barbican key for the whole
+cinder-backup service or to use a per-backup Barbican encryption key.
+
+Data model impact
+-----------------
+
+This change will require the addition of a new DB column called ``encrypted``
+in the ``backups`` table that defaults to ``False`` to be able to detect .
+
+Besides the DB change in ``cinder/db/sqlalchemy/models.py`` the corresponding
+``Backup`` Oslo Versioned Object (OVO) in ``cinder/objects/backup.py`` will
+also need to be changed, which means:
+
+- Bumping the version of the OVO
+
+- Adding the ``encrypted`` boolean field and ensure it's set to ``False`` by
+  default.
+
+- Adding the compatibility code that removes the field when sending it to an
+  older RPC service (using ``obj_make_compatible`` method).
+
+- The new ``Backup`` OVO version shall be added to the OVO history in
+  ``cinder/objects/base.py``.
+
+The DB change also affects the ``import_record`` and ``export_record`` methods
+from ``cinder/backup/manager.py`` and the first should remain backward
+compatible with older backup exports.
+
+Since there is a new DB field that has no value in the DB, and we no longer do
+data migrations when updating the schema,  a new online data migration will
+need to be added to the ``cinder/cmd/manage.py`` CLI and to the ``Backup`` OVO
+when a record is read from the DB and it doesn't have the ``encrypted`` value
+set.
+
+REST API impact
+---------------
+
+A new REST API microversion is needed since admins will now be able to see the
+``encrypted`` field for backups.
+
+Security impact
+---------------
+
+Security will be increased when using external public clouds for Cinder
+backups.
+
+Active/Active HA impact
+-----------------------
+
+No impact.
+
+Notifications impact
+--------------------
+
+Notifications will include the new ``encrypted`` field in when sending backup
+notifications.
+
+Other end user impact
+---------------------
+
+None.
+
+Performance Impact
+------------------
+
+Doing encryption is computationally expensive, so this will have an impact on
+the speed at which backups are performed.
+
+This can be mitigated for encrypted volumes by setting
+``backup_encryption_mode`` to ``auto``.
+
+Better performance can be achieved by using multiple backup processes with the
+``backup_workers`` configuration option or by horizontally scaling the
+cinder-backup service.
+
+Other deployer impact
+---------------------
+
+Deployers will now have 3 new configuration options related to this feature as
+described above:
+
+- ``backup_encryption_mode``
+- ``backup_encryption_algorithm``
+- ``backup_encryption_key``
+
+Developer impact
+----------------
+
+None
+
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+TBD
+
+Work Items
+----------
+
+- Update the DB model for the ``backups`` table.
+
+- Update the ``Backup`` OVO.
+
+- Update the import and export record methods.
+
+- Update the REST API with a new microversion.
+
+- Add new configuration options and update the chunked backup driver.
+
+
+Dependencies
+============
+
+None
+
+
+Testing
+=======
+
+Besides the standard unit tests one of the tempest CI jobs should be modified
+to run with encryption enabled and the backup tempest tests should be made to
+look for the value of the new ``encrypted`` field in the admin REST API
+responses and confirm its not present for normal users.
+
+
+Documentation Impact
+====================
+
+Document ``doc/source/admin/volume-backups.rst`` must be updated to reflect the
+new encryption possibilities describing the main purpose of the feature (public
+clouds) and how the Ceph backup driver doesn't support this feature.
+
+
+References
+==========
+
+`Antelope PTG discussion`_
+
+
+_`Antelope PTG discussion`: https://etherpad.opendev.org/p/antelope-ptg-cinder#L115
--- a/specs/2024.1/remove-me.rst
+++ b/specs/2024.1/remove-me.rst
@ -1,7 +0,0 @@
-.. This file is a place holder.  It should be removed by
-   any patch proposing a spec for the 2024.1 release
-
-================================
-No specs have yet been approved.
-================================
-