Encrypted backups

This specs adds the possibility for chunked backup drivers to
encrypted/decrypt the data.

This is primarily intended for backups that are stored in public clouds,
such as GCS and S3.

Change-Id: Ib9999e01e02dc48a14a268e5277203a6ac0dd89b
This commit is contained in:
Gorka Eguileor 2022-10-25 13:49:19 +02:00 committed by Jon Bernard
parent 5adae83030
commit 0cfece2ed6
2 changed files with 283 additions and 7 deletions

View File

@ -0,0 +1,283 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
=========================
Encrypted Chunked Backups
=========================
https://blueprints.launchpad.net/cinder/+spec/encrypted-backups
Backups inheriting from the Chunked Backup Driver always store data unencrypted
which may not be undesirable in some scenarios.
Problem description
===================
Keeping data in public cloud storage services, such as GCS and S3, means that
access control to that information is only partially in our control, and we are
forced to trust that the provider has tight control over its access.
This is not the best policy with sensitive information, and it is recommended
to encrypt sensitive data.
Unfortunately in OpenStack all public cloud drivers, GCS and S3, do not support
encryption, which means that anyone with access to the OpenStack backup
location will be able to access the raw data of all my unencrypted Cinder
volumes.
Use Cases
=========
As a system administrator I want to configure my OpenStack deployment to make
my backups to S3 without having to worry about having sensitive data readable
for anyone that has access to the S3 account, authorized or not.
Proposed change
===============
The proposal is to support encrypted backups for the S3 and GCS Cinder backup
drivers using a single static encryption key for all encrypted backups.
Users would **not** be allowed to select whether they want encryption or not,
as this would be configured by the system administrator that configures the
Cinder services and knows where backups are being stored.
Note:
The RBD driver would not support encryption.
The reason to use a single key instead of per backup encryption keys using
Barbican, like Cinder does for volumes, is because the primary objective of
encrypting backups is to prevent people that has access to S3 from reading the
contents of the backups, and in a disaster scenario where Barbican store is
also lost, all backups would be rendered unusable by the loss of the keys.
The proposal is to have 3 new configuration options ``backup_encryption_mode``
and ``backup_encryption_algorithm``, ``backup_encryption_key`` to configure how
chunked backup drivers would handle encryption.
The ``backup_encryption_mode`` would accept 3 different options: ``off``,
``always``, and ``auto``, where:
- ``off`` would disable encryption, just like we do for the
``backup_compression_algorithm``, for example if we were using our own Swift
cluster.
- ``on`` would force all backup to be encrypted.
- ``auto`` would only encrypt unencrypted volumes and would backup already
encrypted volumes as they are, since double encryption would not provide
additional security.
The ``backup_encryption_key`` would be a string configuration option.
For simplicity this initial implementation will only support the ``Fernet``
algorithm from the ``cryptography`` python module, that Cinder already
requires, which is a symmetric authenticated cryptography. It will be the
default algorithm and will be configured with the value ``fernet`` in the
``backup_encryption_algorithm`` configuration option.
To support encrypted backup we need a new version of the *metadata* file format
to be able to store the encryption algorithm to mark the backup as encrypted.
First we would bump the ``DRIVER_VERSION`` and update the
``DRIVER_VERSION_MAPPING`` to include the restore method for the new version,
which would support decrypting volumes.
In the ``_write_metadata`` method we would save a new key called
``encryption_algorithm`` that would contain the configured
``backup_encryption_algorithm`` when we are encrypting the backup or ``none``
if the data is not being encrypted because backup encryption is disabled or the
volume is already encrypted.
Backup drivers supporting encryption shall fail to start if the
``backup_encryption_algorithm`` has an incorrect value or if
``backup_encryption_mode`` is not ``off`` and there is no valid
``backup_encryption_key``.
The new code should be able to support doing an incremental encrypted backup
even if the parent backup is non encrypted, and even if the parent was made by
an older Cinder release that used version ``1.0.0`` of the metadata format.
This means that the restore operation must select the appropriate restore
method based on each backup's metadata, and not assume that all backups from a
backup chain are the same.
In terms of code this is quite easy, in the ``restore`` method it just needs to
move the assignment of the ``restore_func`` to the ``while`` loop after the
``backup1`` metadata has been read.
When compression is enabled with the ``backup_compression_algorithm`` the
compression should be done prior to the encryption, since encryption turns the
data into high-entropy data reducing the likelihood of the compression doing
anything useful.
Since encryption algorithms are implemented in C libraries the encryption and
decryption of data shall be done in a native thread from the ``tpool`` to
prevent greenthreads from blocking and turning the whole service unresponsive.
Backup internals should be hidden from end-users, so Cinder will not expose
whether backups are encrypted or not for them, but since this information could
be useful for system administrators it will need to make it visible for them.
For this purpose a new DB column called ``encrypted`` will be added to the
``backups`` table to reflect whether the data in that specific backup is
encrypted or not. A backup will be encrypted whenever an encrypted volume is
backed up or when an unencrypted volume is backup up and the chunked backup
driver encrypts it.
It's important to reflect in the documentation that this field only reflects
that specific backup in a chain of backups, so we could have an encrypted
incremental backup that states that it is encrypted but its parent is
unencrypted.
Alternatives
------------
An alternative could be to have use a single Barbican key for the whole
cinder-backup service or to use a per-backup Barbican encryption key.
Data model impact
-----------------
This change will require the addition of a new DB column called ``encrypted``
in the ``backups`` table that defaults to ``False`` to be able to detect .
Besides the DB change in ``cinder/db/sqlalchemy/models.py`` the corresponding
``Backup`` Oslo Versioned Object (OVO) in ``cinder/objects/backup.py`` will
also need to be changed, which means:
- Bumping the version of the OVO
- Adding the ``encrypted`` boolean field and ensure it's set to ``False`` by
default.
- Adding the compatibility code that removes the field when sending it to an
older RPC service (using ``obj_make_compatible`` method).
- The new ``Backup`` OVO version shall be added to the OVO history in
``cinder/objects/base.py``.
The DB change also affects the ``import_record`` and ``export_record`` methods
from ``cinder/backup/manager.py`` and the first should remain backward
compatible with older backup exports.
Since there is a new DB field that has no value in the DB, and we no longer do
data migrations when updating the schema, a new online data migration will
need to be added to the ``cinder/cmd/manage.py`` CLI and to the ``Backup`` OVO
when a record is read from the DB and it doesn't have the ``encrypted`` value
set.
REST API impact
---------------
A new REST API microversion is needed since admins will now be able to see the
``encrypted`` field for backups.
Security impact
---------------
Security will be increased when using external public clouds for Cinder
backups.
Active/Active HA impact
-----------------------
No impact.
Notifications impact
--------------------
Notifications will include the new ``encrypted`` field in when sending backup
notifications.
Other end user impact
---------------------
None.
Performance Impact
------------------
Doing encryption is computationally expensive, so this will have an impact on
the speed at which backups are performed.
This can be mitigated for encrypted volumes by setting
``backup_encryption_mode`` to ``auto``.
Better performance can be achieved by using multiple backup processes with the
``backup_workers`` configuration option or by horizontally scaling the
cinder-backup service.
Other deployer impact
---------------------
Deployers will now have 3 new configuration options related to this feature as
described above:
- ``backup_encryption_mode``
- ``backup_encryption_algorithm``
- ``backup_encryption_key``
Developer impact
----------------
None
Implementation
==============
Assignee(s)
-----------
TBD
Work Items
----------
- Update the DB model for the ``backups`` table.
- Update the ``Backup`` OVO.
- Update the import and export record methods.
- Update the REST API with a new microversion.
- Add new configuration options and update the chunked backup driver.
Dependencies
============
None
Testing
=======
Besides the standard unit tests one of the tempest CI jobs should be modified
to run with encryption enabled and the backup tempest tests should be made to
look for the value of the new ``encrypted`` field in the admin REST API
responses and confirm its not present for normal users.
Documentation Impact
====================
Document ``doc/source/admin/volume-backups.rst`` must be updated to reflect the
new encryption possibilities describing the main purpose of the feature (public
clouds) and how the Ceph backup driver doesn't support this feature.
References
==========
`Antelope PTG discussion`_
_`Antelope PTG discussion`: https://etherpad.opendev.org/p/antelope-ptg-cinder#L115

View File

@ -1,7 +0,0 @@
.. This file is a place holder. It should be removed by
any patch proposing a spec for the 2024.1 release
================================
No specs have yet been approved.
================================