Encrypted backups
This specs adds the possibility for chunked backup drivers to encrypted/decrypt the data. This is primarily intended for backups that are stored in public clouds, such as GCS and S3. Change-Id: Ib9999e01e02dc48a14a268e5277203a6ac0dd89b
This commit is contained in:
parent
5adae83030
commit
0cfece2ed6
|
@ -0,0 +1,283 @@
|
|||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
=========================
|
||||
Encrypted Chunked Backups
|
||||
=========================
|
||||
|
||||
https://blueprints.launchpad.net/cinder/+spec/encrypted-backups
|
||||
|
||||
Backups inheriting from the Chunked Backup Driver always store data unencrypted
|
||||
which may not be undesirable in some scenarios.
|
||||
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Keeping data in public cloud storage services, such as GCS and S3, means that
|
||||
access control to that information is only partially in our control, and we are
|
||||
forced to trust that the provider has tight control over its access.
|
||||
|
||||
This is not the best policy with sensitive information, and it is recommended
|
||||
to encrypt sensitive data.
|
||||
|
||||
Unfortunately in OpenStack all public cloud drivers, GCS and S3, do not support
|
||||
encryption, which means that anyone with access to the OpenStack backup
|
||||
location will be able to access the raw data of all my unencrypted Cinder
|
||||
volumes.
|
||||
|
||||
|
||||
Use Cases
|
||||
=========
|
||||
|
||||
As a system administrator I want to configure my OpenStack deployment to make
|
||||
my backups to S3 without having to worry about having sensitive data readable
|
||||
for anyone that has access to the S3 account, authorized or not.
|
||||
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
The proposal is to support encrypted backups for the S3 and GCS Cinder backup
|
||||
drivers using a single static encryption key for all encrypted backups.
|
||||
|
||||
Users would **not** be allowed to select whether they want encryption or not,
|
||||
as this would be configured by the system administrator that configures the
|
||||
Cinder services and knows where backups are being stored.
|
||||
|
||||
Note:
|
||||
The RBD driver would not support encryption.
|
||||
|
||||
The reason to use a single key instead of per backup encryption keys using
|
||||
Barbican, like Cinder does for volumes, is because the primary objective of
|
||||
encrypting backups is to prevent people that has access to S3 from reading the
|
||||
contents of the backups, and in a disaster scenario where Barbican store is
|
||||
also lost, all backups would be rendered unusable by the loss of the keys.
|
||||
|
||||
The proposal is to have 3 new configuration options ``backup_encryption_mode``
|
||||
and ``backup_encryption_algorithm``, ``backup_encryption_key`` to configure how
|
||||
chunked backup drivers would handle encryption.
|
||||
|
||||
The ``backup_encryption_mode`` would accept 3 different options: ``off``,
|
||||
``always``, and ``auto``, where:
|
||||
|
||||
- ``off`` would disable encryption, just like we do for the
|
||||
``backup_compression_algorithm``, for example if we were using our own Swift
|
||||
cluster.
|
||||
|
||||
- ``on`` would force all backup to be encrypted.
|
||||
|
||||
- ``auto`` would only encrypt unencrypted volumes and would backup already
|
||||
encrypted volumes as they are, since double encryption would not provide
|
||||
additional security.
|
||||
|
||||
The ``backup_encryption_key`` would be a string configuration option.
|
||||
|
||||
For simplicity this initial implementation will only support the ``Fernet``
|
||||
algorithm from the ``cryptography`` python module, that Cinder already
|
||||
requires, which is a symmetric authenticated cryptography. It will be the
|
||||
default algorithm and will be configured with the value ``fernet`` in the
|
||||
``backup_encryption_algorithm`` configuration option.
|
||||
|
||||
To support encrypted backup we need a new version of the *metadata* file format
|
||||
to be able to store the encryption algorithm to mark the backup as encrypted.
|
||||
|
||||
First we would bump the ``DRIVER_VERSION`` and update the
|
||||
``DRIVER_VERSION_MAPPING`` to include the restore method for the new version,
|
||||
which would support decrypting volumes.
|
||||
|
||||
In the ``_write_metadata`` method we would save a new key called
|
||||
``encryption_algorithm`` that would contain the configured
|
||||
``backup_encryption_algorithm`` when we are encrypting the backup or ``none``
|
||||
if the data is not being encrypted because backup encryption is disabled or the
|
||||
volume is already encrypted.
|
||||
|
||||
Backup drivers supporting encryption shall fail to start if the
|
||||
``backup_encryption_algorithm`` has an incorrect value or if
|
||||
``backup_encryption_mode`` is not ``off`` and there is no valid
|
||||
``backup_encryption_key``.
|
||||
|
||||
The new code should be able to support doing an incremental encrypted backup
|
||||
even if the parent backup is non encrypted, and even if the parent was made by
|
||||
an older Cinder release that used version ``1.0.0`` of the metadata format.
|
||||
This means that the restore operation must select the appropriate restore
|
||||
method based on each backup's metadata, and not assume that all backups from a
|
||||
backup chain are the same.
|
||||
|
||||
In terms of code this is quite easy, in the ``restore`` method it just needs to
|
||||
move the assignment of the ``restore_func`` to the ``while`` loop after the
|
||||
``backup1`` metadata has been read.
|
||||
|
||||
When compression is enabled with the ``backup_compression_algorithm`` the
|
||||
compression should be done prior to the encryption, since encryption turns the
|
||||
data into high-entropy data reducing the likelihood of the compression doing
|
||||
anything useful.
|
||||
|
||||
Since encryption algorithms are implemented in C libraries the encryption and
|
||||
decryption of data shall be done in a native thread from the ``tpool`` to
|
||||
prevent greenthreads from blocking and turning the whole service unresponsive.
|
||||
|
||||
Backup internals should be hidden from end-users, so Cinder will not expose
|
||||
whether backups are encrypted or not for them, but since this information could
|
||||
be useful for system administrators it will need to make it visible for them.
|
||||
|
||||
For this purpose a new DB column called ``encrypted`` will be added to the
|
||||
``backups`` table to reflect whether the data in that specific backup is
|
||||
encrypted or not. A backup will be encrypted whenever an encrypted volume is
|
||||
backed up or when an unencrypted volume is backup up and the chunked backup
|
||||
driver encrypts it.
|
||||
|
||||
It's important to reflect in the documentation that this field only reflects
|
||||
that specific backup in a chain of backups, so we could have an encrypted
|
||||
incremental backup that states that it is encrypted but its parent is
|
||||
unencrypted.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
An alternative could be to have use a single Barbican key for the whole
|
||||
cinder-backup service or to use a per-backup Barbican encryption key.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
This change will require the addition of a new DB column called ``encrypted``
|
||||
in the ``backups`` table that defaults to ``False`` to be able to detect .
|
||||
|
||||
Besides the DB change in ``cinder/db/sqlalchemy/models.py`` the corresponding
|
||||
``Backup`` Oslo Versioned Object (OVO) in ``cinder/objects/backup.py`` will
|
||||
also need to be changed, which means:
|
||||
|
||||
- Bumping the version of the OVO
|
||||
|
||||
- Adding the ``encrypted`` boolean field and ensure it's set to ``False`` by
|
||||
default.
|
||||
|
||||
- Adding the compatibility code that removes the field when sending it to an
|
||||
older RPC service (using ``obj_make_compatible`` method).
|
||||
|
||||
- The new ``Backup`` OVO version shall be added to the OVO history in
|
||||
``cinder/objects/base.py``.
|
||||
|
||||
The DB change also affects the ``import_record`` and ``export_record`` methods
|
||||
from ``cinder/backup/manager.py`` and the first should remain backward
|
||||
compatible with older backup exports.
|
||||
|
||||
Since there is a new DB field that has no value in the DB, and we no longer do
|
||||
data migrations when updating the schema, a new online data migration will
|
||||
need to be added to the ``cinder/cmd/manage.py`` CLI and to the ``Backup`` OVO
|
||||
when a record is read from the DB and it doesn't have the ``encrypted`` value
|
||||
set.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
A new REST API microversion is needed since admins will now be able to see the
|
||||
``encrypted`` field for backups.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
Security will be increased when using external public clouds for Cinder
|
||||
backups.
|
||||
|
||||
Active/Active HA impact
|
||||
-----------------------
|
||||
|
||||
No impact.
|
||||
|
||||
Notifications impact
|
||||
--------------------
|
||||
|
||||
Notifications will include the new ``encrypted`` field in when sending backup
|
||||
notifications.
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
None.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
Doing encryption is computationally expensive, so this will have an impact on
|
||||
the speed at which backups are performed.
|
||||
|
||||
This can be mitigated for encrypted volumes by setting
|
||||
``backup_encryption_mode`` to ``auto``.
|
||||
|
||||
Better performance can be achieved by using multiple backup processes with the
|
||||
``backup_workers`` configuration option or by horizontally scaling the
|
||||
cinder-backup service.
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
Deployers will now have 3 new configuration options related to this feature as
|
||||
described above:
|
||||
|
||||
- ``backup_encryption_mode``
|
||||
- ``backup_encryption_algorithm``
|
||||
- ``backup_encryption_key``
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
None
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
TBD
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
- Update the DB model for the ``backups`` table.
|
||||
|
||||
- Update the ``Backup`` OVO.
|
||||
|
||||
- Update the import and export record methods.
|
||||
|
||||
- Update the REST API with a new microversion.
|
||||
|
||||
- Add new configuration options and update the chunked backup driver.
|
||||
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
None
|
||||
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Besides the standard unit tests one of the tempest CI jobs should be modified
|
||||
to run with encryption enabled and the backup tempest tests should be made to
|
||||
look for the value of the new ``encrypted`` field in the admin REST API
|
||||
responses and confirm its not present for normal users.
|
||||
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
Document ``doc/source/admin/volume-backups.rst`` must be updated to reflect the
|
||||
new encryption possibilities describing the main purpose of the feature (public
|
||||
clouds) and how the Ceph backup driver doesn't support this feature.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
`Antelope PTG discussion`_
|
||||
|
||||
|
||||
_`Antelope PTG discussion`: https://etherpad.opendev.org/p/antelope-ptg-cinder#L115
|
|
@ -1,7 +0,0 @@
|
|||
.. This file is a place holder. It should be removed by
|
||||
any patch proposing a spec for the 2024.1 release
|
||||
|
||||
================================
|
||||
No specs have yet been approved.
|
||||
================================
|
||||
|
Loading…
Reference in New Issue