Merge "crypto - add overview doc" into feature/crypto

This commit is contained in:
Jenkins 2016-06-03 19:44:18 +00:00 committed by Gerrit Code Review
commit 0a5c9af5c5
3 changed files with 374 additions and 2 deletions

View File

@ -59,6 +59,7 @@ Overview and Concepts
overview_erasure_code
overview_backing_store
ring_background
overview_encryption
associated_projects
Developer Documentation

View File

@ -0,0 +1,373 @@
=================
Object Encryption
=================
Swift supports the optional encryption of object data at rest on storage nodes.
The encryption of object data is intended to mitigate the risk of users' data
being read if an unauthorised party were to gain physical access to a disk.
Encryption of data at rest is implemented by a set of three middleware modules
that may be included in the proxy server WSGI pipeline. The feature is internal
to a Swift cluster and not exposed through the API. Clients are unaware that
data is encrypted by this feature internally to the Swift service; internally
encrypted data should never be returned to clients via the Swift API.
The following data are encrypted while at rest in Swift:
* Object content i.e. the content of an object PUT request's body
* The entity tag (ETag) of the object
* All custom user metadata values i.e. metadata sent using X-Object-Meta-
prefixed headers with PUT or POST requests
Any data not included in the list above are not encrypted, including:
* Account, container and object names
* Account and container custom user metadata
* Custom user metadata names
* Object Content-Type values
* Object size
* System metadata
------------------------
Deployment and operation
------------------------
Encryption at rest is deployed by adding three middleware filters to the proxy
server WSGI pipeline and including their respective filter configuration
sections in the `proxy-server.conf` file::
... decrypter keymaster encrypter proxy-logging proxy-server
[filter:decrypter]
use = egg:swift#decrypter
[filter:keymaster]
use = egg:swift#keymaster
encryption_root_secret = your_secret
[filter:encrypter]
use = egg:swift#encrypter
# disable_encryption = False
See the example pipeline in `proxy-server.conf-sample` for further details on
the positioning of those middlewares relative to other middleware.
The keymaster config option ``encryption_root_secret`` MUST be set to a value
of at least 44 valid base-64 characters before the middleware is used and
should be consistent across all proxy servers.
.. note::
The ``encryption_root_secret`` option holds the master secret key used for
encryption. The security of all encrypted data critically depends on this
key, therefore it should be set to a high-entropy value. For example, a
suitable ``encryption_root_secret`` may be obtained by base-64 encoding a
32 byte (or longer) value generated by a cryptographically secure random
number generator.
The ``encryption_root_secret`` value is necessary to recover any encrypted
data from the storage system, and therefore, it must be guarded against
accidental loss. Its value (and consequently, the proxy-server.conf file)
should not be stored on any disk that is in any account, container or
object ring.
One method for generating a suitable value for ``encryption_root_secret`` is to
use the ``openssl`` command line tool::
openssl rand -base64 32
Once deployed, the encrypter will by default encrypt object data and metadata
when PUT and POST requests are made to the proxy server and the decrypter will
decrypt object data and metadata when handling GET and HEAD requests.
Objects that existed in the cluster prior to the encryption middlewares being
deployed are still readable with GET and HEAD requests. The content of those
objects will not be encrypted unless they are written again by a PUT or COPY
request. Any user metadata of those objects will not be encrypted unless it is
written again by a PUT, POST or COPY request.
Once deployed, the encryption middlewares should not be removed from the
pipeline. To do so might cause encrypted object data and/or metadata to be
returned in response to GET or HEAD requests.
Encryption of inbound object data may be disabled by setting the encrypter
``disable_encryption`` option to ``True``, in which case existing encrypted
objects will remain encrypted but new data written with PUT, POST or COPY
requests will not be encrypted. The encryption middlewares should remain in the
pipeline even when encryption of new objects is not required. The encrypter
middleware is needed to handle conditional GET requests that may be for
previously encrypted objects. The decrypter middleware is needed to handle all
GET requests that are for encrypted objects. The keymaster is needed to provide
keys for those requests.
.. _container_sync_client_config:
Container sync
--------------
If container sync is being used then the encryption middlewares must be added
to the container sync internal client pipeline. The following configuration
steps are required:
#. Create a custom internal client configuration file for container sync (if
one is not already in use) based on the sample file
`internal-client.conf-sample`. For example, copy
`internal-client.conf-sample` to `/etc/swift/container-sync-client.conf`.
#. Modify this file include to include the encryption middlewares in the
pipeline in the same way as described above for the proxy server.
#. Modify the container-sync section of all container server config files to
point to this internal client config file using the
``internal_client_conf_path`` option. For example::
internal_client_conf_path = /etc/swift/container-sync-client.conf
--------------------------
Performance Considerations
--------------------------
TODO
--------------
Implementation
--------------
Encryption scheme
-----------------
Plaintext data is encrypted to a ciphertext using the AES cipher with 256-bit
keys. The cipher is used in counter mode so that any byte or range of bytes in
the ciphertext may be decrypted independently of any other bytes in the
ciphertext. This enables very simple handling of ranged GETs.
In general an item of plaintext data ``p`` is transformed to a ciphertext
``c``::
ciphertext = E(plaintext, k, iv)
where ``E`` is the encryption function, ``k`` is an encryption key and ``iv``
is a unique initialization vector (IV) chosen for each encryption operation.
The IV is stored as metadata of the encrypted item so that it is available for
decryption::
plaintext = D(ciphertext, k, iv)
where ``D`` is the decryption function.
In general any encrypted item has accompanying crypto-metadata that describes
the IV and the cipher algorithm used for the encryption::
crypto_metadata = {"iv": <16 byte value>,
"cipher": "AES_CTR_256"}
Key management
--------------
A keymaster middleware is responsible for providing the keys required for each
encryption and decryption operation. The keymaster middleware should provide
different keys for each object and container. These are made available to the
encrypter and decrypter via a callback function that the keymaster installs in
the WSGI request environ.
The current keymaster implementation derives container and object keys from the
``encryption_root_secret`` in a deterministic way by constructing an SHA256
HMAC using the ``encryption_root_secret`` as a key and the container or object
path as a message, for example::
object_key = HMAC(encryption_root_secret, "/a/c/o")
Other strategies for providing object and container keys may be employed by
future implementations of alternative keymaster middleware.
The encrypter uses the object key to `wrap` other randomly generated keys that
are used to encrypt object data. A random key is `wrapped` by encrypting it
using the object key provided by the keymaster. This makes it safe to then
store the wrapped key alongside object data and metadata.
This process of `key wrapping` is performed to enable more efficient re-keying
events when the object key may need to be replaced and consequently any data
encrypted using that key must be re-encrypted. Key wrapping minimizes the
amount of data encrypted using those keys to just other randomly chosen keys
which can be re-wrapped efficiently without needing to re-encrypt the larger
amounts of data that were encrypted using the random keys.
For example, as described below, the object body is encrypted using a random
key which is then wrapped using the object key. If re-keying requires the
object key to be replaced then only the random key needs to be re-encrypted and
not the object body, which is potentially a large amount of data.
.. note::
Re-keying is not currently implemented. Key wrapping is implemented
in anticipation of future re-keying operations.
Encrypter operation
-------------------
Custom user metadata
++++++++++++++++++++
The encrypter encrypts each item of custom user metadata using the object key
provided by the keymaster and an IV that is randomly chosen for that metadata
item. For example::
X-Object-Meta-Private1: value1
X-Object-Meta-Private2: value2
are transformed to::
X-Object-Meta-Private1: E(value1, object_key, header_iv_1)
X-Object-Meta-Private2: E(value2, object_key, header_iv_2)
For each custom user metadata header the encrypter stores the associated
crypto-metadata using an ``X-Object-Transient-Sysmeta-`` header. For the same
example::
X-Object-Transient-Sysmeta-Crypto-Meta-Private1:{"iv": header_iv_1,
"cipher": "AES_CTR_256"}
X-Object-Transient-Sysmeta-Crypto-Meta-Private2:{"iv": header_iv_2,
"cipher": "AES_CTR_256"}
Object body
+++++++++++
Encryption of an object body is performed using a randomly chosen body key
and a randomly chosen IV::
body_ciphertext = E(body_plaintext, body_key, body_iv)
The body_key is wrapped using the object key provided by the keymaster and a
randomly chosen IV::
wrapped_body_key = E(body_key, object_key, body_key_iv)
The encrypter stores the associated crypto metadata in a system metadata
header::
X-Object-Sysmeta-Crypto-Meta:
{"iv": body_iv,
"cipher": "AES_CTR_256",
"body_key": {"key": wrapped_body_key,
"iv": body_key_iv}}
Note that in this case there is an extra item of crypto metadata which stores
the wrapped body key and its IV.
Entity tag
++++++++++
While encrypting the object body the encrypter also calculates the ETag (md5
digest) of the plaintext body. This value is encrypted using a keymaster
provided container key, and an IV that is derived from the object's path, and
saved as an item of system metadata::
X-Object-Sysmeta-Crypto-Etag: E(md5(plaintext), container_key, F(path))
The encrypter stores the associated crypto metadata in a system metadata
header::
X-Object-Sysmeta-Crypto-Meta-Etag: {"iv": F(path),
"cipher": "AES_CTR_256"}
The reason for using the container key for this encryption is that the
encrypted ETag must also be included in the object update to the container
server, and will be included in container listings. The decrypter must be able
to decrypt the ETags in container listings using only the container key (since
object keys may not be available when handling a container request) so the
ETags must therefore be encrypted using the container key.
The encrypter forces the encrypted plaintext ETag to be sent with container
updates by adding an update override header to the PUT request, which also has
the associated crypto metadata appended to the encrypted ETag value::
X-Object-Sysmeta-Container-Update-Override-Etag:
E(md5(plaintext), container_key, F(path));
meta={"iv": F(path), "cipher": "AES_CTR_256"}
The reason an IV derived from the object's path is used when encrypting the
ETag is to allow the encrypter to perform the same transformation on ETag
values specified in subsequent conditional GET or HEAD requests, so that they
can be compared against the encrypted object ETag when the object server
evaluates the conditional request. So, when handling a conditional GET or HEAD
request, the encrypter updates ``If[-None]-Match`` headers::
If[-None]-Match: E(ETag, container_key, F(path))
Since the plaintext ETag value is only known once the encrypter has completed
processing the entire object body, the ``X-Object-Sysmeta-Crypto-Etag``,
``X-Object-Sysmeta-Crypto-Meta-Etag`` and
``X-Object-Sysmeta-Container-Update-Override-Etag`` headers are sent after the
encrypted object body using the proxy server's support for request footers.
Decrypter operation
-------------------
For each GET or HEAD request to an object, the decrypter inspects the response
for encrypted items (revealed by crypto metadata headers), and if any are
discovered then it will:
#. Fetch container and object keys from the keymaster via its callback
#. Decrypt the ``X-Object-Sysmeta-Crypto-Etag`` value using the container
key and the IV found in the ``X-Object-Sysmeta-Crypto-Meta-Etag`` header
#. Decrypt metadata headers using the object key
#. Decrypt the wrapped body key found in ``X-Object-Sysmeta-Crypto-Meta``
#. Decrypt the body using the body key
For each GET request to a container that includes a format param, the
decrypter will:
#. GET the container listing
#. Fetch container key from the keymaster via its callback
#. Decrypt the response body ETag entries using the container key
Impact on other Swift services
------------------------------
`Container Sync` uses an internal client to GET objects that are to be sync'd.
This internal client must be configured to use the encryption middlewares as
described `above`__.
.. __: container_sync_client_config_
Encryption has no impact on the `object-auditor` service. Since the ETag
header saved with the object at rest is the md5 sum of the encrypted object
body then the auditor will verify that encrypted data is valid.
Encryption has no impact on the `object-expirer` service. ``X-Delete-At`` and
``X-Delete-After`` headers are not encrypted.
Encryption has no impact on the `object-replicator` and `object-reconstructor`
services. These services are unaware of the object or EC fragment data being
encrypted.
Encryption has no impact on the `container-reconciler` service. The
`container-reconciler` uses an internal client to move objects between
different policy rings. The destination object has the same URL as the source
object and the object is moved without re-encryption.
Considerations for developers
-----------------------------
Developers should be aware that encryption middlewares rely on the path of an
object remaining unchanged. The keymaster derives keys for containers and
objects based on their paths. The encrypter also uses the object path to derive
an IV for encrypting the ETag. As explained above, this choice of IV is
made to enable conditional request ETag values to be encrypted in an
identical fashion prior to matching with the object ETag.
Developers should therefore give careful consideration to any new features that
would relocate object data and metadata within a Swift cluster by means that do
not cause the object data and metadata to pass through the encryption
middlewares in the proxy pipeline and be re-encrypted.
The keymaster does persist the path that was used to derive keys as an item of
system metadata name ``X-Object-Sysmeta-Crypto-Id``. This metadata has been
included in anticipation of future scenarios when it may be necessary to
decrypt an object that has been relocated without re-encrypting, in which case
the value of ``X-Object-Sysmeta-Crypto-Id`` could be used to derive the keys
that were used for encryption. However, this alone is not sufficient to handle
conditional requests and to decrypt container listings where objects have been
relocated, and further work will be required to solve those issues.

View File

@ -769,7 +769,6 @@ use = egg:swift#copy
# Note: To enable encryption, add the following 3 dependent pieces of
# crypto middleware to the proxy-server pipeline as follows:
# ... decrypter keymaster encrypter proxy-logging (end of pipeline)
[filter:decrypter]
use = egg:swift#decrypter
@ -788,6 +787,5 @@ use = egg:swift#keymaster
# to the devstack proxy-config so that gate tests can pass.
# base64 encoding of "dontEverUseThisIn_PRODUCTION_xxxxxxxxxxxxxxx"
encryption_root_secret = ZG9udEV2ZXJVc2VUaGlzSW5fUFJPRFVDVElPTl94eHh4eHh4eHh4eHh4eHg=
[filter:encrypter]
use = egg:swift#encrypter