swift/doc/source/overview_encryption.rst

15 KiB

Object Encryption

Swift supports the optional encryption of object data at rest on storage nodes. The encryption of object data is intended to mitigate the risk of users' data being read if an unauthorised party were to gain physical access to a disk.

Encryption of data at rest is implemented by a set of three middleware modules that may be included in the proxy server WSGI pipeline. The feature is internal to a Swift cluster and not exposed through the API. Clients are unaware that data is encrypted by this feature internally to the Swift service; internally encrypted data should never be returned to clients via the Swift API.

The following data are encrypted while at rest in Swift:

  • Object content i.e. the content of an object PUT request's body
  • The entity tag (ETag) of the object
  • All custom user metadata values i.e. metadata sent using X-Object-Meta-prefixed headers with PUT or POST requests

Any data not included in the list above are not encrypted, including:

  • Account, container and object names
  • Account and container custom user metadata
  • Custom user metadata names
  • Object Content-Type values
  • Object size
  • System metadata

Deployment and operation

Encryption at rest is deployed by adding three middleware filters to the proxy server WSGI pipeline and including their respective filter configuration sections in the proxy-server.conf file:

... decrypter keymaster encrypter proxy-logging proxy-server

[filter:decrypter]
use = egg:swift#decrypter

[filter:keymaster]
use = egg:swift#keymaster
encryption_root_secret = your_secret

[filter:encrypter]
use = egg:swift#encrypter
# disable_encryption = False

See the example pipeline in proxy-server.conf-sample for further details on the positioning of those middlewares relative to other middleware.

The keymaster config option encryption_root_secret MUST be set to a value of at least 44 valid base-64 characters before the middleware is used and should be consistent across all proxy servers.

Note

The encryption_root_secret option holds the master secret key used for encryption. The security of all encrypted data critically depends on this key, therefore it should be set to a high-entropy value. For example, a suitable encryption_root_secret may be obtained by base-64 encoding a 32 byte (or longer) value generated by a cryptographically secure random number generator.

The encryption_root_secret value is necessary to recover any encrypted data from the storage system, and therefore, it must be guarded against accidental loss. Its value (and consequently, the proxy-server.conf file) should not be stored on any disk that is in any account, container or object ring.

One method for generating a suitable value for encryption_root_secret is to use the openssl command line tool:

openssl rand -base64 32

Once deployed, the encrypter will by default encrypt object data and metadata when PUT and POST requests are made to the proxy server and the decrypter will decrypt object data and metadata when handling GET and HEAD requests.

Objects that existed in the cluster prior to the encryption middlewares being deployed are still readable with GET and HEAD requests. The content of those objects will not be encrypted unless they are written again by a PUT or COPY request. Any user metadata of those objects will not be encrypted unless it is written again by a PUT, POST or COPY request.

Once deployed, the encryption middlewares should not be removed from the pipeline. To do so might cause encrypted object data and/or metadata to be returned in response to GET or HEAD requests.

Encryption of inbound object data may be disabled by setting the encrypter disable_encryption option to True, in which case existing encrypted objects will remain encrypted but new data written with PUT, POST or COPY requests will not be encrypted. The encryption middlewares should remain in the pipeline even when encryption of new objects is not required. The encrypter middleware is needed to handle conditional GET requests that may be for previously encrypted objects. The decrypter middleware is needed to handle all GET requests that are for encrypted objects. The keymaster is needed to provide keys for those requests.

Container sync

If container sync is being used then the encryption middlewares must be added to the container sync internal client pipeline. The following configuration steps are required:

  1. Create a custom internal client configuration file for container sync (if one is not already in use) based on the sample file internal-client.conf-sample. For example, copy internal-client.conf-sample to /etc/swift/container-sync-client.conf.

  2. Modify this file include to include the encryption middlewares in the pipeline in the same way as described above for the proxy server.

  3. Modify the container-sync section of all container server config files to point to this internal client config file using the internal_client_conf_path option. For example:

    internal_client_conf_path = /etc/swift/container-sync-client.conf

Performance Considerations

TODO

Implementation

Encryption scheme

Plaintext data is encrypted to a ciphertext using the AES cipher with 256-bit keys. The cipher is used in counter mode so that any byte or range of bytes in the ciphertext may be decrypted independently of any other bytes in the ciphertext. This enables very simple handling of ranged GETs.

In general an item of plaintext data p is transformed to a ciphertext c:

ciphertext = E(plaintext, k, iv)

where E is the encryption function, k is an encryption key and iv is a unique initialization vector (IV) chosen for each encryption operation. The IV is stored as metadata of the encrypted item so that it is available for decryption:

plaintext = D(ciphertext, k, iv)

where D is the decryption function.

In general any encrypted item has accompanying crypto-metadata that describes the IV and the cipher algorithm used for the encryption:

crypto_metadata = {"iv": <16 byte value>,
                   "cipher": "AES_CTR_256"}

Key management

A keymaster middleware is responsible for providing the keys required for each encryption and decryption operation. The keymaster middleware should provide different keys for each object and container. These are made available to the encrypter and decrypter via a callback function that the keymaster installs in the WSGI request environ.

The current keymaster implementation derives container and object keys from the encryption_root_secret in a deterministic way by constructing an SHA256 HMAC using the encryption_root_secret as a key and the container or object path as a message, for example:

object_key = HMAC(encryption_root_secret, "/a/c/o")

Other strategies for providing object and container keys may be employed by future implementations of alternative keymaster middleware.

The encrypter uses the object key to wrap other randomly generated keys that are used to encrypt object data. A random key is wrapped by encrypting it using the object key provided by the keymaster. This makes it safe to then store the wrapped key alongside object data and metadata.

This process of key wrapping is performed to enable more efficient re-keying events when the object key may need to be replaced and consequently any data encrypted using that key must be re-encrypted. Key wrapping minimizes the amount of data encrypted using those keys to just other randomly chosen keys which can be re-wrapped efficiently without needing to re-encrypt the larger amounts of data that were encrypted using the random keys.

For example, as described below, the object body is encrypted using a random key which is then wrapped using the object key. If re-keying requires the object key to be replaced then only the random key needs to be re-encrypted and not the object body, which is potentially a large amount of data.

Note

Re-keying is not currently implemented. Key wrapping is implemented in anticipation of future re-keying operations.

Encrypter operation

Custom user metadata

The encrypter encrypts each item of custom user metadata using the object key provided by the keymaster and an IV that is randomly chosen for that metadata item. For example:

X-Object-Meta-Private1: value1
X-Object-Meta-Private2: value2

are transformed to:

X-Object-Meta-Private1: E(value1, object_key, header_iv_1)
X-Object-Meta-Private2: E(value2, object_key, header_iv_2)

For each custom user metadata header the encrypter stores the associated crypto-metadata using an X-Object-Transient-Sysmeta- header. For the same example:

X-Object-Transient-Sysmeta-Crypto-Meta-Private1:{"iv": header_iv_1,
                                                 "cipher": "AES_CTR_256"}
X-Object-Transient-Sysmeta-Crypto-Meta-Private2:{"iv": header_iv_2,
                                                 "cipher": "AES_CTR_256"}

Object body

Encryption of an object body is performed using a randomly chosen body key and a randomly chosen IV:

body_ciphertext = E(body_plaintext, body_key, body_iv)

The body_key is wrapped using the object key provided by the keymaster and a randomly chosen IV:

wrapped_body_key = E(body_key, object_key, body_key_iv)

The encrypter stores the associated crypto metadata in a system metadata header:

X-Object-Sysmeta-Crypto-Meta:
    {"iv": body_iv,
     "cipher": "AES_CTR_256",
     "body_key": {"key": wrapped_body_key,
                  "iv": body_key_iv}}

Note that in this case there is an extra item of crypto metadata which stores the wrapped body key and its IV.

Entity tag

While encrypting the object body the encrypter also calculates the ETag (md5 digest) of the plaintext body. This value is encrypted using a keymaster provided container key, and an IV that is derived from the object's path, and saved as an item of system metadata:

X-Object-Sysmeta-Crypto-Etag: E(md5(plaintext), container_key, F(path))

The encrypter stores the associated crypto metadata in a system metadata header:

X-Object-Sysmeta-Crypto-Meta-Etag: {"iv": F(path),
                                    "cipher": "AES_CTR_256"}

The reason for using the container key for this encryption is that the encrypted ETag must also be included in the object update to the container server, and will be included in container listings. The decrypter must be able to decrypt the ETags in container listings using only the container key (since object keys may not be available when handling a container request) so the ETags must therefore be encrypted using the container key.

The encrypter forces the encrypted plaintext ETag to be sent with container updates by adding an update override header to the PUT request, which also has the associated crypto metadata appended to the encrypted ETag value:

X-Object-Sysmeta-Container-Update-Override-Etag:
    E(md5(plaintext), container_key, F(path));
    meta={"iv": F(path), "cipher": "AES_CTR_256"}

The reason an IV derived from the object's path is used when encrypting the ETag is to allow the encrypter to perform the same transformation on ETag values specified in subsequent conditional GET or HEAD requests, so that they can be compared against the encrypted object ETag when the object server evaluates the conditional request. So, when handling a conditional GET or HEAD request, the encrypter updates If[-None]-Match headers:

If[-None]-Match: E(ETag, container_key, F(path))

Since the plaintext ETag value is only known once the encrypter has completed processing the entire object body, the X-Object-Sysmeta-Crypto-Etag, X-Object-Sysmeta-Crypto-Meta-Etag and X-Object-Sysmeta-Container-Update-Override-Etag headers are sent after the encrypted object body using the proxy server's support for request footers.

Decrypter operation

For each GET or HEAD request to an object, the decrypter inspects the response for encrypted items (revealed by crypto metadata headers), and if any are discovered then it will:

  1. Fetch container and object keys from the keymaster via its callback
  2. Decrypt the X-Object-Sysmeta-Crypto-Etag value using the container key and the IV found in the X-Object-Sysmeta-Crypto-Meta-Etag header
  3. Decrypt metadata headers using the object key
  4. Decrypt the wrapped body key found in X-Object-Sysmeta-Crypto-Meta
  5. Decrypt the body using the body key

For each GET request to a container that includes a format param, the decrypter will:

  1. GET the container listing
  2. Fetch container key from the keymaster via its callback
  3. Decrypt the response body ETag entries using the container key

Impact on other Swift services

Container Sync uses an internal client to GET objects that are to be sync'd. This internal client must be configured to use the encryption middlewares as described above.

Encryption has no impact on the object-auditor service. Since the ETag header saved with the object at rest is the md5 sum of the encrypted object body then the auditor will verify that encrypted data is valid.

Encryption has no impact on the object-expirer service. X-Delete-At and X-Delete-After headers are not encrypted.

Encryption has no impact on the object-replicator and object-reconstructor services. These services are unaware of the object or EC fragment data being encrypted.

Encryption has no impact on the container-reconciler service. The container-reconciler uses an internal client to move objects between different policy rings. The destination object has the same URL as the source object and the object is moved without re-encryption.

Considerations for developers

Developers should be aware that encryption middlewares rely on the path of an object remaining unchanged. The keymaster derives keys for containers and objects based on their paths. The encrypter also uses the object path to derive an IV for encrypting the ETag. As explained above, this choice of IV is made to enable conditional request ETag values to be encrypted in an identical fashion prior to matching with the object ETag.

Developers should therefore give careful consideration to any new features that would relocate object data and metadata within a Swift cluster by means that do not cause the object data and metadata to pass through the encryption middlewares in the proxy pipeline and be re-encrypted.

The keymaster does persist the path that was used to derive keys as an item of system metadata name X-Object-Sysmeta-Crypto-Id. This metadata has been included in anticipation of future scenarios when it may be necessary to decrypt an object that has been relocated without re-encrypting, in which case the value of X-Object-Sysmeta-Crypto-Id could be used to derive the keys that were used for encryption. However, this alone is not sufficient to handle conditional requests and to decrypt container listings where objects have been relocated, and further work will be required to solve those issues.