diff --git a/doc/source/index.rst b/doc/source/index.rst index c648d0af4f..cc86a02c35 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -59,6 +59,7 @@ Overview and Concepts overview_erasure_code overview_backing_store ring_background + overview_encryption associated_projects Developer Documentation diff --git a/doc/source/overview_encryption.rst b/doc/source/overview_encryption.rst new file mode 100644 index 0000000000..127b2b699d --- /dev/null +++ b/doc/source/overview_encryption.rst @@ -0,0 +1,373 @@ +================= +Object Encryption +================= + +Swift supports the optional encryption of object data at rest on storage nodes. +The encryption of object data is intended to mitigate the risk of users' data +being read if an unauthorised party were to gain physical access to a disk. + +Encryption of data at rest is implemented by a set of three middleware modules +that may be included in the proxy server WSGI pipeline. The feature is internal +to a Swift cluster and not exposed through the API. Clients are unaware that +data is encrypted by this feature internally to the Swift service; internally +encrypted data should never be returned to clients via the Swift API. + +The following data are encrypted while at rest in Swift: + +* Object content i.e. the content of an object PUT request's body +* The entity tag (ETag) of the object +* All custom user metadata values i.e. metadata sent using X-Object-Meta- + prefixed headers with PUT or POST requests + +Any data not included in the list above are not encrypted, including: + +* Account, container and object names +* Account and container custom user metadata +* Custom user metadata names +* Object Content-Type values +* Object size +* System metadata + +------------------------ +Deployment and operation +------------------------ + +Encryption at rest is deployed by adding three middleware filters to the proxy +server WSGI pipeline and including their respective filter configuration +sections in the `proxy-server.conf` file:: + + ... decrypter keymaster encrypter proxy-logging proxy-server + + [filter:decrypter] + use = egg:swift#decrypter + + [filter:keymaster] + use = egg:swift#keymaster + encryption_root_secret = your_secret + + [filter:encrypter] + use = egg:swift#encrypter + # disable_encryption = False + +See the example pipeline in `proxy-server.conf-sample` for further details on +the positioning of those middlewares relative to other middleware. + +The keymaster config option ``encryption_root_secret`` MUST be set to a value +of at least 44 valid base-64 characters before the middleware is used and +should be consistent across all proxy servers. + +.. note:: + + The ``encryption_root_secret`` option holds the master secret key used for + encryption. The security of all encrypted data critically depends on this + key, therefore it should be set to a high-entropy value. For example, a + suitable ``encryption_root_secret`` may be obtained by base-64 encoding a + 32 byte (or longer) value generated by a cryptographically secure random + number generator. + + The ``encryption_root_secret`` value is necessary to recover any encrypted + data from the storage system, and therefore, it must be guarded against + accidental loss. Its value (and consequently, the proxy-server.conf file) + should not be stored on any disk that is in any account, container or + object ring. + +One method for generating a suitable value for ``encryption_root_secret`` is to +use the ``openssl`` command line tool:: + + openssl rand -base64 32 + +Once deployed, the encrypter will by default encrypt object data and metadata +when PUT and POST requests are made to the proxy server and the decrypter will +decrypt object data and metadata when handling GET and HEAD requests. + +Objects that existed in the cluster prior to the encryption middlewares being +deployed are still readable with GET and HEAD requests. The content of those +objects will not be encrypted unless they are written again by a PUT or COPY +request. Any user metadata of those objects will not be encrypted unless it is +written again by a PUT, POST or COPY request. + +Once deployed, the encryption middlewares should not be removed from the +pipeline. To do so might cause encrypted object data and/or metadata to be +returned in response to GET or HEAD requests. + +Encryption of inbound object data may be disabled by setting the encrypter +``disable_encryption`` option to ``True``, in which case existing encrypted +objects will remain encrypted but new data written with PUT, POST or COPY +requests will not be encrypted. The encryption middlewares should remain in the +pipeline even when encryption of new objects is not required. The encrypter +middleware is needed to handle conditional GET requests that may be for +previously encrypted objects. The decrypter middleware is needed to handle all +GET requests that are for encrypted objects. The keymaster is needed to provide +keys for those requests. + +.. _container_sync_client_config: + +Container sync +-------------- + +If container sync is being used then the encryption middlewares must be added +to the container sync internal client pipeline. The following configuration +steps are required: + +#. Create a custom internal client configuration file for container sync (if + one is not already in use) based on the sample file + `internal-client.conf-sample`. For example, copy + `internal-client.conf-sample` to `/etc/swift/container-sync-client.conf`. +#. Modify this file include to include the encryption middlewares in the + pipeline in the same way as described above for the proxy server. +#. Modify the container-sync section of all container server config files to + point to this internal client config file using the + ``internal_client_conf_path`` option. For example:: + + internal_client_conf_path = /etc/swift/container-sync-client.conf + +-------------------------- +Performance Considerations +-------------------------- + +TODO + +-------------- +Implementation +-------------- + +Encryption scheme +----------------- + +Plaintext data is encrypted to a ciphertext using the AES cipher with 256-bit +keys. The cipher is used in counter mode so that any byte or range of bytes in +the ciphertext may be decrypted independently of any other bytes in the +ciphertext. This enables very simple handling of ranged GETs. + +In general an item of plaintext data ``p`` is transformed to a ciphertext +``c``:: + + ciphertext = E(plaintext, k, iv) + +where ``E`` is the encryption function, ``k`` is an encryption key and ``iv`` +is a unique initialization vector (IV) chosen for each encryption operation. +The IV is stored as metadata of the encrypted item so that it is available for +decryption:: + + plaintext = D(ciphertext, k, iv) + +where ``D`` is the decryption function. + +In general any encrypted item has accompanying crypto-metadata that describes +the IV and the cipher algorithm used for the encryption:: + + crypto_metadata = {"iv": <16 byte value>, + "cipher": "AES_CTR_256"} + +Key management +-------------- + +A keymaster middleware is responsible for providing the keys required for each +encryption and decryption operation. The keymaster middleware should provide +different keys for each object and container. These are made available to the +encrypter and decrypter via a callback function that the keymaster installs in +the WSGI request environ. + +The current keymaster implementation derives container and object keys from the +``encryption_root_secret`` in a deterministic way by constructing an SHA256 +HMAC using the ``encryption_root_secret`` as a key and the container or object +path as a message, for example:: + + object_key = HMAC(encryption_root_secret, "/a/c/o") + +Other strategies for providing object and container keys may be employed by +future implementations of alternative keymaster middleware. + +The encrypter uses the object key to `wrap` other randomly generated keys that +are used to encrypt object data. A random key is `wrapped` by encrypting it +using the object key provided by the keymaster. This makes it safe to then +store the wrapped key alongside object data and metadata. + +This process of `key wrapping` is performed to enable more efficient re-keying +events when the object key may need to be replaced and consequently any data +encrypted using that key must be re-encrypted. Key wrapping minimizes the +amount of data encrypted using those keys to just other randomly chosen keys +which can be re-wrapped efficiently without needing to re-encrypt the larger +amounts of data that were encrypted using the random keys. + +For example, as described below, the object body is encrypted using a random +key which is then wrapped using the object key. If re-keying requires the +object key to be replaced then only the random key needs to be re-encrypted and +not the object body, which is potentially a large amount of data. + +.. note:: + + Re-keying is not currently implemented. Key wrapping is implemented + in anticipation of future re-keying operations. + + +Encrypter operation +------------------- + +Custom user metadata +++++++++++++++++++++ + +The encrypter encrypts each item of custom user metadata using the object key +provided by the keymaster and an IV that is randomly chosen for that metadata +item. For example:: + + X-Object-Meta-Private1: value1 + X-Object-Meta-Private2: value2 + +are transformed to:: + + X-Object-Meta-Private1: E(value1, object_key, header_iv_1) + X-Object-Meta-Private2: E(value2, object_key, header_iv_2) + +For each custom user metadata header the encrypter stores the associated +crypto-metadata using an ``X-Object-Transient-Sysmeta-`` header. For the same +example:: + + X-Object-Transient-Sysmeta-Crypto-Meta-Private1:{"iv": header_iv_1, + "cipher": "AES_CTR_256"} + X-Object-Transient-Sysmeta-Crypto-Meta-Private2:{"iv": header_iv_2, + "cipher": "AES_CTR_256"} + +Object body ++++++++++++ + +Encryption of an object body is performed using a randomly chosen body key +and a randomly chosen IV:: + + body_ciphertext = E(body_plaintext, body_key, body_iv) + +The body_key is wrapped using the object key provided by the keymaster and a +randomly chosen IV:: + + wrapped_body_key = E(body_key, object_key, body_key_iv) + +The encrypter stores the associated crypto metadata in a system metadata +header:: + + X-Object-Sysmeta-Crypto-Meta: + {"iv": body_iv, + "cipher": "AES_CTR_256", + "body_key": {"key": wrapped_body_key, + "iv": body_key_iv}} + +Note that in this case there is an extra item of crypto metadata which stores +the wrapped body key and its IV. + +Entity tag +++++++++++ + +While encrypting the object body the encrypter also calculates the ETag (md5 +digest) of the plaintext body. This value is encrypted using a keymaster +provided container key, and an IV that is derived from the object's path, and +saved as an item of system metadata:: + + X-Object-Sysmeta-Crypto-Etag: E(md5(plaintext), container_key, F(path)) + +The encrypter stores the associated crypto metadata in a system metadata +header:: + + X-Object-Sysmeta-Crypto-Meta-Etag: {"iv": F(path), + "cipher": "AES_CTR_256"} + +The reason for using the container key for this encryption is that the +encrypted ETag must also be included in the object update to the container +server, and will be included in container listings. The decrypter must be able +to decrypt the ETags in container listings using only the container key (since +object keys may not be available when handling a container request) so the +ETags must therefore be encrypted using the container key. + +The encrypter forces the encrypted plaintext ETag to be sent with container +updates by adding an update override header to the PUT request, which also has +the associated crypto metadata appended to the encrypted ETag value:: + + X-Object-Sysmeta-Container-Update-Override-Etag: + E(md5(plaintext), container_key, F(path)); + meta={"iv": F(path), "cipher": "AES_CTR_256"} + +The reason an IV derived from the object's path is used when encrypting the +ETag is to allow the encrypter to perform the same transformation on ETag +values specified in subsequent conditional GET or HEAD requests, so that they +can be compared against the encrypted object ETag when the object server +evaluates the conditional request. So, when handling a conditional GET or HEAD +request, the encrypter updates ``If[-None]-Match`` headers:: + + If[-None]-Match: E(ETag, container_key, F(path)) + +Since the plaintext ETag value is only known once the encrypter has completed +processing the entire object body, the ``X-Object-Sysmeta-Crypto-Etag``, +``X-Object-Sysmeta-Crypto-Meta-Etag`` and +``X-Object-Sysmeta-Container-Update-Override-Etag`` headers are sent after the +encrypted object body using the proxy server's support for request footers. + + +Decrypter operation +------------------- + +For each GET or HEAD request to an object, the decrypter inspects the response +for encrypted items (revealed by crypto metadata headers), and if any are +discovered then it will: + +#. Fetch container and object keys from the keymaster via its callback +#. Decrypt the ``X-Object-Sysmeta-Crypto-Etag`` value using the container + key and the IV found in the ``X-Object-Sysmeta-Crypto-Meta-Etag`` header +#. Decrypt metadata headers using the object key +#. Decrypt the wrapped body key found in ``X-Object-Sysmeta-Crypto-Meta`` +#. Decrypt the body using the body key + +For each GET request to a container that includes a format param, the +decrypter will: + +#. GET the container listing +#. Fetch container key from the keymaster via its callback +#. Decrypt the response body ETag entries using the container key + + +Impact on other Swift services +------------------------------ + +`Container Sync` uses an internal client to GET objects that are to be sync'd. +This internal client must be configured to use the encryption middlewares as +described `above`__. + +.. __: container_sync_client_config_ + +Encryption has no impact on the `object-auditor` service. Since the ETag +header saved with the object at rest is the md5 sum of the encrypted object +body then the auditor will verify that encrypted data is valid. + +Encryption has no impact on the `object-expirer` service. ``X-Delete-At`` and +``X-Delete-After`` headers are not encrypted. + +Encryption has no impact on the `object-replicator` and `object-reconstructor` +services. These services are unaware of the object or EC fragment data being +encrypted. + +Encryption has no impact on the `container-reconciler` service. The +`container-reconciler` uses an internal client to move objects between +different policy rings. The destination object has the same URL as the source +object and the object is moved without re-encryption. + + +Considerations for developers +----------------------------- + +Developers should be aware that encryption middlewares rely on the path of an +object remaining unchanged. The keymaster derives keys for containers and +objects based on their paths. The encrypter also uses the object path to derive +an IV for encrypting the ETag. As explained above, this choice of IV is +made to enable conditional request ETag values to be encrypted in an +identical fashion prior to matching with the object ETag. + +Developers should therefore give careful consideration to any new features that +would relocate object data and metadata within a Swift cluster by means that do +not cause the object data and metadata to pass through the encryption +middlewares in the proxy pipeline and be re-encrypted. + +The keymaster does persist the path that was used to derive keys as an item of +system metadata name ``X-Object-Sysmeta-Crypto-Id``. This metadata has been +included in anticipation of future scenarios when it may be necessary to +decrypt an object that has been relocated without re-encrypting, in which case +the value of ``X-Object-Sysmeta-Crypto-Id`` could be used to derive the keys +that were used for encryption. However, this alone is not sufficient to handle +conditional requests and to decrypt container listings where objects have been +relocated, and further work will be required to solve those issues. diff --git a/etc/proxy-server.conf-sample b/etc/proxy-server.conf-sample index cade60751e..a871b70af5 100644 --- a/etc/proxy-server.conf-sample +++ b/etc/proxy-server.conf-sample @@ -769,7 +769,6 @@ use = egg:swift#copy # Note: To enable encryption, add the following 3 dependent pieces of # crypto middleware to the proxy-server pipeline as follows: # ... decrypter keymaster encrypter proxy-logging (end of pipeline) - [filter:decrypter] use = egg:swift#decrypter @@ -788,6 +787,5 @@ use = egg:swift#keymaster # to the devstack proxy-config so that gate tests can pass. # base64 encoding of "dontEverUseThisIn_PRODUCTION_xxxxxxxxxxxxxxx" encryption_root_secret = ZG9udEV2ZXJVc2VUaGlzSW5fUFJPRFVDVElPTl94eHh4eHh4eHh4eHh4eHg= - [filter:encrypter] use = egg:swift#encrypter