Commit Graph

123 Commits

Author SHA1 Message Date
Zuul 60db1f847c Merge "slo: part-number=N query parameter support" 2024-03-13 00:13:45 +00:00
indianwhocodes 6adbeb4036 slo: part-number=N query parameter support
This change allows individual SLO segments to be downloaded by adding
an extra 'part-number' query parameter to the GET request.  You can
also retrieve the Content-Length of an individual segment with a HEAD
request.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I7af0dc9898ca35f042b52dd5db000072f2c7512e
2024-03-12 06:47:02 -07:00
Clay Gerrard 130188b6c0 zero bytes manifests are not legacy
Change-Id: I7c8adb129b8770eee501748a378f3adc42c8cd39
2024-02-27 17:21:00 -06:00
Clay Gerrard 1c31973d33 test: couple raw manifests with their TestCase
The ?format=raw TestCase has it's own manifest setup and doesn't do any
segment validation.  It's manifests are not suitable for use in other
TestCases.

Change-Id: Idf4b72bb59b8bf7232236ca544a3317b6e2e08fd
2023-12-14 00:12:14 -06:00
Alistair Coles 60c04f116b s3api: Stop propagating storage policy to sub-requests
The proxy_logging middleware needs an X-Backend-Storage-Policy-Index
header to populate the storage policy field in logs, and will look in
both request and response headers to find it.

Previously, the s3api middleware would indiscriminately copy the
X-Backend-Storage-Policy-Index from swift backend requests into the
S3Request headers [1]. This works for logging but causes the header
to leak between backend requests [2] and break mixed policy
multipart uploads. This patch sets the X-Backend-Storage-Policy-Index
header on s3api responses rather than requests.

Additionally, the middleware now looks for the
X-Backend-Storage-Policy-Index header in the swift backend request
*and* response headers, in the same way that proxy_logging would
(preferring a response header over a request header). This means that
a policy index is now logged for bucket requests, which only have
X-Backend-Storage-Policy-Index header in their response headers.

The s3api adds the value from the *final* backend request/response
pair to its response headers. Returning the policy index from the
final backend request/response is consistent with swift.backend_path
being set to that backend request's path i.e. proxy_logging will log
the correct policy index for the logged path.

The FakeSwift helper no longer looks in registered object responses
for an X-Backend-Storage-Policy-Index header to update an object
request. Real Swift object responses do not have an
X-Backend-Storage-Policy-Index header. By default, FakeSwift will now
update *all* object requests with an X-Backend-Storage-Policy-Index as
follows:

  - If a matching container HEAD response has been registered then
    any X-Backend-Storage-Policy-Index found with that is used.
  - Otherwise the default policy index is used.

Furthermore, FakeSwift now adds the X-Backend-Storage-Policy-Index
header to the request *after* the request has been captured. Tests
using FakeSwift.calls_wth_headers() to make assertions about captured
headers no longer need to make allowance for the header that FakeSwift
added.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Closes-Bug: #2038459
[1] Related-Change: I5fe5ab31d6b2d9f7b6ecb3bfa246433a78e54808
[2] Related-Change: I40b252446b3a1294a5ca8b531f224ce9c16f9aba
Change-Id: I2793e335a08ad373c49cbbe6759d4e97cc420867
2023-11-14 15:09:18 +00:00
Clay Gerrard 4a37a2976b slo: refactor GET/HEAD response handling
This patch reorganizes the SLO read response handling.  The main goal
was to push the response header replacement for both GET/HEAD SLO and
multipart-manifest=get paths all into a common return path.  A new
RespAttrs primitive is used to carry around some metadata details from
requests made in SLO.  The authors hope these changes make the code more
easily readable and easier to modify.

Drive-By: add new "friendly_close" function in common.utils so we can
drain empty/error responses more confidently (and use it in swob and
request_helpers).

Drive-By: the tests added in the Related-Change discovered a 500 on
If-[Un]Modified-Since conditional GET requests - it probably wasn't
important, but this refactor fixed it on accident as a side effect.

Closes-Bug: #2040178
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Ashwin Nair <nairashwin952013@gmail.com>
Related-Change-Id: I54094f3d2098f56b755ec19cc9315d06a6ca8b15
Change-Id: Idc84e70539fc7480b6ecb86e2f0da904baf2c727
2023-11-10 15:26:28 -06:00
Clay Gerrard 3dab88bdf8 tests: refactor SLO size/etag sysmeta tests
We've been writing SLO manifests with size/etag sysmeta for more than 5
years, but we want our tests and code to continue to support the legacy
format forever.  This test infra refactor will make that easier for test
authors to opt-in testing of legacy manifests by reusing a common
pattern for manifest setup across tests.

This consolidation also cleans up some duplication where two TestCases
had identical manifest setup and paves the way to more tidying of
similar (but slightly different) manifest setup across TestCases and
sharing of setup across future TestCases.

This manifest setup standardization also adopts a consistent naming
scheme for manifest sysmeta values so test assertions are easier to read
as correct at a glance (e.g. slo_etag vs json_md5)

Additionally leak tracking is added to the common base; SLO was already
really good about *closing* requests, but in many cases seems to not
bother reading/draining them (even when they might be empty/small).

As part of the leak tracking investigation a couple new tests were added
to explore the behavior of SLO's SegmentedIterable in the
request_helpers module.

Drive-By: Fix SegmentedIterable docstring: the constructor has
expected an iterable yielding dicts, not tuples, since the
Related-Change [2].

Drive-By: remove FakeSwift's now unused "register_responses" interface
and provide "register_next_response" as a replacment.  This allows test
authors to extend the registered response for a given request key from a
common test setup into a "series of registered responses" by expressing
just the new/next response rather than forcing them to duplicate the
initial response in the explicit list passed to "register_responses".

Related-Bug: #2040178
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
[1] Related-Change: Ia6ad32354105515560b005cea750aa64a88c96f9
[2] Related-Change-Id: Ib8dc216a84d370e6da7d6b819af79582b671d699
Change-Id: I54094f3d2098f56b755ec19cc9315d06a6ca8b15
2023-11-01 17:26:58 -05:00
Tim Burke 5392a2057b tests: Add test(s) for MPU part copy from range
When using the copy-part API it is expected for s3api to write down an
empty value for X-Object-Sysmeta-S3Api-Etag on segments.  This was
ostensibly to prevent writing down an unrelated S3Api-Etag when copying
a part from another MPU the copy transfers object sysmeta.  We should
assume a S3Api-Etag w/o X-Static-Large-Object is non-sense, and SLO
should forever expect empty values for it's sysmeta.

Drive-By: consolidate handling of boto2 sigv4 skips

Related-Bug: #2035158
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: Ic6f04a5a6af8a3e65b226cff2ed6c9fce8ce1fa2
2023-09-18 15:25:46 -05:00
Zuul 7cbfe7c924 Merge "Improve FakeSwift Backend-Ignore-Range support" 2023-09-06 21:26:36 +00:00
Clay Gerrard 451ae26a8b Improve FakeSwift Backend-Ignore-Range support
In keeping with the trend as of late, this change makes FakeSwift behave
more reliably like a real Swift backend.

Swift backend object server's grew support for ignoring Range request
headers when fetching SLO manifests in Jan-2020, and FakeSwift learned
how to mimic the real behavior in Jul-2022.  This change unifies the
implementation details with a request_helper and consolidates the
behavior in FakeSwift.  It also makes the modern object-server behavior
the default.

Between 2020 and 2022 there was arguably some utility defaulting to
legacy behavior, but in 2023 as we endeavor to refactor the SLO
implementation and extend it's tests: a reliable FakeSwift is paramount.

Since most of the existing tests for SLO's behavior responding to Range
requests did not reliably assert behavior across new and old swift this
change selects the most relevant tests to legacy behavior and has them
opt-in to can_ignore_range = False, while the others merely have their
backend request asserts cleaned-up to match the backend request pattern
you would expect in a production environment that's upgraded in the last
3 years.  Additional technical investment may be required to ensure
older clusters can upgrade proxies before object servers w/o tracebacks
until the upgrade finishes; however it appears the existing code is
sufficiently robust despite the lack of explicit multi-inheritance
testing like was done for the legacy manifest format change in Nov-2016
(N.B. unlike rolling upgrade bugs, data is forever).

Related-Change-Id: I4ff2a178d0456e7e37d561109ef57dd0d92cbd4e
Related-Change-Id: If3861e5b9c4f17ab3b82ea16673ddb29d07820a1
Related-Change-Id: Ia6ad32354105515560b005cea750aa64a88c96f9

Change-Id: I7ebfd557b9c8ec25498c628fcf0695cd52ad78d6
2023-09-06 13:06:33 +01:00
Zuul 957743f25f Merge "slo: 500 if we can't load the manifest" 2023-09-06 01:11:01 +00:00
Alistair Coles 8f85e27c27 FakeSwift: use env['PATH_INFO'] to index uploaded objects
Change-Id: If17e10d309826b815d2b5b470a8bff071f5c4e87
2023-08-18 14:48:17 +01:00
indianwhocodes dab7192e1e tests: fix FakeSwift HEAD with query param
The existing FakeSwift implementation already supports using registered
GET responses for GET requests with a query param.  It also supports
using registered GET responses for HEAD requests (if they either both
had the exact SAME matching query params, or both did not have ANY query
params).  But it did not support using registered GET responses w/o
query params for HEAD requests with a query param, even though a GET
with the same query param would work.

This change makes it a little more consistent: any client or test that
makes a GET request, should be able to make a similar HEAD request and
expect consistent response headers and status.

This test infra improvement is needed as we're going to be extending
test_slo with a bunch of tests that assert consistent response headers
for both GET and HEAD requests w/ new query params.

Change-Id: Idb4020fdeee87a9164312dc9647ab0820b098ff8
2023-08-18 14:28:25 +01:00
Tim Burke 6e5b5a659a slo: 500 if we can't load the manifest
When the proxy app is trying to send back an object and hits an error
(maybe a timeout, maybe an EC decode error) *after* it has sent headers
and started streaming data, it just stops sending data, expecting
clients to notice the discrepency in Content-Length and retry.

When that happened in SLO while reading a manifest, previously we'd just
assume the manifest is empty and send back an empty response. This would
cause confusion for users (who'd think we lost data or soemthing) and
was clearly wrong.

Now, return a 500 to the client. Retrying is perfectly reasonable.

Change-Id: I7fc923ad0ef37459b7a76ce360dd7f320053d3f7
2023-06-28 13:05:26 -07:00
Zuul 0470994a03 Merge "slo: Default allow_async_delete to true" 2022-12-01 19:25:50 +00:00
Tim Burke e6ee372744 slo: Reduce overhead for 'Not an SLO manifest' responses
When clients issue a ?multipart-manifest=delete request to non-SLOs, we
try to fetch the manifest then drain and close the response upon seeing
it wasn't actually an SLO manifest. This could previously cause the extra
transfer (and discard) of several gigabytes of data.

Now, add two extra headers to the request:

  * Range: bytes=-1
  * X-Backend-Ignore-Range-If-Metadata-Present: X-Static-Large-Object

The first limits how much data we'll be discarding, while the second tells
object servers to ignore the range header if it's an SLO manifest. Note
that object-servers may still need to return more than one byte to the
proxy -- an EC policy will require that we get a full fragment's worth
from each server -- but at least we've got a better cap on our downside.

Why one byte? Because range requests weren't designed to be able to
return no data. Why the last byte (as opposed to the first)? Because
bytes=0-0 will 416 on a zero-byte object, while bytes=-1 will 200.

Note that the backend header was introduced in Swift 2.24.0 -- if we get
a response from an older object-server, it may respect the Range header
even though it's returning an SLO manifest. In that case, retry without
either header.

Related-Bug: #1980954
Co-Authored-By: Romain de Joux <romain.de-joux@ovhcloud.com>
Change-Id: If3861e5b9c4f17ab3b82ea16673ddb29d07820a1
2022-07-28 14:50:16 -07:00
Romain de Joux a5c1444faa Drain and close response in StaticLargeObject.get_slo_segments
In get_slo_segments a GET subrequest is processed to get SLO manifest,
but if the object is not a SLO the response was not drain/closed.

Closes-Bug: 1980954
Change-Id: I7862c8ef153416c00c8ca7d6bf2f3556a1776d8c
2022-07-19 15:22:31 +02:00
Alistair Coles 5227cb702b Refactor rate-limiting helper into a class
Replaces the ratelimit_sleep helper function with an
EventletRateLimiter class that encapsulates the rate-limiting state
that previously needed to be maintained by the caller of the function.

The ratelimit_sleep function is retained but deprecated, and now
forwards to the EventletRateLimiter class.

The object updater's BucketizedUpdateSkippingLimiter is refactored to
take advantage of the new EventletRateLimiter class.

The rate limiting algorithm is corrected to make the allowed request
rate more uniform: previously pairs of requests would be allowed in
rapid succession before the rate limiter would the sleep for the time
allowance consumed by those two requests; now the rate limiter will
sleep as required after each allowed request. For example, before a
max_rate of 1 per second might result in 2 requests being allowed
followed by a 2 second sleep. That is corrected to be a sleep of 1
second after each request.

Change-Id: Ibcf4dbeb4332dee7e9e233473d4ceaf75a5a85c7
2022-05-04 11:22:50 +01:00
Matthew Oliver 589ac355f3 Move *_swift_info functions into a new registry module
The *_swift_info functions use in module global dicts to provide a
registry mechanism for registering and getting swift info.

This is an abnormal pattern and doesn't quite fit into utils. Further
we looking at following this pattern for sensitive info to trim in the
future.
So this patch does some house cleaning and moves this registry to a new
module swift.common.registry. And updates all the references to it.

For backwards compat we still import the *_swift_info methods into utils
for any 3rd party tools or middleware.

Change-Id: I71fd7f50d1aafc001d6905438f42de4e58af8421
2022-02-03 14:41:13 +00:00
Tim Burke fa1058b6ed slo: Default allow_async_delete to true
We've had this option for a year now, and it seems to help. Let's enable
it for everyone. Note that Swift clients still need to opt into the
async delete via a query param, while S3 clients get it for free.

Change-Id: Ib4164f877908b855ce354cc722d9cb0be8be9921
2021-12-21 14:12:34 -08:00
Alistair Coles 5e33026495 Use CloseableChain when creating iterator of SLO response
When handling a GET response ProxyLoggingMiddleware will try to close
a reiterated [1] proxy response iterator if, for example, there is a
client disconnect.

The reiterate function encapsulates the result of calling iter() on
the proxy response. In the case of an SLO response, the iter method
returned an instance of iterchools.chain, rather than the response
itself, which is an instance of SegmentedIterable. As a result the
SegmentedIterable.close() method would not be called and object server
connections would not be closed.

This patch replaces the iterchools.chain with a CloseableChain which
encapsulates the SegmentedIterable and closes it when
CloseableChain.close() is called.

[1] The use of reiterate was introduced by the Related-Change.

Closes-Bug: #1909588
Related-Change: I27feabe923a6520e983637a9c68a19ec7174a0df
Change-Id: Ib7450a85692114973782525004466db49f63066d
2020-12-29 16:14:28 +00:00
Ade Lee 5320ecbaf2 replace md5 with swift utils version
md5 is not an approved algorithm in FIPS mode, and trying to
instantiate a hashlib.md5() will fail when the system is running in
FIPS mode.

md5 is allowed when in a non-security context.  There is a plan to
add a keyword parameter (usedforsecurity) to hashlib.md5() to annotate
whether or not the instance is being used in a security context.

In the case where it is not, the instantiation of md5 will be allowed.
See https://bugs.python.org/issue9216 for more details.

Some downstream python versions already support this parameter.  To
support these versions, a new encapsulation of md5() is added to
swift/common/utils.py.  This encapsulation is identical to the one being
added to oslo.utils, but is recreated here to avoid adding a dependency.

This patch is to replace the instances of hashlib.md5() with this new
encapsulation, adding an annotation indicating whether the usage is
a security context or not.

While this patch seems large, it is really just the same change over and
again.  Reviewers need to pay particular attention as to whether the
keyword parameter (usedforsecurity) is set correctly.   Right now, all
of them appear to be not used in a security context.

Now that all the instances have been converted, we can update the bandit
run to look for these instances and ensure that new invocations do not
creep in.

With this latest patch, the functional and unit tests all pass
on a FIPS enabled system.

Co-Authored-By: Pete Zaitcev
Change-Id: Ibb4917da4c083e1e094156d748708b87387f2d87
2020-12-15 09:52:55 -05:00
Zuul e22cad666a Merge "xlo: 5xx while validating first segment is a server error" 2020-11-20 01:13:26 +00:00
Zuul cd228fafad Merge "Add a new URL parameter to allow for async cleanup of SLO segments" 2020-11-18 00:50:54 +00:00
Clay Gerrard 3d787ddff8 xlo: 5xx while validating first segment is a server error
With DLO and SLO, we validate that we can read the first segment before
sending data to the client; this helps catch auth errors where the user
has access to read the manifest but not the segments.

Sometimes, though, that validation fails for transient reasons; if the
proxy couldn't get enough responses from primaries to determine whether
the object exists (for example), we should send back a 503 to indicate
to the client that it should retry the request.

Change-Id: Ice5358ff85ee2d5fe60785b73b67dea493044a2c
2020-11-17 16:38:33 +00:00
Tim Burke e78377624a Add a new URL parameter to allow for async cleanup of SLO segments
Add a new config option to SLO, allow_async_delete, to allow operators
to opt-in to this new behavior. If their expirer queues get out of hand,
they can always turn it back off.

If the option is disabled, handle the delete inline; this matches the
behavior of old Swift.

Only allow an async delete if all segments are in the same container and
none are nested SLOs, that way we only have two auth checks to make.

Have s3api try to use this new mode if the data seems to have been
uploaded via S3 (since it should be safe to assume that the above
criteria are met).

Drive-by: Allow the expirer queue and swift-container-deleter to use
high-precision timestamps.

Change-Id: I0bbe1ccd06776ef3e23438b40d8fb9a7c2de8921
2020-11-10 18:22:01 +00:00
Tim Burke 9a0bac4ceb Fail short reads in SegmentedIterable
...and check for it *before* doing the MD5 check. We see this happen
ocassionally, but as best I can tell, it's always due to a
ChunkReadTimeout popping in the proxy that it can't recover from.

Change-Id: If238725bbec4fc3f6c8d000599c735a7c4972f7d
2020-09-28 13:56:00 -07:00
Tim Burke a8d2146266 xlo: Drain error responses
Otherwise, they get logged as 499s. While we're at it, also drain DLO
bodies (if they're small).

Change-Id: I7b54f25f42577020b10029c74f8fc01fa6fc591e
2020-09-15 12:51:16 -07:00
Zuul 24fcb380a8 Merge "Return correct etag for raw manifest" 2020-01-31 18:53:27 +00:00
Thiago da Silva b8c16de023 Return correct etag for raw manifest
When client sends a '?multipart-manifest=get&format=raw' request
middleware will change the manifest returned from object server.
This patch makes sure the response etag is updated to reflect
changes to manifest content

Change-Id: I0ac6dd0808fb041ba7663f4a472a06ee3f1d9a71
2020-01-31 12:04:12 +11:00
Clay Gerrard 2759d5d51c New Object Versioning mode
This patch adds a new object versioning mode. This new mode provides
a new set of APIs for users to interact with older versions of an
object. It also changes the naming scheme of older versions and adds
a version-id to each object.

This new mode is not backwards compatible or interchangeable with the
other two modes (i.e., stack and history), especially due to the changes
in the namimg scheme of older versions. This new mode will also serve
as a foundation for adding S3 versioning compatibility in the s3api
middleware.

Note that this does not (yet) support using a versioned container as
a source in container-sync. Container sync should be enhanced to sync
previous versions of objects.

Change-Id: Ic7d39ba425ca324eeb4543a2ce8d03428e2225a1
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Thiago da Silva <thiagodasilva@gmail.com>
2020-01-24 17:39:56 -08:00
Tim Burke e8b654f318 Have slo tell the object-server that it wants whole manifests
Otherwise, we waste a request on some 416/206 response that won't be
helpful.

To do this, add a new X-Backend-Ignore-Range-If-Metadata-Present header
whose value is a comma-separated list of header names. Middlewares may
include this header to tell object-servers to send the whole object
(rather than a 206 or 416) if *any* of the metadata are present.

Have dlo and symlink use it, too; it won't save us any round-trips, but
it should clean up some object-server logging.

Change-Id: I4ff2a178d0456e7e37d561109ef57dd0d92cbd4e
2020-01-02 15:48:39 -08:00
Tim Burke d270596b67 Consistently use io.BytesIO
Change-Id: Ic41b37ac75b5596a8307c4962be86f2a4b0d9731
2019-10-15 15:09:46 +02:00
Samuel Merritt cb30811916 Allow bulk delete of big SLO manifests
If you set SLO's max_manifest_segments to a value larger than 10000,
then clients are able to create manifests with that many segments, but
unable to use "?multipart-manifest=delete" to delete them.

This is because the SLO middleware has its very own bulk-deleter that
it uses to handle such requests, and that bulk-deleter only allows
10000 deletions per request by default. This commit removes the
limitation so that any SLO manifest can be deleted along with its
segments.

I considered setting max-deletes-per-request to be equal to SLO's
max_manifest_segments, but that only works if max_manifest_segments
has never been decreased.

Note that this commit does not increase max_manifest_segments. Clients
cannot make SLOs any bigger than they could before. Also note that
this commit does not affect user-initiated bulk deletes, i.e. POST
requests with "?bulk-delete=true" set. Those requests are still
limited in their size, and those limits are not changed.

Change-Id: I6a35937e8418f4f2b8e29825fc9c40415e34742f
Closes-Bug: 1746685
2019-08-13 16:51:50 -07:00
Tim Burke 3ee6de408e slo: Add X-Manifest-Etag to responses
This matches the ETag of the underlying swift object, as opposed to the
MD5-of-MD5s that is the large object's ETag.

Change-Id: Ifab726f63739f62aeef495c970939410341694d1
2019-08-08 14:20:05 -07:00
Tim Burke ef5a37c2bf slo: Better handle non-manifest responses when refetching manifest
Previously, we never checked whether the response we get when refetching
is even successful, much less whether it's still coming from an SLO.

Now, if the refetched data is newer, act on it. If it's older, 503.

Closes-Bug: #1837270
Change-Id: I106b94c77da220c762869aa800c31b87c3dffeeb
2019-07-19 21:42:43 -07:00
Pete Zaitcev bd8c3067b4 py3: slo
This adds wsgi_to_str(self.path_info) everywhere we forgot it,
not only in the slo module itself.

Dropping the body=''.join(body) after call_slo() is obvious:
the latter only returns strings of bytes, not lists of such.

Change-Id: I6b4d87e4cda4945bc128dbc9c1edd39e736a59d2
2019-05-17 17:57:23 -05:00
Tim Burke 284bbdd391 Add slo_manifest_hook callback
... to allow other middlewares to impose additional constraints on
or make edits to SLO manifests before being written.

The callback takes a single argument: the python list that represents
the manifest to be written. All the normal list operations listed at
https://docs.python.org/2/library/stdtypes.html#mutable-sequence-types
are available to make changes to that before SLO serializes it as JSON.

The callback may return a list of problematic segments; each item in the
list should be a tuple of

    (quoted object name, description of problem)

This will be useful both for s3api minimum segment size validation and
creating tar large objects.

Change-Id: I198c5196e0221a72b14597a06e5ce3c4b2bbf436
Related-Bug: #1636663
2018-12-11 15:46:01 -08:00
Alistair Coles 0cd42a2d33 Check other params preserved when slo_etag is extracted
Change-Id: Ie34ce2a33f2a642b97986fa28cf9db9e6da964d5
Related-Change: I67478923619b00ec1a37d56b6fec6a218453dafc
Related-Change: Ibaa630b5b4251cc4f821c01d3c09a8b8a6be342c
2018-07-12 10:01:58 -05:00
Clay Gerrard f8b9c24a1c Add unittest for slo_etag
Related-Change-Id: I67478923619b00ec1a37d56b6fec6a218453dafc

Change-Id: Ibaa630b5b4251cc4f821c01d3c09a8b8a6be342c
2018-07-12 10:01:54 -05:00
Tim Burke c4c98eb64d Include SLO ETag in container updates
Container servers will store an etag like

   <MD5 of manifest on disk>; slo_etag=<MD5 on concatenated ETags>

which the SLO middleware will break out into separate

   "hash": "<MD5 of manifest on disk",
   "slo_etag": "\"<MD5 of concatenated ETags\"",

keys for JSON listings. Text and XML listings are unaffected.

If a middleware left of SLO already specified a container update
override, the slo_etag parameter will be appended. If the base header
value was blank, the MD5 of the manifest will be inserted.

SLOs that were created on previous versions of Swift will continue to
just have the MD5 of the manifest in container listings.

Closes-Bug: 1618573
Change-Id: I67478923619b00ec1a37d56b6fec6a218453dafc
2018-07-10 15:41:29 -07:00
Timur Alperovich 0aad95005d Fix SLO delete for accounts with non-ASCII names.
If an account contains non-ASCII characters, currently SLO delete code
will fail, as get_slo_segments() method receives a unicode object, but
UTF-8 encoded account name. Attempting to concatenate the strings fails
with a UnicodeError, as it tries to use the ASCII codec to decode the
UTF-8 encoded account name.

This patch allows accounts with non-ASCII characters in their names to
delete SLOs.

Change-Id: I619d41e62c16b25bd5f58d300a3dc71aa4dc75c2
2018-05-23 16:19:50 -07:00
Tim Burke 42adbe561f Respect X-Backend-Etag-Is-At headers from left of SLO
If a middleware left of SLO wants to override the ETag for a large
object, it will need to send a X-Backend-Etag-Is-At on GETs if it wants
to be at all performant. This would work fine coming out of the object
controller (which would look at the headers in the response, figure out
what's the real conditional etag, and pass it to swob.Response), and
even encryption (which would do the same), but at SLO, we'd just replace
the ETag, flag it as a conditional response, and let swob assume the
*SLO* ETag is the conditional one.

Now, SLO will jump through the same resolve_backend_etag_is_at hoops that
other parts of the proxy have to deal with. This allows If-Match and
If-None-Match to work correctly if/when swift3 stores an S3-compatible
multipart-upload ETag.

Change-Id: Ibbf59d38d7bcc9c485b1d5305548144025d77441
2018-03-26 23:50:43 +00:00
Samuel Merritt 98d185905a Cleanup for iterators in SegmentedIterable
We had a pair of large, complicated iterators to handle fetching all
the segment data, and they were hard to read and think about. I tried
to break them out into some simpler pieces:

 * one to handle coalescing multiple requests to the same segment

 * one to handle fetching the bytes from each segment

 * one to check that the download isn't taking too long

 * one to count the bytes and make sure we sent the right number

 * one to catch errors and handle cleanup

It's more nesting, but each level now does just one thing.

Change-Id: If6f5cbd79edeff6ecb81350792449ce767919bcc
2018-02-02 11:30:49 -08:00
Zuul 82844a3211 Merge "Add support for data segments to SLO and SegmentedIterable" 2018-02-01 12:52:55 +00:00
Joel Wright 11bf9e4588 Add support for data segments to SLO and SegmentedIterable
This patch updates the SLO middleware and SegmentedIterable to add
support for user-specified inlined-data segments. Such segments will
contain base64-encoded data to be added before/after an object-backed
segment within an SLO. To accommodate the potential extra data we
increase the default SLO maximum manifest size from 2MiB to 8MiB.
The default maximum number of segments remains 1000, but this will
only be enforced for object-backed segments.

This patch is a prerequisite for a future patch enabling the
download of large objects as tarballs. The TLO patch will be added
as a dependent patch later.

UpgradeImpact
=============
During a rolling upgrade, an updated proxy may write a manifest that
out-of-date proxies will not be able to read. This will resolve itself
once the upgrade completes on all nodes.

Change-Id: Ib8dc216a84d370e6da7d6b819af79582b671d699
2018-01-31 02:13:22 +00:00
Tim Burke d1656e3349 slo: Send ETag header in 206 responses
Why weren't we doing that before?? The etag should be the same as for
GET/HEAD, and by sending it, we can assure resuming clients that they're
downlading the same object even if they didn't include an If-Match
header.

Change-Id: I4ccbd1ae3a909ecb4606ef18211d1b868f5cad86
Related-Change: Ic11662eb5c7176fbf422a6fc87a569928d6f85a1
2018-01-17 23:30:16 +00:00
Zuul 2596b3ca9d Merge "Let clients request heartbeats during SLO PUTs" 2017-11-03 16:05:18 +00:00
Tim Burke 77a8a4455d Let clients request heartbeats during SLO PUTs
An SLO PUT requires that we HEAD every referenced object; as a result, it
can be a very time-intensive operation. This makes it difficult as a
client to differentiate between a proxy-server that's still doing work and
one that's crashed but left the socket open.

Now, clients can opt-in to receiving heartbeats during long-running PUTs
by including the query parameter

    heartbeat=on

With heartbeating turned on, the proxy will start its response immediately
with 202 Accepted then send a single whitespace character periodically
until the request completes. At that point, a final summary chunk will be
sent which includes a "Response Status" key indicating success or failure
and (if successful) an "Etag" key indicating the Etag of the resulting SLO.

This mechanism is very similar to the way bulk extractions and deletions
work, and even the way SLO behaves for ?multipart-manifest=delete requests.

Note that this is opt-in: this prevents us from sending the 202 response
to existing clients that may mis-interpret it as an immediate indication
of success.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Related-Bug: 1718811
Change-Id: I65cee5f629c87364e188aa05a06d563c3849c8f3
2017-11-03 09:42:48 +00:00
Tim Burke e001c02ff9 Stop logging tracebacks on bad xLOs
The error messages alone should provide plenty enough information.
Plus, running functional tests really shouldn't cause tracebacks.

Also, tighten up tests for log messages.

Change-Id: I55136484d342af756fa153d971dcb9159a435f13
2017-10-18 22:39:37 +00:00