swift

Commit Graph

Author	SHA1	Message	Date
Zuul	60db1f847c	Merge "slo: part-number=N query parameter support"	2024-03-13 00:13:45 +00:00
indianwhocodes	6adbeb4036	slo: part-number=N query parameter support This change allows individual SLO segments to be downloaded by adding an extra 'part-number' query parameter to the GET request. You can also retrieve the Content-Length of an individual segment with a HEAD request. Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: I7af0dc9898ca35f042b52dd5db000072f2c7512e	2024-03-12 06:47:02 -07:00
Clay Gerrard	130188b6c0	zero bytes manifests are not legacy Change-Id: I7c8adb129b8770eee501748a378f3adc42c8cd39	2024-02-27 17:21:00 -06:00
Tim Burke	76ca11773e	lint: Up-rev hacking Last time we did this was nearly 4 years ago; drag ourselves into something approaching the present. Address a few new pyflakes issues that seem reasonable to enforce: E275 missing whitespace after keyword E231 missing whitespace after ',' E721 do not compare types, for exact checks use `is` / `is not`, for instance checks use `isinstance()` Main motivator is that the old hacking kept us on an old version of flake8 et al., which no longer work with newer Pythons. Change-Id: I54b46349fabb9776dcadc6def1cfb961c123aaa0	2024-02-07 15:48:39 -08:00
Zuul	966340aeed	Merge "Remove per-service auto_create_account_prefix"	2023-12-01 01:48:57 +00:00
Takashi Kajinami	49b19613d2	Remove per-service auto_create_account_prefix The per-service option was deprecated almost 4 years ago[1]. [1] `4601548dab` Change-Id: I45f7678c9932afa038438ee841d1b262d53c9da8	2023-11-22 01:58:03 +09:00
Clay Gerrard	4a37a2976b	slo: refactor GET/HEAD response handling This patch reorganizes the SLO read response handling. The main goal was to push the response header replacement for both GET/HEAD SLO and multipart-manifest=get paths all into a common return path. A new RespAttrs primitive is used to carry around some metadata details from requests made in SLO. The authors hope these changes make the code more easily readable and easier to modify. Drive-By: add new "friendly_close" function in common.utils so we can drain empty/error responses more confidently (and use it in swob and request_helpers). Drive-By: the tests added in the Related-Change discovered a 500 on If-[Un]Modified-Since conditional GET requests - it probably wasn't important, but this refactor fixed it on accident as a side effect. Closes-Bug: #2040178 Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Co-Authored-By: Ashwin Nair <nairashwin952013@gmail.com> Related-Change-Id: I54094f3d2098f56b755ec19cc9315d06a6ca8b15 Change-Id: Idc84e70539fc7480b6ecb86e2f0da904baf2c727	2023-11-10 15:26:28 -06:00
Tim Burke	6e5b5a659a	slo: 500 if we can't load the manifest When the proxy app is trying to send back an object and hits an error (maybe a timeout, maybe an EC decode error) after it has sent headers and started streaming data, it just stops sending data, expecting clients to notice the discrepency in Content-Length and retry. When that happened in SLO while reading a manifest, previously we'd just assume the manifest is empty and send back an empty response. This would cause confusion for users (who'd think we lost data or soemthing) and was clearly wrong. Now, return a 500 to the client. Retrying is perfectly reasonable. Change-Id: I7fc923ad0ef37459b7a76ce360dd7f320053d3f7	2023-06-28 13:05:26 -07:00
Zuul	0470994a03	Merge "slo: Default allow_async_delete to true"	2022-12-01 19:25:50 +00:00
Tim Burke	e6ee372744	slo: Reduce overhead for 'Not an SLO manifest' responses When clients issue a ?multipart-manifest=delete request to non-SLOs, we try to fetch the manifest then drain and close the response upon seeing it wasn't actually an SLO manifest. This could previously cause the extra transfer (and discard) of several gigabytes of data. Now, add two extra headers to the request: * Range: bytes=-1 * X-Backend-Ignore-Range-If-Metadata-Present: X-Static-Large-Object The first limits how much data we'll be discarding, while the second tells object servers to ignore the range header if it's an SLO manifest. Note that object-servers may still need to return more than one byte to the proxy -- an EC policy will require that we get a full fragment's worth from each server -- but at least we've got a better cap on our downside. Why one byte? Because range requests weren't designed to be able to return no data. Why the last byte (as opposed to the first)? Because bytes=0-0 will 416 on a zero-byte object, while bytes=-1 will 200. Note that the backend header was introduced in Swift 2.24.0 -- if we get a response from an older object-server, it may respect the Range header even though it's returning an SLO manifest. In that case, retry without either header. Related-Bug: #1980954 Co-Authored-By: Romain de Joux <romain.de-joux@ovhcloud.com> Change-Id: If3861e5b9c4f17ab3b82ea16673ddb29d07820a1	2022-07-28 14:50:16 -07:00
Romain de Joux	a5c1444faa	Drain and close response in StaticLargeObject.get_slo_segments In get_slo_segments a GET subrequest is processed to get SLO manifest, but if the object is not a SLO the response was not drain/closed. Closes-Bug: 1980954 Change-Id: I7862c8ef153416c00c8ca7d6bf2f3556a1776d8c	2022-07-19 15:22:31 +02:00
Matthew Oliver	589ac355f3	Move _swift_info functions into a new registry module The _swift_info functions use in module global dicts to provide a registry mechanism for registering and getting swift info. This is an abnormal pattern and doesn't quite fit into utils. Further we looking at following this pattern for sensitive info to trim in the future. So this patch does some house cleaning and moves this registry to a new module swift.common.registry. And updates all the references to it. For backwards compat we still import the *_swift_info methods into utils for any 3rd party tools or middleware. Change-Id: I71fd7f50d1aafc001d6905438f42de4e58af8421	2022-02-03 14:41:13 +00:00
Tim Burke	fa1058b6ed	slo: Default allow_async_delete to true We've had this option for a year now, and it seems to help. Let's enable it for everyone. Note that Swift clients still need to opt into the async delete via a query param, while S3 clients get it for free. Change-Id: Ib4164f877908b855ce354cc722d9cb0be8be9921	2021-12-21 14:12:34 -08:00
Ade Lee	5320ecbaf2	replace md5 with swift utils version md5 is not an approved algorithm in FIPS mode, and trying to instantiate a hashlib.md5() will fail when the system is running in FIPS mode. md5 is allowed when in a non-security context. There is a plan to add a keyword parameter (usedforsecurity) to hashlib.md5() to annotate whether or not the instance is being used in a security context. In the case where it is not, the instantiation of md5 will be allowed. See https://bugs.python.org/issue9216 for more details. Some downstream python versions already support this parameter. To support these versions, a new encapsulation of md5() is added to swift/common/utils.py. This encapsulation is identical to the one being added to oslo.utils, but is recreated here to avoid adding a dependency. This patch is to replace the instances of hashlib.md5() with this new encapsulation, adding an annotation indicating whether the usage is a security context or not. While this patch seems large, it is really just the same change over and again. Reviewers need to pay particular attention as to whether the keyword parameter (usedforsecurity) is set correctly. Right now, all of them appear to be not used in a security context. Now that all the instances have been converted, we can update the bandit run to look for these instances and ensure that new invocations do not creep in. With this latest patch, the functional and unit tests all pass on a FIPS enabled system. Co-Authored-By: Pete Zaitcev Change-Id: Ibb4917da4c083e1e094156d748708b87387f2d87	2020-12-15 09:52:55 -05:00
Tim Burke	e78377624a	Add a new URL parameter to allow for async cleanup of SLO segments Add a new config option to SLO, allow_async_delete, to allow operators to opt-in to this new behavior. If their expirer queues get out of hand, they can always turn it back off. If the option is disabled, handle the delete inline; this matches the behavior of old Swift. Only allow an async delete if all segments are in the same container and none are nested SLOs, that way we only have two auth checks to make. Have s3api try to use this new mode if the data seems to have been uploaded via S3 (since it should be safe to assume that the above criteria are met). Drive-by: Allow the expirer queue and swift-container-deleter to use high-precision timestamps. Change-Id: I0bbe1ccd06776ef3e23438b40d8fb9a7c2de8921	2020-11-10 18:22:01 +00:00
Tim Burke	a8d2146266	xlo: Drain error responses Otherwise, they get logged as 499s. While we're at it, also drain DLO bodies (if they're small). Change-Id: I7b54f25f42577020b10029c74f8fc01fa6fc591e	2020-09-15 12:51:16 -07:00
Zuul	24fcb380a8	Merge "Return correct etag for raw manifest"	2020-01-31 18:53:27 +00:00
Thiago da Silva	b8c16de023	Return correct etag for raw manifest When client sends a '?multipart-manifest=get&format=raw' request middleware will change the manifest returned from object server. This patch makes sure the response etag is updated to reflect changes to manifest content Change-Id: I0ac6dd0808fb041ba7663f4a472a06ee3f1d9a71	2020-01-31 12:04:12 +11:00
Clay Gerrard	2759d5d51c	New Object Versioning mode This patch adds a new object versioning mode. This new mode provides a new set of APIs for users to interact with older versions of an object. It also changes the naming scheme of older versions and adds a version-id to each object. This new mode is not backwards compatible or interchangeable with the other two modes (i.e., stack and history), especially due to the changes in the namimg scheme of older versions. This new mode will also serve as a foundation for adding S3 versioning compatibility in the s3api middleware. Note that this does not (yet) support using a versioned container as a source in container-sync. Container sync should be enhanced to sync previous versions of objects. Change-Id: Ic7d39ba425ca324eeb4543a2ce8d03428e2225a1 Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Tim Burke <tim.burke@gmail.com> Co-Authored-By: Thiago da Silva <thiagodasilva@gmail.com>	2020-01-24 17:39:56 -08:00
Tim Burke	e8b654f318	Have slo tell the object-server that it wants whole manifests Otherwise, we waste a request on some 416/206 response that won't be helpful. To do this, add a new X-Backend-Ignore-Range-If-Metadata-Present header whose value is a comma-separated list of header names. Middlewares may include this header to tell object-servers to send the whole object (rather than a 206 or 416) if any of the metadata are present. Have dlo and symlink use it, too; it won't save us any round-trips, but it should clean up some object-server logging. Change-Id: I4ff2a178d0456e7e37d561109ef57dd0d92cbd4e	2020-01-02 15:48:39 -08:00
Tim Burke	1f7b97ec0f	Add normalize_etag() helper function ... and drive-by a import rename Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Change-Id: I1eaf075ff9855cfa03e7991bdf33375b0e4397e6	2019-12-14 01:53:43 +00:00
Zuul	9b4b57a880	Merge "py3: port SLO func tests"	2019-08-17 03:19:22 +00:00
Samuel Merritt	cb30811916	Allow bulk delete of big SLO manifests If you set SLO's max_manifest_segments to a value larger than 10000, then clients are able to create manifests with that many segments, but unable to use "?multipart-manifest=delete" to delete them. This is because the SLO middleware has its very own bulk-deleter that it uses to handle such requests, and that bulk-deleter only allows 10000 deletions per request by default. This commit removes the limitation so that any SLO manifest can be deleted along with its segments. I considered setting max-deletes-per-request to be equal to SLO's max_manifest_segments, but that only works if max_manifest_segments has never been decreased. Note that this commit does not increase max_manifest_segments. Clients cannot make SLOs any bigger than they could before. Also note that this commit does not affect user-initiated bulk deletes, i.e. POST requests with "?bulk-delete=true" set. Those requests are still limited in their size, and those limits are not changed. Change-Id: I6a35937e8418f4f2b8e29825fc9c40415e34742f Closes-Bug: 1746685	2019-08-13 16:51:50 -07:00
Zuul	196113f93e	Merge "Consolidate Container-Update-Override headers"	2019-08-13 02:52:08 +00:00
Clay Gerrard	996aa4547f	Consolidate Container-Update-Override headers Related-Change-Id: I179ea6180d31146bb947061c69b1807c59529ac8 Related-Change-Id: I056edc68aee8c0db2a2c4a5b9e3d242a895975b3 Change-Id: I84bd29ae48ff1b0826794a8fdf9aa87670ad4aa4	2019-08-09 10:35:26 -05:00
Tim Burke	3ee6de408e	slo: Add X-Manifest-Etag to responses This matches the ETag of the underlying swift object, as opposed to the MD5-of-MD5s that is the large object's ETag. Change-Id: Ifab726f63739f62aeef495c970939410341694d1	2019-08-08 14:20:05 -07:00
Tim Burke	a48104c545	py3: port SLO func tests Drive-by: remove some py2/py3 code branches where encoding/decoding should always succeed. Change-Id: Iabfa157f2b20e6fd650a441e9c0f6163478ce147	2019-07-22 15:02:37 +01:00
Tim Burke	ef5a37c2bf	slo: Better handle non-manifest responses when refetching manifest Previously, we never checked whether the response we get when refetching is even successful, much less whether it's still coming from an SLO. Now, if the refetched data is newer, act on it. If it's older, 503. Closes-Bug: #1837270 Change-Id: I106b94c77da220c762869aa800c31b87c3dffeeb	2019-07-19 21:42:43 -07:00
Tim Burke	5573354655	Move calls to self.app outside of error handling On py3, if/when you hit an error, you can get very noisy tracebacks like <traceback coming out of split_path()> During handling of the above exception, another exception occurred: <meaningful traceback> In general, I like this, but when we've used exception handling for flow-control, it gets difficult to separate the wheat from the chaff. Change-Id: I5f3bc6416207cab2c7e3a77ee6689360b55990e7	2019-06-17 13:43:48 -07:00
Pete Zaitcev	bd8c3067b4	py3: slo This adds wsgi_to_str(self.path_info) everywhere we forgot it, not only in the slo module itself. Dropping the body=''.join(body) after call_slo() is obvious: the latter only returns strings of bytes, not lists of such. Change-Id: I6b4d87e4cda4945bc128dbc9c1edd39e736a59d2	2019-05-17 17:57:23 -05:00
Tim Burke	fa678949ae	Fix quoting for large objects Change-Id: I46bdb6da8f778a6c86e0f8e883b52fc31e9fd44e Partial-Bug: 1774238 Closes-Bug: 1678022 Closes-Bug: 1598093 Closes-Bug: 1762997	2019-03-12 16:08:24 -07:00
Tim Burke	284bbdd391	Add slo_manifest_hook callback ... to allow other middlewares to impose additional constraints on or make edits to SLO manifests before being written. The callback takes a single argument: the python list that represents the manifest to be written. All the normal list operations listed at https://docs.python.org/2/library/stdtypes.html#mutable-sequence-types are available to make changes to that before SLO serializes it as JSON. The callback may return a list of problematic segments; each item in the list should be a tuple of (quoted object name, description of problem) This will be useful both for s3api minimum segment size validation and creating tar large objects. Change-Id: I198c5196e0221a72b14597a06e5ce3c4b2bbf436 Related-Bug: #1636663	2018-12-11 15:46:01 -08:00
Tim Burke	c4c98eb64d	Include SLO ETag in container updates Container servers will store an etag like <MD5 of manifest on disk>; slo_etag=<MD5 on concatenated ETags> which the SLO middleware will break out into separate "hash": "<MD5 of manifest on disk", "slo_etag": "\"<MD5 of concatenated ETags\"", keys for JSON listings. Text and XML listings are unaffected. If a middleware left of SLO already specified a container update override, the slo_etag parameter will be appended. If the base header value was blank, the MD5 of the manifest will be inserted. SLOs that were created on previous versions of Swift will continue to just have the MD5 of the manifest in container listings. Closes-Bug: 1618573 Change-Id: I67478923619b00ec1a37d56b6fec6a218453dafc	2018-07-10 15:41:29 -07:00
Tim Burke	d03fc9bc54	swob: Stop auto-encoding unicode bodies Instead, require that callers provide an encoding. Related-Change: I31408f525ba9836f634a35581d4aee6fa2c9428f Change-Id: I3e5ed9e4401eea76c375bb43ad4afc58b1d8006a	2018-06-28 09:58:44 -07:00
Timur Alperovich	0aad95005d	Fix SLO delete for accounts with non-ASCII names. If an account contains non-ASCII characters, currently SLO delete code will fail, as get_slo_segments() method receives a unicode object, but UTF-8 encoded account name. Attempting to concatenate the strings fails with a UnicodeError, as it tries to use the ASCII codec to decode the UTF-8 encoded account name. This patch allows accounts with non-ASCII characters in their names to delete SLOs. Change-Id: I619d41e62c16b25bd5f58d300a3dc71aa4dc75c2	2018-05-23 16:19:50 -07:00
Tim Burke	42adbe561f	Respect X-Backend-Etag-Is-At headers from left of SLO If a middleware left of SLO wants to override the ETag for a large object, it will need to send a X-Backend-Etag-Is-At on GETs if it wants to be at all performant. This would work fine coming out of the object controller (which would look at the headers in the response, figure out what's the real conditional etag, and pass it to swob.Response), and even encryption (which would do the same), but at SLO, we'd just replace the ETag, flag it as a conditional response, and let swob assume the SLO ETag is the conditional one. Now, SLO will jump through the same resolve_backend_etag_is_at hoops that other parts of the proxy have to deal with. This allows If-Match and If-None-Match to work correctly if/when swift3 stores an S3-compatible multipart-upload ETag. Change-Id: Ibbf59d38d7bcc9c485b1d5305548144025d77441	2018-03-26 23:50:43 +00:00
Zuul	82844a3211	Merge "Add support for data segments to SLO and SegmentedIterable"	2018-02-01 12:52:55 +00:00
Joel Wright	11bf9e4588	Add support for data segments to SLO and SegmentedIterable This patch updates the SLO middleware and SegmentedIterable to add support for user-specified inlined-data segments. Such segments will contain base64-encoded data to be added before/after an object-backed segment within an SLO. To accommodate the potential extra data we increase the default SLO maximum manifest size from 2MiB to 8MiB. The default maximum number of segments remains 1000, but this will only be enforced for object-backed segments. This patch is a prerequisite for a future patch enabling the download of large objects as tarballs. The TLO patch will be added as a dependent patch later. UpgradeImpact ============= During a rolling upgrade, an updated proxy may write a manifest that out-of-date proxies will not be able to read. This will resolve itself once the upgrade completes on all nodes. Change-Id: Ib8dc216a84d370e6da7d6b819af79582b671d699	2018-01-31 02:13:22 +00:00
Tim Burke	d1656e3349	slo: Send ETag header in 206 responses Why weren't we doing that before?? The etag should be the same as for GET/HEAD, and by sending it, we can assure resuming clients that they're downlading the same object even if they didn't include an If-Match header. Change-Id: I4ccbd1ae3a909ecb4606ef18211d1b868f5cad86 Related-Change: Ic11662eb5c7176fbf422a6fc87a569928d6f85a1	2018-01-17 23:30:16 +00:00
Zuul	2596b3ca9d	Merge "Let clients request heartbeats during SLO PUTs"	2017-11-03 16:05:18 +00:00
Tim Burke	77a8a4455d	Let clients request heartbeats during SLO PUTs An SLO PUT requires that we HEAD every referenced object; as a result, it can be a very time-intensive operation. This makes it difficult as a client to differentiate between a proxy-server that's still doing work and one that's crashed but left the socket open. Now, clients can opt-in to receiving heartbeats during long-running PUTs by including the query parameter heartbeat=on With heartbeating turned on, the proxy will start its response immediately with 202 Accepted then send a single whitespace character periodically until the request completes. At that point, a final summary chunk will be sent which includes a "Response Status" key indicating success or failure and (if successful) an "Etag" key indicating the Etag of the resulting SLO. This mechanism is very similar to the way bulk extractions and deletions work, and even the way SLO behaves for ?multipart-manifest=delete requests. Note that this is opt-in: this prevents us from sending the 202 response to existing clients that may mis-interpret it as an immediate indication of success. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Related-Bug: 1718811 Change-Id: I65cee5f629c87364e188aa05a06d563c3849c8f3	2017-11-03 09:42:48 +00:00
Zuul	058eb772e3	Merge "Make xml responses less insane"	2017-11-02 04:29:43 +00:00
Tim Burke	29e9ae1cc5	Make xml responses less insane Looking at bulk extractions: $ tar c *.py \| curl http://saio:8090/v1/AUTH_test/c?extract-archive=tar \ -H accept:application/xml -T - <?xml version="1.0" encoding="UTF-8"?> <delete> <number_files_created>2</number_files_created> <response_body></response_body> <response_status>201 Created</response_status> <errors> </errors> </delete> Or SLO upload failures: $ curl http://saio:8090/v1/AUTH_test/c/slo?multipart-manifest=put -X PUT \ -d '[{"path": "not/found"}]' -H accept:application/xml <delete> <errors> <object><name>not/found</name><status>404 Not Found</status></object></errors> </delete> Why <delete>? Makes no sense. Drive-by: stop being so quadratic. Change-Id: I46b233864ba2815ac632d856b9f3c40cc9d0001a	2017-11-01 03:50:26 +00:00
Tim Burke	e001c02ff9	Stop logging tracebacks on bad xLOs The error messages alone should provide plenty enough information. Plus, running functional tests really shouldn't cause tracebacks. Also, tighten up tests for log messages. Change-Id: I55136484d342af756fa153d971dcb9159a435f13	2017-10-18 22:39:37 +00:00
Zuul	ae2a75ad52	Merge "Allow SLOs to have zero-byte last segments."	2017-10-17 18:08:18 +00:00
Tim Burke	c118059719	Respond 400 Bad Request when Accept headers fail to parse Change-Id: I6eb4e4bca95e2ee4fecdb703394cb2419737922d Closes-Bug: 1716509	2017-10-13 12:35:21 -07:00
Samuel Merritt	37f23b072e	Allow SLOs to have zero-byte last segments. Since we used to allow zero-byte last segments but now we don't, it can be difficult to deal with some old SLO manifests. Imagine you're writing some code to sync objects from Swift cluster A to Swift cluster B. You start off with just a GET from A piped into a PUT to B, and that works great until you hit a SLO manifest and B won't accept a 500GB object. So, you write some code to detect SLO manifests, sync their segments, then take the JSON manifest (?multipart-manifest=get) and sync that over. Now, life is good... until one day you get an exception notification that there's this manifest on cluster A that cluster B won't accept. Turns out that, back when Swift would take zero-byte final segments on SLOs (before commit `7f636a5`), someone uploaded such a SLO to cluster A. Now, however, zero-byte final segments are invalid, so that SLO that exists over in cluster A can't just be copied to cluster B. A little coding later, your sync tool detects zero-byte final segments and removes them when copying a manifest. But now your ETags don't match between clusters, so you have to figure out some way to deal with that, and so you put it in metadata, but then you realize that your syncer might encounter a SLO which contains a sub-SLO which has a zero-byte final segment, and it's right about then that you start thinking about giving up on programming and getting a job as an elevator mechanic. This commit makes life easier for developers of such applications by allowing SLOs to have zero-byte segments again. Change-Id: Ia37880bbb435e269ec53b2963eb1b9121696d479	2017-10-10 10:38:33 -07:00
Tim Burke	4665c175be	Clean up SLO tests and docs Change-Id: If7087cb674d6c575c4073ba09b5ef056d908655b	2017-10-04 00:16:31 +00:00
liuyamin	1eeb354c27	Fix the reST field raises in docstrings Probably the most common format for documenting arguments is reST field lists [1]. This change updates some docstrings to comply with the field lists syntax. [1] http://sphinx-doc.org/domains.html#info-field-lists Change-Id: I0c35c6b4df840018534737bca2ca32dc977b0e05	2017-06-28 09:10:24 +08:00
Jenkins	ab1701d1ff	Merge "Expand SLO manifest documentation."	2017-02-24 00:02:50 +00:00

1 2 3

118 Commits