swift

Commit Graph

Author	SHA1	Message	Date
Shreeya Deshpande	bc3a59bdd3	Refactor utils - Move statsd client into it's own module - Move all logging functions into their own module - Move all config functions into their own module - Move all helper functions into their own module Partial-Bug: #2015274 Change-Id: Ic4b5005e3efffa8dba17d91a41e46d5c68533f9a	2024-04-30 20:27:47 +00:00
Tim Burke	c522f5676e	Add ClosingIterator class; be more explicit about closes ... in document_iters_to_http_response_body. We seemed to be relying a little too heavily upon prompt garbage collection to log client disconnects, leading to failures in test_base.py::TestGetOrHeadHandler::test_disconnected_logging under python 3.12. Closes-Bug: #2046352 Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: I4479d2690f708312270eb92759789ddce7f7f930	2024-02-12 11:16:09 +00:00
Alistair Coles	dc3eda7e89	proxy: don't send multi-part terminator when no parts sent If the proxy timed out while reading a replicated policy multi-part response body, it would transform the ChunkReadTimeout to a StopIteration. This masks the fact that the backend read has terminated unexpectedly. The document_iters_to_multipart_byteranges would complete iterating over parts and send a multipart terminator line, even though no parts may have been sent. This patch removes the conversion of ChunkReadTmeout to StopIteration. The ChunkReadTimeout that is now raised prevents the document_iters_to_multipart_byteranges 'for' loop completing and therefore stops the multi-part terminator line being sent. It is raised from the GetOrHeadHandler similar to other scenarios that raise ChunkReadTimeouts while the resp body is being read. A ChunkReadTimeout exception handler is removed in the _iter_parts_from_response method. This handler was previously never reached (because StopIteration rather than ChunkReadTimeout was raised from _get_next_response_part), but if it were reached (i.e. with this change) then it would repeat logging of the error and repeat incrementing the node's error counter. This change in the GetOrHeadHandler mimics a similar change in the ECFragGetter [1]. [1] Related-Chage: I0654815543be3df059eb2875d9b3669dbd97f5b4 Co-Authored-By: Tim Burke <tim.burke@gmail.com> Change-Id: I6dd53e239f5e7eefcf1c74229a19b1df1c989b4a	2024-02-05 10:28:40 +00:00
Alistair Coles	72ac5b3be0	proxy: refactor to share namespace cache helpers Create new helper functions to set and get namespaces in cache. Use these in both the object and container controllers when caching namespaces for updating and listing state shard ranges respectively. Add unit tests for the new helper functions. No intentional behavioural changes. Change-Id: I6833ec64540fa19f658f0ee78952ecb43b49f169	2023-11-21 10:30:32 +00:00
Alistair Coles	f8c94d6bbc	proxy-server: add replicated GET path tests Improve test coverage for the resuming multipart replicated GET path. Change-Id: I7de34f443399f645f5021ed392e515f795ed7249	2023-09-21 12:02:20 +01:00
Alistair Coles	369a72c4cf	proxy: remove client_chunk_size and skip_bytes from GetOrHeadHandler The client_chunk_size attribute was introduced into GetOrHeadHandler for EC support [1]. It was only ever not None for an ECObjectController. The ECObjectController stopped using GetOrHeadHandler for Object GET when the ECFragGetter class was introduced [2], but the EC specific code was not expunged from GetOrHeadHandler. In [3] the ECFragGetter client_chunk_size was renamed to fragment_size to better reflect what it represented. The skip_bytes attribute was similarly introduced for EC support. It is only ever non-zero if client_chunk_size is an int. For EC, skip_bytes is used to undo the effect of expanding the backend range(s) to fetch whole fragments: the range(s) of decoded bytes returned to the client may need to be narrower than the backend ranges. There is no equivalent requirement for replicated GETs. The elimination of client_chunk_size and skip_bytes simplifies the yielding of chunks from the GetOrHeadHandler response iter. Related-Change: [1] I9c13c03616489f8eab7dcd7c5f21237ed4cb6fd2 [2] I0dc5644a84ededee753e449e053e6b1786fdcf32 [3] Ie1efaab3bd0510275d534b5c023cb73c98bec90d Change-Id: I31ed36d32682469e3c5ca8bf9a2b383568d63c72	2023-07-24 09:15:12 -05:00
Shreeya Deshpande	647ee83906	Unit test for keepalive timeout Create a unit test to verify client timeout for multiple requests Change-Id: I974e01cd2cb18f4ea87c3966dbf4b06bff22ed39	2023-05-10 09:01:41 -07:00
Clay Gerrard	c95f8e6c05	tests for wsgi/daemon config parsing Change-Id: Ibb82555830b88962cc765fc88281ca42a9ce9d9c	2023-04-14 14:51:23 -05:00
Tim Burke	be16d6c4fd	tests: Get rid of test.unit.SkipTest unittest.SkipTest suffices. Change-Id: I11eb73f7dc4a8598fae85d1efca721f69067fb4f	2023-02-16 23:59:53 -08:00
Zuul	6994200026	Merge "Remove :memory: from DatabaseBrokers and unittests"	2023-02-09 08:46:13 +00:00
Tim Burke	69b18e3c50	tests: Remove references to soft_lock As best as I can tell, this has never been an interface. Change-Id: I42e4b82a7af8a81e497e68ad25ac3bc4d0d74970	2023-01-11 14:05:34 -08:00
Matthew Oliver	c4e00eb89f	Sharder: Fall back to local device in get_shard_broker If the sharder is processing a node that has 0 weight, especially for all the devices on the node, the `find_local_handoff_for_part` can fail because there will be no local hand off devices available as it uses the replica2part2dev_id to find a device. However, a 0 weighted device won't appear in the replica2part2dev table. This patch extends `find_local_handoff_for_part`, if it fails to find a node from the ring it'll fall back to a local device identified by the `_local_device_ids` that is built up when the replicator or sharder was identifing local devices. This uses the ring.devs, so does include 0 weighted devices. This allows the sharder to find a location to write the shard_broker in a handoff location while sharding. Co-Authored-By: Tim Burke <tim.burke@gmail.com> Change-Id: Ic38698e9ca0397770c7362229baef1101a72788f	2022-07-29 15:02:26 +01:00
Matthew Oliver	a548da916f	Remove :memory: from DatabaseBrokers and unittests The SQLite in-memory databases have been great for testing but as the swift DatabaseBroker's have become more complex, the limitations of in memory databases are being reached. Mostly due to the introduction of container sharding where a broker sometimes needs to make multiple connections to the same database as the same time. Rather then rework the real broker logic to better support in-memory testing, it's actually easier to just remove the in-memory broker tests and use a "real" broker in a tempdir. This allows us to better test how brokers behave in real life, pending files and all. This patch replaces all the :memory: brokers in the tests with real ones placed in a tempdir. To achieve this, we new base unittest class `TestDBBase` has been added that creates, cleans up and provides some helper methods to manage the db path and location. Further, all references to :memory: in the Database brokers have been removed. Change-Id: I5983132f776b84db634fef39c833d5cfdce11980	2022-07-12 12:30:43 +10:00
Zuul	eeb5533457	Merge "memcached: Give callers the option to accept errors"	2022-05-13 16:44:57 +00:00
Alistair Coles	2f607cd319	Round s3api listing LastModified to integer resolution s3api bucket listing elements currently have LastModified values with millisecond precision. This is inconsistent with the value of the Last-Modified header returned with an object GET or HEAD response which has second precision. This patch reduces the precision to seconds in bucket listings and upload part listings. This is also consistent with observation of an aws listing response. The last modified values in the swift native listing up to the nearest second to be consistent with the seconds-precision Last-Modified time header that is returned with an object GET or HEAD. However, we continue to include millisecond digits set to 0 in the last-modified string, e.g.: '2014-06-10T22:47:32.000Z'. Also, fix the last modified time returned in an object copy response to be consistent with the last modified time of the object that was created. Previously it was rounded down, but it should be rounded up. Change-Id: I8c98791a920eeedfc79e8a9d83e5032c07ae86d3	2022-05-10 11:26:27 +01:00
Tim Burke	9bed525bfb	memcached: Give callers the option to accept errors Auth middlewares in particular may want to know when there's a communication breakdown as opposed to a cache miss. Update our shard-range cache stats to acknowlegde the distinction. Drive-by: Log an error if all memcached servers are error-limited. Change-Id: Ic8d0915235d11124d06ec940c5be9a2edbe85c83	2022-04-28 13:20:44 -07:00
Tim Burke	874a5865b8	tests: Improve FakeMemcache call tracking Make it much more like mock tracking, so we can easily add new kwargs. Change-Id: Ib29816c4626bb0d914929783bd676e8b6cb19bbf	2022-01-07 13:09:43 -08:00
Tim Burke	f7101f3795	tests: Unify FakeMemcaches Change-Id: I114d1628bb6dea04f246ff3ab12f4ccfdc4ec358	2022-01-06 10:13:15 -08:00
Tim Burke	1eaf7474fe	Fix some imports for py310 Between this and the (unreleased) pyeclib fix, I see unit and func tests passing on py310. Haven't tried probe tests, yet. Change-Id: Iacf66eda75fed6bf96900107250f393227c57ae5	2021-11-25 14:54:17 -08:00
Zuul	b04e7c2e53	Merge "Switch get(full)argspec function according to python version"	2021-07-15 18:21:40 +00:00
Takashi Kajinami	e00ae03370	Switch get(full)argspec function according to python version inspect.getargspec was deprecated since Python 3.0 and inspect.getfullargspec is its replacement with correct handling of function annotations and keyword-only parameters[1]. This change ensures that inspect.getfullargspec is used in Python 3. [1] https://docs.python.org/3/library/inspect.html#inspect.getargspec Change-Id: I63a4fda4f5da00c0f752e58f2e7192baea5012bb	2021-07-15 23:51:23 +09:00
Matthew Oliver	e491693e36	reconciler: PPI aware reconciler This patch makes the reconciler PPI aware. It does this by adding a helper method `can_reconcile_policy` that is used to check that the policies used for the source and destination aren't in the middle of a PPI (their ring doesn't have next_part_power set). In order to accomplish this the reconciler has had to include the POLICIES singleton and grown swift_dir and ring_check_interval config options. Closes-Bug: #1934314 Change-Id: I78a94dd1be90913a7a75d90850ec5ef4a85be4db	2021-07-13 13:55:13 +10:00
Matthew Oliver	4ce907a4ae	relinker: Add /recon/relinker endpoint and drop progress stats To further benefit the stats capturing for the relinker, drop partition progress to a new relinker.recon recon cache and add a new recon endpoint: GET /recon/relinker To gather get live relinking progress data: $ curl http://127.0.0.3:6030/recon/relinker \|python -mjson.tool { "devices": { "sdb3": { "parts_done": 523, "policies": { "1": { "next_part_power": 11, "start_time": 1618998724.845616, "stats": { "errors": 0, "files": 1630, "hash_dirs": 1630, "linked": 1630, "policies": 1, "removed": 0 }, "timestamp": 1618998730.24672, "total_parts": 1029, "total_time": 5.400741815567017 }}, "start_time": 1618998724.845946, "stats": { "errors": 0, "files": 836, "hash_dirs": 836, "linked": 836, "removed": 0 }, "timestamp": 1618998730.24672, "total_parts": 523, "total_time": 5.400741815567017 }, "sdb7": { "parts_done": 506, "policies": { "1": { "next_part_power": 11, "part_power": 10, "parts_done": 506, "start_time": 1618998724.845616, "stats": { "errors": 0, "files": 794, "hash_dirs": 794, "linked": 794, "removed": 0 }, "step": "relink", "timestamp": 1618998730.166175, "total_parts": 506, "total_time": 5.320528984069824 } }, "start_time": 1618998724.845616, "stats": { "errors": 0, "files": 794, "hash_dirs": 794, "linked": 794, "removed": 0 }, "timestamp": 1618998730.166175, "total_parts": 506, "total_time": 5.320528984069824 } }, "workers": { "100": { "drives": ["sda1"], "return_code": 0, "timestamp": 1618998730.166175} }} Also, add a constant DEFAULT_RECON_CACHE_PATH to help fix failing tests by mocking recon_cache_path, so that errors are not logged due to dump_recon_cache exceptions. Mock recon_cache_path more widely and assert no error logs more widely. Change-Id: I625147dadd44f008a7c48eb5d6ac1c54c4c0ef05	2021-05-10 16:13:32 +01:00
Alistair Coles	29418998b7	Fix shrinking making acceptors prematurely active During sharding a shard range is moved to CLEAVED state when cleaved from its parent. However, during shrinking an acceptor shard should not be moved to CLEAVED state when the shrinking shard cleaves to it, because the shrinking shard is not the acceptor's parent and does not know if the acceptor has yet been cleaved from its parent. The existing attempt to prevent a shrinking shard updating its acceptor state relied on comparing the acceptor namespace to the shrinking shard namespace: if the acceptor namespace fully enclosed the shrinkng shard then it was inferred that shrinking was taking place. That check is sufficient for normal shrinking of one shard into an expanding acceptor, but is not sufficient when shrinking in order to fix overlaps, when a shard might shrink into more than one acceptor, none of which completely encloses the shrinking shard. Fortunately, since [1], it is possible to determine that a shard is shrinking from its own shard range state being either SHRINKING or SHRUNK. It is still advantageous to delete and merge the shrinking shard range into the acceptor when the acceptor fully encloses the shrinking shard because that increases the likelihood of the root being updated with the deleted shard range in a timely manner. [1] Related-Change: I9034a5715406b310c7282f1bec9625fe7acd57b6 Change-Id: I91110bc747323e757d8b63003ad3d38f915c1f35	2021-04-29 09:38:46 +01:00
Clay Gerrard	2a312d1cd5	Cleanup tests' import of debug_logger Change-Id: I19ca860deaa6dbf388bdcd1f0b0f77f72ff19689	2021-04-27 12:04:41 +01:00
Clay Gerrard	4a4d899680	Refactor EC multipart/byteranges control flow The multipart document handling in the proxy is consumed via iteration, but the error handling code is not consistent with how it applies conversions of IO errors/timeouts and retry failures to StopIteration. In an effort to make the code more obvious and easier to debug and maintain I've added comments and additional tests as well as tightening up StopIteration exception handling. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: I0654815543be3df059eb2875d9b3669dbd97f5b4	2021-04-21 12:45:20 -05:00
Alistair Coles	8f4200791b	Move DebugLogger to its own module Move DebugLogger and associated classes to its own module under test so that it can be imported (for example in probe tests) without requiring all the dependencies in test/unit/__init__.py. Change-Id: I0ea3c26e54d91f27159805a45e49ad7f8f0e0431	2021-01-22 10:45:01 -06:00
Tim Burke	6f813f6bfa	Fix __exit__ calls The context manager protocol requires that __exit__ be called with three args: type, value, and traceback. In some places, we didn't include any args at all, leading to test failures during clean-up. Change-Id: I2998830e6eac685b1f753937d12cf5346a4eb081	2021-01-13 12:42:23 -08:00
Alistair Coles	077ba77ea6	Use cached shard ranges for container GETs This patch makes four significant changes to the handling of GET requests for sharding or sharded containers: - container server GET requests may now result in the entire list of shard ranges being returned for the 'listing' state regardless of any request parameter constraints. - the proxy server may cache that list of shard ranges in memcache and the requests environ infocache dict, and subsequently use the cached shard ranges when handling GET requests for the same container. - the proxy now caches more container metadata so that it can synthesize a complete set of container GET response headers from cache. - the proxy server now enforces more container GET request validity checks that were previously only enforced by the backend server, e.g. checks for valid request parameter values With this change, when the proxy learns from container metadata that the container is sharded then it will cache shard ranges fetched from the backend during a container GET in memcache. On subsequent container GETs the proxy will use the cached shard ranges to gather object listings from shard containers, avoiding further GET requests to the root container until the cached shard ranges expire from cache. Cached shard ranges are most useful if they cover the entire object name space in the container. The proxy therefore uses a new X-Backend-Override-Shard-Name-Filter header to instruct the container server to ignore any request parameters that would constrain the returned shard range listing i.e. 'marker', 'end_marker', 'includes' and 'reverse' parameters. Having obtained the entire shard range listing (either from the server or from cache) the proxy now applies those request parameter constraints itself when constructing the client response. When using cached shard ranges the proxy will synthesize response headers from the container metadata that is also in cache. To enable the full set of container GET response headers to be synthezised in this way, the set of metadata that the proxy caches when handling a backend container GET response is expanded to include various timestamps. The X-Newest header may be used to disable looking up shard ranges in cache. Change-Id: I5fc696625d69d1ee9218ee2a508a1b9be6cf9685	2021-01-06 16:28:49 +00:00
Alistair Coles	5e33026495	Use CloseableChain when creating iterator of SLO response When handling a GET response ProxyLoggingMiddleware will try to close a reiterated [1] proxy response iterator if, for example, there is a client disconnect. The reiterate function encapsulates the result of calling iter() on the proxy response. In the case of an SLO response, the iter method returned an instance of iterchools.chain, rather than the response itself, which is an instance of SegmentedIterable. As a result the SegmentedIterable.close() method would not be called and object server connections would not be closed. This patch replaces the iterchools.chain with a CloseableChain which encapsulates the SegmentedIterable and closes it when CloseableChain.close() is called. [1] The use of reiterate was introduced by the Related-Change. Closes-Bug: #1909588 Related-Change: I27feabe923a6520e983637a9c68a19ec7174a0df Change-Id: Ib7450a85692114973782525004466db49f63066d	2020-12-29 16:14:28 +00:00
Ade Lee	5320ecbaf2	replace md5 with swift utils version md5 is not an approved algorithm in FIPS mode, and trying to instantiate a hashlib.md5() will fail when the system is running in FIPS mode. md5 is allowed when in a non-security context. There is a plan to add a keyword parameter (usedforsecurity) to hashlib.md5() to annotate whether or not the instance is being used in a security context. In the case where it is not, the instantiation of md5 will be allowed. See https://bugs.python.org/issue9216 for more details. Some downstream python versions already support this parameter. To support these versions, a new encapsulation of md5() is added to swift/common/utils.py. This encapsulation is identical to the one being added to oslo.utils, but is recreated here to avoid adding a dependency. This patch is to replace the instances of hashlib.md5() with this new encapsulation, adding an annotation indicating whether the usage is a security context or not. While this patch seems large, it is really just the same change over and again. Reviewers need to pay particular attention as to whether the keyword parameter (usedforsecurity) is set correctly. Right now, all of them appear to be not used in a security context. Now that all the instances have been converted, we can update the bandit run to look for these instances and ensure that new invocations do not creep in. With this latest patch, the functional and unit tests all pass on a FIPS enabled system. Co-Authored-By: Pete Zaitcev Change-Id: Ibb4917da4c083e1e094156d748708b87387f2d87	2020-12-15 09:52:55 -05:00
Clay Gerrard	5f95e1bece	Use bigger GreenPool for concurrent EC We're getting some blockage trying to feed backup requests in waterfall EC because the pool_size was limited to the initial batch of requests. This was (un?)fortunately working out in practice because there were lots of initial primary fragment requests and some would inevitably be quick enough to make room for the pending feeder requests. But when enough of the initial requests were slow (network issue at the proxy?) we wouldn't have the expected number of pending backup requests in-flight. Since concurrent EC should never make extra requests to non-primaries (at least not until an existing primary request completes) ec_n_unique_fragments makes a reasonable cap for the pool. Drive-bys: * Don't make concurrent_ec_extra_requests unless you have enabled concurrent_gets. * Improved mock_http_connect extra requests tracking formatting * FakeStatus __repr__'s w/ status code in AssertionErrors Change-Id: Iec579ed874ef097c659dc80fff1ba326b6da05e9	2020-09-25 09:47:40 -05:00
Tim Burke	d5625abf60	proxy: Include thread_locals when spawning _fragment_GET_request Otherwise, we miss out on transaction id and client IP information when timeouts pop. Closes-Bug: #1892421 Change-Id: I6dea3ccf780bcc703db8447a2ef13c33838ff12d	2020-09-08 15:00:02 -07:00
Tim Burke	b7b45eadcd	ec: Close down some unused responses more quickly These should get GC'ed eventually, but sooner is probably better than later. Change-Id: I4daa18c36235e6df65e8b1c00a12dbf10677ca61	2020-09-03 11:20:07 -07:00
Tim Burke	2a6dfae2f3	Allow direct and internal clients to use the replication network A new header `X-Backend-Use-Replication-Network` is added; if true, use the replication network instead of the client-data-path network. Several background daemons are updated to use the replication network: * account-reaper * container-reconciler * container-sharder * container-sync * object-expirer Note that if container-sync is being used to sync data within the same cluster, the replication network will only be used when communicating with the "source" container; the "destination" traffic will continue to use the configured realm endpoint. The direct and internal client APIs still default to using the client-data-path network; this maintains backwards compatibility for external tools written against them. UpgradeImpact ============= Until recently, servers configured with replication_server = true would only handle REPLICATE (and, in the case of object servers, SSYNC) requests, and would respond 405 Method Not Allowed to other requests. When upgrading from Swift 2.25.0 or earlier, remove the config option and restart services prior to upgrade to avoid a flood of background daemon errors in logs. Note that some background daemons find work by querying Swift rather than walking local drives that should be available on the replication network: * container-reconciler * object-expirer Previosuly these may have been configured without access to the replication network; ensure they have access before upgrading. Closes-Bug: #1883302 Related-Bug: #1446873 Related-Change: Ica2b41a52d11cb10c94fa8ad780a201318c4fc87 Change-Id: Ieef534bf5d5fb53602e875b51c15ef565882fbff	2020-08-04 21:22:04 +00:00
Zuul	a495f1e327	Merge "pep8: Turn on E305"	2020-04-10 11:55:07 +00:00
Tim Burke	668242c422	pep8: Turn on E305 Change-Id: Ia968ec7375ab346a2155769a46e74ce694a57fc2	2020-04-03 21:22:38 +02:00
Romain LE DISEZ	804776b379	Optimize obj replicator/reconstructor healthchecks DaemonStrategy class calls Daemon.is_healthy() method every 0.1 seconds to ensure that all workers are running as wanted. On object replicator/reconstructor daemons, is_healthy() check if the rings changed to decide if workers must be created/killed. With large rings, this operation can be CPU intensive, especially on low-end CPU. This patch: - increases the check interval to 5 seconds by default, because none of these daemons are critical for performance (they are not in the datapath). But it allows each daemon to change this value if necessary - ensures that before doing a computation of all devices in the ring, object replicator/reconstructor checks that the ring really changed (by checking the mtime of the ring.gz files) On an Atom N2800 processor, this patch reduced the CPU usage of the main object replicator/reconstructor from 70% of a core to 0%. Change-Id: I2867e2be539f325778e2f044a151fd0773a7c390	2020-04-01 08:03:32 -04:00
Sean McGinnis	5b26b749b5	Drop use of unittest2 unittest2 was needed for Python version <= 2.6, so it hasn't been needed for quite some time. See unittest2 note one: https://docs.python.org/2.7/library/unittest.html This drops unittest2 in favor of the standard unittest module. Change-Id: I2e787cfbf1709b7f9c889230a10c03689e032957 Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>	2020-01-12 03:13:41 -06:00
Clay Gerrard	286082222d	Use less responses from handoffs Since we don't use 404s from handoffs anymore, we need to not let errors on handoffs overwhelm primary responses either Change-Id: I2624e113c9d945542f787e5f18f487bd7be3d32e Closes-Bug: #1857909	2020-01-02 16:44:05 -08:00
Tim Burke	d270596b67	Consistently use io.BytesIO Change-Id: Ic41b37ac75b5596a8307c4962be86f2a4b0d9731	2019-10-15 15:09:46 +02:00
Thomas Goirand	12a7b42062	Fix test_parse_get_node_args Looks like xattr_supported_check was missing ERANGE Change-Id: I82263e48e836f38f77d81593c8435f64a4728b5d	2019-07-19 01:32:25 +02:00
Clay Gerrard	563e1671cf	Return 503 when primary containers can't respond Closes-Bug: #1833612 Change-Id: I53ed04b5de20c261ddd79c98c629580472e09961	2019-06-25 12:23:12 -05:00
Tim Burke	e8e7106d14	py3: port obj/reconstructor tests All of the swift changes we needed for this were already done elsewhere. Change-Id: Ib2c26fdf7bd36ed1cccd5dbd1fa208f912f4d8d5	2019-06-10 08:31:41 -07:00
Tim Burke	2e35376c6d	py3: symlink follow-up - Have the unit tests use WSGI strings, like a real system. - Port the func tests. Change-Id: I3a6f409208de45ebf9f55f7f59e4fe6ac6fbe163	2019-05-30 16:25:17 -07:00
Tim Burke	b8284538be	py3: start porting for unit/proxy/test_server.py Mostly this ammounts to Exception.message -> Exception.args[0] '...' -> b'...' StringIO -> BytesIO makefile() -> makefile('rwb') iter.next() -> next(iter) bytes[n] -> bytes[n:n + 1] integer division Note that the versioning tests are mostly untouched; they seemed to get a little hairy. Change-Id: I167b5375e7ed39d4abecf0653f84834ea7dac635	2019-05-04 20:35:05 -07:00
Pete Zaitcev	575538b55b	py3: port the container This started with ShardRanges and its CLI. The sharder is at the bottom of the dependency chain. Even container backend needs it. Once we started tinkering with the sharder, it all snowballed to include the rest of the container services. Beware, this does affect some of Python 2 code. Mostly it's trivial and obviously correct, but needs checking by reviewers. About killing the stray "from __future__ import unicode_literals": we do not do it in general. The specific problem it caused was a failure of functional tests because unicode leaked into a field that was supposed to be encoded. It is just too hard to track the types when rules change from file to file, so off with its head. Change-Id: Iba4e65d0e46d8c1f5a91feb96c2c07f99ca7c666	2019-02-20 21:30:46 -06:00
Zuul	64e5fd364a	Merge "Stop using duplicate dev IDs in write_fake_ring"	2019-02-09 07:08:21 +00:00
Clay Gerrard	ea8e545a27	Rebuild frags for unmounted disks Change the behavior of the EC reconstructor to perform a fragment rebuild to a handoff node when a primary peer responds with 507 to the REPLICATE request. Each primary node in a EC ring will sync with exactly three primary peers, in addition to the left & right nodes we now select a third node from the far side of the ring. If any of these partners respond unmounted the reconstructor will rebuild it's fragments to a handoff node with the appropriate index. To prevent ssync (which is uninterruptible) receiving a 409 (Conflict) we must give the remote handoff node the correct backend_index for the fragments it will recieve. In the common case we will use determistically different handoffs for each fragment index to prevent multiple unmounted primary disks from forcing a single handoff node to hold more than one rebuilt fragment. Handoff nodes will continue to attempt to revert rebuilt handoff fragments to the appropriate primary until it is remounted or rebalanced. After a rebalance of EC rings (potentially removing unmounted/failed devices), it's most IO efficient to run in handoffs_only mode to avoid unnecessary rebuilds. Closes-Bug: #1510342 Change-Id: Ief44ed39d97f65e4270bf73051da9a2dd0ddbaec	2019-02-08 18:04:55 +00:00
Tim Burke	8a6159f67b	Stop using duplicate dev IDs in write_fake_ring This would cause some weird issues where get_more_nodes() would actually yield out something, despite us only having two drives. Change-Id: Ibf658d69fce075c76c0870a542348f220376c87a	2019-02-08 09:36:35 -08:00

1 2 3 4

181 Commits