swift

Commit Graph

Author	SHA1	Message	Date
Jianjian Huo	a53270a15a	swift-manage-shard-ranges repair: check for parent-child overlaps. Stuck shard ranges have been seen in the production, root cause has been traced back to that s-m-s-r failed to detect parent-child relationship in overlaps and it either shrinked child shard ranges into parents or the other way around. A patch has been added to check minimum age before s-m-s-r performs repair, which will most likely prevent this from happening again, but we also need to check for parent-child relationship in overlaps explicitly during repairs. This patch will do that and remove parent or child shard ranges from doners, and prevent s-m-s-r from shrinking them into acceptor shard ranges. Drive-by 1: fixup gap repair probe test. The probe test is no longer appropriate because we're no longer allowed to repair parent-child overlaps, so replace the test with a manually created gap. Drive-by 2: address probe test TODOs. The commented assertion would fail because the node filtering comparison failed to account for the same node having different indexes when generated for the root versus the shard. Adding a new iterable function filter_nodes makes the node filtering behave as expected. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Change-Id: Iaa89e94a2746ba939fb62449e24bdab9666d7bab	2022-09-09 11:04:43 -07:00
Pete Zaitcev	85d0211279	Get rid of port to node assumptions and their modulo kludges We had get_server_number for many years now, there's no reason to continue using the node_id=port%100//10 thing. Change-Id: I5357a095110e8e4889c0468c154611209c6e8c07	2021-09-30 00:42:24 -05:00
Pete Zaitcev	bcff1282b5	Band-aid and test the crash of the account server We have a better fix in the works, see the change Ic53068867feb0c18c88ddbe029af83a970336545. But it is taking too long to coalesce and users are unhappy right now. Related: rhbz#1838242, rhbz#1965348 Change-Id: I3f7bfc2877355b7cb433af77c4e2dfdfa94ff14d	2021-08-12 16:26:48 -05:00
Alistair Coles	40aace89f0	Capture logs when running custom daemons in probe tests Previously a DebugLogger was used when probe tests ran 'custom daemons', which provided a means to inspect captured logs but logged to console. This patch replaces that DebugLogger with a 'normal' swift logger that is adapted to also capture logs and support log inspection. Change-Id: I25da3aa81018c5de7b63e5584ac6a9dbb73243db	2021-06-24 09:32:38 +01:00
Alistair Coles	46ea3aeae8	Quarantine stale EC fragments after checking handoffs If the reconstructor finds a fragment that appears to be stale then it will now quarantine the fragment. Fragments are considered stale if insufficient fragments at the same timestamp can be found to rebuild missing fragments, and the number found is less than or equal to a new reconstructor 'quarantine_threshold' config option. Before quarantining a fragment the reconstructor will attempt to fetch fragments from handoff nodes in addition to the usual primary nodes. The handoff requests are limited by a new 'request_node_count' config option. 'quarantine_threshold' defaults to zero i.e. no fragments will be quarantined. 'request node count' defaults to '2 * replicas'. Closes-Bug: 1655608 Change-Id: I08e1200291833dea3deba32cdb364baa99dc2816	2021-05-10 20:45:17 +01:00
Alistair Coles	122840cc04	probe test: use helper functions more widely Use the recently added assert_subprocess_success [1] helper function more widely. Add run_custom_sharder helper. Add container-sharder key to the ProbeTest.configs dict. [1] Related-Change: I9ec411462e4aaf9f21aba6c5fd7698ff75a07de3 Change-Id: Ic2bc4efeba5ae5bc8881f0deaf4fd9e10213d3b7	2021-04-08 12:18:40 +01:00
Alistair Coles	6ed82b106c	Run garbage collector during probe test setUp DatabaseBrokers cache opened connections. If a probe test instantiates a DatabaseBroker, or any other class that in turn instantiates a DatabaseBroker, such as a ContainerSharder, then connections may hold db files open until the DatabaseBroker is garbage collected. This can cause subsequent probe tests to fail during their setUp() because resetswift is unable to unmount device directories while db files are open. A call to gc.collect() is added during setUp() to ensure db files are closed before resetswift() is called. Closes-Bug: 1917050 Change-Id: Ifda4407c9ecff4c636fe07e013c3ebcebd0df018	2021-02-26 15:51:06 +00:00
Alistair Coles	1dceafa7d5	ssync: sync non-durable fragments from handoffs Previously, ssync would not sync nor cleanup non-durable data fragments on handoffs. When the reconstructor is syncing objects from a handoff node (a 'revert' reconstructor job) it may be useful, and is not harmful, to also send non-durable fragments if the receiver has older or no fragment data. Several changes are made to enable this. On the sending side: - For handoff (revert) jobs, the reconstructor instantiates SsyncSender with a new 'include_non_durable' option. - If configured with the include_non_durable option, the SsyncSender calls the diskfile yield_hashes function with options that allow non-durable fragments to be yielded. - The diskfile yield_hashes function is enhanced to include a 'durable' flag in the data structure yielded for each object. - The SsyncSender includes the 'durable' flag in the metadata sent during the missing_check exchange with the receiver. - If the receiver requests the non-durable object, the SsyncSender includes a new 'X-Backend-No-Commit' header when sending the PUT subrequest for the object. - The SsyncSender includes the non-durable object in the collection of synced objects returned to the reconstructor so that the non-durable fragment is removed from the handoff node. On the receiving side: - The object server includes a new 'X-Backend-Accept-No-Commit' header in its response to SSYNC requests. This indicates to the sender that the receiver has been upgraded to understand the 'X-Backend-No-Commit' header. - The SsyncReceiver is enhanced to consider non-durable data when determining if the sender's data is wanted or not. - The object server PUT method is enhanced to check for and 'X-Backend-No-Commit' header before committing a diskfile. If a handoff sender has both a durable and newer non-durable fragment for the same object and frag-index, only the newer non-durable fragment will be synced and removed on the first reconstructor pass. The durable fragment will be synced and removed on the next reconstructor pass. Change-Id: I1d47b865e0a621f35d323bbed472a6cfd2a5971b Closes-Bug: 1778002	2021-01-20 12:00:10 +00:00
Alistair Coles	128f199508	Refactor reconstructor probe tests Refactor the reconstructor probe test to share common setup and helper methods. Change-Id: If75803648169f85b854c3d5d8784aaebbd93805b	2021-01-11 13:57:55 +00:00
Samuel Merritt	b971280907	Let developers/operators add watchers to object audit Swift operators may find it useful to operate on each object in their cluster in some way. This commit provides them a way to hook into the object auditor with a simple, clearly-defined boundary so that they can iterate over their objects without additional disk IO. For example, a cluster operator may want to ensure a semantic consistency with all SLO segments accounted in their manifests, or locate objects that aren't in container listings. Now that Swift has encryption support, this could be used to locate unencrypted objects. The list goes on. This commit makes the auditor locate, via entry points, the watchers named in its config file. A watcher is a class with at least these four methods: __init__(self, conf, logger, kwargs) start(self, audit_type, kwargs) see_object(self, object_metadata, data_file_path, kwargs) end(self, kwargs) The auditor will call watcher.start(audit_type) at the start of an audit pass, watcher.see_object(...) for each object audited, and watcher.end() at the end of an audit pass. All method arguments are passed as keyword args. This version of the API is implemented on the context of the auditor itself, without spawning any additional processes. If the plugins are not working well -- hang, crash, or leak -- it's easier to debug them when there's no additional complication of processes that run by themselves. In addition, we include a reference implementation of plugin for the watcher API, as a help to plugin writers. Change-Id: I1be1faec53b2cdfaabf927598f1460e23c206b0a	2020-12-26 17:16:14 -06:00
Ade Lee	5320ecbaf2	replace md5 with swift utils version md5 is not an approved algorithm in FIPS mode, and trying to instantiate a hashlib.md5() will fail when the system is running in FIPS mode. md5 is allowed when in a non-security context. There is a plan to add a keyword parameter (usedforsecurity) to hashlib.md5() to annotate whether or not the instance is being used in a security context. In the case where it is not, the instantiation of md5 will be allowed. See https://bugs.python.org/issue9216 for more details. Some downstream python versions already support this parameter. To support these versions, a new encapsulation of md5() is added to swift/common/utils.py. This encapsulation is identical to the one being added to oslo.utils, but is recreated here to avoid adding a dependency. This patch is to replace the instances of hashlib.md5() with this new encapsulation, adding an annotation indicating whether the usage is a security context or not. While this patch seems large, it is really just the same change over and again. Reviewers need to pay particular attention as to whether the keyword parameter (usedforsecurity) is set correctly. Right now, all of them appear to be not used in a security context. Now that all the instances have been converted, we can update the bandit run to look for these instances and ensure that new invocations do not creep in. With this latest patch, the functional and unit tests all pass on a FIPS enabled system. Co-Authored-By: Pete Zaitcev Change-Id: Ibb4917da4c083e1e094156d748708b87387f2d87	2020-12-15 09:52:55 -05:00
Tim Burke	5bd95cf2b7	probe tests: Get rid of `server` arg for device_dir() and storage_dir() It's not actually used anywhere. Change-Id: I8f9b5cf7f5749481ef391a2029b0c4263443a89b	2020-07-16 13:50:58 -07:00
Tim Burke	630c9ef809	probe tests: Work when fronted by a TLS terminator * Add a new config option, proxy_base_url * Support HTTPS as well as HTTP connections * Monkey-patch eventlet early so we never import an unpatched version from swiftclient Change-Id: I4945d512966d3666f2738058f15a916c65ad4a6b	2020-05-04 10:54:01 -07:00
Clay Gerrard	2759d5d51c	New Object Versioning mode This patch adds a new object versioning mode. This new mode provides a new set of APIs for users to interact with older versions of an object. It also changes the naming scheme of older versions and adds a version-id to each object. This new mode is not backwards compatible or interchangeable with the other two modes (i.e., stack and history), especially due to the changes in the namimg scheme of older versions. This new mode will also serve as a foundation for adding S3 versioning compatibility in the s3api middleware. Note that this does not (yet) support using a versioned container as a source in container-sync. Container sync should be enhanced to sync previous versions of objects. Change-Id: Ic7d39ba425ca324eeb4543a2ce8d03428e2225a1 Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Tim Burke <tim.burke@gmail.com> Co-Authored-By: Thiago da Silva <thiagodasilva@gmail.com>	2020-01-24 17:39:56 -08:00
Clay Gerrard	698717d886	Allow internal clients to use reserved namespace Reserve the namespace starting with the NULL byte for internal use-cases. Backend services will allow path names to include the NULL byte in urls and validate names in the reserved namespace. Database services will filter all names starting with the NULL byte from responses unless the request includes the header: X-Backend-Allow-Reserved-Names: true The proxy server will not allow path names to include the NULL byte in urls unless a middlware has set the X-Backend-Allow-Reserved-Names header. Middlewares can use the reserved namespace to create objects and containers that can not be directly manipulated by clients. Any objects and bytes created in the reserved namespace will be aggregated to the user's account totals. When deploying internal proxys developers and operators may configure the gatekeeper middleware to translate the X-Allow-Reserved-Names header to the Backend header so they can manipulate the reserved namespace directly through the normal API. UpgradeImpact: it's not safe to rollback from this change Change-Id: If912f71d8b0d03369680374e8233da85d8d38f85	2019-11-27 11:22:00 -06:00
Tim Burke	1d7e1558b3	py3: (mostly) port probe tests There's still one problem, though: since swiftclient on py3 doesn't support non-ASCII characters in metadata names, none of the tests in TestReconstructorRebuildUTF8 will pass. Change-Id: I4ec879ade534e09c3a625414d8aa1f16fd600fa4	2019-09-04 10:17:45 -07:00
Clay Gerrard	771963c926	Increase node_timeout in gate Give storage nodes more time to complete requests for multi-node upgrade and probetests. Also slightly decouple probetests from default configs. Change-Id: I334ef517d833916a3b7be3151a812d4f9c66a6e1	2019-02-12 10:39:17 -06:00
Alistair Coles	9d742b85ad	Refactoring, test infrastructure changes and cleanup ...in preparation for the container sharding feature. Co-Authored-By: Matthew Oliver <matt@oliver.net.au> Co-Authored-By: Tim Burke <tim.burke@gmail.com> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Change-Id: I4455677abb114a645cff93cd41b394d227e805de	2018-05-15 18:18:25 +01:00
Alistair Coles	1f4ebbc990	kill orphans during probe test setup orphans processes sometimes cause probe test failures so get rid of them before each test. Change-Id: I4ba6748d30fbb28371f13aa95387c49bc8223402	2018-02-08 16:43:18 -08:00
Clay Gerrard	1d5cf3e730	add symlink to probetest for reconciler Change-Id: Ib2c5616f2965ab92b1c76d573e869206c91464c6	2017-12-14 12:16:39 -08:00
Steve Kowalik	5a06e3da3b	No longer import nose Since Python 2.7, unittest in the standard library has included mulitple facilities for skipping tests by decorators as well as an exception. Switch to that directly, rather than importing nose. Change-Id: I4009033473ea24f0d0faed3670db844f40051f30	2017-11-07 15:39:25 +11:00
Clay Gerrard	feee399840	Use check_drive consistently We added check_drive to the account/container servers to unify how all the storage wsgi servers treat device dirs/mounts. Thus pushes that unification down into the consistency engine. Drive-by: * use FakeLogger less * clean up some repeititon in probe utility for device re-"mounting" Related-Change-Id: I3362a6ebff423016bb367b4b6b322bb41ae08764 Change-Id: I941ffbc568ebfa5964d49964dc20c382a5e2ec2a	2017-11-01 16:33:40 +00:00
Kota Tsuyuzaki	1e79f828ad	Remove all post_as_copy related code and configes It was deprecated and we discussed on this topic in Denver PTG for Queen cycle. Main motivation for this work is that deprecated post_as_copy option and its gate blocks future symlink work. Change-Id: I411893db1565864ed5beb6ae75c38b982a574476	2017-09-16 05:50:41 +00:00
Thiago da Silva	d0bfd036af	ready yet? nope, please wait! Related-Change: Iab923c4f48ac7a5dd41237761ed91d01a59dc77c Change-Id: Id4e17569e9ec856663e1539eaf72872296698367 Signed-off-by: Thiago da Silva <thiago@redhat.com>	2017-07-25 17:12:11 -04:00
Tim Burke	675145ef4a	Remove deprecated vm_test_mode option This was deprecated in the 2.5.0 release (i.e. Liberty cycle), and we've been warning about it ever since. A year and a half seems like a long enough time. Change-Id: I5688e8f7dedb534071e67d799252bf0b2ccdd9b6 Related-Change: Iad91df50dadbe96c921181797799b4444323ce2e	2017-05-25 13:02:42 -07:00
Clay Gerrard	d062af836c	DRY out probe.common Specifically to facilitate the reuse of the retry check server function to fill in the creds for the test2 account which is required for probetests after the related change. Change-Id: I9729faa4c8c8d6d65a481bc2ea3f0566d511034c Related-Change: I8d503419b7996721a671ed6b2795224775a7d8c6	2016-09-14 10:12:38 -07:00
Alistair Coles	f679ed0cc8	Make container sync copy SLO manifests Currently the container sync daemon fails to copy an SLO manifest, and the error will stall progress of the sync process on that container. There are several reasons why the sync of an SLO manifest may fail: 1. The GET of the manifest from the source container returns an X-Static-Large-Object header that is not allowed to be included with a PUT to the destination container. 2. The format of the manifest object that is read from the source is not in the syntax required for a SLO manifest PUT. 3. Assuming 2 were fixed, the PUT of the manifest includes an ETag header which will not match the md5 of the manifest generated by the receiving proxy's SLO middleware. 4. If the manifest is being synced to a different account and/or cluster, then the SLO segments may not have been synced and so the validation of the PUT manifest will fail. This patch addresses all of these obstacles by enabling the destination container-sync middleware to cause the SLO middleware to be bypassed by setting a swift.slo_override flag in the request environ. This flag is only set for request that have been validated as originating from a container sync peer. This is justifed by noting that a SLO manifest PUT from a container sync peer can be assumed to have valid syntax because it was already been validated when written to the source container. Furthermore, we must allow SLO manifests to be synced without requiring the semantic of their content to be re-validated because we have no way to enforce or check that segments have been synced prior to the manifest, nor to check that the semantic of the manifest is still valid at the source. This does mean that GETs to synced SLO manifests may fail if segments have not been synced. This is however consistent with the expectation for synced DLO manifests and indeed for the source SLO manifest if segments have been deleted since it was written. Co-Authored-By: Oshrit Feder <oshritf@il.ibm.com> Change-Id: I8d503419b7996721a671ed6b2795224775a7d8c6 Closes-Bug: #1605597	2016-09-14 13:32:00 +01:00
Kota Tsuyuzaki	95a5a4a7ec	Don't run probe tests if resetswift failed Probe test is cleaning up the swift environment for each test in setUp method. However, probe tests will run even if we cannot use the resetswift script for some reasons (e.g. not permitted, the script not found) and probably the probe tests will fail after a long time passed for the execution. To prevent such an unfortunate situation and also to find the reason easily, this patch adds the exit code check for "resetswift" and if it failed, the test will raise AssertionError with the stdout and stderr to make it easy to find the reason. Closes-Bug: #1613494 Change-Id: Id80d56ab6b71402ead4fe22c120064d78c1e74ac	2016-08-16 18:02:58 -07:00
Tim Burke	6b0e9a3e24	Remove unused (but defaulted) args Every time we call start_server, check is True. Every time we call check_server, we use the default timeout. Change-Id: Id38182f15bcbfbb145b57cee179a8fd47ec8e2b7	2016-06-02 16:49:32 +00:00
Kota Tsuyuzaki	e56a1a550a	pids in probe is no longer used Change-Id: I1fd76004257a8c05ce8bb1f3ca0e45000509f833	2016-06-01 23:53:35 -07:00
Jenkins	2a0935e9e3	Merge "Send correct size in POST async update for EC object"	2016-06-01 22:15:31 +00:00
Tim Burke	a821dd42de	Don't include holes when reporting how many devices a ring has Change-Id: I9b933051aec009c6108ee9d2dd5c0978772bf699	2016-05-26 13:42:12 -07:00
Alistair Coles	c1b1a5a0ee	Send correct size in POST async update for EC object When a PUT request is made to an EC object the resulting container update must include the override values for the actual object etag and size, as opposed to the fragment etag and size. When a POST request is made the same override values should be included in the container update, but currently the update includes the incorrect EC fragment size (but the correct body etag). This is ok so long as the update for the object PUT request arrives at the container server first (whether by direct update or replication) because the etag and size values in an update due to an object POST will not have a newer timestamp that the PUT and will therefore be ignored at the container server. However, if the update due to the object PUT request has not arrived at the container server when the update due to the object POST arrives, then the etag and incorrect size sent with the POST update will be recorded in the container server. If the update due to the PUT subsequently arrives it will not fix this error because the timestamp of its etag and size values is not greater than that of the already recorded values. Fortunately the correct object body size is persisted with the object as X-Backend-Container-Update-Override-Size sysmeta so this patch fixes the container update due to a POST to use that value instead of the Content-Length metadata. Closes-Bug: #1582723 Change-Id: Ide7c9c59eb41aa09eaced2acfd0700f882c6eab1	2016-05-17 15:00:21 +01:00
Alistair Coles	e91de49d68	Update container on fast-POST This patch makes a number of changes to enable content-type metadata to be updated when using the fast-POST mode of operation, as proposed in the associated spec [1]. * the object server and diskfile are modified to allow content-type to be updated by a POST and the updated value to be stored in .meta files. * the object server accepts PUTs and DELETEs with older timestamps than existing .meta files. This is to be consistent with replication that will leave a later .meta file in place when replicating a .data file. * the diskfile interface is modified to provide accessor methods for the content-type and its timestamp. * the naming of .meta files is modified to encode two timestamps when the .meta file contains a content-type value that was set prior to the latest metadata update; this enables consistency to be achieved when rsync is used for replication. * ssync is modified to sync meta files when content-type differs between local and remote copies of objects. * the object server issues container updates when handling POST requests, notifying the container server of the current immutable metadata (etag, size, hash, swift_bytes), content-type with their respective timestamps, and the mutable metadata timestamp. * the container server maintains the most recently reported values for immutable metadata, content-type and mutable metadata, each with their respective timestamps, in a single db row. * new probe tests verify that replication achieves eventual consistency of containers and objects after discrete updates to content-type and mutable metadata, and that container-sync sync's objects after fast-post updates. [1] spec change-id: I60688efc3df692d3a39557114dca8c5490f7837e Change-Id: Ia597cd460bb5fd40aa92e886e3e18a7542603d01	2016-03-03 14:25:10 +00:00
Christian Schwede	c30ceec6f1	Fix ring device checks in probetests If a device has been removed from one of the rings, it actually is set as None within the ring. In that case the length of the devices is not True without filtering the None devices. However, if the length matched the condition but included a removed device the probetests would fail with a TypeError. This fix could be done also in swift/common/ring/ring.py, but it seems it only affects probetests right now, thus fixing it there and not changing the current behavior. Change-Id: I8ccf9b32a51957e040dd370bc9f711d4328d17b1	2015-10-07 19:59:15 +00:00
Romain LE DISEZ	71f6fd025e	Allows to configure the rsync modules where the replicators will send data Currently, the rsync module where the replicators send data is static. It forbids administrators to set rsync configuration based on their current deployment or needs. As an example, the rsyncd configuration example encourages to set a connections limit for the modules account, container and object. It permits to protect devices from excessives parallels connections, because it would impact performances. On a server with many devices, it is tempting to increase this number proportionally, but nothing guarantees that the distribution of the connections will be balanced. In the worst scenario, a single device can receive all the connections, which is a severe impact on performances. This commit adds a new option named 'rsync_module' to the -replicator sections of the -server configuration file. This configuration variable can be extrapolated with device attributes like ip, port, device, zone, ... by using the format {NAME}. eg: rsync_module = {replication_ip}::object_{device} With this configuration, an administrators can solve the problem of connections distribution by creating one module per device in rsyncd configuration. The default values are backward compatible: {replication_ip}::account {replication_ip}::container {replication_ip}::object Option vm_test_mode is deprecated by this commit, but backward compatibility is maintained. The option is only effective when rsync_module is not set. In that case, {replication_port} is appended to the default value of rsync_module. Change-Id: Iad91df50dadbe96c921181797799b4444323ce2e	2015-09-07 08:00:18 +02:00
paul luse	893f30c61d	EC GET path: require fragments to be of same set And if they are not, exhaust the node iter to go get more. The problem without this implementation is a simple overwrite where a GET follows before the handoff has put the newer obj back on the 'alive again' node such that the proxy gets n-1 fragments of the newest set and 1 of the older. This patch bucketizes the fragments by etag and if it doesn't have enough continues to exhaust the node iterator until it has a large enough matching set. Change-Id: Ib710a133ce1be278365067fd0d6610d80f1f7372 Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Alistair Coles <alistair.coles@hp.com> Closes-Bug: 1457691	2015-08-27 21:09:41 -07:00
janonymous	923238aa1b	test/(functional/probe):Replace python print operator with print function (pep H233, py33) 'print' function is compatible with 2.x and 3.x python versions Link : https://www.python.org/dev/peps/pep-3105/ Python 2.6 has a __future__ import that removes print as language syntax, letting you use the functional form instead Change-Id: I416c6ac21ccbfb91ec328ffb1ed21e492ef52d58	2015-08-20 11:42:58 +09:00
Clay Gerrard	768d7ab074	Add a probetest for HUP/reload This would have been enough to catch the regression, and we can extend them as we work on any future ehancements to our process management. Change-Id: I9a1b57aa15663380c45cf783afc8212ab4ffbace	2015-07-30 15:49:23 -07:00
Victor Stinner	e24d7c36fa	Use six to fix imports on Python 3 Get configparser, queue, http_client modules from six.moves. Patch generated by the six_moves operation of the sixer tool: https://pypi.python.org/pypi/sixer Change-Id: I666241ab50101b8cc6f992dd80134ce27327bd7d	2015-07-24 11:48:28 +02:00
Victor Stinner	e70b66586e	Replace dict.iteritems() with dict.items() The iteritems() of Python 2 dictionaries has been renamed to items() on Python 3. According to a discussion on the openstack-dev mailing list, the overhead of creating a temporary list using dict.items() on Python 2 is very low because most dictionaries are small: http://lists.openstack.org/pipermail/openstack-dev/2015-June/066391.html Patch generated by the following command: sed -i 's,iteritems,items,g' \ $(find swift -name ".py") \ $(find test -name ".py") Change-Id: I6070bb6c684be76e8e77222a7d280ec6edd43496	2015-06-24 09:39:55 +02:00
Darrell Bishop	df134df901	Allow 1+ object-servers-per-disk deployment Enabled by a new > 0 integer config value, "servers_per_port" in the [DEFAULT] config section for object-server and/or replication server configs. The setting's integer value determines how many different object-server workers handle requests for any single unique local port in the ring. In this mode, the parent swift-object-server process continues to run as the original user (i.e. root if low-port binding is required), binds to all ports as defined in the ring, and forks off the specified number of workers per listen socket. The child, per-port servers drop privileges and behave pretty much how object-server workers always have, except that because the ring has unique ports per disk, the object-servers will only be handling requests for a single disk. The parent process detects dead servers and restarts them (with the correct listen socket), starts missing servers when an updated ring file is found with a device on the server with a new port, and kills extraneous servers when their port is found to no longer be in the ring. The ring files are stat'ed at most every "ring_check_interval" seconds, as configured in the object-server config (same default of 15s). Immediately stopping all swift-object-worker processes still works by sending the parent a SIGTERM. Likewise, a SIGHUP to the parent process still causes the parent process to close all listen sockets and exit, allowing existing children to finish serving their existing requests. The drop_privileges helper function now has an optional param to suppress the setsid() call, which otherwise screws up the child workers' process management. The class method RingData.load() can be told to only load the ring metadata (i.e. everything except replica2part2dev_id) with the optional kwarg, header_only=True. This is used to keep the parent and all forked off workers from unnecessarily having full copies of all storage policy rings in memory. A new helper class, swift.common.storage_policy.BindPortsCache, provides a method to return a set of all device ports in all rings for the server on which it is instantiated (identified by its set of IP addresses). The BindPortsCache instance will track mtimes of ring files, so they are not opened more frequently than necessary. This patch includes enhancements to the probe tests and object-replicator/object-reconstructor config plumbing to allow the probe tests to work correctly both in the "normal" config (same IP but unique ports for each SAIO "server") and a server-per-port setup where each SAIO "server" must have a unique IP address and unique port per disk within each "server". The main probe tests only work with 4 servers and 4 disks, but you can see the difference in the rings for the EC probe tests where there are 2 disks per server for a total of 8 disks. Specifically, swift.common.ring.utils.is_local_device() will ignore the ports when the "my_port" argument is None. Then, object-replicator and object-reconstructor both set self.bind_port to None if server_per_port is enabled. Bonus improvement for IPv6 addresses in is_local_device(). This PR for vagrant-swift-all-in-one will aid in testing this patch: https://github.com/swiftstack/vagrant-swift-all-in-one/pull/16/ Also allow SAIO to answer is_local_device() better; common SAIO setups have multiple "servers" all on the same host with different ports for the different "servers" (which happen to match the IPs specified in the rings for the devices on each of those "servers"). However, you can configure the SAIO to have different localhost IP addresses (e.g. 127.0.0.1, 127.0.0.2, etc.) in the ring and in the servers' config files' bind_ip setting. This new whataremyips() implementation combined with a little plumbing allows is_local_device() to accurately answer, even on an SAIO. In the default case (an unspecified bind_ip defaults to '0.0.0.0') as well as an explict "bind to everything" like '0.0.0.0' or '::', whataremyips() behaves as it always has, returning all IP addresses for the server. Also updated probe tests to handle each "server" in the SAIO having a unique IP address. For some (noisy) benchmarks that show servers_per_port=X is at least as good as the same number of "normal" workers: https://gist.github.com/dbishop/c214f89ca708a6b1624a#file-summary-md Benchmarks showing the benefits of I/O isolation with a small number of slow disks: https://gist.github.com/dbishop/fd0ab067babdecfb07ca#file-results-md If you were wondering what the overhead of threads_per_disk looks like: https://gist.github.com/dbishop/1d14755fedc86a161718#file-tabular_results-md DocImpact Change-Id: I2239a4000b41a7e7cc53465ce794af49d44796c6	2015-06-18 12:43:50 -07:00
Clay Gerrard	a3559edc23	Exclude local_dev from sync partners on failure If the primary left or right hand partners are down, the next best thing is to validate the rest of the primary nodes. Where the rest should exclude not just the left and right hand partners - but ourself as well. This fixes a accidental noop when partner node is unavailable and another node is missing data. Validation: Add probetests to cover ssync failures for the primary sync_to nodes for sync jobs. Drive-by: Make additional plumbing for the check_mount and check_dir constraints into the remaining daemons. Change-Id: I4d1c047106c242bca85c94b569d98fd59bb255f4	2015-05-26 12:50:31 -07:00
Clay Gerrard	52b102163e	Don't apply the wrong Etag validation to rebuilt fragments Because of the object-server's interaction with ssync sender's X-Backend-Replication-Headers when a object (or fragment archive) is pushed unmodified to another node it's ETag value is duped into the recieving ends metadata as Etag. This interacts poorly with the reconstructor's RebuildingECDiskFileStream which can not know ahead of time the ETag of the fragment archive being rebuilt. Don't send the Etag from the local source fragment archive being used as the basis for the rebuilt fragent archive's metadata along to ssync. Change-Id: Ie59ad93a67a7f439c9a84cd9cff31540f97f334a	2015-04-15 23:33:32 +01:00
paul luse	647b66a2ce	Erasure Code Reconstructor This patch adds the erasure code reconstructor. It follows the design of the replicator but: - There is no notion of update() or update_deleted(). - There is a single job processor - Jobs are processed partition by partition. - At the end of processing a rebalanced or handoff partition, the reconstructor will remove successfully reverted objects if any. And various ssync changes such as the addition of reconstruct_fa() function called from ssync_sender which performs the actual reconstruction while sending the object to the receiver Co-Authored-By: Alistair Coles <alistair.coles@hp.com> Co-Authored-By: Thiago da Silva <thiago@redhat.com> Co-Authored-By: John Dickinson <me@not.mn> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com> Co-Authored-By: Samuel Merritt <sam@swiftstack.com> Co-Authored-By: Christian Schwede <christian.schwede@enovance.com> Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com> blueprint ec-reconstructor Change-Id: I7d15620dc66ee646b223bb9fff700796cd6bef51	2015-04-14 00:52:17 -07:00
Martin Kletzander	76b106fc01	Fix common misspellings Wikipedia's list of common misspellings [1] has a machine-readable version. This patch fixes those misspellings mentioned in the list which don't have multiple right variants (as e.g. "accension", which can be both "accession" and "ascension"), such misspellings are left untouched. The list of changes was manually re-checked for false positives. [1] https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines Change-Id: Ic9a5438629664f7cea216413a28acc0e8992da05 Signed-off-by: Martin Kletzander <mkletzan@redhat.com>	2015-03-24 11:07:56 +01:00
Leah Klearman	ca0fce8542	more probe test refactoring * move get_to_final_state into ProbeTest * get rid of kill_servers * add replicators manager and updaters manager to ProbeTest (this is all going someplace, i promise) Change-Id: I8393a2ebc0d04051cae48cc3c49580f70818dbf2	2015-02-13 16:55:45 -08:00
Leah Klearman	2c1b5af062	refactor probe tests * refactor probe tests to use probe.common.ProbeTest * move reset_environment functionality to ProbeTest.setUp() * choose rings and policies that meet the criteria - raise SkipTest if nothing matches * replace all AssertionErrors in setup with SkipTest Change-Id: Id56c497d58083f5fd55f5283cdd346840df039d3	2015-02-12 11:30:21 -08:00
Alistair Coles	22b65846aa	Make probe tests tolerate deprecated policies A deprecated policy in swift.conf causes errors in probe tests that may attempt to use that policy. This patch introduces a list ENABLED_POLICIES in test/probe/common.py and changes probe tests to only use policies contained in that list. Change-Id: Ie65477c15d631fcfc3a4a5772fbe6d7d171b22b0	2014-09-09 13:09:37 +01:00
Yuan Zhou	ad2a9cefe5	Fixes probe tests with non-zero default storage policy Add headers param to direct_client.direct_get_object, which is used in probetests to passthrough the X-Storage-Policy-Index header. DocImpact Implements: blueprint storage-policies Change-Id: I19adbbcefbc086c8467bd904a275d55cde596412	2014-06-18 21:09:53 -07:00

1 2

78 Commits