swift

Commit Graph

Author	SHA1	Message	Date
Tim Burke	7f5576d3e6	ring-builder: require part_power>=0, give better error messages Closes-Bug: #1697860 Change-Id: I500a86de390b24b9d08a478d695a7d62c447e779	2023-02-17 13:47:53 -08:00
Zuul	6d600bef0c	Merge "ring: Have RingBuilder.debug return to old state"	2022-12-01 19:28:08 +00:00
Tim Burke	52a4fe37aa	Various doc formatting cleanups * Get rid of a bunch of accidental blockquote formatting * Always declare a lexer to use for ``.. code::`` blocks Change-Id: I8940e75b094843e542e815dde6b6be4740751813	2022-08-02 14:28:36 -07:00
Tim Burke	85d063b20a	ring: Have RingBuilder.debug return to old state Previously, if the logger was enabled before entering the context manager, it would be disabled upon exiting. Change-Id: I19e03309c2c1e3ddafb1eee6f3a4ead0267c26ca	2022-07-28 10:26:13 -07:00
Tim Burke	314347a3cb	Update SAIO & docker image to use 62xx ports Note that existing SAIOs with 60xx ports should still work fine. Change-Id: If5dd79f926fa51a58b3a732b212b484a7e9f00db Related-Change: Ie1c778b159792c8e259e2a54cb86051686ac9d18	2020-07-20 15:17:12 -07:00
Tim Burke	821b964166	ring: Flag region, zone, and device as required in add_dev They effectively already were, but if you used the RingBuilder API directly (rather than the CLI) you could previously write down builders that would hit KeyErrors on load. Change-Id: I1de895d4571f7464be920345881789d47659729f	2020-03-06 22:25:21 -06:00
Pete Zaitcev	0f13db30b6	Import directly from the inside of the ring module An odd thing happened: when my code did only "from swift.common.ring import Ring", and nothing else, the interpreter looped on the CPU. The import from the top of the ring module is our standard protocol for Ring. It causes no problem in places like container updater or list_endpoints.py. It is a big mystery why it causes Python 3.7.3 to loop, and only in specific circumstances. But we noticed that the recursive import is unnecesasry in this case, so an obviously harmless fix exists. Change-Id: I7373bbb0a50d090d6becf141e5832f8ae98381a4	2020-01-03 18:34:11 -06:00
Christian Schwede	98f9781096	Add commands to ring-builder to change region / zone Currently one has to remove and re-add devices when the region or zone needs to be changed. Adding the subcommands set_region and set_zone simplifies this, now it is possible to change the region and/or zone easily. Note that there is no change to the required rebalancing; it is likely that data still needs to be moved within the cluster. This is mostly copy-n-paste of the existing set_weight subcommand and adopting tests accordingly. Some duplicated code in the tests has been aggregated as well. Change-Id: I37dd8e8ac24e2b0bb196758449a1d91a769b2e96	2019-10-04 21:56:12 -07:00
Tim Burke	6853616aea	ring: Track more properties of the ring Plumb the version from the ringbuilder through to the metadata at the start of the ring. Recover this (if available) when running swift-ring-builder <ring> write_builder When we load the ring, track the count and MD5 of the bytes off disk, as well as the number of uncompressed bytes. Expose all this new information as properties on the Ring, along with - device_count (number of non-None entries in self._devs), - weighted_device_count (number of devices that have weight), and - assigned_device_count (number of devices that actually have partition assignments). Co-Authored-By: Matthew Oliver <matt@oliver.net.au> Change-Id: I73deaf6f1d9c1d37630c37c02c597b8812592351	2019-07-12 17:26:28 -07:00
Samuel Merritt	a6f7600545	Remove our reimplemented logging.NullHandler Python 2.6 didn't have one, so we'd try to find logging.NullHandler but fall back to our own. Since 2.7+ has logging.NullHandler, we can just use it. Change-Id: Ie2c27407efc2882e698abe6e4379a00a1d3f4301	2018-05-24 11:06:19 -07:00
Tim Burke	642f79965a	py3: port common/ring/ and common/utils.py I can't imagine us not having a py3 proxy server at some point, and that proxy server is going to need a ring. While we're at it (and since they were so close anyway), port * cli/ringbuilder.py and * common/linkat.py * common/daemon.py Change-Id: Iec8d97e0ce925614a86b516c4c6ed82809d0ba9b	2018-02-12 06:42:24 +00:00
malei	f0f3de3462	fix typos in swift Change-Id: I14dd433ed7c7fe789d5d04defbf9eec5bffc301a	2018-02-09 22:47:02 +08:00
Zuul	2f64e31b99	Merge "Further extract a builder method to a util function"	2018-01-18 23:13:53 +00:00
Zuul	909e371515	Merge "Merge repeat code for rebalance"	2018-01-15 15:05:01 +00:00
Clay Gerrard	68906dac43	Further extract a builder method to a util function ... to make testing more targeted and obvious Related-Change-Id: I89439286b211f2c5ef19deffa77c202f48f07cf8 Change-Id: I93b99128a4fb35395e8e9caf11649e216f824fdf	2018-01-15 11:45:30 +00:00
vxlinux	a1ae142d5b	Merge repeat code for rebalance There are three similar code segments in rebalance process as follows： tiers = ['cluster', 'regions', 'zones', 'servers', 'devices'] for i, tier_name in enumerate(tiers): replicas_at_tier = sum(weighted_replicas_by_tier[t] for t in weighted_replicas_by_tier if len(t) == i) if abs(self.replicas - replicas_at_tier) > 1e-10: raise exceptions.RingValidationError( '%s != %s at tier %s' % ( replicas_at_tier, self.replicas, tier_name)) I think we can encapsulate this code segment to a private function and replace those code segments with a function call Change-Id: I89439286b211f2c5ef19deffa77c202f48f07cf8	2018-01-15 11:45:00 +00:00
Tim Burke	e343452394	Support existing builders with None _last_part_moves These were likely written before the first related change, or created from an existing ring file. Also, tolerate missing dispersion when rebalancing -- that may not exist in the builder file. Change-Id: I26e3b4429c747c23206e4671f7c86543bb182a15 Related-Change: Ib165cf974c865d47c2d9e8f7b3641971d2e9f404 Related-Change: Ie239b958fc7e0547ffda2bebf61546bd4ef3d829 Related-Change: I551fcaf274876861feb12848749590f220842d68	2018-01-08 13:19:13 -08:00
Clay Gerrard	49de7db532	add swift-ring-builder option to recalculate dispersion Since dispersion info is cached, this can easily happen if we make changes to how dispersion info is calculated or stored (e.g. we extend the dispersion calculation to consider dispersion of all part-replicas in the related change) Related-Change-Id: Ifefff0260deac0c3e8b369a1e158686c89936686 Change-Id: I714deb9e349cd114a21ec591216a9496aaf9e0d1	2018-01-02 16:20:28 +00:00
Clay Gerrard	7013e70ca6	Represent dispersion worse than one replicanth With a sufficiently undispersed ring it's possible to move an entire replicas worth of parts and yet the value of dispersion may not get any better (even though in reality dispersion has dramatically improved). The problem is dispersion will currently only represent up to one whole replica worth of parts being undispersed. However with EC rings it's possible for more than one whole replicas worth of partitions to be undispersed, in these cases the builder will require multiple rebalance operations to fully disperse replicas - but the dispersion value should improve with every rebalance. N.B. with this change it's possible for rings with a bad dispersion value to measure as having a significantly smaller dispersion value after a rebalance (even though they may not have had their dispersion change) because the total amount of bad dispersion we can measure has been increased but we're normalizing within a similar range. Closes-Bug: #1697543 Change-Id: Ifefff0260deac0c3e8b369a1e158686c89936686	2017-12-28 11:16:17 -08:00
Alistair Coles	348bd83b7e	Respect co-builder partition moves when min_part_hours is zero Repeated calls to each co-builder's _update_last_part_moves() are unnecessary and have the unfortunate side effect of resetting the _last_part_moved bitmap. When a component builder has zero min_part_hours this results in it not preventing its co-builders from moving parts that it has already moved. This patch changes the CompositeRingBuilder to call each component builder _update_last_part_moves() once before rebalancing any component builder. CooperativeRingBuilder's no longer forward calls to their _update_last_part_moves() method. Each component's _last_part_moved bitmap is therefore preserved until for the duration of the composite rebalance. The initialisation of the RingBuilder _last_part_moves array is moved to the RingBuilder __init__ method, so that calls to _update_last_part_moves() are effective even when rebalance() has never been called on that builder. Otherwise, during a composite rebalance, a component that has not previously been rebalanced will not have its _last_part_moves_epoch updated during rebalance and as a result may report erroneous min_part_seconds_left after its first rebalance. Related-Change: I1b30cb3d776be441346a4131007d2487a5440a81 Closes-Bug: #1714274 Change-Id: Ib165cf974c865d47c2d9e8f7b3641971d2e9f404	2017-09-27 15:05:13 +01:00
Kota Tsuyuzaki	2321966456	Accept a trade off of dispersion for balance ... but only if we have to! During the initial gather for balance we prefer to avoid replicas on over-weight devices that are already under-represented in any of it's tiers (i.e. if a zone has to have at least one, but may have as many of two, don't take the only replica). Instead we hope by going for replicas on over-weight devices that are at the limits of their dispersion we might have a better than even chance we find a better place for them during placement! This normally works on out - and especially so for rings which can disperse and balance. But for existing rings where we'd have to sacrifice dispersion to improve balance the existing optimistic gather will end up refusing to trade dispersion for balance - and instead get stuck without solving either! You should always be able to solve for either dispersion or balance. But if you can't solve both - we bail out on our optimistic gather much more quickly and instead just focus on improving balance. With this change, the ring can get into balanced (and un-dispersed) states much more quickly! Change-Id: I17ac627f94f64211afaccad15596a9fcab2fada2 Related-Change-Id: Ie6e2d116b65938edac29efa6171e2470bb3e8e12 Closes-Bug: 1699636 Closes-Bug: 1701472	2017-09-25 12:37:49 -07:00
Alistair Coles	7fa9d6e5b4	Unset random seed after rebalancing ring Unit tests use the random module in places to randomise test inputs, but once the tests in test_builder.py or test_ring_builder_analyzer.py have been run the random seed is left in a repeatable state because calls are made to RingBuilder.balance with a seed value. Consequently, subsequent calls to random in other tests get repeatable results from one test run to another. This patch resets the state of the random module before returning from RingBuilder.rebalance. Closes-Bug: #1639755 Change-Id: I4b74030afc654e60452e65b3e0f1b45a189c16e3	2017-08-08 22:34:04 +00:00
Tim Burke	c211141c27	Add ever_rebalanced property to RingBuilder ...to formalize an otherwise-unwritten contract and get CooperativeRingBuilder using more "public" interfaces. Change-Id: Ib6666728eabeff948bb53dff054a69bada47556e	2017-07-21 17:21:49 -07:00
Alistair Coles	71ec83f414	Ring rebalance respects co-builders' last_part_moves - Add a CooperativeRingBuilder subclass of RingBuilder. The subclass takes a reference to a parent CompositeRingBuilder which is consulted about whether a part can be moved during rebalance. The parent builder in turn consults all component CooperativeRingBuilder's to decide if a part can be moved. - Make CompositeRingBuilder load CooperativeRingBuilder instances. - Add rebalance() method to CompositeRingBuilder class. - Add a load_components() method to CompositeRingBuilder class. - Change the CompositeRingBuilder compose() method to NOT by default raise a ValueError if component builders have not been modified since last loaded. With the load_components method being added it makes less sense insist by default on loaded components being modified, and it is desirable to have the same semantic for all methods that load components. Previously it has been necessary to use the 'force' flag with compose() to prevent these errors being raised, which has the unfortunate side effect of also disabling all other checks on component builders. A new 'require_modified' parameter is added to compose() which defaults to False but can be set to True if the previous default behaviour is required. Change-Id: I1b30cb3d776be441346a4131007d2487a5440a81	2017-07-21 22:17:48 +01:00
Jenkins	e94b383655	Merge "Add support to increase object ring partition power"	2017-07-05 14:40:42 +00:00
Clay Gerrard	806b18c6f5	Extend device tier pretty print normalization * use pretty device tier normalization in additional debug message * refactor pretty device tier normalization for consistency * unittest asserting consistency Change-Id: Ide32e35ff9387f1cc1e4997eb133314d06b794f3	2017-06-28 15:40:34 -07:00
Samuel Merritt	62509cc8f4	Improve debug logging from ring builder. The gather/place debug logs used to just contain device IDs; now they include region, zone, and IP. This makes it easier to see what's going on when debugging rebalance operations. Change-Id: I6314e327973c57a34b88ebbb4d3b1594dbacd357	2017-06-28 15:09:07 -07:00
Christian Schwede	e1140666d6	Add support to increase object ring partition power This patch adds methods to increase the partition power of an existing object ring without downtime for the users using a 3-step process. Data won't be moved to other nodes; objects using the new increased partition power will be located on the same device and are hardlinked to avoid data movement. 1. A new setting "next_part_power" will be added to the rings, and once the proxy server reloaded the rings it will send this value to the object servers on any write operation. Object servers will now create a hard-link in the new location to the original DiskFile object. Already existing data will be relinked using a new tool in the new locations using hardlinks. 2. The actual partition power itself will be increased. Servers will now use the new partition power to read from and write to. No longer required hard links in the old object location have to be removed now by the relinker tool; the relinker tool reads the next_part_power setting to find object locations that need to be cleaned up. 3. The "next_part_power" flag will be removed. This mostly implements the spec in [1]; however it's not using an "epoch" as described there. The idea of the epoch was to store data using different partition powers in their own namespace to avoid conflicts with auditors and replicators as well as being able to abort such an operation and just remove the new tree. This would require some heavy change of the on-disk data layout, and other object-server implementations would be required to adopt this scheme too. Instead the object-replicator is now aware that there is a partition power increase in progress and will skip replication of data in that storage policy; the relinker tool should be simply run and afterwards the partition power will be increased. This shouldn't take that much time (it's only walking the filesystem and hardlinking); impact should be low therefore. The relinker should be run on all storage nodes at the same time in parallel to decrease the required time (though this is not mandatory). Failures during relinking should not affect cluster operations - relinking can be even aborted manually and restarted later. Auditors are not quarantining objects written to a path with a different partition power and therefore working as before (though they are reading each object twice in the worst case before the no longer needed hard links are removed). Co-Authored-By: Alistair Coles <alistair.coles@hpe.com> Co-Authored-By: Matthew Oliver <matt@oliver.net.au> Co-Authored-By: Tim Burke <tim.burke@gmail.com> [1] https://specs.openstack.org/openstack/swift-specs/specs/in_progress/ increasing_partition_power.html Change-Id: I7d6371a04f5c1c4adbb8733a71f3c177ee5448bb	2017-06-15 15:08:48 -07:00
Jenkins	9089e44c0b	Merge "Add Composite Ring Functionality"	2017-05-18 10:18:31 +00:00
Kota Tsuyuzaki	d40031b46f	Add Composite Ring Functionality * Adds a composite_builder module which provides the functionality to build a composite ring from a number of component ring builders. * Add id to RingBuilder to differentiate rings in composite. A RingBuilder now gets a UUID when it is saved to file if it does not already have one. A RingBuilder loaded from file does NOT get a UUID assigned unless it was previously persisted in the file. This forces users to explicitly assign an id to existing ring builders by saving the state back to file. The UUID is included in first line of the output from: swift-ring-builder <builder-file> Background: This is another implementation for Composite Ring [1] to enable better dispersion for global erasure coded cluster. The most significant difference from the related-change [1] is that this solution attempts to solve the problem as an offline tool rather than dynamic compositing on the running servers. Due to the change, we gain advantages such as: - Less code and being simple - No complex state validation on the running server - Easy deployments with an offline tool This patch does not provide a command line utility for managing composite rings. The interface for such a tool is still under discussion; this patch provides the enabling functionality first. Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> [1] Related-Change: I80ef36d3ac4d4b7c97a1d034b7fc8e0dc2214d16 Change-Id: I0d8928b55020592f8e75321d1f7678688301d797	2017-05-15 16:42:00 -07:00
Clay Gerrard	6be5196fbe	Make add_dev complain louder about missing keys ... and remove some cruft that couldn't possibly work Change-Id: I560f0a29f0a881c63ec3cb910dbf5476fe2a915a Related-Change-Id: I0d8928b55020592f8e75321d1f7678688301d797	2017-04-25 19:29:57 -07:00
Gábor Antal	5817f804fd	Handle log message interpolation by the logger According to OpenStack Guideline[1], logged string message should be interpolated by the logger. [1]: http://docs.openstack.org/developer/oslo.i18n/guidelines.html#adding-variables-to-log-messages Change-Id: I49952348c06e340c6cfc708028f2b06e02a16dfc Closes-Bug: #1661262	2017-02-02 15:27:47 +01:00
Jenkins	a97ed0b96a	Merge "Fix unnecessary for-loop and mis docs"	2017-01-16 16:36:00 +00:00
Kota Tsuyuzaki	bbefaca2ef	Fix unnecessary for-loop and mis docs This is follow-up for https://review.openstack.org/#/c/419107 to address: - Remove unnecessary for-loop block that breaks the end of the loop always - Correct mis document says "two more times" but actually does only "one more time" Change-Id: I3d76275afc6448709a4b3588259e085bce7fa21d	2017-01-16 06:12:23 -08:00
Jenkins	707e2558b9	Merge "Tighten the move-one-replica test"	2017-01-16 10:40:04 +00:00
Tim Burke	244e640005	Tighten the move-one-replica test We only need one additional rebalance to get something well-balanced and well-dispersed. Drive-by to break a little earlier in the builder after we move a part. Change-Id: I78b86292c98be5e247a694b9db71d9267e1a0a22	2017-01-11 21:04:25 +00:00
Jenkins	7f6d316dd6	Merge "For any part, only one replica can move in a rebalance"	2017-01-11 20:34:58 +00:00
kellerbr	aa17ae1b04	Allow Hacking H401, H403 check Currently swift ignores a lot of the Hacking style guide. This patch enables the H401 and H403 checks and fixes any violations. With this we can get a little closer to following the OpenStack style guidelines. Change-Id: I5109a052f2ceb2e6a9a174cded62f4231156d39b	2017-01-04 17:23:46 +00:00
Tim Burke	2e7a7347fc	Avoid infinite loop while placing parts Previously, we could over-assign how many parts should be in a tier. This would cause the local `parts` variable to go negative, which meant that our `while parts` loop would never terminate. Change-Id: Id7e7889742ca37cf1a9c0d55fba78d967e90e8d0 Closes-Bug: 1642538	2016-11-17 15:32:47 -08:00
cheng	ce26e78902	For any part, only one replica can move in a rebalance With a min_part_hours of zero, it's possible to move more than one replicas of the same part in a single rebalance. This change in behavior only effects min_part_hour zero rings, which are understood to be uncommon in production mostly because of this very specific and strange behavior of min_part_hour zero rings. With this change, no matter how small your min_part_hours it will always require at least N rebalances to move N part-replicas of the same part. To supplement the existing persisted _last_part_moves structure to enforce min_part_hours, this change adds a _part_moved_bitmap that exists only during the life of the rebalance, to track when rebalance moves a part in order to prevent another replicas of the same part from being moved in the same rebalance. Add authors: Clay Gerrard, clay.gerrard@gmail.com Christian Schwede, cschwede@redhat.com Closes-bug: #1586167 Change-Id: Ia1629abd5ce6e1b3acc2e94f818ed8223eed993a	2016-09-27 15:01:41 +00:00
Mohit Motiani	89388bf232	Fix typos and grammer in builder.py Change-Id: Ib87f4df8f741809840e92db9bacf2af847a5f77f Closes-Bug: #1600403	2016-08-22 12:46:23 -07:00
cheng	c1c18da82c	check _last_part_moves when pretend_min_part_hours_passed pretend_min_part_hours_passed do things like this: self._last_part_moves[part] = 0xff this will throw exception if self._last_part_moves is None. this patch is to check self._last_part_moves to prevent exception. Closes-bug: #1578835 Change-Id: Ic83c7a338b45bfcf61f5ab6100e6db335c3fa81a	2016-07-26 00:58:06 +00:00
Clay Gerrard	c70abba529	Adjust replica count before pulling parts from failed devices When your device count falls below your replica count you can either add devices or reduce the replica count. Trying to reduce your replica count fails about half the time because removing parts from from failed devices temporarily invalidates your _replica2part2dev table with NONE_DEV which can result in an IndexError in _adjust_replica2part2dev_size. If you adjust the replica count first you won't have to worry about tracking unassigned parts from failed devices. Closes-Bug: #1558751 Change-Id: I99dc776fd260a2ba68ca77d7b5ed5120d10b06de	2016-03-17 19:27:24 +00:00
Christian Schwede	3ff94cb785	Add internal method to increase ring partition power This method increases the partition power of an existing ring by one. It does not move any data nor does it exposes a CLI command yet; it is only intended to be used in a future version to do the actual ring modification itself. An existing object that is currently located on partition X will be placed either on partition 2X or 2X+1 after the partition power got increased. The reason for this is the Ring.get_part() method, that does a bitwise shift to the right. To avoid actual data movement to different disks or even nodes, the allocation of partitions to nodes needs to be changed. The allocation is pairwise due to the above mentioned new partition scheme. Therefore devices are allocated like this, with the partition being the index and the value being the device id: OLD: 0, 3, 7, 5, 2, 1, ... NEW: 0, 0, 3, 3, 7, 7, 5, 5, 2, 2, 1, 1, ... If an operator stops the cluster, increases the partition power and renames & hardlinks the existing data it is possible to do a power shift without actually moving data. Please see the partition power spec for further details on this. Change-Id: I063fd8077497ee8c14d9065f07b4ec0fb5cbe180 Partially-Implements: spec increasing_partition_power	2016-03-04 08:01:47 +00:00
Clay Gerrard	b19dc1ddec	Always fix devices with multiple part-replica assignments I've found that given a sufficiently bad replica2part2dev table we can accidently not entirely fix palcement when more than two replicas of a part are assigned to the duplicate devices. It shows up most on > 3 replica rings when you have two different devices both holding two replicas. But you can see it on a three replica ring when all three replicas are assigned to the same device. Change-Id: Ieb213c1a259815a2ed657291242919cda568c7b5	2016-02-01 16:19:08 -08:00
Jenkins	6bb3325f31	Merge "Look for device holes that can be reused when adding new device."	2016-01-21 09:10:28 +00:00
Ben Martin	1f3304c515	Print min_part_hours lockout time remaining swift-ring-builder currently only displays min_part_hours and not the amount of time remaining before a rebalance can occur. This information is readily available and has been displayed as a quality of life improvement. Additionally, a bug where the time since the last rebalance was always updated when rebalance was called regardless of if any partitions were reassigned. This can lead to partitions being unable to be reassigned as they never age according to the time since last rebalance. Change-Id: Ie0e2b5e25140cbac7465f31a26a4998beb3892e9 Closes-Bug: #1526017	2016-01-11 10:58:38 -06:00
Paul Dardeau	fb6751d8ba	Look for device holes that can be reused when adding new device. Change-Id: I1980ebdd9dc89848173d8ca2fe2afb74029dcfa2 Closes-Bug: 1532276	2016-01-09 00:13:06 +00:00
Clay Gerrard	7035639dfd	Put part-replicas where they go It's harder than it sounds. There was really three challenges. Challenge #1 Initial Assignment =============================== Before starting to assign parts on this new shiny ring you've constructed, maybe we'll pause for a moment up front and consider the lay of the land. This process is called the replica_plan. The replica_plan approach is separating part assignment failures into two modes: 1) we considered the cluster topology and it's weights and came up with the wrong plan 2) we failed to execute on the plan I failed at both parts plenty of times before I got it this close. I'm sure a counter example still exists, but when we find it the new helper methods will let us reason about where things went wrong. Challenge #2 Fixing Placement ============================= With a sound plan in hand, it's much easier to fail to execute on it the less material you have to execute with - so we gather up as many parts as we can - as long as we think we can find them a better home. Picking the right parts for gather is a black art - when you notice a balance is slow it's because it's spending so much time iterating over replica2part2dev trying to decide just the right parts to gather. The replica plan can help at least in the gross dispersion collection to gather up the worst offenders first before considering balance. I think trying to avoid picking up parts that are stuck to the tier before falling into a forced grab on anything over parts_wanted helps with stability generally - but depending on where the parts_wanted are in relation to the full devices it's pretty easy pick up something that'll end up really close to where it started. I tried to break the gather methods into smaller pieces so it looked like I knew what I was doing. Going with a MAXIMUM gather iteration instead of balance (which doesn't reflect the replica_plan) doesn't seem to be costing me anything - most of the time the exit condition is either solved or all the parts overly aggressively locked up on min_part_hours. So far, it mostly seemds if the thing is going to balance this round it'll get it in the first couple of shakes. Challenge #3 Crazy replica2part2dev tables ========================================== I think there's lots of ways "scars" can build up a ring which can result in very particular replica2part2dev tables that are physically difficult to dig out of. It's repairing these scars that will take multiple rebalances to resolve. ... but at this point ... ... lacking a counter example ... I've been able to close up all the edge cases I was able to find. It may not be quick, but progress will be made. Basically my strategy just required a better understanding of how previous algorithms were able to mostly keep things moving by brute forcing the whole mess with a bunch of randomness. Then when we detect our "elegant" careful part selection isn't making progress - we can fall back to same old tricks. Validation ========== We validate against duplicate part replica assignment after rebalance and raise an ERROR if we detect more than one replica of a part assigned to the same device. In order to meet that requirement we have to have as many devices as replicas, so attempting to rebalance with too few devices w/o changing your replica_count is also an ERROR not a warning. Random Thoughts =============== As usual with rings, the test diff can be hard to reason about - hopefully I've added enough comments to assure future me that these assertions make sense. Despite being a large rewrite of a lot of important code, the existing code is known to have failed us. This change fixes a critical bug that's trivial to reproduce in a critical component of the system. There's probably a bunch of error messages and exit status stuff that's not as helpful as it could be considering the new behaviors. Change-Id: I1bbe7be38806fc1c8b9181a722933c18a6c76e05 Closes-Bug: #1452431	2015-12-07 16:06:42 -08:00
Kota Tsuyuzaki	c81f202f71	Add missing docs for ring.builder.rebalance commit `71993d84e8` added a new 'remove_dev' column to the swift.common.ring.builder.rebalance return value. This patch adds the docs for that and clean up a bit to the variable name to be easy to read. Change-Id: Idfd46e47b9f6894cbafc8b7701a4c7414212f79f	2015-11-05 17:38:08 -08:00

1 2 3

137 Commits