Commit Graph

93 Commits

Author SHA1 Message Date
Tim Burke ce9e56a6d1 lint: Consistently use assertIsInstance
This has been available since py32 and was backported to py27; there
is no point in us continuing to carry the old idiom forward.

Change-Id: I21f64b8b2970e2dd5f56836f7f513e7895a5dc88
2024-02-07 15:48:39 -08:00
Matthew Oliver 00bfc425ce Add FakeStatsdClient to unit tests
Currently we simply mock calls in the FakeLogger for calls statsd calls,
and there are also some helper methods for counting and collating
metrics that were called. This Fakelogger is overloaded and doesn't
simulate the real world.
In real life we use a Statsdclient that is attached to the logger.

We've been in the situation where unit tests pass but the statsd client
stacktraces because we don't actually fake the statsdclient based off
the real one and let it's use its internal logic.

This patch creates a new FakeStatsdClient that is based off the real
one, this can then be used (like the real statsd client) and attached to
the FakeLogger.
There is quite a bit of churn in tests to make this work, because we now
have to looking into the fake statsd client to check the faked calls
made.
The FakeStatsdClient does everything the real one does, except overrides
the _send method and socket creation so no actual statsd metrics are
emitted.

Change-Id: I9cdf395e85ab559c2b67b0617f898ad2d6a870d4
2023-08-07 10:10:45 +01:00
Tim Burke 20b48a6900 Clean up a bunch of deprecation warnings
pytest still complains about some 20k warnings, but the vast majority
are actually because of eventlet, and a lot of those will get cleaned up
when upper-constraints picks up v0.33.2.

Change-Id: If48cda4ae206266bb41a4065cd90c17cbac84b7f
2022-12-27 13:34:00 -08:00
Matthew Oliver c4e00eb89f Sharder: Fall back to local device in get_shard_broker
If the sharder is processing a node that has 0 weight, especially for
all the devices on the node, the `find_local_handoff_for_part` can
fail because there will be no local hand off devices available as it
uses the replica2part2dev_id to find a device.  However, a 0 weighted
device won't appear in the replica2part2dev table.

This patch extends `find_local_handoff_for_part`, if it fails to find
a node from the ring it'll fall back to a local device identified by
the `_local_device_ids` that is built up when the replicator or
sharder was identifing local devices. This uses the ring.devs, so does
include 0 weighted devices.  This allows the sharder to find a
location to write the shard_broker in a handoff location while
sharding.

Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Change-Id: Ic38698e9ca0397770c7362229baef1101a72788f
2022-07-29 15:02:26 +01:00
Zuul 5ff37a0d5e Merge "DB Replicator: Add handoff_delete option" 2022-07-22 01:45:31 +00:00
Matthew Oliver bf4edefce4 DB Replicator: Add handoff_delete option
Currently the object-replicator has an option called `handoff_delete`
which allows us to define the the number of replicas which are ensured
in swift. Once a handoff node ensures that many successful responses it
can go ahead and delete the handoff partition.

By default it's 'auto' or rather the number of primary nodes. But this
can be reduced. It's useful in draining full disks, but has to be used
carefully.

This patch adds the same option to the DB replicator and works the same
way. But instead of deleting a partition it's done at the per DB level.

Because it's done in the DB Replicator level it means the option is now
available to both the Account and Container replicators.

Change-Id: Ide739a6d805bda20071c7977f5083574a5345a33
2022-07-21 13:35:24 +10:00
Tim Burke f6f474e429 db: Close ReplConnection sockets on errors/timeouts
This could happen when there was a timeout running _sync_shard_ranges()
in _choose_replication_mode() -- syncing of shard ranges failed, but we
still want to attempt to replicate. If we close the socket, the
connection will automatically spin up a new one the next time we call
request() instead of raising CannotSendRequest.

Change-Id: I242351078e26213f43c1ccc0fed534b64aa29ab6
Closes-Bug: #1968224
2022-04-07 14:55:18 -07:00
Tim Burke cd7159c69c db: Attempt to clean up part dir post replication
Change-Id: Id844f47ead7fab47915b6db76c06155b22586efe
2022-02-22 16:05:28 -08:00
Matthew Oliver 85e36f7122 recon: refactor common recon names into a common location
Change-Id: I0a0766cfb6672377de0f152ce179c874c327ec54
2021-06-29 15:22:57 -07:00
Clay Gerrard 2a312d1cd5 Cleanup tests' import of debug_logger
Change-Id: I19ca860deaa6dbf388bdcd1f0b0f77f72ff19689
2021-04-27 12:04:41 +01:00
Tim Burke 971023e4c8 replication: Allow databases_per_second to be a float
Sometimes even one database per second is too fast.

Change-Id: Iaf11743485e1ad320c82476430f450be0c4f849c
Closes-Bug: #1877827
2020-05-15 13:23:17 -07:00
Pete Zaitcev ac01d186b4 Leave less garbage in /var/tmp
All our tests that invoked broker.set_sharding_state() created
/var/tmp/tmp, when it called DatabaseBroker.get_device_path(),
then added "tmp" to it. We provided 1 less level, so it walked up
ouside of the test's temporary directory.

The case of "cleanUp" instead of "tearDown" didn't break out of
jail, but left trash in /var/tmp all the same.

Change-Id: I8030ea49e2a977ebb7048e1d5dcf17338c1616df
2019-02-12 18:43:30 +00:00
Tim Burke bc4494f24d py3: port account/container replicators
Change-Id: Ia2662d8f75883e1cc41b9277c65f8b771f56f902
2018-11-06 16:54:20 -08:00
Zuul a13e44b39d Merge "py3: adapt common/db_replicator.py" 2018-11-01 10:34:00 +00:00
Clay Gerrard 06cf5d298f Add databases_per_second to db daemons
Most daemons have a "go as fast as you can then sleep for 30 seconds"
strategy towards resource utilization; the object-updater and
object-auditor however have some "X_per_second" options that allow
operators much better control over how they spend their I/O budget.

This change extends that pattern into the account-replicator,
container-replicator, and container-sharder which have been known to peg
CPUs when they're not IO limited.

Partial-Bug: #1784753
Change-Id: Ib7f2497794fa2f384a1a6ab500b657c624426384
2018-10-30 22:28:05 +00:00
Pete Zaitcev 654187e1fe py3: adapt common/db_replicator.py
Another one of those almost-empty patches. This time my excuse is
that this one is needed in a couple of places (account & container).

Change-Id: Ieb8960763c64f88981b68884bfec92c17ebb4708
2018-10-22 21:42:58 -05:00
Tim Burke f192f51d37 Have check_drive raise ValueError on errors
...which helps us differentiate between a drive that's not mounted vs.
not a dir better in log messages. We were already doing that a bit in
diskfile.py, and it seems like a useful distinction; let's do it more.

While we're at it, remove some log translations.

Related-Change: I941ffbc568ebfa5964d49964dc20c382a5e2ec2a
Related-Change: I3362a6ebff423016bb367b4b6b322bb41ae08764
Change-Id: Ife0d34f9482adb4524d1ab1fe6c335c6b287c2fd
Partial-Bug: 1674543
2018-06-20 17:15:07 -07:00
Matthew Oliver 2641814010 Add sharder daemon, manage_shard_ranges tool and probe tests
The sharder daemon visits container dbs and when necessary executes
the sharding workflow on the db.

The workflow is, in overview:

- perform an audit of the container for sharding purposes.

- move any misplaced objects that do not belong in the container
  to their correct shard.

- move shard ranges from FOUND state to CREATED state by creating
  shard containers.

- move shard ranges from CREATED to CLEAVED state by cleaving objects
  to shard dbs and replicating those dbs. By default this is done in
  batches of 2 shard ranges per visit.

Additionally, when the auto_shard option is True (NOT yet recommeneded
in production), the sharder will identify shard ranges for containers
that have exceeded the threshold for sharding, and will also manage
the sharding and shrinking of shard containers.

The manage_shard_ranges tool provides a means to manually identify
shard ranges and merge them to a container in order to trigger
sharding. This is currently the recommended way to shard a container.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

Change-Id: I7f192209d4d5580f5a0aa6838f9f04e436cf6b1f
2018-05-18 18:48:13 +01:00
Alistair Coles 14af38a899 Add support for sharding in ContainerBroker
With this patch the ContainerBroker gains several new features:

1. A shard_ranges table to persist ShardRange data, along with
methods to merge and access ShardRange instances to that table,
and to remove expired shard ranges.

2. The ability to create a fresh db file to replace the existing db
file. Fresh db files are named using the hash of the container path
plus an epoch which is a serialized Timestamp value, in the form:

  <hash>_<epoch>.db

During sharding both the fresh and retiring db files co-exist on
disk. The ContainerBroker is now able to choose the newest on disk db
file when instantiated. It also provides a method (get_brokers()) to
gain access to broker instance for either on disk file.

3. Methods to access the current state of the on disk db files i.e.
UNSHARDED (old file only), SHARDING (fresh and retiring files), or
SHARDED (fresh file only with shard ranges).

Container replication is also modified:

1. shard ranges are replicated between container db peers. Unlike
objects, shard ranges are both pushed and pulled during a REPLICATE
event.

2. If a container db is capable of being sharded (i.e. it has a set of
shard ranges) then it will no longer attempt to replicate objects to
its peers. Object record durability is achieved by sharding rather than
peer to peer replication.

Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

Change-Id: Ie4d2816259e6c25c346976e181fb9d350f947190
2018-05-18 18:42:38 +01:00
Alistair Coles 9d742b85ad Refactoring, test infrastructure changes and cleanup
...in preparation for the container sharding feature.

Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

Change-Id: I4455677abb114a645cff93cd41b394d227e805de
2018-05-15 18:18:25 +01:00
Zuul b20893f540 Merge "Support -d <devs> and -p <partitions> in DB replicators." 2018-03-20 00:04:01 +00:00
Zuul 44070c0eec Merge "Add handoffs-only mode to DB replicators." 2018-03-14 20:36:31 +00:00
Tim Burke b640631daf Apply remote metadata in _handle_sync_response
We've already got it in the response, may as well apply it now rather
than wait for the other end to get around to running its replicators.

Change-Id: Ie36a6dd075beda04b9726dfa2bba9ffed025c9ef
2018-03-06 19:52:59 +00:00
Samuel Merritt b08c70d38e Support -d <devs> and -p <partitions> in DB replicators.
Similar to the object replicator and reconstructor, these arguments
are comma-separated lists of device names and partitions,
respectively, on which the account or container replicator will
operate. Other devices and partitions are ignored.

Change-Id: Ic108f5c38f700ac4c7bcf8315bf4c55306951361
2018-03-05 16:26:19 -08:00
Samuel Merritt 47fed6f2f9 Add handoffs-only mode to DB replicators.
The object reconstructor has a handoffs-only mode that is very useful
when a cluster requires rapid rebalancing, like when disks are nearing
fullness. This mode's goal is to remove handoff partitions from disks
without spending effort on primary partitions. The object replicator
has a similar mode, though it varies in some details.

This commit adds a handoffs-only mode to the account and container
replicators.

Change-Id: I588b151ee65ae49d204bd6bf58555504c15edf9f
Closes-Bug: 1668399
2018-02-16 16:56:13 -08:00
Samuel Merritt 2bfd9c6a9b Make DB replicators ignore non-partition directories
If a cluster operator has some tooling that makes directories in
/srv/node/<disk>/accounts, then the account replicator will treat
those directories as partition dirs and may remove empty
subdirectories contained therein. This wastes time and confuses the
operator.

This commit makes DB replicators skip partition directories whose
names don't look like positive integers. This doesn't completely avoid
the problem since an operator can still use an all-digit name, but it
will skip directories like "tmp21945".

Change-Id: I8d6682915a555f537fc0ce8c39c3d52c99ff3056
2018-02-16 16:56:13 -08:00
Clay Gerrard feee399840 Use check_drive consistently
We added check_drive to the account/container servers to unify how all
the storage wsgi servers treat device dirs/mounts.  Thus pushes that
unification down into the consistency engine.

Drive-by:
 * use FakeLogger less
 * clean up some repeititon in probe utility for device re-"mounting"

Related-Change-Id: I3362a6ebff423016bb367b4b6b322bb41ae08764
Change-Id: I941ffbc568ebfa5964d49964dc20c382a5e2ec2a
2017-11-01 16:33:40 +00:00
Jenkins 322c8551cd Merge "Log remote_merges during DB replication" 2017-09-14 22:30:35 +00:00
junboli 99a6d3b30a Test: Use assertIsNone() in unittest
Use assertIsNone() instead of assertEqual(), because assertEqual()
still fails on false values when compared to None

Change-Id: Ic52c319e3e55135df834fdf857982e1721bc44bb
2017-06-25 03:01:42 +00:00
Jenkins bb80773153 Merge "Using assertIsNone() instead of assertEqual(None)" 2017-06-08 01:13:49 +00:00
lingyongxu ee9458a250 Using assertIsNone() instead of assertEqual(None)
Following OpenStack Style Guidelines:
[1] http://docs.openstack.org/developer/hacking/#unit-tests-and-assertraises
[H203] Unit test assertions tend to give better messages for more specific
assertions. As a result, assertIsNone(...) is preferred over
assertEqual(None, ...) and assertIs(..., None)

Change-Id: If4db8872c4f5705c1fff017c4891626e9ce4d1e4
2017-06-07 14:05:53 +08:00
Daisuke Morita 843184f3fe Sync metadata in 'rsync_then_merge' in db_replicator
In previous 'rsync_then_merge' remote objects are merged with
rsync'ed local objects, but remote metadata is not merged with local
one. Account/Container replicator sometimes uses rsync for db sync if
there is a big difference of record history in db files between 'local'
and 'remote' servers. If replicator needs to rsync local db to remote
but metadata in local db is older, older info of metadata can be
distributed then some metadata values can be missing or go back to older.

This patch fixes this problem by merging 'remote' metadata with
rsync'ed local db file.

Closes-Bug: #1570118

Change-Id: Icdf0a936fc456c5462471938cbc365bd012b05d4
2017-05-11 17:39:16 -04:00
Tim Burke 8e59dfbee2 Log remote_merges during DB replication
Change-Id: I1850f09bab16401479b5a0cc521f67a32ea9c9f5
2017-04-19 00:28:50 +00:00
Pavel Kvasnička bcd0eb70af Container drive error results double space usage on rest drives
When drive with container or account database is unmounted
replicator pushes database to handoff location. But this
handoff location finds replica with unmounted drive and
pushes database to the *next* handoff until all handoffs has
a replica - all container/account servers has replicas of
all unmounted drives.

This patch solves:
- Consumption of iterator on handoff location that results in
  replication to the next and next handoff.
- StopIteration exception stopped not finished loop over
  available handoffs if no more nodes exists for db replication
  candidency.

Regression was introduced in 2.4.0 with rsync compression.

Co-Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>

Change-Id: I344f9daaa038c6946be11e1cf8c4ef104a09e68b
Closes-Bug: 1675500
2017-04-11 09:49:59 +02:00
Jenkins 8fdb01b8a7 Merge "Cleanup tests from empty suffix quarantined db fix" 2017-01-16 12:36:38 +00:00
Clay Gerrard f0122b69c4 Cleanup tests from empty suffix quarantined db fix
Remove some cruft and cleanup tests from related change.

Related-Change: I721fa5fe9a7ae22eead8d5141f93e116847ca058
Change-Id: Id3addaca9057569a535e4df1c4209a0ddad84d20
2017-01-13 16:24:33 -08:00
Jenkins e974960632 Merge "remove empty db hash and suffix directories" 2016-12-01 21:04:25 +00:00
Mahati Chamarthy 321bb9145f remove empty db hash and suffix directories
If a db gets quarantined we may fail to cleanup an empty suffix dir or
a hash dir.

Change-Id: I721fa5fe9a7ae22eead8d5141f93e116847ca058
Closes-Bug: #1583719
2016-11-16 10:11:30 +05:30
Victor Stinner e6776306b7 Python 3: fix usage of reload()
Replace reload() builtin function with six.moves.reload_module() to
make the code compatible with Python 2 and Python 3.

Change-Id: I7572d613fef700b392d412501facc3bd5ee72a66
2016-07-25 14:56:21 +02:00
Christian Schwede 9729bc83eb Don't delete misplaced dbs if not replicated
If one uses only a single replica and a database file is placed on a
wrong partition, it will be removed instead of replicated to the correct
partition.

There are two reasons for this:
1. The list of nodes is empty when there is only a single replica
2. all(responses) is True even if there is no response at all, and the
latter is always True if there is no node to replicate to.

This patch fixes this by adding a special case if used with only one
replica to the node selection loop and ensures that the list of
responses is not empty.  Also adds a test that fails on current master
and passes with this change.

Closes-Bug: 1568591

Change-Id: I028ea8c1928e8c9a401db31fb266ff82606f8371
2016-05-12 07:28:40 +00:00
Shashirekha Gundur cf48e75c25 change default ports for servers
Changing the recommended ports for Swift services
from ports 6000-6002 to unused ports 6200-6202;
so they do not conflict with X-Windows or other services.

Updated SAIO docs.

DocImpact
Closes-Bug: #1521339
Change-Id: Ie1c778b159792c8e259e2a54cb86051686ac9d18
2016-04-29 14:47:38 -04:00
Peter Lisák 16de32f168 Log error if a local device not identified in replicator
Example:
* Different port in config and in ring file.
* Running daemon on server not in ring file.
In both cases replication daemon is running but nothing is replicated.
Error log helps to distinguish a local device can't be identified.

Closes-Bug: 1508228
Change-Id: I99351b7d9946f250b7750df91c13d09352a145ce
2015-11-21 14:04:32 +01:00
Zack M. Davis 1b8b08039a remove remaining simplejson uses, prefer standard library import
a1c32702, 736cf54a, and 38787d0f remove uses of `simplejson` from
various parts of Swift in favor of the standard libary `json`
module (introduced in Python 2.6). This commit performs the remaining
`simplejson` to `json` replacements, removes two comments highlighting
quirks of simplejson with respect to Unicode, and removes the references
to it in setup documentation and requirements.txt.

There were a lot of places where we were importing json from
swift.common.utils, which is less intuitive than a direct `import json`,
so that replacement is made as well.

(And in two more tiny drive-bys, we add some pretty-indenting to an XML
fragment and use `super` rather than naming a base class explicitly.)

Change-Id: I769e88dda7f76ce15cf7ce930dc1874d24f9498a
2015-11-16 12:34:24 -08:00
Lisak, Peter b6b7578190 node_timeout as float in configs
It is more convenient to use float node_timeout for fine tunning latency.

Change-Id: I7c57bba053711a27d3802efe6f2a0bf53483a54f
2015-10-19 21:14:34 +02:00
Matthew Oliver 4a13dcc4a8 Make db_replicator usync smaller containers
The current rule inside the db_replicator is to rsync+merge
containers during replication if the difference between rowids
differ by more than 50%:

  # if the difference in rowids between the two differs by
  # more than 50%, rsync then do a remote merge.
  if rinfo['max_row'] / float(info['max_row']) < 0.5:

This mean on smaller containers, that only have few rows, and differ
by a small number still rsync+merge rather then copying rows.

This change adds a new condition, the difference in the rowids must
be greater than the defined per_diff otherwise usync will be used:

  # if the difference in rowids between the two differs by
  # more than 50% and the difference is greater than per_diff,
  # rsync then do a remote merge.
  # NOTE: difference > per_diff stops us from dropping to rsync
  # on smaller containers, who have only a few rows to sync.
  if rinfo['max_row'] / float(info['max_row']) < 0.5 and \
          info['max_row'] - rinfo['max_row'] > self.per_diff:

Change-Id: I9e779f71bf37714919a525404565dd075762b0d4
Closes-bug: #1019712
2015-10-19 15:26:12 +01:00
janonymous f5f9d791b0 pep8 fix: assertEquals -> assertEqual
assertEquals is deprecated in py3, replacing it.

Change-Id: Ida206abbb13c320095bb9e3b25a2b66cc31bfba8
Co-Authored-By: Ondřej Nový <ondrej.novy@firma.seznam.cz>
2015-10-11 12:57:25 +02:00
Romain LE DISEZ 71f6fd025e Allows to configure the rsync modules where the replicators will send data
Currently, the rsync module where the replicators send data is static. It
forbids administrators to set rsync configuration based on their current
deployment or needs.

As an example, the rsyncd configuration example encourages to set a connections
limit for the modules account, container and object. It permits to protect
devices from excessives parallels connections, because it would impact
performances.

On a server with many devices, it is tempting to increase this number
proportionally, but nothing guarantees that the distribution of the connections
will be balanced. In the worst scenario, a single device can receive all the
connections, which is a severe impact on performances.

This commit adds a new option named 'rsync_module' to the *-replicator sections
of the *-server configuration file. This configuration variable can be
extrapolated with device attributes like ip, port, device, zone, ... by using
the format {NAME}. eg:
    rsync_module = {replication_ip}::object_{device}

With this configuration, an administrators can solve the problem of connections
distribution by creating one module per device in rsyncd configuration.

The default values are backward compatible:
    {replication_ip}::account
    {replication_ip}::container
    {replication_ip}::object

Option vm_test_mode is deprecated by this commit, but backward compatibility is
maintained. The option is only effective when rsync_module is not set. In that
case, {replication_port} is appended to the default value of rsync_module.

Change-Id: Iad91df50dadbe96c921181797799b4444323ce2e
2015-09-07 08:00:18 +02:00
janonymous c5b5cf91a9 test/unit: Replace python print operator with print function (pep H233, py33)
'print' function is compatible with 2.x and 3.x python versions
Link : https://www.python.org/dev/peps/pep-3105/

Python 2.6 has a __future__ import that removes print as language syntax,
letting you use the functional form instead

Change-Id: I94e1bc6bd83ad6b05695c7ebdf7cbfd8f6d9f9af
2015-07-28 21:03:05 +05:30
janonymous cd7b2db550 unit tests: Replace "self.assert_" by "self.assertTrue"
The assert_() method is deprecated and can be safely replaced by assertTrue().
This patch makes sure that running the tests does not create undesired
warnings.

Change-Id: I0602ba39ef93263386644ee68088d5f65fcb4a71
2015-07-21 19:23:00 +05:30
Victor Stinner 1cc3eff958 Fixes for mock 1.1
The new release of mock 1.1 is more strict. It helped to find bugs in
tests.

Closes-Bug: #1473369
Change-Id: Id179513c6010d827cbcbdda7692a920e29213bcb
2015-07-10 16:37:11 +02:00