Commit Graph

7985 Commits

Author SHA1 Message Date
OpenDev Sysadmins ee3f02179c OpenDev Migration Patch
This commit was bulk generated and pushed by the OpenDev sysadmins
as a part of the Git hosting and code review systems migration
detailed in these mailing list posts:

http://lists.openstack.org/pipermail/openstack-discuss/2019-March/003603.html
http://lists.openstack.org/pipermail/openstack-discuss/2019-April/004920.html

Attempts have been made to correct repository namespaces and
hostnames based on simple pattern matching, but it's possible some
were updated incorrectly or missed entirely. Please reach out to us
via the contact information listed at https://opendev.org/ with any
questions you may have.
2019-04-19 19:28:31 +00:00
Alistair Coles 28514903e0 Cleanup for review
reverting unnecessary changes frm master

relocating some methods within modules

Change-Id: I33a46f4daa99e57d946793323d9396a2ad62cd1a
2018-05-02 11:28:01 +01:00
Zuul b72d040884 Merge "Make cleave_row_batch_size configurable" into feature/deep 2018-05-01 15:56:08 +00:00
Clay Gerrard 3587b73260 Make cleave_row_batch_size configurable
... and add some debug timing logging

Change-Id: I2f7facb360824e4b96414cdf7c7a47070366d5af
2018-05-01 09:04:52 +01:00
Zuul d71bc7e6d9 Merge "Use separate queries deleted and undeleted objects" into feature/deep 2018-05-01 03:58:43 +00:00
Zuul 9a9cd11e64 Merge "Remove sharding_lock()" into feature/deep 2018-05-01 03:04:21 +00:00
Zuul 31bc2d0e66 Merge "Merge branch 'master' into feature/deep" into feature/deep 2018-05-01 02:30:48 +00:00
Zuul 6fc22143ce Merge "Remove TODO re taking lock before deleting retiring db" into feature/deep 2018-05-01 02:30:46 +00:00
Zuul 8295bb9204 Merge "Remove support for getting deleted objects via container server API" into feature/deep 2018-05-01 01:26:53 +00:00
Alistair Coles d9be8caae2 Remove sharding_lock()
This lock was taken two places:

1. when cleaving into a shard broker
2. when removing misplaced objects from a shard broker

These two operations would not execute concurrently if there is just
one sharder process running.  If there were more than one sharder
process, each processing different nodes, then they might visit the
same db when one is visiting the parent container and cleaving to the
shard and one is visiting the shard container and handling misplaced
objects. However, any objects merged by the cleaving process will not
be removed by the misplaced object process because the removal is
limited to a max row count that is sampled at the start of the
misplaced objects handling. It is therefore not necessary to protect
these operations with a lock.

Change-Id: Icb3f9d8843b0fe601a32006adb4dcb29779c8a06
2018-04-30 21:30:37 +01:00
Zuul 55ebf422f8 Merge "Include devices not being sharded in local devices" into feature/deep 2018-04-30 20:14:00 +00:00
Alistair Coles e203472a00 Merge branch 'master' into feature/deep
Change-Id: Idb930351a5ad2c9fbde2d4f60ae66816afce534c
2018-04-30 20:35:05 +01:00
Alistair Coles 36a34c590e Use separate queries deleted and undeleted objects
To avoid using a 'deleted in (0, 1)' condition in the sql query to get
all deleted and undeleted objects, yield the undeleted objects
followed by the deleted objects in each shard range.

This is an alternative to [1].

[1] Related-Change: I67159e5ae6a114298cfd61dec692e5a0235df10e

Change-Id: I5e802c627f15e239023f954c5c12e5367d2bd4a0
2018-04-30 19:02:37 +01:00
Alistair Coles e19c8431b6 Remove support for getting deleted objects via container server API
Early versions of the proof of concept needed to include deleted
objects in container listings in order to merge multiple listing
sources in the proxy. This is not currently required.

Change-Id: I0fe405d2b1638467d7f6d669692f1c51fa9d3c85
2018-04-30 19:02:37 +01:00
Zuul 31f3a94136 Merge "Reset local device ids on each cycle" into feature/deep 2018-04-30 17:43:39 +00:00
Zuul 3313392462 Merge "Import swift3 into swift repo as s3api middleware" 2018-04-30 16:00:56 +00:00
Alistair Coles 4d4e33cb90 Include devices not being sharded in local devices
Previously the local devices list was restricted to devices that were
included for sharding, which in the extreme case of only one device
would force all handoff shard dbs to be created on same device as the
sharding db.

Now, all devices are included for purposes of chooding a device to
create a shard db.

Also add some unit tests for partition and device filtering.

Change-Id: Id4dfa83103f89a0398ed389a39bfa7c79e2e9a09
2018-04-30 16:33:46 +01:00
Zuul 8179285803 Merge "Don't modify source ShardRange instance in sharder yield_objects" into feature/deep 2018-04-30 15:17:42 +00:00
Alistair Coles 73f9f85105 Reset local device ids on each cycle
Change-Id: Iebfede8423748319924e0729f1149c1348703e2b
2018-04-30 14:45:31 +01:00
Alistair Coles a09a0358dd Don't modify source ShardRange instance in sharder yield_objects
Updating the lower attribute fo the source shard range *copy* is fine,
but is only a few steps away from a bug if the copy operation was
removed [1].  There is no need to mutate the shard range, just use a
marker variable.

[1] The shard range timestamp would have to also be updated for the
mutated shard range to ever merge, but nevertheless this is not a good
pattern.

Change-Id: I1b9e5e7d20ce628e6f13f50763c72131effc3862
2018-04-30 12:39:48 +01:00
Alistair Coles 5d0b827afe Use correct deleted key in all queries
Change-Id: Ie881fac648e68f54d2814ef1f4523524e8ea50c4
Related-Change: Ifc358672ebd82b93ac6f5afa3f1f5dce9af9706e
2018-04-30 11:42:40 +01:00
Alistair Coles f5dad0f913 Move objects in all storage policies to shards
When cleaving or handling misplaced objects move objects
in all storage policies.

Change-Id: Ia8947452c1cfc38c5f9acc8e79f10bf24ea51013
2018-04-30 10:30:05 +01:00
Zuul d5d16e66d5 Merge "Fix unicode errors with non-ascii object names" into feature/deep 2018-04-28 18:21:51 +00:00
Zuul 4098567bf5 Merge "Handful of formpost cleanups" 2018-04-28 00:16:33 +00:00
Zuul 4ea60c9fca Merge "Don't repeat merges when re-cleaving to existing shard db" into feature/deep 2018-04-27 22:31:47 +00:00
Zuul 1b9d1e0f30 Merge "Improve test coverage for metadata reclaim" into feature/deep 2018-04-27 18:49:37 +00:00
Alistair Coles a6ad703406 Don't repeat merges when re-cleaving to existing shard db
Previously, if a shard db was cleaved but failed to replicate, the
cleaving would be repeated i.e. all rows in the shard range would be
merged from the retiring source db to the shard db. This is
unnecessary, and inefficient, if the shard db is still intact from the
previous attempt to cleave and replicate.

Now, once rows have been cleaved, a sync point is stored in the shard
db under the retiring source db id. This sync point is checked before
rows are merged. If the shard db was re-created or replaced by another
out-of-sync copy, the sync point will not be found and all rows will
be merged. If the shard db is unchanged, or is a copy to which the
original shard db has been replicated, the sync point will be found
and no rows will be merged.

Change-Id: Ic9a5e525afcbc265b06d30c8897f7ffa5b3f95a7
2018-04-27 18:38:34 +01:00
Alistair Coles 55bc15b873 Always update shard brokers with their own range and sharding metadata
There is no need to conditionally update shard brokers when they are
created. The 'force' flag was a leftover from when own shard range
state was stored in sysmeta and the most recent writer of state would
take precedence.  We consequently needed to avoid updating the
sharding meta-timestamp in sysmeta as a side-effect of getting the
shard broker.

Since [1] the meta-timestamp is managed by the shard range merging
logic so that the most recent time takes precedence rather than the
most recent writer.

[1] Related-Change: Ie65cd4ce0abc98452163f6acdfad3cce5cdd216f

Change-Id: I3f3d13cf1caa684260f2ddfdb731ca941336f319
2018-04-27 11:15:03 +01:00
Zuul bfbc18c2f6 Merge "Stop internal client txn_id bleeding into sharder logs" into feature/deep 2018-04-27 10:03:45 +00:00
Zuul 23e76b1890 Merge "Parameterize SQL args" into feature/deep 2018-04-27 09:50:33 +00:00
Alistair Coles 0d967c31cb Improve test coverage for metadata reclaim
Actually test that the reclaim method does reclaim metadata.

Replace class invocation of _reclaim_metadata() with instance
invocation.

Fix reclaim() docstring.

Change-Id: I7a473e164c8c14b26b195db9a91fea1d2cd5b267
Related-Change: Ied1373362c38bbe7bab84fe4958888b0145e68ba
2018-04-27 10:18:03 +01:00
Kota Tsuyuzaki 636b922f3b Import swift3 into swift repo as s3api middleware
This attempts to import openstack/swift3 package into swift upstream
repository, namespace. This is almost simple porting except following items.

1. Rename swift3 namespace to swift.common.middleware.s3api
1.1 Rename also some conflicted class names (e.g. Request/Response)

2. Port unittests to test/unit/s3api dir to be able to run on the gate.

3. Port functests to test/functional/s3api and setup in-process testing

4. Port docs to doc dir, then address the namespace change.

5. Use get_logger() instead of global logger instance

6. Avoid global conf instance

Ex. fix various minor issue on those steps (e.g. packages, dependencies,
  deprecated things)

The details and patch references in the work on feature/s3api are listed
at https://trello.com/b/ZloaZ23t/s3api (completed board)

Note that, because this is just a porting, no new feature is developed since
the last swift3 release, and in the future work, Swift upstream may continue
to work on remaining items for further improvements and the best compatibility
of Amazon S3. Please read the new docs for your deployment and keep track to
know what would be changed in the future releases.

Change-Id: Ib803ea89cfee9a53c429606149159dd136c036fd
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
2018-04-27 15:53:57 +09:00
Zuul 4cc2d82962 Merge "Reclaim deleted shard range rows after 2 * reclaim_age" into feature/deep 2018-04-27 02:24:41 +00:00
Zuul b8af4019a3 Merge "Commit from pending file less often" into feature/deep 2018-04-27 02:08:28 +00:00
Zuul eb77c59418 Merge "Catch more exceptions when recording progress" into feature/deep 2018-04-27 00:06:25 +00:00
Zuul 2ad346d0b4 Merge "Clean up logs a bit" into feature/deep 2018-04-27 00:06:17 +00:00
Tim Burke bef2a855d2 Parameterize SQL args
'Cause I want to have shard ranges with quotes in them LIKE A CRAZY PERSON.

Change-Id: I18f84caf2eb4fe17fbe28d7cb5d65cec4da7474d
2018-04-26 16:56:53 -07:00
Tim Burke 843e845dde Commit from pending file less often
Shard ranges *can't* be in the pending file, so no need to try to clear it
when fetching shard ranges. And we already try to commit down in _empty, so
is_reclaimable was previously trying to do it *twice*.

Change-Id: Ia5867b163eb4e9b516a0306a505cd2bc3dd49d43
2018-04-26 21:26:13 +00:00
Alistair Coles 20fda6673d Reclaim deleted shard range rows after 2 * reclaim_age
...but never reclaim own_shard_range because we always want to know
what own_shard_range state is up until the db is unlinked.

The sharder audit process will delete a shard container if its own
shard range has been deleted for > reclaim_age.

The replicator will unlink the db if the shard container has been
deleted for > reclaim_age.

So there is potentially a window of time which is more than 2 *
reclaim_age after the own_shard_range was deleted but before the db
gets unlinked.

Change-Id: Ied1373362c38bbe7bab84fe4958888b0145e68ba
2018-04-26 21:35:56 +01:00
Alistair Coles 069733c86b Stop internal client txn_id bleeding into sharder logs
The LogAdapter txn_id is stored in a threading.local object that is a
class attribute, so all instances of the LogAdapter in same thread
will pick up the txn_id set in another instance. That means that after
the internal client has been used to make a request, all subsequent
sharder logs include the request txn_id.

The txn_id might be useful if an error occurs in _fetch_shard_ranges
but is annoying elsewhere.

Change-Id: I4e1961c13be301381907885579d4137cd0a9b16a
2018-04-26 13:18:36 -07:00
Tim Burke f635d78a2d Clean up logs a bit
We don't need to log the fact that we created zero ranges, and we don't need
to log at info for *every* container we find.

It *is* handy to write down the time elapsed when we finish cleaving a shard,
though, so I don't have to correlate log lines, parse and diff dates, etc.

Change-Id: I2b2b8a6b8801082f3068ec0f264ab5d4086aa2c8
2018-04-26 20:00:59 +00:00
Tim Burke 7abc6ed4fa Catch more exceptions when recording progress
Change-Id: Idcca56ab2f42dcadb5b3a2194bda16c8dc451f71
2018-04-26 19:43:41 +00:00
Alistair Coles 5eb65e4b89 Remove unused code for passing shard ranges via the pending file
Remove/revert to master leftover code from when shard ranges were
written to the pending file.

Change-Id: I3e48850a0afbab0725859b4bd0c5b70bd78a16ff
2018-04-26 18:41:25 +01:00
Zuul e23562bd6d Merge "Refactor finding sharding and shrinking candidates" into feature/deep 2018-04-26 10:08:27 +00:00
Clay Gerrard 867757aaa0 make sharding 100x faster
... by *using* the (name, deleted) index

Change-Id: Ifc358672ebd82b93ac6f5afa3f1f5dce9af9706e
2018-04-25 17:36:37 -07:00
Samuel Merritt c4751d0d55 Make reconstructor go faster with --override-devices
The object reconstructor will now fork all available worker processes
when operating on a subset of local devices.

Example:
  A system has 24 disks, named "d1" through "d24"
  reconstructor_workers = 8
  invoked with --override-devices=d1,d2,d3,d4,d5,d6

In this case, the reconstructor will now use 6 worker processes, one
per disk. The old behavior was to use 2 worker processes, one for d1,
d3, and d5 and the other for d2, d4, and d6 (because 24 / 8 = 3, so we
assigned 3 disks per worker before creating another).

I think the new behavior better matches operators' expectations. If I
give a concurrent program six tasks to do and tell it to operate on up
to eight at a time, I'd expect it to do all six tasks at once, not run
two concurrent batches of three tasks apiece.

This has no effect when --override-devices is not specified. When
operating on all local devices instead of a subset, the new and old
code produce the same result.

The reconstructor's behavior now matches the object replicator's
behavior.

Change-Id: Ib308c156c77b9b92541a12dd7e9b1a8ea8307a30
2018-04-25 11:18:35 -07:00
Alistair Coles fc58926e5e Refactor finding sharding and shrinking candidates
Refactor to provide module level functions for the finding
of sharding and shrinking candidates, so that these can be
used by other callers.

Add unit tests.

Change-Id: Iada00e63f14238b67aaa818314fa6601eeec624e
2018-04-25 17:36:19 +01:00
Alistair Coles eccc552f52 Remove TODO re taking lock before deleting retiring db
Remove this TODO since we do not plan to take a lock before deleting a
retiring db.  A lock might be required if there was a risk of the
retiring db being modified between the cleaving context being checked
and the db being deleted in _complete_sharding(). The following steps
have been taken to avoid any such modification:

1. No object updates are committed to the retiring db from the pending
updates file [1].

2. No objects are replicated to the retiring db (nor the fresh db)
after sharding begins [2].

3. Multiple attempts are made to abort any rsync_then_merge process
that started before sharding began [3].

[1] Related-Change: I268d01a373491c693b793748065d212f9703ffab
[2] Related-Change: I289f558381d028b4d3129e4e51549d3f1a58dc2f
[3] Related-Change: Ib285efbadb222b7c843fc212e5ae912ccd7b7ead

Change-Id: Ia53aa7a04b483ff91ca03de1a723602eb777a289
2018-04-25 13:13:56 +01:00
Tim Burke 74356fc3cc Only zero out stats at the start of a shard cycle
Otherwise, we lose track of sharding candidates and currently-sharding
containers in recon when the sharding cycle time goes past an hour.

Change-Id: I8c6721d9f03c0f738254db37478e2813976fdf1a
2018-04-25 11:56:49 +01:00
Zuul 47efb5b969 Merge "Multiprocess object replicator" 2018-04-25 00:41:21 +00:00