Stuck shard ranges have been seen in the production, root cause has been
traced back to that s-m-s-r failed to detect parent-child relationship
in overlaps and it either shrinked child shard ranges into parents or
the other way around. A patch has been added to check minimum age before
s-m-s-r performs repair, which will most likely prevent this from
happening again, but we also need to check for parent-child relationship
in overlaps explicitly during repairs. This patch will do that and
remove parent or child shard ranges from doners, and prevent s-m-s-r
from shrinking them into acceptor shard ranges.
Drive-by 1: fixup gap repair probe test.
The probe test is no longer appropriate because we're no longer
allowed to repair parent-child overlaps, so replace the test with a
manually created gap.
Drive-by 2: address probe test TODOs.
The commented assertion would fail because the node filtering
comparison failed to account for the same node having different indexes
when generated for the root versus the shard. Adding a new iterable
function filter_nodes makes the node filtering behave as expected.
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: Iaa89e94a2746ba939fb62449e24bdab9666d7bab
We had get_server_number for many years now, there's no reason
to continue using the node_id=port%100//10 thing.
Change-Id: I5357a095110e8e4889c0468c154611209c6e8c07
We have a better fix in the works, see the change
Ic53068867feb0c18c88ddbe029af83a970336545. But it is
taking too long to coalesce and users are unhappy right now.
Related: rhbz#1838242, rhbz#1965348
Change-Id: I3f7bfc2877355b7cb433af77c4e2dfdfa94ff14d
Previously a DebugLogger was used when probe tests ran 'custom
daemons', which provided a means to inspect captured logs but logged
to console. This patch replaces that DebugLogger with a 'normal' swift
logger that is adapted to also capture logs and support log
inspection.
Change-Id: I25da3aa81018c5de7b63e5584ac6a9dbb73243db
If the reconstructor finds a fragment that appears to be stale then it
will now quarantine the fragment. Fragments are considered stale if
insufficient fragments at the same timestamp can be found to rebuild
missing fragments, and the number found is less than or equal to a new
reconstructor 'quarantine_threshold' config option.
Before quarantining a fragment the reconstructor will attempt to fetch
fragments from handoff nodes in addition to the usual primary nodes.
The handoff requests are limited by a new 'request_node_count'
config option.
'quarantine_threshold' defaults to zero i.e. no fragments will be
quarantined. 'request node count' defaults to '2 * replicas'.
Closes-Bug: 1655608
Change-Id: I08e1200291833dea3deba32cdb364baa99dc2816
Use the recently added assert_subprocess_success [1] helper function
more widely.
Add run_custom_sharder helper.
Add container-sharder key to the ProbeTest.configs dict.
[1] Related-Change: I9ec411462e4aaf9f21aba6c5fd7698ff75a07de3
Change-Id: Ic2bc4efeba5ae5bc8881f0deaf4fd9e10213d3b7
DatabaseBrokers cache opened connections. If a probe test
instantiates a DatabaseBroker, or any other class that in turn
instantiates a DatabaseBroker, such as a ContainerSharder, then
connections may hold db files open until the DatabaseBroker is garbage
collected. This can cause subsequent probe tests to fail during their
setUp() because resetswift is unable to unmount device directories
while db files are open.
A call to gc.collect() is added during setUp() to ensure db files are
closed before resetswift() is called.
Closes-Bug: 1917050
Change-Id: Ifda4407c9ecff4c636fe07e013c3ebcebd0df018
Previously, ssync would not sync nor cleanup non-durable data
fragments on handoffs. When the reconstructor is syncing objects from
a handoff node (a 'revert' reconstructor job) it may be useful, and is
not harmful, to also send non-durable fragments if the receiver has
older or no fragment data.
Several changes are made to enable this. On the sending side:
- For handoff (revert) jobs, the reconstructor instantiates
SsyncSender with a new 'include_non_durable' option.
- If configured with the include_non_durable option, the SsyncSender
calls the diskfile yield_hashes function with options that allow
non-durable fragments to be yielded.
- The diskfile yield_hashes function is enhanced to include a
'durable' flag in the data structure yielded for each object.
- The SsyncSender includes the 'durable' flag in the metadata sent
during the missing_check exchange with the receiver.
- If the receiver requests the non-durable object, the SsyncSender
includes a new 'X-Backend-No-Commit' header when sending the PUT
subrequest for the object.
- The SsyncSender includes the non-durable object in the collection
of synced objects returned to the reconstructor so that the
non-durable fragment is removed from the handoff node.
On the receiving side:
- The object server includes a new 'X-Backend-Accept-No-Commit'
header in its response to SSYNC requests. This indicates to the
sender that the receiver has been upgraded to understand the
'X-Backend-No-Commit' header.
- The SsyncReceiver is enhanced to consider non-durable data when
determining if the sender's data is wanted or not.
- The object server PUT method is enhanced to check for and
'X-Backend-No-Commit' header before committing a diskfile.
If a handoff sender has both a durable and newer non-durable fragment
for the same object and frag-index, only the newer non-durable
fragment will be synced and removed on the first reconstructor
pass. The durable fragment will be synced and removed on the next
reconstructor pass.
Change-Id: I1d47b865e0a621f35d323bbed472a6cfd2a5971b
Closes-Bug: 1778002
Swift operators may find it useful to operate on each object in their
cluster in some way. This commit provides them a way to hook into the
object auditor with a simple, clearly-defined boundary so that they
can iterate over their objects without additional disk IO.
For example, a cluster operator may want to ensure a semantic
consistency with all SLO segments accounted in their manifests,
or locate objects that aren't in container listings. Now that Swift
has encryption support, this could be used to locate unencrypted
objects. The list goes on.
This commit makes the auditor locate, via entry points, the watchers
named in its config file.
A watcher is a class with at least these four methods:
__init__(self, conf, logger, **kwargs)
start(self, audit_type, **kwargs)
see_object(self, object_metadata, data_file_path, **kwargs)
end(self, **kwargs)
The auditor will call watcher.start(audit_type) at the start of an
audit pass, watcher.see_object(...) for each object audited, and
watcher.end() at the end of an audit pass. All method arguments are
passed as keyword args.
This version of the API is implemented on the context of the
auditor itself, without spawning any additional processes.
If the plugins are not working well -- hang, crash, or leak --
it's easier to debug them when there's no additional complication
of processes that run by themselves.
In addition, we include a reference implementation of plugin for
the watcher API, as a help to plugin writers.
Change-Id: I1be1faec53b2cdfaabf927598f1460e23c206b0a
md5 is not an approved algorithm in FIPS mode, and trying to
instantiate a hashlib.md5() will fail when the system is running in
FIPS mode.
md5 is allowed when in a non-security context. There is a plan to
add a keyword parameter (usedforsecurity) to hashlib.md5() to annotate
whether or not the instance is being used in a security context.
In the case where it is not, the instantiation of md5 will be allowed.
See https://bugs.python.org/issue9216 for more details.
Some downstream python versions already support this parameter. To
support these versions, a new encapsulation of md5() is added to
swift/common/utils.py. This encapsulation is identical to the one being
added to oslo.utils, but is recreated here to avoid adding a dependency.
This patch is to replace the instances of hashlib.md5() with this new
encapsulation, adding an annotation indicating whether the usage is
a security context or not.
While this patch seems large, it is really just the same change over and
again. Reviewers need to pay particular attention as to whether the
keyword parameter (usedforsecurity) is set correctly. Right now, all
of them appear to be not used in a security context.
Now that all the instances have been converted, we can update the bandit
run to look for these instances and ensure that new invocations do not
creep in.
With this latest patch, the functional and unit tests all pass
on a FIPS enabled system.
Co-Authored-By: Pete Zaitcev
Change-Id: Ibb4917da4c083e1e094156d748708b87387f2d87
* Add a new config option, proxy_base_url
* Support HTTPS as well as HTTP connections
* Monkey-patch eventlet early so we never import an unpatched version
from swiftclient
Change-Id: I4945d512966d3666f2738058f15a916c65ad4a6b
This patch adds a new object versioning mode. This new mode provides
a new set of APIs for users to interact with older versions of an
object. It also changes the naming scheme of older versions and adds
a version-id to each object.
This new mode is not backwards compatible or interchangeable with the
other two modes (i.e., stack and history), especially due to the changes
in the namimg scheme of older versions. This new mode will also serve
as a foundation for adding S3 versioning compatibility in the s3api
middleware.
Note that this does not (yet) support using a versioned container as
a source in container-sync. Container sync should be enhanced to sync
previous versions of objects.
Change-Id: Ic7d39ba425ca324eeb4543a2ce8d03428e2225a1
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Thiago da Silva <thiagodasilva@gmail.com>
Reserve the namespace starting with the NULL byte for internal
use-cases. Backend services will allow path names to include the NULL
byte in urls and validate names in the reserved namespace. Database
services will filter all names starting with the NULL byte from
responses unless the request includes the header:
X-Backend-Allow-Reserved-Names: true
The proxy server will not allow path names to include the NULL byte in
urls unless a middlware has set the X-Backend-Allow-Reserved-Names
header. Middlewares can use the reserved namespace to create objects
and containers that can not be directly manipulated by clients. Any
objects and bytes created in the reserved namespace will be aggregated
to the user's account totals.
When deploying internal proxys developers and operators may configure
the gatekeeper middleware to translate the X-Allow-Reserved-Names header
to the Backend header so they can manipulate the reserved namespace
directly through the normal API.
UpgradeImpact: it's not safe to rollback from this change
Change-Id: If912f71d8b0d03369680374e8233da85d8d38f85
There's still one problem, though: since swiftclient on py3 doesn't
support non-ASCII characters in metadata names, none of the tests in
TestReconstructorRebuildUTF8 will pass.
Change-Id: I4ec879ade534e09c3a625414d8aa1f16fd600fa4
Give storage nodes more time to complete requests for multi-node upgrade
and probetests.
Also slightly decouple probetests from default configs.
Change-Id: I334ef517d833916a3b7be3151a812d4f9c66a6e1
...in preparation for the container sharding feature.
Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: I4455677abb114a645cff93cd41b394d227e805de
Since Python 2.7, unittest in the standard library has included mulitple
facilities for skipping tests by decorators as well as an exception.
Switch to that directly, rather than importing nose.
Change-Id: I4009033473ea24f0d0faed3670db844f40051f30
We added check_drive to the account/container servers to unify how all
the storage wsgi servers treat device dirs/mounts. Thus pushes that
unification down into the consistency engine.
Drive-by:
* use FakeLogger less
* clean up some repeititon in probe utility for device re-"mounting"
Related-Change-Id: I3362a6ebff423016bb367b4b6b322bb41ae08764
Change-Id: I941ffbc568ebfa5964d49964dc20c382a5e2ec2a
It was deprecated and we discussed on this topic in Denver PTG
for Queen cycle. Main motivation for this work is that deprecated
post_as_copy option and its gate blocks future symlink work.
Change-Id: I411893db1565864ed5beb6ae75c38b982a574476
This was deprecated in the 2.5.0 release (i.e. Liberty cycle), and we've
been warning about it ever since. A year and a half seems like a long
enough time.
Change-Id: I5688e8f7dedb534071e67d799252bf0b2ccdd9b6
Related-Change: Iad91df50dadbe96c921181797799b4444323ce2e
Specifically to facilitate the reuse of the retry check server
function to fill in the creds for the test2 account which is required
for probetests after the related change.
Change-Id: I9729faa4c8c8d6d65a481bc2ea3f0566d511034c
Related-Change: I8d503419b7996721a671ed6b2795224775a7d8c6
Currently the container sync daemon fails to copy
an SLO manifest, and the error will stall progress
of the sync process on that container. There are
several reasons why the sync of an SLO manifest
may fail:
1. The GET of the manifest from the source
container returns an X-Static-Large-Object header
that is not allowed to be included with a PUT
to the destination container.
2. The format of the manifest object that is read
from the source is not in the syntax required
for a SLO manifest PUT.
3. Assuming 2 were fixed, the PUT of the manifest
includes an ETag header which will not match the
md5 of the manifest generated by the receiving
proxy's SLO middleware.
4. If the manifest is being synced to a different
account and/or cluster, then the SLO segments may
not have been synced and so the validation of the
PUT manifest will fail.
This patch addresses all of these obstacles by
enabling the destination container-sync middleware to
cause the SLO middleware to be bypassed by setting a
swift.slo_override flag in the request environ. This
flag is only set for request that have been validated
as originating from a container sync peer.
This is justifed by noting that a SLO manifest PUT from
a container sync peer can be assumed to have valid syntax
because it was already been validated when written to
the source container.
Furthermore, we must allow SLO manifests to be synced
without requiring the semantic of their content to be
re-validated because we have no way to enforce or check
that segments have been synced prior to the manifest, nor
to check that the semantic of the manifest is still valid
at the source.
This does mean that GETs to synced SLO manifests may fail
if segments have not been synced. This is however
consistent with the expectation for synced DLO manifests
and indeed for the source SLO manifest if segments have
been deleted since it was written.
Co-Authored-By: Oshrit Feder <oshritf@il.ibm.com>
Change-Id: I8d503419b7996721a671ed6b2795224775a7d8c6
Closes-Bug: #1605597
Probe test is cleaning up the swift environment for each test in setUp
method. However, probe tests will run even if we cannot use the resetswift
script for some reasons (e.g. not permitted, the script not found) and
probably the probe tests will fail after a long time passed for the
execution.
To prevent such an unfortunate situation and also to find the reason
easily, this patch adds the exit code check for "resetswift" and if it
failed, the test will raise AssertionError with the stdout and stderr to
make it easy to find the reason.
Closes-Bug: #1613494
Change-Id: Id80d56ab6b71402ead4fe22c120064d78c1e74ac
Every time we call start_server, check is True.
Every time we call check_server, we use the default timeout.
Change-Id: Id38182f15bcbfbb145b57cee179a8fd47ec8e2b7
When a PUT request is made to an EC object the resulting container
update must include the override values for the actual object
etag and size, as opposed to the fragment etag and size. When a POST
request is made the same override values should be included in the
container update, but currently the update includes the incorrect EC
fragment size (but the correct body etag).
This is ok so long as the update for the object PUT request arrives at
the container server first (whether by direct update or replication)
because the etag and size values in an update due to an object POST
will not have a newer timestamp that the PUT and will therefore be
ignored at the container server.
However, if the update due to the object PUT request has not arrived
at the container server when the update due to the object POST
arrives, then the etag and incorrect size sent with the POST update
will be recorded in the container server. If the update due to the PUT
subsequently arrives it will not fix this error because the timestamp
of its etag and size values is not greater than that of the already
recorded values.
Fortunately the correct object body size is persisted with the object
as X-Backend-Container-Update-Override-Size sysmeta so this patch
fixes the container update due to a POST to use that value instead of
the Content-Length metadata.
Closes-Bug: #1582723
Change-Id: Ide7c9c59eb41aa09eaced2acfd0700f882c6eab1
This patch makes a number of changes to enable content-type
metadata to be updated when using the fast-POST mode of
operation, as proposed in the associated spec [1].
* the object server and diskfile are modified to allow
content-type to be updated by a POST and the updated value
to be stored in .meta files.
* the object server accepts PUTs and DELETEs with older
timestamps than existing .meta files. This is to be
consistent with replication that will leave a later .meta
file in place when replicating a .data file.
* the diskfile interface is modified to provide accessor
methods for the content-type and its timestamp.
* the naming of .meta files is modified to encode two
timestamps when the .meta file contains a content-type value
that was set prior to the latest metadata update; this
enables consistency to be achieved when rsync is used for
replication.
* ssync is modified to sync meta files when content-type
differs between local and remote copies of objects.
* the object server issues container updates when handling
POST requests, notifying the container server of the current
immutable metadata (etag, size, hash, swift_bytes),
content-type with their respective timestamps, and the
mutable metadata timestamp.
* the container server maintains the most recently reported
values for immutable metadata, content-type and mutable
metadata, each with their respective timestamps, in a single
db row.
* new probe tests verify that replication achieves eventual
consistency of containers and objects after discrete updates
to content-type and mutable metadata, and that container-sync
sync's objects after fast-post updates.
[1] spec change-id: I60688efc3df692d3a39557114dca8c5490f7837e
Change-Id: Ia597cd460bb5fd40aa92e886e3e18a7542603d01
If a device has been removed from one of the rings, it actually is set as None
within the ring. In that case the length of the devices is not True without
filtering the None devices. However, if the length matched the condition but
included a removed device the probetests would fail with a TypeError.
This fix could be done also in swift/common/ring/ring.py, but it seems it only
affects probetests right now, thus fixing it there and not changing the current
behavior.
Change-Id: I8ccf9b32a51957e040dd370bc9f711d4328d17b1
Currently, the rsync module where the replicators send data is static. It
forbids administrators to set rsync configuration based on their current
deployment or needs.
As an example, the rsyncd configuration example encourages to set a connections
limit for the modules account, container and object. It permits to protect
devices from excessives parallels connections, because it would impact
performances.
On a server with many devices, it is tempting to increase this number
proportionally, but nothing guarantees that the distribution of the connections
will be balanced. In the worst scenario, a single device can receive all the
connections, which is a severe impact on performances.
This commit adds a new option named 'rsync_module' to the *-replicator sections
of the *-server configuration file. This configuration variable can be
extrapolated with device attributes like ip, port, device, zone, ... by using
the format {NAME}. eg:
rsync_module = {replication_ip}::object_{device}
With this configuration, an administrators can solve the problem of connections
distribution by creating one module per device in rsyncd configuration.
The default values are backward compatible:
{replication_ip}::account
{replication_ip}::container
{replication_ip}::object
Option vm_test_mode is deprecated by this commit, but backward compatibility is
maintained. The option is only effective when rsync_module is not set. In that
case, {replication_port} is appended to the default value of rsync_module.
Change-Id: Iad91df50dadbe96c921181797799b4444323ce2e
And if they are not, exhaust the node iter to go get more. The
problem without this implementation is a simple overwrite where
a GET follows before the handoff has put the newer obj back on
the 'alive again' node such that the proxy gets n-1 fragments
of the newest set and 1 of the older.
This patch bucketizes the fragments by etag and if it doesn't
have enough continues to exhaust the node iterator until it
has a large enough matching set.
Change-Id: Ib710a133ce1be278365067fd0d6610d80f1f7372
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Closes-Bug: 1457691
'print' function is compatible with 2.x and 3.x python versions
Link : https://www.python.org/dev/peps/pep-3105/
Python 2.6 has a __future__ import that removes print as language syntax,
letting you use the functional form instead
Change-Id: I416c6ac21ccbfb91ec328ffb1ed21e492ef52d58
This would have been enough to catch the regression, and we can extend
them as we work on any future ehancements to our process management.
Change-Id: I9a1b57aa15663380c45cf783afc8212ab4ffbace
Get configparser, queue, http_client modules from six.moves.
Patch generated by the six_moves operation of the sixer tool:
https://pypi.python.org/pypi/sixer
Change-Id: I666241ab50101b8cc6f992dd80134ce27327bd7d
The iteritems() of Python 2 dictionaries has been renamed to items() on
Python 3. According to a discussion on the openstack-dev mailing list,
the overhead of creating a temporary list using dict.items() on Python 2
is very low because most dictionaries are small:
http://lists.openstack.org/pipermail/openstack-dev/2015-June/066391.html
Patch generated by the following command:
sed -i 's,iteritems,items,g' \
$(find swift -name "*.py") \
$(find test -name "*.py")
Change-Id: I6070bb6c684be76e8e77222a7d280ec6edd43496
Enabled by a new > 0 integer config value, "servers_per_port" in the
[DEFAULT] config section for object-server and/or replication server
configs. The setting's integer value determines how many different
object-server workers handle requests for any single unique local port
in the ring. In this mode, the parent swift-object-server process
continues to run as the original user (i.e. root if low-port binding
is required), binds to all ports as defined in the ring, and forks off
the specified number of workers per listen socket. The child, per-port
servers drop privileges and behave pretty much how object-server workers
always have, except that because the ring has unique ports per disk, the
object-servers will only be handling requests for a single disk. The
parent process detects dead servers and restarts them (with the correct
listen socket), starts missing servers when an updated ring file is
found with a device on the server with a new port, and kills extraneous
servers when their port is found to no longer be in the ring. The ring
files are stat'ed at most every "ring_check_interval" seconds, as
configured in the object-server config (same default of 15s).
Immediately stopping all swift-object-worker processes still works by
sending the parent a SIGTERM. Likewise, a SIGHUP to the parent process
still causes the parent process to close all listen sockets and exit,
allowing existing children to finish serving their existing requests.
The drop_privileges helper function now has an optional param to
suppress the setsid() call, which otherwise screws up the child workers'
process management.
The class method RingData.load() can be told to only load the ring
metadata (i.e. everything except replica2part2dev_id) with the optional
kwarg, header_only=True. This is used to keep the parent and all
forked off workers from unnecessarily having full copies of all storage
policy rings in memory.
A new helper class, swift.common.storage_policy.BindPortsCache,
provides a method to return a set of all device ports in all rings for
the server on which it is instantiated (identified by its set of IP
addresses). The BindPortsCache instance will track mtimes of ring
files, so they are not opened more frequently than necessary.
This patch includes enhancements to the probe tests and
object-replicator/object-reconstructor config plumbing to allow the
probe tests to work correctly both in the "normal" config (same IP but
unique ports for each SAIO "server") and a server-per-port setup where
each SAIO "server" must have a unique IP address and unique port per
disk within each "server". The main probe tests only work with 4
servers and 4 disks, but you can see the difference in the rings for the
EC probe tests where there are 2 disks per server for a total of 8
disks. Specifically, swift.common.ring.utils.is_local_device() will
ignore the ports when the "my_port" argument is None. Then,
object-replicator and object-reconstructor both set self.bind_port to
None if server_per_port is enabled. Bonus improvement for IPv6
addresses in is_local_device().
This PR for vagrant-swift-all-in-one will aid in testing this patch:
https://github.com/swiftstack/vagrant-swift-all-in-one/pull/16/
Also allow SAIO to answer is_local_device() better; common SAIO setups
have multiple "servers" all on the same host with different ports for
the different "servers" (which happen to match the IPs specified in the
rings for the devices on each of those "servers").
However, you can configure the SAIO to have different localhost IP
addresses (e.g. 127.0.0.1, 127.0.0.2, etc.) in the ring and in the
servers' config files' bind_ip setting.
This new whataremyips() implementation combined with a little plumbing
allows is_local_device() to accurately answer, even on an SAIO.
In the default case (an unspecified bind_ip defaults to '0.0.0.0') as
well as an explict "bind to everything" like '0.0.0.0' or '::',
whataremyips() behaves as it always has, returning all IP addresses for
the server.
Also updated probe tests to handle each "server" in the SAIO having a
unique IP address.
For some (noisy) benchmarks that show servers_per_port=X is at least as
good as the same number of "normal" workers:
https://gist.github.com/dbishop/c214f89ca708a6b1624a#file-summary-md
Benchmarks showing the benefits of I/O isolation with a small number of
slow disks:
https://gist.github.com/dbishop/fd0ab067babdecfb07ca#file-results-md
If you were wondering what the overhead of threads_per_disk looks like:
https://gist.github.com/dbishop/1d14755fedc86a161718#file-tabular_results-md
DocImpact
Change-Id: I2239a4000b41a7e7cc53465ce794af49d44796c6
If the primary left or right hand partners are down, the next best thing
is to validate the rest of the primary nodes. Where the rest should
exclude not just the left and right hand partners - but ourself as well.
This fixes a accidental noop when partner node is unavailable and
another node is missing data.
Validation:
Add probetests to cover ssync failures for the primary sync_to nodes for
sync jobs.
Drive-by:
Make additional plumbing for the check_mount and check_dir constraints into
the remaining daemons.
Change-Id: I4d1c047106c242bca85c94b569d98fd59bb255f4
Because of the object-server's interaction with ssync sender's
X-Backend-Replication-Headers when a object (or fragment archive) is
pushed unmodified to another node it's ETag value is duped into the
recieving ends metadata as Etag. This interacts poorly with the
reconstructor's RebuildingECDiskFileStream which can not know ahead of
time the ETag of the fragment archive being rebuilt.
Don't send the Etag from the local source fragment archive being used as
the basis for the rebuilt fragent archive's metadata along to ssync.
Change-Id: Ie59ad93a67a7f439c9a84cd9cff31540f97f334a
This patch adds the erasure code reconstructor. It follows the
design of the replicator but:
- There is no notion of update() or update_deleted().
- There is a single job processor
- Jobs are processed partition by partition.
- At the end of processing a rebalanced or handoff partition, the
reconstructor will remove successfully reverted objects if any.
And various ssync changes such as the addition of reconstruct_fa()
function called from ssync_sender which performs the actual
reconstruction while sending the object to the receiver
Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: John Dickinson <me@not.mn>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com>
Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com>
blueprint ec-reconstructor
Change-Id: I7d15620dc66ee646b223bb9fff700796cd6bef51
Wikipedia's list of common misspellings [1] has a machine-readable
version. This patch fixes those misspellings mentioned in the list
which don't have multiple right variants (as e.g. "accension", which can
be both "accession" and "ascension"), such misspellings are left
untouched. The list of changes was manually re-checked for false
positives.
[1] https://en.wikipedia.org/wiki/Wikipedia:Lists_of_common_misspellings/For_machines
Change-Id: Ic9a5438629664f7cea216413a28acc0e8992da05
Signed-off-by: Martin Kletzander <mkletzan@redhat.com>
* move get_to_final_state into ProbeTest
* get rid of kill_servers
* add replicators manager and updaters manager to ProbeTest
(this is all going someplace, i promise)
Change-Id: I8393a2ebc0d04051cae48cc3c49580f70818dbf2
* refactor probe tests to use probe.common.ProbeTest
* move reset_environment functionality to ProbeTest.setUp()
* choose rings and policies that meet the criteria - raise SkipTest if
nothing matches
* replace all AssertionErrors in setup with SkipTest
Change-Id: Id56c497d58083f5fd55f5283cdd346840df039d3
A deprecated policy in swift.conf causes errors in
probe tests that may attempt to use that policy.
This patch introduces a list ENABLED_POLICIES in
test/probe/common.py and changes probe tests to only
use policies contained in that list.
Change-Id: Ie65477c15d631fcfc3a4a5772fbe6d7d171b22b0
Add headers param to direct_client.direct_get_object, which is used in
probetests to passthrough the X-Storage-Policy-Index header.
DocImpact
Implements: blueprint storage-policies
Change-Id: I19adbbcefbc086c8467bd904a275d55cde596412