Commit Graph

32 Commits

Author SHA1 Message Date
indianwhocodes dab7192e1e tests: fix FakeSwift HEAD with query param
The existing FakeSwift implementation already supports using registered
GET responses for GET requests with a query param.  It also supports
using registered GET responses for HEAD requests (if they either both
had the exact SAME matching query params, or both did not have ANY query
params).  But it did not support using registered GET responses w/o
query params for HEAD requests with a query param, even though a GET
with the same query param would work.

This change makes it a little more consistent: any client or test that
makes a GET request, should be able to make a similar HEAD request and
expect consistent response headers and status.

This test infra improvement is needed as we're going to be extending
test_slo with a bunch of tests that assert consistent response headers
for both GET and HEAD requests w/ new query params.

Change-Id: Idb4020fdeee87a9164312dc9647ab0820b098ff8
2023-08-18 14:28:25 +01:00
Ade Lee 5320ecbaf2 replace md5 with swift utils version
md5 is not an approved algorithm in FIPS mode, and trying to
instantiate a hashlib.md5() will fail when the system is running in
FIPS mode.

md5 is allowed when in a non-security context.  There is a plan to
add a keyword parameter (usedforsecurity) to hashlib.md5() to annotate
whether or not the instance is being used in a security context.

In the case where it is not, the instantiation of md5 will be allowed.
See https://bugs.python.org/issue9216 for more details.

Some downstream python versions already support this parameter.  To
support these versions, a new encapsulation of md5() is added to
swift/common/utils.py.  This encapsulation is identical to the one being
added to oslo.utils, but is recreated here to avoid adding a dependency.

This patch is to replace the instances of hashlib.md5() with this new
encapsulation, adding an annotation indicating whether the usage is
a security context or not.

While this patch seems large, it is really just the same change over and
again.  Reviewers need to pay particular attention as to whether the
keyword parameter (usedforsecurity) is set correctly.   Right now, all
of them appear to be not used in a security context.

Now that all the instances have been converted, we can update the bandit
run to look for these instances and ensure that new invocations do not
creep in.

With this latest patch, the functional and unit tests all pass
on a FIPS enabled system.

Co-Authored-By: Pete Zaitcev
Change-Id: Ibb4917da4c083e1e094156d748708b87387f2d87
2020-12-15 09:52:55 -05:00
Clay Gerrard 3d787ddff8 xlo: 5xx while validating first segment is a server error
With DLO and SLO, we validate that we can read the first segment before
sending data to the client; this helps catch auth errors where the user
has access to read the manifest but not the segments.

Sometimes, though, that validation fails for transient reasons; if the
proxy couldn't get enough responses from primaries to determine whether
the object exists (for example), we should send back a 503 to indicate
to the client that it should retry the request.

Change-Id: Ice5358ff85ee2d5fe60785b73b67dea493044a2c
2020-11-17 16:38:33 +00:00
Tim Burke 9a0bac4ceb Fail short reads in SegmentedIterable
...and check for it *before* doing the MD5 check. We see this happen
ocassionally, but as best I can tell, it's always due to a
ChunkReadTimeout popping in the proxy that it can't recover from.

Change-Id: If238725bbec4fc3f6c8d000599c735a7c4972f7d
2020-09-28 13:56:00 -07:00
Tim Burke a8d2146266 xlo: Drain error responses
Otherwise, they get logged as 499s. While we're at it, also drain DLO
bodies (if they're small).

Change-Id: I7b54f25f42577020b10029c74f8fc01fa6fc591e
2020-09-15 12:51:16 -07:00
Tim Burke 2a8d47f00e middlewares: Clean up app iters better
Previously, logs would often show 499s in places where some other status
would be more appropriate.

Change-Id: I68dbb8593101cd3b5b64a1a947c68e340e36ce02
2020-02-12 21:27:15 -08:00
Tim Burke a21c0be70c dlo: Respond 200 on multi-range GETs
We already had a unit test for this, but it was inadvertently retesting
what test_get_suffix_range_many_segments tested.

Change-Id: I43eee7029ca884268c3f2ad74300b94b299fd8d2
Closes-Bug: #1831790
2019-08-16 16:35:34 -07:00
Tim Burke b6ebabee78 Clean up dlo unit tests
We don't actually need so many py2/py3 branches, and once we
clean those up, there's hardly any reason to import six.

Change-Id: Ia3b4f02e7eb99ad1a76aa35c39dc198528fd39ad
2019-04-17 09:15:56 -07:00
Pete Zaitcev 893acffbc0 py3: port dlo
Change-Id: I7236ddea0acde93d0789ad8affa76df0097a86aa
2019-04-16 22:34:58 -05:00
Samuel Merritt 4a0afa9fea Enforce Content-Length in catch_errors
If a WSGI application produces the header "Content-Length: <N>" but
does not produce exactly N bytes of response, then that is an error
and an exception should be thrown so that the WSGI server can take the
correct action (close the TCP connection for HTTP <= 1.1, something
else for HTTP 2.0).

As part of this, I also fixed a bug in DLOs where a HEAD response
might have a body. The way it works is this:

 * user makes HEAD request for DLO manifest

 * DLO middleware makes GET request for container

 * authorize callback (e.g. from tempurl) replies 401 for container
   GET; response has a nonempty body (it's a GET response; that's
   fine)

 * DLO notes that response is non-2xx, returns it as-is

 * client gets response with nonempty body to a HEAD request

The fix there was simple; if the original request method was HEAD,
clear out the response body.

Change-Id: I74d8c13eba2a4917b5a116875b51a781b33a7abf
Related-Bug: 1568650
2018-06-18 14:50:59 -07:00
Alistair Coles dd113ab25a Refactor proxy-server conf loading to a helper function
There were two middlewares using a common pattern to load
the proxy-server app config section. The existing pattern
fails to recognise option overrides that are declared using
paste-deploy's 'set' notation, as illustrated by the change
to test_dlo.py in this patch.

This patch replaces the existing code with a helper function
that loads the proxy-server config using the paste-deploy loader.
The resulting config dict is therefore exactly the same as that
used to initialise the proxy-server app.

Change-Id: Ib58ce03e2010f41e7eb11f1a6dc78b0b7f55d466
2017-12-07 10:24:30 -08:00
Tim Burke e001c02ff9 Stop logging tracebacks on bad xLOs
The error messages alone should provide plenty enough information.
Plus, running functional tests really shouldn't cause tracebacks.

Also, tighten up tests for log messages.

Change-Id: I55136484d342af756fa153d971dcb9159a435f13
2017-10-18 22:39:37 +00:00
Tim Burke 4806434cb0 Move listing formatting out to proxy middleware
Make some json -> (text, xml) stuff in a common module, reference that in
account/container servers so we don't break existing clients (including
out-of-date proxies), but have the proxy controllers always force a json
listing.

This simplifies operations on listings (such as the ones already happening in
decrypter, or the ones planned for symlink and sharding) by only needing to
consider a single response type.

There is a downside of larger backend requests for text/plain listings, but
it seems like a net win?

Change-Id: Id3ce37aa0402e2d8dd5784ce329d7cb4fbaf700d
2017-09-15 06:38:26 +00:00
junboli 99a6d3b30a Test: Use assertIsNone() in unittest
Use assertIsNone() instead of assertEqual(), because assertEqual()
still fails on false values when compared to None

Change-Id: Ic52c319e3e55135df834fdf857982e1721bc44bb
2017-06-25 03:01:42 +00:00
Tim Burke e8a80e874a Let users know entity size in 416 responses
If a user sends a Range header with no satisfiable ranges, we send back
a 416 Requested Range Not Satisfiable response. Previously however,
there would be no indication of the size of the object they were
requesting, so they wouldn't know how to craft a satisfiable range. We
*do* send a Content-Length, but it is (correctly) the length of the
error message.

The RFC [1] has an answer for this:

>  A server generating a 416 (Range Not Satisfiable) response to a
>  byte-range request SHOULD send a Content-Range header field with an
>  unsatisfied-range value, as in the following example:
>
>    Content-Range: bytes */1234
>
>  The complete-length in a 416 response indicates the current length of
>  the selected representation.

Now, we'll send a Content-Range header for all 416 responses, including
those coming from the object server as well as those generated on a
proxy because of the Range mangling required to support EC policies.

[1] RFC 7233, section 4.2, although similar language was used in RFC
2616, sections 10.4.17 and 14.16

Change-Id: I80c7390fc6f84a10a212b0641bb07a64dfccbd45
2016-11-30 10:52:08 -08:00
Prashanth Pai 46d61a4dcd Refactor server side copy as middleware
Rewrite server side copy and 'object post as copy' feature as middleware to
simplify the PUT method in the object controller code. COPY is no longer
a verb implemented as public method in Proxy application.

The server side copy middleware is inserted to the left of dlo, slo and
versioned_writes middlewares in the proxy server pipeline. As a result,
dlo and slo copy_hooks are no longer required. SLO manifests are now
validated when copied so when copying a manifest to another account the
referenced segments must be readable in that account for the manifest
copy to succeed (previously this validation was not made, meaning the
manifest was copied but could be unusable if the segments were not
readable).

With this change, there should be no change in functionality or existing
behavior. This is asserted with (almost) no changes required to existing
functional tests.

Some notes (for operators):
* Middleware required to be auto-inserted before slo and dlo and
  versioned_writes
* Turning off server side copy is not configurable.
* object_post_as_copy is no longer a configurable option of proxy server
  but of this middleware. However, for smooth upgrade, config option set
  in proxy server app is also read.

DocImpact: Introducing server side copy as middleware

Co-Authored-By: Alistair Coles <alistair.coles@hpe.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>

Change-Id: Ic96a92e938589a2f6add35a40741fd062f1c29eb
Signed-off-by: Prashanth Pai <ppai@redhat.com>
Signed-off-by: Thiago da Silva <thiago@redhat.com>
2016-05-11 14:55:51 -04:00
Samuel Merritt 9430f4c9f5 Move HeaderKeyDict to avoid an inline import
There was a function in swift.common.utils that was importing
swob.HeaderKeyDict at call time. It couldn't import it at compilation
time since utils can't import from swob or else it blows up with a
circular import error.

This commit just moves HeaderKeyDict into swift.common.header_key_dict
so that we can remove the inline import.

Change-Id: I656fde8cc2e125327c26c589cf1045cb81ffc7e5
2016-03-07 12:26:48 -08:00
Samuel Merritt e31ecb24b6 Get rid of contextlib.nested() for py3
contextlib.nested() is missing completely in Python 3.

Since 2.7, we can use multiple context managers in a 'with' statement,
like so:

    with thing1() as t1, thing2() as t2:
        do_stuff()

Now, if we had some code that needed to nest an arbitrary number of
context managers, there's stuff we could do with contextlib.ExitStack
and such... but we don't. We only use contextlib.nested() in tests to
set up bunches of mocks without crazy-deep indentation, and all that
stuff fits perfectly into multiple-context-manager 'with' statements.

Change-Id: Id472958b007948f05dbd4c7fb8cf3ffab58e2681
2015-10-23 11:44:54 -07:00
Thiago da Silva 035a411660 versioned writes middleware
Rewrite object versioning as middleware to simplify the PUT method
in the object controller.

The functionality remains basically the
same with the only major difference being the ability to now
version slo manifest files. dlo manifests are still not
supported as part of this patch.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

DocImpact
Change-Id: Ie899290b3312e201979eafefb253d1a60b65b837
Signed-off-by: Thiago da Silva <thiago@redhat.com>
Signed-off-by: Prashanth Pai <ppai@redhat.com>
2015-08-07 14:11:32 -04:00
Victor Stinner a0db56dcde Fix pep8 E265 warning of hacking 0.10
Fix the warning E265 "block comment should start with '# '" added in pep
1.5.

Change-Id: Ib57282e958be9c7cddffc7bca34fbbf1d4c460fd
2015-07-30 09:33:18 +02:00
Samuel Merritt 12d8a53fff Get better at closing WSGI iterables.
PEP 333 (WSGI) says: "If the iterable returned by the application has
a close() method, the server or gateway must call that method upon
completion of the current request[.]"

There's a bunch of places where we weren't doing that; some of them
matter more than others. Calling .close() can prevent a connection
leak in some cases. In others, it just provides a certain pedantic
smugness. Either way, we should do what WSGI requires.

Noteworthy goofs include:

  * If a client is downloading a large object and disconnects halfway
    through, a proxy -> obj connection may be leaked. In this case,
    the WSGI iterable is a SegmentedIterable, which lacked a close()
    method. Thus, when the WSGI server noticed the client disconnect,
    it had no way of telling the SegmentedIterable about it, and so
    the underlying iterable for the segment's data didn't get
    closed.

    Here, it seems likely (though unproven) that the object server
    would time out and kill the connection, or that a
    ChunkWriteTimeout would fire down in the proxy server, so the
    leaked connection would eventually go away. However, a flurry of
    client disconnects could leave a big pile of useless connections.

  * If a conditional request receives a 304 or 412, the underlying
    app_iter is not closed. This mostly affects conditional requests
    for large objects.

The leaked connections were noticed by this patch's co-author, who
made the changes to SegmentedIterable. Those changes helped, but did
not completely fix, the issue. The rest of the patch is an attempt to
plug the rest of the holes.

Co-Authored-By: Romain LE DISEZ <romain.ledisez@ovh.net>

Change-Id: I168e147aae7c1728e7e3fdabb7fba6f2d747d937
Closes-Bug: #1466549
2015-06-18 16:12:41 -07:00
Clay Gerrard a707829334 Update test infrastructure
* Get FakeConn ready for expect 100 continue
 * Use debug_logger more and with better interfaces
 * Fix patch_policies to be less annoying

Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com>
Co-Authored-By: Paul Luse <paul.e.luse@intel.com>
Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com>
Change-Id: I28c0a3539d994cbb8e6b94d63a23ed4ea6cb956d
2015-04-13 22:57:42 -07:00
Hisashi Osanai 5ca49ca924 Fix the GET's response code when there is a missing segment in LO
This patch changes the response code from Internal Server Error to
Conflict when there is a missing segment and the position is first.

Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Closes-Bug: #1386568
Change-Id: Iac175b4dc6ac9081436738697a27fe669acce0eb
2014-12-25 03:24:44 +09:00
Keshava Bharadwaj 0f93fff46a Fixes unit tests to clean up temporary directories
This patch fixes the unit tests to remove the temporary directories
created during run of unit tests. Some of unit tests did not tear down
correctly, whatever it had set it up for running. This would over period
of time bloat up the tmp directory. As on date, there were around 49 tmp
directories left uncleared per round of unit tests. This patch fixes it.

Change-Id: If591375ca9cc87d52c7c9c6dc16c9fb4b49e99fc
2014-09-26 22:39:48 +05:30
Peter Portante 07fcf50c3a Rework use of constraints to ease testing
Prior to this patch both mainline code and testing modules imported
and used constraints directly into their own namespace, or relied on
the namespace of other modules that were not the constraints
module. This meant that if a unit test wanted to change a constraint
for its operation, it had to know how that module was using the
constraint, instead of referencing the constraint module itself.

This patch unifies the use of constraints so that all constraints are
referenced via the constraints module. In turn, this allows a test to
leverage the re-loadable nature of the constraints in the constraints
module.

It addition, a number of functional tests where using the default
values for constraints, instead of the configured value discovered in
a test.conf or in an existing swift.conf. This patch removes those
direct references in favor of the load_constraint() method from the
test/functional/tests.py module.

Change-Id: Ia5313d653c667dd9ca800786de59b59334c34eaa
2014-04-02 23:48:01 -04:00
Samuel Merritt cd149ae19f Check object segment MD5s
Now if a client is downloading a large object and one of the segment
fetches gets bad data, we'll abort the request right away instead of
letting the client carry on. We do this by checking each segment's
body against the etag in the GET response's headers, like we tell
clients to do.

To make sure that the etag is the MD5, we add ?multipart-manifest=get
to all segment GET requests. That way, a segment won't be another
large object, because a large object's etag isn't its MD5. This does
mean that clients can't make, say, a DLO manifest that points to a
bunch of SLO manifests. However, that capability was only introduced
in Swift 1.13.0 (post-Havana); prior to that, clients couldn't mix
large-object types. I'm more than happy to trade that capability away
in exchange for data-integrity checks.

Change-Id: I98c78a291ee1a863a36d54a22867bd01c126644e
2014-03-13 12:56:01 -07:00
David Goetz 8d1278cae8 copy over swift.authorize stuff into subrequests
If auth is setup in the env then it needs to be copied over with the
make_request wsgi helper.  Also renamed make_request to
make_subrequest- when I grepped for make_request I got > 250 results,
this'll make it easier to find references to this function in the
future.

Updated docs and sample confs to show tempurl needs to be before dlo and
slo as well as auth.

Change-Id: I9750555727f520a7c9fedd5f4fd31ff0f63d8088
2014-03-07 11:08:37 -08:00
Samuel Merritt 4f24ef87d6 Make DLO manifest copying work with ?multipart-manifest=get
If a client issues a COPY request for a DLO with the query param
multipart-manifest=get, then the expectation is a new
manifest. However, this wasn't the case; X-Object-Manifest wasn't
being set on the new object, so the result was a normal object with
the same contents as the old manifest (typically 0 bytes).

There was already a mechanism by which middlewares could modify COPY
requests; this commit extends that so they can set headers on the new
object.

Note that this has nothing to do with a "normal" DLO copy, i.e. one
without multipart-manifest=get. That one makes a new object that's the
concatenation of the segments, and it was working just fine.

Change-Id: I1073af9fee6e34ebdfad7b1a89aeb05e4523a151
2014-02-27 22:38:53 -08:00
Samuel Merritt 25ebf3aa9e Raise error on long or short DLO
The GET response for a DLO has a Content-Length that's computed from
the container listing, but the response body's length is determined by
the segments. If a segment grows or shrinks between when the headers
are sent and when the segment is requested, this confuses clients.

For example, if the DLO is longer than its Content-Length and the
client sends another request on the same TCP connection, then it can
get leftover object bytes instead of an HTTP status line.
Alternately, if the headers it sends fill up the TCP buffers since
Swift won't read them until the first response is done, then deadlock
hilarity ensues.

If the DLO is shorter than its Content-Length, you're pretty much
guaranteed a deadlock: the client is waiting for the rest of the
response, and the server is waiting for a new request.

Now SegmentedIterable detects both these conditions and raises an
exception so that the TCP connection gets torn down. It can't save
this request, but it can stop the next one from getting hosed too.

Change-Id: Icf79ba046ef7aaaab49ce6d0b33147332c967afc
2014-02-20 21:03:51 -08:00
Samuel Merritt 1f67eb7403 Support If-[None-]Match for object HEAD, SLO, and DLO
I moved the checking of If-Match and If-None-Match out of the object
server's GET method and into swob so that everyone can use it. The
interface is similar to the Range handling; make a response with
conditional_response=True, and you get handing of If-Match and
If-None-Match.

Since the only users of conditional_response are object GET, object
HEAD, SLO, and DLO, this has the effect of adding support for If-Match
and If-None-Match to just the latter three places and nowhere
else. This makes object GET and HEAD consistent for any kind of
object, large or small.

This also fixes a bug where various conditional headers (If-*) were
passed through to the object server on segment requests, which could
cause segment requests to fail with a 304 or 412 response. Now only
certain headers are copied to the segment requests, and that doesn't
include the conditional ones, so they can't goof up the segment
retrieval.

Note that I moved SegmentedIterable to swift.common.request_helpers
because it sprouted a transitive dependency on swob, and leaving it in
utils caused a circular import.

Bonus fix: unified the handling of DiskFileQuarantined and
DiskFileNotFound in object server GET and HEAD. Now in either case, a
412 will be returned if the client said "If-Match: *". If not, the
response is a 404, just like before.

Closes-Bug: 1279076
Closes-Bug: 1280022
Closes-Bug: 1280028

Change-Id: Id2ee78346244d516b980202e990aa38ce6812de5
2014-02-20 14:54:26 -08:00
Samuel Merritt 5fca0d07a8 Ensure swift.source is set for DLO/SLO requests
SLO was setting it sometimes (the PUT path), but not others (the GET
path), and DLO just wasn't doing it at all. One could already
distinguish DLO/SLO internal requests by the user agent, but we should
set swift.source too.

Change-Id: I5d5b7dd49bd1522a9c830d0abd21fff92ae79a39
2014-02-11 10:22:13 -08:00
Samuel Merritt 6acea29fa6 Move all DLO functionality to middleware
This is for the same reason that SLO got pulled into middleware, which
includes stuff like automatic retry of GETs on broken connection and
the multi-ring storage policy stuff.

The proxy will automatically insert the dlo middleware at an
appropriate place in the pipeline the same way it does with the
gatekeeper middleware. Clusters will still support DLOs after upgrade
even with an old config file that doesn't mention dlo at all.

Includes support for reading config values from the proxy server's
config section so that upgraded clusters continue to work as before.

Bonus fix: resolve 'after' vs. 'after_fn' in proxy's required filters
list. Having two was confusing, so I kept the more-general one.

DocImpact

blueprint multi-ring-large-objects

Change-Id: Ib3b3830c246816dd549fc74be98b4bc651e7bace
2014-02-03 18:29:48 -08:00