Commit Graph

209 Commits

Author SHA1 Message Date
Tim Burke 38d0b3fabc Clean up watchdog threads
This shouldn't impact real servers, as those processes were about to
wrap up anyway. It *can* cause some confusing behaviors in tests,
though.

Change-Id: Ifd8a64efcd3fc983596ba7cd9fe28eb9663c93d6
2024-05-02 08:30:54 -07:00
Tim Burke 55f7833d86 systemd: Send STOPPING/RELOADING notifications
See https://www.freedesktop.org/software/systemd/man/sd_notify.html#Description
for more information.

Note that this requires that we keep the NOTIFY_SOCKET env var
around for more than just the first READY message, so we want to be
careful about when we're sending the default "READY=1".

UpgradeImpact
=============
Since prior versions of Swift would unset the NOTIFY_SOCKET env var,
services must be fully restarted (rather than seamlessly reloaded) to
emit the new messages.

Related-Change: Ice224fc2a6ba0150be180955037c13fc90365479
Change-Id: I201734ae0d6232ecb1923e67864dd928f90b6586
2023-10-16 15:44:06 -07:00
Tim Burke c51e81f640 proxy: Bring back logging/metrics for get_*_info requests
A while back, we changed get_account_info and get_container_info to
call the proxy-server app directly, rather than whatever was right
of the current middleware. This reduced backend request amplification
on cache misses.

However, it *also* meant that we stopped emitting logs or metrics in
the proxy for these backend requests. This was an unfortunate and
unintended break from earlier behavior.

Now, extend the middleware decorating we're doing in loadapp() to
include a "logged app", i.e., the app wrapped by it's right-most
proxy-logging middleware. If there is not logging middleware (such
as would happen for the backend servers), the "logged app" will be
the main app. Make account and container info requests through
*that* app, so we get logging and metrics again.

Closes-Bug: #2027726
Related-Change: I49447c62abf9375541f396f984c91e128b8a05d5
Change-Id: I3f531f904340e4c8407185ed64b41d7d614a308a
2023-08-01 15:58:58 -07:00
Zuul 144b69d901 Merge "Add --test-config option to WSGI servers" 2023-06-09 06:33:21 +00:00
Tim Burke d3d503f905 Drop more translations
Partial-Bug: #1674543
Change-Id: Iad3011e8603cd2771add97942d46122434d048cd
2023-05-24 13:58:24 -07:00
Zuul 89e2050d7f Merge "wsgi: Add keepalive_timeout option" 2023-05-09 02:56:09 +00:00
Chetan Mishra 84b995f275 Don't monkey patch logging on import
Previously swift.common.utils monkey patched logging.thread,
logging.threading, and logging._lock upon import with eventlet
threading modules, but that is no longer reasonable or necessary.

With py3, the existing logging._lock is not patched by eventlet,
unless the logging module is reloaded. The existing lock is not
tracked by the gc so would not be found by eventlet's
green_existing_locks().

Instead we group all monkey patching into utils function and apply
patching consistently across daemons and WSGI servers.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Closes-Bug: #1380815
Change-Id: I6f35ad41414898fb7dc5da422f524eb52ff2940f
2023-04-28 08:57:35 -07:00
Tim Burke 469c38e9fb wsgi: Add keepalive_timeout option
Clients sometimes hold open connections "just in case" they might later
pipeline requests. This can cause issues for proxies, especially if
operators restrict max_clients in an effort to improve response times
for the requests that *do* get serviced.

Add a new keepalive_timeout option to give proxies a way to drop these
established-but-idle connections without impacting active connections
(as may happen when reducing client_timeout). Note that this requires
eventlet 0.33.4 or later.

Change-Id: Ib5bb84fa3f8a4b9c062d58c8d3689e7030d9feb3
2023-04-18 11:49:05 -07:00
Matthew Oliver e5105ffa09 internal_client: Remove allow_modify_pipeline option
The internal client is suppose to be internal to the cluster, and as
such we rely on it to not remove any headers we decide to send. However
if the allow_modify_pipeline option is set the gatekeeper middleware is
added to the internal client's proxy pipeline.

So firstly, this patch removes the allow_modify_pipeline option from the
internal client constructor. And when calling loadapp
allow_modify_pipeline is always passed with a False.

Further, an op could directly put the gatekeeper middleware into the
internal client config. The internal client constructor will now check
the pipeline and raise a ValueError if one has been placed in the
pipeline.

To do this, there is now a check_gatekeeper_loaded staticmethod that will
walk the pipeline which called from the InternalClient.__init__ method.
Enabling this walking through the pipeline, we are now stashing the wsgi
pipeline in each filter so that we don't have to rely on 'app' naming
conventions to iterate the pipeline.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: Idcca7ac0796935c8883de9084d612d64159d9f92
2023-04-14 10:37:40 +01:00
Tim Burke 0a5f0253b1 Add --test-config option to WSGI servers
Previously, seamless reloads were a little risky: when they worked, they
worked great, but if they failed (say, because you wrote out an invalid
config), you were left with no usable server processes and possible
client downtime.

Now, add the ability to do a preflight check before reloading processes
to reduce the likelihood of the reloaded process immediately dying. For
example, you might use a systemd unit that includes something like

    ExecReload=swift-proxy-server --test-config /etc/swift/proxy-server.conf
    ExecReload=kill -USR1 $MAINPID"

Change-Id: I9e5e158ce8be92535430b9cabf040063f5188bf4
2023-04-05 20:51:46 -07:00
Zuul 477423f60a Merge "Clean up a bunch of deprecation warnings" 2023-01-18 21:01:11 +00:00
Tim Burke 20b48a6900 Clean up a bunch of deprecation warnings
pytest still complains about some 20k warnings, but the vast majority
are actually because of eventlet, and a lot of those will get cleaned up
when upper-constraints picks up v0.33.2.

Change-Id: If48cda4ae206266bb41a4065cd90c17cbac84b7f
2022-12-27 13:34:00 -08:00
Tim Burke 597887dedc Extract SwiftHttpProtocol to its own module
Change-Id: I35cade2c46eb6acb66c064cde75d78173f46864c
2022-12-06 11:15:53 -08:00
Tim Burke f08b8e0af2 wsgi: Start workers in parallel, rather than serially
A single worker can take a second or two to start up, which really adds
up when you've got, say, 48 workers.

Change-Id: Id9e8b7e670a67233ac8b79da0b7b8387e302ac02
Related-Change: I83cdaa2cbd394cbd49609c65bf9c5ed026c55417
2022-11-07 08:14:44 -08:00
Clay Gerrard 12bc79bf01 Add ring_ip option to object services
This will be used when finding their own devices in rings, defaulting to
the bind_ip.

Notably, this allows services to be containerized while servers_per_port
is enabled:

* For the object-server, the ring_ip should be set to the host ip and
  will be used to discover which ports need binding. Sockets will still
  be bound to the bind_ip (likely 0.0.0.0), with the assumption that the
  host will publish ports 1:1.

* For the replicator and reconstructor, the ring_ip will be used to
  discover which devices should be replicated. While bind_ip could
  previously be used for this, it would have required a separate config
  from the object-server.

Also rename object deamon's bind_ip attribute to ring_ip so that it's
more obvious wherever we're using the IP for ring lookups instead of
socket binding.

Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Change-Id: I1c9bb8086994f7930acd8cda8f56e766938c2218
2022-06-02 16:31:29 -05:00
Tim Burke 05b2e894a9 Log signal handling at notice
Signals feel like they belong with the other process-management info
we already log at notice -- more important than a simple info(), but
definitely not an error().

While we're at it, have Daemon include the receiving pid when logging
about SIGTERM.

Change-Id: I787b9d5a35ce760450bc7389b53f0540f70c8d76
2022-04-29 09:27:38 -07:00
Tim Burke 9bc1c008a5 Get rid of pipeline_property
Instead, ensure every middleware gets a reference to the final WSGI
application. Note that this reimplements much of paste.deploy's pipeline
handling, but that code hasn't changed in years, anyway.

Change-Id: I2fbb21cabf72849ce84760a6d2607aa2af67f286
2022-01-27 14:40:27 -08:00
Alistair Coles 5079d8429d internal-client: pass global_conf to loadapp
The internal client previously provided no easy way to programatically
customise the configuration of the proxy-server app or other
middlewares in its wsgi pipeline.  This patch allows a global_conf
dict to be passed via the InternalClient constructor to the wsgi
loadapp function. Items in the global_conf dict will override options
loaded from the config file. An example use case would be to change
the log_name from the default 'swift', which would be useful to
differentiate logs from different processes using an internal client.

The minimum version of PasteDeploy is increased to 2.0.0 to make the
global_conf behavior predictable: in older versions global_conf would
not override options in the conf file DEFAULT section, but since 2.0.0
it will.

Change-Id: Ida39ec7eb02a93cf4b2aa68fc07b7f0ae27b5439
2021-12-20 18:16:32 +00:00
Tim Burke 927691098e Plumb allow_modify_pipeline through run_wsgi/run_server
Otherwise, there's no good way to launch a multi-worker, gatekeeper-less
proxy server.

Change-Id: I343609e5c0baeddf91facde463f4458c82fd3193
2021-08-17 17:08:54 -07:00
Tim Burke c374a7a851 Allow floats for all intervals
Change-Id: I91e9bc02d94fe7ea6e89307305705c383087845a
2021-05-05 15:30:21 -07:00
Zuul bed368154a Merge "Test proxy-server.conf-sample" 2021-03-31 08:56:27 +00:00
Alistair Coles 4539837647 Avoid loops when gathering container listings from shards
Previously the proxy container controller could, in corner cases, get
into a loop while building a listing for a sharded container. For
example, if a root has a single shard then the proxy will be
redirected to that shard, but if that shard has shrunk into the root
then it will redirect the proxy back to the root, and so on until the
root is updated with the shard's shrunken status.

There is already a guard to prevent the proxy fetching shard ranges
again from the same container that it is *currently* querying for
listing parts. That deals with the case when a container fills in gaps
in its listing shard ranges with a reference to itself. This patch
extends that guard to prevent the proxy fetching shard ranges again
from any container that has previously been queried for listing parts.

Change-Id: I7dc793f0ec65236c1278fd93d6b1f17c2db98d7b
2021-01-06 16:22:26 +00:00
Tim Burke 918ab8543e Use socket_timeout kwarg instead of useless eventlet.wsgi.WRITE_TIMEOUT
No version of eventlet that I'm aware of hasany sort of support for
eventlet.wsgi.WRITE_TIMEOUT; I don't know why we've been setting that.
On the other hand, the socket_timeout argument for eventlet.wsgi.Server
has been supported for a while -- since 0.14 in 2013.

Drive-by: Fix up handling of sub-second client_timeouts.

Change-Id: I1dca3c3a51a83c9d5212ee5a0ad2ba1343c68cf9
Related-Change: I1d4d028ac5e864084a9b7537b140229cb235c7a3
Related-Change: I433c97df99193ec31c863038b9b6fd20bb3705b8
2020-11-11 14:23:40 -08:00
Tim Burke 0d2604c110 wsgi: Ensure _response_headers is a list
It's reasonably common that apps or middlewares may send back headers
via dict.items(), but WSGIContext users often expect the headers to be
a list.  Convert it for them, to avoid errors like

    AttributeError: 'dict_items' object has no attribute 'append'

Change-Id: I4d061fad4da370c1cbb77ab78a55133319ea2dd7
2020-10-02 16:42:50 -07:00
Clay Gerrard f0d406127c Test proxy-server.conf-sample
AFAICT we don't test it anywhere in unittests; but I think we depend on
it for probe tests in the gate - it should definately be valid and
loadable.  We have lots of other tests that make proxy servers out of
configs - why not test this one if we can!

Change-Id: Iea9e01d5a66ccce6a73c7d23b358579194de89e2
2020-10-01 09:18:30 -05:00
Tim Burke 3f5e712be6 wsgi: Allow workers to gracefully exit
...via the expected HUP/USR1 mechanisms.

Change-Id: I2b7bf21fab433c4ffb7db09314ba527fb05b9f45
2020-08-21 14:12:37 -07:00
Tim Burke 452db14a09 Bind a new socket per-worker
We've seen lumpy distributions of requests to workers, which seems to
parallel what some other projects have seen [0][1].

The solution (as best we can tell) is to take advantage of the
SO_REUSEPORT that eventlet's been setting for us since basically
forever [2].

[0] https://lwn.net/Articles/542629/
[1] https://github.com/varnish/hitch/issues/142
[2] https://github.com/eventlet/eventlet/commit/f9a3074a3

Change-Id: I83cdaa2cbd394cbd49609c65bf9c5ed026c55417
2020-08-20 21:03:37 -07:00
Tim Burke 7753eff662 py3: Stop munging RAW_PATH_INFO
We rely on having byte-accurate representations of the request path
as-seen-on-the-wire to compute signatures in s3api; the unquoting /
requoting madness to make non-ascii paths work with python's stdlib can
lead to erroneous SignatureDoesNotMatch responses.

Change-Id: I87fe3477d8b7ef186421ef2d08bc3b205c18a0c1
Closes-Bug: #1884991
2020-06-26 17:15:59 -07:00
Tim Burke 2854eddb44 py3: (Better) fix percentages in configs
We previously fixed a bunch of places, but not quite *all* the places;
at the very least, some account-layer services (like the replicator and
auditor IIRC) could still bomb out -- and it's important that
replicators still respect fallocate_reserve!

Now, do the NicerInterpolation thing every time we call readconf.
Additionally, clean up the original fix to avoid globally
monkey-patching configparser.

Related-Bug: #1844368
Closes-Bug: #1872553
Change-Id: I4512e686cde37930f0482909f537220a57fef76b
2020-06-09 17:46:44 -07:00
Zuul 3cceec2ee5 Merge "Update hacking for Python3" 2020-04-09 15:05:28 +00:00
Zuul 62fc62bb12 Merge "py3: stop barfing on message/rfc822 Content-Types" 2020-04-07 08:59:05 +00:00
Andreas Jaeger 96b56519bf Update hacking for Python3
The repo is Python using both Python 2 and 3 now, so update hacking to
version 2.0 which supports Python 2 and 3. Note that latest hacking
release 3.0 only supports version 3.

Fix problems found.

Remove hacking and friends from lower-constraints, they are not needed
for installation.

Change-Id: I9bd913ee1b32ba1566c420973723296766d1812f
2020-04-03 21:21:07 +02:00
Romain LE DISEZ d361e5febf Make wsgi server uses systemd's NOTIFY_SOCKET
Change-Id: Ice224fc2a6ba0150be180955037c13fc90365479
2020-03-31 15:22:48 -04:00
Tim Burke 04cc11b938 py3: stop barfing on message/rfc822 Content-Types
Closes-Bug: #1863053
Change-Id: I7493d3e201e26df9f200e16bc081d8a0f30308b9
2020-03-26 12:55:54 -07:00
Clay Gerrard 4601548dab Deprecate per-service auto_create_account_prefix
If we move it to constraints it's more globally accessible in our code,
but more importantly it's more obvious to ops that everything breaks if
you try to mis-configure different values per-service.

Change-Id: Ib8f7d08bc48da12be5671abe91a17ae2b49ecfee
2020-01-05 09:53:30 -06:00
Zuul e890b0f0fc Merge "WSGI server workers must drop_privledges" 2019-12-16 18:48:43 +00:00
Clay Gerrard 6b33cf99f4 WSGI server workers must drop_privledges
... just like they always have and server per port strategy still does.

Related-Change-Id: I3e5229d2fb04be67e53533ff65b0870038accbb7
Change-Id: I14e3ed201ceaceef0f8dbc44685395f350a0e7fc
2019-12-12 17:02:51 -06:00
Tim Burke 8c0fd3f138 py3: Make seamless reloads work
Starting with Python 3.4, newly-created file descriptors are non-inheritable
[0], which causes trouble when we try to use a pipe for IPC. Fortunately, the
same PEP that implemented this change *also* provided a new API to mark file
descriptors as being inheritable -- so just do that.

While we're at it,

* Fix up the probe tests to work on py3
* Fix up the probe tests to work when policy-0 is erasure-coded
* Decode the bytes read so py3 doesn't log a b'pid'
* Log a warning if the read() is empty; something surely went wrong
  in the re-exec

[0] https://www.python.org/dev/peps/pep-0446/

Change-Id: I2a8a9f3dc78abb99bf9cbcf6b44c32ca644bb07b
Related-Change: I3e5229d2fb04be67e53533ff65b0870038accbb7
2019-12-11 01:07:19 +00:00
Darrell Bishop 1107f24179 Seamlessly reload servers with SIGUSR1
Swift servers can now be seamlessly reloaded by sending them a SIGUSR1
(instead of a SIGHUP).  The server forks off a synchronized child to
wait to close the old listen socket(s) until the new server has started
up and bound its listen socket(s).  The new server is exec'ed from the
old one so its PID doesn't change.  This makes Systemd happier, so a
ReloadExec= stanza can now be used.

The seamless part means that incoming connections will alwyas get
accepted either by the old server or the new one.  This eliminates
client-perceived "downtime" during server reloads, while allowing the
server to fully reload, re-reading configuration, becoming a fresh
Python interpreter instance, etc.  The SO_REUSEPORT socket option has
already been getting used, so nothing had to change there.

This patch also includes a non-invasive fix for a current eventlet bug;
see https://github.com/eventlet/eventlet/pull/590
That bug prevents a SIGHUP "reload" from properly servicing existing
requests before old worker processes close sockets and exit.  The
existing probtests missed this, but the new ones, in this patch, caught
it.

New probe tests cover both old SIGHUP "reload" behavior as well as the
new SIGUSR1 seamless reload behavior.

Change-Id: I3e5229d2fb04be67e53533ff65b0870038accbb7
2019-11-07 10:15:26 -08:00
Tim Burke d270596b67 Consistently use io.BytesIO
Change-Id: Ic41b37ac75b5596a8307c4962be86f2a4b0d9731
2019-10-15 15:09:46 +02:00
Zuul 6114965ab9 Merge "Fix some request-smuggling vectors on py3" 2019-10-02 23:09:48 +00:00
Tim Burke bf9346d88d Fix some request-smuggling vectors on py3
A Python 3 bug causes us to abort header parsing in some cases. We
mostly worked around that in the related change, but that was *after*
eventlet used the parsed headers to determine things like message
framing. As a result, a client sending a malformed request (for example,
sending both Content-Length *and* Transfer-Encoding: chunked headers)
might have that request parsed properly and authorized by a proxy-server
running Python 2, but the proxy-to-backend request could get misparsed
if the backend is running Python 3. As a result, the single client
request could be interpretted as multiple requests by an object server,
only the first of which was properly authorized at the proxy.

Now, after we find and parse additional headers that weren't parsed by
Python, fix up eventlet's wsgi.input to reflect the message framing we
expect given the complete set of headers. As an added precaution, if the
client included Transfer-Encoding: chunked *and* a Content-Length,
ensure that the Content-Length is not forwarded to the backend.

Change-Id: I70c125df70b2a703de44662adc66f740cc79c7a9
Related-Change: I0f03c211f35a9a49e047a5718a9907b515ca88d7
Closes-Bug: 1840507
2019-10-02 08:20:20 -07:00
Tim Burke 9a33365f06 py3: Allow percentages in configs
Previously, configs like

    fallocate_reserve = 1%

would cause a py3 backend server to fail to start, complaining like

    configparser.InterpolationSyntaxError: Error in file
    /etc/swift/object-server/1.conf.d: '%' must be followed
    by '%' or '(', found: '%'

This could also come up in proxy-server configs, with things like
percent signs in tempauth password.

In general, we haven't really thought much about interpolation in
configs. Python's default ConfigParser has always supported it, though,
so we got it "for free". On py2, we didn't really have to think about
it, since values like "1%" would pass through just fine. (It would blow
up a SafeConfigParser, but a normal ConfigParser only does replacements
when there's something like a "%(opt)s" in the value.)

On py3, SafeConfigParser became ConfigParser, and the old interpolation
mode (AFAICT) doesn't exist.

Unfortunatley, since we "supported" interpolation, we have to assume
there are deployments in the wild that use it, and try not to break
them.  So, do what we can to mimic the py2 behavior.

Change-Id: I0f9cecd11f00b522a8486972551cb30af151ce32
Closes-Bug: #1844368
2019-09-27 11:41:12 -07:00
Tim Burke 424a2603d8 Fix up errno checking
Change-Id: I196b4e886942eccd2cb3aff3819da1558ddb19d4
2019-07-16 17:02:57 -07:00
Tim Burke 76fde89261 py3: Be able to read and write non-ASCII headers
Apparently Python's stdlib got more picky about what a header should
look like. As a result, if an account, container, or object had a
non-ASCII metadata name (values were fine), the proxy-server wouldn't
parse all of the headers. See https://bugs.python.org/issue37093 for
more information.

This presented several problems:
- Since the non-ASCII header aborts parsing, we may lose important
  HTTP-level information like Content-Length or Transfer-Encoding.
- Since the offending header wouldn't get parsed, the client wouldn't
  even know what the problem was.
- Even if the client knew what the bad header was, it would have no way
  to clear it, as the server uses the same logic to parse incoming
  requests.

So, hack in our own header parsing if we detect that parsing was
aborted. Note that we also have to mangle bufferedhttp's putheader so we
can get non-ASCII headers to the backend servers.

Now, we can run the test_unicode_metadata tests in
test/functional/test_account.py and test/functional/test_container.py
under py2 against services running under py3.

Change-Id: I0f03c211f35a9a49e047a5718a9907b515ca88d7
2019-07-03 02:01:55 +00:00
Tim Burke ff4459ed6b Move call to global_conf_callback after loadapp()
Otherwise, paste complains about not being able to interpolate values
into the replication_semaphore. As long as it gets dropped in before we
fork(), I think it's OK?

Closes-Bug: 1691075
Change-Id: Ib7e065c47871876786bcc9ff39737f5d1bb3c12c
2019-06-21 22:30:47 -07:00
Tim Burke 8b3d0a6c64 py3: finish porting proxy/test_server.py
Change-Id: I8287db75b4f19581203360c646e72f64fe45f170
2019-05-08 17:47:40 -07:00
Tim Burke 93b49c5e48 py3: Be able to parse non-RFC-compliant request lines
There's a bug in CPython [1] that causes servers to mis-parse request
lines that include the bytes \x85 or \xa0.  Naturally, we have
functional tests that (with high probability) will send such request
lines. There's a fix proposed, but the earliest it's likely to land
would be for 3.8, and we need to be able to target an earlier Python.

So, intercept the request line immediately before parsing and re-write
it to be RFC-compliant. Note that this is done for Python 2 as well,
though there should be no change in the request environment that
eventlet eventually hands to us.

[1] https://bugs.python.org/issue33973

Change-Id: Ie648f5c04d4415f3b620fb196fa567ce7575d522
2019-05-02 14:44:18 -07:00
Tim Burke 6c93c57685 Run docs tox env under py3
...and fix this one warning-treated-as-error that crept in since
https://github.com/openstack/swift/commit/d185b60

Change-Id: Id46ee3ab23e2703170191528427aaa2788aba1ee
2019-03-29 16:16:53 -07:00
Tim Burke e5eb673ccb Stop monkey-patching mimetools
You could *try* doing something similar to what we were doing
there over in email.message for py3, but you would end up
breaking pkg_resources (and therefor entrypoints) in the
process.

Drive-by: have mem_diskfile implement more of the diskfile API.

Change-Id: I1ece4b4500ce37408799ee634ed6d7832fb7b721
2019-03-13 21:51:36 -07:00