swift

Commit Graph

Author	SHA1	Message	Date
Tim Burke	307315bde2	docs: Move metric name/description tables out to separate page(s) Offer it both by service and as a single, more easily searchable, page. That admin guide is still too long, but this should help a bit. Change-Id: I946c72f40dce2f33ef845a0ca816038727848b3a	2023-05-30 11:38:42 -07:00
Tim Burke	52a4fe37aa	Various doc formatting cleanups * Get rid of a bunch of accidental blockquote formatting * Always declare a lexer to use for ``.. code::`` blocks Change-Id: I8940e75b094843e542e815dde6b6be4740751813	2022-08-02 14:28:36 -07:00
Matthew Oliver	7a105b5ef0	Add and pipe reconstructor stats through recon This patch plumbs the object-reconstructor stats that are dropped into recon cache out through the middleware and swift-recon tool. This adds a '/recon/reconstruction/object' to the middleware. As such the swift-recon tool has grown a '-R' or '--reconstruction' option access this data from each node. Plus some tests and documentation updates. Change-Id: I98582732ca5ccb2e7d2369b53abf9aa8c0ede00c	2021-08-20 00:03:40 +00:00
Tim Burke	314347a3cb	Update SAIO & docker image to use 62xx ports Note that existing SAIOs with 60xx ports should still work fine. Change-Id: If5dd79f926fa51a58b3a732b212b484a7e9f00db Related-Change: Ie1c778b159792c8e259e2a54cb86051686ac9d18	2020-07-20 15:17:12 -07:00
Darrell Bishop	1107f24179	Seamlessly reload servers with SIGUSR1 Swift servers can now be seamlessly reloaded by sending them a SIGUSR1 (instead of a SIGHUP). The server forks off a synchronized child to wait to close the old listen socket(s) until the new server has started up and bound its listen socket(s). The new server is exec'ed from the old one so its PID doesn't change. This makes Systemd happier, so a ReloadExec= stanza can now be used. The seamless part means that incoming connections will alwyas get accepted either by the old server or the new one. This eliminates client-perceived "downtime" during server reloads, while allowing the server to fully reload, re-reading configuration, becoming a fresh Python interpreter instance, etc. The SO_REUSEPORT socket option has already been getting used, so nothing had to change there. This patch also includes a non-invasive fix for a current eventlet bug; see https://github.com/eventlet/eventlet/pull/590 That bug prevents a SIGHUP "reload" from properly servicing existing requests before old worker processes close sockets and exit. The existing probtests missed this, but the new ones, in this patch, caught it. New probe tests cover both old SIGHUP "reload" behavior as well as the new SIGUSR1 seamless reload behavior. Change-Id: I3e5229d2fb04be67e53533ff65b0870038accbb7	2019-11-07 10:15:26 -08:00
Alexandra Settle	0c16fd9536	Fixing broken links Small changes, but helpful, mostly. Backport: stein rocky Change-Id: Ic4b6524d7804d2f74b2973b6acdb9e2679209cd4	2019-08-16 11:45:52 +00:00
Samuel Merritt	8e651a2d3d	Add fallocate_reserve to account and container servers. The object server can be configured to leave a certain amount of disk space free; default is 1%. This is useful in avoiding 100%-full filesystems, as those can get Swift in a state where the filesystem is too full to write tombstones, so you can't delete objects to free up space. When a cluster has accounts/containers and objects on the same disks, then you can wind up with a 100%-full disk since account and container servers don't respect fallocate_reserve. This commit makes account and container servers respect fallocate_reserve so that disks shared between account/container and object rings won't get 100% full. When a disk's free space falls below the configured reserve, account and container PUT, POST, and REPLICATE requests will fail with a 507 status code. These are the operations that can significantly increase the disk space used by a given database. I called the parameter "fallocate_reserve" for consistency with the object server. No actual fallocate() call happens under Swift's control in the account or container servers (sqlite3 might make such a call, but it's out of our hands). Change-Id: I083442eef14bf83c0ea717b1decb3e6b56dbf1d0	2018-07-18 17:27:11 +10:00
chengebj5238	222df91857	Modify redirection URL and broken URL Change-Id: I9a04cb2fbe61e1fbd8185ab2fac9abbcea4d55cc	2018-01-18 17:05:10 +08:00
Clay Gerrard	7013e70ca6	Represent dispersion worse than one replicanth With a sufficiently undispersed ring it's possible to move an entire replicas worth of parts and yet the value of dispersion may not get any better (even though in reality dispersion has dramatically improved). The problem is dispersion will currently only represent up to one whole replica worth of parts being undispersed. However with EC rings it's possible for more than one whole replicas worth of partitions to be undispersed, in these cases the builder will require multiple rebalance operations to fully disperse replicas - but the dispersion value should improve with every rebalance. N.B. with this change it's possible for rings with a bad dispersion value to measure as having a significantly smaller dispersion value after a rebalance (even though they may not have had their dispersion change) because the total amount of bad dispersion we can measure has been increased but we're normalizing within a similar range. Closes-Bug: #1697543 Change-Id: Ifefff0260deac0c3e8b369a1e158686c89936686	2017-12-28 11:16:17 -08:00
junboli	df00122e74	doc migration: update the doc link address[2/3] Update the doc link brought by the doc migration. Although we had some effort to fix these, it still left lots of bad doc link, I separate these changes into 3 patches aim to fix all of these, this is the 2st patch for doc/manpages. Change-Id: Id426c5dd45a812ef801042834c93701bb6e63a05	2017-09-15 06:31:00 +00:00
shangxiaobj	c93c0c0c6e	[Trivialfix]Fix typos in swift Fix typos that found in swift. Change-Id: I52fad1a4882cec4456f22174b46d54e42ec66d97	2017-08-04 07:50:10 +00:00
Tim Burke	13a07aa77a	Misc doc cleanup * Change some absolute URLs to internal links * Fix some bulletted list indentation * Choose a better lexer for some syntax highlighting * Use ``inline code`` instead of `italics` for some example command lines * Change some quoted paragraphs that only included inlined code to be proper code blocks Change-Id: Iaaa7eefb690122f5af9dcb1c871358c22335c743	2017-07-12 12:14:45 -07:00
lijunbo	21396bc106	keep consistent naming convention of swift and urls Change-Id: Iddd4f69abf77a5c643ce8b164fc6cfd72c068229	2017-03-23 02:28:41 +00:00
Jenkins	8ed8077a04	Merge "Add missing expirer recon metric to admin_guide"	2016-12-01 17:52:27 +00:00
Alistair Coles	463e22a314	Add missing expirer recon metric to admin_guide Add expirer/object to recon metrics, which reports such as: $ curl -s http://localhost:6010/recon/expirer/object {"object_expiration_pass": 0.19765901565551758, "expired_last_pass": 1} Change-Id: Ia9a171c09efebe5ad56c9de2952a8f29188c4970	2016-12-01 10:32:19 +00:00
Jenkins	b4fd962cad	Merge "Add missing recon metrics to admin_guide"	2016-12-01 10:04:44 +00:00
Ondřej Nový	9847796f01	Set owner of drive-audit recon cache to swift user Fixies this problem: * swift-drive-audit needs to be run by root, because only root have "umount" permission * swift-object servers typically runs as user swift * if swift-drive-audit is run by root, /var/cache/swift/drive.recon is owned by root, with 0o600 * recon middleware (inside swift-object-server) can't read this cache file: swift-object: Error reading recon cache file This patch adds "user" option to drive-audit config file. Recon cache is chowned to this user. Change-Id: Ibf20543ee690b7c5a37fabd1540fd5c0c7b638c9	2016-10-19 17:16:42 +00:00
Jenkins	6daa382c34	Merge "Revises 'url' to 'URL' and 'json' to 'JSON'"	2016-10-06 00:23:41 +00:00
Yushiro FURUKAWA	9b98c89983	Revises 'url' to 'URL' and 'json' to 'JSON' Change-Id: I44743fbb9bcbce3a50ed6770264ba0f4b17803d7	2016-09-30 22:21:03 +09:00
zheng yin	05642d2958	fix word spelling mistake Change-Id: Ia7b03e52b8d6a334fc2b67c94912effe0e659941	2016-09-30 16:43:54 +08:00
Kota Tsuyuzaki	dfa5523d8c	Add Pros/Cons docs for global cluster consideration This comes from discussion in Bristol Hackathon (Feb 2016). Currently Swift has a couple of choices (Global Cluster and Container Sync) to sync the stored data into geographically distributed locations. This patch adds the summary of the discussion comparing between Global Cluster and Container Sync to enable operators to know which functionality fits their own use case. And, to be fairness with container-sync, this patch moves global cluster docs into overview_global_cluster.rst from admin_guide.rst. Co-Authored-By: Alistair Coles <alistair.coles@hpe.com> Change-Id: I624eb519503ae71dbc82245c33dab6e8637d0f8b	2016-08-17 12:52:25 +01:00
Christian Schwede	699953508a	Add doc entry to check partition count An high or increasing partition count due to storing handoffs can have some severe side-effects, and replication might never be able to catch up. This patch adds a note to the admin_guide how to check this. Change-Id: Ib4e161d68f1a82236dbf5fac13ef9a13ac4bbf18	2016-07-26 12:23:54 +02:00
Christian Schwede	b5a16beb38	Add missing recon metrics to admin_guide Change-Id: Ibd484e088c915269a46f5fffe3ce627a80b3418e	2016-07-17 14:31:37 +00:00
Jenkins	11c5ef7d22	Merge "[Docs] Document prevention of disk full scenarios"	2016-06-08 21:51:02 +00:00
Nelson Almeida	daae74ca65	Adding sorting_method to admin_guide Change-Id: I1162f154e3a577a95f9f5ea0e0f723b7df5a4baf	2016-06-01 17:29:10 -03:00
Clay Gerrard	b52eccb3b1	Clarify overload best practices in admin guide Change-Id: Ib7c08bdeab6374771bb8e2b05053e7e16973524d	2016-05-25 11:21:25 -07:00
Christian Schwede	f1fd50723b	Add dispersion --verbose example to admin guide Change-Id: I5f9cacedde2a329332ccf744800b6f2453e8b28e	2016-05-25 09:53:33 +02:00
Matthew Oliver	b3ab715c05	Add ring-builder dispersion command to admin guide This change updates the admin guide to point out the dispersion command in swift-ring-builder and mentions the dispersion verbose table to make it more obvious to operators. Change-Id: I72b4c8b2d718e6063de0fdabbaf4f2b73694e0a4	2016-05-25 14:35:54 +10:00
Andy McCrae	efdf123a40	[Docs] Document prevention of disk full scenarios Adds section to detail how to prevent disk full scenarios from occurring. Change-Id: Iafb4a47fa4892f6067252f3a80de87cd76506a40	2016-05-16 10:09:33 +00:00
Shashirekha Gundur	cf48e75c25	change default ports for servers Changing the recommended ports for Swift services from ports 6000-6002 to unused ports 6200-6202; so they do not conflict with X-Windows or other services. Updated SAIO docs. DocImpact Closes-Bug: #1521339 Change-Id: Ie1c778b159792c8e259e2a54cb86051686ac9d18	2016-04-29 14:47:38 -04:00
Donagh McCabe	e38b53393f	Cleanup of Swift Ops Runbook This patch cleans up some rough edges that were left (due to time constraints) in the original commit. Change-Id: Id4480be8dc1b5c920c19988cb89ca8b60ace91b4 Co-Authored-By: Gerry Drudy gerry.drudy@hpe.com	2016-03-10 17:39:54 +00:00
Christian Schwede	043fbca6d0	Remove Erasure Coding beta status from docs This removes notes stating support for Erasure coding as beta. Questions regarding the stability of EC are coming up regularly, and are often referring to the docs that state EC as still in beta. Besides this, a note marking statsd support as beta has been removed as well. Change-Id: If4fb6a5c4cb741d42953db3cee8cb17a1d774e15	2016-03-04 14:27:23 +00:00
Jenkins	eaf6af3179	Merge "Allow IPv6 addresses/hostnames in StatsD target"	2016-02-04 03:23:01 +00:00
Darrell Bishop	26327e1e8b	Allow IPv6 addresses/hostnames in StatsD target The log_statsd_host value can now be an IPv6 address or a hostname which only resolves to an IPv6 address. In both cases, the new behavior is to use an AF_INET6 socket on which .sendto() is called with the originally-configured hostname (or IP). This means the Swift process is not caching a DNS resolution for the lifetime of the process (a good thing). If a hostname resolves to both an IPv6 or IPv4 address, an AF_INET socket is used (i.e. only the IPv4 address will receive the UDP packet). The old behavior is preserved: any invalid IP address literals and failures in DNS resolution or actual StatsD packet sending do not halt the process or bubble up; they are caught, logged, and otherwise ignored. Change-Id: Ibddddcf140e2e69b08edf3feed3e9a5fa17307cf	2016-02-03 00:26:31 -08:00
HugoKuo	e75888b281	Add more description for write_affinity_node_count parameter in the doc. Change-Id: Iad410a2be4f9a2cd5c53e860b9f91993aa7f2369 Closes-Bug: #1531173	2016-01-06 14:33:23 +08:00
Ondřej Nový	e0430fc74a	Compare Swift config checksum in swift-recon --all Change-Id: I796fe0895f4e5ddeb04c0d79a73579ce8bb9aa40	2015-11-05 21:21:21 +01:00
Paul Dardeau	73e032049f	Update admin guide with region. Added region prefix to example commands for adding devices to ring. Also updates description to include region prefix. Change-Id: Ie6d6485b497cea973e37909b5b19b44946c8aa89	2015-10-23 18:20:25 +00:00
Jenkins	63ab40db9a	Merge "Improving statistics sent to Graphite."	2015-09-09 07:12:01 +00:00
Carlos Cavanna	4765189ef3	Improving statistics sent to Graphite. Currently, statistics are organized by command. However, it would also be useful to display statistics organized by policy. Different policies may be based on different storage properties (ie, faster disks). With this change, all the statistics for object timers will be sent per policy as well. Policy statistics reporting will use policy index and the name in Graphite will show as proxy-server.object.policy.<policy-index>.<verb>, etc. Updated unit tests for per-policy stat reporting and added new unit tests for invalid cases. Updated documentation in the Administrator's Guide to reflect this new aggregation. Change-Id: Id70491e4833791a3fb8ff385953d69018514cd9c	2015-08-21 13:45:00 -04:00
Hisashi Osanai	79ba4a8598	Enable Object Replicator's failure count in recon This patch makes the count of object replication failure in recon. And "failure_nodes" is added to Account Replicator and Container Replicator. Recon shows the count of object repliction failure as follows: $ curl http://<ip>:<port>/recon/replication/object { "replication_last": 1416334368.60865, "replication_stats": { "attempted": 13346, "failure": 870, "failure_nodes": { "192.168.0.1": {"sdb1": 3}, "192.168.0.2": {"sdb1": 851, "sdc1": 1, "sdd1": 8}, "192.168.0.3": {"sdb1": 3, "sdc1": 4} }, "hashmatch": 0, "remove": 0, "rsync": 0, "start": 1416354240.9761429, "success": 1908 }, "replication_time": 2316.5563162644703, "object_replication_last": 1416334368.60865, "object_replication_time": 2316.5563162644703 } Note that 'object_replication_last' and 'object_replication_time' are considered to be transitional and will be removed in the subsequent releases. Use 'replication_last' and 'replication_time' instead. Additionaly this patch adds the count in swift-recon and it will be showed as follows: $ swift-recon object -r ======================================================================== ======= --> Starting reconnaissance on 4 hosts ======================================================================== ======= [2014-11-27 16:14:09] Checking on replication [replication_failure] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 4 [replication_success] low: 3, high: 3, avg: 3.0, total: 12, Failed: 0.0%, no_result: 0, reported: 4 [replication_time] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 4 [replication_attempted] low: 1, high: 1, avg: 1.0, total: 4, Failed: 0.0%, no_result: 0, reported: 4 Oldest completion was 2014-11-27 16:09:45 (4 minutes ago) by 192.168.0.4:6002. Most recent completion was 2014-11-27 16:14:19 (-10 seconds ago) by 192.168.0.1:6002. ======================================================================== ======= In case there is a cluster which has servers, a server runs with this patch and the other servers run without this patch. If swift-recon executes on the server which runs with this patch, there are unnecessary information on the output such as [failure], [success] and [attempted]. Because other servers which run without this patch are not able to send a response with information that this patch needs. Therefore once you apply this patch, you also apply this patch to other servers before you execute swift-recon. DocImpact Change-Id: Iecd33655ae2568482833131f422679996c374d78 Co-Authored-By: Kenichiro Matsuda <matsuda_kenichi@jp.fujitsu.com> Co-Authored-By: Brian Cline <bcline@softlayer.com> Implements: blueprint enable-object-replication-failure-in-recon	2015-08-18 11:40:02 +09:00
Jenkins	617c6b0107	Merge "Time synchronization check in recon."	2015-08-18 01:21:22 +00:00
Jenkins	57791b6cd2	Merge "+Document method to avoid rsync filling root drive"	2015-08-11 08:27:17 +00:00
Ben Martin	89f5906286	+Document method to avoid rsync filling root drive When rsync pushes to a remote node with an unmounted drive and if certain steps are not taken, rsync may attempt to write files to the local drive at the location where the drive was mounted. There are two suggested solutions for this issue: 1) Set the permissions for all mount points in /srv/node/ to root:root 755 2) Mount the drives elsewhere and symlink the drives to /srv/.../ The first method ensures that only root and not the swift user can write in the /srv/.../ directories. The second method will prompt a broken link issue if rsync attempts to write to an unmounted drive. Change-Id: I60ce4ed9ef8401768d5f78b6806cbb2e2a65303e Closes-Bug: #1470576	2015-08-05 09:29:07 -05:00
Jenkins	e1683fdb2e	Merge "Support keystone v3 domains in swift-dispersion"	2015-07-31 06:59:01 +00:00
Falk Reimann	363a256e58	Support keystone v3 domains in swift-dispersion This provides the capability to specify a project_name, project_domain_name and user_domain_name in /etc/swift/dispersion.conf. If this values are set in dispersion.conf they get populated to the swift-client. With this it is possible to have a specific dispersion project specified, which is not the keystone default domain. Changes were applied to swift-dispersion-populate and swift-dispersion-report. Relevant man pages, the example dispersion.conf and the admin guide were updated accordingly. DocImpact Closes-Bug: #1468374 Change-Id: I0e716f8d281b4d0f510bc568bcee4a13fc480ff7	2015-07-24 13:40:24 -05:00
Ondrej Novy	dd2f1be3b1	Time synchronization check in recon. This change add call time to recon middleware and param --time to recon CLI. This is usefull for checking if time in cluster is synchronized. Change-Id: I62373e681f64d0bd71f4aeb287953dd3b2ea5662	2015-07-23 11:35:02 +02:00
paul luse	e6165a7879	Add policy support to dispersion tools Doesn't work for anything other than policy 0. updated to allow user to specify policy name on cmd line (as with object-info) which then makes populate/report work with 3x, 2x, or EC style policies Change-Id: Ib7c298f0f6d666b1ecca25315b88539f45cf9f95 Closes-Bug: 1458688	2015-06-23 02:14:02 -07:00
Christian Schwede	55dd705a86	Add missing statsd metrics section for object-reconstructor Change-Id: Id3f98e5f637ff537a387262b40f21c05876fca91	2015-05-06 19:53:09 +02:00
Samuel Merritt	8d3b3b2ee0	Add some debug output to the ring builder Sometimes, I get handed a builder file in a support ticket and a question of the form "why is the balance [not] doing $thing?". When that happens, I add a bunch of print statements to my local swift/common/ring/builder.py, figure things out, and then delete the print statements. This time, instead of deleting the print statements, I turned them into debug() calls and added a "--debug" flag to the rebalance command in hopes that someone else will find it useful. Change-Id: I697af90984fa5b314ddf570280b4585ba0ba363c	2015-03-30 17:47:28 -07:00
Shilla Saebi	a1872b0498	Fix 2 typos in admin_guide file Change-Id: Ibf1e5dbf6ff4747c7f23f6638321ab41bba3021b	2014-11-24 15:38:25 +00:00

1 2

100 Commits