swift

Commit Graph

Author	SHA1	Message	Date
Matthew Oliver	7a105b5ef0	Add and pipe reconstructor stats through recon This patch plumbs the object-reconstructor stats that are dropped into recon cache out through the middleware and swift-recon tool. This adds a '/recon/reconstruction/object' to the middleware. As such the swift-recon tool has grown a '-R' or '--reconstruction' option access this data from each node. Plus some tests and documentation updates. Change-Id: I98582732ca5ccb2e7d2369b53abf9aa8c0ede00c	2021-08-20 00:03:40 +00:00
Matthew Oliver	85e36f7122	recon: refactor common recon names into a common location Change-Id: I0a0766cfb6672377de0f152ce179c874c327ec54	2021-06-29 15:22:57 -07:00
Tim Burke	39ad468dfe	Add async_pending_last time to object.recon The async_pending count isn't near as useful when we don't know how out of date it is. Change-Id: I3e5e904ffc0eba7a7e141e1c2d9f9840e4952041	2021-06-15 08:12:05 -07:00
Matthew Oliver	4ce907a4ae	relinker: Add /recon/relinker endpoint and drop progress stats To further benefit the stats capturing for the relinker, drop partition progress to a new relinker.recon recon cache and add a new recon endpoint: GET /recon/relinker To gather get live relinking progress data: $ curl http://127.0.0.3:6030/recon/relinker \|python -mjson.tool { "devices": { "sdb3": { "parts_done": 523, "policies": { "1": { "next_part_power": 11, "start_time": 1618998724.845616, "stats": { "errors": 0, "files": 1630, "hash_dirs": 1630, "linked": 1630, "policies": 1, "removed": 0 }, "timestamp": 1618998730.24672, "total_parts": 1029, "total_time": 5.400741815567017 }}, "start_time": 1618998724.845946, "stats": { "errors": 0, "files": 836, "hash_dirs": 836, "linked": 836, "removed": 0 }, "timestamp": 1618998730.24672, "total_parts": 523, "total_time": 5.400741815567017 }, "sdb7": { "parts_done": 506, "policies": { "1": { "next_part_power": 11, "part_power": 10, "parts_done": 506, "start_time": 1618998724.845616, "stats": { "errors": 0, "files": 794, "hash_dirs": 794, "linked": 794, "removed": 0 }, "step": "relink", "timestamp": 1618998730.166175, "total_parts": 506, "total_time": 5.320528984069824 } }, "start_time": 1618998724.845616, "stats": { "errors": 0, "files": 794, "hash_dirs": 794, "linked": 794, "removed": 0 }, "timestamp": 1618998730.166175, "total_parts": 506, "total_time": 5.320528984069824 } }, "workers": { "100": { "drives": ["sda1"], "return_code": 0, "timestamp": 1618998730.166175} }} Also, add a constant DEFAULT_RECON_CACHE_PATH to help fix failing tests by mocking recon_cache_path, so that errors are not logged due to dump_recon_cache exceptions. Mock recon_cache_path more widely and assert no error logs more widely. Change-Id: I625147dadd44f008a7c48eb5d6ac1c54c4c0ef05	2021-05-10 16:13:32 +01:00
Tim Burke	f2a4c50dce	Include sharding cycle time in recon Change-Id: Id7e828a56c8a62a1f3e9a1dbbff5a56c928ac6b8	2021-04-25 15:11:49 +00:00
Zuul	75e86425b9	Merge "Plumb sharding stats though recon middleware"	2021-02-26 18:42:21 +00:00
Matthew Oliver	b1309c95e5	Plumb sharding stats though recon middleware To make it easier to have access to the sharding stats add /recon/sharding as a recon middleware endpoint. This allows an easy way to ask a container server for it's sharding stats using REST inside the cluster: curl <container-server>/recon/sharding Also add a get_recon method to the direct client so it can also be used easily inside tooling and probe tests. Co-Authored-By: Alistair Coles <alistairncoles@gmail.com> Change-Id: I2a6024277d1198d8c996682682bfe28797344951	2021-02-26 15:51:06 +00:00
Alistair Coles	9eac76258a	Trivial fixes in recon middleware Fix docstring and remove unused mount_check var. Change-Id: I30ead8b72cb616d1311ffc81d9cfecb1afc9a05e	2021-02-24 11:02:47 +00:00
Tim Burke	f192f51d37	Have check_drive raise ValueError on errors ...which helps us differentiate between a drive that's not mounted vs. not a dir better in log messages. We were already doing that a bit in diskfile.py, and it seems like a useful distinction; let's do it more. While we're at it, remove some log translations. Related-Change: I941ffbc568ebfa5964d49964dc20c382a5e2ec2a Related-Change: I3362a6ebff423016bb367b4b6b322bb41ae08764 Change-Id: Ife0d34f9482adb4524d1ab1fe6c335c6b287c2fd Partial-Bug: 1674543	2018-06-20 17:15:07 -07:00
Pavel Kvasnička	163fb4d52a	Always require device dir for containers For test purposes (e.g. saio probetests) even if mount_check is False, still require check_dir for account/container server storage when real mount points are not used. This behavior is consistent with the object-server's checks in diskfile. Co-Author: Clay Gerrard <clay.gerrard@gmail.com> Related lp bug #1693005 Related-Change-Id: I344f9daaa038c6946be11e1cf8c4ef104a09e68b Depends-On: I52c4ecb70b1ae47e613ba243da5a4d94e5adedf2 Change-Id: I3362a6ebff423016bb367b4b6b322bb41ae08764	2017-09-01 10:32:12 -07:00
Alistair Coles	609b5182c4	Refactor recon to use single md5_hash_for_file function There were several implementations of hashing the content of a file in cli/recon.py and common/middleware/recon.py. This patch relocates one implementation (_hash_for_ringfile, introduced in the Related Change) to common/utils.py and refactors recon cli and middleware to use that function. Also improves use of mocking in the unit tests to eliminate passing custom file opener functions to the ReconMiddleware get_ring_md5 and get_swift_conf_md5 methods. Related-Change: I9623752c3cd2361f57864f3e938e1baf5e9292d7 Change-Id: Iaad88e49aadeb28f614aafa1e9596fe07ce9793a	2016-12-02 18:22:59 +00:00
Clay Gerrard	053b625f42	Remove ring md5 integration check from recon unittests The actual value computed by md5 isn't that important; even in recon it's only used as an opaque identifier that assumed to be consistent across nodes for the same file. However the way these tests were written with hard coded md5 values makes them brittle to changes in the RingData format and susceptible to the burden of needless unrelated test maintenance churn. e.g. Related-Change: I23b5e0a8082b30ca257aeb1fab03ab74e6f0b2d4 Change-Id: I9623752c3cd2361f57864f3e938e1baf5e9292d7	2016-11-30 16:55:05 -08:00
Brian Cline	a537684c77	Don't report recon mount/usage status on files Today recon will include normal files in the payload it returns for /recon/unmounted and /recon/diskusage. As a result it can trigger bogus alarms on any operations-side monitoring checking for unmounted disks or disks that show up in diskusage with weird looking stats. This change adds an isdir check for the entries it finds in /srv/node. Change-Id: Iad72e03fdda11ff600b81b4c5d58020cc4b9048e Closes-bug: #1556747	2016-03-14 00:17:47 -05:00
Zack M. Davis	1b8b08039a	remove remaining simplejson uses, prefer standard library import `a1c32702`, `736cf54a`, and `38787d0f` remove uses of `simplejson` from various parts of Swift in favor of the standard libary `json` module (introduced in Python 2.6). This commit performs the remaining `simplejson` to `json` replacements, removes two comments highlighting quirks of simplejson with respect to Unicode, and removes the references to it in setup documentation and requirements.txt. There were a lot of places where we were importing json from swift.common.utils, which is less intuitive than a direct `import json`, so that replacement is made as well. (And in two more tiny drive-bys, we add some pretty-indenting to an XML fragment and use `super` rather than naming a base class explicitly.) Change-Id: I769e88dda7f76ce15cf7ce930dc1874d24f9498a	2015-11-16 12:34:24 -08:00
Brian Cline	460a7e4b64	Fixes recon bug with initially missing rings Previously the recon middleware was doing a basic scan for object rings that exist at init time. In situations where an object-server was started without an object ring present, but received one shortly after, recon still would not report it in the /recon/ringmd5 response. This persists even when object-server gleefully chugs along after picking up the ring, and recon's behavior would only be corrected by an object-server reload/restart. This change brings the middleware a bit more up to date to use the common POLICIES instance to determine what policies were already loaded based on configuration, and derives the path for each ring. This effectively makes the config the source of truth for what rings should be present, rather than what's present at startup. Since we already dynamically check in ReconMiddleware.get_ring_md5 whether each of the predetermined ring files exist, recon now correctly reports a previously-missing ring whenever it falls into place. Change-Id: Ia079418e54ffac5e01ef6a15511f5069b7fe83ea	2015-09-13 19:10:17 -05:00
Hisashi Osanai	79ba4a8598	Enable Object Replicator's failure count in recon This patch makes the count of object replication failure in recon. And "failure_nodes" is added to Account Replicator and Container Replicator. Recon shows the count of object repliction failure as follows: $ curl http://<ip>:<port>/recon/replication/object { "replication_last": 1416334368.60865, "replication_stats": { "attempted": 13346, "failure": 870, "failure_nodes": { "192.168.0.1": {"sdb1": 3}, "192.168.0.2": {"sdb1": 851, "sdc1": 1, "sdd1": 8}, "192.168.0.3": {"sdb1": 3, "sdc1": 4} }, "hashmatch": 0, "remove": 0, "rsync": 0, "start": 1416354240.9761429, "success": 1908 }, "replication_time": 2316.5563162644703, "object_replication_last": 1416334368.60865, "object_replication_time": 2316.5563162644703 } Note that 'object_replication_last' and 'object_replication_time' are considered to be transitional and will be removed in the subsequent releases. Use 'replication_last' and 'replication_time' instead. Additionaly this patch adds the count in swift-recon and it will be showed as follows: $ swift-recon object -r ======================================================================== ======= --> Starting reconnaissance on 4 hosts ======================================================================== ======= [2014-11-27 16:14:09] Checking on replication [replication_failure] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 4 [replication_success] low: 3, high: 3, avg: 3.0, total: 12, Failed: 0.0%, no_result: 0, reported: 4 [replication_time] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%, no_result: 0, reported: 4 [replication_attempted] low: 1, high: 1, avg: 1.0, total: 4, Failed: 0.0%, no_result: 0, reported: 4 Oldest completion was 2014-11-27 16:09:45 (4 minutes ago) by 192.168.0.4:6002. Most recent completion was 2014-11-27 16:14:19 (-10 seconds ago) by 192.168.0.1:6002. ======================================================================== ======= In case there is a cluster which has servers, a server runs with this patch and the other servers run without this patch. If swift-recon executes on the server which runs with this patch, there are unnecessary information on the output such as [failure], [success] and [attempted]. Because other servers which run without this patch are not able to send a response with information that this patch needs. Therefore once you apply this patch, you also apply this patch to other servers before you execute swift-recon. DocImpact Change-Id: Iecd33655ae2568482833131f422679996c374d78 Co-Authored-By: Kenichiro Matsuda <matsuda_kenichi@jp.fujitsu.com> Co-Authored-By: Brian Cline <bcline@softlayer.com> Implements: blueprint enable-object-replication-failure-in-recon	2015-08-18 11:40:02 +09:00
Jenkins	617c6b0107	Merge "Time synchronization check in recon."	2015-08-18 01:21:22 +00:00
Victor Stinner	a0db56dcde	Fix pep8 E265 warning of hacking 0.10 Fix the warning E265 "block comment should start with '# '" added in pep 1.5. Change-Id: Ib57282e958be9c7cddffc7bca34fbbf1d4c460fd	2015-07-30 09:33:18 +02:00
Ondrej Novy	dd2f1be3b1	Time synchronization check in recon. This change add call time to recon middleware and param --time to recon CLI. This is usefull for checking if time in cluster is synchronized. Change-Id: I62373e681f64d0bd71f4aeb287953dd3b2ea5662	2015-07-23 11:35:02 +02:00
Lorcan	0a46793662	Add swift-recon feature to track swift-drive-audit error count This is a follow-on from a previous commit which added recon info for swift-drive-audit (https://review.openstack.org/#/c/122468/). Here, the "--drievaudit" option is added to swift-recon tool. This feature gives the statistics for the system-wide drive errors flagged by swift-drive-audit. An example of the output is as follows: (verbose mode) swift-recon --driveaudit -v =============================================================================== --> Starting reconnaissance on 5 hosts =============================================================================== [2015-03-11 17:13:39] Checking drive-audit errors -> http://1.2.3.4:6000/recon/driveaudit: {'drive_audit_errors': 14} -> http://1.2.3.5:6000/recon/driveaudit: {'drive_audit_errors': 0} -> http://1.2.3.6:6000/recon/driveaudit: {'drive_audit_errors': 37} -> http://1.2.3.7:6000/recon/driveaudit: {'drive_audit_errors': 101} -> http://1.2.3.8:6000/recon/driveaudit: {'drive_audit_errors': 0} [drive_audit_errors] low: 0, high: 101, avg: 30.4, total: 152, Failed: 0.0%, no_result: 0, reported: 5 =============================================================================== Change-Id: Ia16c52a9d613eeb3de1a5a428d88dd1233631912	2015-03-23 11:38:32 +00:00
Daisuke Morita	f8fa1a9234	Show each policy's information on quarantined files in recon After the release of Swift ver. 2.0.0, some recon responses do not show each policy's information yet. To make things worse, some recon results only count on policy-0's score, therefore the total is not shown in the recon results. This patch makes the count of quarantined files policy-aware for recon requests. Suppose a number of quarantined objects for policy-0 is 2 and a number for policy-1 is 3, recon sums up every policy's amount and shows information for each policy as follows. $ curl http://<host>:<port>/recon/quarantined {"accounts": 0, "containers": 0, "objects": 5, "policies": {"0": {"objects": 2}, "1": {"objects": 3}}} Moreover, this patch adds stats for each policy in CLI output. Change-Id: I07217c635f6fc4ea809ddbc3d859c4e81c4fde37 Related-Bug: 1375327 Related-Bug: 1375332	2015-01-20 18:42:20 +09:00
Paul Luse	8326dc9f2a	Add Storage Policy Support to Recon Middleware Recon middleware returns object ring file MD5 sums; this patch updates it to include other object files that may be present because of Storage Policies. Also adds unit test coverage for the MD5 reporting function which previously had none. The recon script will now check all rings the server responds with match the on-disk md5's regardless of server-type; including any storage policy object rings. Note the small change to the ring save method, needed to stimulate the right code paths in 2.6 and 2.7 versions of gzip to enable testing of ring MD5 sums. DocImpact Implements: blueprint storage-policies Change-Id: I01efd2999d6d9c57ee8693ac3a6236ace17c5566	2014-06-18 21:09:54 -07:00
Samuel Merritt	31dac18625	Check swift.conf MD5 with recon I've seen several folks recently have problems with their Swift clusters because they had different hash prefixes on different nodes. Let's help them out by having recon check that. Note that MD5-equality is stronger than what we need (which is ConfigParser-equality for a particular set of keys), but this way we don't expose the secret hash prefix and suffix across the internal network, just the MD5 checksum of the file containing them. Change-Id: I3af984ee45947345891b3c596a88e3464f178cc7	2014-04-10 14:08:27 -07:00
Greg Lange	8b4876f32a	Fix recon docs Change-Id: Icaa0f61e5796253dcc57b8c005577890de8aa537	2014-02-10 14:31:14 +00:00
Peter Portante	a708295d82	Remove trailing slash for consistency Change-Id: Idd4fd116b6be226e46e33f421883b6fb34947a84 Signed-off-by: Peter Portante <peter.portante@redhat.com>	2014-01-06 18:12:42 -05:00
Florian Hines	62254e42c4	Fix checkmount error parsing in swift-recon - swift-recon now handles parsing instances where 'mounted' key (in unmounted and disk_usage) is an error message instead of a bool. - Add's checkmount exception handling to the recon umounted endpoint. - Updates existing unittest to have ismount throw an error. - Updates unittests to cover the corner cases Change-Id: Id51d14a8b98de69faaac84b2b34b7404b7df69e9	2013-12-28 20:58:27 -08:00
Kun Huang	fd4843f8e7	catch OSError to prevent breaking request /recon/diskusage swift.common.utils.ismount maybe raise some OSError in some special cases; and the request against /recon/diskusage doesn't handle it before. This patch let output of mounted keyword is the error's message. Change-Id: I5d9018f580181e618a3fa072b7a760d41795d8eb Closes-Bug: #1249181	2013-11-13 22:46:20 +08:00
ZhiQiang Fan	f72704fc82	Change OpenStack LLC to Foundation Change-Id: I7c3df47c31759dbeb3105f8883e2688ada848d58 Closes-bug: #1214176	2013-09-20 01:02:31 +08:00
Clay Gerrard	ce12d66cf9	fix swift i18n Change-Id: I53cea28a6d7593a1b308dbcf77dddf7f40d76cb2	2013-09-09 20:25:00 -07:00
Dirk Mueller	3d36a76156	Use Python 3.x compatible except construct except x,y: was deprected and is removed in Python 3.x. Use "except x as y:" instead which works in any Python version >= 2.6. Change-Id: I7008c74b807340f3457d3a0c8bd0b83f23169d14	2013-09-07 10:50:54 +02:00
Alex Gaynor	0f3b0410e3	Removed unnecessary monkeypatching of __builtin__ Replaced it with explicitly importing the gettext function, which is significantly more readable. Change-Id: Ia0a7edcf685fb6e4052a8290367b233169529ab8	2013-07-27 21:34:35 -07:00
Marcelo Martins	7fbb97b39e	Retrieve the swift version with recon Adding a '/recon/version' in order to get the swift version Change-Id: I7b7ddbe70abb87c6a3b1010ddefa09d0acc09710	2013-05-23 15:06:12 -05:00
Monty Taylor	abe70e8323	Cleanup based on pyflakes. pyflakes itself can't be used in any automated gating way, because there are two sets of false errors it raises. However, as an exercise, cleaning up the 'valid' ones uncovered three actual bugs. The other changes (mostly unused variables) are included here for fun. Command run: pyflakes swift \| grep -v "undefined name '_'" Change-Id: I18696bf047dedad1a9fdbde3463e214fba95f7c6	2013-02-01 07:50:17 +11:00
Michael Barton	c45e435d1f	Add wsgify and split_path utilities to swob And refactor some of the code to use them. Remove unused imports. Change-Id: Ica479c10247fa85c740bb99cf7d1db7fbb1b2c80	2013-01-25 00:38:32 -08:00
gholt	a88b412e17	swift-recon: Added oldest and most recent repl I've been doing this with cluster-wide log searches for far too long. This adds support for reporting the oldest replication pass completion as well as the most recent. This is quite useful for finding those odd replicators that have hung up for some reason and need intervention. Change-Id: I7fd7260eca162d6b085f3e82aaa3cf90670f2d53	2013-01-12 05:49:14 +00:00
John Dickinson	8ac292595f	changed TRUE_VALUES references to utils.config_true_value() call cleaned up pep8 (v1.3.3) in all files this patch touches Change-Id: I30e8314dfdc23fb70ab83741a548db9905dfccff	2012-10-29 13:59:01 -07:00
Michael Barton	5e3e9a882d	local WSGI Request and Response classes This change replaces WebOb with a mostly compatible local library, swift.common.swob. Subtle changes to WebOb's API over the years have been a huge headache. Swift doesn't even run on the current version. There are a few incompatibilities to simplify the implementation/interface: * It only implements the header properties we use. More can be easily added. * Casts header values to str on assignment. * Response classes ("HTTPNotFound") are no longer subclasses, but partials on Response, so things like isinstance no longer work on them. * Unlike newer webob versions, will never return unicode objects. Change-Id: I76617a0903ee2286b25a821b3c935c86ff95233f	2012-09-28 14:48:48 -07:00
Florian Hines	243b439507	Ensure empty results are returned Make sure that empty but still valid results (like no unmounted drives) aren't treated as 500 errors. Change-Id: I9588e2711d7916406f15613d5a26b9f0cf38235a	2012-05-31 18:25:05 -05:00
Florian Hines	ccb6334c17	Expand recon middleware support Expand recon middleware to include support for account and container servers in addition to the existing object servers. Also add support for retrieving recent information from auditors, replicators, and updaters. In the case of certain checks (such as container auditors) the stats returned are only for the most recent path processed. The middleware has also been refactored and should now also handle errors better in cases where stats are unavailable. While new check's have been added the output from pre-existing check's has not changed. This should allow existing 3rd party utilities such as the Swift ZenPack to continue to function. Change-Id: Ib9893a77b9b8a2f03179f2a73639bc4a6e264df7	2012-05-24 14:50:00 -05:00
Jenkins	6168e37fd5	Merge "tests for recon middleware."	2012-03-22 20:39:50 +00:00
John Dickinson	1ecf5ebba1	updated copyright date for all files Change-Id: Ifd909d3561c2647770a7e0caa3cd91acd1b4f298	2012-03-19 13:45:34 -05:00
Florian Hines	0a461a5b8a	tests for recon middleware. My first stab at unittests for the recon middleware. Also, made some minor changes to the middleware to make testing easier now and down the road. Change-Id: I23ce853398ff035ffbfc2082e90e22038832b966	2012-03-19 13:44:43 -05:00
Chmouel Boudjnah	16a5faaaba	PEP8 fixes. Change-Id: I3c33c03547f97ca7afbb47c3bddfdeabf152afe2	2012-01-20 15:07:55 -06:00
Florian Hines	413ca11a5f	Add sockstat info to recon. Add's support for pulling info from /proc/net/sockstat and /proc/net/sockstat6 via recon. Change-Id: Idb403c6eda199c5d36d96cc9027ee249c12c7d8b	2011-11-15 17:55:14 +00:00
Florian Hines	e9b5cb83ac	simplejson import and exception/logging fixes	2011-09-01 13:46:13 -05:00
Florian Hines	b762c5acd0	pep8 fix	2011-08-14 10:49:15 -05:00
Florian Hines	dcd39d098f	account for parent/.. hardlinks	2011-08-12 16:29:13 -05:00
Florian Hines	44803a835d	add quarantine stats	2011-08-12 15:01:28 -05:00
Florian Hines	7938a5d777	quick comment on how to load recon.py	2011-08-01 21:43:55 -05:00
Florian Hines	aa622eb799	recon middlewear for the object server and utils for cluster monitoring	2011-07-27 10:41:07 -05:00

50 Commits