puppet-log_processor

Commit Graph

Author	SHA1	Message	Date
Clark Boylan	fb7c8790dd	Retire this project We've shutdown the log processing service and don't need to manage it with puppet any more. Depends-On: https://review.opendev.org/c/openstack/project-config/+/839235 Change-Id: I451488faf6a7502a5171d2a4299d7a4e40d96072	2022-04-25 09:48:21 -07:00
melanie witt	89bfe00dda	Stream log files instead of loading full files into memory For awhile now lately, we have been seeing Elastic Search indexing quickly fall behind as some log files generated in the gate have become larger. Currently, we download a full log file into memory and then emit it line-by-line to be received by a logstash listener. When log files are large (example: 40M) logstash gets bogged down processing them. Instead of downloading full files into memory, we can stream the files and emit their lines on-the-fly to try to alleviate load on the log processor. This: * Replaces use of urllib2.urlopen with requests with stream=True * Removes manual decoding of gzip and deflate compression formats as these are decoded automatically by requests.iter_lines * Removes unrelated unused imports * Removes an unused arg 'retry' from the log retrieval method Change-Id: I6d32036566834da75f3a73f2d086475ef3431165	2020-11-09 09:49:07 -08:00
Zuul	09b0ed74f7	Merge "Handle case where content encoding isn't set"	2019-08-27 18:01:38 +00:00
Zuul	c0f67cef05	Merge "Don't try to get .gz suffixed files in addition to base url"	2019-08-24 02:25:13 +00:00
Clark Boylan	972b6355b0	Handle case where content encoding isn't set Belts and suspenders for cases where content encoding may not be present. I believe this is possible if the content is served with the identity encoding. In that case setting the encoding header isn't required. Change-Id: If18670d4fd3656a35f818247539b7afad39493e6	2019-08-23 18:45:06 -07:00
Clark Boylan	15991cfded	Don't try to get .gz suffixed files in addition to base url Zuul gives us the source url to index. Previously we tried to fetch url + .gz because in many cases we uploaded the file as a gzip file but logically treated it as unzipped. Now with logs in swift we compress files without the .gz suffix. This means we should be able to always fetch the url that zuul provides unmodified. Depends-On: https://review.opendev.org/678303 Change-Id: I0ea4d9daa905ccb50372b73b5035758fc0963716	2019-08-23 21:55:17 +00:00
Clark Boylan	b9063a7e7e	Fix systemd severity filter input data The severity filters are passed the entire json event and not just a string. Update the systemd filter to access the message string out of the event json dict. Prior to this we get a type error: 2019-08-19 17:18:48,055 Exception handling log event. Traceback (most recent call last): File "/usr/local/bin/log-gearman-worker.py", line 255, in _handle_event keep_line = f.process(out_event) File "/usr/local/bin/log-gearman-worker.py", line 183, in process m = self.SYSTEMDRE.match(msg) TypeError: expected string or buffer Change-Id: I7ab56ac397133f00539d9d3374fa400363ef12d6	2019-08-19 10:27:45 -07:00
Ian Wienand	3119c0cddd	log-gearman-worker: remove obsolete GET debug filter, add local filter Now that logs have moved into swift, the os-loganalyze middleware that stripped DEBUG level logs when the URL was given a ?level= parameter no longer functions. We move to filtering DEBUG statements directly. Because services in devstack now run as systemd services, their log files are actually journalctl dumps. Thus we add a new filter for systemd style timestamps and messages (this is loosely based on the zuul log viewer at [1]). [1] `8c1f4e9d6b/web/src/actions/logfile.js` Change-Id: I54087c95c809612758139136d5b3e86b1a6372be	2019-08-19 17:15:25 +10:00
Ian Wienand	5b30a3a6c0	log-gearman-worker: Remove jenkins streaming workaround We don't need to worry about the file changing under us any more; this was all pre-zuul, let alone pre-using-swift for logs. Remove this workaround. Change-Id: I5938dcef5550d4c62c8158c5f89ace75ae99aedc	2019-08-19 13:06:58 +10:00
Ian Wienand	bca04e3155	log-gearman-worker: handle deflate encoded values We are now logging to swift and store the objects as deflate encoded data [1]. This means that we get back "Content-Encoding: deflate" data when downloading the logs (even despite us not accepting that). So put in a path for deflate encoding to the extant code with zlib. For completeness we also update our Accept-Encoding: header to show we accept deflate. [1] `60e7542875/roles/upload-logs-swift/library/zuul_swift_upload.py (L608)` Change-Id: I328bafea3ddae858fd77af043f16c499ddd5a30e	2019-08-19 13:02:32 +10:00
Clark Boylan	772a94ff6d	Force geard to listen on :: By default geard only listens on ipv4 0.0.0.0 which means ipv6 connectiosn don't work. Because we run dual stack and things expect ipv6 to work (we have AAAA dns records after all) force geard to listen on :: which will accept ipv6 and ipv4 connections. Change-Id: Ibf3bfc5f80ca139b375ee2902dc3149ac791ef96	2018-10-18 15:47:14 -07:00
Zuul	8c748b0cd5	Merge "Add support for running a standalone geard"	2018-10-18 16:44:21 +00:00
James E. Blair	625bb48d13	Add severity info to logstash and filter out DEBUG lines This adds severity as a logstash field for every oslo formatted log line, and does not add any lines which are at DEBUG level. This means we no longer rely on the level=INFO query paremeter in order to remove DEBUG lines, so we will avoid sending them to logstash regardless of whether os-loganalyze is used. Change-Id: I8c4ac76a7fa0c3badd82fc7c54959ef6eb052732	2018-08-06 10:00:40 -07:00
Clark Boylan	ffb50e9e8e	Add support for running a standalone geard We don't use the jenkins log client 0mq events anymore with zuulv3. Instead zuul jobs submit the log indexing jobs directly to the gearman queue for log processing. This means we only need a geard to be running so add support for running just that daemon. Change-Id: Iedcb5b29875494b8e18fa125adb08ec2e34d0064	2017-12-22 10:21:52 -08:00
Clark Boylan	54eb1a0785	Collapse logically identical filenames for crm114 Log files come with many names while still containing the same logical content. That may be because the path to them differs (eg /var/log/foo.log and /opt/stack/log/foo.log) or due to file rotations (eg /var/log/foo.log and /var/log/foo.log.1) or due to compression (eg /var/log/foo.log and /var/log/foo.log.gz). At the end of the day these are all the same foo.log log file. This means when we do machine learning on the log files we can collapse all these different cases down into a single case that we learn on. This has become more important with the recent running out of disk space due to all the non unique log paths out there for our log files but should also result in better learning. Change-Id: I4ba276870b73640909ac469b336a436eb127f611	2017-11-22 23:05:35 +00:00
Clark Boylan	f35b4e2490	Reduce log worker internal queue size We are having OOM problems with larger log files. Attempt to make this more robust by having much smaller internal log line message queues (we reduce the queue size to about 10% the original size). The idea here is that if we have the old 130k queue size full then grab a large log file the overhead there is fairly significant whereas if we have a small 16k queue size and grab a large log file we really only have to worry about the size of the logfile itself. Depends-On: Iddbbab9ea5996df4922bf7927deb8f0354378ab7 Change-Id: I761fabaa1b5aae64790def721980151f9fdc720d	2017-11-18 23:40:46 +00:00
Clark Boylan	88e0d21347	Only send mqtt events for processed files We were previously sending events for every file we attempted to process, not just those that were processed and also for every single log line event. This effectively doubled the io performed by the logstash workers which seemed to slow the whole pipeline down. Trim it down to only recording events for log files that are processed which should significantly trim down the total number of events. Change-Id: I0daf3eb2e2b3240e3efa4f2c7bac57de99505df0	2017-08-03 15:28:45 -07:00
Clark Boylan	becc05e0aa	else needs to be else: Fix missing ':' syntax error. Change-Id: I65d26db42eb871c230fd880457e12a25016baf1e	2017-08-03 09:44:46 -07:00
Matthew Treinish	662ae3777c	Handle cases without a build_change field Previously the mqtt topic generation always assumed a build_change was present. However there are some cases where the isn't a build_change in the metadata, like periodic, post, and release jobs. This commit handles those edge cases so it uses the build queue in the topic instead of the build_change. If that doesn't work the topic is just the project. Change-Id: I26dba76e3475749d00a45b076d981778f885c339	2017-08-03 11:46:55 -04:00
Clark Boylan	8a55071b10	Fix syntax errors There was bad indentation and a missing '.' in config.get. _generate_topic())_ is an object method not global function and it takes an action argument. Change-Id: I01c4af83cf98f0d7191041a864618a1608f97647	2017-08-02 15:23:29 -07:00
Matthew Treinish	b1a4357058	Add MQTT support to the gearman worker This commit adds support to the gearman worker for publishing an mqtt message when processing a gearman job succeeds or fails. It also adds a message for when the processor passes the logs to logstash either via stdout or over a socket. By default this is disabled since it requires extra configuration to tell the worker how to talk to the mqtt broker. Depends-On: Id0308d2d4d1843fcca73f459cffa2ae944bebd0c Change-Id: I43be3562780c61591ebede61f3a8929e8217f199	2017-04-27 10:06:34 -04:00
Clark Boylan	1f0f91fdbb	Reduce log client logging by default We had been running at debug level which is incredibly verbose. Remove the -d flag. This will cause the logs which are logged to go to stdout/err which should mean that upstart (or whatever init system) will deal with them for us. We should properly clean this up so that debug logging is useful again in the long term. Change-Id: I613c135ea56507d083df8c66e8846c6fbfa8b2ed	2016-09-27 17:18:23 -07:00
Clark Boylan	8491d10d26	Decouple log processing from logstash As part of the move to logstash 2.0 we are relying on upstream packaging for logstash. This packaging replaces a lot of the micromanagement of users and groups and dirs that was done in puppet for logstash. This is great news because its less work for us but means that the log processors can't rely on puppet resources for those items and we don't actually want to install logstash package everywhere we run log processor daemons. Since the log processors don't need a logstash service running and actually don't need any of the logstash stuff at all decouple them completely and have the log processor daemons use their own user, group, log dir, config dir, etc. With this in place we can easily switch to using the logstash packages only where we actually need logstash to be running. Change-Id: I2354fbe9d3ab25134c52bfe58f562dfdf9ff6786	2016-03-09 13:52:35 -08:00
James E. Blair	d7d9d50ee2	Change node_region to node_provider This matches nodepool terminology to reduce confusion. Change-Id: I3a8776010dcaf6677a450d0a9cb770313e604019	2015-12-17 14:51:59 -08:00
K Jonathan Harker	84c7e72312	Revert "Switch to using the new log_processor project" This reverts commit `b548b141ce`. `b548b141ce` was supposed to depend-on https://review.openstack.org/248868 Change-Id: If3d4ad8a1cd45e6e63155a76dc1477ab38b156e3	2015-12-07 16:21:00 -08:00
K Jonathan Harker	b548b141ce	Switch to using the new log_processor project The python scripts have been moved to their own project at openstack-infra/log_processor. Delete the files here and start installing that project from source. As a part of this split, the .py extension has been dropped from the filename of the installed executables. Change-Id: Ied3025df46b5014a092be0c26e43d4f90699a43f	2015-11-25 15:23:26 -08:00
Matthew Treinish	b2190b1a42	Add a node_region field to the job metadata The node region can be figured out from the build_node very easily and having a discrete field will make filtering to a single region much simpler. This commit adds a new metadata field 'node_region' which is the cloud region that the build_node ran in. Change-Id: I06bbb62d21871ee61dbfb911143efff376992b98	2015-11-19 19:16:13 -05:00
Joshua Hesketh	cd55cdf7d7	Revert "Create subunit proccessor subclass" This reverts commit `135ac1809d`. EventProcessor was called before being defined. The code also doesn't look entirely right. Reverting this to fix up the logstash servers Change-Id: I2fb8081426646565814090c152d04d7349c16945	2015-11-19 11:05:53 +00:00
Jenkins	75ed9aca88	Merge "Create subunit proccessor subclass"	2015-11-18 14:23:41 +00:00
Jenkins	4b308c5308	Merge "Add the ability to filter on project"	2015-11-15 13:49:48 +00:00
Jenkins	6f4720fd7b	Merge "Process ZUUL_VOTING parameter"	2015-10-08 15:35:29 +00:00
Matt Riedemann	eeddbf5a43	Process ZUUL_VOTING parameter Read the ZUUL_VOTING parameter and add to the event before posting for log processing. The plan is that elastic-recheck will eventually use this field for filtering out non-voting jobs from the e-r uncategorized bugs page. Depends-On: I40746bb77aab900c1dd2637f940c14f72a904a61 Change-Id: I1f3c2a65104db39fdd7d786d421cded1b436a5f6	2015-09-16 09:04:27 -07:00
Anita Kuno	2b6961467a	Add build_zuul_url parameter Currently logstash does not track the zuul url. The zuul url contains the zm (zuul merger) node identifier. While trying to troubleshoot a zuul cloning issue, I noticed all faiures were coming from the same zm (zuul merger) node. Tracking the build_zuul_url can be helpful. This patch adds the build_zuul_url parameter. Change-Id: I83358dc0d9b27852df2395a9c52d2daaaeda712b	2015-09-01 14:22:16 -04:00
Clark Boylan	17883b76b0	Import socket so we can use it to get name info Previously this was using socket.getaddrinfo() without importing socket and causing the daemon to fail. Running in the foreground did not use statsd thus did not attempt to resolve the statsd host which is how this got past manual testing. Import socket to get everything working agian. Change-Id: I280973bdcdf472736a07d19173559b062ed74d3c	2015-07-17 11:15:19 -07:00
Jenkins	a193d901cf	Merge "Lazily connect to logstash"	2015-07-15 22:50:58 +00:00
Jenkins	2595ee1273	Merge "Retry on EAI_AGAIN name resolution failures"	2015-07-15 22:50:57 +00:00
K Jonathan Harker	135ac1809d	Create subunit proccessor subclass This allows for subunit files that do not include subunit in the name. Change-Id: I8504fad6a4dea98700c204984cf00fea95de8369	2015-06-11 11:18:06 -07:00
K Jonathan Harker	622f6d9471	Add the ability to filter on project Implement a project-filter option to gearman client config alongside the job-filter and build-queue-filter options. Change-Id: Ia71f216f4acc9de145eb9124df691393d2a86808	2015-06-11 09:13:41 -07:00
Clark Boylan	2aa7b07ebb	Lazily connect to logstash Because boot order is such a mess we will lazily connect to the logstash TCP/UDP ports to allow for logstash to come up before we start writing to it. This takes advantage of existing logstash restart handler code in the log processors. Change-Id: I836c55806c88cc86b7973b3d40f4bfce076970f5	2015-03-05 11:14:15 -08:00
Clark Boylan	3cd22c77cc	Retry on EAI_AGAIN name resolution failures There is no sane way to convince Ubuntu to start these services after name resolution is working (because sysv init is horribly broken on Ubuntu). Work around this by catching EAI_AGAIN errors during name resolution and retrying until we can resolve names. This logs each failed resolution attempt so that users are aware of the issue if investigating logs. Change-Id: If94d4f04d0e1cfedc358fd9d678a36fc9cd8aa7b	2015-03-05 10:29:48 -08:00
Clark Boylan	e3641f727f	Start processes after network and named Log processing requires networking and name resolution to be available. Specify these deps in the LSB init headers so that we get proper boot time start sequences for these services. Change-Id: Ic36eba2654e7425f3aba8ee5c215150b7d94d658	2015-03-04 08:36:51 -08:00
Matthew Treinish	19d70bee3e	Add support to log gearman client to filter on build-queue This commit adds a new job filter to the gearman client to filter based on the build queue. This is used for the subunit jobs which we don't want to run on check jobs. Change-Id: If81fe98d8d67bb718c53a963695a7d06f5f6625d	2014-11-19 09:42:47 -05:00
Matthew Treinish	e5fbd6ca48	Add subunit2sql gearman workers This adds a new gearman worker to process the subunit files from the gate job runs. It will use subunit2sql to connect to a sql server and process the data from the subunit file. The log-gearman-client is modified to allow for pushing subunit jobs to gearman, and the worker model for processsing logs is borrowed to process the subunit files. Change-Id: I83103eb6afc22d91f916583c36c0e956c23a64b3	2014-10-29 13:03:49 -04:00
Clark Boylan	742c92e537	Handle log processing subprocess cleanup better We are leaking file descriptors in our log worker processes because we are are not catch all possible errors leaving some actions left behind to do. More aggressively catch errors so that all cleanup happens Change-Id: I7a73a36c6fc42d4eba636cf36c8cfffcea48a318	2014-09-03 17:03:27 -07:00
Christian Berendt	46b9ae5771	Use except x as y instead of except x, y According to https://docs.python.org/3/howto/pyporting.html the syntax changed in Python 3.x. The new syntax is usable with Python >= 2.6 and should be preferred to be compatible with Python3. Enabled hacking check H231. Change-Id: I4c20a04bc7732efc2d4bbcbc3d285107b244e5fa	2014-05-29 23:55:41 +02:00
Clark Boylan	bbbf64f74c	Don't treat IDs as uniquely special in CRM114 The openstack logs a full of various IDs and UUIDs but they are not uniquely special when it comes to filtering them. Instead replace each ID with a token making CRM114's life much easier. Change-Id: Id9b430c0d31889b89e4e0c1790a2405d73f501b5	2014-03-24 15:19:49 -07:00
Clark Boylan	5a3ff67db4	Better logstash field data. We are currently using a lot of wildcard searches in elasticsearch which are slow. Provide better field data so that we can replace those wildcard searches with filters. In particular add a short uuid field and make the filename tag field the basename of the filepath so that grenade and non grenade files all end up with the same tags. Change-Id: If558017fceae96bcf197e611ab5cac1cfe7ae9bf	2014-03-13 14:42:58 -07:00
James E. Blair	c24a8a75e7	Use statsd in logstash client Have the log-gearman-client (aka jenkins-log-client) initialize the statsd parameters when starting the geard server. Also, make sure that the python statsd package is installed on the host. Change-Id: I04fe1a7609f08bc710891b6a3b92d0f4d156d86c	2014-02-24 15:34:48 -08:00
Clark Boylan	cc5d9265ec	Handle log filter exceptions more gracefully. If there is an exception filtering a log event handle that by removing the filter and continuing to process the remaining log events for the assocaited file. This prevents non filter data from being lost when the filters have an exception. Change-Id: I65141daf21a873096829c41fdc2c77cbeecde2e3	2014-02-10 10:20:12 -08:00
Clark Boylan	585112564d	Close unneeded fds before execing CRM 114. CRM 114 is being forked off of the gearman worker processes and as a result has open fds for log files and tcp connections. CRM 114 should be isolated from the fds so that it doesn't crash when they change unexpectedly. Close the fds using the subprocess.Popen close_fds flag. Change-Id: I4fbdf3564771be7d7a7e4c518e571634de576253	2014-02-05 09:44:55 -08:00

1 2

56 Commits