puppet-log_processor/files
melanie witt 89bfe00dda Stream log files instead of loading full files into memory
For awhile now lately, we have been seeing Elastic Search indexing
quickly fall behind as some log files generated in the gate have become
larger. Currently, we download a full log file into memory and then
emit it line-by-line to be received by a logstash listener. When log
files are large (example: 40M) logstash gets bogged down processing
them.

Instead of downloading full files into memory, we can stream the files
and emit their lines on-the-fly to try to alleviate load on the log
processor.

This:

  * Replaces use of urllib2.urlopen with requests with stream=True
  * Removes manual decoding of gzip and deflate compression
    formats as these are decoded automatically by requests.iter_lines
  * Removes unrelated unused imports
  * Removes an unused arg 'retry' from the log retrieval method

Change-Id: I6d32036566834da75f3a73f2d086475ef3431165
2020-11-09 09:49:07 -08:00
..
classify-log.crm Don't treat IDs as uniquely special in CRM114 2014-03-24 15:19:49 -07:00
geard.init Force geard to listen on :: 2018-10-18 15:47:14 -07:00
jenkins-log-client.init Reduce log client logging by default 2016-09-27 17:18:23 -07:00
log-gearman-client.py Change node_region to node_provider 2015-12-17 14:51:59 -08:00
log-gearman-worker.py Stream log files instead of loading full files into memory 2020-11-09 09:49:07 -08:00