Support timeout for stats capture cron job
In this charm we run a cron job to check rabbitmq status and it is possible that the commands run could fail or hang if e.g. rabbit is not healthy. Currently the cron will never timeout and could hang forever so we add a new timeout config option 'cron-timeout' which, when set, will result in the a SIGINT being sent to the application and if that fails to exit within 10s a SIGKILL is sent. We also fix logging so that all output goes to syslog local0.notice. Change-Id: I0bb8780c5cc64a24384648f00c8068d5d666d28c Closes-Bug: 1716854
This commit is contained in:
parent
a68b912cf5
commit
ff4da882a2
|
@ -83,6 +83,14 @@ options:
|
|||
description: |
|
||||
Cron schedule used to generate rabbitmq stats. To disable,
|
||||
either unset this config option or set it to an empty string ('').
|
||||
cron-timeout:
|
||||
type: int
|
||||
default: 300
|
||||
description: |
|
||||
Run a command with a time limit specified in seconds in cron.
|
||||
This timeout will govern to the rabbitmq stats capture, and that once
|
||||
the timeout is reached a SIGINT is sent to the program, if it doesn't
|
||||
exits before 10 seconds a SIGKILL is sent.
|
||||
queue_thresholds:
|
||||
type: string
|
||||
default: "[['\\*', '\\*', 100, 200]]"
|
||||
|
|
|
@ -109,6 +109,8 @@ STATS_CRONFILE = '/etc/cron.d/rabbitmq-stats'
|
|||
STATS_DATAFILE = os.path.join(RABBIT_DIR, 'data',
|
||||
'{}_queue_stats.dat'
|
||||
''.format(rabbit.get_unit_hostname()))
|
||||
CRONJOB_CMD = ("{schedule} root timeout -k 10s -s SIGINT {timeout} "
|
||||
"{command} 2>&1 | logger -p local0.notice\n")
|
||||
INITIAL_CLIENT_UPDATE_KEY = 'initial_client_update_done'
|
||||
|
||||
|
||||
|
@ -590,7 +592,9 @@ def update_nrpe_checks():
|
|||
os.path.join(NAGIOS_PLUGINS, 'check_rabbitmq_queues.py'))
|
||||
if config('stats_cron_schedule'):
|
||||
script = os.path.join(SCRIPTS_DIR, 'collect_rabbitmq_stats.sh')
|
||||
cronjob = "{} root {}\n".format(config('stats_cron_schedule'), script)
|
||||
cronjob = CRONJOB_CMD.format(schedule=config('stats_cron_schedule'),
|
||||
timeout=config('cron-timeout'),
|
||||
command=script)
|
||||
rsync(os.path.join(charm_dir(), 'scripts',
|
||||
'collect_rabbitmq_stats.sh'), script)
|
||||
write_file(STATS_CRONFILE, cronjob)
|
||||
|
|
Loading…
Reference in New Issue