Enhance the self-monitoring of Heka plugins

Since AFD and GSE filters use intensively the timer_event function it's
relevant to collect TimerEventAvgDuration and TimerEventSamples.

The memory used by decoders and filters is also collected.

Change-Id: Ib82eb57985a0eb2709ccd1eb5ec2e65e5810669e
This commit is contained in:
Swann Croiset 2015-10-08 17:19:32 +02:00
parent e6525c5606
commit 6fda6983e3
2 changed files with 22 additions and 9 deletions

View File

@ -45,11 +45,18 @@ function process_table(typ, array)
['type'] = typ,
['name'] = name,
}
msgCount = v['ProcessMessageCount']['value']
avgDuration = v['ProcessMessageAvgDuration']['value']
utils.add_to_bulk_metric('hekad_msg_count', v['ProcessMessageCount']['value'], tags)
utils.add_to_bulk_metric('hekad_msg_avg_duration', v['ProcessMessageAvgDuration']['value'], tags)
utils.add_to_bulk_metric('hekad_msg_count', v.ProcessMessageCount.value, tags)
utils.add_to_bulk_metric('hekad_msg_avg_duration', v.ProcessMessageAvgDuration.value, tags)
if v.Memory then
utils.add_to_bulk_metric('hekad_memory', v.Memory.value, tags)
end
if v.TimerEventAvgDuration then
utils.add_to_bulk_metric('hekad_timer_event_avg_duration', v.TimerEventAvgDuration.value, tags)
end
if v.TimerEventSamples then
utils.add_to_bulk_metric('hekad_timer_event_count', v.TimerEventSamples.value, tags)
end
end
end
end

View File

@ -24,9 +24,15 @@ Metrics have a ``service`` field with the name of the service it applies to. Val
Heka pipeline
^^^^^^^^^^^^^
Metrics have a ``name`` field that contains the name of the decoder or filter as defined by *Heka*.
Metrics have two fields: ``name`` that contains the name of the decoder or filter as defined by *Heka* and ``type`` that is either *decoder* or *filter*.
* ``hekad_decoder_count``, the total number of messages processed by the decoder. This will reset to 0 when the process is restarted.
* ``hekad_decoder_duration``, the average time for processing the message (in nanoseconds).
* ``hekad_filter_count``, the total number of messages processed by the filter. This will reset to 0 when the process is restarted.
* ``hekad_filter_duration``, the average time for processing the message (in nanoseconds).
Metrics for both types:
* ``hekad_msg_avg_duration``, the average time for processing the message (in nanoseconds).
* ``hekad_msg_count``, the total number of messages processed by the decoder. This will reset to 0 when the process is restarted.
* ``hekad_memory``, the total memory used by the Sandbox (in bytes).
Additional metrics for *filter* type:
* ``heakd_timer_event_avg_duration``, the average time for executing the *timer_event* function (in nanoseconds).
* ``hekad_timer_event_count``, the total number of executions of the *timer_event* function. This will reset to 0 when the process is restarted.