Updating metric names to match new format
Change-Id: I395c56864373dc43c89c86e455102ea54c50a64c
This commit is contained in:
parent
178509a3aa
commit
b65128c818
42
README.md
42
README.md
|
@ -351,12 +351,8 @@ This section documents the system metrics that are sent by the Agent. This sect
|
||||||
| cpu.stolen_perc | | Percentage of stolen CPU time, i.e. the time spent in other OS contexts when running in a virtualized environment |
|
| cpu.stolen_perc | | Percentage of stolen CPU time, i.e. the time spent in other OS contexts when running in a virtualized environment |
|
||||||
| cpu.system_perc | | Percentage of time the CPU is used at the system level |
|
| cpu.system_perc | | Percentage of time the CPU is used at the system level |
|
||||||
| cpu.user_perc | | Percentage of time the CPU is used at the user level |
|
| cpu.user_perc | | Percentage of time the CPU is used at the user level |
|
||||||
| disk.free_inodes | device | The number of inodes that are free on a device |
|
| disk_inode_utilization_perc | device | The percentage of inodes that are used on a device |
|
||||||
| disk.used_inodes | device | The number of inodes that are used on a device |
|
| disk_space_utilization_perc | device | The percentage of disk space that is being used on a device |
|
||||||
| disk.total_inodes | device | The total number of inodes that are available on a device |
|
|
||||||
| disk.used_kbytes | device | The number of kilobytes of disk space that are used on a device |
|
|
||||||
| disk.total_kbytes | device | The total number of kilobytes of disk space that are available on a device |
|
|
||||||
| disk.free_kbytes | device | The number of kilobytes of disk space that are free on a device|
|
|
||||||
| io.read_kbytes_sec | device | Kbytes/sec read by an io device
|
| io.read_kbytes_sec | device | Kbytes/sec read by an io device
|
||||||
| io.read_req_sec | device | Number of read requests/sec to an io device
|
| io.read_req_sec | device | Number of read requests/sec to an io device
|
||||||
| io.write_kbytes_sec |device | Kbytes/sec written by an io device
|
| io.write_kbytes_sec |device | Kbytes/sec written by an io device
|
||||||
|
@ -373,7 +369,7 @@ This section documents the system metrics that are sent by the Agent. This sect
|
||||||
| mem.usable_mb | | Total megabytes of usable memory
|
| mem.usable_mb | | Total megabytes of usable memory
|
||||||
| mem.usable_perc | | Percentage of total memory that is usable
|
| mem.usable_perc | | Percentage of total memory that is usable
|
||||||
| mem.used_buffers | | Number of buffers being used by the kernel for block io
|
| mem.used_buffers | | Number of buffers being used by the kernel for block io
|
||||||
| mem_used_cached | | Memory used for the page cache
|
| mem.used_cached | | Memory used for the page cache
|
||||||
| mem.used_shared | | Memory shared between separate processes and typically used for inter-process communication
|
| mem.used_shared | | Memory shared between separate processes and typically used for inter-process communication
|
||||||
| net.bytes_in | device | Number of network bytes received
|
| net.bytes_in | device | Number of network bytes received
|
||||||
| net.bytes_out | device | Number of network bytes sent
|
| net.bytes_out | device | Number of network bytes sent
|
||||||
|
@ -381,9 +377,9 @@ This section documents the system metrics that are sent by the Agent. This sect
|
||||||
| net.packets_out | device | Number of network packets sent
|
| net.packets_out | device | Number of network packets sent
|
||||||
| net.errors_in | device | Number of network errors on incoming network traffic
|
| net.errors_in | device | Number of network errors on incoming network traffic
|
||||||
| net.errors_out | device | Number of network errors on outgoing network traffic
|
| net.errors_out | device | Number of network errors on outgoing network traffic
|
||||||
| collector.threads.count | service=monasca component=agent | Number of threads that the collector is consuming for this collection run
|
| thread_count | service=monasca component=collector | Number of threads that the collector is consuming for this collection run
|
||||||
| collector.emit.time | service=monasca component=agent | Amount of time that the collector took for sending the collected metrics to the Forwarder for this collection run
|
| emit_time | service=monasca component=collector | Amount of time that the collector took for sending the collected metrics to the Forwarder for this collection run
|
||||||
| collector.collection.time | service=monasca component=agent | Amount of time that the collector took for this collection run
|
| collection_time | service=monasca component=collector | Amount of time that the collector took for this collection run
|
||||||
|
|
||||||
# Plugin Checks
|
# Plugin Checks
|
||||||
Plugins are the way to extend the Monasca Agent. Plugins add additional functionality that allow the agent to perform checks on other applications, servers or services.
|
Plugins are the way to extend the Monasca Agent. Plugins add additional functionality that allow the agent to perform checks on other applications, servers or services.
|
||||||
|
@ -557,19 +553,19 @@ The process checks return the following metrics:
|
||||||
|
|
||||||
| Metric Name | Dimensions | Semantics |
|
| Metric Name | Dimensions | Semantics |
|
||||||
| ----------- | ---------- | --------- |
|
| ----------- | ---------- | --------- |
|
||||||
| processes.mem.real | process_name | Amount of real memory a process is using
|
| process.mem.real | process_name, service, component | Amount of real memory a process is using
|
||||||
| processes.mem.rss | process_name | Amount of rss memory a process is using
|
| process.mem.rss | process_name, service, component | Amount of rss memory a process is using
|
||||||
| processes.io.read_count | process_name | Number of reads by a process
|
| process.io.read_count | process_name, service, component | Number of reads by a process
|
||||||
| processes.io.write_count | process_name | Number of writes by a process
|
| process.io.write_count | process_name, service, component | Number of writes by a process
|
||||||
| processes.io.read_bytes | process_name | Bytes read by a process
|
| process.io.read_bytes | process_name, service, component | Bytes read by a process
|
||||||
| processes.io.write_bytes | process_name | Bytes written by a process
|
| process.io.write_bytes | process_name, service, component | Bytes written by a process
|
||||||
| processes.threads | process_name | Number of threads a process is using
|
| process.threads | process_name, service, component | Number of threads a process is using
|
||||||
| processes.cpu_perc | process_name | Percentage of cpu being consumed by a process
|
| process.cpu_perc | process_name, service, component | Percentage of cpu being consumed by a process
|
||||||
| processes.vms | process_name | Amount of virtual memory a process is using
|
| process.vms | process_name, service, component | Amount of virtual memory a process is using
|
||||||
| processes.open_file_decorators | process_name | Number of files being used by a process
|
| process.open_file_decorators | process_name, service, component | Number of files being used by a process
|
||||||
| processes.involuntary_ctx_switches | process_name | Number of involuntary context switches for a process
|
| process.involuntary_ctx_switches | process_name, service, component | Number of involuntary context switches for a process
|
||||||
| processes.voluntary_ctx_switches | process_name | Number of voluntary context switches for a process
|
| process.voluntary_ctx_switches | process_name, service, component | Number of voluntary context switches for a process
|
||||||
| processes.pid_count | process_name | Number of processes that exist with this process name
|
| process.pid_count | process_name, service, component | Number of processes that exist with this process name
|
||||||
|
|
||||||
## Http Endpoint Checks
|
## Http Endpoint Checks
|
||||||
This section describes the http endpoint check that can be performed by the Agent. Http endpoint checks are checks that perform simple up/down checks on services, such as HTTP/REST APIs. An agent, given a list of URLs can dispatch an http request and report to the API success/failure as a metric.
|
This section describes the http endpoint check that can be performed by the Agent. Http endpoint checks are checks that perform simple up/down checks on services, such as HTTP/REST APIs. An agent, given a list of URLs can dispatch an http request and report to the API success/failure as a metric.
|
||||||
|
|
|
@ -108,16 +108,16 @@ class Collector(object):
|
||||||
def collector_stats(self, num_metrics, num_events, collection_time, emit_time):
|
def collector_stats(self, num_metrics, num_events, collection_time, emit_time):
|
||||||
metrics = {}
|
metrics = {}
|
||||||
thread_count = threading.active_count()
|
thread_count = threading.active_count()
|
||||||
metrics['monagent.collector.threads.count'] = thread_count
|
metrics['threads_count'] = thread_count
|
||||||
if thread_count > MAX_THREADS_COUNT:
|
if thread_count > MAX_THREADS_COUNT:
|
||||||
log.warn("Collector thread count is high: %d" % thread_count)
|
log.warn("Collector thread count is high: %d" % thread_count)
|
||||||
|
|
||||||
metrics['monagent.collector.collection.time'] = collection_time
|
metrics['collection_time'] = collection_time
|
||||||
if collection_time > MAX_COLLECTION_TIME:
|
if collection_time > MAX_COLLECTION_TIME:
|
||||||
log.info("Collection time (s) is high: %.1f, metrics count: %d, events count: %d" %
|
log.info("Collection time (s) is high: %.1f, metrics count: %d, events count: %d" %
|
||||||
(collection_time, num_metrics, num_events))
|
(collection_time, num_metrics, num_events))
|
||||||
|
|
||||||
metrics['monagent.collector.emit.time'] = emit_time
|
metrics['emit_time'] = emit_time
|
||||||
if emit_time is not None and emit_time > MAX_EMIT_TIME:
|
if emit_time is not None and emit_time > MAX_EMIT_TIME:
|
||||||
log.info("Emit time (s) is high: %.1f, metrics count: %d, events count: %d" %
|
log.info("Emit time (s) is high: %.1f, metrics count: %d, events count: %d" %
|
||||||
(emit_time, num_metrics, num_events))
|
(emit_time, num_metrics, num_events))
|
||||||
|
@ -163,7 +163,10 @@ class Collector(object):
|
||||||
# Add in metrics on the collector run, emit_duration is from the previous run
|
# Add in metrics on the collector run, emit_duration is from the previous run
|
||||||
for name, value in self.collector_stats(len(metrics_list), len(events),
|
for name, value in self.collector_stats(len(metrics_list), len(events),
|
||||||
collect_duration, self.emit_duration).iteritems():
|
collect_duration, self.emit_duration).iteritems():
|
||||||
metrics_list.append(Measurement(name, timestamp, value, {}))
|
metrics_list.append(Measurement(name,
|
||||||
|
timestamp,
|
||||||
|
value,
|
||||||
|
{'service': 'monasca', 'component': 'collector'}))
|
||||||
|
|
||||||
emitter_statuses = self._emit(metrics_list)
|
emitter_statuses = self._emit(metrics_list)
|
||||||
self.emit_duration = timer.step()
|
self.emit_duration = timer.step()
|
||||||
|
|
|
@ -113,13 +113,9 @@ class Disk(Check):
|
||||||
self.logger.exception("Cannot parse %s" % (parts,))
|
self.logger.exception("Cannot parse %s" % (parts,))
|
||||||
|
|
||||||
if inodes:
|
if inodes:
|
||||||
usage_data['%s.disk_total_inodes' % parts[0]] = parts[1]
|
usage_data['%s.disk_inode_utilization_perc' % parts[0]] = float(parts[2]) / parts[1] * 100
|
||||||
usage_data['%s.disk_used_inodes' % parts[0]] = parts[2]
|
|
||||||
usage_data['%s.disk_free_inodes' % parts[0]] = parts[3]
|
|
||||||
else:
|
else:
|
||||||
usage_data['%s.disk_total_kbytes' % parts[0]] = parts[1]
|
usage_data['%s.disk_space_utilization_perc' % parts[0]] = float(parts[2]) / parts[1] * 100
|
||||||
usage_data['%s.disk_used_kbytes' % parts[0]] = parts[2]
|
|
||||||
usage_data['%s.disk_free_kbytes' % parts[0]] = parts[3]
|
|
||||||
|
|
||||||
return usage_data
|
return usage_data
|
||||||
|
|
||||||
|
@ -261,20 +257,20 @@ class IO(Check):
|
||||||
names = {"wait": "await",
|
names = {"wait": "await",
|
||||||
"svc_t": "svctm",
|
"svc_t": "svctm",
|
||||||
"%b": "%util",
|
"%b": "%util",
|
||||||
"kr/s": "io_read_kbytes_sec",
|
"kr/s": "io.read_kbytes_sec",
|
||||||
"kw/s": "io_write_kbytes_sec",
|
"kw/s": "io.write_kbytes_sec",
|
||||||
"actv": "avgqu-sz"}
|
"actv": "avgqu-sz"}
|
||||||
elif os_name == "freebsd":
|
elif os_name == "freebsd":
|
||||||
names = {"svc_t": "await",
|
names = {"svc_t": "await",
|
||||||
"%b": "%util",
|
"%b": "%util",
|
||||||
"kr/s": "io_read_kbytes_sec",
|
"kr/s": "io.read_kbytes_sec",
|
||||||
"kw/s": "io_write_kbytes_sec",
|
"kw/s": "io.write_kbytes_sec",
|
||||||
"wait": "avgqu-sz"}
|
"wait": "avgqu-sz"}
|
||||||
elif os_name == "linux":
|
elif os_name == "linux":
|
||||||
names = {"rkB/s": "io_read_kbytes_sec",
|
names = {"rkB/s": "io.read_kbytes_sec",
|
||||||
"r/s": "io_read_req_sec",
|
"r/s": "io.read_req_sec",
|
||||||
"wkB/s": "io_write_kbytes_sec",
|
"wkB/s": "io.write_kbytes_sec",
|
||||||
"w/s": "io_write_req_sec"}
|
"w/s": "io.write_req_sec"}
|
||||||
# translate if possible
|
# translate if possible
|
||||||
return names.get(metric_name, metric_name)
|
return names.get(metric_name, metric_name)
|
||||||
|
|
||||||
|
@ -435,9 +431,9 @@ class Load(Check):
|
||||||
|
|
||||||
# Split out the 3 load average values
|
# Split out the 3 load average values
|
||||||
load = [res.replace(',', '.') for res in re.findall(r'([0-9]+[\.,]\d+)', uptime)]
|
load = [res.replace(',', '.') for res in re.findall(r'([0-9]+[\.,]\d+)', uptime)]
|
||||||
return {'load_avg_1_min': float(load[0]),
|
return {'cpu.load_avg_1_min': float(load[0]),
|
||||||
'load_avg_5_min': float(load[1]),
|
'cpu.load_avg_5_min': float(load[1]),
|
||||||
'load_avg_15_min': float(load[2]),
|
'cpu.load_avg_15_min': float(load[2]),
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@ -537,35 +533,35 @@ class Memory(Check):
|
||||||
# Physical memory
|
# Physical memory
|
||||||
# FIXME units are in MB, we should use bytes instead
|
# FIXME units are in MB, we should use bytes instead
|
||||||
try:
|
try:
|
||||||
memData['mem_total_mb'] = int(meminfo.get('MemTotal', 0)) / 1024
|
memData['mem.total_mb'] = int(meminfo.get('MemTotal', 0)) / 1024
|
||||||
memData['mem_free_mb'] = int(meminfo.get('MemFree', 0)) / 1024
|
memData['mem.free_mb'] = int(meminfo.get('MemFree', 0)) / 1024
|
||||||
memData['memphysBuffers'] = int(meminfo.get('Buffers', 0)) / 1024
|
memData['mem.used_buffers'] = int(meminfo.get('Buffers', 0)) / 1024
|
||||||
memData['memphysCached'] = int(meminfo.get('Cached', 0)) / 1024
|
memData['mem.used_cached'] = int(meminfo.get('Cached', 0)) / 1024
|
||||||
memData['memphysShared'] = int(meminfo.get('Shmem', 0)) / 1024
|
memData['mem.used_shared'] = int(meminfo.get('Shmem', 0)) / 1024
|
||||||
|
|
||||||
memData['mem_usable_perc'] = memData['mem_total_mb'] - memData['mem_free_mb']
|
memData['mem.usable_perc'] = memData['mem.total_mb'] - memData['mem.free_mb']
|
||||||
# Usable is relative since cached and buffers are actually used to speed things up.
|
# Usable is relative since cached and buffers are actually used to speed things up.
|
||||||
memData['mem_usable_mb'] = memData['mem_free_mb'] + \
|
memData['mem.usable_mb'] = memData['mem.free_mb'] + \
|
||||||
memData['memphysBuffers'] + memData['memphysCached']
|
memData['mem.used_buffers'] + memData['mem.used_cached']
|
||||||
|
|
||||||
if memData['mem_total_mb'] > 0:
|
if memData['mem.total_mb'] > 0:
|
||||||
memData['mem_usable_perc'] = float(
|
memData['mem.usable_perc'] = float(
|
||||||
memData['mem_usable_mb']) / float(memData['mem_total_mb'])
|
memData['mem.usable_mb']) / float(memData['mem.total_mb'])
|
||||||
except Exception:
|
except Exception:
|
||||||
self.logger.exception('Cannot compute stats from /proc/meminfo')
|
self.logger.exception('Cannot compute stats from /proc/meminfo')
|
||||||
|
|
||||||
# Swap
|
# Swap
|
||||||
# FIXME units are in MB, we should use bytes instead
|
# FIXME units are in MB, we should use bytes instead
|
||||||
try:
|
try:
|
||||||
memData['mem_swap_total_mb'] = int(meminfo.get('SwapTotal', 0)) / 1024
|
memData['mem.swap_total_mb'] = int(meminfo.get('SwapTotal', 0)) / 1024
|
||||||
memData['mem_swap_free_mb'] = int(meminfo.get('SwapFree', 0)) / 1024
|
memData['mem.swap_free_mb'] = int(meminfo.get('SwapFree', 0)) / 1024
|
||||||
|
|
||||||
memData['mem_swap_used_mb'] = memData[
|
memData['mem.swap_used_mb'] = memData[
|
||||||
'mem_swap_total_mb'] - memData['mem_swap_free_mb']
|
'mem.swap_total_mb'] - memData['mem.swap_free_mb']
|
||||||
|
|
||||||
if memData['mem_swap_total_mb'] > 0:
|
if memData['mem.swap_total_mb'] > 0:
|
||||||
memData['mem_swap_free_perc'] = float(
|
memData['mem.swap_free_perc'] = float(
|
||||||
memData['mem_swap_free_mb']) / float(memData['mem_swap_total_mb'])
|
memData['mem.swap_free_mb']) / float(memData['mem.swap_total_mb'])
|
||||||
except Exception:
|
except Exception:
|
||||||
self.logger.exception('Cannot compute swap stats')
|
self.logger.exception('Cannot compute swap stats')
|
||||||
|
|
||||||
|
@ -747,11 +743,11 @@ class Cpu(Check):
|
||||||
When figures are not available, False is sent back.
|
When figures are not available, False is sent back.
|
||||||
"""
|
"""
|
||||||
def format_results(us, sy, wa, idle, st):
|
def format_results(us, sy, wa, idle, st):
|
||||||
data = {'cpu_user_perc': us,
|
data = {'cpu.user_perc': us,
|
||||||
'cpu_system_perc': sy,
|
'cpu.system_perc': sy,
|
||||||
'cpu_wait_perc': wa,
|
'cpu.wait_perc': wa,
|
||||||
'cpu_idle_perc': idle,
|
'cpu.idle_perc': idle,
|
||||||
'cpu_stolen_perc': st}
|
'cpu.stolen_perc': st}
|
||||||
for key in data.keys():
|
for key in data.keys():
|
||||||
if data[key] is None:
|
if data[key] is None:
|
||||||
del data[key]
|
del data[key]
|
||||||
|
|
|
@ -67,6 +67,9 @@ class MySql(AgentCheck):
|
||||||
host, port, user, password, mysql_sock, defaults_file, dimensions, options = self._get_config(
|
host, port, user, password, mysql_sock, defaults_file, dimensions, options = self._get_config(
|
||||||
instance)
|
instance)
|
||||||
|
|
||||||
|
if 'service' not in dimensions:
|
||||||
|
dimensions.update({'service': 'mysql'})
|
||||||
|
|
||||||
if (not host or not user) and not defaults_file:
|
if (not host or not user) and not defaults_file:
|
||||||
raise Exception("Mysql host and user are needed.")
|
raise Exception("Mysql host and user are needed.")
|
||||||
|
|
||||||
|
@ -84,7 +87,7 @@ class MySql(AgentCheck):
|
||||||
password = instance.get('pass', '')
|
password = instance.get('pass', '')
|
||||||
mysql_sock = instance.get('sock', '')
|
mysql_sock = instance.get('sock', '')
|
||||||
defaults_file = instance.get('defaults_file', '')
|
defaults_file = instance.get('defaults_file', '')
|
||||||
dimensions = instance.get('dimensions', None)
|
dimensions = instance.get('dimensions', {})
|
||||||
options = instance.get('options', {})
|
options = instance.get('options', {})
|
||||||
|
|
||||||
return host, port, user, password, mysql_sock, defaults_file, dimensions, options
|
return host, port, user, password, mysql_sock, defaults_file, dimensions, options
|
||||||
|
|
|
@ -28,18 +28,18 @@ class Network(AgentCheck):
|
||||||
}
|
}
|
||||||
|
|
||||||
NETSTAT_GAUGE = {
|
NETSTAT_GAUGE = {
|
||||||
('udp4', 'connections'): 'net_udp4_connections',
|
('udp4', 'connections'): 'net.udp4_connections',
|
||||||
('udp6', 'connections'): 'net_udp6_connections',
|
('udp6', 'connections'): 'net.udp6_connections',
|
||||||
('tcp4', 'established'): 'net_tcp4_established',
|
('tcp4', 'established'): 'net.tcp4_established',
|
||||||
('tcp4', 'opening'): 'net_tcp4_opening',
|
('tcp4', 'opening'): 'net.tcp4_opening',
|
||||||
('tcp4', 'closing'): 'net_tcp4_closing',
|
('tcp4', 'closing'): 'net.tcp4_closing',
|
||||||
('tcp4', 'listening'): 'net_tcp4_listening',
|
('tcp4', 'listening'): 'net.tcp4_listening',
|
||||||
('tcp4', 'time_wait'): 'net_tcp4_time_wait',
|
('tcp4', 'time_wait'): 'net.tcp4_time_wait',
|
||||||
('tcp6', 'established'): 'net_tcp6_established',
|
('tcp6', 'established'): 'net.tcp6_established',
|
||||||
('tcp6', 'opening'): 'net_tcp6_opening',
|
('tcp6', 'opening'): 'net.tcp6_opening',
|
||||||
('tcp6', 'closing'): 'net_tcp6_closing',
|
('tcp6', 'closing'): 'net.tcp6_closing',
|
||||||
('tcp6', 'listening'): 'net_tcp6_listening',
|
('tcp6', 'listening'): 'net.tcp6_listening',
|
||||||
('tcp6', 'time_wait'): 'net_tcp6_time_wait',
|
('tcp6', 'time_wait'): 'net.tcp6_time_wait',
|
||||||
}
|
}
|
||||||
|
|
||||||
def __init__(self, name, init_config, agent_config, instances=None):
|
def __init__(self, name, init_config, agent_config, instances=None):
|
||||||
|
@ -100,7 +100,7 @@ class Network(AgentCheck):
|
||||||
if iface in self._excluded_ifaces and metric in exclude_iface_metrics:
|
if iface in self._excluded_ifaces and metric in exclude_iface_metrics:
|
||||||
# skip it!
|
# skip it!
|
||||||
continue
|
continue
|
||||||
self.rate('net_%s' % metric, val, device_name=iface)
|
self.rate('net.%s' % metric, val, device_name=iface)
|
||||||
count += 1
|
count += 1
|
||||||
self.log.debug("tracked %s network metrics for interface %s" % (count, iface))
|
self.log.debug("tracked %s network metrics for interface %s" % (count, iface))
|
||||||
|
|
||||||
|
|
|
@ -5,18 +5,18 @@ from monagent.common.util import Platform
|
||||||
|
|
||||||
class ProcessCheck(AgentCheck):
|
class ProcessCheck(AgentCheck):
|
||||||
|
|
||||||
PROCESS_GAUGE = ('processes_threads',
|
PROCESS_GAUGE = ('process.threads',
|
||||||
'processes_cpu.pct',
|
'process.cpu_pct',
|
||||||
'processes_mem.rss',
|
'process.mem.rss',
|
||||||
'processes_mem.vms',
|
'process.mem.vms',
|
||||||
'processes_mem.real',
|
'process.mem.real',
|
||||||
'processes_open_file_decorators',
|
'process.open_file_descriptors',
|
||||||
'processes_ioread_count',
|
'process.io.read_count',
|
||||||
'processes_iowrite_count',
|
'process.io.write_count',
|
||||||
'processes_ioread_bytes',
|
'process.io.read_bytes',
|
||||||
'processes_iowrite_bytes',
|
'process.io.write_bytes',
|
||||||
'processes_voluntary_ctx_switches',
|
'process.voluntary_ctx_switches',
|
||||||
'processes_involuntary_ctx_switches')
|
'process.involuntary_ctx_switches')
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def is_psutil_version_later_than(v):
|
def is_psutil_version_later_than(v):
|
||||||
|
@ -193,7 +193,7 @@ class ProcessCheck(AgentCheck):
|
||||||
|
|
||||||
self.log.debug('ProcessCheck: process %s analysed' % name)
|
self.log.debug('ProcessCheck: process %s analysed' % name)
|
||||||
|
|
||||||
self.gauge('processes_pid_count', len(pids), dimensions=dimensions)
|
self.gauge('process.pid_count', len(pids), dimensions=dimensions)
|
||||||
|
|
||||||
metrics = dict(zip(ProcessCheck.PROCESS_GAUGE,
|
metrics = dict(zip(ProcessCheck.PROCESS_GAUGE,
|
||||||
self.get_process_metrics(pids,
|
self.get_process_metrics(pids,
|
||||||
|
|
|
@ -41,6 +41,8 @@ class Zookeeper(AgentCheck):
|
||||||
timeout = float(instance.get('timeout', 3.0))
|
timeout = float(instance.get('timeout', 3.0))
|
||||||
dimensions = instance.get('dimensions', {})
|
dimensions = instance.get('dimensions', {})
|
||||||
|
|
||||||
|
if 'service' not in dimensions:
|
||||||
|
dimensions.update({'service': 'zookeeper'})
|
||||||
sock = socket.socket()
|
sock = socket.socket()
|
||||||
sock.settimeout(timeout)
|
sock.settimeout(timeout)
|
||||||
buf = StringIO()
|
buf = StringIO()
|
||||||
|
@ -74,14 +76,14 @@ class Zookeeper(AgentCheck):
|
||||||
if buf is not None:
|
if buf is not None:
|
||||||
# Parse the response
|
# Parse the response
|
||||||
metrics, new_dimensions = self.parse_stat(buf)
|
metrics, new_dimensions = self.parse_stat(buf)
|
||||||
dimensions.update(new_dimensions)
|
new_dimensions.update(dimensions)
|
||||||
|
|
||||||
# Write the data
|
# Write the data
|
||||||
for metric, value in metrics:
|
for metric, value in metrics:
|
||||||
self.gauge(metric, value, dimensions=dimensions)
|
self.gauge(metric, value, dimensions=new_dimensions)
|
||||||
else:
|
else:
|
||||||
# Reading from the client port timed out, track it as a metric
|
# Reading from the client port timed out, track it as a metric
|
||||||
self.increment('zookeeper.timeouts', dimensions=dimensions)
|
self.increment('zookeeper.timeouts', dimensions=new_dimensions)
|
||||||
|
|
||||||
@classmethod
|
@classmethod
|
||||||
def parse_stat(cls, buf):
|
def parse_stat(cls, buf):
|
||||||
|
|
|
@ -39,13 +39,13 @@ class ServicePlugin(Plugin):
|
||||||
for process in self.found_processes:
|
for process in self.found_processes:
|
||||||
# Watch the service processes
|
# Watch the service processes
|
||||||
log.info("\tMonitoring the {0} {1} process.".format(process, self.service_name))
|
log.info("\tMonitoring the {0} {1} process.".format(process, self.service_name))
|
||||||
config.merge(watch_process([process], self.service_name))
|
config.merge(watch_process([process], self.service_name, process))
|
||||||
|
|
||||||
if self.service_api_url and self.search_pattern:
|
if self.service_api_url and self.search_pattern:
|
||||||
# Setup an active http_status check on the API
|
# Setup an active http_status check on the API
|
||||||
log.info("\tConfiguring an http_check for the {0} API.".format(self.service_name))
|
log.info("\tConfiguring an http_check for the {0} API.".format(self.service_name))
|
||||||
config.merge(service_api_check(self.service_name + '-api', self.service_api_url,
|
config.merge(service_api_check(self.service_name + '-api', self.service_api_url,
|
||||||
self.search_pattern, self.service_name))
|
self.search_pattern, self.service_name + '_api'))
|
||||||
|
|
||||||
return config
|
return config
|
||||||
|
|
||||||
|
|
|
@ -26,7 +26,7 @@ def find_process_name(pname):
|
||||||
return None
|
return None
|
||||||
|
|
||||||
|
|
||||||
def watch_process(search_strings, service=None):
|
def watch_process(search_strings, service=None, component=None):
|
||||||
"""Takes a list of process search strings and returns a Plugins object with the config set.
|
"""Takes a list of process search strings and returns a Plugins object with the config set.
|
||||||
This was built as a helper as many plugins setup process watching
|
This was built as a helper as many plugins setup process watching
|
||||||
"""
|
"""
|
||||||
|
@ -34,17 +34,16 @@ def watch_process(search_strings, service=None):
|
||||||
parameters = {'name': search_strings[0],
|
parameters = {'name': search_strings[0],
|
||||||
'search_string': search_strings}
|
'search_string': search_strings}
|
||||||
|
|
||||||
# If service parameter is set in the plugin config, add the service dimension which
|
dimensions = _get_dimensions(service, component)
|
||||||
# will override the service in the agent config
|
if len(dimensions) > 0:
|
||||||
if service:
|
parameters['dimensions'] = dimensions
|
||||||
parameters['dimensions'] = {'service': service}
|
|
||||||
|
|
||||||
config['process'] = {'init_config': None,
|
config['process'] = {'init_config': None,
|
||||||
'instances': [parameters]}
|
'instances': [parameters]}
|
||||||
return config
|
return config
|
||||||
|
|
||||||
|
|
||||||
def service_api_check(name, url, pattern, service=None):
|
def service_api_check(name, url, pattern, service=None, component=None):
|
||||||
"""Setup a service api to be watched by the http_check plugin."""
|
"""Setup a service api to be watched by the http_check plugin."""
|
||||||
config = agent_config.Plugins()
|
config = agent_config.Plugins()
|
||||||
parameters = {'name': name,
|
parameters = {'name': name,
|
||||||
|
@ -53,12 +52,26 @@ def service_api_check(name, url, pattern, service=None):
|
||||||
'timeout': 10,
|
'timeout': 10,
|
||||||
'use_keystone': True}
|
'use_keystone': True}
|
||||||
|
|
||||||
# If service parameter is set in the plugin config, add the service dimension which
|
dimensions = _get_dimensions(service, component)
|
||||||
# will override the service in the agent config
|
if len(dimensions) > 0:
|
||||||
if service:
|
parameters['dimensions'] = dimensions
|
||||||
parameters['dimensions'] = {'service': service}
|
|
||||||
|
|
||||||
config['http_check'] = {'init_config': None,
|
config['http_check'] = {'init_config': None,
|
||||||
'instances': [parameters]}
|
'instances': [parameters]}
|
||||||
|
|
||||||
return config
|
return config
|
||||||
|
|
||||||
|
|
||||||
|
def _get_dimensions(service, component):
|
||||||
|
dimensions = {}
|
||||||
|
# If service parameter is set in the plugin config, add the service dimension which
|
||||||
|
# will override the service in the agent config
|
||||||
|
if service:
|
||||||
|
dimensions.update({'service': service})
|
||||||
|
|
||||||
|
# If component parameter is set in the plugin config, add the component dimension which
|
||||||
|
# will override the component in the agent config
|
||||||
|
if component:
|
||||||
|
dimensions.update({'component': component})
|
||||||
|
|
||||||
|
return dimensions
|
||||||
|
|
Loading…
Reference in New Issue