Add options to limit certain metrics
Some Agent plugins generate metrics that may not be desirable on some installations where resources are severely limited. This change adds the ability to limit potentially non-essential metrics from some plugins using the following boolean parameters: * system: cpu_idle_only (reduces total CPU metrics by 3) * system: net_bytes_only (reduces network metrics by 6 per device) * libvirt: ping_only (reduces per-VM metrics by 20+) These changes are described in more detail in the documentation accompanying this patch. monasca-setup now has the ability to configure these parameters from the command line. For example, to limit all non-essential system metrics: monasca-setup -d system -a 'cpu_idle_only=true net_bytes_only=true send_io_stats=false' To limit libvirt per-VM metrics to host_alive_status only: monasca-setup -d libvirt -a 'ping_only=true' --overwrite Change-Id: I1fc7839907100dae52432e1af33170457b5888ef
This commit is contained in:
parent
65d850c988
commit
630d7805e0
|
@ -4,6 +4,7 @@
|
|||
|
||||
- [System Checks](#system-checks)
|
||||
- [System Metrics](#system-metrics)
|
||||
- [Limiting System Metrics](#limiting-system-metrics)
|
||||
- [Standard Plugins](#standard-plugins)
|
||||
- [Dot File Configuration](#dot-file-configuration)
|
||||
- [Default Plugin Detection](#default-plugin-detection)
|
||||
|
@ -114,6 +115,20 @@ This section documents the system metrics that are sent by the Agent. This sect
|
|||
| monasca.emit_time_sec | service=monitoring component=monasca-agent | Amount of time that the forwarder took to send metrics to the Monasca API.
|
||||
| monasca.collection_time_sec | service=monitoring component=monasca-agent | Amount of time that the collector took for this collection run
|
||||
|
||||
### Limiting System Metrics
|
||||
It is possible to reduce the number of system metrics with certain configuration parameters.
|
||||
|
||||
| Config Option | Values | Description |
|
||||
| -------------- | ---------- | ------------------------------------------------------------------------------------------ |
|
||||
| net_bytes_only | true/false | Sends bytes/sec metrics only, disabling packets/sec, packets_dropped/sec, and errors/sec. |
|
||||
| cpu_idle_only | true/false | Sends idle_perc only, disabling wait/stolen/system/user metrics |
|
||||
| send_io_stats | true/false | If true, sends I/O metrics for each disk device. If false, sends only disk space metrics. |
|
||||
|
||||
These parameters may added to `instances` in the plugin `.yaml` configuration file, or added via `monasca-setup` like this:
|
||||
```
|
||||
monasca-setup -d system -a 'cpu_idle_only=true net_bytes_only=true send_io_stats=false' --overwrite
|
||||
```
|
||||
By default, all metrics are enabled.
|
||||
|
||||
# Standard Plugins
|
||||
Plugins are the way to extend the Monasca Agent. Plugins add additional functionality that allow the agent to perform checks on other applications, servers or services. This section describes the standard plugins that are delivered by default.
|
||||
|
@ -997,6 +1012,8 @@ If the owner of the VM is in a different tenant the Agent Cross-Tenant Metric Su
|
|||
|
||||
`ping_check` includes the command line (sans the IP address) used to perform a ping check against instances. Set to False (or omit altogether) to disable ping checks. This is automatically populated during `monasca-setup` from a list of possible `ping` command lines. Generally, `fping` is preferred over `ping` because it can return a failure with sub-second resolution, but if `fping` does not exist on the system, `ping` will be used instead. If ping_check is disabled, the `host_alive_status` metric will not be published unless that VM is inactive. This is because the host status is inconclusive without a ping check.
|
||||
|
||||
`ping_only` will suppress all per-VM metrics aside from `host_alive_status` and `vm.host_alive_status`, including all I/O, network, memory, and CPU metrics. [Aggregate Metrics](#aggregate-metrics), however, would still be enabled if `ping_only` is true. By default, `ping_only` is false. If both `ping_only` and `ping_check` are set to false, the only metrics published by the Libvirt plugin would be the Aggregate Metrics.
|
||||
|
||||
Example config:
|
||||
```
|
||||
init_config:
|
||||
|
@ -1009,6 +1026,7 @@ init_config:
|
|||
nova_refresh: 14400
|
||||
vm_probation: 300
|
||||
ping_check: /usr/bin/fping -n -c1 -t250 -q
|
||||
ping_only: false
|
||||
instances:
|
||||
- {}
|
||||
```
|
||||
|
@ -1016,6 +1034,11 @@ instances:
|
|||
|
||||
Note: If the Nova service login credentials are changed, `monasca-setup` would need to be re-run to use the new credentials. Alternately, `/etc/monasca/agent/conf.d/libvirt.yaml` could be modified directly.
|
||||
|
||||
Example `monasca-setup` usage:
|
||||
```
|
||||
monasca-setup -d libvirt -a 'ping_check=false ping_only=false'
|
||||
```
|
||||
|
||||
### Instance Cache
|
||||
The instance cache (`/dev/shm/libvirt_instances.yaml` by default) contains data that is not available to libvirt, but queried from Nova. To limit calls to the Nova API, the cache is only updated if a new instance is detected (libvirt sees an instance not already in the cache), or every `nova_refresh` seconds (see Configuration above).
|
||||
|
||||
|
|
|
@ -28,13 +28,13 @@ class Cpu(checks.AgentCheck):
|
|||
cpu_stats.iowait,
|
||||
cpu_stats.idle,
|
||||
cpu_stats.steal,
|
||||
dimensions)
|
||||
dimensions, instance)
|
||||
if send_rollup_stats:
|
||||
self.gauge('cpu.total_logical_cores', psutil.cpu_count(logical=True), dimensions)
|
||||
num_of_metrics += 1
|
||||
log.debug('Collected {0} cpu metrics'.format(num_of_metrics))
|
||||
|
||||
def _format_results(self, us, sy, wa, idle, st, dimensions):
|
||||
def _format_results(self, us, sy, wa, idle, st, dimensions, instance):
|
||||
data = {'cpu.user_perc': us,
|
||||
'cpu.system_perc': sy,
|
||||
'cpu.wait_perc': wa,
|
||||
|
@ -42,7 +42,7 @@ class Cpu(checks.AgentCheck):
|
|||
'cpu.stolen_perc': st}
|
||||
|
||||
for key in data.keys():
|
||||
if data[key] is None:
|
||||
if (data[key] is None or instance.get('cpu_idle_only') and 'idle_perc' not in key):
|
||||
del data[key]
|
||||
|
||||
[self.gauge(key, value, dimensions) for key, value in data.iteritems()]
|
||||
|
|
|
@ -233,6 +233,10 @@ class LibvirtCheck(AgentCheck):
|
|||
except OSError as e:
|
||||
self.log.warn("OS error running '{0}' returned {1}".format(ping_cmd, e))
|
||||
|
||||
# Skip the remainder of the checks if ping_only is True in the config
|
||||
if self.init_config.get('ping_only'):
|
||||
continue
|
||||
|
||||
# Accumulate aggregate data
|
||||
for gauge in agg_gauges:
|
||||
if gauge in instance_cache.get(inst_name):
|
||||
|
|
|
@ -37,6 +37,8 @@ class Network(checks.AgentCheck):
|
|||
nic = nics[nic_name]
|
||||
self.rate('net.in_bytes_sec', nic.bytes_recv, device_name=nic_name, dimensions=dimensions)
|
||||
self.rate('net.out_bytes_sec', nic.bytes_sent, device_name=nic_name, dimensions=dimensions)
|
||||
if instance.get('net_bytes_only'):
|
||||
continue
|
||||
self.rate('net.in_packets_sec', nic.packets_recv, device_name=nic_name, dimensions=dimensions)
|
||||
self.rate('net.out_packets_sec', nic.packets_sent, device_name=nic_name, dimensions=dimensions)
|
||||
self.rate('net.in_errors_sec', nic.errin, device_name=nic_name, dimensions=dimensions)
|
||||
|
|
|
@ -2,6 +2,7 @@
|
|||
|
||||
Detection classes should be platform independent
|
||||
"""
|
||||
import ast
|
||||
import logging
|
||||
import sys
|
||||
|
||||
|
@ -58,6 +59,15 @@ class Plugin(object):
|
|||
"""
|
||||
raise NotImplementedError
|
||||
|
||||
def literal_eval(self, testval):
|
||||
"""Return a literal boolean value if applicable
|
||||
|
||||
"""
|
||||
if 'false' in str(testval).lower() or 'true' in str(testval).lower():
|
||||
return ast.literal_eval(str(testval).capitalize())
|
||||
else:
|
||||
return testval
|
||||
|
||||
@property
|
||||
def name(self):
|
||||
"""Return _name if set otherwise the class name.
|
||||
|
|
|
@ -1,4 +1,3 @@
|
|||
import ast
|
||||
import ConfigParser
|
||||
import grp
|
||||
import logging
|
||||
|
@ -120,7 +119,12 @@ class Libvirt(Plugin):
|
|||
break
|
||||
if 'ping_check' not in init_config:
|
||||
log.info("\tUnable to find suitable ping command, disabling ping checks.")
|
||||
init_config['ping_check'] = ast.literal_eval('False')
|
||||
init_config['ping_check'] = self.literal_eval('False')
|
||||
|
||||
# Handle monasca-setup detection arguments, which take precedence
|
||||
if self.args:
|
||||
for arg in self.args:
|
||||
init_config[arg] = self.literal_eval(self.args[arg])
|
||||
|
||||
config['libvirt'] = {'init_config': init_config,
|
||||
'instances': [{}]}
|
||||
|
|
|
@ -31,6 +31,9 @@ class System(Plugin):
|
|||
with open(os.path.join(self.template_dir, 'conf.d/' + metric + '.yaml'), 'r') as metric_template:
|
||||
default_config = yaml.load(metric_template.read())
|
||||
config[metric] = default_config
|
||||
if self.args:
|
||||
for arg in self.args:
|
||||
config[metric]['instances'][0][arg] = self.literal_eval(self.args[arg])
|
||||
log.info('\tConfigured {0}'.format(metric))
|
||||
except (OSError, IOError):
|
||||
log.info('\tUnable to configure {0}'.format(metric))
|
||||
|
|
Loading…
Reference in New Issue