Update congestion agent plugin

This commit is for correcting some mistakes about the documentation
and small issues.

Change-Id: I0a1bda7353c7e742ceb36f182011b3b92dc18da1
This commit is contained in:
Fouad Benamrane 2018-01-03 14:24:01 +01:00
parent 08ac1308b8
commit 4e4faac80c
3 changed files with 67 additions and 41 deletions

View File

@ -839,32 +839,50 @@ These options can be set if desired:
## Congestion
This section describes the congestion check performed by monasca-agent. Congestion check collects metrics from special iptable chain created by the agent called congestion. Metric names that are cross-posted to the infrastructure project will have the ecn. prefix.
This section describes the congestion check performed by monasca-agent.
Congestion check collects metrics from special iptable chain created by
the agent called congestion. Metric names that are cross-posted to the
infrastructure project will have the ecn. prefix.
Configuration
The congestion check requires a configuration file called congestion.yaml to be available in the agent conf.d configuration directory. An example of the configuration is given below.
### Configuration
`auth_url` is the keystone endpoint for authentication
The congestion check requires a configuration file called congestion.yaml
to be available in the agent conf.d configuration directory. An example
of the configuration is given below.
`cache_dir` will be used to cache ecn metrics in a file called congestion_status.json.
`auth_url` is the keystone endpoint for authentication.
`enable_ecn` optional method that activates ecn marking in each machine. When the transmission equipment in the network encounters a congestion in its queues, it will not mark packets until the ecn is enabled by the sender. enable_ecn method ensures that ecn marking is enabled for both the sender and the receiver. This method is optional because the end user could enable ecn by changing the value of tcp_ecn from 0 to 2 or running 'echo 2 > /proc/sys/net/ipv4/tcp_ecn' in each machine.
`cache_dir` will be used to cache ecn metrics in a file called
congestion_status.json.
`enable_vm` optional method that gathers ecn metrics ecn.packets, ecn.bytes, and ecn.cong.rate of each VM hosted in a remote compute. By default the agent collects ecn metrics of computes, activating this method add fine-grained control of the congestion.
`enable_ecn` optional method that activates ecn marking in each machine.
When the transmission equipment in the network encounters a congestion in
its queues, it will not mark packets until the ecn is enabled by the sender.
enable_ecn method ensures that ecn marking is enabled for both the sender
and the receiver. This method is optional because the end user could enable
ecn by changing the value of tcp_ecn from 0 to 2 or running
'echo 2 > /proc/sys/net/ipv4/tcp_ecn' in each machine.
`enable_vm` optional method that gathers ecn metrics ecn.packets, ecn.bytes,
and ecn.cong.rate of each VM hosted in a remote compute. By default the agent
collects ecn metrics of computes, activating this method add fine-grained
control of the congestion.
`s_factor` Smoothing factor used to compute ecn congestion rate.
`collect_period` Period of time in sec to collect metrics and used also in ecn congestion rate calculation.
`collect_period` Period of time in sec to collect metrics and used also in
ecn congestion rate calculation.
`password` is the password for the nova user.
`project_name` is the project/tenant to POST ecn metrics.
`project_name` is the project/tenant to POST ecn metrics.
`region_name` is used to add the region dimension to metrics.
`username` is the username capable of making administrative nova calls.
`instances` are not used and should be empty in congestion.yaml because the plugin runs against all computes.
`instances` are not used and should be empty in congestion.yaml because
the plugin runs against all computes.
Sample config (`congestion.yaml`):
```
@ -886,20 +904,22 @@ instances:
```
The congestion checks return the following metrics:
| Metric Name | Dimensions | Semantics |
|---------------| --------------------------------------------------------| ---------------------------------------------------------|
| ecn.packets | hostname, device, component=Neutron, service=networking | Number of packets marked as Congestion Experienced |
| ecn.bytes | hostname, device, component=Neutron, service=networking | Number of bytes marked as Congestion Experienced |
| ecn.cong.rate | hostname, device, component=Neutron, service=networking | Congestion rate in kbps calculated using ecn.bytes value |
There is a detection plugin that should be used to configure this plugin. It is invoked as:
| Metric Name | Dimensions | Semantics |
|-------------|------------|-----------|
| ecn.packets | hostname, device, component=neutron, service=networking | Number of packets marked as Congestion Experienced |
| ecn.bytes | hostname, device, component=neutron, service=networking | Number of bytes marked as Congestion Experienced |
| ecn.cong.rate | hostname, device, component=neutron, service=networking | Congestion rate in kbps calculated using ecn.bytes value |
There is a detection plugin that should be used to configure this plugin.
It is invoked as:
```
$ monasca-setup -d congestion
You can check the current congestion status of the network by simply running
```
You can check the current congestion status of the network by simply running:
```
$ sudo monasca-collector -v check congestion
```
## Couch
See [the example configuration](https://github.com/openstack/monasca-agent/blob/master/conf.d/couch.yaml.example) for how to configure the Couch plugin.

View File

@ -6,14 +6,15 @@ from copy import deepcopy
import json
import logging
import math
from monasca_agent.collector.checks import AgentCheck
from monasca_agent.common import keystone
from novaclient import client as nova_client
import os
import stat
import subprocess
import time
from monasca_agent.collector.checks import AgentCheck
from monasca_agent.common import keystone
from novaclient import client as nova_client
log = logging.getLogger(__name__)
prerouting_chain = "PREROUTING"
congestion_chain = "congestion"
@ -53,7 +54,7 @@ class Congestion(AgentCheck):
"""Extend check method to collect and update congestion metrics.
"""
dimensions = self._set_dimensions({'service': 'networking',
'component': 'Neutron'}, instance)
'component': 'neutron'}, instance)
self.sample_time = float("{:9f}".format(time.time()))
"""Check iptables information and verify/install the ECN rule for
specific hypervisor"""
@ -138,7 +139,7 @@ class Congestion(AgentCheck):
"""Ensures that the ECN marking is enable on each tap interface
"""
tos_rule = "TOS --set-tos 0x02/0xff"
tos_tap = False
taps = None
"""Collect tap intefaces attached to linux bridge"""
try:
taps = subprocess.check_output(
@ -146,21 +147,24 @@ class Congestion(AgentCheck):
shell=True, stderr=subprocess.STDOUT)
except subprocess.CalledProcessError as e:
self.log.error(e.output)
taps = filter(None, taps.split('\n'))
"""Collect installed rules in Forward chain"""
forw_rules = self._get_rule(forward_chain)
for tap in taps:
for rule in forw_rules:
"""Check if the rule was applied to tap interface"""
if (tap + " -j " + tos_rule) in rule:
tos_tap = True
break
if not tos_tap:
"""Enable ECN"""
match = "physdev --physdev-out " + tap
self._add_rule(forward_chain, None, match, tos_rule)
self.log.info("ECN is enabled for %s interface.", tap)
break
if taps:
taps = filter(None, taps.split('\n'))
"""Collect installed rules in Forward chain"""
forw_rules = self._get_rule(forward_chain)
for tap in taps:
tap = tap + " --physdev-is-bridged"
if not self._find_tap(tap, forw_rules, tos_rule):
"""Enable ECN"""
match = "physdev --physdev-in " + tap
self._add_rule(forward_chain, None, match, tos_rule)
self.log.info("ECN is enabled for %s interface.", tap)
def _find_tap(self, tap, chain, tos_rule):
for rule in chain:
"""Check if the rule was applied to tap interface"""
if (tap + " -j " + tos_rule) in rule:
return True
return False
def _add_chain(self, chain):
"""This method adds 'chain' into iptables.

View File

@ -2,10 +2,12 @@
import ConfigParser
import logging
import os
from monasca_agent.common.psutil_wrapper import psutil
import monasca_setup.agent_config
import monasca_setup.detection
import os
log = logging.getLogger(__name__)
# Directory to use for metric caches