[docs] Edits Alarms and Appendix

Edits the following sections of the StackLight Collector plugin 0.10.0 documentation: * Configuring alarms * Appendix Change-Id: I534611a4eae9aeb97bfedb3971d7a8ec76e20bac
2016-07-19 19:56:52 +03:00 · 2016-07-19 19:56:52 +03:00 · 8581289600
parent 5c0d43aaec
commit 8581289600
17 changed files with 760 additions and 654 deletions
--- a/doc/user/source/appendix_alarms.rst
+++ b/doc/user/source/appendix_alarms.rst
@ -1,9 +1,13 @@
 .. _alarms:

+.. raw:: latex
+
+   \pagebreak
+
 List of built-in alarms
 -----------------------

-Here is a list of all the alarms that are built-in in StackLight::
+The following is a list of StackLight built-in alarms::

  alarms:
    - name: 'cpu-critical-controller'
@ -732,5 +736,4 @@ Here is a list of all the alarms that are built-in in StackLight::
            threshold: 5
            window: 60
            periods: 0
-            function: min
-
+            function: min
--- a/doc/user/source/appendix_metrics.rst
+++ b/doc/user/source/appendix_metrics.rst
@ -3,8 +3,8 @@
 List of metrics
 ---------------

-Here is a list of metrics that are emitted by the StackLight Collector.
-They are listed by category then by metric name.
+The following is a list of metrics that are emitted by the StackLight Collector.
+The metrics are listed by category, then by metric name.

 System
 ++++++
@ -63,7 +63,7 @@ Clusters

 .. include:: metrics/clusters.rst

-Self Monitoring
+Self-monitoring
 +++++++++++++++

 .. include:: metrics/lma.rst
@ -78,4 +78,4 @@ Elasticsearch
 InfluxDB
 ++++++++

-.. include:: metrics/influxdb.rst
+.. include:: metrics/influxdb.rst
--- a/doc/user/source/configure_alarms.rst
+++ b/doc/user/source/configure_alarms.rst
@ -3,139 +3,130 @@
 Overview
 --------

-The process of running alarms in StackLight is not centralized
-(as it is often the case in more conventional monitoring systems)
-but distributed across all the StackLight Collectors.
+The process of running alarms in StackLight is not centralized, as it is often
+the case in more conventional monitoring systems, but distributed across all
+the StackLight Collectors.

-Each Collector is individually responsible for monitoring the
-resources and the services that are deployed on the node and for reporting
-any anomaly or fault it has detected to the Aggregator.
+Each Collector is individually responsible for monitoring the resources and
+services that are deployed on the node and for reporting any anomaly or fault
+it has detected to the Aggregator.

-The anomaly and fault detection logic in StackLight is designed
-more like an *expert system* in that the Collector and the Aggregator
-use artifacts we could refer to as *facts* and *rules*.
+The anomaly and fault detection logic in StackLight is designed more like an
+*expert system* in that the Collector and the Aggregator use artifacts we
+can refer to as *facts* and *rules*.

 The *facts* are the operational data ingested in the StackLight's
-stream processing pipeline.
-The *rules* are either alarm rules or aggregation rules.
-They are declaratively defined in YAML files that can be modified.
-Those rules are turned into a collection of Lua plugins
-that are executed by the Collector and the Aggregator.
-They are generated dynamically using the Puppet modules of the StackLight
-Collector Plugin.
+stream-processing pipeline. The *rules* are either alarm rules or aggregation
+rules. They are declaratively defined in YAML files that can be modified.
+Those rules are turned into a collection of Lua plugins that are executed by
+the Collector and the Aggregator. They are generated dynamically using the
+Puppet modules of the StackLight Collector Plugin.

-There are two types of Lua plugins related to the processing
-of alarms.
+The following are the two types of Lua plugins related to the processing of
+alarms:

-* The **AFD plugin** for Anomaly and Fault Detection plugin.
-* The **GSE plugin** for Global Status Evaluation plugin.
+* The **AFD plugin** -- Anomaly and Fault Detection plugin
+* The **GSE plugin** -- Global Status Evaluation plugin

-These plugins create a special type of metric called respectively
-the **AFD metric** and the **GSE metric**.
+These plugins create special types of metrics, as follows:

-* The AFD metric contains information about the health status
-  of a node or service in the OpenStack environment.
-  The AFD metrics are sent on a regular basis to the Aggregator
-  where they are further processed by the GSE plugins.
-* The GSE metric contains information about the health status
-  of a cluster in the OpenStack environment. A cluster is a
-  logical grouping of nodes or services. We call
-  them node clusters and service clusters hereafter.
-  A service cluster can be anything like a cluster of API endpoints
-  or a cluster of workers. A cluster of nodes is a grouping of
-  nodes that have the same role. For example 'compute' or 'storage'.
+* The **AFD metric**, which contains information about the health status of a
+  node or service in the OpenStack environment. The AFD metrics are sent on a
+  regular basis to the Aggregator where they are further processed by the GSE
+  plugins.

-.. note:: The AFD and GSE metrics are new types of metrics introduced
-   in StackLight version 0.8.
-   They contain detailed information about the fault and anomalies
-   detected by StackLight. Please refer to the
+* The **GSE metric**, which contains information about the health status of a
+  cluster in the OpenStack environment. A cluster is a logical grouping of
+  nodes or services. We call them node clusters and service clusters hereafter.
+  A service cluster can be anything like a cluster of API endpoints or a
+  cluster of workers. A cluster of nodes is a grouping of nodes that have the
+  same role. For example, *compute* or *storage*.
+
+.. note:: The AFD and GSE metrics are new types of metrics introduced in
+   StackLight version 0.8. They contain detailed information about the fault
+   and anomalies detected by StackLight. For more information about the
+   message structure of these metrics, refer to
   `Metrics section of the Developer Guide
-   <http://lma-developer-guide.readthedocs.io/en/latest/metrics.html>`_
-   for more information about the message structure of these metrics.
+   <http://lma-developer-guide.readthedocs.io/en/latest/metrics.html>`_.

-The StackLight stream processing pipeline workflow is shown in the figure below:
+The following figure shows the StackLight stream-processing pipeline workflow:

 .. figure:: ../../images/AFD_and_GSE_message_flow.*
   :width: 800
   :alt: Message flow for the AFD and GSE metrics
-   :align: center
+
+.. raw:: latex
+
+   \pagebreak

 The AFD and GSE plugins
 -----------------------

-In the current version of StackLight, there are three types of GSE plugins:
+The current version of StackLight contains the following three types of GSE
+plugins:

-* The **Service Cluster GSE Plugin** which receives AFD metrics for services
+* The **Service Cluster GSE Plugin**, which receives AFD metrics for services
  from the AFD plugins.
-* The **Node Cluster GSE Plugin** which receives AFD metrics for nodes
+* The **Node Cluster GSE Plugin**, which receives AFD metrics for nodes
  from the AFD plugins.
-* The **Global Cluster GSE Plugin** which receives GSE metrics from the
-  GSE plugins above. It aggregates and correlates the GSE metrics to issue a global
-  health status for the top-level clusters like Nova, MySQL and so forth.
+* The **Global Cluster GSE Plugin**, which receives GSE metrics from the
+  GSE plugins above. It aggregates and correlates the GSE metrics to issue a
+  global health status for the top-level clusters like Nova, MySQL, and others.

-The health status exposed in the GSE metrics is as follow:
+The health status exposed in the GSE metrics is as follows:

-* *Down*: One or several primary functions of a cluster has failed or is failing.
-  For example, the API service for Nova or Cinder isn't accessible.
-* *Critical*: One or several primary functions of a
-  cluster are severely degraded. The quality
-  of service delivered to the end-user is severely impacted.
-* *Warning*: One or several primary functions of the
-  cluster are slightly degraded. The quality
-  of service delivered to the end-user is slightly
+* ``Down``: One or several primary functions of a cluster has failed or is
+  failing. For example, the API service for Nova or Cinder is not accessible.
+* ``Critical``: One or several primary functions of a cluster are severely
+  degraded. The quality of service delivered to the end user is severely
  impacted.
-* *Unknown*: There is not enough data to infer the actual
-  health status of the cluster.
-* *Okay*: None of the above was found to be true.
+* ``Warning``: One or several primary functions of the cluster are slightly
+  degraded. The quality of service delivered to the end user is slightly
+  impacted.
+* ``Unknown``: There is not enough data to infer the actual health status of
+  the cluster.
+* ``Okay``: None of the above was found to be true.

 The AFD and GSE persisters
 --------------------------

-The AFD and GSE metrics are also consumed by other types
-of Lua plugins called the **persisters**.
+The AFD and GSE metrics are also consumed by other types of Lua plugins called
+**persisters**:

-* The **InfluxDB persister** transforms the GSE metrics
-  into InfluxDB data-points and Grafana annotations. They
-  are used in Grafana to graph the health status of
-  the OpenStack clusters.
-* The **Elasticsearch persister** transforms the AFD metrics
-  into events that are indexed in Elasticsearch. Using Kibana,
-  these events can be searched to display a fault or an anomaly
-  that occured in the environment (not implemented yet).
-* The **Nagios persister** transforms the GSE and AFD metrics
-  into passive checks that are sent to Nagios for alerting and
-  escalation.
+* The **InfluxDB persister** transforms the GSE metrics into InfluxDB data
+  points and Grafana annotations. They are used in Grafana to graph the health
+  status of the OpenStack clusters.
+* The **Elasticsearch persister** transforms the AFD metrics into events that
+  are indexed in Elasticsearch. Using Kibana, these events can be searched to
+  display a fault or an anomaly that occurred in the environment (not yet
+  implemented).
+* The **Nagios persister** transforms the GSE and AFD metrics into passive
+  checks that are sent to Nagios for alerting and escalation.

-New persisters could be created easely to feed other
-systems with the operational insight contained in the
-AFD and GSE metrics.
+New persisters can be easily created to feed other systems with the
+operational insight contained in the AFD and GSE metrics.

 .. _alarm_configuration:

 Alarms configuration
 --------------------

-StackLight comes with a predefined set of alarm rules.
-We have tried to make these rules as comprehensive and relevant
-as possible, but your mileage may vary depending on the specifics of
-your OpenStack environment and monitoring requirements.
-Therefore, it is possible to modify those predefined rules
-and create new ones.
-To do so, you will be required to modify the
-``/etc/hiera/override/alarming.yaml`` file
-and apply the :ref:`Puppet manifest <puppet_apply>`
-that will dynamically generate Lua plugins known as
-the AFD Plugins which are the actuators of the alarm rules.
-But before you proceed, you need to understand the structure
-of that file.
+StackLight comes with a predefined set of alarm rules. We have tried to make
+these rules as comprehensive and relevant as possible, but your mileage may
+vary depending on the specifics of your OpenStack environment and monitoring
+requirements. Therefore, it is possible to modify those predefined rules and
+create new ones. To do so, modify the ``/etc/hiera/override/alarming.yaml``
+file and apply the :ref:`Puppet manifest <puppet_apply>` that will dynamically
+generate Lua plugins, known as the AFD Plugins, which are the actuators of the
+alarm rules. But before you proceed, verify that understand the structure of
+that file.

 .. _alarm_structure:

 Alarm structure
 +++++++++++++++

-An alarm rule is defined declaratively using the YAML syntax
-as shown in the example below::
+An alarm rule is defined declaratively using the YAML syntax. For example::

    name: 'fs-warning'
    description: 'Filesystem free space is low'
@ -180,7 +171,7 @@ as shown in the example below::

 | logical_operator
 |    Type: Enum('and' | '&&' | 'or' | '||')
-|    The conjonction relation for the alarm rules.
+|    The conjunction relation for the alarm rules

 | metric
 |    Type: unicode
@ -192,24 +183,25 @@ as shown in the example below::

 | fields
 |   Type: list
-|   List of field name / value pairs (a.k.a dimensions) used to select
-    a particular device for the metric such as a network interface name or file
-    system mount point. If value is specified as an empty string (""), then the rule
-    is applied to all the aggregated values for the specified field name. For example
-    the file system mount point.
-    If value is specified as the '*' wildcard character,
-    then the rule is applied to each of the metrics matching the metric name and field name.
-    For example, the alarm definition sample given above would run the rule
-    for each of the file system mount points associated with the *fs_space_percent_free* metric.
+|   List of field name / value pairs, also known as dimensions, used to select
+    a particular device for the metric, such as a network interface name or
+    file system mount point. If the value is specified as an empty string (""),
+    then the rule is applied to all the aggregated values for the specified
+    field name. For example, the file system mount point. If value is
+    specified as the '*' wildcard character, then the rule is applied to each
+    of the metrics matching the metric name and field name. For example, the
+    alarm definition sample given above would run the rule for each of the
+    file system mount points associated with the *fs_space_percent_free*
+    metric.

 | window
 |   Type: integer
-|   The in memory time-series analysis window in seconds
+|   The in-memory time-series analysis window in seconds

 | periods
 |   Type: integer
-|   The number of prior time-series analysis window to compare the window with (this is
-|   not implemented yet)
+|   The number of prior time-series analysis window to compare the window with
+|   (this is not implemented yet).

 | function
 |   Type: enum('last' | 'min' | 'max' | 'sum' | 'count' | 'avg' | 'median' | 'mode' | 'roc' | 'mww' | 'mww_nonparametric')
@ -232,46 +224,49 @@ as shown in the example below::
 |       returns the value that occurs most often in all the values
 |       (not implemented yet)
 |     roc:
-|       The 'roc' function detects a significant rate
-        of change when comparing current metrics values with historical data.
-        To achieve this, it computes the average of the values in the current window,
-        and the average of the values in the window before the current window and
-        compare the difference against the standard deviation of the
-        historical window. The function returns true if the difference
+|       The 'roc' function detects a significant rate of change when comparing
+        current metrics values with historical data. To achieve this, it
+        computes the average of the values in the current window and the
+        average of the values in the window before the current window and
+        compares the difference against the standard deviation of the
+        historical window. The function returns ``true`` if the difference
        exceeds the standard deviation multiplied by the 'threshold' value.
        This function uses the rate of change algorithm already available in the
-        anomaly detection module of Heka. It can only be applied on normal
-        distributions.
-        With an alarm rule using the 'roc' function, the 'window' parameter
-        specifies the duration in seconds of the current window and the 'periods'
-        parameter specifies the number of windows used for the historical data.
-        You need at least one period and so, the 'periods' parameter must not be zero.
-        If you choose a period of 'p', the function will compute the rate of
-        change using an historical data window of ('p' * window) seconds.
-        For example, if you specify in the alarm rule:
+        anomaly detection module of Heka. It can only be applied to normal
+        distributions. With an alarm rule using the 'roc' function, the
+        'window' parameter specifies the duration in seconds of the current
+        window, and the 'periods' parameter specifies the number of windows
+        used for the historical data. You need at least one period and the
+        'periods' parameter must not be zero. If you choose a period of 'p',
+        the function will compute the rate of change using a historical data
+        window of ('p' * window) seconds. For example, if you specify the
+        following in the alarm rule:
 |
 |           window = 60
 |           periods = 3
 |           threshold = 1.5
 |
-|       The function will store in a circular buffer the value of the metrics
+|       the function will store in a circular buffer the value of the metrics
        received during the last 300 seconds (5 minutes) where:
 |
 |           Current window (CW) = 60 sec
 |           Previous window (PW) = 60 sec
 |           Historical window (HW) = 180 sec
 |
-|       And apply the following formula:
+|       and apply the following formula:
 |
 |            abs(avg(CW) - avg(PW)) > std(HW) * 1.5 ? true : false
 |     mww:
-|       returns the result (true, false) of the Mann-Whitney-Wilcoxon test function
-        of Heka that can be used only with normal distributions (not implemented yet)
+|       returns the result (true, false) of the Mann-Whitney-Wilcoxon test
+        function of Heka that can be used only with normal distributions (not
+        implemented yet)
 |     mww-nonparametric:
-|       returns the result (true, false) of the Mann-Whitney-Wilcoxon
-        test function of Heka that can be used with non-normal distributions (not implemented yet)
+|       returns the result (true, false) of the Mann-Whitney-Wilcoxon test
+        function of Heka that can be used with non-normal distributions (not
+        implemented yet)
 |     diff:
-|       returns the difference between the last value and the first value of all the values
+|       returns the difference between the last value and the first value of
+        all the values

 | threshold
 |   Type: float
@ -281,15 +276,13 @@ as shown in the example below::
 Modify or create an alarm
 +++++++++++++++++++++++++

-To modify (or create) an alarm, you need to edit the
-``/etc/hiera/override/alarming.yaml`` file.
-This file has four sections:
+To modify or create an alarm, edit the ``/etc/hiera/override/alarming.yaml``
+file. This file has the following sections:

-1. The *alarms* section contains a global list of alarms that
-   are executed by the Collectors. These alarms are global to
-   the LMA toolchain and should be kept identical
-   on all nodes of the OpenStack environment.
-   Here is another example of the definition of an alarm::
+#. The ``alarms`` section contains a global list of alarms that are executed
+   by the Collectors. These alarms are global to the LMA toolchain and should
+   be kept identical on all nodes of the OpenStack environment. The following
+   is another example of the definition of an alarm::

     alarms:
       - name: 'cpu-critical-controller'
@ -312,30 +305,29 @@ This file has four sections:
                periods: 0
                function: avg

-   This alarm is called 'cpu-critical-controller'.
-   It says that CPU activity is critical (severity: 'critical')
-   if any of the rules in the alarm definition evaluates to true.
+   This alarm is called 'cpu-critical-controller'. It says that CPU activity
+   is critical (severity: 'critical') if any of the rules in the alarm
+   definition evaluate to true.

-   The rule says that the alarm
-   will evaluate to 'true' if the value of the metric *cpu_idle*
-   has been in average (function: avg) below or equal
+   The rule says that the alarm will evaluate to 'true' if the value of the
+   metric ``cpu_idle`` has been in average (function: avg), below or equal
   (relational_operator: <=) to 5 for the last 5 minutes (window: 120).

   OR (logical_operator: 'or')

-   If the value of the metric **cpu_wait** has been in average
-   (function: avg) superior or equal (relational_operator: >=) to 35
-   for the last 5 minutes (window: 120)
+   If the value of the metric **cpu_wait** has been in average (function: avg),
+   superior or equal (relational_operator: >=) to 35 for the last 5 minutes
+   (window: 120)

   Note that these metrics are expressed in percentage.

-   What alarms are executed on which node depends on
-   the mapping between the alarm definition and the
-   definition of a cluster as described in the following sections.
+   What alarms are executed on which node depends on the mapping between the
+   alarm definition and the definition of a cluster as described in the
+   following sections.

-2. The *node_cluster_roles* section defines the mapping between
-   the internal definition of a cluster of nodes and one or
-   several Fuel roles. For example::
+#. The ``node_cluster_roles`` section defines the mapping between the internal
+   definition of a cluster of nodes and one or several Fuel roles.
+   For example::

    node_cluster_roles:
      controller: ['primary-controller', 'controller']
@ -343,22 +335,19 @@ This file has four sections:
      storage: ['cinder', 'ceph-osd']
      [ ... ]

-   Creates a mapping between the 'primary-controller'
-   and 'controller' Fuel roles and the internal defintion of a cluster
-   of nodes called 'controller'.
-   Likewise, the internal definition of a cluster of nodes called
-   'storage' is mapped to the 'cinder' and 'ceph-osd' Fuel roles.
-   The internal definition of a cluster of nodes is used to assign
-   the alarms to the relevant category of nodes.
-   This mapping is also used to configure the **passive checks**
-   in Nagios. This is the reason why, it is criticaly important
-   to keep the exact same copy of ``/etc/hiera/override/alarming.yaml``
-   across all the nodes of the OpenStack environment including the
-   node(s) where Nagios is installed.
+   Creates a mapping between the 'primary-controller' and 'controller' Fuel
+   roles, and the internal definition of a cluster of nodes called 'controller'.
+   Likewise, the internal definition of a cluster of nodes called 'storage' is
+   mapped to the 'cinder' and 'ceph-osd' Fuel roles. The internal definition
+   of a cluster of nodes is used to assign the alarms to the relevant category
+   of nodes. This mapping is also used to configure the **passive checks**
+   in Nagios. Therefore, it is critically important to keep exactly the same
+   copy of ``/etc/hiera/override/alarming.yaml`` across all nodes of the
+   OpenStack environment including the node(s) where Nagios is installed.

-3. The *service_cluster_roles* section defines the mapping between
-   the internal definition of a cluster of services and one or
-   several Fuel roles. For example::
+#. The ``service_cluster_roles`` section defines the mapping between the
+   internal definition of a cluster of services and one or several Fuel roles.
+   For example::

     service_cluster_roles:
       rabbitmq: ['primary-controller', 'controller']
@ -366,18 +355,17 @@ This file has four sections:
       elasticsearch: ['primary-elasticsearch_kibana', 'elasticsearch_kibana']
       [ ... ]

-   Creates a mapping between the 'primary-controller'
-   and 'controller' Fuel roles and the internal defintion of a cluster
-   of services called 'rabbitmq'.
+   Creates a mapping between the 'primary-controller' and 'controller' Fuel
+   roles, and the internal definition of a cluster of services called 'rabbitmq'.
   Likewise, the internal definition of a cluster of services called
-   'elasticsearch' is mapped to the 'primary-elasticsearch_kibana'
-   and 'elasticsearch_kibana' Fuel roles.
-   As for the clusters of nodes, the internal definition of a cluster
-   of services is used to assign the alarns to the relevant category of services.
+   'elasticsearch' is mapped to the 'primary-elasticsearch_kibana' and
+   'elasticsearch_kibana' Fuel roles. As for the clusters of nodes, the
+   internal definition of a cluster of services is used to assign the alarms
+   to the relevant category of services.

-4. The *node_cluster_alarms* section defines the mapping between
-   the internal definition of a cluster of nodes and the alarms that
-   are assigned to that category of nodes. For example::
+#. The ``node_cluster_alarms`` section defines the mapping between the
+   internal definition of a cluster of nodes and the alarms that are assigned
+   to that category of nodes. For example::

     node_cluster_alarms:
        controller:
@ -385,121 +373,105 @@ This file has four sections:
         root-fs: ['root-fs-critical', 'root-fs-warning']
         log-fs: ['log-fs-critical', 'log-fs-warning']

-   Creates three alarm groups for the cluster of nodes called
-   'controller'.
+   Creates three alarm groups for the cluster of nodes called 'controller':

-   * The *cpu* alarm group is mapped to two alarms defined in the
-     *alarms* section known as the 'cpu-critical-controller' and
-     'cpu-warning-controller' alarms. Those alarms monitor the
-     CPU on the controller nodes. Note that the order matters
-     here since the first alarm which evaluates to 'true' stops
-     the evaluation. Hence, it is important to start the list
-     with the most critical alarms.
-   * The *root-fs* alarm group is mapped to two alarms defined
-     in the *alarms* section known as the 'root-fs-critical'
-     and 'root-fs-warning' alarms. Those alarms monitor the
-     root file system on the controller nodes.
-   * The *log-fs* alarm group is mapped to two alarms defined
-     in the *alarms* section known as the 'log-fs-critical' and
-     'log-fs-warning' alarms. Those alarms monitor the file
-     system where the logs are created on the controller
-     nodes.
+   * The *cpu* alarm group is mapped to two alarms defined in the ``alarms``
+     section known as the 'cpu-critical-controller' and
+     'cpu-warning-controller' alarms. These alarms monitor the CPU on the
+     controller nodes. The order matters here since the first alarm that
+     evaluates to 'true' stops the evaluation. Therefore, it is important
+     to start the list with the most critical alarms.
+   * The *root-fs* alarm group is mapped to two alarms defined in the
+     ``alarms`` section known as the 'root-fs-critical' and 'root-fs-warning'
+     alarms. These alarms monitor the root file system on the controller nodes.
+   * The *log-fs* alarm group is mapped to two alarms defined in the ``alarms``
+     section known as the 'log-fs-critical' and 'log-fs-warning' alarms. These
+     alarms monitor the file system where the logs are created on the
+     controller nodes.

-   .. note:: An *alarm group* is a mere implementaton artifact
-      (although it has several functional usefulness) that is
-      primarily used to distribute the alarms evaluation workload
-      across several Lua plugins. Since the Lua plugins
-      runtime is sandboxed within Heka, it is preferable to run
-      smaller sets of alarms in different plugins rather than a
-      large set of alarms in a single plugin. This is to avoid
-      having alarms evaluation plugins shutdown by Heka.
-      Furthermore, the alarm groups are used to identify what is
-      called a *source*. A *source* is a tuple in which we associate
-      a cluster with an alarm group. For example the tuple ['controller', 'cpu']
-      is a *source*. It associates a 'controller' cluster with the 'cpu'
-      alarm group. The tuple ['controller', 'root-fs'] is another *source*
-      example. The *source* is used by the GSE Plugins to remember the
-      AFD metrics it has received. If a GSE Plugin stops receiving
-      AFD metrics it used to get, then the GSE Plugin will
-      infer that the health status for the cluster associated
-      with the source is *Unknown*.
+   .. note:: An *alarm group* is a mere implementation artifact (although it
+      has functional value) that is primarily used to distribute the alarms
+      evaluation workload across several Lua plugins. Since the Lua plugins
+      runtime is sandboxed within Heka, it is preferable to run smaller sets
+      of alarms in different plugins rather than a large set of alarms in a
+      single plugin. This is to avoid having alarms evaluation plugins
+      shut down by Heka. Furthermore, the alarm groups are used to identify
+      what is called a *source*. A *source* is a tuple in which we associate
+      a cluster with an alarm group. For example, the tuple
+      ['controller', 'cpu'] is a *source*. It associates a 'controller'
+      cluster with the 'cpu' alarm group. The tuple ['controller', 'root-fs']
+      is another *source* example. The *source* is used by the GSE Plugins to
+      remember the AFD metrics it has received. If a GSE Plugin stops receiving
+      AFD metrics it used to get, then the GSE Plugin infers that the health
+      status of the cluster associated with the source is *Unknown*.

-      This is evaluated every *ticker-interval*. By default,
-      the *ticker interval* for the GSE Plugins is set to
-      10 seconds.
+      This is evaluated every *ticker-interval*. By default, the
+      *ticker interval* for the GSE Plugins is set to 10 seconds.

 .. _aggreg_correl_config:

 Aggregation and correlation configuration
 -----------------------------------------

-StackLight comes with a predefined set of aggregation rules and
-correlation policies. As for the alarms, it is possible to
-create new aggregation rules and correlation policies or modify
-existing ones. To do so, you will be required to modify the
-``/etc/hiera/override/gse_filters.yaml`` file
-and apply the :ref:`Puppet manifest <puppet_apply>`
-that will generate Lua plugins known as
-the GSE Plugins which are the actuators of these aggregation
-rules and correlation policies.
-But before you proceed, you need to undestand the structure
-of that file.
+StackLight comes with a predefined set of aggregation rules and correlation
+policies. However, you can create new aggregation rules and correlation
+policies or modify the existing ones. To do so, modify the ``/etc/hiera/override/gse_filters.yaml`` file and apply the
+:ref:`Puppet manifest <puppet_apply>` that will generate Lua plugins known as
+the GSE Plugins, which are the actuators of these aggregation rules and
+correlation policies. But before you proceed, verify that you understand the
+structure of that file.

-.. note:: As for ``/etc/hiera/override/alarming.yaml``,
-   it is criticaly important to keep the exact same copy of
-   ``/etc/hiera/override/gse_filters.yaml``
-   across all the nodes of the OpenStack environment including the
-   node(s) where Nagios is installed.
+.. note:: As for ``/etc/hiera/override/alarming.yaml``, it is critically
+   important to keep exactly the same copy of
+   ``/etc/hiera/override/gse_filters.yaml`` across all the nodes of the
+   OpenStack environment including the node(s) where Nagios is installed.
   
-The aggregation rules and correlation policies are defined
-in the ``/etc/hiera/override/gse_filters.yaml`` configuration file.
+The aggregation rules and correlation policies are defined in the ``/etc/hiera/override/gse_filters.yaml`` configuration file.

-This file has four sections:
+This file has the following sections:

-1. The *gse_policies* section contains the :ref:`health status
-   correlation policies <gse_policies>` that apply to the node
-   clusters and service clusters.
-2. The *gse_cluster_service* section contains the :ref:`aggregation rules
-   <gse_cluster_service>` for the service clusters. These
-   aggregation rules are actuated by the Service Cluster GSE
-   Plugin which runs on the Aggregator.
-3. The *gse_cluster_node* section contains the :ref:`aggreagion rules
-   <gse_cluster_node>` for the node clusters. These aggregation rules
-   are actuated by the Node Cluster GSE Plugin which runs on the
-   Aggregator.
-4. The *gse_cluster_global* section contains the :ref:`aggregation
-   rules <gse_cluster_global>` for the so-called top-level clusters.
-   A global cluster is a kind of logical construct of node clusters
-   and service clusters. These aggregation rules are actuated by
-   the Global Cluster GSE Plugin which runs on the Aggregator.
+#. The ``gse_policies`` section contains the :ref:`health status correlation
+   policies <gse_policies>` that apply to the node clusters and service
+   clusters.
+#. The ``gse_cluster_service` section contains the :ref:`aggregation rules
+   <gse_cluster_service>` for the service clusters. These aggregation rules
+   are actuated by the Service Cluster GSE Plugin that runs on the Aggregator.
+#. The ``gse_cluster_node`` section contains the :ref:`aggregation rules
+   <gse_cluster_node>` for the node clusters. These aggregation rules are
+   actuated by the Node Cluster GSE Plugin that runs on the Aggregator.
+#. The ``gse_cluster_global`` section contains the :ref:`aggregation
+   rules <gse_cluster_global>` for the so-called top-level clusters. A global
+   cluster is a kind of logical construct of node clusters and service
+   clusters. These aggregation rules are actuated by the Global Cluster GSE
+   Plugin that runs on the Aggregator.

 .. _gse_policies:

 Health status policies
 ++++++++++++++++++++++

-The correlation logic implemented by the GSE plugins is policy-based.
-The policies define how the GSE plugins infer the health status of a
-cluster.
+The correlation logic implemented by the GSE plugins is policy-based. The
+policies define how the GSE plugins infer the health status of a cluster.

-By default, two policies are defined:
+By default, there are two policies:

-* The **highest_severity** policy defines that the cluster's status depends on the
-  member with the highest severity, typically used for a cluster of services.
-* The **majority_of_members** policy defines that the cluster is healthy as long as
-  (N+1)/2 members of the cluster are healthy. This is typically used for
-  clusters managed by Pacemaker.
+* The **highest_severity** policy defines that the cluster's status depends on
+  the member with the highest severity, typically used for a cluster of
+  services.
+* The **majority_of_members** policy defines that the cluster is healthy as
+  long as (N+1)/2 members of the cluster are healthy. This is typically used
+  for clusters managed by Pacemaker.

-A policy consists of a list of rules that are evaluated against the
-current status of the cluster's members. When one of the rules matches, the
-cluster's status gets the value associated with the rule and the evaluation
-stops here. The last rule of the list is usually a catch-all rule that
-defines the default status in case none of the previous rules could be matched.
+A policy consists of a list of rules that are evaluated against the current
+status of the cluster's members. When one of the rules matches, the cluster's
+status gets the value associated with the rule and the evaluation stops. The
+last rule of the list is usually a catch-all rule that defines the default
+status if none of the previous rules matches.

-A policy rule is defined as shown in the example below::
+The following example shows the policy rule definition::

   # The following rule definition reads as: "the cluster's status is critical
-   # if more than 50% of its members are either down or criticial"
+   # if more than 50% of its members are either down or critical"
   - status: critical
     trigger:
       logical_operator: or
@ -517,7 +489,7 @@ Where

 | logical_operator
 |    Type: Enum('and' | '&&' | 'or' | '||')
-|    The conjonction relation for the condition rules
+|    The conjunction relation for the condition rules

 | rules
 |    Type: list
@ -543,7 +515,7 @@ Where
 |   Type: float
 |   The threshold value

-Lets take a closer look at the policy called *highest_severity*::
+Consider the policy called *highest_severity*::

  gse_policies:

@ -582,28 +554,31 @@ Lets take a closer look at the policy called *highest_severity*::
              threshold: 0
      - status: unknown

-The policy definition reads as:
+The policy definition reads as follows:

-* The status of the cluster is *Down* if the status of at least one cluster's member is *Down*.
+* The status of the cluster is ``Down`` if the status of at least one
+  cluster's member is ``Down``.

-* Otherwise the status of the cluster is *Critical* if the status of at least one cluster's member is *Critical*.
+* Otherwise, the status of the cluster is ``Critical`` if the status of at
+  least one cluster's member is ``Critical``.

-* Otherwise the status of the cluster is *Warning* if the status of at least one cluster's member is *Warning*.
+* Otherwise, the status of the cluster is ``Warning`` if the status of at
+  least one cluster's member is ``Warning``.

-* Otherwise the status of the cluster is *Okay* if the status of at least one cluster's entity is *Okay*.
+* Otherwise, the status of the cluster is ``Okay`` if the status of at least
+  one cluster's entity is *Okay*.

-* Otherwise the status of the cluster is *Unknown*.
+* Otherwise, the status of the cluster is ``Unknown``.

 .. _gse_cluster_service:

 Service cluster aggregation rules
 +++++++++++++++++++++++++++++++++

-The service cluster aggregation rules are used to designate
-the members of a service cluster along with
-the AFD metrics that must be taken into account to derive an
-health status for the service cluster.
-Here is an example of the service cluster aggregation rules::
+The service cluster aggregation rules are used to designate the members of a
+service cluster along with the AFD metrics that must be taken into account to
+derive a health status for the service cluster. The following is an example of
+the service cluster aggregation rules::

  gse_cluster_service:
    input_message_types:
@ -673,7 +648,7 @@ Where
 Service cluster definition
 ++++++++++++++++++++++++++

-The service clusters are defined as shown in the example below::
+The following example shows the service clusters definition::

  gse_cluster_service:
    [...]
@ -691,36 +666,36 @@ Where
 | members
 |   Type: list
 |   The list of cluster members.
-    The AFD messages that are associated to the cluster when the *cluster_field*
-    value is equal to the cluster name and the *member_field* value is in this
-    list.
+    The AFD messages that are associated with the cluster when the
+    ``cluster_field`` value is equal to the cluster name and the
+    ``member_field`` value is in this list.

 | group_by
 |   Type: Enum(member, hostname)
 |   This parameter defines how the incoming AFD metrics are aggregated.
 |
 |     member:
-|       aggregation by member, irrespective of the host that emitted the AFD metric.
-|       This setting is typically used for AFD metrics that are not host-centric.
+|       aggregation by member, irrespective of the host that emitted the AFD
+|       metric. This setting is typically used for AFD metrics that are not
+|       host-centric.
 |
 |     hostname:
 |       aggregation by hostname then by member.
-|       This setting is typically used for AFD metrics that are host-centric such as
-|       those working on filesystem or CPU usage metrics.
+|       This setting is typically used for AFD metrics that are host-centric,
+|       such as those working on the file system or CPU usage metrics.

 | policy:
 |   Type: unicode
-|   The policy to use for computing the service cluster status. See :ref:`gse_policies`
-    for details.
+|   The policy to use for computing the service cluster status.
+    See :ref:`gse_policies` for details.

-If we look more closely into the example above, it defines that the Service
-Cluster GSE plugin resulting from those rules will emit a
-*gse_service_cluster_metric* message every 10
-seconds to report the current status of the *nova-api* cluster. This
-status is computed using the *afd_service_metric* metric for which
-Fields[service] is 'nova-api' and Fields[source] is one of 'backends',
-'endpoint' or 'http_errors'. The 'nova-api' cluster's status is computed using
-the 'highest_severity' policy which means that it will be equal to the 'worst'
+A closer look into the example above defines that the Service Cluster GSE
+plugin resulting from those rules will emit a *gse_service_cluster_metric*
+message every 10 seconds to report the current status of the *nova-api*
+cluster. This status is computed using the *afd_service_metric* metric for
+which Fields[service] is 'nova-api' and Fields[source] is one of 'backends',
+'endpoint', or 'http_errors'. The 'nova-api' cluster's status is computed using
+the 'highest_severity' policy, which means that it will be equal to the 'worst'
 status across all members.

 .. _gse_cluster_node:
@ -728,11 +703,10 @@ status across all members.
 Node cluster aggregation rules
 ++++++++++++++++++++++++++++++

-The node cluster aggregation rules are used to designate
-the members of a node cluster along with
-the AFD metrics that must be taken into account to derive
-an health status for the node cluster.
-Here is an example of the node cluster aggregation rules::
+The node cluster aggregation rules are used to designate the members of a node
+cluster along with the AFD metrics that must be taken into account to derive
+a health status for the node cluster. The following is an example of the node
+cluster aggregation rules::

  gse_cluster_node:
    input_message_types:
@ -804,7 +778,7 @@ Where
 Node cluster definition
 +++++++++++++++++++++++

-The node clusters are defined as shown in the example below::
+The following example shows the node clusters definition::

  gse_cluster_node:
    [...]
@ -822,36 +796,35 @@ Where
 | members
 |   Type: list
 |   The list of cluster members.
-    The AFD messages are associated to the cluster when the *cluster_field*
-    value is equal to the cluster name and the *member_field* value is in this
-    list.
+    The AFD messages are associated to the cluster when the ``cluster_field``
+    value is equal to the cluster name and the ``member_field`` value is in
+    this list.

 | group_by
 |   Type: Enum(member, hostname)
 |   This parameter defines how the incoming AFD metrics are aggregated.
 |
 |     member:
-|       aggregation by member, irrespective of the host that emitted the AFD metric.
-|       This setting is typically used for AFD metrics that are not host-centric.
+|       aggregation by member, irrespective of the host that emitted the AFD
+|       metric. This setting is typically used for AFD metrics that are not
+|       host-centric.
 |
 |     hostname:
 |       aggregation by hostname then by member.
-|       This setting is typically used for AFD metrics that are host-centric such as
-|       those working on filesystem or CPU usage metrics.
+|       This setting is typically used for AFD metrics that are host-centric,
+|       such as those working on the file system or CPU usage metrics.

 | policy:
 |   Type: unicode
-|   The policy to use for computing the node cluster status. See :ref:`gse_policies`
-    for details.
+|   The policy to use for computing the node cluster status.
+    See :ref:`gse_policies` for details.

-If we look more closely into the example above, it defines that the Node
-Cluster GSE plugin resulting from those rules will emit a
-*gse_node_cluster_metric* message every 10
-seconds to report the current status of the *controller* cluster. This
+A closer look into the example above defines that the Node Cluster GSE plugin
+resulting from those rules will emit a *gse_node_cluster_metric* message every
+10 seconds to report the current status of the *controller* cluster. This
 status is computed using the *afd_node_metric* metric for which
 Fields[node_role] is 'controller' and Fields[source] is one of 'cpu',
-'root-fs' or 'log-fs'. The 'controller' cluster's status is computed using
-the 'majority_of_members' policy which means that it will be equal to the 'majority'
+'root-fs' or 'log-fs'. The 'controller' cluster's status is computed using the 'majority_of_members' policy which means that it will be equal to the 'majority'
 status across all members.

 .. _gse_cluster_global:
@ -859,23 +832,20 @@ status across all members.
 Top-level cluster aggregation rules
 +++++++++++++++++++++++++++++++++++

-The top-level agggregation rules aggregate GSE metrics from the
-Service Cluster GSE Plugin and the Node Cluster GSE Plugin.
-This is the last aggregation stage that issues health status
-for the top-level clusters. A top-level cluster is a logical
-contruct of service and node clustering. By default, we define
-that the health status of Nova, as a top-level cluster,
-depends on the health status of several service clusters
-related to Nova and the health status of the 'controller' and
-'compute' node clusters. But it can be anything. For example, you
-could define a 'control-plane' top-level cluster that would
-exclude the health status of the 'compute' node cluster if
-you wanted to... In summary, the top-level cluster aggregation
-rules are used to designate the node clusters and service
-clusters members of a top-level cluster along with
-the GSE metrics that must be taken into account to derive
-an health status for the top-level cluster.
-Here is an example of a top-level cluster aggregation rules::
+The top-level aggregation rules aggregate GSE metrics from the Service
+Cluster GSE Plugin and the Node Cluster GSE Plugin. This is the last
+aggregation stage that issues health status for the top-level clusters.
+A top-level cluster is a logical construct of service and node clustering.
+By default, we define that the health status of Nova, as a top-level cluster,
+depends on the health status of several service clusters related to Nova and
+the health status of the 'controller' and 'compute' node clusters. But it can
+be anything. For example, you can define a 'control-plane' top-level cluster
+that would exclude the health status of the 'compute' node cluster if required.
+The top-level cluster aggregation rules are used to designate the node
+clusters and service clusters members of a top-level cluster along with the
+GSE metrics that must be taken into account to derive a health status for the
+top-level cluster. The following is an example of a top-level cluster
+aggregation rules::

  gse_cluster_global:
    input_message_types:
@ -954,7 +924,7 @@ Where
 Top-level cluster definition
 ++++++++++++++++++++++++++++

-The top-level clusters are defined as shown in the example below::
+The following example shows the top-level clusters definition::

  gse_cluster_global:
    [...]
@ -987,15 +957,16 @@ Where
 | members
 |   Type: list
 |   The list of cluster members.
-|   The GSE messages are associated to the cluster when the *member_field* value
-|   (i.e *cluster_name*) is in this list.
+|   The GSE messages are associated to the cluster when the ``member_field``
+|   value (``cluster_name``), is on this list.

 | hints
 |   Type: list
-|   The list of clusters that are indirectly associated with the top-level cluster.
-|   The GSE messages are indirectly associated to the cluster when the *member_field* value
-|   (i.e *cluster_name*) is in this list. This means that they are not used to derive
-|   the health status of the top-level cluster but as 'hints' for root cause analysis.
+|   The list of clusters that are indirectly associated with the top-level
+|   cluster. The GSE messages are indirectly associated to the cluster when
+|   the ``member_field`` value (``cluster_name``) is on this list. This means
+|   that they are not used to derive the health status of the top-level
+|   cluster but as 'hints' for root cause analysis.

 | group_by
 |   Type: Enum(member, hostname)
@ -1004,8 +975,8 @@ Where

 | policy:
 |   Type: unicode
-|   The policy to use for computing the top-level cluster status. See :ref:`gse_policies`
-    for details.
+|   The policy to use for computing the top-level cluster status.
+    See :ref:`gse_policies` for details.

 .. _puppet_apply:

@ -1015,11 +986,10 @@ Apply your configuration changes
 Once you have edited and saved your changes in
 ``/etc/hiera/override/alarmaing.yaml`` and / or
 ``/etc/hiera/override/gse_filters.yaml``,
-you need to apply the following Puppet manifest on
-all the nodes of your OpenStack
-environment (**including the node(s) where Nagios is installed**)
+apply the following Puppet manifest on all the nodes of your OpenStack
+environment **including the node(s) where Nagios is installed**
 for the changes to take effect::

  # puppet apply --modulepath=/etc/fuel/plugins/lma_collector-<version>/puppet/modules:\
      /etc/puppet/modules \
-      /etc/fuel/plugins/lma_collector-<version>/puppet/manifests/configure_afd_filters.pp
+      /etc/fuel/plugins/lma_collector-<version>/puppet/manifests/configure_afd_filters.pp
--- a/doc/user/source/configure_plugin.rst
+++ b/doc/user/source/configure_plugin.rst
@ -77,6 +77,10 @@ Plugin configuration

 .. _plugin_verification:

+.. raw:: latex
+
+   \pagebreak
+
 Plugin verification
 -------------------

--- a/doc/user/source/metrics/ceph.rst
+++ b/doc/user/source/metrics/ceph.rst
@ -1,108 +1,128 @@
 .. _Ceph_metrics:


-All Ceph metrics have a ``cluster`` field containing the name of the Ceph cluster
-(*ceph* by default).
+All Ceph metrics have a ``cluster`` field containing the name of the Ceph
+cluster (*ceph* by default).

-See `cluster monitoring`_ and `RADOS monitoring`_ for further details.
+For details, see
+`Cluster monitoring <http://docs.ceph.com/docs/master/rados/operations/monitoring/>`_
+and `RADOS monitoring <http://docs.ceph.com/docs/master/rados/operations/monitoring-osd-pg/>`_.

 Cluster
 ^^^^^^^

-* ``ceph_health``, the health status of the entire cluster where values ``1``, ``2``
-  , ``3`` represent respectively ``OK``, ``WARNING`` and ``ERROR``.
+* ``ceph_health``, the health status of the entire cluster where values
+  ``1``, ``2``, ``3`` represent  ``OK``, ``WARNING`` and ``ERROR``, respectively.

-* ``ceph_monitor_count``, number of ceph-mon processes.
+* ``ceph_monitor_count``, the number of ceph-mon processes.

-* ``ceph_quorum_count``, number of ceph-mon processes participating in the
+* ``ceph_quorum_count``, the number of ceph-mon processes participating in the
  quorum.

 Pools
 ^^^^^

-* ``ceph_pool_total_avail_bytes``, total available size in bytes for all pools.
-* ``ceph_pool_total_bytes``,  total number of bytes for all pools.
-* ``ceph_pool_total_number``, total number of pools.
-* ``ceph_pool_total_used_bytes``, total used size in bytes by all pools.
+* ``ceph_pool_total_avail_bytes``, the total available size in bytes for all
+  pools.
+* ``ceph_pool_total_bytes``, the total number of bytes for all pools.
+* ``ceph_pool_total_number``, the total number of pools.
+* ``ceph_pool_total_used_bytes``, the total used size in bytes by all pools.

-The folllowing metrics have a ``pool`` field that contains the name of the Ceph pool.
+The following metrics have a ``pool`` field that contains the name of the
+Ceph pool.

-* ``ceph_pool_bytes_used``, amount of data in bytes used by the pool.
-* ``ceph_pool_max_avail``, available size in bytes for the pool.
-* ``ceph_pool_objects``, number of objects in the pool.
-* ``ceph_pool_op_per_sec``, number of operations per second for the pool.
-* ``ceph_pool_pg_num``, number of placement groups for the pool.
-* ``ceph_pool_read_bytes_sec``, number of bytes read by second for the pool.
-* ``ceph_pool_size``, number of data replications for the pool.
-* ``ceph_pool_write_bytes_sec``, number of bytes written by second for the pool.
+* ``ceph_pool_bytes_used``, the amount of data in bytes used by the pool.
+* ``ceph_pool_max_avail``, the available size in bytes for the pool.
+* ``ceph_pool_objects``, the number of objects in the pool.
+* ``ceph_pool_op_per_sec``, the number of operations per second for the pool.
+* ``ceph_pool_pg_num``, the number of placement groups for the pool.
+* ``ceph_pool_read_bytes_sec``, the number of bytes read by second for the pool.
+* ``ceph_pool_size``, the number of data replications for the pool.
+* ``ceph_pool_write_bytes_sec``, the number of bytes written by second for the
+  pool.

 Placement Groups
 ^^^^^^^^^^^^^^^^

-* ``ceph_pg_bytes_avail``, available size in bytes.
-* ``ceph_pg_bytes_total``, cluster total size in bytes.
-* ``ceph_pg_bytes_used``, data stored size in bytes.
-* ``ceph_pg_data_bytes``, stored data size in bytes before it is replicated, cloned
-  or snapshotted.
-* ``ceph_pg_state``, number of placement groups in a given state. The metric
-  contains a ``state`` field whose value is ``<state>`` is a combination
+* ``ceph_pg_bytes_avail``, the available size in bytes.
+* ``ceph_pg_bytes_total``, the cluster total size in bytes.
+* ``ceph_pg_bytes_used``, the data stored size in bytes.
+* ``ceph_pg_data_bytes``, the stored data size in bytes before it is
+  replicated, cloned or snapshotted.
+* ``ceph_pg_state``, the number of placement groups in a given state. The
+  metric contains a ``state`` field whose ``<state>`` value is a combination
  separated by ``+`` of 2 or more states of this list: ``creating``,
  ``active``, ``clean``, ``down``, ``replay``, ``splitting``, ``scrubbing``,
  ``degraded``, ``inconsistent``, ``peering``, ``repair``, ``recovering``,
  ``recovery_wait``, ``backfill``, ``backfill-wait``, ``backfill_toofull``,
  ``incomplete``, ``stale``, ``remapped``.
-* ``ceph_pg_total``, total number of placement groups.
+* ``ceph_pg_total``, the total number of placement groups.

 OSD Daemons
 ^^^^^^^^^^^

-* ``ceph_osd_down``, number of OSD daemons DOWN.
-* ``ceph_osd_in``, number of OSD daemons IN.
-* ``ceph_osd_out``, number of OSD daemons OUT.
-* ``ceph_osd_up``, number of OSD daemons UP.
+* ``ceph_osd_down``, the number of OSD daemons DOWN.
+* ``ceph_osd_in``, the number of OSD daemons IN.
+* ``ceph_osd_out``, the number of OSD daemons OUT.
+* ``ceph_osd_up``, the number of OSD daemons UP.

-The following metrics have an ``osd`` field that contains the OSD identifier.
+The following metrics have an ``osd`` field that contains the OSD identifier:

 * ``ceph_osd_apply_latency``, apply latency in ms for the given OSD.
 * ``ceph_osd_commit_latency``, commit latency in ms for the given OSD.
-* ``ceph_osd_total``, total size in bytes for the given OSD.
-* ``ceph_osd_used``, data stored size in bytes for the given OSD.
+* ``ceph_osd_total``, the total size in bytes for the given OSD.
+* ``ceph_osd_used``, the data stored size in bytes for the given OSD.

 OSD Performance
 ^^^^^^^^^^^^^^^

 All the following metrics are retrieved per OSD daemon from the corresponding
-socket ``/var/run/ceph/ceph-osd.<ID>.asok`` by issuing the command ``perf dump``.
+``/var/run/ceph/ceph-osd.<ID>.asok`` socket by issuing the :command:`perf dump`
+command.

 All metrics have an ``osd`` field that contains the OSD identifier.

-.. note:: These metrics are not collected when a node has both the ceph-osd and controller roles.
+.. note:: These metrics are not collected when a node has both the ceph-osd
+   and controller roles.

-See `OSD performance counters`_ for further details.
+For details, see `OSD performance counters <http://ceph.com/docs/firefly/dev/perf_counters/>`_.

-* ``ceph_perf_osd_op``, number of client operations.
-* ``ceph_perf_osd_op_in_bytes``, number of bytes received from clients for write operations.
-* ``ceph_perf_osd_op_latency``, average latency in ms for client operations (including queue time).
-* ``ceph_perf_osd_op_out_bytes``, number of bytes sent to clients for read operations.
-* ``ceph_perf_osd_op_process_latency``, average latency in ms for client operations (excluding queue time).
-* ``ceph_perf_osd_op_r``, number of client read operations.
-* ``ceph_perf_osd_op_r_latency``, average latency in ms for read operation (including queue time).
-* ``ceph_perf_osd_op_r_out_bytes``, number of bytes sent to clients for read operations.
-* ``ceph_perf_osd_op_r_process_latency``, average latency in ms for read operation (excluding queue time).
-* ``ceph_perf_osd_op_rw``, number of client read-modify-write operations.
-* ``ceph_perf_osd_op_rw_in_bytes``, number of bytes per second received from clients for read-modify-write operations.
-* ``ceph_perf_osd_op_rw_latency``, average latency in ms for read-modify-write operations (including queue time).
-* ``ceph_perf_osd_op_rw_out_bytes``, number of bytes per second sent to clients for read-modify-write operations.
-* ``ceph_perf_osd_op_rw_process_latency``, average latency in ms for read-modify-write operations (excluding queue time).
-* ``ceph_perf_osd_op_rw_rlat``, average latency in ms for read-modify-write operations with readable/applied.
-* ``ceph_perf_osd_op_w``, number of client write operations.
-* ``ceph_perf_osd_op_wip``, number of replication operations currently being processed (primary).
-* ``ceph_perf_osd_op_w_in_bytes``, number of bytes received from clients for write operations.
-* ``ceph_perf_osd_op_w_latency``, average latency in ms for write operations (including queue time).
-* ``ceph_perf_osd_op_w_process_latency``, average latency in ms for write operation (excluding queue time).
-* ``ceph_perf_osd_op_w_rlat``, average latency in ms for write operations with readable/applied.
-* ``ceph_perf_osd_recovery_ops``, number of recovery operations in progress.
-
-.. _cluster monitoring: http://docs.ceph.com/docs/master/rados/operations/monitoring/
-.. _RADOS monitoring: http://docs.ceph.com/docs/master/rados/operations/monitoring-osd-pg/
-.. _OSD performance counters: http://ceph.com/docs/firefly/dev/perf_counters/
+* ``ceph_perf_osd_op``, the number of client operations.
+* ``ceph_perf_osd_op_in_bytes``, the number of bytes received from clients for
+  write operations.
+* ``ceph_perf_osd_op_latency``, the average latency in ms for client operations
+  (including queue time).
+* ``ceph_perf_osd_op_out_bytes``, the number of bytes sent to clients for read
+  operations.
+* ``ceph_perf_osd_op_process_latency``, the average latency in ms for client
+  operations (excluding queue time).
+* ``ceph_perf_osd_op_r``, the number of client read operations.
+* ``ceph_perf_osd_op_r_latency``, the average latency in ms for read operation
+  (including queue time).
+* ``ceph_perf_osd_op_r_out_bytes``, the number of bytes sent to clients for
+  read operations.
+* ``ceph_perf_osd_op_r_process_latency``, the average latency in ms for read
+  operation (excluding queue time).
+* ``ceph_perf_osd_op_rw``, the number of client read-modify-write operations.
+* ``ceph_perf_osd_op_rw_in_bytes``, the number of bytes per second received
+  from clients for read-modify-write operations.
+* ``ceph_perf_osd_op_rw_latency``, the average latency in ms for
+  read-modify-write operations (including queue time).
+* ``ceph_perf_osd_op_rw_out_bytes``, the number of bytes per second sent to
+  clients for read-modify-write operations.
+* ``ceph_perf_osd_op_rw_process_latency``, the average latency in ms for
+  read-modify-write operations (excluding queue time).
+* ``ceph_perf_osd_op_rw_rlat``, the average latency in ms for read-modify-write
+  operations with readable/applied.
+* ``ceph_perf_osd_op_w``, the number of client write operations.
+* ``ceph_perf_osd_op_wip``, the number of replication operations currently
+  being processed (primary).
+* ``ceph_perf_osd_op_w_in_bytes``, the number of bytes received from clients
+  for write operations.
+* ``ceph_perf_osd_op_w_latency``, the average latency in ms for write
+  operations (including queue time).
+* ``ceph_perf_osd_op_w_process_latency``, the average latency in ms for write
+  operation (excluding queue time).
+* ``ceph_perf_osd_op_w_rlat``, the average latency in ms for write operations
+  with readable/applied.
+* ``ceph_perf_osd_recovery_ops``, the number of recovery operations in progress.
--- a/doc/user/source/metrics/clusters.rst
+++ b/doc/user/source/metrics/clusters.rst
@ -3,24 +3,23 @@
 The cluster metrics are emitted by the GSE plugins. For details, see
 :ref:`Configuring alarms <configure_alarms>`.

-* ``cluster_node_status``, the status of the node cluster.
-  The metric contains a ``cluster_name`` field that identifies the node cluster.
+* ``cluster_node_status``, the status of the node cluster. The metric contains
+  a ``cluster_name`` field that identifies the node cluster.

-* ``cluster_service_status``, the status of the service cluster.
-  The metric contains a ``cluster_name`` field that identifies the service cluster.
-
-* ``cluster_status``, the status of the global cluster.
-  The metric contains a ``cluster_name`` field that identifies the global cluster.
+* ``cluster_service_status``, the status of the service cluster. The metric
+  contains a ``cluster_name`` field that identifies the service cluster.

+* ``cluster_status``, the status of the global cluster. The metric contains a
+  ``cluster_name`` field that identifies the global cluster.

 The supported values for these metrics are:

-* `0` for the *Okay* status.
+* ``0`` for the *Okay* status.

-* `1` for the *Warning* status.
+* ``1`` for the *Warning* status.

-* `2` for the *Unknown* status.
+* ``2`` for the *Unknown* status.

-* `3` for the *Critical* status.
+* ``3`` for the *Critical* status.

-* `4` for the *Down* status.
+* ``4`` for the *Down* status.
--- a/doc/user/source/metrics/elasticsearch.rst
+++ b/doc/user/source/metrics/elasticsearch.rst
@ -1,20 +1,19 @@
 .. _Elasticsearch:

 The following metrics represent the simple status on the health of the cluster.
-See `cluster health`_ for further details.
+For details, see `Cluster health <https://www.elastic.co/guide/en/elasticsearch/reference/1.7/cluster-health.html>`_.

 * ``elasticsearch_cluster_active_primary_shards``, the number of active primary
  shards.
 * ``elasticsearch_cluster_active_shards``, the number of active shards.
 * ``elasticsearch_cluster_health``, the health status of the entire cluster
-  where values ``1``, ``2`` , ``3`` represent respectively ``green``,
-  ``yellow`` and ``red``. The ``red`` status may also be reported when the
-  Elasticsearch API returns an unexpected result (network failure for instance).
+  where values ``1``, ``2`` , ``3`` represent ``green``, ``yellow`` and
+  ``red``, respectively. The ``red`` status may also be reported when the
+  Elasticsearch API returns an unexpected result,  for example, a network
+  failure.
 * ``elasticsearch_cluster_initializing_shards``, the number of initializing
  shards.
 * ``elasticsearch_cluster_number_of_nodes``, the number of nodes in the cluster.
 * ``elasticsearch_cluster_number_of_pending_tasks``, the number of pending tasks.
 * ``elasticsearch_cluster_relocating_shards``, the number of relocating shards.
-* ``elasticsearch_cluster_unassigned_shards``, the number of unassigned shards.
-
-.. _cluster health: https://www.elastic.co/guide/en/elasticsearch/reference/1.7/cluster-health.html
+* ``elasticsearch_cluster_unassigned_shards``, the number of unassigned shards.
--- a/doc/user/source/metrics/haproxy.rst
+++ b/doc/user/source/metrics/haproxy.rst
@ -1,6 +1,6 @@
 .. _haproxy_metrics:

-``frontend`` and ``backend`` field values can be:
+The ``frontend`` and ``backend`` field values can be as follows:

 * cinder-api
 * glance-api
@ -35,7 +35,8 @@ Server
 Frontends
 ^^^^^^^^^

-The following metrics have a ``frontend`` field that contains the name of the frontend server.
+The following metrics have a ``frontend`` field that contains the name of the
+front-end server:

 * ``haproxy_frontend_bytes_in``, the number of bytes received by the frontend.
 * ``haproxy_frontend_bytes_out``, the number of bytes transmitted by the frontend.
@ -55,25 +56,33 @@ Backends
 ^^^^^^^^
 .. _haproxy_backend_metric:

-The following metrics have a ``backend`` field that contains the name of the backend server.
+The following metrics have a ``backend`` field that contains the name of the
+back-end server:

-* ``haproxy_backend_bytes_in``, the number of bytes received by the backend.
-* ``haproxy_backend_bytes_out``, the number of bytes transmitted by the backend.
+* ``haproxy_backend_bytes_in``, the number of bytes received by the back end.
+* ``haproxy_backend_bytes_out``, the number of bytes transmitted by the back end.
 * ``haproxy_backend_denied_requests``, the number of denied requests.
 * ``haproxy_backend_denied_responses``, the number of denied responses.
-* ``haproxy_backend_downtime``, the total downtime in second.
+* ``haproxy_backend_downtime``, the total downtime in seconds.
 * ``haproxy_backend_error_connection``, the number of error connections.
 * ``haproxy_backend_error_responses``, the number of error responses.
 * ``haproxy_backend_queue_current``, the number of requests in queue.
-* ``haproxy_backend_redistributed``, the number of times a request was redispatched to another server.
+* ``haproxy_backend_redistributed``, the number of times a request was
+  redispatched to another server.
 * ``haproxy_backend_response_1xx``, the number of HTTP responses with 1xx code.
 * ``haproxy_backend_response_2xx``, the number of HTTP responses with 2xx code.
 * ``haproxy_backend_response_3xx``, the number of HTTP responses with 3xx code.
 * ``haproxy_backend_response_4xx``, the number of HTTP responses with 4xx code.
 * ``haproxy_backend_response_5xx``, the number of HTTP responses with 5xx code.
-* ``haproxy_backend_response_other``, the number of HTTP responses with other code.
-* ``haproxy_backend_retries``, the number of times a connection to a server was retried.
-* ``haproxy_backend_servers``, the count of servers grouped by state. This metric has an additional ``state`` field that contains the state of the backends (either 'down' or 'up').
+* ``haproxy_backend_response_other``, the number of HTTP responses with other
+  code.
+* ``haproxy_backend_retries``, the number of times a connection to a server
+  was retried.
+* ``haproxy_backend_servers``, the count of servers grouped by state. This
+  metric has an additional ``state`` field that contains the state of the
+  back ends (either 'down' or 'up').
 * ``haproxy_backend_session_current``, the number of current sessions.
 * ``haproxy_backend_session_total``, the cumulative number of sessions.
-* ``haproxy_backend_status``, the global backend status where values ``0`` and ``1`` represent respectively ``DOWN`` (all backends are down) and ``UP`` (at least one backend is up).
+* ``haproxy_backend_status``, the global back-end status where values ``0``
+  and ``1`` represent, respectively, ``DOWN`` (all back ends are down) and ``UP``
+  (at least one back end is up).
--- a/doc/user/source/metrics/influxdb.rst
+++ b/doc/user/source/metrics/influxdb.rst
@ -1,37 +1,47 @@
 .. InfluxDB:

-The following metrics are extracted from the output of ``show stats`` command.
-The values are reset to zero when InfluxDB is restarted.
+The following metrics are extracted from the output of the :command:`show stats`
+command. The values are reset to zero when InfluxDB is restarted.

 cluster
 ^^^^^^^

-These metrics are only available if there are more than one node in the cluster.
+The following metrics are only available if there is more than one node in the
+cluster:

-* ``influxdb_cluster_write_shard_points_requests``, the number of requests for writing a time series points to a shard.
-* ``influxdb_cluster_write_shard_requests``, the number of requests for writing to a shard.
+* ``influxdb_cluster_write_shard_points_requests``, the number of requests for
+  writing a time series points to a shard.
+* ``influxdb_cluster_write_shard_requests``, the number of requests for writing
+  to a shard.

 httpd
 ^^^^^

-* ``influxdb_httpd_failed_auths``, the number of times failed authentications.
+* ``influxdb_httpd_failed_auths``, the number of failed authentications.
 * ``influxdb_httpd_ping_requests``, the number of ping requests.
 * ``influxdb_httpd_query_requests``, the number of query requests received.
-* ``influxdb_httpd_query_response_bytes``, the number of bytes returned to the client.
+* ``influxdb_httpd_query_response_bytes``, the number of bytes returned to the
+  client.
 * ``influxdb_httpd_requests``, the number of requests received.
 * ``influxdb_httpd_write_points_ok``, the number of points successfully written.
-* ``influxdb_httpd_write_request_bytes``, the number of bytes received for write requests.
+* ``influxdb_httpd_write_request_bytes``, the number of bytes received for
+  write requests.
 * ``influxdb_httpd_write_requests``, the number of write requests received.

 write
 ^^^^^

-* ``influxdb_write_local_point_requests``, the number of write points requests from the local data node.
+* ``influxdb_write_local_point_requests``, the number of write points requests
+  from the local data node.
 * ``influxdb_write_ok``, the number of successful writes of consistency level.
-* ``influxdb_write_point_requests``, the number of write points requests across all data nodes.
-* ``influxdb_write_remote_point_requests``, the number of write points requests to remote data nodes.
-* ``influxdb_write_requests``, the number of write requests across all data nodes.
-* ``influxdb_write_sub_ok``, the number of successful points send to subscriptions.
+* ``influxdb_write_point_requests``, the number of write points requests across
+  all data nodes.
+* ``influxdb_write_remote_point_requests``, the number of write points requests
+  to remote data nodes.
+* ``influxdb_write_requests``, the number of write requests across all data
+  nodes.
+* ``influxdb_write_sub_ok``, the number of successful points sent to
+  subscriptions.

 runtime
 ^^^^^^^
@ -41,11 +51,12 @@ runtime
 * ``influxdb_heap_idle``, the number of bytes in idle spans.
 * ``influxdb_heap_in_use``, the number of bytes in non-idle spans.
 * ``influxdb_heap_objects``, the total number of allocated objects.
-* ``influxdb_heap_released``, the number of bytes released to the operating system.
+* ``influxdb_heap_released``, the number of bytes released to the operating
+  system.
 * ``influxdb_heap_system``, the number of bytes obtained from the system.
 * ``influxdb_memory_alloc``, the number of bytes allocated and not yet freed.
 * ``influxdb_memory_frees``, the number of free operations.
 * ``influxdb_memory_lookups``, the number of pointer lookups.
 * ``influxdb_memory_mallocs``, the number of malloc operations.
 * ``influxdb_memory_system``, the number of bytes obtained from the system.
-* ``influxdb_memory_total_alloc``, the number of bytes allocated (even if freed).
+* ``influxdb_memory_total_alloc``, the number of bytes allocated (even if freed).
--- a/doc/user/source/metrics/libvirt.rst
+++ b/doc/user/source/metrics/libvirt.rst
@ -1,6 +1,6 @@
 .. _libvirt-metrics:

-Every metric contains an ``instance_id`` field which is the UUID of the
+Every metric contains an ``instance_id`` field, which is the UUID of the
 instance for the Nova service.

 CPU
@ -17,7 +17,7 @@ Disk
 ^^^^

 Metrics have a ``device`` field that contains the virtual disk device to which
-the metric applies (eg 'vda', 'vdb' and so on).
+the metric applies. For example, 'vda', 'vdb', and others.

 * ``virt_disk_octets_read``, the number of octets (bytes) read per second.

@ -37,7 +37,7 @@ Network
 ^^^^^^^

 Metrics have an ``interface`` field that contains the interface name to which
-the metric applies (eg 'tap0dc043a6-dd', 'tap769b123a-2e' and so on).
+the metric applies. For example, 'tap0dc043a6-dd', 'tap769b123a-2e', and others.

 * ``virt_if_dropped_rx``, the number of dropped packets per second when
  receiving from the interface.
@ -61,4 +61,4 @@ the metric applies (eg 'tap0dc043a6-dd', 'tap769b123a-2e' and so on).
  interface.

 * ``virt_if_packets_tx``, the number of packets transmitted per second by the
-  interface.
+  interface.
--- a/doc/user/source/metrics/lma.rst
+++ b/doc/user/source/metrics/lma.rst
@ -3,49 +3,67 @@
 System
 ^^^^^^

-Metrics have a ``service`` field with the name of the service it applies to. Values can be: hekad, collectd, influxd, grafana-server or elasticsearch.
+The metrics have a ``service`` field with the name of the service it applies
+to. The values can be: ``hekad``, ``collectd``, ``influxd``, ``grafana-server``
+or ``elasticsearch``.

-* ``lma_components_count_processes``, number of processes currently running.
-* ``lma_components_count_threads``, number of threads currently running.
-* ``lma_components_cputime_syst``, percentage of CPU time spent in system mode by the service.
-  It can be greater than 100% when the node has more than one CPU.
-* ``lma_components_cputime_user``, percentage of CPU time spent in user mode by the service.
-  It can be greater than 100% when the node has more than one CPU.
-* ``lma_components_disk_bytes_read``, number of bytes read from disk(s) per second.
-* ``lma_components_disk_bytes_write``, number of bytes written to disk(s) per second.
-* ``lma_components_disk_ops_read``, number of read operations from disk(s) per second.
-* ``lma_components_disk_ops_write``, number of write operations to disk(s) per second.
-* ``lma_components_memory_code``,  physical memory devoted to executable code (bytes).
-* ``lma_components_memory_data``, physical memory devoted to other than executable code (bytes).
-* ``lma_components_memory_rss``, non-swapped physical memory used (bytes).
-* ``lma_components_memory_vm``, virtual memory size (bytes).
+* ``lma_components_count_processes``, the number of processes currently running.
+* ``lma_components_count_threads``, the number of threads currently running.
+* ``lma_components_cputime_syst``, the percentage of CPU time spent in system
+  mode by the service. It can be greater than 100% when the node has more than
+  one CPU.
+* ``lma_components_cputime_user``, the percentage of CPU time spent in user
+  mode by the service. It can be greater than 100% when the node has more than
+  one CPU.
+* ``lma_components_disk_bytes_read``, the number of bytes read from disk(s) per
+  second.
+* ``lma_components_disk_bytes_write``, the number of bytes written to disk(s)
+  per second.
+* ``lma_components_disk_ops_read``, the number of read operations from disk(s)
+  per second.
+* ``lma_components_disk_ops_write``, the number of write operations to disk(s)
+  per second.
+* ``lma_components_memory_code``, the physical memory devoted to executable code
+  in bytes.
+* ``lma_components_memory_data``, the physical memory devoted to other than
+  executable code in bytes.
+* ``lma_components_memory_rss``, the non-swapped physical memory used in bytes.
+* ``lma_components_memory_vm``, the virtual memory size in bytes.
 * ``lma_components_pagefaults_majflt``, major page faults per second.
 * ``lma_components_pagefaults_minflt``, minor page faults per second.
-* ``lma_components_stacksize``, absolute value of the start address (the bottom)
+* ``lma_components_stacksize``, the absolute value of the start address (the bottom)
  of the stack minus the address of the current stack pointer.

 Heka pipeline
 ^^^^^^^^^^^^^

-Metrics have two fields: ``name`` that contains the name of the decoder or filter as defined by *Heka* and ``type`` that is either *decoder* or *filter*.
+The metrics have two fields: ``name`` that contains the name of the decoder
+or filter as defined by *Heka* and ``type`` that is either *decoder* or
+*filter*.

-Metrics for both types:
+The metrics for both types are as follows:

-* ``hekad_memory``, the total memory used by the Sandbox (in bytes).
-* ``hekad_msg_avg_duration``, the average time for processing the message (in nanoseconds).
-* ``hekad_msg_count``, the total number of messages processed by the decoder. This will reset to 0 when the process is restarted.
+* ``hekad_memory``, the total memory in bytes used by the Sandbox.
+* ``hekad_msg_avg_duration``, the average time in nanoseconds for processing
+  the message.
+* ``hekad_msg_count``, the total number of messages processed by the decoder.
+  This resets to ``0`` when the process is restarted.

 Additional metrics for *filter* type:

-* ``heakd_timer_event_avg_duration``, the average time for executing the *timer_event* function (in nanoseconds).
-* ``hekad_timer_event_count``, the total number of executions of the *timer_event* function. This will reset to 0 when the process is restarted.
+* ``heakd_timer_event_avg_duration``, the average time in nanoseconds for
+  executing the *timer_event* function.
+* ``hekad_timer_event_count``, the total number of executions of the
+  *timer_event* function. This resets to ``0`` when the process is restarted.

-Backend checks
-^^^^^^^^^^^^^^
+Back-end checks
+^^^^^^^^^^^^^^^

-* ``http_check``, the backend's API status, 1 if it is responsive, if not 0.
-  The metric contains a ``service`` field that identifies the LMA backend service being checked.
+* ``http_check``, the API status of the back end, ``1`` if it is responsive,
+  if not, then ``0``. The metric contains a ``service`` field that identifies
+  the LMA back-end service being checked.

-``<service>`` is one of the following values (depending of which Fuel plugins are deployed in the environment):
+``<service>`` is one of the following values, depending on which Fuel plugins
+are deployed in the environment:

-* 'influxdb'
+* 'influxdb'
--- a/doc/user/source/metrics/memcached.rst
+++ b/doc/user/source/metrics/memcached.rst
@ -1,25 +1,26 @@
 .. _memcached_metrics:

-* ``memcached_command_flush``, cumulative number of flush reqs.
-* ``memcached_command_get``, cumulative number of retrieval reqs.
-* ``memcached_command_set``, cumulative number of storage reqs.
-* ``memcached_command_touch``, cumulative number of touch reqs.
-* ``memcached_connections_current``, number of open connections.
-* ``memcached_df_cache_free``, current number of free bytes to store items.
-* ``memcached_df_cache_used``, current number of bytes used to store items.
-* ``memcached_items_current``, current number of items stored.
-* ``memcached_octets_rx``, total number of bytes read by this server from network.
-* ``memcached_octets_tx``, total number of bytes sent by this server to network.
-* ``memcached_ops_decr_hits``, number of successful decr reqs.
-* ``memcached_ops_decr_misses``, number of decr reqs against missing keys.
-* ``memcached_ops_evictions``, number of valid items removed from cache to free memory for new items.
-* ``memcached_ops_hits``, number of keys that have been requested.
-* ``memcached_ops_incr_hits``, number of successful incr reqs.
-* ``memcached_ops_incr_misses``, number of successful incr reqs.
-* ``memcached_ops_misses``, number of items that have been requested and not found.
-* ``memcached_percent_hitratio``, percentage of get command hits (in cache).
+* ``memcached_command_flush``, the cumulative number of flush reqs.
+* ``memcached_command_get``, the cumulative number of retrieval reqs.
+* ``memcached_command_set``, the cumulative number of storage reqs.
+* ``memcached_command_touch``, the cumulative number of touch reqs.
+* ``memcached_connections_current``, the number of open connections.
+* ``memcached_df_cache_free``, the current number of free bytes to store items.
+* ``memcached_df_cache_used``, the current number of bytes used to store items.
+* ``memcached_items_current``, the current number of items stored.
+* ``memcached_octets_rx``, the total number of bytes read by this server from
+  the network.
+* ``memcached_octets_tx``, the total number of bytes sent by this server to
+  the network.
+* ``memcached_ops_decr_hits``, the number of successful decr reqs.
+* ``memcached_ops_decr_misses``, the number of decr reqs against missing keys.
+* ``memcached_ops_evictions``, the number of valid items removed from cache to
+  free memory for new items.
+* ``memcached_ops_hits``, the number of keys that have been requested.
+* ``memcached_ops_incr_hits``, the number of successful incr reqs.
+* ``memcached_ops_incr_misses``, the number of successful incr reqs.
+* ``memcached_ops_misses``, the number of items that have been requested and
+  not found.
+* ``memcached_percent_hitratio``, the percentage of get command hits (in cache).

-
-See `memcached documentation`_ for further details.
-
-.. _memcached documentation: https://github.com/memcached/memcached/blob/master/doc/protocol.txt#L488
+For details, see the `Memcached documentation <https://github.com/memcached/memcached/blob/master/doc/protocol.txt#L488>`_.
--- a/doc/user/source/metrics/mysql.rst
+++ b/doc/user/source/metrics/mysql.rst
@ -4,8 +4,8 @@ Commands
 ^^^^^^^^

 ``mysql_commands``, the number of times per second a given statement has been
-executed.  The metric has a ``statement`` field that contains the statement to
-which it applies. The values can be:
+executed. The metric has a ``statement`` field that contains the statement to
+which it applies. The values can be as follows:

 * ``change_db`` for the USE statement.
 * ``commit`` for the COMMIT statement.
@ -29,7 +29,7 @@ Handlers

 ``mysql_handler``, the number of times per second a given handler has been
 executed. The metric has a ``handler`` field that contains the handler
-it applies to. The values can be:
+it applies to. The values can be as follows:

 * ``commit`` for the internal COMMIT statements.
 * ``delete`` for the internal DELETE statements.
@ -40,56 +40,69 @@ it applies to. The values can be:
 * ``read_prev`` for the requests that read the previous row in key order.
 * ``read_rnd`` for the requests that read a row based on a fixed position.
 * ``read_rnd_next`` for the requests that read the next row in the data file.
-* ``rollback`` the requests that perform rollback operation.
+* ``rollback`` the requests that perform the rollback operation.
 * ``update`` the requests that update a row in a table.
 * ``write`` the requests that insert a row in a table.

 Locks
 ^^^^^

-* ``mysql_locks_immediate``, the number of times per second the requests for table locks could be granted immediately.
-* ``mysql_locks_waited``, the number of times per second the requests for table locks had to wait.
+* ``mysql_locks_immediate``, the number of times per second the requests for
+  table locks could be granted immediately.
+* ``mysql_locks_waited``, the number of times per second the requests for
+  table locks had to wait.

 Network
 ^^^^^^^

-* ``mysql_octets_rx``, the number of bytes received per second by the server.
-* ``mysql_octets_tx``, the number of bytes sent per second by the server.
+* ``mysql_octets_rx``, the number of bytes per second received by the server.
+* ``mysql_octets_tx``, the number of bytes per second sent by the server.

 Threads
 ^^^^^^^

 * ``mysql_threads_cached``, the number of threads in the thread cache.
 * ``mysql_threads_connected``, the number of currently open connections.
-* ``mysql_threads_created``, the number of threads created per second to handle connections.
+* ``mysql_threads_created``, the number of threads created per second to
+  handle connections.
 * ``mysql_threads_running``, the number of threads that are not sleeping.

 Cluster
 ^^^^^^^

-These metrics are collected with statement 'SHOW STATUS'. see `Percona documentation`_
-for further details.
+The following metrics are collected with statement 'SHOW STATUS'. For details,
+see `Percona documentation <http://www.percona.com/doc/percona-xtradb-cluster/5.6/wsrep-status-index.html>`_.

-* ``mysql_cluster_connected``, ``1`` when the node is connected to the cluster, if not ``0``.
-* ``mysql_cluster_local_cert_failures``, number of writesets that failed the certification test.
-* ``mysql_cluster_local_commits``, number of writesets commited on the node.
-* ``mysql_cluster_local_recv_queue``, the number of writesets waiting to be applied.
-* ``mysql_cluster_local_send_queue``, the number of writesets waiting to be sent.
-* ``mysql_cluster_ready``, ``1`` when the node is ready to accept queries, if not ``0``.
-* ``mysql_cluster_received``, total number of writesets received from other nodes.
-* ``mysql_cluster_received_bytes``, total size in bytes of writesets received from other nodes.
-* ``mysql_cluster_replicated``, total number of writesets sent to other nodes.
-* ``mysql_cluster_replicated_bytes`` total size in bytes of writesets sent to other nodes.
-* ``mysql_cluster_size``, current number of nodes in the cluster.
-* ``mysql_cluster_status``, ``1`` when the node is 'Primary', ``2`` if 'Non-Primary' and ``3`` if 'Disconnected'.
+* ``mysql_cluster_connected``, ``1`` when the node is connected to the cluster,
+  if not, then ``0``.
+* ``mysql_cluster_local_cert_failures``, the number of write sets that failed
+  the certification test.
+* ``mysql_cluster_local_commits``, the number of write sets committed on the
+  node.
+* ``mysql_cluster_local_recv_queue``, the number of write sets waiting to be
+  applied.
+* ``mysql_cluster_local_send_queue``, the number of write sets waiting to be
+  sent.
+* ``mysql_cluster_ready``, ``1`` when the node is ready to accept queries, if
+  not, then ``0``.
+* ``mysql_cluster_received``, the total number of write sets received from
+  other nodes.
+* ``mysql_cluster_received_bytes``, the total size in bytes of write sets
+  received from other nodes.
+* ``mysql_cluster_replicated``, the total number of write sets sent to other
+  nodes.
+* ``mysql_cluster_replicated_bytes`` the total size in bytes of write sets sent
+  to other nodes.
+* ``mysql_cluster_size``, the current number of nodes in the cluster.
+* ``mysql_cluster_status``, ``1`` when the node is 'Primary', ``2`` if
+  'Non-Primary', and ``3`` if 'Disconnected'.

-.. _Percona documentation: http://www.percona.com/doc/percona-xtradb-cluster/5.6/wsrep-status-index.html
-
-Slow Queries
+Slow queries
 ^^^^^^^^^^^^

-This metric is collected with statement 'SHOW STATUS where Variable_name = 'Slow_queries'.
-
-* ``mysql_slow_queries``, number of queries that have taken more than X seconds,
-  depending of the MySQL configuration parameter 'long_query_time' (10s per default)
+The following metric is collected with statement
+'SHOW STATUS where Variable_name = 'Slow_queries'.

+* ``mysql_slow_queries``, the number of queries that have taken more than X
+  seconds, depending on the MySQL configuration parameter 'long_query_time'
+  (10s per default).
--- a/doc/user/source/metrics/openstack.rst
+++ b/doc/user/source/metrics/openstack.rst
@ -4,10 +4,12 @@ Service checks
 ^^^^^^^^^^^^^^
 .. _service_checks:

-* ``openstack_check_api``, the service's API status, 1 if it is responsive, if not 0.
-    The metric contains a ``service`` field that identifies the OpenStack service being checked.
+* ``openstack_check_api``, the service's API status, ``1`` if it is responsive,
+    if not, then ``0``. The metric contains a ``service`` field that identifies
+    the OpenStack service being checked.

-``<service>`` is one of the following values with their respective resource checks:
+``<service>`` is one of the following values with their respective resource
+checks:

 * 'ceilometer-api': '/v2/capabilities'
 * 'cinder-api': '/'
@ -21,61 +23,75 @@ Service checks
 * 'swift-api': '/healthcheck'
 * 'swift-s3-api': '/healthcheck'

-.. note:: All checks are performed without authentication except for Ceilometer.
+.. note:: All checks except for Ceilometer are performed without authentication.

 Compute
 ^^^^^^^

-These metrics are emitted per compute node.
+The following metrics are emitted per compute node:

-* ``openstack_nova_free_disk``, the disk space (in GB) available for new instances.
-* ``openstack_nova_free_ram``, the  memory (in MB) available for new instances.
-* ``openstack_nova_free_vcpus``, the number of virtual CPU available for new instances.
-* ``openstack_nova_instance_creation_time``, the time (in seconds) it took to launch a new instance.
-* ``openstack_nova_instance_state``, the number of instances which entered a given state (the value is always 1).
+* ``openstack_nova_free_disk``, the disk space in GB available for new instances.
+* ``openstack_nova_free_ram``, the memory in MB available for new instances.
+* ``openstack_nova_free_vcpus``, the number of virtual CPU available for new
+  instances.
+* ``openstack_nova_instance_creation_time``, the time in seconds it took to
+  launch a new instance.
+* ``openstack_nova_instance_state``, the number of instances which entered a
+  given state (the value is always ``1``).
  The metric contains a ``state`` field.
 * ``openstack_nova_running_instances``, the number of running instances.
 * ``openstack_nova_running_tasks``, the number of tasks currently executed.
-* ``openstack_nova_used_disk``, the disk space (in GB) used by the instances.
-* ``openstack_nova_used_ram``, the memory (in MB) used by the instances.
-* ``openstack_nova_used_vcpus``, the number of virtual CPU used by the instances.
+* ``openstack_nova_used_disk``, the disk space in GB used by the instances.
+* ``openstack_nova_used_ram``, the memory in MB used by the instances.
+* ``openstack_nova_used_vcpus``, the number of virtual CPU used by the
+  instances.

-These metrics are retrieved from the Nova API and represent the aggregated
-values across all compute nodes.
+The following metrics are retrieved from the Nova API and represent the
+aggregated values across all compute nodes.

-* ``openstack_nova_total_free_disk``, the total amount of disk space (in GB) available for new instances.
-* ``openstack_nova_total_free_ram``, the total amount of memory (in MB) available for new instances.
-* ``openstack_nova_total_free_vcpus``, the total number of virtual CPU available for new instances.
-* ``openstack_nova_total_running_instances``, the total number of running instances.
-* ``openstack_nova_total_running_tasks``, the total number of tasks currently executed.
-* ``openstack_nova_total_used_disk``, the total amount of disk space (in GB) used by the instances.
-* ``openstack_nova_total_used_ram``, the total amount of memory (in MB) used by the instances.
-* ``openstack_nova_total_used_vcpus``, the total number of virtual CPU used by the instances.
+* ``openstack_nova_total_free_disk``, the total amount of disk space in GB
+  available for new instances.
+* ``openstack_nova_total_free_ram``, the total amount of memory in MB available
+  for new instances.
+* ``openstack_nova_total_free_vcpus``, the total number of virtual CPU
+  available for new instances.
+* ``openstack_nova_total_running_instances``, the total number of running
+  instances.
+* ``openstack_nova_total_running_tasks``, the total number of tasks currently
+  executed.
+* ``openstack_nova_total_used_disk``, the total amount of disk space in GB
+  used by the instances.
+* ``openstack_nova_total_used_ram``, the total amount of memory in MB used by
+  the instances.
+* ``openstack_nova_total_used_vcpus``, the total number of virtual CPU used by
+  the instances.

-These metrics are retrieved from the Nova API.
+The following metrics are retrieved from the Nova API:

 * ``openstack_nova_instances``, the total count of instances in a given state.
  The metric contains a ``state`` field which is one of 'active', 'deleted',
  'error', 'paused', 'resumed', 'rescued', 'resized', 'shelved_offloaded' or
  'suspended'.

-These metrics are retrieved from the Nova database.
+The following metrics are retrieved from the Nova database:

 .. _compute-service-state-metrics:

-* ``openstack_nova_service``, the Nova service state (either 0 for 'up', 1 for 'down' or 2 for 'disabled').
-  The metric contains a ``service`` field (one of 'compute', 'conductor', 'scheduler', 'cert'
-  or 'consoleauth') and a ``state`` field (one of 'up', 'down' or 'disabled').
+* ``openstack_nova_service``, the Nova service state (either ``0`` for 'up',
+  ``1`` for 'down' or ``2`` for 'disabled').
+  The metric contains a ``service`` field (one of 'compute', 'conductor',
+  'scheduler', 'cert' or 'consoleauth') and a ``state`` field (one of 'up',
+  'down' or 'disabled').

 * ``openstack_nova_services``, the total count of Nova
  services by state. The metric contains a ``service`` field (one of 'compute',
  'conductor', 'scheduler', 'cert' or 'consoleauth') and a ``state`` field (one
-  of 'up', 'down' or 'disabled').
+  of 'up', 'down', or 'disabled').

 Identity
 ^^^^^^^^

-These metrics are retrieved from the Keystone API.
+The following metrics are retrieved from the Keystone API:

 * ``openstack_keystone_roles``, the total number of roles.
 * ``openstack_keystone_tenants``, the number of tenants by state. The metric
@ -86,28 +102,37 @@ These metrics are retrieved from the Keystone API.
 Volume
 ^^^^^^

-These metrics are emitted per volume node.
+The following metrics are emitted per volume node:

-* ``openstack_cinder_volume_creation_time``, the time (in seconds) it took to create a new volume.
+* ``openstack_cinder_volume_creation_time``, the time in seconds it took to
+  create a new volume.

-.. note:: When using Ceph as the backend storage for volumes, the ``hostname`` value is always set to ``rbd``.
+.. note:: When using Ceph as the back end storage for volumes, the ``hostname``
+   value is always set to ``rbd``.

-These metrics are retrieved from the Cinder API.
+The following metrics are retrieved from the Cinder API:

-* ``openstack_cinder_snapshots``, the number of snapshots by state. The metric contains a ``state`` field.
-* ``openstack_cinder_snapshots_size``, the total size (in bytes) of snapshots by state. The metric contains a ``state`` field.
-* ``openstack_cinder_volumes``, the number of volumes by state. The metric contains a ``state`` field.
-* ``openstack_cinder_volumes_size``, the total size (in bytes) of volumes by state. The metric contains a ``state`` field.
+* ``openstack_cinder_snapshots``, the number of snapshots by state. The metric
+  contains a ``state`` field.
+* ``openstack_cinder_snapshots_size``, the total size (in bytes) of snapshots
+  by state. The metric contains a ``state`` field.
+* ``openstack_cinder_volumes``, the number of volumes by state. The metric
+  contains a ``state`` field.
+* ``openstack_cinder_volumes_size``, the total size (in bytes) of volumes by
+  state. The metric contains a ``state`` field.

-``state`` is one of 'available', 'creating', 'attaching', 'in-use', 'deleting', 'backing-up', 'restoring-backup', 'error', 'error_deleting', 'error_restoring', 'error_extending'.
+``state`` is one of 'available', 'creating', 'attaching', 'in-use', 'deleting',
+'backing-up', 'restoring-backup', 'error', 'error_deleting', 'error_restoring',
+'error_extending'.

-These metrics are retrieved from the Cinder database.
+The following metrics are retrieved from the Cinder database:

 .. _volume-service-state-metrics:

-* ``openstack_cinder_service``, the Cinder service state (either 0 for 'up', 1 for 'down' or 2 for 'disabled').
-  The metric contains a ``service`` field (one of 'volume', 'backup', 'scheduler'),
-  and a ``state`` field (one of 'up', 'down' or 'disabled').
+* ``openstack_cinder_service``, the Cinder service state (either ``0`` for
+  'up', ``1`` for 'down', or ``2`` for 'disabled'). The metric contains a
+  ``service`` field (one of 'volume', 'backup', 'scheduler') and a ``state``
+  field (one of 'up', 'down' or 'disabled').

 * ``openstack_cinder_services``, the total count of Cinder services by state.
  The metric contains a ``service`` field (one of 'volume', 'backup',
@ -116,17 +141,18 @@ These metrics are retrieved from the Cinder database.
 Image
 ^^^^^

-These metrics are retrieved from the Glance API.
+The following metrics are retrieved from the Glance API:

 * ``openstack_glance_images``, the number of images by state and visibility.
-  The metric contains ``state`` and ``visibility`` field.
+  The metric contains ``state`` and ``visibility`` fields.
 * ``openstack_glance_images_size``, the total size (in bytes) of images by
-  state and visibility. The metric contains ``state`` and ``visibility`` field.
+  state and visibility. The metric contains ``state`` and ``visibility``
+  fields.
 * ``openstack_glance_snapshots``, the number of snapshot images by state and
-  visibility. The metric contains ``state`` and ``visibility`` field.
+  visibility. The metric contains ``state`` and ``visibility`` fields.
 * ``openstack_glance_snapshots_size``, the total size (in bytes) of snapshots
  by state and visibility. The metric contains ``state`` and ``visibility``
-  field.
+  fields.

 ``state`` is one of 'queued', 'saving', 'active', 'killed', 'deleted',
 'pending_delete'. ``visibility`` is either 'public' or 'private'.
@ -134,27 +160,32 @@ These metrics are retrieved from the Glance API.
 Network
 ^^^^^^^

-These metrics are retrieved from the Neutron API.
+The following metrics are retrieved from the Neutron API:

 * ``openstack_neutron_floatingips``, the total number of floating IP addresses.
-* ``openstack_neutron_networks``, the number of virtual networks by state. The metric contains a ``state`` field.
-* ``openstack_neutron_ports``, the number of virtual ports by owner and state. The metric contains ``owner`` and ``state`` fields.
-* ``openstack_neutron_routers``, the number of virtual routers by state. The metric contains a ``state`` field.
+* ``openstack_neutron_networks``, the number of virtual networks by state. The
+  metric contains a ``state`` field.
+* ``openstack_neutron_ports``, the number of virtual ports by owner and state.
+  The metric contains ``owner`` and ``state`` fields.
+* ``openstack_neutron_routers``, the number of virtual routers by state. The
+  metric contains a ``state`` field.
 * ``openstack_neutron_subnets``, the number of virtual subnets.

 ``<state>`` is one of 'active', 'build', 'down' or 'error'.

-``<owner>`` is one of 'compute', 'dhcp', 'floatingip', 'floatingip_agent_gateway', 'router_interface', 'router_gateway', 'router_ha_interface', 'router_interface_distributed' or 'router_centralized_snat'.
+``<owner>`` is one of 'compute', 'dhcp', 'floatingip', 'floatingip_agent_gateway', 'router_interface', 'router_gateway', 'router_ha_interface',
+'router_interface_distributed', or 'router_centralized_snat'.

-These metrics are retrieved from the Neutron database.
+The following metrics are retrieved from the Neutron database:

 .. _network-agent-state-metrics:

 .. note:: These metrics are not collected when the Contrail plugin is deployed.

-* ``openstack_neutron_agent``, the Neutron agent state (either 0 for 'up', 1 for 'down' or 2 for 'disabled').
-  The metric contains a ``service`` field (one of 'dhcp', 'l3', 'metadata' or 'openvswitch'),
-  and a ``state`` field (one of 'up', 'down' or 'disabled').
+* ``openstack_neutron_agent``, the Neutron agent state (either ``0`` for 'up',
+  ``1`` for 'down', or ``2`` for 'disabled').
+  The metric contains a ``service`` field (one of 'dhcp', 'l3', 'metadata', or
+  'openvswitch'), and a ``state`` field (one of 'up', 'down' or 'disabled').

 * ``openstack_neutron_agents``, the total number of Neutron agents by service
  and state. The metric contains ``service`` (one of 'dhcp', 'l3', 'metadata'
@ -164,12 +195,17 @@ API response times
 ^^^^^^^^^^^^^^^^^^

 * ``openstack_<service>_http_response_times``, HTTP response time statistics.
-  The statistics are ``min``, ``max``, ``sum``, ``count``, ``upper_90`` (90 percentile) over 10 seconds.
-  The metric contains ``http_method`` (eg 'GET', 'POST', and so forth) and ``http_status`` (eg '2xx', '4xx', and so forth) fields.
+  The statistics are ``min``, ``max``, ``sum``, ``count``, ``upper_90``
+  (90 percentile) over 10 seconds. The metric contains an ``http_method`` field,
+  for example, 'GET', 'POST', and others, and an ``http_status`` field, for
+  example, '2xx', '4xx', and others.

-``<service>`` is one of 'cinder', 'glance', 'heat' 'keystone', 'neutron' or 'nova'.
+``<service>`` is one of 'cinder', 'glance', 'heat' 'keystone', 'neutron' or
+'nova'.

 Logs
 ^^^^

-* ``log_messages``, the number of log messages per second for the given service and severity level. The metric contains ``service`` and ``level`` (one of 'debug', 'info', ... ) fields.
+* ``log_messages``, the number of log messages per second for the given
+  service and severity level. The metric contains ``service`` and ``level``
+  (one of 'debug', 'info', and others) fields.
--- a/doc/user/source/metrics/pacemaker.rst
+++ b/doc/user/source/metrics/pacemaker.rst
@ -4,6 +4,6 @@ Resource location
 ^^^^^^^^^^^^^^^^^

 * ``pacemaker_resource_local_active``,  ``1`` when the resource is located on
-  the host reporting the metric, if not ``0``. The metric contains a
+  the host reporting the metric, if not, then ``0``. The metric contains a
  ``resource`` field which is one of 'vip__public', 'vip__management',
-  'vip__vrouter_pub' or 'vip__vrouter'.
+  'vip__vrouter_pub', or 'vip__vrouter'.
--- a/doc/user/source/metrics/rabbitmq.rst
+++ b/doc/user/source/metrics/rabbitmq.rst
@ -3,16 +3,23 @@
 Cluster
 ^^^^^^^

-* ``rabbitmq_connections``, total number of connections.
-* ``rabbitmq_consumers``, total number of consumers.
-* ``rabbitmq_channels``, total number of channels.
-* ``rabbitmq_exchanges``, total number of exchanges.
-* ``rabbitmq_messages``, total number of messages which are ready to be consumed or not yet acknowledged.
-* ``rabbitmq_queues``, total number of queues.
-* ``rabbitmq_running_nodes``, total number of running nodes in the cluster.
-* ``rabbitmq_disk_free``, the disk free space.
-* ``rabbitmq_disk_free_limit``, the minimum amount of free disk for RabbitMQ. When ``rabbitmq_disk_free`` drops below this value, all producers are blocked.
-* ``rabbitmq_remaining_disk``, the difference between ``rabbitmq_disk_free`` and ``rabbitmq_disk_free_limit``.
+* ``rabbitmq_connections``, the total number of connections.
+* ``rabbitmq_consumers``, the total number of consumers.
+* ``rabbitmq_channels``, the total number of channels.
+* ``rabbitmq_exchanges``, the total number of exchanges.
+* ``rabbitmq_messages``, the total number of messages which are ready to be
+  consumed or not yet acknowledged.
+* ``rabbitmq_queues``, the total number of queues.
+* ``rabbitmq_running_nodes``, the total number of running nodes in the cluster.
+* ``rabbitmq_disk_free``, the free disk space.
+* ``rabbitmq_disk_free_limit``, the minimum amount of free disk space for
+  RabbitMQ.
+  When ``rabbitmq_disk_free`` drops below this value, all producers are blocked.
+* ``rabbitmq_remaining_disk``, the difference between ``rabbitmq_disk_free``
+  and ``rabbitmq_disk_free_limit``.
 * ``rabbitmq_used_memory``, bytes of memory used by the whole RabbitMQ process.
-* ``rabbitmq_vm_memory_limit``, the maximum amount of memory allocated for RabbitMQ. When ``rabbitmq_used_memory`` uses more than this value, all producers are blocked.
-* ``rabbitmq_remaining_memory``, the difference between ``rabbitmq_vm_memory_limit`` and ``rabbitmq_used_memory``.
+* ``rabbitmq_vm_memory_limit``, the maximum amount of memory allocated for
+  RabbitMQ. When ``rabbitmq_used_memory`` uses more than this value, all
+  producers are blocked.
+* ``rabbitmq_remaining_memory``, the difference between
+  ``rabbitmq_vm_memory_limit`` and ``rabbitmq_used_memory``.
--- a/doc/user/source/metrics/system.rst
+++ b/doc/user/source/metrics/system.rst
@ -3,36 +3,45 @@
 CPU
 ^^^

-Metrics have a ``cpu_number`` field that contains the CPU number to which the metric applies.
+Metrics have a ``cpu_number`` field that contains the CPU number to which the
+metric applies.

-* ``cpu_idle``, percentage of CPU time spent in the idle task.
-* ``cpu_interrupt``, percentage of CPU time spent servicing interrupts.
-* ``cpu_nice``, percentage of CPU time spent in user mode with low priority (nice).
-* ``cpu_softirq``, percentage of CPU time spent servicing soft interrupts.
-* ``cpu_steal``, percentage of CPU time spent in other operating systems.
-* ``cpu_system``, percentage of CPU time spent in system mode.
-* ``cpu_user``, percentage of CPU time spent in user mode.
-* ``cpu_wait``, percentage of CPU time spent waiting for I/O operations to complete.
+* ``cpu_idle``, the percentage of CPU time spent in the idle task.
+* ``cpu_interrupt``, the percentage of CPU time spent servicing interrupts.
+* ``cpu_nice``, the percentage of CPU time spent in user mode with low
+  priority (nice).
+* ``cpu_softirq``, the percentage of CPU time spent servicing soft interrupts.
+* ``cpu_steal``, the percentage of CPU time spent in other operating systems.
+* ``cpu_system``, the percentage of CPU time spent in system mode.
+* ``cpu_user``, the percentage of CPU time spent in user mode.
+* ``cpu_wait``, the percentage of CPU time spent waiting for I/O operations to
+  complete.


 Disk
 ^^^^

-Metrics have a ``device`` field that contains the disk device number the metric applies to (eg 'sda', 'sdb' and so on).
+Metrics have a ``device`` field that contains the disk device number the metric
+applies to. For example, 'sda', 'sdb', and others.

-* ``disk_merged_read``, the number of read operations per second that could be merged with already queued operations.
-* ``disk_merged_write``, the number of write operations per second that could be merged with already queued operations.
+* ``disk_merged_read``, the number of read operations per second that could be
+  merged with already queued operations.
+* ``disk_merged_write``, the number of write operations per second that could
+  be merged with already queued operations.
 * ``disk_octets_read``, the number of octets (bytes) read per second.
 * ``disk_octets_write``, the number of octets (bytes) written per second.
 * ``disk_ops_read``, the number of read operations per second.
 * ``disk_ops_write``, the number of write operations per second.
-* ``disk_time_read``, the average time for a read operation to complete in the last interval.
-* ``disk_time_write``, the average time for a write operation to complete in the last interval.
+* ``disk_time_read``, the average time for a read operation to complete in the
+  last interval.
+* ``disk_time_write``, the average time for a write operation to complete in
+  the last interval.

 File system
 ^^^^^^^^^^^

-Metrics have a ``fs`` field that contains the partition's mount point to which the metric applies (eg '/', '/var/lib' and so on).
+Metrics have a ``fs`` field that contains the partition's mount point to which
+the metric applies. For example, '/', '/var/lib', and others.

 * ``fs_inodes_free``, the number of free inodes on the file system.
 * ``fs_inodes_percent_free``, the percentage of free inodes on the file system.
@ -52,46 +61,53 @@ System load

 * ``load_longterm``, the system load average over the last 15 minutes.
 * ``load_midterm``, the system load average over the last 5 minutes.
-* ``load_shortterm``, the system load averge over the last minute.
+* ``load_shortterm``, the system load average over the last minute.

 Memory
 ^^^^^^

-* ``memory_buffered``, the amount of memory (in bytes) which is buffered.
-* ``memory_cached``, the amount of memory (in bytes) which is cached.
-* ``memory_free``, the amount of memory (in bytes) which is free.
-* ``memory_used``, the amount of memory (in bytes) which is used.
+* ``memory_buffered``, the amount of buffered memory in bytes.
+* ``memory_cached``, the amount of cached memory in bytes.
+* ``memory_free``, the amount of free memory in bytes.
+* ``memory_used``, the amount of used memory in bytes.

 Network
 ^^^^^^^

-Metrics have a ``interface`` field that contains the interface name the metric applies to (eg 'eth0', 'eth1' and so on).
+Metrics have an ``interface`` field that contains the interface name the
+metric applies to. For example, 'eth0', 'eth1', and others.

-* ``if_errors_rx``, the number of errors per second detected when receiving from the interface.
-* ``if_errors_tx``, the number of errors per second detected when transmitting from the interface.
-* ``if_octets_rx``, the number of octets (bytes) received per second by the interface.
-* ``if_octets_tx``, the number of octets (bytes) transmitted per second by the interface.
-* ``if_packets_rx``, the number of packets received per second by the interface.
-* ``if_packets_tx``, the number of packets transmitted per second by the interface.
+* ``if_errors_rx``, the number of errors per second detected when receiving
+  from the interface.
+* ``if_errors_tx``, the number of errors per second detected when transmitting
+  from the interface.
+* ``if_octets_rx``, the number of octets (bytes) received per second by the
+  interface.
+* ``if_octets_tx``, the number of octets (bytes) transmitted per second by the
+  interface.
+* ``if_packets_rx``, the number of packets received per second by the
+  interface.
+* ``if_packets_tx``, the number of packets transmitted per second by the
+  interface.

 Processes
 ^^^^^^^^^

 * ``processes_count``, the number of processes in a given state. The metric has
-  a ``state`` field (one of 'blocked', 'paging', 'running', 'sleeping', 'stopped'
-  or 'zombies').
+  a ``state`` field (one of 'blocked', 'paging', 'running', 'sleeping',
+  'stopped' or 'zombies').
 * ``processes_fork_rate``, the number of processes forked per second.

 Swap
 ^^^^

-* ``swap_cached``, the amount of cached memory (in bytes) which is in the swap.
-* ``swap_free``, the amount of free memory (in bytes) which is in the swap.
+* ``swap_cached``, the amount of cached memory (in bytes) that is in the swap.
+* ``swap_free``, the amount of free memory (in bytes) that is in the swap.
 * ``swap_io_in``, the number of swap pages written per second.
 * ``swap_io_out``, the number of swap pages read per second.
-* ``swap_used``, the amount of used memory (in bytes) which is in the swap.
+* ``swap_used``, the amount of used memory (in bytes) that is in the swap.

 Users
 ^^^^^

-* ``logged_users``, the number of users currently logged-in.
+* ``logged_users``, the number of users currently logged in.