telemetry: reorg and cleanup data collection

- ha notes are redundant to ha-guide - move meter definition section under notifications as that's what it relates to - move the configuration of standard meters to install guide. Change-Id: Ib09dea06c609e3611a8a9d20d5f3a1c8f757a547
2017-02-22 14:58:59 +00:00 · 2017-02-22 14:58:59 +00:00 · b6946421d0
parent a98a174092
commit b6946421d0
4 changed files with 103 additions and 262 deletions
--- a/doc/admin-guide/source/telemetry-alarms.rst
+++ b/doc/admin-guide/source/telemetry-alarms.rst
@ -109,9 +109,8 @@ The alarm evaluation process uses the same mechanism for workload
 partitioning as the central and compute agents. The
 `Tooz <https://pypi.python.org/pypi/tooz>`_ library provides the
 coordination within the groups of service instances. For further
-information about this approach, see the section called
-:ref:`Support for HA deployment of the central and compute agent services
-<ha-deploy-services>`.
+information about this approach, see the `high availability guide
+<https://docs.openstack.org/ha-guide/controller-ha-telemetry.html>`_.

 To use this workload partitioning solution set the
 ``evaluation_service`` option to ``default``. For more
--- a/doc/admin-guide/source/telemetry-best-practices.rst
+++ b/doc/admin-guide/source/telemetry-best-practices.rst
@ -42,7 +42,8 @@ Data collection

 #. If polling many resources or at a high frequency, you can add additional
   central and compute agents as necessary. The agents are designed to scale
-   horizontally. For more information see, :ref:`ha-deploy-services`.
+   horizontally. For more information refer to the `high availability guide
+   <https://docs.openstack.org/ha-guide/controller-ha-telemetry.html>`_.

 Data storage
 ------------
--- a/doc/admin-guide/source/telemetry-data-collection.rst
+++ b/doc/admin-guide/source/telemetry-data-collection.rst
@ -38,6 +38,7 @@ RESTful API (deprecated in Ocata)

 Notifications
 ~~~~~~~~~~~~~
+
 All OpenStack services send notifications about the executed operations
 or system state. Several notifications carry information that can be
 metered. For example, CPU time of a VM instance created by OpenStack
@ -199,35 +200,104 @@ compute service, see
 telemetry/ocata/configure_services/nova/install-nova-ubuntu.html>`__ in the
 Installation Tutorials and Guides.

-Middleware for the OpenStack Object Storage service
---------------------------------------------------
+Meter definitions
+-----------------

-A subset of Object Store statistics requires additional middleware to
-be installed behind the proxy of Object Store. This additional component
-emits notifications containing data-flow-oriented meters, namely the
-``storage.objects.(incoming|outgoing).bytes values``. The list of these
-meters are listed in :ref:`telemetry-object-storage-meter`, marked with
-``notification`` as origin.
+The Telemetry service collects a subset of the meters by filtering
+notifications emitted by other OpenStack services. You can find the meter
+definitions in a separate configuration file, called
+``ceilometer/meter/data/meters.yaml``. This enables
+operators/administrators to add new meters to Telemetry project by updating
+the ``meters.yaml`` file without any need for additional code changes.

-The instructions on how to install this middleware can be found in
-`Configure the Object Storage service for Telemetry
-<https://docs.openstack.org/project-install-guide/
-telemetry/ocata/configure_services/swift/install-swift-ubuntu.html>`__
-section in the Installation Tutorials and Guides.
+.. note::

-Telemetry middleware
--------------------
+   The ``meters.yaml`` file should be modified with care. Unless intended,
+   do not remove any existing meter definitions from the file. Also, the
+   collected meters can differ in some cases from what is referenced in the
+   documentation.

-Telemetry provides HTTP request and API endpoint counting
-capability in OpenStack. This is achieved by
-storing a sample for each event marked as ``audit.http.request``,
-``audit.http.response``, ``http.request`` or ``http.response``.
+A standard meter definition looks like:

-It is recommended that these notifications be consumed as events rather
-than samples to better index the appropriate values and avoid massive
-load on the Metering database. If preferred, Telemetry can consume these
-events as samples if the services are configured to emit ``http.*``
-notifications.
+.. code-block:: yaml
+
+   ---
+   metric:
+     - name: 'meter name'
+       event_type: 'event name'
+       type: 'type of meter eg: gauge, cumulative or delta'
+       unit: 'name of unit eg: MB'
+       volume: 'path to a measurable value eg: $.payload.size'
+       resource_id: 'path to resource id eg: $.payload.id'
+       project_id: 'path to project id eg: $.payload.owner'
+       metadata: 'addiitonal key-value data describing resource'
+
+The definition above shows a simple meter definition with some fields,
+from which ``name``, ``event_type``, ``type``, ``unit``, and ``volume``
+are required. If there is a match on the event type, samples are generated
+for the meter.
+
+The ``meters.yaml`` file contains the sample
+definitions for all the meters that Telemetry is collecting from
+notifications. The value of each field is specified by using JSON path in
+order to find the right value from the notification message. In order to be
+able to specify the right field you need to be aware of the format of the
+consumed notification. The values that need to be searched in the notification
+message are set with a JSON path starting with ``$.`` For instance, if you need
+the ``size`` information from the payload you can define it like
+``$.payload.size``.
+
+A notification message may contain multiple meters. You can use ``*`` in
+the meter definition to capture all the meters and generate samples
+respectively. You can use wild cards as shown in the following example:
+
+.. code-block:: yaml
+
+   ---
+   metric:
+     - name: $.payload.measurements.[*].metric.[*].name
+       event_type: 'event_name.*'
+       type: 'delta'
+       unit: $.payload.measurements.[*].metric.[*].unit
+       volume: payload.measurements.[*].result
+       resource_id: $.payload.target
+       user_id: $.payload.initiator.id
+       project_id: $.payload.initiator.project_id
+
+In the above example, the ``name`` field is a JSON path with matching
+a list of meter names defined in the notification message.
+
+You can use complex operations on JSON paths. In the following example,
+``volume`` and ``resource_id`` fields perform an arithmetic
+and string concatenation:
+
+.. code-block:: yaml
+
+   ---
+   metric:
+   - name: 'compute.node.cpu.idle.percent'
+     event_type: 'compute.metrics.update'
+     type: 'gauge'
+     unit: 'percent'
+     volume: payload.metrics[?(@.name='cpu.idle.percent')].value * 100
+     resource_id: $.payload.host + "_" + $.payload.nodename
+
+You can use the ``timedelta`` plug-in to evaluate the difference in seconds
+between two ``datetime`` fields from one notification.
+
+.. code-block:: yaml
+
+   ---
+   metric:
+   - name: 'compute.instance.booting.time'
+     event_type: 'compute.instance.create.end'
+    type: 'gauge'
+    unit: 'sec'
+    volume:
+      fields: [$.payload.created_at, $.payload.launched_at]
+      plugin: 'timedelta'
+    project_id: $.payload.tenant_id
+    resource_id: $.payload.instance_id

 Polling
 ~~~~~~~
@ -378,107 +448,6 @@ The list of collected meters can be found in
   compute node. If ``conductor.send_sensor_data`` is set, this
   misconfiguration causes duplicated IPMI sensor samples.

-
-.. _ha-deploy-services:
-
-Support for HA deployment
-~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Both the polling agents and notification agents can run in an HA deployment,
-which means that multiple instances of these services can run in
-parallel with workload partitioning among these running instances.
-
-The `Tooz <https://pypi.python.org/pypi/tooz>`__ library provides the
-coordination within the groups of service instances. Tooz supports `various
-drivers <https://docs.openstack.org/developer/tooz/drivers.html>`__
-including the following back end solutions:
-
-  `Zookeeper <http://zookeeper.apache.org/>`__. Recommended solution by
-   the Tooz project.
-
-  `Redis <http://redis.io/>`__. Recommended solution by the Tooz
-   project.
-
-  `Memcached <http://memcached.org/>`__. Recommended for testing.
-
-You must configure a supported Tooz driver for the HA deployment of the
-Telemetry services.
-
-For information about the required configuration options that have to be
-set in the ``ceilometer.conf`` configuration file for both the central
-and Compute agents, see the `Coordination section
-<https://docs.openstack.org/ocata/config-reference/telemetry/telemetry-config-options.html>`__
-in the OpenStack Configuration Reference.
-
-Notification agent HA deployment
--------------------------------
-
-Workload partitioning support is particularly useful as the pipeline processing
-is handled exclusively by the notification agent now which may result
-in a larger amount of load.
-
-To enable workload partitioning by notification agent, the ``backend_url``
-option must be set in the ``ceilometer.conf`` configuration file.
-Additionally, ``workload_partitioning`` should be enabled in the
-`Notification section <https://docs.openstack.org/ocata/config-reference/telemetry/telemetry-config-options.html>`__ in the OpenStack Configuration Reference.
-
-The notification agent creates multiple queues to divide the workload across
-all active agents. The number of queues can be controlled by  the
-``pipeline_processing_queues`` option in the ``ceilometer.conf`` configuration
-file.
-
-.. note::
-
-   A larger value will result in better distribution of
-   tasks but will also require more memory and longer startup time. It is
-   recommended to have a value approximately three times the number of active
-   notification agents. At a minimum, the value should be equal to the number
-   of active agents.
-
-Polling agent HA deployment
---------------------------
-
-.. note::
-
-    Without the ``backend_url`` option being set only one instance of
-    both the central and Compute agent service is able to run and
-    function correctly.
-
-The availability check of the instances is provided by heartbeat
-messages. When the connection with an instance is lost, the workload
-will be reassigned within the remained instances in the next polling
-cycle.
-
-.. note::
-
-    ``Memcached`` uses a ``timeout`` value, which should always be set
-    to a value that is higher than the ``heartbeat`` value set for
-    Telemetry.
-
-For backward compatibility and supporting existing deployments, the
-central agent configuration also supports using different configuration
-files for groups of service instances of this type that are running in
-parallel. For enabling this configuration set a value for the
-``partitioning_group_prefix`` option in the `polling section
-<https://docs.openstack.org/ocata/config-reference/telemetry/telemetry-config-options.html>`__
-in the OpenStack Configuration Reference.
-
-.. warning::
-
-    For each sub-group of the central agent pool with the same
-    ``partitioning_group_prefix`` a disjoint subset of meters must be
-    polled, otherwise samples may be missing or duplicated. The list of
-    meters to poll can be set in the ``/etc/ceilometer/pipeline.yaml``
-    configuration file. For more information about pipelines see
-    :ref:`telemetry-data-pipelines`.
-
-To enable the Compute agent to run multiple instances simultaneously
-with workload partitioning, the ``workload_partitioning`` option has to
-be set to ``True`` under the `Compute section
-<https://docs.openstack.org/ocata/config-reference/telemetry/telemetry-config-options.html>`__
-in the ``ceilometer.conf`` configuration file.
-
-
 Send samples to Telemetry
 ~~~~~~~~~~~~~~~~~~~~~~~~~

@ -545,131 +514,3 @@ following command should be invoked:
   | user_id           | 679b0499e7a34ccb9d90b64208401f8e           |
   | volume            | 48.0                                       |
   +-------------------+--------------------------------------------+
-
-.. _telemetry-meter-definitions:
-
-Meter definitions
-----------------
-
-The Telemetry service collects a subset of the meters by filtering
-notifications emitted by other OpenStack services. You can find the meter
-definitions in a separate configuration file, called
-``ceilometer/meter/data/meters.yaml``. This enables
-operators/administrators to add new meters to Telemetry project by updating
-the ``meters.yaml`` file without any need for additional code changes.
-
-.. note::
-
-   The ``meters.yaml`` file should be modified with care. Unless intended,
-   do not remove any existing meter definitions from the file. Also, the
-   collected meters can differ in some cases from what is referenced in the
-   documentation.
-
-A standard meter definition looks like:
-
-.. code-block:: yaml
-
-   ---
-   metric:
-     - name: 'meter name'
-       event_type: 'event name'
-       type: 'type of meter eg: gauge, cumulative or delta'
-       unit: 'name of unit eg: MB'
-       volume: 'path to a measurable value eg: $.payload.size'
-       resource_id: 'path to resource id eg: $.payload.id'
-       project_id: 'path to project id eg: $.payload.owner'
-       metadata: 'addiitonal key-value data describing resource'
-
-The definition above shows a simple meter definition with some fields,
-from which ``name``, ``event_type``, ``type``, ``unit``, and ``volume``
-are required. If there is a match on the event type, samples are generated
-for the meter.
-
-If you take a look at the ``meters.yaml`` file, it contains the sample
-definitions for all the meters that Telemetry is collecting from
-notifications. The value of each field is specified by using JSON path in
-order to find the right value from the notification message. In order to be
-able to specify the right field you need to be aware of the format of the
-consumed notification. The values that need to be searched in the notification
-message are set with a JSON path starting with ``$.`` For instance, if you need
-the ``size`` information from the payload you can define it like
-``$.payload.size``.
-
-A notification message may contain multiple meters. You can use ``*`` in
-the meter definition to capture all the meters and generate samples
-respectively. You can use wild cards as shown in the following example:
-
-.. code-block:: yaml
-
-   ---
-   metric:
-     - name: $.payload.measurements.[*].metric.[*].name
-       event_type: 'event_name.*'
-       type: 'delta'
-       unit: $.payload.measurements.[*].metric.[*].unit
-       volume: payload.measurements.[*].result
-       resource_id: $.payload.target
-       user_id: $.payload.initiator.id
-       project_id: $.payload.initiator.project_id
-
-In the above example, the ``name`` field is a JSON path with matching
-a list of meter names defined in the notification message.
-
-You can even use complex operations on JSON paths. In the following example,
-``volume`` and ``resource_id`` fields perform an arithmetic
-and string concatenation:
-
-.. code-block:: yaml
-
-   ---
-   metric:
-   - name: 'compute.node.cpu.idle.percent'
-     event_type: 'compute.metrics.update'
-     type: 'gauge'
-     unit: 'percent'
-     volume: payload.metrics[?(@.name='cpu.idle.percent')].value * 100
-     resource_id: $.payload.host + "_" + $.payload.nodename
-
-You can use the ``timedelta`` plug-in to evaluate the difference in seconds
-between two ``datetime`` fields from one notification.
-
-.. code-block:: yaml
-
-   ---
-   metric:
-   - name: 'compute.instance.booting.time'
-     event_type: 'compute.instance.create.end'
-    type: 'gauge'
-    unit: 'sec'
-    volume:
-      fields: [$.payload.created_at, $.payload.launched_at]
-      plugin: 'timedelta'
-    project_id: $.payload.tenant_id
-    resource_id: $.payload.instance_id
-
-Block Storage audit script setup to get notifications
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-If you want to collect OpenStack Block Storage notification on demand,
-you can use :command:`cinder-volume-usage-audit` from OpenStack Block Storage.
-This script becomes available when you install OpenStack Block Storage,
-so you can use it without any specific settings and you don't need to
-authenticate to access the data. To use it, you must run this command in
-the following format:
-
-.. code-block:: console
-
-   $ cinder-volume-usage-audit \
-     --start_time='YYYY-MM-DD HH:MM:SS' --end_time='YYYY-MM-DD HH:MM:SS' --send_actions
-
-This script outputs what volumes or snapshots were created, deleted, or
-exists in a given period of time and some information about these
-volumes or snapshots. Information about the existence and size of
-volumes and snapshots is store in the Telemetry service. This data is
-also stored as an event which is the recommended usage as it provides
-better indexing of data.
-
-Using this script via cron you can get notifications periodically, for
-example, every 5 minutes::
-
-    */5 * * * * /path/to/cinder-volume-usage-audit --send_actions
--- a/doc/ha-guide/source/controller-ha-telemetry.rst
+++ b/doc/ha-guide/source/controller-ha-telemetry.rst
@ -1,15 +1,15 @@
-==============================
-Highly available Telemetry API
-==============================
+==========================
+Highly available Telemetry
+==========================

 The `Telemetry service
 <https://docs.openstack.org/admin-guide/common/get-started-telemetry.html>`_
 provides a data collection service and an alarming service.

-Telemetry central agent
+Telemetry polling agent
 ~~~~~~~~~~~~~~~~~~~~~~~

-The Telemetry central agent can be configured to partition its polling
+The Telemetry polling agent can be configured to partition its polling
 workload between multiple agents. This enables high availability (HA).

 Both the central and the compute agent can run in an HA deployment.