From 56c1f631a0e744f13512be341b526cce54855aad Mon Sep 17 00:00:00 2001
From: Dantali0n <info@dantalion.nl>
Date: Tue, 14 May 2019 17:12:06 +0200
Subject: [PATCH] Spec for Grafana proxy datasource

Spec for a new datasource to collect metrics using Grafana as a proxy
to the databases it integrates with.

Change-Id: Id224bf37a72347e634567c933c42a12ef9b048d3
Implements: blueprint grafana-proxy-datasource
---
 .../approved/grafana-proxy-datasource.rst     | 250 ++++++++++++++++++
 1 file changed, 250 insertions(+)
 create mode 100644 specs/train/approved/grafana-proxy-datasource.rst

diff --git a/specs/train/approved/grafana-proxy-datasource.rst b/specs/train/approved/grafana-proxy-datasource.rst
new file mode 100644
index 0000000..3647a12
--- /dev/null
+++ b/specs/train/approved/grafana-proxy-datasource.rst
@@ -0,0 +1,250 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+========================
+Grafana proxy datasource
+========================
+
+https://blueprints.launchpad.net/watcher/+spec/grafana-proxy-datasaource
+
+Watcher requires metrics from compute nodes and instances to perform the
+resource optimization. Metrics are retrieved using several different
+datasources such as Gnocchi and Monasca, however, not every OpenStack cloud
+might have the available datasources deployed. Grafana can be implemented as an
+additional datasource that can be used to retrieve metrics from
+`several different databases`_ that the Grafana endpoint is configured to use.
+
+.. _several different databases: https://grafana.com/plugins?type=datasource`
+
+Problem description
+===================
+
+Not every OpenStack cloud might use the currently available datasources
+limiting the use of Watcher. By offering a widely used monitoring platform as
+datasource with flexible configurations options more OpenStack clouds could
+start using Watcher.
+
+Use Cases
+----------
+
+As a service operator I want Watcher to integrate with currently deployed
+monitoring solutions.
+
+As a service operator I want to limit the amount of external credentials I have
+to configure for Watcher.
+
+Proposed change
+===============
+
+The new Grafana datasource will be able to query for different metrics
+depending on what is configured by the end user. Flexible configuration options
+will allow Grafana to work for each user's configuration. Some of these options
+will be configurable per metric using key value pairs in a dictionary. These
+options are called maps as they map the value to a specific metric.
+Configuration options include:
+
+* Endpoint url
+* Authorization token
+* Project id map
+* Attribute map
+* Database map
+* Translator map
+* Query map
+
+A configuration example configured for the ``host_cpu_usage`` and
+``instance_cpu_usage`` metrics could look as follows:
+
+::
+
+  [grafana_client]
+  token = uyyNKUJOZiLW7AVKRF7XAAAAQQDzoXbnS6cOxxcqJfS8ZEQyxgakF0bSUo0D==
+  base_url = https://grafana.ch/api/datasources/proxy/
+  project_id_map = host_cpu_usage:1337,instance_cpu_usage:4337
+  metric_db_map = host_cpu_usage:production_db,instance_cpu_usage:production_db
+  attribute_map = host_cpu_usage:hostname,instance_cpu_usage:human_id
+  translator_map = host_cpu_usage:influxdb,instance_cpu_usage:influxdb
+  query_map = host_cpu_usage:SELECT 100-{0}("{0}_value") FROM {3}.cpu_percent
+              WHERE ("host" =~ /^{1}$/ AND "type_instance" =~/^idle$/ AND
+              time > now()-{2}m),instance_cpu_usage:SELECT 100-{0}("{0}_value")
+              FROM {3}.cpu_percent WHERE ("host" =~ /^{1}$/ AND "type_instance"
+              =~/^idle$/ AND time > now()-{2}m)
+
+Grafana uses project ids to proxy to different ``databases`` each of these
+projects could contain a different type of ``database``. **The term project**
+**will be used throughout this document to prevent possible confusion**.
+All the desired metrics can be collected from multiple project or a single one
+depending on how the monitoring is configured but is limited to a single metric
+per project. This is because there is no method to aggregate a single metric
+across multiple projects.
+
+The way queries have to be sent to the endpoint and how to interpret the
+retrieved data will depend upon the project behind Grafana. To account for
+these differences between projects specific translators will be developed.
+The influxdb translator will be developed first. The translator map is used
+to perform the correct translations per metric depending on the type of
+project.
+
+Projects could contain one or more databases similar to schema's in MySQL. The
+database map allows to define a specific database per metric.
+
+Similar to the project map and the database map is the query map. This map
+contains the queries that will be send to the project to retrieve metrics.
+Queries are depended on the type of project and in the case of influxdb they
+are similar to SQL statements.
+
+The attribute map is used to select specific attribute from the resource
+objects. This is necessary because the attribute used as identifier in projects
+can differ per deployment and per metric.
+
+From the query map the entries will be formatted so that essential information
+for retrieving the specific metric for the desired host can be achieved.
+
+::
+
+  query = 'SELECT "{0}_value" FROM cpu_util WHERE host =~ /^{1}$/ AND time > '
+          'now() - {2}m'
+  query.format(aggregate, attribute, period, translator_specific)
+
+The format options can be extended overtime in case other specific
+projects such as elastic search require different parameters to successfully
+build a query.
+
+The initial format options will be:
+
+* {0} = aggregate
+* {1} = attribute
+* {2} = period
+* {3} = { influxdb: retention_period, }
+
+Because the amount of metrics available to Grafana depends on user
+configuration some minor changes are made to the datasource manager to build
+the metric list for Grafana at runtime.
+
+Instead of configuration many parameters using the default configuration file
+`the metric yaml`_ can used to set the configuration but the expected
+parameters differ from other datasources because of the large amount of
+parameters.
+
+::
+
+  grafana:
+    host_cpu_usage:
+      project: 1337
+      db: production_db
+      attribute: hostname
+      translator: influxdb
+      query: SELECT 100-{0}("{0}_value") FROM {3}.cpu_percent
+              WHERE ("host" =~ /^{1}$/ AND "type_instance" =~/^idle$/ AND
+              time > now()-{2}m)
+
+.. _the metric yaml: https://specs.openstack.org/openstack/watcher-specs/specs/train/approved/file-based-metricmap.html
+
+Alternatives
+------------
+
+Datasources for individual projects that Grafana integrates with could be
+developed but this would be significantly more development effort and possibly
+complicate authorization as it would have to be configured per database.
+
+Data model impact
+-----------------
+
+None
+
+REST API impact
+---------------
+
+None
+
+Security impact
+---------------
+
+The configuration file will need to contain the Grafana authorization token
+which provides read access to the databases Grafana is configured for.
+The configuration file already contains other important credentials.
+
+Notifications impact
+--------------------
+
+None
+
+Other end user impact
+---------------------
+
+None
+
+Performance Impact
+------------------
+
+None
+
+Other deployer impact
+---------------------
+
+None
+
+Developer impact
+----------------
+
+None
+
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+  Dantali0n
+
+Work Items
+----------
+
+* Configuration options
+* General Grafana datasource
+* Translator interface
+* InfluxDB translator
+* Unit tests for Grafana
+* Unit tests for translators
+
+
+Dependencies
+============
+
+* The communication with Grafana is realized using the requests library
+
+Testing
+=======
+
+Unit tests for both the datasource itself as well as the translator base class
+and any subsequent translators will be created.
+
+
+Documentation Impact
+====================
+
+A page containing documentation on how end user's can configure the options
+to successfully use Grafana as a datasource will be created.
+
+
+References
+==========
+
+* https://specs.openstack.org/openstack/watcher-specs/specs/train/approved/formal-datasource-interface.html
+* https://specs.openstack.org/openstack/watcher-specs/specs/train/approved/file-based-metricmap.html
+
+History
+=======
+
+.. list-table:: Revisions
+   :header-rows: 1
+
+   * - Release Name
+     - Description
+   * - Train
+     - Introduced
+