Merge "Add a document about the resource monitoring"

2018-02-07 08:32:52 +00:00 · 2018-02-07 08:32:52 +00:00 · 4c58611d01
parent dcf42e75ca 55c98c365e
commit 4c58611d01
6 changed files with 118 additions and 7 deletions
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@ -5,14 +5,13 @@ Blazar is an OpenStack service to provide resource reservations in the
 OpenStack cloud for different resource types - both virtual (instances,
 volumes, stacks) and physical (hosts).

-Overview
--------
+User Guide
+----------

 .. toctree::
-    :maxdepth: 1
+    :maxdepth: 2

-    introduction
-    architecture
+    user/index

 Installation Guide
 ------------------
@ -65,4 +64,4 @@ Indices and tables
 ------------------

 * :ref:`genindex`
-* :ref:`search`
+* :ref:`search`
--- a/doc/source/user/architecture.rst
+++ b/doc/source/user/architecture.rst
@ -4,7 +4,7 @@ Blazar architecture

 Blazar design can be described by following diagram:

-.. image:: images/blazar-architecture.png
+.. image:: ../images/blazar-architecture.png
    :width: 700 px
    :scale: 99 %
    :align: left
--- a/doc/source/user/compute-host-monitor.rst
+++ b/doc/source/user/compute-host-monitor.rst
@ -0,0 +1,40 @@
+====================
+Compute Host Monitor
+====================
+
+Compute host monitor detects failure and recovery of compute hosts.
+If it detects failures, it triggers healing of host reservations and instance
+reservations. This document describes the compute host monitor plugin in
+detail.
+
+Monitoring Type
+===============
+
+Both of the push-based and the polling-based monitoring types are supported
+for the compute host monitor.
+These monitors can be enabled/disabled by the following configuration options:
+
+* **enable_notification_monitor**: Set *True* to enable it.
+* **enable_polling_monitor**: Set *True* to enable it.
+
+Failure Detection
+=================
+
+Compute host monitor detects failure and recovery hosts by subscribing Nova
+notifications or polling the *List Hypervisors* of Nova API. If any failure is
+detected, Blazar sets the *reservable* field of the failed host *False* and
+heals suffering reservations as follows.
+
+Reservation Healing
+===================
+
+If a host failure is detected, Blazar tries to heal host/instance reservations
+which use the failed host by reserving alternative host.
+
+Configurations
+==============
+
+To enable the compute host monitor, enable **enable_notification_monitor**
+or **enable_polling_monitor** option.
+See also the :doc:`../configuration/blazar-conf` in detail.
+detail
--- a/doc/source/user/index.rst
+++ b/doc/source/user/index.rst
@ -5,5 +5,8 @@ User Guide
 .. toctree::
    :maxdepth: 2

+    introduction.rst
+    architecture.rst
+    resource-monitoring.rst
    ../cli/index
    ../restapi/index
--- a/doc/source/user/introduction.rst
+++ b/doc/source/user/introduction.rst
--- a/doc/source/user/resource-monitoring.rst
+++ b/doc/source/user/resource-monitoring.rst
@ -0,0 +1,69 @@
+===================
+Resource Monitoring
+===================
+
+Blazar monitors states of resources and heals reservations which are expected
+to suffer from resource failure.
+Resource specific functionality, e.g., calling Nova APIs, is provided as a
+monitoring plugin.
+The following sections describes the resource monitoring feature in detail.
+
+Monitoring Type
+===============
+
+Blazar supports 2 types of monitoring - push-based and polling-based.
+
+1. Push-based monitoring
+
+   The monitor listens to notification messages sent by other components,
+   e.g., sent by Nova for the host monitoring plugin.
+   And it picks up messages which refer to the resources managed by Blazar.
+   Event types, topics to subscribe and notification callbacks are provided by
+   monitoring plugins.
+
+2. Polling-based monitoring
+
+   The blazar-manager periodically calls a states check method of monitoring
+   plugins. Then, the monitoring plugins check states of resources, e.g.,
+   *List Hypervisors* of the Compute API is used for the host monitoring
+   plugin.
+
+Admins can enable/disable these monitoring by setting configuration options.
+
+Healing
+=======
+
+When the monitor detects a resource failure, it heals reservations which
+are expected to suffer from the failure.
+
+Flags
+=====
+
+Leases and reservations have flags that indicate states of reserved
+resources. Reservations have the following two flags:
+
+* **missing_resources**: If any resource allocated to the reservation fails
+  and no alternative resource found, this flag is set *True*.
+
+* **resources_changed**: If any resource allocated to the *active* reservation
+  and alternative resource is reallocated, this flag is set *True*.
+
+Leases have the following flag:
+
+* **degraded**: If the **missing_resources** or the **resources_changed** flags
+  of any reservation included in the lease is *True*, then it is *True*.
+
+Lease owners can see health of the lease and reservations included in the
+lease by checking these flags.
+
+Monitoring Resources
+====================
+
+Resource specific functionality is provided as a monitoring plugin.
+The following resource is currently supported.
+
+.. toctree::
+   :maxdepth: 1
+
+   compute-host-monitor.rst
+