Update the high level architecture

Updated the architecture document to include new features since folsom.

Change-Id: I8f3bee2f881341a18ad20063d081f0fb7d63c3ad
This commit is contained in:
Nicolas Barcet (nijaba) 2013-09-03 00:45:24 +02:00
parent a9f147c62a
commit ebd7a7d9be
9 changed files with 236 additions and 26 deletions

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 34 KiB

BIN
doc/source/3-Pipeline.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 41 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 27 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 58 KiB

View File

@ -15,42 +15,250 @@ High Level Description
double: database; architecture
double: API; architecture
The following diagram summarizes ceilometer logical architecture:
Objectives
----------
.. The image source can be found at https://docs.google.com/drawings/d/1-6-DxU5ITyRcVJtJtPsc_zeiqzafZlir0AF7AkG4ZeQ/edit
The Ceilometer project was started in 2012 with one simple goal in mind: to
provide an infrastructure to collect any information needed regarding
OpenStack projects. It was designed so that rating engines could use this
single source to transform events into billable items which we
label as "metering".
.. image:: ./Ceilometer_Architecture.png
As the project started to come to life, collecting an
`increasing number of metrics`_ across multiple projects, the OpenStack
community started to realize that a secondary goal could be added to
Ceilometer: become a standard way to collect metric, regardless of the
purpose of the collection. For example, Ceilometer can now publish information for
monitoring, debugging and graphing tools in addition or in parallel to the
metering backend. We labelled this effort as “multi-publisher“.
As shown in the above diagram, there are 5 basic components to the system:
.. _increasing number of metrics: http://docs.openstack.org/developer/ceilometer/measurements.html
1. A :term:`compute agent` runs on each compute node and polls for
resource utilization statistics. There may be other types of agents
in the future, but for now we will focus on creating the compute
agent.
Most recently, as the Heat project started to come to
life, it soon became clear that the OpenStack project needed a tool to watch for
variations in key values in order to trigger various reactions.
As Ceilometer already had the tooling to collect vast quantities of data, it
seemed logical to add this as an extension of the Ceilometer project, which we
tagged as “alarming“.
2. A :term:`central agent` runs on a central management server to
poll for resource utilization statistics for resources not tied
to instances or compute nodes.
Metering
--------
3. A :term:`collector` runs on one or more central management
servers to monitor the message queues (for notifications and for
metering data coming from the agent). Notification messages are
processed and turned into metering messages and sent back out onto
the message bus using the appropriate topic. Metering messages are
written to the data store without modification.
If you divide a billing process into a three steps process as is commonly done in
the telco industry, the steps are:
4. A :term:`data store` is a database capable of handling concurrent
writes (from one or more collector instances) and reads (from the
API server).
1. :term:`Metering` is the process of collecting information about what,
who, when and how much regarding anything that can be billed. The result of
this is a collection of “tickets” (a.k.a. samples) which are ready to be
processed in anyway you want.
1. :term:`Rating` is the process of analysing a series of tickets,
according to business rules defined by marketing, in order to transform
them into bill line items with a currency value.
1. :term:`Billing` is the process to assemble bill line items into a
single per customer bill, emitting the bill to start the payment collection.
5. An :term:`API server` runs on one or more central management
servers to provide access to the data from the data store. See
`API Description`_ for details.
Ceilometers initial goal was, and still is, strictly limited to step
one. This is a choice made from the beginning not to go into rating or billing,
as the variety of possibilities seemed too huge for the project to ever deliver
a solution that would fit everyones needs, from private to public clouds. This
means that if you are looking at this project to solve your billing needs, this
is the right way to go, but certainly not the end of the road for your. Once
Ceilometer is in place on your OpenStack deployment, you will still have
quite a few things to do before you can produce a bill for your customers.
One of you first task could be: finding the right queries within the Ceilometer
API to extract the information you need for your very own rating engine.
.. _API Description: api.html
You can of course use the same API to satisfy other needs, such as a data mining
solution to help you identify unexpected or new usage types, or a capacity
planning solution. In general, it is recommended to download the data from the API in
order to work on it in a separate database to avoid overloading the one which
should be dedicated to storing tickets. It is also often found that the
Ceilometer metering DB only keeps a couple months worth of data while data is
regularly offloaded into a long term store connected to the billing system,
but this is fully left up to the implementor.
These services communicate using the standard OpenStack messaging
bus. Only the collector and API server have access to the data store.
.. note::
We do not guarantee that we wont change the DB schema, so it is
highly recommended to access the database through the API and not use
direct queries.
How is data collected?
----------------------
.. The source for the 7 diagrams below can be found at: https://docs.google.com/presentation/d/1P50qO9BSAdGxRSbgHSbxLo0dKWx4HDIgjhDVa8KBR-Q/edit?usp=sharing
.. figure:: ./1-Collectorandagents.png
:figwidth: 100%
:align: center
:alt: Collectors and agents
This is a representation of how the collectors and agents gather data from multiple sources.
In a perfect world, each and every project that you want to instrument should
send events on the Oslo bus about anything that could be of interest to
your. Unfortunately, not all
projects have implemented this and you will often need to instrument
other tools which may not use the same bus as OpenStack has defined. To
circumvent this, the Ceilometer project created 3 independent methods to
collect data:
1. :term:`Bus listener agent` which takes events generated on the Oslo
notification bus and transforms them into Ceilometer sample. Again this
is the preferred method of data collection. If you are working on some
OpenStack related project and are using the Oslo library, you are kindly
invited to come and talk to one of the project members to learn how you
could quickly add instrumentation for your project.
1. :term:`Push agents` which is the only solution to fetch data within projects
which do not expose the required data in a remotely useable way. This is not
the preferred method as it makes deployment a bit more complex having to add
a component to each of the nodes that need to be monitored. However, we do
prefer this compared to a polling agent method as resilience (high
availability) will not be a problem with this method.
1. :term:`Polling agents` which is the least preferred method, that will poll
some API or other tool to collect information at a regular interval. The main
reason why we do not like this method is the inherent difficulty to make such
a component be resilient.
How to access collected data?
-----------------------------
Once collected, the data is stored in a database. There can be multiple types of
databases through the use of different database plugins (see the section
`which database to use`_). Moreover, the schema and dictionary of
this database can also evolve over time. For both reasons, we offer a REST API
that should be the only way for you to access the collected data rather than
accessing the underlying database directly. It is possible that the way
youd like to access your data is not yet supported by the API. If you think
this is the case, please contact us with your feedback as this will certainly
lead us to improve the API.
.. figure:: ./2-accessmodel.png
:figwidth: 100%
:align: center
:alt: data access model
This is a representation of how to access data stored by ceilometer
The :ref:`list of currently built in meters <measurements>` is
available in the developer documentation,
and it is also relatively easy to add your own (and eventually contribute it).
Ceilometer is part of OpenStack, but is not tied to OpenStack's definition of
"users" and "tenants." The "source" field of each sample refers to the authority
defining the user and tenant associated with the sample. Deployers can define
custom sources through a configuration file, and then create agents to collect
samples for new meters using those sources. This means that you can collect
data for applications running on top of OpenStack, such as a PaaS or SaaS
layer, and use the same tools for metering your entire cloud.
Moreover, end users can also :ref:`send their own application centric data <user-defined-data>` into the
database through the REST API for a various set of use cases (see the section
“Alarming” later in this article).
.. _send their own application centric data: ./webapi/v2.html#user-defined-data
Multi-Publisher
---------------
.. figure:: ./3-Pipeline.png
:figwidth: 100%
:align: center
:alt: Ceilometer pipeline
The assembly of component making the ceilometer pipeline
Publishing meters for different uses is actually a two dimensional problem.
The first variable is the frequency of publication. Typically a meter that
you publish for billing need will need to be updated every 30 min while the
same meter needed for performance tuning may be needed every 10 seconds.
The second variable is the transport. In the case of data intended for a
monitoring system, losing an update or not ensuring security
(non-repudiability) of a message is not really a problem while the same meter
will need both security and guaranteed delivery in the case of data intended
for rating and billing systems.
To solve this, the notion of multi-publisher can now be configured for each
meter within Ceilometer, allowing the same technical meter to be published
multiple times to multiple destination each potentially using a different
transport and frequency of publication. At the time of writing, two
transports have been implemented so far: the original and relatively secure
Oslo RPC queue based, and one using UDP packets.
.. figure:: ./4-Transformer.png
:figwidth: 100%
:align: center
:alt: Transformer example
Example of aggregation of multiple cpu time usage samples in a single
cpu percentage sample
.. figure:: ./5-multi-publish.png
:figwidth: 100%
:align: center
:alt: Multi-publish
This figure shows how a sample can be published to multiple destinations.
Alarming
--------
The Alarming component of Ceilometer, which is being delivered in the Havana
version, allows you to set alarms based on threshold evaluation for a collection
of samples. An alarm can be set on a single meter, or on a combination. For
example, you may want to trigger an alarm when the memory consumption
reaches 70% on a given instance if the instance has been up for more than
10 min. To setup an alarm, you will call :ref:`Ceilometers API server <alarms-api>` specifying
the alarm conditions and an action to take.
Of course, if you are not administrator of the cloud itself, you can only
set alarms on meters for your own components. Good news, you can also
:ref:`send your own meters <user-defined-data>` from within your instances,
meaning that you can trigger
alarms based on application centric data.
There can be multiple form of actions, but two have been implemented so far:
1. http call back: you provide a URL to be called whenever the alarm has been set
off. The payload of the request contains all the details of why the alarm went
off.
2. log: mostly useful for debugging, stores alarms in a log file.
For more details on this, I recommend you to read the blog post by
Mehdi Abaakouk `Autoscaling with Heat and Ceilometer`_. Particular attention
should be given to the section “Some notes about deploying alarming” as the
database setup (using a separate database from the one used for metering)
will be critical in all cases of production deployment.
.. _Autoscaling with Heat and Ceilometer: http://techs.enovance.com/5991/autoscaling-with-heat-and-ceilometer
.. _which database to use:
Which database to use
---------------------
.. figure:: ./6-storagemodel.png
:figwidth: 100%
:align: center
:alt: Storage model
An overview of the Ceilometer storage model.
Since the beginning of the project, a plugin model has been put in place
to allow for various types of database backends to be used. However, not
all implementations are equal and, at the time of writing, MongoDB
is the recommended backend of choice because it is the most tested. Have a look
at the “choosing a database backend” section of the documentation for more
details. In short, ensure a dedicated database is used when deploying
Ceilometer as the volume of data generated can be extensive in a production
environment and will generally use a lot of I/O.
.. figure:: ./7-overallarchi.png
:figwidth: 100%
:align: center
:alt: Architecture summary
An overall summary of Ceilometer's logical architecture.
Detailed Description
====================

View File

@ -32,6 +32,7 @@ Samples and Statistics
.. autotype:: ceilometer.api.controllers.v2.Statistics
:members:
.. _alarms-api:
Alarms
======
@ -292,6 +293,7 @@ parameter to the query::
This query would only return the last 3 samples.
.. _user-defined-data:
User-defined data
+++++++++++++++++