Zabbix Plugin for Application Monitoring in Tacker VNF Manager

Develop a Zabbix plugin in Tacker VNF manager to monitor application level
parameters that can't be supported by current Tacker monitoring driver.

Change-Id: Id9bef93a6bff2434a68556c895fe50aec3c1e2ce
Blueprint: zabbix-plugin
This commit is contained in:
MinWookKim 2017-09-20 11:10:32 +09:00
parent 5000b3c076
commit 83b75e0d48
1 changed files with 391 additions and 0 deletions

View File

@ -0,0 +1,391 @@
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
==============================================================
Zabbix Plugin for Application Monitoring in Tacker VNF Manager
==============================================================
The URL of the launchpad blueprint:
https://blueprints.launchpad.net/tacker/+spec/zabbix-plugin
Develop a Zabbix plugin in Tacker VNF manager to monitor application level
parameters that can't be supported by current Tacker monitoring driver.
Problem description
===================
Current Tacker monitoring drivers provide simple monitoring capabilities, such
as Ceilometer just supports hardware infrastructure resource-related parameters
(e.g. CPU or memory usage). Ping and http_ping drivers also support basic
checking for aliveness of VNFs. In order to guarantee the availability and
stability of network services provided by VNFs, another advanced monitoring tool
which supports application-level parameters is required into VNF Tacker manager.
Zabbix is the most well-known monitoring tool that monitors and tracks numerous
types of network services, applications, resources and servers to quickly
notify administrators of failures. Zabbix provides easy-to-view, simple Web
pages and agents.
The monitoring targets provided by Zabbix are as follows.
* cpu memory / performance
* os performance
* process load / number
* swap / memory space
* FTP, HTTP, MySQL, NTP, POP, SSH, Telnet, SMTP, etc application status
* customize monitoring
Monitoring provided by Zabbix provides not only the operating system in vnf,
but also the status, memory occupancy, cpu, etc. in relation to
the application. The templates you need for monitoring can use many more
functions through the Zabbix community.
It also provides APIs to provide accurate and versatile monitoring capabilities
for each VDU. It collects data through agents, provides
visual monitoring such as graphs and maps and provides triggers and alarms
according to the collected data conditions to promptly notify when a user-defined
threshold value is exceeded. Depending on these triggers, user-defined commands
can be performed to quickly recover from the failure. User-defined commands can
be executed through the agent in the VDU.
The VNF monitoring that tacker can provide by using this is as follows.
* Application monitoring
* Resources, network monitoring in VNF
* Fault management and recovery within applications in VNFs
* Perform predefined commands
* Collected data can be graphically displayed
* Custom trigger conditions can be set
Proposed changes
================
Template Policy, Service Monitoring Driver and Zabbix Plugin are required for
interoperation with Zabbix. In Tacker, the IP address, name, etc. for the VDU
must be transmitted to the Zabbix server using the API request. Also, it is
necessary to transmit a request for generating Template, Item, Trigger, Graph,
etc. for monitoring to the server through the API. In addition, Service
Monitoring Policy should be defined in Tosca-Template to allow user to define
condition value for trigger notification about monitored application, CPU load,
and status.
Zabbix can collect data (CPU, RAM, disk, network usage) from VNFs using agents
that are installed in each single VM. Through Zabbix APIs, we monitor and get all
information from Zabbix dashboard such as Graphs.
In Tacker VNF descriptor, user defines Zabbix agents via user-data, user also
defines policies and resources for Zabbix such as thresholds, which metrics will
be monitored, etc.
Tacker collects the status of VDU. If VDU responds to Respwan action in policy
or an additional VDU need to be created for scalability, information about
additional VDU is registered as a new monitoring host by requesting Zabbix
server for IP information of new VDU through Zabbix plugin. Should be in the
opposite case, request the delete API for the existing VDU Host registered in
Zabbix Server.
The overall workflow is as follows::
+----------------------------+ +----------------------------+
| | | |
| Zabbix Web UI | | Tosca_zabbix_template.yaml |
| | | |
+----------------------------+ +--------------+-------------+
|
|
+----------------------------+ +---------------------v-------------------------+
| Zabbix | | Tacker |
| | | |
| +------------------+ | | +-----------------------------------------+ |
| | | | | | VNFM | |
| | Zabbix Frontend | | | | | |
| | | | | | +---------------+ +---------------+ | |
| +------------------+ | | | | | | Service | | |
| | | | | Zabbix Plugin | | Monitoring | | |
| +------------------+ | +----------+ <---+ Driver | | |
| | | | | | | | | | | | |
| | Zabbix Server | <------------+ | | +---------------+ +---------------+ | |
| | | | | | | |
| +------+-^----+---^+----+ | +-----------------------------------------+ |
| | | | | | +-----------------------------------------------+
+----------------------------+
| | | +-------------------------------------------------------+
| | | |
| | +-----------------------------------+ |
| +-------------------+ | |
| | | |
| | | |
+------------------------------------------------------------------------------------------+
| | | NFVI | | |
| +--------------------------------------------------------------------------------------+ |
| | | | VNF | | | |
| | +---------------------------------------+ +---------------------------------------+ | |
| | | | VDU | | | | VDU | | | |
| | | +-----v---------------------+-------+ | | +-----v-----------------------+-----+ | | |
| | | | Zabbix Agent | | | | Zabbix Agent | | | |
| | | +---------+--------------+----------+ | | +---------+--------------+----------+ | | |
| | | | | | | | | | | |
| | | +v--------------v+ | | +v--------------v+ | | |
| | | | Scripts | | | | Scripts | | | |
| | | +----+---------^-+ | | +----+---------^-+ | | |
| | | | | | | | | | | |
| | | |--------v---------+------+ | | |--------v---------+------+ | | |
| | | | Application Or OS Info | | | | Application Or OS Info | | | |
| | | | Or | | | | Or | | | |
| | | | User Define | | | | User Define | | | |
| | | +-------------------------+ | | +-------------------------+ | | |
| | +---------------------------------------+ +---------------------------------------+ | |
| +--------------------------------------------------------------------------------------+ |
+------------------------------------------------------------------------------------------+
The Zabbix server requests the Zabbix agent built in each VDU to monitor the item
to be collected, and the Zabbix agent executes the monitoring script for the
embedded item. The collected data is sent to the Zabbix server by the Zabbix
Agent, which determines whether the trigger is generated. Triggers can be
generated according to the average value of collected data and so on.
Tacker VNFM requires Zabbix Server with Service Monitoring Driver and Zabbix
Plugin for API request. This allows VDU to be monitored. The Service Monitoring
Driver extracts the service monitoring policy from the Tosca template and makes
it into a dictionary and delivers it to the Zabbix Plugin. The Zabbix Plugin
first sends a token request API to the Zabbix Server to receive the token.
Zabbix Plugin requests Zabbix Server host creation API including IP address to
register generated VDU as monitoring host. It then requests a generation API
with trigger value and application information to perform monitoring for each
VDU. All of these processes require tokens. Below is an example of a service
monitoring policy.
.. code-block:: yaml
app_monitoring_policy:
name: zabbix
zabbix_username: Admin
zabbix_password: zabbix
zabbix_server_ip: 192.168.11.53
zabbix_server_port: 80
parameters:
application:
appname: apache2
appport: 80
app_status:
condition: [down]
actioname: cmd
cmd-action: service apache2 restart
app_memory:
condition: [greater,22]
actioname: cmd
cmd-action: service apache2 stop
OS:
os_agent_info:
condition: [down]
actioname: cmd
cmd-action: service zabbix-agent restart
os_cpu_usage:
os_proc_value:
condition: [and less,30]
actioname: cmd
cmd-action: reboot
os_cpu_load:
condition: [greater,30]
actioname: cmd
cmd-action: reboot
We can enter the ID / PASSWORD / IP / PORT of the Zabbix server in the Template and use this
information to access the zabbix server (e.g. token) to perform monitoring actions.
* zabbix_username & zabbix_password & zabbix_server_ip & zabbix_server_port : The information
needed for authentication from Zabbix, the ID and Password of the zabbix user, and the IP and Port
number of the Zabbix server respectively.
Zabbix monitors the inside of the VNFs managed by the Tacker. From the point of
view of Zabbix, VNFs are the physical servers we are operating.
Therefore, it monitors CPU / Memory and application for VNF internal operating
system,and notifies and repairs the occurrence of a failure. All of this can be
verified through VNF internal monitoring via Zabbix web. Tacker can run stable
VNF through Zabbix.
These service monitoring policies should be set individually for each VDU. This
is because the range and monitoring application that each VDU wants to monitor
may be different. The parameters are divided into application and VNF internal
OS monitoring.
* name : The name of the open source monitoring tool.
* application & OS: Information about application and VNF OS-related monitoring.
* appname & appport : Specify monitored application and port information.
* app_status & app_memory: It shows the state of the application and the
application memory in bytes.
* os_cpu_usage & os_proc_value & os_cpu_load : cpu Indicates the I / O
throughput, usage, and the number of processes running.
* os_agent_info : os_agent_info can check status of zabbix agent.
Enter a number of authentication information about the zabbix server and call it from
the zabbix plugin to communicate with the Zabbix Server. As a result, the tacker can
make various monitoring requests to the Zabbix Server.It is efficient from a security
point of view.
Each monitoring item can define detailed comparison values and corresponding actions.
This notifies you when a problem with an item occurs and defines additional actions
accordingly. You can increase the stability of VNF application operations by performing
actions defined in the event of a failure.
The details that can be set for each monitoring target are as follows.
* condition: [comparison, value ]
The condition consists of comparison and value. Comparisons include greater, less,
and greater, and less, and zabbix determines whether a failure occurs for the item
based on a comparison with the baseline value.
* actioname: [action name]
* cmd-action: [Command to be executed in VNF]
The current action supports cmd, and a respawn, scale action will be added in the future.
If the condition specified in condition is true, the cmd-action is performed within the VNF.
Allows you to perform an action in response to a failure. The actions provided by Zabbix
include remote commands, script execution, and so on. These commands can be set to run
through Zabbix-agent or via Zabbix-server. Use script execution provided by zabbix to do cmd.
If you choose cmd,you must define cmd-action. This is the cmd - action command to be executed
through zabbix-agent in vnf. In conclusion, the execution of each cmd is as follows.
* cmd: The zabbix server automatically performs actions according to the user cmd-action through
the zabbix agent in vnf.
VDU can install Zabbix Agent on the image itself, or create an automatic
monitoring environment through scripting in user data session of Tosca Template.
An example script used in user data is shown below.
.. code-block:: console
user_data: |
#!/bin/bash
sudo apt-get -y update
sudo apt-get -y upgrade
sudo apt-get -y install zabbix-agent
sudo apt-get -y install apache2
sudo sed -i "2s/.*/`ifconfig [Interface name in VNF] | grep ""\"inet addr:\"""| cut -d: -f2 | awk ""\"{ print $1 }\"""`/g" "/etc/hosts"
sudo sed -i "s/Bcast/`cat /etc/hostname`/g" "/etc/hosts"
sudo sed -i "3s/.*/[Zabbix server's IP Address]\tmonitor/g" "/etc/hosts"
sudo /etc/init.d/networking restart
sudo echo 'zabbix ALL=NOPASSWD: ALL' >> /etc/sudoers
sudo sed -i "s/# EnableRemoteCommands=0/EnableRemoteCommands=1/" "/etc/zabbix/zabbix_agentd.conf"
sudo sed -i "s/Server=127.0.0.1/Server=[Zabbix server's IP Address]/" "/etc/zabbix/zabbix_agentd.conf"
sudo sed -i "s/ServerActive=127.0.0.1/ServerActive=[Zabbix server's IP Address:Port]/" "/etc/zabbix/zabbix_agentd.conf"
sudo sed -i "s/Hostname=Zabbix server/Hostname=`cat /etc/hostname`/" "/etc/zabbix/zabbix_agentd.conf"
sudo service apache2 restart
sudo service zabbix-agent restart
Install Zabbix-agent with apt-get. Also, after installing zabbix-agent, replace
the configuration file with sed command. This process is performed during the
VDU initialization process.
As a result of this process, Zabbix provides a stable monitoring function that
allows the Tacker to set the monitoring range and targets for each VDU in
detail, and to provide more stable support for the applications provided by the
VDU.
If an application monitoring policy and a current monitoring policy are used at
the same time, respawn or scale action occurs according to the current monitoring policy,
the existing respawn or scale action proceeds as it is. However, when these actions occur,
the application monitoring policy does not work. In the scope of the current proposal, the
user must select either the current monitoring policy or the application monitoring policy.
It can be resolved if the repawn, scale action is added at the level of the application
monitoring that is separate from the current monitoring policy.
Alternatives
------------
Open source monitoring tool providing API(e.g. nagios)
Data model impact
-----------------
None
REST API impact
---------------
None
Security impact
---------------
None
Notifications impact
--------------------
None
Other end user impact
---------------------
None
Performance Impact
------------------
None
Other deployer impact
---------------------
None
Developer impact
----------------
None
Implementation
==============
Assignee(s)
-----------
Primary author and contact.
MinWookKim <delightwook@dcn.ssu.ac.kr>
Primary assignee:
MinWookKim <delightwook@dcn.ssu.ac.kr>
Work Items
----------
1. Zabbix-plugin in monitoring_driver
2. Plugin in VNFM for called Zabbix-plugin
3. Additional features in the alarm-receiver for receiving alarms from the Zabbix server
4. Definition of server IP address, Port, Id, Password for requesting Zabbix server from
Zabbix-plugin in tacker.conf
5. Extract app_monitoring_policy defined in template from utils, translate_template
Dependencies
============
* Zabbix-Server Installation
* Zabbix-Agent Installation
Testing
=======
* Add function test for vnf service monitoring
* Performance test for vnf failure recovery and detection in Zabbix
* Service monitoring policy based on the overall workflow test
Documentation Impact
====================
None
References
==========
.. [#first] https://www.zabbix.com/documentation/3.0/manual
.. [#second] https://share.zabbix.com/
.. [#third] https://www.nagios.org/documentation/