Retire repository
Fuel (from openstack namespace) and fuel-ccp (in x namespace) repositories are unused and ready to retire. This change removes all content from the repository and adds the usual README file to point out that the repository is retired following the process from https://docs.openstack.org/infra/manual/drivers.html#retiring-a-project See also http://lists.openstack.org/pipermail/openstack-discuss/2019-December/011647.html Depends-On: https://review.opendev.org/699362 Change-Id: I6b38110f2d230006cd9cce1da5d2cf76cf470d35
This commit is contained in:
parent
617e225baa
commit
16f5f4cf44
|
@ -1,69 +0,0 @@
|
|||
*.py[cod]
|
||||
|
||||
# C extensions
|
||||
*.so
|
||||
|
||||
# Packages
|
||||
*.egg
|
||||
*.egg-info
|
||||
dist
|
||||
build
|
||||
.eggs
|
||||
eggs
|
||||
parts
|
||||
bin
|
||||
var
|
||||
sdist
|
||||
develop-eggs
|
||||
.installed.cfg
|
||||
lib
|
||||
lib64
|
||||
|
||||
# Installer logs
|
||||
pip-log.txt
|
||||
|
||||
# Unit test / coverage reports
|
||||
.coverage
|
||||
cover
|
||||
.tox
|
||||
nosetests.xml
|
||||
.testrepository
|
||||
.venv
|
||||
|
||||
# Translations
|
||||
*.mo
|
||||
|
||||
# Mr Developer
|
||||
.mr.developer.cfg
|
||||
.project
|
||||
.pydevproject
|
||||
|
||||
# Complexity
|
||||
output/*.html
|
||||
output/*/index.html
|
||||
|
||||
# Sphinx
|
||||
doc/build
|
||||
|
||||
# oslo-config-generator
|
||||
etc/*.sample
|
||||
|
||||
# pbr generates these
|
||||
AUTHORS
|
||||
ChangeLog
|
||||
|
||||
# Editors
|
||||
*~
|
||||
.*.swp
|
||||
.*sw?
|
||||
|
||||
# Vagrant
|
||||
.vagrant
|
||||
vagrant/Vagrantfile.custom
|
||||
vagrant/vagrantkey*
|
||||
|
||||
# generated openrc
|
||||
openrc
|
||||
|
||||
# tests
|
||||
tests/.cache/*
|
176
LICENSE
176
LICENSE
|
@ -1,176 +0,0 @@
|
|||
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
|
@ -0,0 +1,10 @@
|
|||
This project is no longer maintained.
|
||||
|
||||
The contents of this repository are still available in the Git
|
||||
source code management system. To see the contents of this
|
||||
repository before it reached its end of life, please check out the
|
||||
previous commit with "git checkout HEAD^1".
|
||||
|
||||
For any further questions, please email
|
||||
openstack-discuss@lists.openstack.org or join #openstack-dev on
|
||||
Freenode.
|
|
@ -1,16 +0,0 @@
|
|||
FROM {{ image_spec("base-tools") }}
|
||||
MAINTAINER {{ maintainer }}
|
||||
|
||||
# Install alarm-manager and dependencies
|
||||
COPY alarm-manager.py /opt/ccp/bin/
|
||||
COPY requirements.txt /tmp/requirements.txt
|
||||
COPY config-files /etc/alarm-manager/
|
||||
|
||||
RUN pip install --no-cache-dir -r /tmp/requirements.txt \
|
||||
&& useradd --user-group alarm-manager \
|
||||
&& usermod -a -G microservices alarm-manager \
|
||||
&& chown -R alarm-manager: /etc/alarm-manager \
|
||||
&& chmod 755 /opt/ccp/bin/alarm-manager.py \
|
||||
&& rm -f /tmp/requirements.txt
|
||||
|
||||
USER alarm-manager
|
|
@ -1,603 +0,0 @@
|
|||
#!/usr/bin/env python
|
||||
#
|
||||
# Copyright 2016 Mirantis, Inc.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
#
|
||||
|
||||
# Global imports
|
||||
# --------------
|
||||
import argparse
|
||||
import hashlib
|
||||
import jinja2
|
||||
import logging
|
||||
import os
|
||||
import pyinotify
|
||||
import re
|
||||
import sys
|
||||
import yaml
|
||||
|
||||
# Best practice code for logging
|
||||
# ------------------------------
|
||||
try: # Python 2.7+
|
||||
from logging import NullHandler
|
||||
except ImportError:
|
||||
class NullHandler(logging.Handler):
|
||||
def emit(self, record):
|
||||
pass
|
||||
|
||||
# Global variables initialization
|
||||
# -------------------------------
|
||||
|
||||
dflt_cfg_dir = os.path.join(
|
||||
'/etc', 'alarm-manager')
|
||||
dflt_config = os.path.join(
|
||||
dflt_cfg_dir, 'config', 'alarm-manager.ini')
|
||||
dflt_template = os.path.join(
|
||||
dflt_cfg_dir, 'templates', 'lua_alarming_template.j2')
|
||||
dflt_cfg_template = os.path.join(
|
||||
dflt_cfg_dir,
|
||||
'templates', 'alarm_manager_lua_config_template.cfg.j2')
|
||||
dflt_dest_dir = os.path.join(
|
||||
'/opt', 'ccp', 'lua', 'modules', 'stacklight_alarms')
|
||||
dflt_cfg_dest_dir = os.path.join(
|
||||
'/var', 'lib', 'hindsight', 'load', 'analysis')
|
||||
dflt_alarm_file = 'alarms.yaml'
|
||||
|
||||
# Logging initialization
|
||||
# ----------------------
|
||||
|
||||
|
||||
def logger_init(cfg_file):
|
||||
"""Initialize logger instance."""
|
||||
log = logging.getLogger()
|
||||
log.setLevel(logging.DEBUG)
|
||||
try:
|
||||
log.debug('Looking for log configuration file: %s' % cfg_file)
|
||||
# Default logging configuration file
|
||||
logging.config.fileConfig(cfg_file)
|
||||
except Exception:
|
||||
# Only add handler if not already done
|
||||
if len(log.handlers) == 0:
|
||||
# Hardcoded default logging configuration if no/bad config file
|
||||
console_handler = logging.StreamHandler(sys.stdout)
|
||||
fmt_str = "[%(asctime)s.%(msecs)03d %(name)s %(levelname)s] " \
|
||||
"%(message)s"
|
||||
console_handler.setFormatter(
|
||||
logging.Formatter(fmt_str, "%Y-%m-%d %H:%M:%S"))
|
||||
log.addHandler(console_handler)
|
||||
log.setLevel(logging.DEBUG)
|
||||
log.debug('Defaulting to stdout')
|
||||
return log
|
||||
|
||||
log = logger_init(None)
|
||||
|
||||
# Class for keeping configuration parameters
|
||||
# ------------------------------------------
|
||||
|
||||
|
||||
class AlarmConfig():
|
||||
"""
|
||||
Class used to store parameters
|
||||
"""
|
||||
def __init__(self, code_dest_dir, config_dest_dir,
|
||||
source_file, template, config_template):
|
||||
self._code_dest_dir = code_dest_dir
|
||||
self._config_dest_dir = config_dest_dir
|
||||
self._source_file = source_file
|
||||
self._template = template
|
||||
self._config_template = config_template
|
||||
self._sha256 = None
|
||||
|
||||
# Class for processing inotify events
|
||||
# ------------------------------------
|
||||
|
||||
|
||||
class InotifyEventsHandler(pyinotify.ProcessEvent):
|
||||
"""
|
||||
Class used to process inotify events.
|
||||
"""
|
||||
def my_init(self, cfg, name, out=None):
|
||||
"""
|
||||
@param cfg: configuration to use for generation callback.
|
||||
@type cfg: AlarmConfig.
|
||||
@param name: File name to be watched.
|
||||
@type name: String.
|
||||
@param out: Logger where events will be written.
|
||||
@type out: Object providing a valid logging object interface.
|
||||
"""
|
||||
if out is None:
|
||||
out = log
|
||||
self._out = out
|
||||
self._cfg = cfg
|
||||
self._name = name
|
||||
|
||||
def process_default(self, event):
|
||||
"""
|
||||
Writes event string representation to logging object provided to
|
||||
my_init().
|
||||
|
||||
@param event: Event to be processed. Can be of any type of events but
|
||||
IN_Q_OVERFLOW events (see method process_IN_Q_OVERFLOW).
|
||||
@type event: Event instance
|
||||
"""
|
||||
self._out.debug(
|
||||
'Received event %s'
|
||||
% str(event))
|
||||
# File name on which inotify event has been triggered does
|
||||
# not match => return right away
|
||||
if event.name != self._name:
|
||||
self._out.debug(
|
||||
'Ignoring event %s (path does not match %s)'
|
||||
% (str(event), self._name))
|
||||
return
|
||||
self._out.info('File %s has been updated' % event.name)
|
||||
# Callback function called with proper parameters
|
||||
if not yaml_alarms_2_lua_and_hindsight_cfg_files(
|
||||
self._cfg
|
||||
):
|
||||
log.error('Error converting YAML alarms into LUA code')
|
||||
|
||||
# Check alarm entry for field existence and type
|
||||
# TODO: see if we ca use similar methods from
|
||||
# fuel-ccp which uses jsonschema to validate types.
|
||||
# -------------------------------------------------
|
||||
|
||||
|
||||
def check_alarm_entry_field(alarm, field, ftype):
|
||||
try:
|
||||
akeys = alarm.keys()
|
||||
# Field lookup
|
||||
if field not in akeys:
|
||||
log.error('Error parsing file: alarm entry does ' +
|
||||
'not have a %s field: %s'
|
||||
% (field, alarm))
|
||||
return False
|
||||
# Do we need to check for proper type too ?
|
||||
if ftype is not None:
|
||||
vfield = alarm[field]
|
||||
vftype = type(vfield)
|
||||
# Check for proper type
|
||||
if vftype is not ftype:
|
||||
log.error('Error parsing file: alarm entry does ' +
|
||||
'not have a field %s is not of type ' +
|
||||
'%s: found %s [%s]'
|
||||
% (field, ftype.__name__, vftype.__name__, alarm))
|
||||
return False
|
||||
except Exception as e:
|
||||
log.error('Error checking for %s: %s' % (field, e))
|
||||
return False
|
||||
return True
|
||||
|
||||
# YAML alarms structure validation
|
||||
#
|
||||
# TODO: see if we ca use similar methods from
|
||||
# fuel-ccp which uses jsonschema to validate types.
|
||||
#
|
||||
# TODO: do not return false right away
|
||||
# when processing lists so that most errors
|
||||
# are reported at once allowing for faster
|
||||
# achievement of correctness
|
||||
# -------------------------------------------------
|
||||
|
||||
|
||||
def validate_yaml(alarms_yaml):
|
||||
log.info('Validating YAML alarms structure')
|
||||
ctx = ''
|
||||
try:
|
||||
log.debug('Retrieving all alarms')
|
||||
# Try to retrieve alarms definitions
|
||||
# and check for overall validity
|
||||
alarms = alarms_yaml['alarms']
|
||||
if alarms is None:
|
||||
log.error('Error parsing file: empty alarm list')
|
||||
return False
|
||||
# alarms entry should be a list
|
||||
atype = type(alarms)
|
||||
if atype is not list:
|
||||
log.error('Error parsing file: alarms entry is not a list (%s)'
|
||||
% atype.__name__)
|
||||
return False
|
||||
# Keep the complete list of alarm names
|
||||
anames = []
|
||||
# Checking all alarms
|
||||
for alarm in alarms:
|
||||
akeys = alarm.keys()
|
||||
if not check_alarm_entry_field(alarm, 'name', str):
|
||||
return False
|
||||
# TODO do we need to add some more checks here ?
|
||||
anames.append(alarm['name'])
|
||||
log.debug('Found %d alarms' % len(anames))
|
||||
# Try to retrieve alarms groups definitions
|
||||
# and check overall validity
|
||||
log.debug('Retrieving alarms groups')
|
||||
cluster_alarms = alarms_yaml['node_cluster_alarms']
|
||||
ckeys = cluster_alarms.keys()
|
||||
for ckey in ckeys:
|
||||
log.debug('Parsing alarms group %s' % ckey)
|
||||
ctx = ' under node_cluster_alarms[%s]' % ckey
|
||||
# Are there some alarm key defined
|
||||
# (if not, the next line throws exception)
|
||||
c_alarms = cluster_alarms[ckey]['alarms']
|
||||
if c_alarms is None:
|
||||
log.error('Error parsing file: empty alarm list%s' % ctx)
|
||||
return False
|
||||
# Now check validity of alarm entries
|
||||
akeys = c_alarms.keys()
|
||||
log.debug('Found %d alarms in group %s' % (len(akeys), ckey))
|
||||
for k in akeys:
|
||||
# Must be a list
|
||||
v = c_alarms[k]
|
||||
ktype = type(v)
|
||||
if ktype is not list:
|
||||
log.error('Error parsing file: alarm entry for %s ' +
|
||||
'is not a list (%s)%s'
|
||||
% (k, ktype.__name__, ctx))
|
||||
return False
|
||||
# Each member of list must be a string
|
||||
for s in v:
|
||||
stype = type(s)
|
||||
if stype is not str:
|
||||
log.error('Error parsing file: alarm entry for %s ' +
|
||||
'is not a list of strings (%s) [%s]%s'
|
||||
% (k, stype.__name__, s, ctx))
|
||||
return False
|
||||
# Now check that all alarm referenced in
|
||||
# alarm groups have been defined
|
||||
for agroup in c_alarms:
|
||||
for aname in c_alarms[agroup]:
|
||||
if aname not in anames:
|
||||
log.error(
|
||||
('Error parsing file: alarm with name %s is not ' +
|
||||
'defined but is referenced in alarm group %s')
|
||||
% (aname, agroup))
|
||||
return False
|
||||
except KeyError as e:
|
||||
log.error('Error parsing file: can not find %s key%s' % (e, ctx))
|
||||
return False
|
||||
except Exception as e:
|
||||
log.error('Error parsing file: unknown exception %s %s'
|
||||
% (type(e), str(e)))
|
||||
return False
|
||||
return True
|
||||
|
||||
# Retrieve alarm by its name within list
|
||||
# --------------------------------------
|
||||
|
||||
|
||||
def find_alarm_by_name(aname, alarms):
|
||||
for alarm in alarms:
|
||||
if alarm['name'] == aname:
|
||||
return alarm
|
||||
return None
|
||||
|
||||
|
||||
# Check file for content changes and returns boolean
|
||||
# True => content has changed
|
||||
# False = content is unchanged
|
||||
#
|
||||
# File path can be altered using string substitutions
|
||||
# so as to adapt to Hindsight current running state
|
||||
# when file are moved around once taken into account
|
||||
# ---------------------------------------------------
|
||||
|
||||
|
||||
def content_changed(file_fullpath, file_content, replace=None):
|
||||
log.debug(
|
||||
'Checking file %s for changes'
|
||||
% file_fullpath)
|
||||
fullpath = file_fullpath
|
||||
# Do we need to replace some parts of the path
|
||||
if replace is not None:
|
||||
for k in replace.keys():
|
||||
fullpath = fullpath.replace(k, replace[k])
|
||||
log.debug(
|
||||
'Checking file path %s for changes'
|
||||
% fullpath)
|
||||
# File does not exist => needs to be created therefore
|
||||
# content has changed
|
||||
if not os.path.isfile(fullpath):
|
||||
log.debug(
|
||||
'File %s does not exist'
|
||||
% fullpath)
|
||||
return True
|
||||
# Read the file content
|
||||
with open(fullpath, 'r') as in_fd:
|
||||
try:
|
||||
old_content = in_fd.read()
|
||||
# Compare former content to new one
|
||||
if old_content == file_content:
|
||||
log.debug(
|
||||
'File %s content has not changed'
|
||||
% fullpath)
|
||||
return False
|
||||
except Exception as e:
|
||||
log.error(
|
||||
'Error reading %s got exception: %s'
|
||||
% (fullpath, e))
|
||||
return True
|
||||
log.debug(
|
||||
'File %s content has changed'
|
||||
% fullpath)
|
||||
return True
|
||||
|
||||
# Convert YAML file containing alarms into lua code
|
||||
# and create Hindsight configuration files
|
||||
# -------------------------------------------------
|
||||
|
||||
|
||||
def yaml_alarms_2_lua_and_hindsight_cfg_files(
|
||||
alarm_config):
|
||||
(lua_code_dest_dir,
|
||||
lua_config_dest_dir,
|
||||
yaml_file,
|
||||
template,
|
||||
cfg_template) = (alarm_config._code_dest_dir,
|
||||
alarm_config._config_dest_dir,
|
||||
alarm_config._source_file,
|
||||
alarm_config._template,
|
||||
alarm_config._config_template)
|
||||
log.info(
|
||||
'Converting alarm YAML file %s to LUA code in %s and configs in %s'
|
||||
% (yaml_file, lua_code_dest_dir, lua_config_dest_dir))
|
||||
try:
|
||||
if os.stat(yaml_file).st_size == 0:
|
||||
log.error('File %s will not be parsed: size = 0' % yaml_file)
|
||||
return False
|
||||
# Open file and retrieve YAML structure if correctly formed
|
||||
with open(yaml_file, 'r') as in_fd:
|
||||
try:
|
||||
alarms_defs = in_fd.read()
|
||||
sha256sum = hashlib.sha256(alarms_defs).hexdigest()
|
||||
if sha256sum == alarm_config._sha256:
|
||||
log.warning('No change detected in file: %s' % yaml_file)
|
||||
return True
|
||||
alarm_config._sha256 = sha256sum
|
||||
alarms_yaml = yaml.load(alarms_defs)
|
||||
except yaml.YAMLError as exc:
|
||||
log.error('Error parsing file: %s' % exc)
|
||||
return False
|
||||
# Check overall validity of alarms definitions
|
||||
if not validate_yaml(alarms_yaml):
|
||||
log.error('Error validating alarms definitions')
|
||||
return False
|
||||
# Now retrieve the information for config and code files generation
|
||||
cluster_alarms = alarms_yaml['node_cluster_alarms']
|
||||
for afd_cluster_name in cluster_alarms:
|
||||
for key in cluster_alarms[afd_cluster_name]['alarms'].keys():
|
||||
# Key can not contain dash or other non letter/numbers
|
||||
if not re.match('^[A-Za-z0-9]*$', key):
|
||||
log.error('Alarm group name can only contain letters ' +
|
||||
'and digits: %s'
|
||||
% key)
|
||||
return False
|
||||
# Build list of associated alarms
|
||||
alarms = []
|
||||
for aname in cluster_alarms[afd_cluster_name]['alarms'][key]:
|
||||
alarms.append(
|
||||
find_alarm_by_name(
|
||||
aname, alarms_yaml['alarms']))
|
||||
# Write LUA code file
|
||||
afd_file = 'afd_node_%s_%s_alarms' % (afd_cluster_name, key)
|
||||
lua_code_dest_file = os.path.join(
|
||||
lua_code_dest_dir, "%s.lua" % afd_file)
|
||||
lua_code = template.render(alarms=alarms)
|
||||
updated_lua_code = False
|
||||
# Check if the generated code has changed
|
||||
if not content_changed(lua_code_dest_file, lua_code):
|
||||
log.info('Unchanged LUA file %s' % lua_code_dest_file)
|
||||
else:
|
||||
# LUA code changes should force re-generation of config
|
||||
# file so as to force Hindsight to take changes into
|
||||
# account
|
||||
updated_lua_code = True
|
||||
log.info('Writing LUA file: %s' % lua_code_dest_file)
|
||||
# Produce LUA code file corresponding to alarm
|
||||
with open(lua_code_dest_file, 'w') as out_fd:
|
||||
try:
|
||||
out_fd.write(lua_code)
|
||||
except Exception as e:
|
||||
log.error('Error writing %s: got exception: %s'
|
||||
% (lua_code_dest_file, e))
|
||||
return False
|
||||
# Write LUA config file
|
||||
afd_file = 'afd_node_%s_%s_alarms' % (afd_cluster_name, key)
|
||||
lua_config_dest_file = os.path.join(
|
||||
lua_config_dest_dir, "%s.cfg" % afd_file)
|
||||
lua_config = cfg_template.render(
|
||||
afd_file=afd_file,
|
||||
afd_cluster_name=afd_cluster_name,
|
||||
afd_logical_name=key
|
||||
)
|
||||
# Check if the generated config has changed
|
||||
# of if we need to force config writing due to
|
||||
# changes in LUA code above
|
||||
#
|
||||
# Note that config is written into .../load/...
|
||||
# and moved to .../run/... by Hindsight
|
||||
if (
|
||||
not content_changed(
|
||||
lua_config_dest_file,
|
||||
lua_config,
|
||||
{'/load/': '/run/'}) and
|
||||
not updated_lua_code):
|
||||
log.info('Unchanged config file %s' % lua_config_dest_file)
|
||||
else:
|
||||
log.info('Writing config file: %s' % lua_config_dest_file)
|
||||
with open(lua_config_dest_file, 'w') as out_fd:
|
||||
try:
|
||||
out_fd.write(lua_config)
|
||||
except Exception as e:
|
||||
log.error('Error writing %s: got exception: %s'
|
||||
% (lua_config_dest_file, e))
|
||||
return False
|
||||
except Exception as e:
|
||||
log.error('Error got exception: %s' % e)
|
||||
return False
|
||||
return True
|
||||
|
||||
# Command line argument parsing
|
||||
# -----------------------------
|
||||
|
||||
|
||||
def cmd_line_args_parser():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="""Alarm manager watches for new alarms definitions
|
||||
in specified directory and applies them TBC ...
|
||||
"""
|
||||
)
|
||||
parser.add_argument(
|
||||
'-c', '--config',
|
||||
help='log level and format configuration file (default %s)'
|
||||
% dflt_config,
|
||||
default=dflt_config,
|
||||
dest='config'
|
||||
)
|
||||
parser.add_argument(
|
||||
'-d', '--code-destdir',
|
||||
help='destination path for LUA plugins code files ' +
|
||||
'(default %s)' % dflt_dest_dir,
|
||||
default=dflt_dest_dir,
|
||||
dest='code_dest_dir'
|
||||
)
|
||||
parser.add_argument(
|
||||
'-D', '--config-destdir',
|
||||
help='destination path for LUA plugins configuration ' +
|
||||
'files (default %s)' % dflt_cfg_dest_dir,
|
||||
default=dflt_cfg_dest_dir,
|
||||
dest='config_dest_dir'
|
||||
)
|
||||
parser.add_argument(
|
||||
'-t', '--template',
|
||||
help='LUA template file (default %s)' % dflt_template,
|
||||
default=dflt_template,
|
||||
dest='template'
|
||||
)
|
||||
parser.add_argument(
|
||||
'-T', '--config-template',
|
||||
help='LUA plugins configuration template file (default %s)' %
|
||||
(dflt_cfg_template),
|
||||
default=dflt_cfg_template,
|
||||
dest='cfg_template'
|
||||
)
|
||||
parser.add_argument(
|
||||
'-w', '--watch-path',
|
||||
help='path to watch for changes (default %s)' %
|
||||
(dflt_cfg_dir),
|
||||
default=dflt_cfg_dir,
|
||||
dest='watch_path'
|
||||
)
|
||||
parser.add_argument(
|
||||
'-x', '--exit',
|
||||
help='exit program without watching filesystem changes',
|
||||
action='store_const',
|
||||
const=True, default=False,
|
||||
dest='exit'
|
||||
)
|
||||
args = parser.parse_args()
|
||||
log = logger_init(args.config)
|
||||
log.info('Watch path: %s\n\tConfig: %s\n\tTemplate: %s'
|
||||
% (args.watch_path, args.config, args.template))
|
||||
|
||||
if (
|
||||
not os.path.isdir(args.watch_path) or
|
||||
not os.access(args.watch_path, os.R_OK)):
|
||||
log.error("{} not a directory or is not readable"
|
||||
.format(args.watch_path))
|
||||
sys.exit(1)
|
||||
|
||||
if (
|
||||
not os.path.isdir(args.code_dest_dir) or
|
||||
not os.access(args.code_dest_dir, os.W_OK)):
|
||||
log.error("{} not a directory or is not writable"
|
||||
.format(args.code_dest_dir))
|
||||
sys.exit(1)
|
||||
|
||||
if (
|
||||
not os.path.isdir(args.config_dest_dir) or
|
||||
not os.access(args.config_dest_dir, os.W_OK)):
|
||||
log.error("{} not a directory or is not writable"
|
||||
.format(args.config_dest_dir))
|
||||
sys.exit(1)
|
||||
|
||||
if (
|
||||
not os.path.isfile(args.template) or
|
||||
not os.access(args.template, os.R_OK)):
|
||||
log.error("{} not a file or is not readable".format(args.template))
|
||||
sys.exit(1)
|
||||
|
||||
if (
|
||||
not os.path.isfile(args.cfg_template) or
|
||||
not os.access(args.cfg_template, os.R_OK)):
|
||||
log.error("{} not a file or is not readable".format(args.cfg_template))
|
||||
sys.exit(1)
|
||||
|
||||
src = os.path.join(args.watch_path, dflt_alarm_file)
|
||||
log.info('Looking for existing readable file: %s' % src)
|
||||
if os.access(src, os.R_OK):
|
||||
log.info('Using LUA template %s and LUA config template %s'
|
||||
% (args.template, args.cfg_template))
|
||||
j2_env = jinja2.Environment(
|
||||
loader=jinja2.FileSystemLoader(
|
||||
os.path.dirname(
|
||||
args.template)),
|
||||
trim_blocks=True)
|
||||
template = j2_env.get_template(
|
||||
os.path.basename(
|
||||
args.template))
|
||||
j2_cfg_env = jinja2.Environment(
|
||||
loader=jinja2.FileSystemLoader(
|
||||
os.path.dirname(
|
||||
args.cfg_template)),
|
||||
trim_blocks=True)
|
||||
cfg_template = j2_cfg_env.get_template(
|
||||
os.path.basename(
|
||||
args.cfg_template))
|
||||
alarm_cfg = AlarmConfig(
|
||||
args.code_dest_dir,
|
||||
args.config_dest_dir,
|
||||
src,
|
||||
template,
|
||||
cfg_template)
|
||||
if not yaml_alarms_2_lua_and_hindsight_cfg_files(
|
||||
alarm_cfg
|
||||
):
|
||||
log.error('Error converting YAML alarms into LUA code')
|
||||
|
||||
# Asked to leave right away or continue watching inotify events ?
|
||||
if args.exit:
|
||||
sys.exit(0)
|
||||
|
||||
# watch manager instance
|
||||
wm = pyinotify.WatchManager()
|
||||
# notifier instance and init
|
||||
notifier = pyinotify.Notifier(
|
||||
wm,
|
||||
default_proc_fun=InotifyEventsHandler(
|
||||
cfg=alarm_cfg,
|
||||
name=dflt_alarm_file))
|
||||
# What mask to apply
|
||||
mask = pyinotify.IN_CLOSE_WRITE
|
||||
log.debug('Start monitoring of %s' % args.watch_path)
|
||||
# Do not recursively dive into path
|
||||
# Do not add watches on newly created subdir in path
|
||||
# Do not do globbing on path name
|
||||
wm.add_watch(args.watch_path,
|
||||
mask, rec=False,
|
||||
auto_add=False,
|
||||
do_glob=False)
|
||||
# Loop forever (until sigint signal get caught)
|
||||
notifier.loop(callback=None)
|
||||
|
||||
if __name__ == '__main__':
|
||||
cmd_line_args_parser()
|
|
@ -1,22 +0,0 @@
|
|||
[loggers]
|
||||
keys=root
|
||||
|
||||
[handlers]
|
||||
keys=stream_handler
|
||||
|
||||
[formatters]
|
||||
keys=formatter
|
||||
|
||||
[logger_root]
|
||||
level=DEBUG
|
||||
handlers=stream_handler
|
||||
|
||||
[handler_stream_handler]
|
||||
class=StreamHandler
|
||||
level=DEBUG
|
||||
formatter=formatter
|
||||
args=(sys.stdout,)
|
||||
|
||||
[formatter_formatter]
|
||||
format=%(asctime)s.%(msecs)03d - %(name)s - %(levelname)s - %(message)s
|
||||
datefmt=%Y-%m-%d %H:%M:%S
|
|
@ -1,41 +0,0 @@
|
|||
local M = {}
|
||||
setfenv(1, M) -- Remove external access to contain everything in the module
|
||||
|
||||
local alarms = {
|
||||
{% for alarm in alarms %}
|
||||
{
|
||||
{% for fkey in alarm.keys()|sort() %}
|
||||
{% if fkey != "trigger" %}
|
||||
['{{ fkey }}'] = '{{ alarm[fkey] }}',
|
||||
{% endif %}
|
||||
{% endfor %}
|
||||
{% if alarm.trigger is defined %}
|
||||
['trigger'] = {
|
||||
{% if alarm.trigger.logical_operator is defined %}
|
||||
['logical_operator'] = '{{ alarm.trigger.logical_operator }}',
|
||||
{% endif %}
|
||||
['rules'] = {
|
||||
{% for rule in alarm.trigger.rules %}
|
||||
{
|
||||
{% for fkey in rule.keys()|sort() %}
|
||||
{% if fkey != "fields" %}
|
||||
['{{ fkey }}'] = '{{ rule[fkey] }}',
|
||||
{% endif %}
|
||||
{% endfor %}
|
||||
{% if rule.fields is defined %}
|
||||
['fields'] = {
|
||||
{% for fkey in rule.fields.keys() %}
|
||||
['{{ fkey }}'] = '{{ rule.fields[fkey] }}'
|
||||
{% endfor %}
|
||||
},
|
||||
{% endif %}
|
||||
},
|
||||
{% endfor %}
|
||||
},
|
||||
},
|
||||
{% endif %}
|
||||
},
|
||||
{% endfor %}
|
||||
}
|
||||
|
||||
return alarms
|
|
@ -1,6 +0,0 @@
|
|||
# The order of packages is significant, because pip processes them in the order
|
||||
# of appearance. Changing the order has an impact on the overall integration
|
||||
# process, which may cause wedges in the gate later.
|
||||
pyinotify>=0.9.6;sys_platform!='win32' and sys_platform!='darwin' and sys_platform!='sunos5' # MIT
|
||||
PyYAML>=3.1.0 # MIT
|
||||
Jinja2>=2.8 # BSD License (3 clause)
|
|
@ -1,26 +0,0 @@
|
|||
FROM {{ image_spec("base-tools") }}
|
||||
MAINTAINER {{ maintainer }}
|
||||
|
||||
RUN apt-get -y -t jessie-backports --no-install-recommends install golang \
|
||||
&& apt-get clean
|
||||
|
||||
# ReplaceMe with a heka package install
|
||||
COPY install-heka.sh /tmp/
|
||||
RUN mkdir -p /var/cache/hekad /usr/share/heka/lua_modules /etc/heka
|
||||
RUN bash -x /tmp/install-heka.sh
|
||||
|
||||
# Add this to heka package?
|
||||
COPY plugins/modules /usr/share/heka/lua_modules/
|
||||
COPY plugins/decoders /usr/share/heka/lua_decoders/
|
||||
COPY plugins/encoders /usr/share/heka/lua_encoders/
|
||||
|
||||
RUN useradd --user-group heka \
|
||||
&& usermod -a -G microservices heka \
|
||||
&& chown -R heka: /usr/share/heka /etc/heka /var/cache/hekad
|
||||
|
||||
# https://github.com/mozilla-services/heka/issues/1881
|
||||
ENV GODEBUG cgocheck=0
|
||||
|
||||
# We need to mount docker.sock for docker plugin. And this sock need
|
||||
# docker group or root user permissions.
|
||||
#USER heka
|
|
@ -1,4 +0,0 @@
|
|||
%microservices ALL=(root) NOPASSWD: /bin/chown heka\:microservices /var/log/microservices, /usr/bin/chown heka\:microservices /var/log/microservices
|
||||
%microservices ALL=(root) NOPASSWD: /bin/chmod 2775 /var/log/microservices, /usr/bin/chmod 2775 /var/log/microservices
|
||||
%microservices ALL=(root) NOPASSWD: /bin/chown heka\: /var/cache/hekad, /usr/bin/chown heka\: /var/cache/hekad
|
||||
%microservices ALL=(root) NOPASSWD: /bin/chown heka\:microservices /var/lib/microservices/heka, /usr/bin/chown heka\:microservices /var/lib/microservices/heka
|
|
@ -1,36 +0,0 @@
|
|||
#!/bin/bash
|
||||
|
||||
set -e
|
||||
|
||||
PLUGINDIR="$1"
|
||||
export GOPATH="/go"
|
||||
|
||||
mkdir -p "$GOPATH/src" "$GOPATH/bin"
|
||||
chmod -R 777 "$GOPATH"
|
||||
|
||||
export PATH=/usr/local/go/bin:$GOPATH/bin/:$PATH
|
||||
|
||||
echo "Get system dependencies..."
|
||||
BUILD_DEPS="git gcc g++ libc6-dev make cmake debhelper fakeroot patch"
|
||||
apt-get update
|
||||
apt-get install -y --no-install-recommends $BUILD_DEPS
|
||||
|
||||
echo "Get and build Heka..."
|
||||
cd /tmp
|
||||
git clone -b dev --single-branch https://github.com/mozilla-services/heka
|
||||
cd heka
|
||||
touch message/message.pb.go # make sure message/message.pb.go has a date
|
||||
# more recent than message/message.proto, to
|
||||
# prevent make from attempting to re-generate
|
||||
# message.pb.go
|
||||
source build.sh # changes GOPATH to /tmp/heka/build/heka and builds Heka
|
||||
install -vD /tmp/heka/build/heka/bin/* /usr/local/bin/
|
||||
cp -rp /tmp/heka/build/heka/lib/lib* /usr/lib/
|
||||
cp -rp /tmp/heka/build/heka/lib/luasandbox/modules/* /usr/share/heka/lua_modules/
|
||||
|
||||
echo "Clean up..."
|
||||
apt-get purge -y --auto-remove $BUILD_DEPS
|
||||
apt-get clean
|
||||
rm -rf /tmp/heka
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
rm -rf $GOPATH
|
|
@ -1,100 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local l = require 'lpeg'
|
||||
l.locale(l)
|
||||
|
||||
local dt = require "date_time"
|
||||
local common_log_format = require 'common_log_format'
|
||||
local patt = require 'os_patterns'
|
||||
local utils = require 'os_utils'
|
||||
|
||||
local msg = {
|
||||
Timestamp = nil,
|
||||
Type = 'log',
|
||||
Hostname = nil,
|
||||
Payload = nil,
|
||||
Pid = nil,
|
||||
Fields = nil,
|
||||
Severity = nil,
|
||||
}
|
||||
|
||||
local severity_label = utils.severity_to_label_map[msg.Severity]
|
||||
|
||||
local access_log_pattern = read_config("access_log_pattern") or error(
|
||||
"access_log_pattern configuration must be specificed")
|
||||
local access_log_grammar = common_log_format.build_apache_grammar(access_log_pattern)
|
||||
local request_grammar = l.Ct(patt.http_request)
|
||||
|
||||
-- Since "common_log_format.build_apache_grammar", doesnt support ErrorLogFormat,
|
||||
-- we have to create error log grammar by ourself. Example error log string:
|
||||
-- 2016-08-15 10:46:27.679999 wsgi:error 340:140359488239360 Not Found: /favicon.ico
|
||||
|
||||
local sp = patt.sp
|
||||
local colon = patt.colon
|
||||
local p_timestamp = l.Cg(l.Ct(dt.rfc3339_full_date * (sp + l.P"T") * dt.rfc3339_partial_time * (dt.rfc3339_time_offset + dt.timezone_offset)^-1), "Timestamp")
|
||||
local p_module = l.Cg(l.R("az")^0, "Module")
|
||||
local p_errtype = l.Cg(l.R("az")^0, "ErrorType")
|
||||
local p_pid = l.Cg(l.digit^-5, "Pid")
|
||||
local p_tid = l.Cg(l.digit^-15, "TreadID")
|
||||
local p_mess = l.Cg(patt.Message, "Message")
|
||||
local error_log_grammar = l.Ct(p_timestamp * sp * p_module * colon * p_errtype * sp * p_pid * colon * p_tid * sp * p_mess)
|
||||
|
||||
function prepare_message (timestamp, pid, severity, severity_label, programname, payload)
|
||||
msg.Logger = 'openstack.horizon-apache'
|
||||
msg.Payload = payload
|
||||
msg.Timestamp = timestamp
|
||||
msg.Pid = pid
|
||||
msg.Severity = severity
|
||||
msg.Fields = {}
|
||||
msg.Fields.programname = programname
|
||||
msg.Fields.severity_label = severity_label
|
||||
end
|
||||
|
||||
function process_message ()
|
||||
|
||||
-- logger is either "horizon-access" or "horizon-error"
|
||||
local logger = read_message("Logger")
|
||||
local log = read_message("Payload")
|
||||
local m
|
||||
|
||||
if logger == "horizon-access" then
|
||||
m = access_log_grammar:match(log)
|
||||
if m then
|
||||
prepare_message(m.Timestamp, m.Pid, "6", "INFO", logger, log)
|
||||
msg.Fields.http_status = m.status
|
||||
msg.Fields.http_response_time = m.request_time.value / 1e6 -- us to sec
|
||||
local request = m.request
|
||||
r = request_grammar:match(request)
|
||||
if r then
|
||||
msg.Fields.http_method = r.http_method
|
||||
msg.Fields.http_url = r.http_url
|
||||
msg.Fields.http_version = r.http_version
|
||||
end
|
||||
else
|
||||
return -1, string.format("Failed to parse %s log: %s", logger, string.sub(log, 1, 64))
|
||||
end
|
||||
elseif logger == "horizon-error" then
|
||||
m = error_log_grammar:match(log)
|
||||
if m then
|
||||
prepare_message(m.Timestamp, m.Pid, "3", "ERROR", logger, m.Message)
|
||||
else
|
||||
return -1, string.format("Failed to parse %s log: %s", logger, string.sub(log, 1, 64))
|
||||
end
|
||||
else
|
||||
error("Logger unknown")
|
||||
end
|
||||
|
||||
return utils.safe_inject_message(msg)
|
||||
end
|
|
@ -1,72 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local l = require 'lpeg'
|
||||
l.locale(l)
|
||||
|
||||
local common_log_format = require 'common_log_format'
|
||||
local patt = require 'os_patterns'
|
||||
local utils = require 'os_utils'
|
||||
|
||||
local msg = {
|
||||
Timestamp = nil,
|
||||
Type = 'log',
|
||||
Hostname = nil,
|
||||
Payload = nil,
|
||||
Pid = nil,
|
||||
Fields = nil,
|
||||
Severity = 6,
|
||||
}
|
||||
|
||||
local severity_label = utils.severity_to_label_map[msg.Severity]
|
||||
|
||||
local apache_log_pattern = read_config("apache_log_pattern") or error(
|
||||
"apache_log_pattern configuration must be specificed")
|
||||
local apache_grammar = common_log_format.build_apache_grammar(apache_log_pattern)
|
||||
local request_grammar = l.Ct(patt.http_request)
|
||||
|
||||
function process_message ()
|
||||
|
||||
-- logger is either "keystone-apache-public" or "keystone-apache-admin"
|
||||
local logger = read_message("Logger")
|
||||
|
||||
local log = read_message("Payload")
|
||||
|
||||
local m
|
||||
|
||||
m = apache_grammar:match(log)
|
||||
if m then
|
||||
msg.Logger = 'openstack.keystone-apache'
|
||||
msg.Payload = log
|
||||
msg.Timestamp = m.time
|
||||
|
||||
msg.Fields = {}
|
||||
msg.Fields.http_status = m.status
|
||||
msg.Fields.http_response_time = m.request_time.value / 1e6 -- us to sec
|
||||
msg.Fields.programname = logger
|
||||
msg.Fields.severity_label = severity_label
|
||||
|
||||
local request = m.request
|
||||
m = request_grammar:match(request)
|
||||
if m then
|
||||
msg.Fields.http_method = m.http_method
|
||||
msg.Fields.http_url = m.http_url
|
||||
msg.Fields.http_version = m.http_version
|
||||
end
|
||||
|
||||
return utils.safe_inject_message(msg)
|
||||
end
|
||||
|
||||
return -1, string.format("Failed to parse %s log: %s", logger, string.sub(log, 1, 64))
|
||||
end
|
|
@ -1,62 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
require "string"
|
||||
local dt = require "date_time"
|
||||
local l = require 'lpeg'
|
||||
l.locale(l)
|
||||
|
||||
local patt = require 'os_patterns'
|
||||
local utils = require 'os_utils'
|
||||
|
||||
local msg = {
|
||||
Timestamp = nil,
|
||||
Type = 'log',
|
||||
Hostname = nil,
|
||||
Payload = nil,
|
||||
Pid = nil,
|
||||
Fields = {
|
||||
programname = 'mysql',
|
||||
severity_label = nil,
|
||||
},
|
||||
Severity = nil,
|
||||
}
|
||||
|
||||
-- mysqld logs are cranky, the hours have no leading zero and the "real" severity level is enclosed by square brackets...
|
||||
-- 2016-07-28 11:09:24 139949080807168 [Note] InnoDB: Dumping buffer pool(s) not yet started
|
||||
|
||||
-- Different pieces of pattern
|
||||
local sp = patt.sp
|
||||
local colon = patt.colon
|
||||
local p_timestamp = l.Cg(dt.rfc3339_full_date * sp^1 * dt.rfc3339_partial_time, "Timestamp")
|
||||
local p_thread_id = l.digit^-15
|
||||
local p_severity_label = l.P"[" * l.Cg(l.R("az", "AZ")^0 / string.upper, "SeverityLabel") * l.P"]"
|
||||
local p_message = l.Cg(patt.Message, "Message")
|
||||
|
||||
local mysql_grammar = l.Ct(p_timestamp * sp^1 * p_thread_id * sp^1 * p_severity_label * sp^1 * p_message)
|
||||
|
||||
|
||||
function process_message ()
|
||||
local log = read_message("Payload")
|
||||
local logger = read_message("Logger")
|
||||
|
||||
local m = mysql_grammar:match(log)
|
||||
if not m then return -1 end
|
||||
|
||||
msg.Timestamp = m.Timestamp
|
||||
msg.Logger = logger
|
||||
msg.Payload = m.Message
|
||||
msg.Fields.severity_label = m.SeverityLabel
|
||||
|
||||
return utils.safe_inject_message(msg)
|
||||
end
|
|
@ -1,164 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
require "string"
|
||||
require "table"
|
||||
local l = require 'lpeg'
|
||||
l.locale(l)
|
||||
|
||||
local patt = require 'os_patterns'
|
||||
local utils = require 'os_utils'
|
||||
|
||||
local service_pattern = read_config("heka_service_pattern") or
|
||||
error('heka_service_pattern must be specified')
|
||||
|
||||
local msg = {
|
||||
Timestamp = nil,
|
||||
Type = 'log',
|
||||
Hostname = nil,
|
||||
Payload = nil,
|
||||
Pid = nil,
|
||||
Fields = nil,
|
||||
Severity = nil,
|
||||
}
|
||||
|
||||
-- traceback_lines is a reference to a table used to accumulate lines of
|
||||
-- a Traceback. traceback_key represent the key of the Traceback lines
|
||||
-- being accumulated in traceback_lines. This is used to know when to
|
||||
-- stop accumulating and inject the Heka message.
|
||||
local traceback_key = nil
|
||||
local traceback_lines = nil
|
||||
|
||||
function prepare_message (service, timestamp, pid, severity_label,
|
||||
python_module, programname, cont_name, payload)
|
||||
msg.Logger = 'openstack.' .. service
|
||||
msg.Timestamp = timestamp
|
||||
msg.Payload = payload
|
||||
msg.Pid = pid
|
||||
msg.Severity = utils.label_to_severity_map[severity_label] or 7
|
||||
msg.Fields = {}
|
||||
msg.Fields.severity_label = severity_label
|
||||
msg.Fields.python_module = python_module
|
||||
msg.Fields.programname = programname
|
||||
msg.Fields.container_name = cont_name
|
||||
msg.Payload = payload
|
||||
end
|
||||
|
||||
-- OpenStack log messages are of this form:
|
||||
-- 2015-11-30 08:38:59.306 3434 INFO oslo_service.periodic_task [-] Blabla...
|
||||
--
|
||||
-- [-] is the "request" part, it can take multiple forms.
|
||||
|
||||
function process_message ()
|
||||
|
||||
local cont_name = read_message("Fields[ContainerName]")
|
||||
local program = string.match(cont_name, service_pattern)
|
||||
local service = nil
|
||||
|
||||
if program == nil then
|
||||
program = "unknown_program"
|
||||
else
|
||||
service = string.match(program, '(.-)%-.*')
|
||||
--- Most of the OS services should match the pattern e.g. "nova-api"
|
||||
--- But some of them dont, e.g "keystone"
|
||||
if service == nil then
|
||||
service = string.match(program, '(%a+)')
|
||||
end
|
||||
end
|
||||
|
||||
--- If service is still nil, it means we fail to match current service
|
||||
--- using both patterns, so we set fallback one.
|
||||
if service == nil then
|
||||
service = "unknown_service"
|
||||
end
|
||||
|
||||
local log = read_message("Payload")
|
||||
local m
|
||||
|
||||
m = patt.openstack:match(log)
|
||||
if not m then
|
||||
return -1 --, string.format("Failed to parse %s log: %s", logger, string.sub(log, 1, 64))
|
||||
end
|
||||
|
||||
-- You could debug something here using this:
|
||||
-- add_to_payload(string.format("Debug: %s\n", VAR))
|
||||
-- inject_payload("txt", "debug")
|
||||
|
||||
local key = {
|
||||
Timestamp = m.Timestamp,
|
||||
Pid = m.Pid,
|
||||
SeverityLabel = m.SeverityLabel,
|
||||
PythonModule = m.PythonModule,
|
||||
service = service,
|
||||
program = program,
|
||||
}
|
||||
|
||||
if traceback_key ~= nil then
|
||||
-- If traceback_key is not nil then it means we've started accumulated
|
||||
-- lines of a Python traceback. We keep accumulating the traceback
|
||||
-- lines util we get a different log key.
|
||||
if utils.table_equal(traceback_key, key) then
|
||||
table.insert(traceback_lines, m.Message)
|
||||
return 0
|
||||
else
|
||||
prepare_message(traceback_key.service, traceback_key.Timestamp,
|
||||
traceback_key.Pid, traceback_key.SeverityLabel,
|
||||
traceback_key.PythonModule, traceback_key.program,
|
||||
cont_name, table.concat(traceback_lines, ''))
|
||||
traceback_key = nil
|
||||
traceback_lines = nil
|
||||
-- Ignore safe_inject_message status code here to still get a
|
||||
-- chance to inject the current log message.
|
||||
utils.safe_inject_message(msg)
|
||||
end
|
||||
end
|
||||
|
||||
if patt.traceback:match(m.Message) then
|
||||
-- Python traceback detected, begin accumulating the lines making
|
||||
-- up the traceback.
|
||||
traceback_key = key
|
||||
traceback_lines = {}
|
||||
table.insert(traceback_lines, m.Message)
|
||||
return 0
|
||||
end
|
||||
|
||||
prepare_message(service, m.Timestamp, m.Pid, m.SeverityLabel, m.PythonModule,
|
||||
program, cont_name, m.Message)
|
||||
|
||||
m = patt.openstack_request_context:match(msg.Payload)
|
||||
if m then
|
||||
msg.Fields.request_id = m.RequestId
|
||||
if m.UserId then
|
||||
msg.Fields.user_id = m.UserId
|
||||
end
|
||||
if m.TenantId then
|
||||
msg.Fields.tenant_id = m.TenantId
|
||||
end
|
||||
end
|
||||
|
||||
m = patt.openstack_http:match(msg.Payload)
|
||||
if m then
|
||||
msg.Fields.http_method = m.http_method
|
||||
msg.Fields.http_status = m.http_status
|
||||
msg.Fields.http_url = m.http_url
|
||||
msg.Fields.http_version = m.http_version
|
||||
msg.Fields.http_response_size = m.http_response_size
|
||||
msg.Fields.http_response_time = m.http_response_time
|
||||
m = patt.ip_address:match(msg.Payload)
|
||||
if m then
|
||||
msg.Fields.http_client_ip_address = m.ip_address
|
||||
end
|
||||
end
|
||||
|
||||
return utils.safe_inject_message(msg)
|
||||
end
|
|
@ -1,87 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
require "string"
|
||||
local l = require 'lpeg'
|
||||
l.locale(l)
|
||||
local dt = require "date_time"
|
||||
|
||||
local patt = require 'os_patterns'
|
||||
local utils = require 'os_utils'
|
||||
|
||||
local msg = {
|
||||
Timestamp = nil,
|
||||
Type = 'log',
|
||||
Hostname = nil,
|
||||
Payload = nil,
|
||||
Fields = {},
|
||||
Severity = nil,
|
||||
}
|
||||
|
||||
-- ovs logs looks like this:
|
||||
-- 2016-08-10T09:27:41Z|00038|connmgr|INFO|br-ex<->tcp:127.0.0.1:6633: 2 flow_mods 10 s ago (2 adds)
|
||||
|
||||
-- Different pieces of pattern
|
||||
local sp = patt.sp
|
||||
local colon = patt.colon
|
||||
local pipe = patt.pipe
|
||||
local dash = patt.dash
|
||||
|
||||
local p_timestamp = l.Cg(l.Ct(dt.rfc3339_full_date * (sp + l.P"T") * dt.rfc3339_partial_time * (dt.rfc3339_time_offset + dt.timezone_offset)^-1), "Timestamp")
|
||||
local p_id = l.Cg(l.digit^-5, "Message_ID")
|
||||
local p_module = l.Cg(l.R("az")^0, "Module")
|
||||
local p_severity_label = l.Cg(l.R("AZ")^0, "SeverityLabel")
|
||||
local p_message = l.Cg(patt.Message, "Message")
|
||||
|
||||
local ovs_grammar = l.Ct(p_timestamp * pipe * p_id * pipe * p_module * pipe * p_severity_label * pipe * p_message)
|
||||
local pattern = read_config("heka_service_pattern") or "^k8s_(.-)%..*"
|
||||
|
||||
|
||||
function process_message ()
|
||||
local cont_name = read_message("Fields[ContainerName]")
|
||||
local program = string.match(cont_name, pattern)
|
||||
local service = nil
|
||||
|
||||
if program == nil then
|
||||
program = "unknown_program"
|
||||
else
|
||||
service = string.match(program, '(.-)%-.*')
|
||||
end
|
||||
|
||||
--- If service is still nil, it means we fail to match current service
|
||||
--- using both patterns, so we set fallback one.
|
||||
if service == nil then
|
||||
service = "unknown_service"
|
||||
end
|
||||
|
||||
local log = read_message("Payload")
|
||||
|
||||
local m = ovs_grammar:match(log)
|
||||
if not m then return -1 end
|
||||
|
||||
if m.SeverityLabel == "WARN" then
|
||||
m.SeverityLabel = "WARNING"
|
||||
end
|
||||
|
||||
msg.Timestamp = m.Timestamp
|
||||
msg.Logger = service
|
||||
msg.Payload = m.Message
|
||||
msg.Severity = utils.label_to_severity_map[m.SeverityLabel] or 7
|
||||
msg.Fields.module = m.Module
|
||||
msg.Fields.message_id = m.Message_ID
|
||||
msg.Fields.programname = program
|
||||
msg.Fields.container_name = cont_name
|
||||
msg.Fields.severity_label = m.SeverityLabel
|
||||
|
||||
return utils.safe_inject_message(msg)
|
||||
end
|
|
@ -1,73 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
local dt = require "date_time"
|
||||
local l = require 'lpeg'
|
||||
l.locale(l)
|
||||
|
||||
local patt = require 'os_patterns'
|
||||
local utils = require 'os_utils'
|
||||
|
||||
local msg = {
|
||||
Timestamp = nil,
|
||||
Type = 'log',
|
||||
Hostname = nil,
|
||||
Payload = nil,
|
||||
Pid = nil,
|
||||
Fields = {
|
||||
programname = 'rabbitmq',
|
||||
severity_label = nil,
|
||||
},
|
||||
Severity = nil,
|
||||
}
|
||||
|
||||
-- RabbitMQ message logs are formatted like this:
|
||||
-- =ERROR REPORT==== 2-Jan-2015::09:17:22 ===
|
||||
-- Blabla
|
||||
-- Blabla
|
||||
--
|
||||
local message = l.Cg(patt.Message / utils.chomp, "Message")
|
||||
-- The token before 'REPORT' isn't standardized so it can be a valid severity
|
||||
-- level as 'INFO' or 'ERROR' but also 'CRASH' or 'SUPERVISOR'.
|
||||
local severity = l.Cg(l.R"AZ"^1, "SeverityLabel")
|
||||
local day = l.R"13" * l.R"09" + l.R"19"
|
||||
local datetime = l.Cg(day, "day") * patt.dash * dt.date_mabbr * patt.dash * dt.date_fullyear *
|
||||
"::" * dt.rfc3339_partial_time
|
||||
local timestamp = l.Cg(l.Ct(datetime)/ dt.time_to_ns, "Timestamp")
|
||||
|
||||
local grammar = l.Ct("=" * severity * " REPORT==== " * timestamp * " ===" * l.P'\n' * message)
|
||||
|
||||
function process_message ()
|
||||
local log = read_message("Payload")
|
||||
|
||||
local m = grammar:match(log)
|
||||
if not m then
|
||||
return -1
|
||||
end
|
||||
|
||||
msg.Timestamp = m.Timestamp
|
||||
msg.Payload = m.Message
|
||||
msg.Logger = read_message("Logger")
|
||||
|
||||
if utils.label_to_severity_map[m.SeverityLabel] then
|
||||
msg.Severity = utils.label_to_severity_map[m.SeverityLabel]
|
||||
elseif m.SeverityLabel == 'CRASH' then
|
||||
msg.Severity = 2 -- CRITICAL
|
||||
else
|
||||
msg.Severity = 5 -- NOTICE
|
||||
end
|
||||
|
||||
msg.Fields.severity_label = utils.severity_to_label_map[msg.Severity]
|
||||
|
||||
return utils.safe_inject_message(msg)
|
||||
end
|
|
@ -1,49 +0,0 @@
|
|||
-- Copyright 2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
--
|
||||
-- The code in this file was inspired by Heka's rsyslog.lua decoder plugin.
|
||||
-- https://github.com/mozilla-services/heka/blob/master/sandbox/lua/decoders/rsyslog.lua
|
||||
|
||||
local syslog = require "syslog"
|
||||
local utils = require "os_utils"
|
||||
|
||||
local msg = {
|
||||
Timestamp = nil,
|
||||
Type = 'log',
|
||||
Hostname = read_config("hostname"),
|
||||
Payload = nil,
|
||||
Pid = nil,
|
||||
Severity = nil,
|
||||
Fields = nil
|
||||
}
|
||||
|
||||
-- See https://github.com/openstack/swift/blob/2a8b455/swift/common/utils.py#L1423-L1424
|
||||
local swift_grammar = syslog.build_rsyslog_grammar('<%PRI%>%programname%: %msg%')
|
||||
|
||||
function process_message ()
|
||||
local log = read_message("Payload")
|
||||
|
||||
local fields = swift_grammar:match(log)
|
||||
if not fields then return -1 end
|
||||
|
||||
msg.Severity = fields.pri.severity
|
||||
fields.syslogfacility = fields.pri.facility
|
||||
fields.pri = nil
|
||||
|
||||
msg.Payload = fields.msg
|
||||
fields.msg = nil
|
||||
|
||||
msg.Fields = fields
|
||||
return utils.safe_inject_message(msg)
|
||||
end
|
|
@ -1,55 +0,0 @@
|
|||
-- Copyright 2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
--
|
||||
-- The code in this file was inspired by Heka's rsyslog.lua decoder plugin.
|
||||
-- https://github.com/mozilla-services/heka/blob/master/sandbox/lua/decoders/rsyslog.lua
|
||||
|
||||
local syslog = require "syslog"
|
||||
local utils = require "os_utils"
|
||||
|
||||
local msg = {
|
||||
Timestamp = nil,
|
||||
Type = 'log',
|
||||
Hostname = read_config("hostname"),
|
||||
Payload = nil,
|
||||
Pid = nil,
|
||||
Severity = nil,
|
||||
Fields = nil
|
||||
}
|
||||
|
||||
-- See https://tools.ietf.org/html/rfc3164
|
||||
local grammar = syslog.build_rsyslog_grammar('<%PRI%>%TIMESTAMP% %syslogtag% %msg%')
|
||||
|
||||
function process_message ()
|
||||
local log = read_message("Payload")
|
||||
local fields = grammar:match(log)
|
||||
if not fields then return -1 end
|
||||
|
||||
msg.Timestamp = fields.timestamp
|
||||
fields.timestamp = nil
|
||||
|
||||
msg.Severity = fields.pri.severity
|
||||
fields.syslogfacility = fields.pri.facility
|
||||
fields.pri = nil
|
||||
|
||||
fields.programname = fields.syslogtag.programname
|
||||
msg.Pid = fields.syslogtag.pid
|
||||
fields.syslogtag = nil
|
||||
|
||||
msg.Payload = fields.msg
|
||||
fields.msg = nil
|
||||
|
||||
msg.Fields = fields
|
||||
return utils.safe_inject_message(msg)
|
||||
end
|
|
@ -1,26 +0,0 @@
|
|||
-- Copyright 2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
require "string"
|
||||
|
||||
local interpolate = require "msg_interpolate"
|
||||
local utils = require "os_utils"
|
||||
|
||||
local header_template = "<%{Severity}>%{%FT%TZ} %{Hostname} %{programname}[%{Pid}]:"
|
||||
|
||||
function process_message()
|
||||
local timestamp = read_message("Timestamp") / 1e9
|
||||
local header = interpolate.interpolate_from_msg(header_template, timestamp)
|
||||
local payload = string.format("%s %s\n", header, read_message("Payload"))
|
||||
return utils.safe_inject_payload("txt", "", payload)
|
||||
end
|
|
@ -1,145 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
local table = require 'table'
|
||||
local dt = require "date_time"
|
||||
local l = require 'lpeg'
|
||||
l.locale(l)
|
||||
|
||||
local tonumber = tonumber
|
||||
|
||||
local M = {}
|
||||
setfenv(1, M) -- Remove external access to contain everything in the module
|
||||
|
||||
function format_uuid(t)
|
||||
return table.concat(t, '-')
|
||||
end
|
||||
|
||||
function anywhere (patt)
|
||||
return l.P {
|
||||
patt + 1 * l.V(1)
|
||||
}
|
||||
end
|
||||
|
||||
sp = l.space
|
||||
colon = l.P":"
|
||||
dash = l.P"-"
|
||||
dot = l.P'.'
|
||||
quote = l.P'"'
|
||||
pipe = l.P'|'
|
||||
|
||||
local x4digit = l.xdigit * l.xdigit * l.xdigit * l.xdigit
|
||||
local uuid_dash = l.C(x4digit * x4digit * dash * x4digit * dash * x4digit * dash * x4digit * dash * x4digit * x4digit * x4digit)
|
||||
local uuid_nodash = l.Ct(l.C(x4digit * x4digit) * l.C(x4digit) * l.C(x4digit) * l.C(x4digit) * l.C(x4digit * x4digit * x4digit)) / format_uuid
|
||||
|
||||
-- Return a UUID string in canonical format (eg with dashes)
|
||||
Uuid = uuid_nodash + uuid_dash
|
||||
|
||||
-- Parse a datetime string and return a table with the following keys
|
||||
-- year (string)
|
||||
-- month (string)
|
||||
-- day (string)
|
||||
-- hour (string)
|
||||
-- min (string)
|
||||
-- sec (string)
|
||||
-- sec_frac (number less than 1, can be nil)
|
||||
-- offset_sign ('-' or '+', can be nil)
|
||||
-- offset_hour (number, can be nil)
|
||||
-- offset_min (number, can be nil)
|
||||
--
|
||||
-- The datetime string can be formatted as
|
||||
-- 'YYYY-MM-DD( |T)HH:MM:SS(.ssssss)?(offset indicator)?'
|
||||
TimestampTable = l.Ct(dt.rfc3339_full_date * (sp + l.P"T") * dt.rfc3339_partial_time * (dt.rfc3339_time_offset + dt.timezone_offset)^-1)
|
||||
|
||||
-- Returns the parsed datetime converted to nanosec
|
||||
Timestamp = TimestampTable / dt.time_to_ns
|
||||
|
||||
programname = (l.R("az", "AZ", "09") + l.P"." + dash + l.P"_")^1
|
||||
Pid = l.digit^1
|
||||
SeverityLabel = l.P"CRITICAL" + l.P"ERROR" + l.P"WARNING" + l.P"INFO" + l.P"AUDIT" + l.P"DEBUG"
|
||||
Message = l.P(1)^0
|
||||
|
||||
-- Capture for OpenStack logs producing four values: Timestamp, Pid,
|
||||
-- SeverityLabel, PythonModule and Message.
|
||||
--
|
||||
-- OpenStack log messages are of this form:
|
||||
-- 2015-11-30 08:38:59.306 3434 INFO oslo_service.periodic_task [-] Blabla...
|
||||
--
|
||||
-- [-] is the "request" part, it can take multiple forms. See below.
|
||||
openstack = l.Ct(l.Cg(Timestamp, "Timestamp")* sp * l.Cg(Pid, "Pid") * sp *
|
||||
l.Cg(SeverityLabel, "SeverityLabel") * sp * l.Cg(programname, "PythonModule") *
|
||||
sp * l.Cg(Message, "Message"))
|
||||
|
||||
-- Capture for OpenStack request context producing three values: RequestId,
|
||||
-- UserId and TenantId.
|
||||
--
|
||||
-- Notes:
|
||||
--
|
||||
-- OpenStack logs include a request context, enclosed between square brackets.
|
||||
-- It takes one of these forms:
|
||||
--
|
||||
-- [-]
|
||||
-- [req-0fd2a9ba-448d-40f5-995e-33e32ac5a6ba - - - - -]
|
||||
-- [req-4db318af-54c9-466d-b365-fe17fe4adeed 8206d40abcc3452d8a9c1ea629b4a8d0 112245730b1f4858ab62e3673e1ee9e2 - - -]
|
||||
--
|
||||
-- In the 1st case the capture produces nil.
|
||||
-- In the 2nd case the capture produces one value: RequestId.
|
||||
-- In the 3rd case the capture produces three values: RequestId, UserId, TenantId.
|
||||
--
|
||||
-- The request id may be formatted as 'req-xxx' or 'xxx' depending on the project.
|
||||
-- The user id and tenant id may not be present depending on the OpenStack release.
|
||||
openstack_request_context = (l.P(1) - "[" )^0 * "[" * l.P"req-"^-1 *
|
||||
l.Ct(l.Cg(Uuid, "RequestId") * sp * ((l.Cg(Uuid, "UserId") * sp *
|
||||
l.Cg(Uuid, "TenantId")) + l.P(1)^0)) - "]"
|
||||
|
||||
local http_method = l.Cg(l.R"AZ"^3, "http_method")
|
||||
local url = l.Cg( (1 - sp)^1, "http_url")
|
||||
local http_version = l.Cg(l.digit * dot * l.digit, "http_version")
|
||||
|
||||
-- Pattern for the "<http_method> <http_url> HTTP/<http_version>" format found
|
||||
-- found in both OpenStack and Apache log files.
|
||||
-- Example : OPTIONS /example.com HTTP/1.0
|
||||
http_request = http_method * sp * url * sp * l.P'HTTP/' * http_version
|
||||
|
||||
-- Patterns for HTTP status, HTTP response size and HTTP response time in
|
||||
-- OpenLayers logs.
|
||||
--
|
||||
-- Notes:
|
||||
-- Nova changes the default log format of eventlet.wsgi (see nova/wsgi.py) and
|
||||
-- prefixes the HTTP status, response size and response time values with
|
||||
-- respectively "status: ", "len: " and "time: ".
|
||||
-- Other OpenStack services just rely on the default log format.
|
||||
-- TODO(pasquier-s): build the LPEG grammar based on the log_format parameter
|
||||
-- passed to eventlet.wsgi.server similar to what the build_rsyslog_grammar
|
||||
-- function does for RSyslog.
|
||||
local openstack_http_status = l.P"status: "^-1 * l.Cg(l.digit^3, "http_status")
|
||||
local openstack_response_size = l.P"len: "^-1 * l.Cg(l.digit^1 / tonumber, "http_response_size")
|
||||
local openstack_response_time = l.P"time: "^-1 * l.Cg(l.digit^1 * dot^0 * l.digit^0 / tonumber, "http_response_time")
|
||||
|
||||
-- Capture for OpenStack HTTP producing six values: http_method, http_url,
|
||||
-- http_version, http_status, http_response_size and http_response_time.
|
||||
openstack_http = anywhere(l.Ct(
|
||||
quote * http_request * quote * sp *
|
||||
openstack_http_status * sp * openstack_response_size * sp *
|
||||
openstack_response_time
|
||||
))
|
||||
|
||||
-- Capture for IP addresses producing one value: ip_address.
|
||||
ip_address = anywhere(l.Ct(
|
||||
l.Cg(l.digit^-3 * dot * l.digit^-3 * dot * l.digit^-3 * dot * l.digit^-3, "ip_address")
|
||||
))
|
||||
|
||||
-- Pattern used to match the beginning of a Python Traceback.
|
||||
traceback = l.P'Traceback (most recent call last):'
|
||||
|
||||
return M
|
|
@ -1,89 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
local cjson = require 'cjson'
|
||||
local string = require 'string'
|
||||
|
||||
local patt = require 'os_patterns'
|
||||
|
||||
local pairs = pairs
|
||||
local inject_message = inject_message
|
||||
local inject_payload = inject_payload
|
||||
local read_message = read_message
|
||||
local pcall = pcall
|
||||
|
||||
local M = {}
|
||||
setfenv(1, M) -- Remove external access to contain everything in the module
|
||||
|
||||
severity_to_label_map = {
|
||||
[0] = 'EMERGENCY',
|
||||
[1] = 'ALERT',
|
||||
[2] = 'CRITICAL',
|
||||
[3] = 'ERROR',
|
||||
[4] = 'WARNING',
|
||||
[5] = 'NOTICE',
|
||||
[6] = 'INFO',
|
||||
[7] = 'DEBUG',
|
||||
}
|
||||
|
||||
label_to_severity_map = {
|
||||
EMERGENCY = 0,
|
||||
ALERT = 1,
|
||||
CRITICAL = 2,
|
||||
ERROR = 3,
|
||||
WARNING = 4,
|
||||
NOTICE = 5,
|
||||
INFO= 6,
|
||||
DEBUG = 7,
|
||||
}
|
||||
|
||||
function chomp(s)
|
||||
return string.gsub(s, "\n$", "")
|
||||
end
|
||||
|
||||
-- Call inject_message() wrapped by pcall()
|
||||
function safe_inject_message(msg)
|
||||
local ok, err_msg = pcall(inject_message, msg)
|
||||
if not ok then
|
||||
return -1, err_msg
|
||||
else
|
||||
return 0
|
||||
end
|
||||
end
|
||||
|
||||
-- Call inject_payload() wrapped by pcall()
|
||||
function safe_inject_payload(payload_type, payload_name, data)
|
||||
local ok, err_msg = pcall(inject_payload, payload_type, payload_name, data)
|
||||
if not ok then
|
||||
return -1, err_msg
|
||||
else
|
||||
return 0
|
||||
end
|
||||
end
|
||||
|
||||
-- Shallow comparison between two tables.
|
||||
-- Return true if the two tables have the same keys with identical
|
||||
-- values, otherwise false.
|
||||
function table_equal(t1, t2)
|
||||
-- all key-value pairs in t1 must be in t2
|
||||
for k, v in pairs(t1) do
|
||||
if t2[k] ~= v then return false end
|
||||
end
|
||||
-- there must not be other keys in t2
|
||||
for k, v in pairs(t2) do
|
||||
if t1[k] == nil then return false end
|
||||
end
|
||||
return true
|
||||
end
|
||||
|
||||
return M
|
|
@ -1,29 +0,0 @@
|
|||
FROM {{ image_spec("base-tools") }}
|
||||
MAINTAINER {{ maintainer }}
|
||||
|
||||
# We use MOS packages for hindsight, lua_sandbox and lua_sandbox_extensions
|
||||
|
||||
COPY sources.mos.list /etc/apt/sources.list.d/
|
||||
COPY mos.pref /etc/apt/preferences.d/
|
||||
COPY bootstrap-hindsight.sh /opt/ccp/bin/
|
||||
|
||||
RUN apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv-keys 1FA22B08 \
|
||||
&& apt-get update \
|
||||
&& apt-get install -y --no-install-recommends \
|
||||
hindsight \
|
||||
lua-sandbox-extensions \
|
||||
&& cp /usr/share/luasandbox/sandboxes/heka/input/prune_input.lua \
|
||||
/usr/share/luasandbox/sandboxes/heka/input/heka_tcp.lua \
|
||||
/var/lib/hindsight/run/input/
|
||||
|
||||
ADD output/*.lua /var/lib/hindsight/run/output/
|
||||
ADD input/*.lua /var/lib/hindsight/run/input/
|
||||
ADD analysis/*.lua /var/lib/hindsight/run/analysis/
|
||||
ADD modules/*.lua /opt/ccp/lua/modules/stacklight/
|
||||
|
||||
RUN useradd --user-group hindsight \
|
||||
&& usermod -a -G microservices hindsight \
|
||||
&& chown -R hindsight: /var/lib/hindsight /etc/hindsight \
|
||||
&& tar cf - -C /var/lib hindsight | tar xf - -C /opt/ccp
|
||||
|
||||
USER hindsight
|
|
@ -1,117 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local string = require 'string'
|
||||
|
||||
local message = require 'stacklight.message'
|
||||
local afd = require 'stacklight.afd'
|
||||
local afd_annotation = require 'stacklight.afd_annotation'
|
||||
|
||||
-- node or service
|
||||
local afd_type = read_config('afd_type') or error('afd_type must be specified!')
|
||||
local afd_msg_type
|
||||
local afd_metric_name
|
||||
|
||||
if afd_type == 'node' then
|
||||
afd_msg_type = 'afd_node_metric'
|
||||
afd_metric_name = 'node_status'
|
||||
elseif afd_type == 'service' then
|
||||
afd_msg_type = 'afd_service_metric'
|
||||
afd_metric_name = 'service_status'
|
||||
else
|
||||
error('invalid afd_type value')
|
||||
end
|
||||
|
||||
-- ie: controller for node AFD / rabbitmq for service AFD
|
||||
local afd_cluster_name = read_config('afd_cluster_name') or
|
||||
error('afd_cluster_name must be specified!')
|
||||
|
||||
-- ie: cpu for node AFD / queue for service AFD
|
||||
local afd_logical_name = read_config('afd_logical_name') or
|
||||
error('afd_logical_name must be specified!')
|
||||
|
||||
local hostname = read_config('hostname') or error('hostname must be specified')
|
||||
|
||||
local afd_file = read_config('afd_file') or error('afd_file must be specified')
|
||||
local all_alarms = require('stacklight_alarms.' .. afd_file)
|
||||
local A = require 'stacklight.afd_alarms'
|
||||
A.load_alarms(all_alarms)
|
||||
|
||||
function process_message()
|
||||
|
||||
local metric_name = read_message('Fields[name]')
|
||||
local ts = read_message('Timestamp')
|
||||
|
||||
local value, err_msg = message.read_values()
|
||||
if not value then
|
||||
return -1, err_msg
|
||||
end
|
||||
-- retrieve field values
|
||||
local fields = {}
|
||||
for _, field in ipairs(A.get_metric_fields(metric_name)) do
|
||||
local field_value = read_message(string.format('Fields[%s]', field))
|
||||
if not field_value then
|
||||
return -1, "Cannot find Fields[" .. field .. "] for the metric " .. metric_name
|
||||
end
|
||||
fields[field] = field_value
|
||||
end
|
||||
A.add_value(ts, metric_name, value, fields)
|
||||
return 0
|
||||
end
|
||||
|
||||
function timer_event(ns)
|
||||
if A.is_started() then
|
||||
local state, alarms = A.evaluate(ns)
|
||||
if state then -- it was time to evaluate at least one alarm
|
||||
for _, alarm in ipairs(alarms) do
|
||||
afd.add_to_alarms(
|
||||
alarm.state,
|
||||
alarm.alert['function'],
|
||||
alarm.alert.metric,
|
||||
alarm.alert.fields,
|
||||
{}, -- tags
|
||||
alarm.alert.operator,
|
||||
alarm.alert.value,
|
||||
alarm.alert.threshold,
|
||||
alarm.alert.window,
|
||||
alarm.alert.periods,
|
||||
alarm.alert.message)
|
||||
end
|
||||
|
||||
-- Message example:
|
||||
-- msg = {
|
||||
-- Type = 'afd_node_metric',
|
||||
-- Payload = '{"alarms":[...]}',
|
||||
-- Fields = {
|
||||
-- name = 'node_status',
|
||||
-- value = 0,
|
||||
-- hostname = 'node1',
|
||||
-- source = 'cpu',
|
||||
-- cluster = 'system',
|
||||
-- dimensions = {'cluster', 'source', 'hostname'},
|
||||
-- }
|
||||
-- }
|
||||
local msg = afd.inject_afd_metric(
|
||||
afd_msg_type, afd_metric_name, afd_cluster_name, afd_logical_name,
|
||||
state, hostname)
|
||||
|
||||
if msg then
|
||||
afd_annotation.inject_afd_annotation(msg)
|
||||
end
|
||||
|
||||
end
|
||||
else
|
||||
A.set_start_time(ns)
|
||||
end
|
||||
end
|
|
@ -1,31 +0,0 @@
|
|||
#!/bin/bash
|
||||
|
||||
# This script is used for bootstrapping
|
||||
# Hindsight with proper directories contents
|
||||
# when using emptydir Kubernetes volumes
|
||||
# As these are created empty
|
||||
# Hindsight will not start properly
|
||||
# as files will be missing
|
||||
# Therefore the need to run this script
|
||||
# with the proper destination directory
|
||||
# as its command line parameter
|
||||
|
||||
set -e
|
||||
|
||||
if [ $# -ne 1 ]; then
|
||||
echo "Usage: $0 directory"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
if [ ! -d "$1" ]; then
|
||||
echo "Error: $1 does not exist or is not a directory"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
SRC=/opt/ccp/hindsight
|
||||
if [ ! -d "$SRC" ]; then
|
||||
echo "Error: $SRC does not exist or is not a directory"
|
||||
exit 1
|
||||
fi
|
||||
|
||||
tar cf - -C $SRC . | tar xf - -C $1 --strip-components=1
|
|
@ -1,700 +0,0 @@
|
|||
--
|
||||
-- This sandbox queries the kubelet "stats" API to collect statistics on Kubernetes
|
||||
-- pods and namespaces.
|
||||
--
|
||||
-- The sandbox injects Heka messages for the following metrics:
|
||||
--
|
||||
-- * k8s_check: Expresses the success or failure of the data collection.
|
||||
-- * k8s_pods_count: The number of pods in a given namespace.
|
||||
-- * k8s_pods_count_total: The total number of pods on the node.
|
||||
-- * k8s_pod_cpu_usage: The CPU usage of a given pod. For example 50 means that
|
||||
-- the pod consumes 50% of CPU. The value may be greater than 100 on
|
||||
-- multicore nodes.
|
||||
-- * k8s_namespace_cpu_usage: The CPU usage of all the pods of a given namespace.
|
||||
-- * k8s_pods_cpu_usage: The CPU usage of all the pods on the node.
|
||||
-- * k8s_pod_memory_usage: The memory in Bytes used by a given pod. For example
|
||||
-- 100000 means that the pod consumes 100000 Bytes of memory.
|
||||
-- * k8s_namespace_memory_usage: The memory in Bytes used by all the pods of
|
||||
-- a given namespace.
|
||||
-- * k8s_pods_memory_usage: The memory in Bytes used by all the pods on the
|
||||
-- node.
|
||||
-- * k8s_pod_working_set: The working set in Bytes of a given pod.
|
||||
-- * k8s_namespace_working_set: The working set in Bytes of all the pods of a
|
||||
-- given namespace.
|
||||
-- * k8s_pods_working_set: The working set in Bytes of all the pods on the
|
||||
-- node.
|
||||
-- * k8s_pod_major_page_faults: The number of major page faults per second
|
||||
-- for a given pod.
|
||||
-- * k8s_namespace_major_page_faults: The number of major page faults per second
|
||||
-- for all the pods of a given namespace.
|
||||
-- * k8s_pods_major_page_faults: The number of major page faults per second for
|
||||
-- all the pods on the node.
|
||||
-- * k8s_pod_page_faults: The number of minor page faults per second for
|
||||
-- a given pod.
|
||||
-- * k8s_namespace_page_faults: The number of minor page faults per second for
|
||||
-- all the pods of a given namespace.
|
||||
-- * k8s_pods_page_faults: The number of minor page faults per second for all
|
||||
-- the pods on the node.
|
||||
-- * k8s_pod_rx_bytes: The number of bytes per second received over the network
|
||||
-- for a given pod.
|
||||
-- * k8s_namespace_rx_bytes: The number of bytes per second received over the
|
||||
-- network for all the pods of a given namespace.
|
||||
-- * k8s_pods_rx_bytes: The number of bytes per second received over the
|
||||
-- network for all the pods on the node.
|
||||
-- * k8s_pod_tx_bytes: The number of bytes per second sent over the network
|
||||
-- for a given pod.
|
||||
-- * k8s_namespace_tx_bytes: The number of bytes per second sent over the
|
||||
-- network for all the pods of a given namespace.
|
||||
-- * k8s_pods_tx_bytes: The number of bytes per second sent over the
|
||||
-- network for all the pods on the node.
|
||||
-- * k8s_pod_rx_errors: The number of errors per second received over the network
|
||||
-- for a given pod.
|
||||
-- * k8s_namespace_rx_errors: The number of errors per second received over the
|
||||
-- network for all the pods of a given namespace.
|
||||
-- * k8s_pods_rx_errors: The number of errors per second received over the
|
||||
-- network for all the pods on the node.
|
||||
-- * k8s_pod_tx_errors: The number of errors per second sent over the network
|
||||
-- for a given pod.
|
||||
-- * k8s_namespace_tx_errors: The number of errors per second sent over the
|
||||
-- network for all the pods of a given namespace.
|
||||
-- * k8s_pods_tx_errors: The number of errors per second sent over the
|
||||
-- network for all the pods on the node.
|
||||
--
|
||||
-- Configuration variables:
|
||||
--
|
||||
-- * kubernetes_host: The hostname or IP to use to access the Kubernetes
|
||||
-- API. Optional. Default is "kubernetes".
|
||||
-- * kubelet_stats_node: The name of the Kubernetes node onto which the
|
||||
-- Kubelet to query runs. At init time the plugin uses the Kubernetes API
|
||||
-- to get the corresponding internal IP address. Required.
|
||||
-- * kubelet_stats_port: The port to use to access the Kubelet stats API.
|
||||
-- Optional. Default value is 10255.
|
||||
--
|
||||
-- Configuration example:
|
||||
--
|
||||
-- filename = "kubelet_stats.lua"
|
||||
-- kubelet_stats_node = "node1"
|
||||
-- kubelet_stats_port = 10255
|
||||
-- ticker_interval = 10 -- query Kubelet every 10 seconds
|
||||
--
|
||||
|
||||
|
||||
local cjson = require 'cjson'
|
||||
local date_time = require 'lpeg.date_time'
|
||||
local http = require 'socket.http'
|
||||
local https = require 'ssl.https'
|
||||
local io = require 'io'
|
||||
local ltn12 = require 'ltn12'
|
||||
|
||||
|
||||
local function read_file(path)
|
||||
local fh, err = io.open(path, 'r')
|
||||
if err then return nil, err end
|
||||
local content = fh:read('*all')
|
||||
fh:close()
|
||||
return content, nil
|
||||
end
|
||||
|
||||
|
||||
-- get the node IP for "node_name". Done by querying the Kubernetes API
|
||||
local function get_node_ip_address(kubernetes_host, node_name)
|
||||
local token_path = '/var/run/secrets/kubernetes.io/serviceaccount/token'
|
||||
local token, err_msg = read_file(token_path)
|
||||
if not token then
|
||||
return nil, err_msg
|
||||
end
|
||||
local url = string.format('https://%s/api/v1/nodes/%s',
|
||||
kubernetes_host, node_name)
|
||||
local resp_body = {}
|
||||
local res, code, headers, status = https.request {
|
||||
url = url,
|
||||
cafile = '/var/run/secrets/kubernetes.io/serviceaccount/ca.crt',
|
||||
headers = {
|
||||
Authorization = string.format('Bearer %s', token)
|
||||
},
|
||||
sink = ltn12.sink.table(resp_body)
|
||||
}
|
||||
if not res then
|
||||
return nil, code
|
||||
end
|
||||
local ok, doc = pcall(cjson.decode, table.concat(resp_body))
|
||||
if not ok then
|
||||
local err_msg = string.format(
|
||||
'HTTP response does not contain valid JSON: %s', doc)
|
||||
return nil, err_msg
|
||||
end
|
||||
local status = doc['status']
|
||||
if not status then
|
||||
return nil, 'HTTP JSON does not contain node status'
|
||||
end
|
||||
local addresses = status['addresses']
|
||||
if not addresses then
|
||||
return nil, 'HTTP JSON does not contain node addresses'
|
||||
end
|
||||
for _, address in ipairs(addresses) do
|
||||
if address['type'] == 'InternalIP' then
|
||||
return address['address'], ''
|
||||
end
|
||||
end
|
||||
return nil, string.format('No IP address found for %s', node_name)
|
||||
end
|
||||
|
||||
local kubernetes_host = read_config('kubernetes_host') or 'kubernetes'
|
||||
local kubelet_stats_port = read_config('kubelet_stats_port') or 10255
|
||||
local kubelet_stats_node = read_config('kubelet_stats_node')
|
||||
assert(kubelet_stats_node, 'kubelet_stats_node missing in plugin config')
|
||||
|
||||
local kubelet_stats_ip_address, err_msg = get_node_ip_address(
|
||||
kubernetes_host, kubelet_stats_node)
|
||||
assert(kubelet_stats_ip_address, err_msg)
|
||||
|
||||
local summary_url = string.format('http://%s:%d/stats/summary',
|
||||
kubelet_stats_ip_address, kubelet_stats_port)
|
||||
|
||||
local pods_stats = {}
|
||||
|
||||
-- message skeletons for each metric type
|
||||
local k8s_check_msg = {
|
||||
Type = 'metric',
|
||||
Timestamp = nil,
|
||||
Hostname = nil,
|
||||
Fields = {
|
||||
name = 'k8s_check',
|
||||
value = nil,
|
||||
dimensions = {'hostname'},
|
||||
hostname = nil
|
||||
}
|
||||
}
|
||||
local k8s_pod_msg = {
|
||||
Type = 'metric',
|
||||
Timestamp = nil,
|
||||
Hostname = nil,
|
||||
Fields = {
|
||||
name = nil,
|
||||
value = nil,
|
||||
dimensions = {'pod_name', 'pod_namespace', 'hostname'},
|
||||
hostname = nil,
|
||||
pod_name = nil,
|
||||
pod_namespace = nil
|
||||
}
|
||||
}
|
||||
local k8s_namespace_msg = {
|
||||
Type = 'metric',
|
||||
Timestamp = nil,
|
||||
Hostname = nil,
|
||||
Fields = {
|
||||
name = nil,
|
||||
value = nil,
|
||||
dimensions = {'pod_namespace', 'hostname'},
|
||||
hostname = nil,
|
||||
pod_namespace = nil
|
||||
}
|
||||
}
|
||||
local k8s_pods_msg = {
|
||||
Type = 'metric',
|
||||
Timestamp = nil,
|
||||
Hostname = nil,
|
||||
Fields = {
|
||||
name = nil,
|
||||
value = nil,
|
||||
dimensions = {'hostname'},
|
||||
hostname = nil
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
-- inject a pod-level metric message
|
||||
local function inject_pod_metric(name, value, hostname, pod_namespace, pod_name)
|
||||
k8s_pod_msg.Fields.name = name
|
||||
k8s_pod_msg.Fields.value = value
|
||||
k8s_pod_msg.Fields.hostname = hostname
|
||||
k8s_pod_msg.Fields.pod_namespace = pod_namespace
|
||||
k8s_pod_msg.Fields.pod_name = pod_name
|
||||
inject_message(k8s_pod_msg)
|
||||
end
|
||||
|
||||
|
||||
-- inject a namespace-level metric message
|
||||
local function inject_namespace_metric(name, value, hostname, pod_namespace)
|
||||
k8s_namespace_msg.Fields.name = name
|
||||
k8s_namespace_msg.Fields.value = value
|
||||
k8s_namespace_msg.Fields.hostname = hostname
|
||||
k8s_namespace_msg.Fields.pod_namespace = pod_namespace
|
||||
inject_message(k8s_namespace_msg)
|
||||
end
|
||||
|
||||
|
||||
-- inject a node-level metric message
|
||||
local function inject_pods_metric(name, value, hostname)
|
||||
k8s_pods_msg.Fields.name = name
|
||||
k8s_pods_msg.Fields.value = value
|
||||
k8s_pods_msg.Fields.hostname = hostname
|
||||
inject_message(k8s_pods_msg)
|
||||
end
|
||||
|
||||
|
||||
-- Send a "stats" query to kubelet, and return the JSON response in a Lua table
|
||||
local function send_stats_query()
|
||||
local resp_body, resp_status = http.request(summary_url)
|
||||
if resp_body and resp_status == 200 then
|
||||
-- success
|
||||
local ok, doc = pcall(cjson.decode, resp_body)
|
||||
if ok then
|
||||
return doc, ''
|
||||
else
|
||||
local err_msg = string.format('HTTP response does not contain valid JSON: %s', doc)
|
||||
return nil, err_msg
|
||||
end
|
||||
else
|
||||
-- error
|
||||
local err_msg = resp_status
|
||||
if resp_body then
|
||||
err_msg = string.format('kubelet stats query error: [%s] %s',
|
||||
resp_status, resp_body)
|
||||
end
|
||||
return nil, err_msg
|
||||
end
|
||||
end
|
||||
|
||||
|
||||
-- Collect cpu statistics for a container
|
||||
local function collect_container_cpu_stats(container_cpu, prev_stats, curr_stats)
|
||||
local cpu_usage
|
||||
if container_cpu then
|
||||
local cpu_scrape_time = date_time.rfc3339:match(container_cpu['time'])
|
||||
curr_stats.cpu = {
|
||||
scrape_time = date_time.time_to_ns(cpu_scrape_time),
|
||||
usage = container_cpu['usageCoreNanoSeconds']
|
||||
}
|
||||
if prev_stats and prev_stats.cpu then
|
||||
local time_diff = curr_stats.cpu.scrape_time - prev_stats.cpu.scrape_time
|
||||
if time_diff > 0 then
|
||||
cpu_usage = 100 *
|
||||
(curr_stats.cpu.usage - prev_stats.cpu.usage) / time_diff
|
||||
end
|
||||
end
|
||||
end
|
||||
return cpu_usage
|
||||
end
|
||||
|
||||
|
||||
-- Collect memory statistics for a container
|
||||
local function collect_container_memory_stats(container_memory, prev_stats, curr_stats)
|
||||
local memory_usage, major_page_faults, page_faults, working_set
|
||||
if container_memory then
|
||||
memory_usage = container_memory['usageBytes']
|
||||
working_set = container_memory['workingSetBytes']
|
||||
local memory_scrape_time = date_time.rfc3339:match(container_memory['time'])
|
||||
curr_stats.memory = {
|
||||
scrape_time = date_time.time_to_ns(memory_scrape_time),
|
||||
major_page_faults = container_memory['majorPageFaults'],
|
||||
page_faults = container_memory['pageFaults']
|
||||
}
|
||||
if prev_stats and prev_stats.memory then
|
||||
local time_diff = curr_stats.memory.scrape_time - prev_stats.memory.scrape_time
|
||||
if time_diff > 0 then
|
||||
major_page_faults = 1e9 *
|
||||
(curr_stats.memory.major_page_faults -
|
||||
prev_stats.memory.major_page_faults) / time_diff
|
||||
page_faults = 1e9 *
|
||||
(curr_stats.memory.page_faults -
|
||||
prev_stats.memory.page_faults) / time_diff
|
||||
end
|
||||
end
|
||||
end
|
||||
return memory_usage, major_page_faults, page_faults, working_set
|
||||
end
|
||||
|
||||
|
||||
-- Collect statistics for a container
|
||||
local function collect_container_stats(container, prev_stats, curr_stats)
|
||||
-- cpu stats
|
||||
local cpu_usage =
|
||||
collect_container_cpu_stats(container['cpu'], prev_stats, curr_stats)
|
||||
-- memory stats
|
||||
local memory_usage, major_page_faults, page_faults, working_set =
|
||||
collect_container_memory_stats(container['memory'], prev_stats, curr_stats)
|
||||
return cpu_usage, memory_usage, major_page_faults, page_faults, working_set
|
||||
end
|
||||
|
||||
|
||||
-- Collect statistics for a group of containers
|
||||
local function collect_containers_stats(containers, prev_stats, curr_stats)
|
||||
local aggregated_cpu_usage, aggregated_memory_usage,
|
||||
aggregated_major_page_faults, aggregated_page_faults,
|
||||
aggregated_working_set
|
||||
for _, container in ipairs(containers) do
|
||||
local container_name = container['name']
|
||||
curr_stats[container_name] = {}
|
||||
local container_prev_stats
|
||||
if prev_stats then
|
||||
container_prev_stats = prev_stats[container_name]
|
||||
end
|
||||
local cpu_usage, memory_usage, major_page_faults, page_faults, working_set =
|
||||
collect_container_stats(container,
|
||||
container_prev_stats, curr_stats[container_name])
|
||||
if cpu_usage then
|
||||
aggregated_cpu_usage = (aggregated_cpu_usage or 0) + cpu_usage
|
||||
end
|
||||
if memory_usage then
|
||||
aggregated_memory_usage = (aggregated_memory_usage or 0) + memory_usage
|
||||
end
|
||||
if major_page_faults then
|
||||
aggregated_major_page_faults = (aggregated_major_page_faults or 0) +
|
||||
major_page_faults
|
||||
end
|
||||
if page_faults then
|
||||
aggregated_page_faults = (aggregated_page_faults or 0) + page_faults
|
||||
end
|
||||
if working_set then
|
||||
aggregated_working_set = (aggregated_working_set or 0) + working_set
|
||||
end
|
||||
end
|
||||
return aggregated_cpu_usage, aggregated_memory_usage,
|
||||
aggregated_major_page_faults, aggregated_page_faults,
|
||||
aggregated_working_set
|
||||
end
|
||||
|
||||
|
||||
-- Collect statistics for a pod
|
||||
local function collect_pod_stats(pod, prev_stats, curr_stats)
|
||||
curr_stats.containers = {}
|
||||
local containers_prev_stats
|
||||
if prev_stats then
|
||||
containers_prev_stats = prev_stats.containers
|
||||
end
|
||||
|
||||
-- collect cpu and memory containers stats
|
||||
local cpu_usage, memory_usage, major_page_faults, page_faults, working_set =
|
||||
collect_containers_stats(pod['containers'] or {},
|
||||
containers_prev_stats, curr_stats.containers)
|
||||
|
||||
-- collect network stats
|
||||
local rx_bytes, tx_bytes, rx_errors, tx_errors
|
||||
local pod_network = pod['network']
|
||||
if pod_network then
|
||||
local network_scrape_time = date_time.rfc3339:match(pod_network['time'])
|
||||
curr_stats.network = {
|
||||
scrape_time = date_time.time_to_ns(network_scrape_time),
|
||||
rx_bytes = pod_network['rxBytes'],
|
||||
tx_bytes = pod_network['txBytes'],
|
||||
rx_errors = pod_network['rxErrors'],
|
||||
tx_errors = pod_network['txErrors']
|
||||
}
|
||||
if prev_stats and prev_stats.network then
|
||||
local time_diff = curr_stats.network.scrape_time -
|
||||
prev_stats.network.scrape_time
|
||||
if time_diff > 0 then
|
||||
rx_bytes = 1e9 *
|
||||
(curr_stats.network.rx_bytes -
|
||||
prev_stats.network.rx_bytes) / time_diff
|
||||
tx_bytes = 1e9 *
|
||||
(curr_stats.network.tx_bytes -
|
||||
prev_stats.network.tx_bytes) / time_diff
|
||||
rx_errors = 1e9 *
|
||||
(curr_stats.network.rx_errors -
|
||||
prev_stats.network.rx_errors) / time_diff
|
||||
tx_errors = 1e9 *
|
||||
(curr_stats.network.tx_errors -
|
||||
prev_stats.network.tx_errors) / time_diff
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
return cpu_usage, memory_usage, major_page_faults, page_faults, working_set,
|
||||
rx_bytes, tx_bytes, rx_errors, tx_errors
|
||||
end
|
||||
|
||||
|
||||
-- Collect statistics for a group of pods
|
||||
local function collect_pods_stats(node_name, pods, prev_stats, curr_stats)
|
||||
local pods_count_by_ns = {}
|
||||
local pods_stats_by_ns = {}
|
||||
|
||||
local pods_count_total = 0
|
||||
local pods_cpu_usage = 0
|
||||
local pods_memory_usage = 0
|
||||
local pods_major_page_faults = 0
|
||||
local pods_page_faults = 0
|
||||
local pods_working_set = 0
|
||||
local pods_rx_bytes = 0
|
||||
local pods_tx_bytes = 0
|
||||
local pods_rx_errors = 0
|
||||
local pods_tx_errors = 0
|
||||
|
||||
for _, pod in ipairs(pods) do
|
||||
local pod_ref = pod['podRef']
|
||||
local pod_uid = pod_ref['uid']
|
||||
local pod_name = pod_ref['name']
|
||||
local pod_namespace = pod_ref['namespace']
|
||||
|
||||
curr_stats[pod_uid] = {}
|
||||
|
||||
local pod_cpu_usage,
|
||||
pod_memory_usage,
|
||||
pod_major_page_faults,
|
||||
pod_page_faults,
|
||||
pod_working_set,
|
||||
pod_rx_bytes,
|
||||
pod_tx_bytes,
|
||||
pod_rx_errors,
|
||||
pod_tx_errors = collect_pod_stats(
|
||||
pod, prev_stats[pod_uid], curr_stats[pod_uid])
|
||||
|
||||
if pod_cpu_usage then
|
||||
-- inject k8s_pod_cpu_usage metric
|
||||
inject_pod_metric('k8s_pod_cpu_usage',
|
||||
pod_cpu_usage, node_name, pod_namespace, pod_name)
|
||||
|
||||
if not pods_stats_by_ns[pod_namespace] then
|
||||
pods_stats_by_ns[pod_namespace] = {cpu_usage = pod_cpu_usage}
|
||||
else
|
||||
pods_stats_by_ns[pod_namespace].cpu_usage =
|
||||
(pods_stats_by_ns[pod_namespace].cpu_usage or 0) + pod_cpu_usage
|
||||
end
|
||||
|
||||
pods_cpu_usage = pods_cpu_usage + pod_cpu_usage
|
||||
end
|
||||
|
||||
if pod_memory_usage then
|
||||
-- inject k8s_pod_memory_usage metric
|
||||
inject_pod_metric('k8s_pod_memory_usage',
|
||||
pod_memory_usage, node_name, pod_namespace, pod_name)
|
||||
|
||||
if not pods_stats_by_ns[pod_namespace] then
|
||||
pods_stats_by_ns[pod_namespace] = {memory_usage = pod_memory_usage}
|
||||
else
|
||||
pods_stats_by_ns[pod_namespace].memory_usage =
|
||||
(pods_stats_by_ns[pod_namespace].memory_usage or 0) + pod_memory_usage
|
||||
end
|
||||
|
||||
pods_memory_usage = pods_memory_usage + pod_memory_usage
|
||||
end
|
||||
|
||||
if pod_major_page_faults then
|
||||
-- inject k8s_pod_major_page_faults metric
|
||||
inject_pod_metric('k8s_pod_major_page_faults',
|
||||
pod_major_page_faults, node_name, pod_namespace, pod_name)
|
||||
|
||||
if not pods_stats_by_ns[pod_namespace] then
|
||||
pods_stats_by_ns[pod_namespace] = {major_page_faults = pod_major_page_faults}
|
||||
else
|
||||
pods_stats_by_ns[pod_namespace].major_page_faults =
|
||||
(pods_stats_by_ns[pod_namespace].major_page_faults or 0) + pod_major_page_faults
|
||||
end
|
||||
|
||||
pods_major_page_faults = pods_major_page_faults + pod_major_page_faults
|
||||
end
|
||||
|
||||
if pod_page_faults then
|
||||
-- inject k8s_pod_page_faults metric
|
||||
inject_pod_metric('k8s_pod_page_faults',
|
||||
pod_page_faults, node_name, pod_namespace, pod_name)
|
||||
|
||||
if not pods_stats_by_ns[pod_namespace] then
|
||||
pods_stats_by_ns[pod_namespace] = {page_faults = pod_page_faults}
|
||||
else
|
||||
pods_stats_by_ns[pod_namespace].page_faults =
|
||||
(pods_stats_by_ns[pod_namespace].page_faults or 0) + pod_page_faults
|
||||
end
|
||||
|
||||
pods_page_faults = pods_page_faults + pod_page_faults
|
||||
end
|
||||
|
||||
if pod_working_set then
|
||||
-- inject k8s_pod_working_set metric
|
||||
inject_pod_metric('k8s_pod_working_set',
|
||||
pod_working_set, node_name, pod_namespace, pod_name)
|
||||
|
||||
if not pods_stats_by_ns[pod_namespace] then
|
||||
pods_stats_by_ns[pod_namespace] = {working_set = pod_working_set}
|
||||
else
|
||||
pods_stats_by_ns[pod_namespace].working_set =
|
||||
(pods_stats_by_ns[pod_namespace].working_set or 0) + pod_working_set
|
||||
end
|
||||
|
||||
pods_working_set = pods_working_set + pod_working_set
|
||||
end
|
||||
|
||||
if pod_rx_bytes then
|
||||
-- inject k8s_pod_rx_bytes metric
|
||||
inject_pod_metric('k8s_pod_rx_bytes',
|
||||
pod_rx_bytes, node_name, pod_namespace, pod_name)
|
||||
|
||||
if not pods_stats_by_ns[pod_namespace] then
|
||||
pods_stats_by_ns[pod_namespace] = {rx_bytes = pod_rx_bytes}
|
||||
else
|
||||
pods_stats_by_ns[pod_namespace].rx_bytes =
|
||||
(pods_stats_by_ns[pod_namespace].rx_bytes or 0) + pod_rx_bytes
|
||||
end
|
||||
|
||||
pods_rx_bytes = pods_rx_bytes + pod_rx_bytes
|
||||
end
|
||||
|
||||
if pod_tx_bytes then
|
||||
-- inject k8s_pod_tx_bytes metric
|
||||
inject_pod_metric('k8s_pod_tx_bytes',
|
||||
pod_tx_bytes, node_name, pod_namespace, pod_name)
|
||||
|
||||
if not pods_stats_by_ns[pod_namespace] then
|
||||
pods_stats_by_ns[pod_namespace] = {tx_bytes = pod_tx_bytes}
|
||||
else
|
||||
pods_stats_by_ns[pod_namespace].tx_bytes =
|
||||
(pods_stats_by_ns[pod_namespace].tx_bytes or 0) + pod_tx_bytes
|
||||
end
|
||||
|
||||
pods_tx_bytes = pods_tx_bytes + pod_tx_bytes
|
||||
end
|
||||
|
||||
if pod_rx_errors then
|
||||
-- inject k8s_pod_rx_errors metric
|
||||
inject_pod_metric('k8s_pod_rx_errors',
|
||||
pod_rx_errors, node_name, pod_namespace, pod_name)
|
||||
|
||||
if not pods_stats_by_ns[pod_namespace] then
|
||||
pods_stats_by_ns[pod_namespace] = {rx_errors = pod_rx_errors}
|
||||
else
|
||||
pods_stats_by_ns[pod_namespace].rx_errors =
|
||||
(pods_stats_by_ns[pod_namespace].rx_errors or 0) + pod_rx_errors
|
||||
end
|
||||
|
||||
pods_rx_errors = pods_rx_errors + pod_rx_errors
|
||||
end
|
||||
|
||||
if pod_tx_errors then
|
||||
-- inject k8s_pod_tx_errors metric
|
||||
inject_pod_metric('k8s_pod_tx_errors',
|
||||
pod_tx_errors, node_name, pod_namespace, pod_name)
|
||||
|
||||
if not pods_stats_by_ns[pod_namespace] then
|
||||
pods_stats_by_ns[pod_namespace] = {tx_errors = pod_tx_errors}
|
||||
else
|
||||
pods_stats_by_ns[pod_namespace].tx_errors =
|
||||
(pods_stats_by_ns[pod_namespace].tx_errors or 0) + pod_tx_errors
|
||||
end
|
||||
|
||||
pods_tx_errors = pods_tx_errors + pod_tx_errors
|
||||
end
|
||||
|
||||
if not pods_count_by_ns[pod_namespace] then
|
||||
pods_count_by_ns[pod_namespace] = 1
|
||||
else
|
||||
pods_count_by_ns[pod_namespace] = pods_count_by_ns[pod_namespace] + 1
|
||||
end
|
||||
pods_count_total = pods_count_total + 1
|
||||
end
|
||||
|
||||
for pod_namespace, namespace_stats in pairs(pods_stats_by_ns) do
|
||||
if namespace_stats.cpu_usage then
|
||||
-- inject k8s_namespace_cpu_usage metric
|
||||
inject_namespace_metric('k8s_namespace_cpu_usage',
|
||||
namespace_stats.cpu_usage, node_name, pod_namespace)
|
||||
end
|
||||
if namespace_stats.memory_usage then
|
||||
-- inject k8s_namespace_memory_usage metric
|
||||
inject_namespace_metric('k8s_namespace_memory_usage',
|
||||
namespace_stats.memory_usage, node_name, pod_namespace)
|
||||
end
|
||||
if namespace_stats.major_page_faults then
|
||||
-- inject k8s_namespace_major_page_faults metric
|
||||
inject_namespace_metric('k8s_namespace_major_page_faults',
|
||||
namespace_stats.major_page_faults, node_name, pod_namespace)
|
||||
end
|
||||
if namespace_stats.page_faults then
|
||||
-- inject k8s_namespace_page_faults metric
|
||||
inject_namespace_metric('k8s_namespace_page_faults',
|
||||
namespace_stats.page_faults, node_name, pod_namespace)
|
||||
end
|
||||
if namespace_stats.working_set then
|
||||
-- inject k8s_namespace_working_set metric
|
||||
inject_namespace_metric('k8s_namespace_working_set',
|
||||
namespace_stats.working_set, node_name, pod_namespace)
|
||||
end
|
||||
if namespace_stats.rx_bytes then
|
||||
-- inject k8s_namespace_rx_bytes metric
|
||||
inject_namespace_metric('k8s_namespace_rx_bytes',
|
||||
namespace_stats.rx_bytes, node_name, pod_namespace)
|
||||
end
|
||||
if namespace_stats.tx_bytes then
|
||||
-- inject k8s_namespace_tx_bytes metric
|
||||
inject_namespace_metric('k8s_namespace_tx_bytes',
|
||||
namespace_stats.tx_bytes, node_name, pod_namespace)
|
||||
end
|
||||
if namespace_stats.rx_errors then
|
||||
-- inject k8s_namespace_rx_errors metric
|
||||
inject_namespace_metric('k8s_namespace_rx_errors',
|
||||
namespace_stats.rx_errors, node_name, pod_namespace)
|
||||
end
|
||||
if namespace_stats.tx_errors then
|
||||
-- inject k8s_namespace_tx_errors metric
|
||||
inject_namespace_metric('k8s_namespace_tx_errors',
|
||||
namespace_stats.tx_errors, node_name, pod_namespace)
|
||||
end
|
||||
end
|
||||
|
||||
for pod_namespace, pods_count in pairs(pods_count_by_ns) do
|
||||
-- inject k8s_pods_count metric
|
||||
inject_namespace_metric('k8s_pods_count',
|
||||
pods_count, node_name, pod_namespace)
|
||||
end
|
||||
|
||||
-- inject k8s_pods_count_total metric
|
||||
inject_pods_metric('k8s_pods_count_total', pods_count_total, node_name)
|
||||
|
||||
-- inject k8s_pods_cpu_usage metric
|
||||
inject_pods_metric('k8s_pods_cpu_usage', pods_cpu_usage, node_name)
|
||||
|
||||
-- inject k8s_pods_memory_usage metric
|
||||
inject_pods_metric('k8s_pods_memory_usage', pods_memory_usage, node_name)
|
||||
|
||||
-- inject k8s_pods_major_page_faults metric
|
||||
inject_pods_metric('k8s_pods_major_page_faults', pods_major_page_faults, node_name)
|
||||
|
||||
-- inject k8s_pods_page_faults metric
|
||||
inject_pods_metric('k8s_pods_page_faults', pods_page_faults, node_name)
|
||||
|
||||
-- inject k8s_pods_working_set metric
|
||||
inject_pods_metric('k8s_pods_working_set', pods_working_set, node_name)
|
||||
|
||||
-- inject k8s_pods_rx_bytes metric
|
||||
inject_pods_metric('k8s_pods_rx_bytes', pods_rx_bytes, node_name)
|
||||
|
||||
-- inject k8s_pods_tx_bytes metric
|
||||
inject_pods_metric('k8s_pods_tx_bytes', pods_tx_bytes, node_name)
|
||||
|
||||
-- inject k8s_pods_rx_errors metric
|
||||
inject_pods_metric('k8s_pods_rx_errors', pods_rx_errors, node_name)
|
||||
|
||||
-- inject k8s_pods_tx_errors metric
|
||||
inject_pods_metric('k8s_pods_tx_errors', pods_tx_errors, node_name)
|
||||
end
|
||||
|
||||
|
||||
-- Function called every ticker interval. Queries the kubelet "stats" API,
|
||||
-- does aggregations, and inject metric messages.
|
||||
function process_message()
|
||||
local doc, err_msg = send_stats_query()
|
||||
if not doc then
|
||||
-- inject a k8s_check "failure" metric
|
||||
k8s_check_msg.Fields.value = 0
|
||||
k8s_check_msg.Fields.hostname = node_name
|
||||
inject_message(k8s_check_msg)
|
||||
return -1, err_msg
|
||||
end
|
||||
|
||||
local pods = doc['pods']
|
||||
if not pods then
|
||||
return -1, "no pods in kubelet stats response"
|
||||
end
|
||||
|
||||
local curr_stats = {}
|
||||
collect_pods_stats(doc['node']['nodeName'], pods, pods_stats, curr_stats)
|
||||
pods_stats = curr_stats
|
||||
|
||||
-- inject a k8s_check "success" metric
|
||||
k8s_check_msg.Fields.value = 1
|
||||
k8s_check_msg.Fields.hostname = node_name
|
||||
inject_message(k8s_check_msg)
|
||||
|
||||
return 0
|
||||
end
|
|
@ -1,185 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local cjson = require 'cjson'
|
||||
local string = require 'string'
|
||||
local table = require 'table'
|
||||
|
||||
local utils = require 'stacklight.utils'
|
||||
local constants = require 'stacklight.constants'
|
||||
|
||||
local read_message = read_message
|
||||
local assert = assert
|
||||
local ipairs = ipairs
|
||||
local pcall = pcall
|
||||
|
||||
local M = {}
|
||||
setfenv(1, M) -- Remove external access to contain everything in the module
|
||||
|
||||
local function read_field(msg, name)
|
||||
return msg.Fields[name]
|
||||
end
|
||||
|
||||
function read_status(msg)
|
||||
return read_field(msg, 'value')
|
||||
end
|
||||
|
||||
function read_source(msg)
|
||||
return read_field(msg, 'source')
|
||||
end
|
||||
|
||||
function read_hostname(msg)
|
||||
return read_field(msg, 'hostname')
|
||||
end
|
||||
|
||||
function read_cluster(msg)
|
||||
return read_field(msg, 'cluster')
|
||||
end
|
||||
|
||||
function extract_alarms(msg)
|
||||
local ok, payload = pcall(cjson.decode, msg.Payload)
|
||||
if not ok or not payload.alarms then
|
||||
return nil
|
||||
end
|
||||
return payload.alarms
|
||||
end
|
||||
|
||||
-- return a human-readable message from an alarm table
|
||||
-- for instance: "CPU load too high (WARNING, rule='last(load_midterm)>=5', current=7)"
|
||||
function get_alarm_for_human(alarm)
|
||||
local metric
|
||||
if #(alarm.fields) > 0 then
|
||||
local fields = {}
|
||||
for _, field in ipairs(alarm.fields) do
|
||||
fields[#fields+1] = field.name .. '="' .. field.value .. '"'
|
||||
end
|
||||
metric = string.format('%s[%s]', alarm.metric, table.concat(fields, ','))
|
||||
else
|
||||
metric = alarm.metric
|
||||
end
|
||||
|
||||
local host = ''
|
||||
if alarm.hostname then
|
||||
host = string.format(', host=%s', alarm.hostname)
|
||||
end
|
||||
|
||||
return string.format(
|
||||
"%s (%s, rule='%s(%s)%s%s', current=%.2f%s)",
|
||||
alarm.message,
|
||||
alarm.severity,
|
||||
alarm['function'],
|
||||
metric,
|
||||
alarm.operator,
|
||||
alarm.threshold,
|
||||
alarm.value,
|
||||
host
|
||||
)
|
||||
end
|
||||
|
||||
function alarms_for_human(alarms)
|
||||
local alarm_messages = {}
|
||||
local hint_messages = {}
|
||||
|
||||
for _, v in ipairs(alarms) do
|
||||
if v.tags and v.tags.dependency_level and v.tags.dependency_level == 'hint' then
|
||||
hint_messages[#hint_messages+1] = get_alarm_for_human(v)
|
||||
else
|
||||
alarm_messages[#alarm_messages+1] = get_alarm_for_human(v)
|
||||
end
|
||||
end
|
||||
|
||||
if #hint_messages > 0 then
|
||||
alarm_messages[#alarm_messages+1] = "Other related alarms:"
|
||||
end
|
||||
for _, v in ipairs(hint_messages) do
|
||||
alarm_messages[#alarm_messages+1] = v
|
||||
end
|
||||
|
||||
return alarm_messages
|
||||
end
|
||||
|
||||
local alarms = {}
|
||||
|
||||
-- append an alarm to the list of pending alarms
|
||||
-- the list is sent when inject_afd_metric is called
|
||||
function add_to_alarms(status, fn, metric, fields, tags, operator, value, threshold, window, periods, message)
|
||||
local severity = constants.status_label(status)
|
||||
assert(severity)
|
||||
alarms[#alarms+1] = {
|
||||
severity=severity,
|
||||
['function']=fn,
|
||||
metric=metric,
|
||||
fields=fields or {},
|
||||
tags=tags or {},
|
||||
operator=operator,
|
||||
value=value,
|
||||
threshold=threshold,
|
||||
window=window or 0,
|
||||
periods=periods or 0,
|
||||
message=message
|
||||
}
|
||||
end
|
||||
|
||||
function get_alarms()
|
||||
return alarms
|
||||
end
|
||||
|
||||
function reset_alarms()
|
||||
alarms = {}
|
||||
end
|
||||
|
||||
-- inject an AFD event into the pipeline
|
||||
function inject_afd_metric(msg_type, metric_name, cluster_name, logical_name,
|
||||
state, hostname)
|
||||
local payload
|
||||
|
||||
if #alarms > 0 then
|
||||
payload = utils.safe_json_encode({alarms=alarms})
|
||||
reset_alarms()
|
||||
if not payload then
|
||||
return
|
||||
end
|
||||
else
|
||||
-- because cjson encodes empty tables as objects instead of arrays
|
||||
payload = '{"alarms":[]}'
|
||||
end
|
||||
|
||||
local msg = {
|
||||
Type = msg_type,
|
||||
Payload = payload,
|
||||
Fields = {
|
||||
name = metric_name,
|
||||
value = state,
|
||||
hostname = hostname,
|
||||
cluster = cluster_name,
|
||||
source = logical_name,
|
||||
dimensions = {'cluster', 'hostname', 'source'},
|
||||
}
|
||||
}
|
||||
|
||||
local err_code, err_msg = utils.safe_inject_message(msg)
|
||||
|
||||
if err_code ~= 0 then
|
||||
return nil, err_msg
|
||||
end
|
||||
|
||||
return msg
|
||||
end
|
||||
|
||||
MATCH = 1
|
||||
NO_MATCH = 2
|
||||
NO_DATA = 3
|
||||
MISSING_DATA = 4
|
||||
|
||||
return M
|
|
@ -1,224 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local assert = assert
|
||||
local ipairs = ipairs
|
||||
local pairs = pairs
|
||||
local string = string
|
||||
local setmetatable = setmetatable
|
||||
|
||||
local table_utils = require 'stacklight.table_utils'
|
||||
local constants = require 'stacklight.constants'
|
||||
local afd = require 'stacklight.afd'
|
||||
local Rule = require 'stacklight.afd_rule'
|
||||
|
||||
local SEVERITIES = {
|
||||
warning = constants.WARN,
|
||||
critical = constants.CRIT,
|
||||
down = constants.DOWN,
|
||||
unknown = constants.UNKW,
|
||||
okay = constants.OKAY,
|
||||
}
|
||||
|
||||
local Alarm = {}
|
||||
Alarm.__index = Alarm
|
||||
|
||||
setfenv(1, Alarm) -- Remove external access to contain everything in the module
|
||||
|
||||
function Alarm.new(alarm)
|
||||
local a = {}
|
||||
setmetatable(a, Alarm)
|
||||
a._metrics_list = nil
|
||||
a.name = alarm.name
|
||||
a.description = alarm.description
|
||||
if alarm.trigger.logical_operator then
|
||||
a.logical_operator = string.lower(alarm.trigger.logical_operator)
|
||||
else
|
||||
a.logical_operator = 'or'
|
||||
end
|
||||
a.severity_str = string.upper(alarm.severity)
|
||||
a.severity = SEVERITIES[string.lower(alarm.severity)]
|
||||
assert(a.severity ~= nil)
|
||||
|
||||
a.skip_when_no_data = false
|
||||
if alarm.no_data_policy then
|
||||
if string.lower(alarm.no_data_policy) == 'skip' then
|
||||
a.skip_when_no_data = true
|
||||
else
|
||||
a.no_data_severity = SEVERITIES[string.lower(alarm.no_data_policy)]
|
||||
end
|
||||
else
|
||||
a.no_data_severity = constants.UNKW
|
||||
end
|
||||
assert(a.skip_when_no_data or a.no_data_severity ~= nil)
|
||||
|
||||
a.rules = {}
|
||||
a.initial_wait = 0
|
||||
for _, rule in ipairs(alarm.trigger.rules) do
|
||||
local r = Rule.new(rule)
|
||||
a.rules[#a.rules+1] = r
|
||||
local wait = r.window * r.periods
|
||||
if wait > a.initial_wait then
|
||||
a.initial_wait = wait * 1e9
|
||||
end
|
||||
end
|
||||
a.start_time_ns = 0
|
||||
|
||||
return a
|
||||
end
|
||||
|
||||
-- return the Set of metrics used by the alarm
|
||||
function Alarm:get_metrics()
|
||||
if not self._metrics_list then
|
||||
self._metrics_list = {}
|
||||
for _, rule in ipairs(self.rules) do
|
||||
if not table_utils.item_find(rule.metric, metrics) then
|
||||
self._metrics_list[#self._metrics_list+1] = rule.metric
|
||||
end
|
||||
end
|
||||
end
|
||||
return self._metrics_list
|
||||
end
|
||||
|
||||
-- return a list of field names used for the metric
|
||||
-- (can have duplicate names)
|
||||
function Alarm:get_metric_fields(metric_name)
|
||||
local fields = {}
|
||||
for _, rule in ipairs(self.rules) do
|
||||
if rule.metric == metric_name then
|
||||
for k, _ in pairs(rule.fields) do
|
||||
fields[#fields+1] = k
|
||||
end
|
||||
for _, g in ipairs(rule.group_by) do
|
||||
fields[#fields+1] = g
|
||||
end
|
||||
end
|
||||
end
|
||||
return fields
|
||||
end
|
||||
|
||||
function Alarm:has_metric(metric)
|
||||
return table_utils.item_find(metric, self:get_metrics())
|
||||
end
|
||||
|
||||
-- dispatch datapoint in datastores
|
||||
function Alarm:add_value(ts, metric, value, fields)
|
||||
local data
|
||||
for id, rule in pairs(self.rules) do
|
||||
if rule.metric == metric then
|
||||
rule:add_value(ts, value, fields)
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
-- convert fields to fields map
|
||||
-- {foo="bar"} --> {name="foo", value="bar"}
|
||||
local function convert_field_list(fields)
|
||||
local named_fields = {}
|
||||
for name, value in pairs(fields or {}) do
|
||||
named_fields[#named_fields+1] = {name=name, value=value}
|
||||
end
|
||||
return named_fields
|
||||
end
|
||||
|
||||
-- return: state of alarm and a list of alarm details.
|
||||
--
|
||||
-- with alarm list when state != OKAY:
|
||||
-- {
|
||||
-- {
|
||||
-- value = <current value>,
|
||||
-- fields = <metric fields table>,
|
||||
-- message = <string>,
|
||||
-- },
|
||||
-- }
|
||||
function Alarm:evaluate(ns)
|
||||
local state = constants.OKAY
|
||||
local matches = 0
|
||||
local all_alerts = {}
|
||||
local function add_alarm(rule, value, message, fields)
|
||||
all_alerts[#all_alerts+1] = {
|
||||
severity = self.severity_str,
|
||||
['function'] = rule.fct,
|
||||
metric = rule.metric,
|
||||
operator = rule.relational_operator,
|
||||
threshold = rule.threshold,
|
||||
window = rule.window,
|
||||
periods = rule.periods,
|
||||
value = value,
|
||||
fields = fields,
|
||||
message = message
|
||||
}
|
||||
end
|
||||
local one_unknown = false
|
||||
local msg
|
||||
|
||||
for _, rule in ipairs(self.rules) do
|
||||
local eval, context_list = rule:evaluate(ns)
|
||||
if eval == afd.MATCH then
|
||||
matches = matches + 1
|
||||
msg = self.description
|
||||
elseif eval == afd.MISSING_DATA then
|
||||
msg = 'No datapoint have been received over the last ' .. rule.observation_window .. ' seconds'
|
||||
one_unknown = true
|
||||
elseif eval == afd.NO_DATA then
|
||||
msg = 'No datapoint have been received ever'
|
||||
one_unknown = true
|
||||
end
|
||||
for _, context in ipairs(context_list) do
|
||||
add_alarm(rule, context.value, msg,
|
||||
convert_field_list(context.fields))
|
||||
end
|
||||
end
|
||||
|
||||
if self.logical_operator == 'and' then
|
||||
if one_unknown then
|
||||
if self.skip_when_no_data then
|
||||
state = nil
|
||||
else
|
||||
state = self.no_data_severity
|
||||
end
|
||||
elseif #self.rules == matches then
|
||||
state = self.severity
|
||||
end
|
||||
elseif self.logical_operator == 'or' then
|
||||
if matches > 0 then
|
||||
state = self.severity
|
||||
elseif one_unknown then
|
||||
if self.skip_when_no_data then
|
||||
state = nil
|
||||
else
|
||||
state = self.no_data_severity
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
if state == nil or state == constants.OKAY then
|
||||
all_alerts = {}
|
||||
end
|
||||
return state, all_alerts
|
||||
end
|
||||
|
||||
function Alarm:set_start_time(ns)
|
||||
self.start_time_ns = ns
|
||||
end
|
||||
|
||||
function Alarm:is_evaluation_time(ns)
|
||||
local delta = ns - self.start_time_ns
|
||||
if delta >= self.initial_wait then
|
||||
return true
|
||||
end
|
||||
return false
|
||||
end
|
||||
|
||||
return Alarm
|
|
@ -1,118 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local pairs = pairs
|
||||
local ipairs = ipairs
|
||||
local table_utils = require 'stacklight.table_utils'
|
||||
local constants = require 'stacklight.constants'
|
||||
local Alarm = require 'stacklight.afd_alarm'
|
||||
|
||||
local all_alarms = {}
|
||||
|
||||
local M = {}
|
||||
setfenv(1, M) -- Remove external access to contain everything in the module
|
||||
|
||||
-- return a list of field names required for the metric
|
||||
function get_metric_fields(metric_name)
|
||||
local fields = {}
|
||||
for name, alarm in pairs(all_alarms) do
|
||||
local mf = alarm:get_metric_fields(metric_name)
|
||||
if mf then
|
||||
for _, field in pairs(mf) do
|
||||
if not table_utils.item_find(field, fields) then
|
||||
fields[#fields+1] = field
|
||||
end
|
||||
end
|
||||
end
|
||||
end
|
||||
return fields
|
||||
end
|
||||
|
||||
-- return list of alarms interested by a metric
|
||||
function get_interested_alarms(metric)
|
||||
local interested_alarms = {}
|
||||
for _, alarm in pairs(all_alarms) do
|
||||
if alarm:has_metric(metric) then
|
||||
|
||||
interested_alarms[#interested_alarms+1] = alarm
|
||||
end
|
||||
end
|
||||
return interested_alarms
|
||||
end
|
||||
|
||||
function add_value(ts, metric, value, fields)
|
||||
local interested_alarms = get_interested_alarms(metric)
|
||||
for _, alarm in ipairs (interested_alarms) do
|
||||
alarm:add_value(ts, metric, value, fields)
|
||||
end
|
||||
end
|
||||
|
||||
function reset_alarms()
|
||||
all_alarms = {}
|
||||
end
|
||||
|
||||
function evaluate(ns)
|
||||
local global_state
|
||||
local all_alerts = {}
|
||||
for _, alarm in pairs(all_alarms) do
|
||||
if alarm:is_evaluation_time(ns) then
|
||||
local state, alerts = alarm:evaluate(ns)
|
||||
global_state = constants.max_status(state, global_state)
|
||||
for _, a in ipairs(alerts) do
|
||||
all_alerts[#all_alerts+1] = { state=state, alert=a }
|
||||
end
|
||||
-- raise the first triggered alarm except for OKAY/UNKW states
|
||||
if global_state ~= constants.UNKW and global_state ~= constants.OKAY then
|
||||
break
|
||||
end
|
||||
end
|
||||
end
|
||||
return global_state, all_alerts
|
||||
end
|
||||
|
||||
function get_alarms()
|
||||
return all_alarms
|
||||
end
|
||||
function get_alarm(alarm_name)
|
||||
for _, a in ipairs(all_alarms) do
|
||||
if a.name == alarm_name then
|
||||
return a
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
function load_alarm(alarm)
|
||||
local A = Alarm.new(alarm)
|
||||
all_alarms[#all_alarms+1] = A
|
||||
end
|
||||
|
||||
function load_alarms(alarms)
|
||||
for _, alarm in ipairs(alarms) do
|
||||
load_alarm(alarm)
|
||||
end
|
||||
end
|
||||
|
||||
local started = false
|
||||
function set_start_time(ns)
|
||||
for _, alarm in ipairs(all_alarms) do
|
||||
alarm:set_start_time(ns)
|
||||
end
|
||||
started = true
|
||||
end
|
||||
|
||||
function is_started()
|
||||
return started
|
||||
end
|
||||
|
||||
return M
|
|
@ -1,102 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local string = require 'string'
|
||||
local table = require 'table'
|
||||
|
||||
local utils = require 'stacklight.utils'
|
||||
local consts = require 'stacklight.constants'
|
||||
local afd = require 'stacklight.afd'
|
||||
|
||||
local M = {}
|
||||
setfenv(1, M)
|
||||
|
||||
local statuses = {}
|
||||
|
||||
local annotation_msg = {
|
||||
Type = 'metric',
|
||||
Fields = {
|
||||
name = 'annotation',
|
||||
dimensions = {'cluster', 'source', 'hostname'},
|
||||
value_fields = {'title', 'tags', 'text'},
|
||||
title = nil,
|
||||
tags = nil,
|
||||
text = nil,
|
||||
cluster = nil,
|
||||
source = nil,
|
||||
hostname = nil,
|
||||
}
|
||||
}
|
||||
|
||||
function inject_afd_annotation(msg)
|
||||
local previous
|
||||
local text
|
||||
|
||||
local source = afd.read_source(msg)
|
||||
local status = afd.read_status(msg)
|
||||
local hostname = afd.read_hostname(msg)
|
||||
local alarms = afd.extract_alarms(msg)
|
||||
local cluster = afd.read_cluster(msg)
|
||||
|
||||
if not source or not status or not hostname or not alarms or not cluster then
|
||||
return -1
|
||||
end
|
||||
|
||||
if not statuses[source] then
|
||||
statuses[source] = {}
|
||||
end
|
||||
previous = statuses[source]
|
||||
|
||||
text = table.concat(afd.alarms_for_human(alarms), '<br />')
|
||||
|
||||
-- build the title
|
||||
if not previous.status and status == consts.OKAY then
|
||||
-- don't send an annotation when we detect a new cluster which is OKAY
|
||||
return 0
|
||||
elseif not previous.status then
|
||||
title = string.format('General status is %s',
|
||||
consts.status_label(status))
|
||||
elseif previous.status ~= status then
|
||||
title = string.format('General status %s -> %s',
|
||||
consts.status_label(previous.status),
|
||||
consts.status_label(status))
|
||||
|
||||
-- TODO(pasquier-s): generate an annotation when the set of alarms has
|
||||
-- changed. the following code generated an annotation whenever at least
|
||||
-- one value associated to an alarm was changing. This led to way too
|
||||
-- many annotations with alarms monitoring the CPU usage for instance.
|
||||
|
||||
-- elseif previous.text ~= text then
|
||||
-- title = string.format('General status remains %s',
|
||||
-- consts.status_label(status))
|
||||
else
|
||||
-- nothing has changed since the last message
|
||||
return 0
|
||||
end
|
||||
|
||||
annotation_msg.Fields.title = title
|
||||
annotation_msg.Fields.tags = source
|
||||
annotation_msg.Fields.text = text
|
||||
annotation_msg.Fields.source = source
|
||||
annotation_msg.Fields.hostname = hostname
|
||||
annotation_msg.Fields.cluster = cluster
|
||||
|
||||
-- store the last status and alarm text for future messages
|
||||
previous.status = status
|
||||
previous.text = text
|
||||
|
||||
return utils.safe_inject_message(annotation_msg)
|
||||
end
|
||||
|
||||
return M
|
|
@ -1,279 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local circular_buffer = require 'circular_buffer'
|
||||
local stats = require 'lsb.stats'
|
||||
local setmetatable = setmetatable
|
||||
local ipairs = ipairs
|
||||
local pairs = pairs
|
||||
local math = require 'math'
|
||||
local string = string
|
||||
local table = table
|
||||
local assert = assert
|
||||
local type = type
|
||||
|
||||
-- StackLight libs
|
||||
local table_utils = require 'stacklight.table_utils'
|
||||
local constants = require 'stacklight.constants'
|
||||
local afd = require 'stacklight.afd'
|
||||
local matching = require 'stacklight.value_matching'
|
||||
|
||||
local MIN_WINDOW = 10
|
||||
local MIN_PERIOD = 1
|
||||
local SECONDS_PER_ROW = 5
|
||||
|
||||
local Rule = {}
|
||||
Rule.__index = Rule
|
||||
|
||||
setfenv(1, Rule) -- Remove external access to contain everything in the module
|
||||
|
||||
function Rule.new(rule)
|
||||
local r = {}
|
||||
setmetatable(r, Rule)
|
||||
|
||||
local win = MIN_WINDOW
|
||||
if rule.window and rule.window + 0 > 0 then
|
||||
win = rule.window + 0
|
||||
end
|
||||
r.window = win
|
||||
local periods = MIN_PERIOD
|
||||
if rule.periods and rule.periods + 0 > 0 then
|
||||
periods = rule.periods + 0
|
||||
end
|
||||
r.periods = periods
|
||||
r.relational_operator = rule.relational_operator
|
||||
r.metric = rule.metric
|
||||
r.fields = rule.fields or {}
|
||||
|
||||
-- build field matching
|
||||
r.field_matchers = {}
|
||||
for f, expression in pairs(r.fields) do
|
||||
r.field_matchers[f] = matching.new(expression)
|
||||
end
|
||||
|
||||
r.fct = rule['function']
|
||||
r.threshold = rule.threshold + 0
|
||||
r.value_index = rule.value or nil -- Can be nil
|
||||
|
||||
-- build unique rule id
|
||||
local arr = {r.metric, r.fct, r.window, r.periods}
|
||||
for f, v in table_utils.orderedPairs(r.fields or {}) do
|
||||
arr[#arr+1] = string.format('(%s=%s)', f, v)
|
||||
end
|
||||
r.rule_id = table.concat(arr, '/')
|
||||
|
||||
r.group_by = rule.group_by or {}
|
||||
|
||||
r.cbuf_size = math.ceil(r.window * r.periods / SECONDS_PER_ROW)
|
||||
|
||||
r.ids_datastore = {}
|
||||
r.datastore = {}
|
||||
r.observation_window = math.ceil(r.window * r.periods)
|
||||
|
||||
return r
|
||||
end
|
||||
|
||||
function Rule:get_datastore_id(fields)
|
||||
if #self.group_by == 0 or fields == nil then
|
||||
return self.rule_id
|
||||
end
|
||||
|
||||
local arr = {}
|
||||
arr[#arr + 1] = self.rule_id
|
||||
for _, g in ipairs(self.group_by) do
|
||||
arr[#arr + 1] = fields[g]
|
||||
end
|
||||
return table.concat(arr, '/')
|
||||
end
|
||||
|
||||
function Rule:fields_accepted(fields)
|
||||
if not fields then
|
||||
fields = {}
|
||||
end
|
||||
local matched_fields = 0
|
||||
local no_match_on_fields = true
|
||||
for f, expression in pairs(self.field_matchers) do
|
||||
no_match_on_fields = false
|
||||
for k, v in pairs(fields) do
|
||||
if k == f then
|
||||
if expression:matches(v) then
|
||||
matched_fields = matched_fields + 1
|
||||
else
|
||||
return false
|
||||
end
|
||||
end
|
||||
end
|
||||
end
|
||||
return no_match_on_fields or matched_fields > 0
|
||||
end
|
||||
|
||||
function Rule:get_circular_buffer()
|
||||
local fct
|
||||
if self.fct == 'min' or self.fct == 'max' then
|
||||
fct = self.fct
|
||||
else
|
||||
fct = 'sum'
|
||||
end
|
||||
local cbuf = circular_buffer.new(self.cbuf_size, 1, SECONDS_PER_ROW)
|
||||
cbuf:set_header(1, self.metric, fct, fct)
|
||||
return cbuf
|
||||
end
|
||||
|
||||
-- store datapoints in cbuf, create the cbuf if not exists.
|
||||
-- value can be a table where the index to choose is referenced by self.value_index
|
||||
function Rule:add_value(ts, value, fields)
|
||||
if not self:fields_accepted(fields) then
|
||||
return
|
||||
end
|
||||
if type(value) == 'table' then
|
||||
value = value[self.value_index]
|
||||
end
|
||||
if value == nil then
|
||||
return
|
||||
end
|
||||
|
||||
local data
|
||||
local uniq_field_id = self:get_datastore_id(fields)
|
||||
if not self.datastore[uniq_field_id] then
|
||||
self.datastore[uniq_field_id] = {
|
||||
fields = self.fields,
|
||||
cbuf = self:get_circular_buffer()
|
||||
}
|
||||
if #self.group_by > 0 then
|
||||
self.datastore[uniq_field_id].fields = fields
|
||||
end
|
||||
|
||||
self:add_datastore(uniq_field_id)
|
||||
end
|
||||
data = self.datastore[uniq_field_id]
|
||||
|
||||
if self.fct == 'avg' then
|
||||
data.cbuf:add(ts, 1, value)
|
||||
else
|
||||
data.cbuf:set(ts, 1, value)
|
||||
end
|
||||
end
|
||||
|
||||
function Rule:add_datastore(id)
|
||||
if not table_utils.item_find(id, self.ids_datastore) then
|
||||
self.ids_datastore[#self.ids_datastore+1] = id
|
||||
end
|
||||
end
|
||||
|
||||
function Rule:compare_threshold(value)
|
||||
return constants.compare_threshold(value, self.relational_operator, self.threshold)
|
||||
end
|
||||
|
||||
local function isnumber(value)
|
||||
return value ~= nil and not (value ~= value)
|
||||
end
|
||||
|
||||
local available_functions = {last=true, avg=true, max=true, min=true, sum=true,
|
||||
variance=true, sd=true, diff=true}
|
||||
|
||||
-- evaluate the rule against datapoints
|
||||
-- return a list: match (bool or string), context ({value=v, fields=list of field table})
|
||||
--
|
||||
-- examples:
|
||||
-- true, { {value=100, fields={{queue='nova'}, {queue='neutron'}}, ..}
|
||||
-- false, { {value=10, fields={}}, ..}
|
||||
-- with 2 special cases:
|
||||
-- - never receive one datapoint
|
||||
-- 'nodata', {}
|
||||
-- - no more datapoint received for a metric
|
||||
-- 'missing', {value=-1, fields={}}
|
||||
-- There is a drawback with the 'missing' state and could leads to emit false positive
|
||||
-- state. For example when the monitored thing has been renamed/deleted,
|
||||
-- it's normal to don't receive datapoint anymore .. for example a filesystem.
|
||||
function Rule:evaluate(ns)
|
||||
local fields = {}
|
||||
local one_match, one_no_match, one_missing_data = false, false, false
|
||||
for _, id in ipairs(self.ids_datastore) do
|
||||
local data = self.datastore[id]
|
||||
if data then
|
||||
local cbuf_time = data.cbuf:current_time()
|
||||
-- if we didn't receive datapoint within the observation window this means
|
||||
-- we don't receive anymore data and cannot compute the rule.
|
||||
if ns - cbuf_time > self.observation_window * 1e9 then
|
||||
one_missing_data = true
|
||||
fields[#fields+1] = {value = -1, fields = data.fields}
|
||||
else
|
||||
assert(available_functions[self.fct])
|
||||
local result
|
||||
|
||||
if self.fct == 'last' then
|
||||
local last
|
||||
local t = ns
|
||||
while (not isnumber(last)) and t >= ns - self.observation_window * 1e9 do
|
||||
last = data.cbuf:get(t, 1)
|
||||
t = t - SECONDS_PER_ROW * 1e9
|
||||
end
|
||||
if isnumber(last) then
|
||||
result = last
|
||||
else
|
||||
one_missing_data = true
|
||||
fields[#fields+1] = {value = -1, fields = data.fields}
|
||||
end
|
||||
elseif self.fct == 'diff' then
|
||||
local first, last
|
||||
|
||||
local t = ns
|
||||
while (not isnumber(last)) and t >= ns - self.observation_window * 1e9 do
|
||||
last = data.cbuf:get(t, 1)
|
||||
t = t - SECONDS_PER_ROW * 1e9
|
||||
end
|
||||
|
||||
if isnumber(last) then
|
||||
t = ns - self.observation_window * 1e9
|
||||
while (not isnumber(first)) and t <= ns do
|
||||
first = data.cbuf:get(t, 1)
|
||||
t = t + SECONDS_PER_ROW * 1e9
|
||||
end
|
||||
end
|
||||
|
||||
if not isnumber(last) or not isnumber(first) then
|
||||
one_missing_data = true
|
||||
fields[#fields+1] = {value = -1, fields = data.fields}
|
||||
else
|
||||
result = last - first
|
||||
end
|
||||
else
|
||||
local values = data.cbuf:get_range(1)
|
||||
result = stats[self.fct](values)
|
||||
end
|
||||
|
||||
if result then
|
||||
local m = self:compare_threshold(result)
|
||||
if m then
|
||||
one_match = true
|
||||
fields[#fields+1] = {value=result, fields=data.fields}
|
||||
else
|
||||
one_no_match = true
|
||||
end
|
||||
end
|
||||
end
|
||||
end
|
||||
end
|
||||
if one_match then
|
||||
return afd.MATCH, fields
|
||||
elseif one_missing_data then
|
||||
return afd.MISSING_DATA, fields
|
||||
elseif one_no_match then
|
||||
return afd.NO_MATCH, {}
|
||||
else
|
||||
return afd.NO_DATA, {{value=-1, fields=self.fields}}
|
||||
end
|
||||
end
|
||||
|
||||
return Rule
|
|
@ -1,78 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local M = {}
|
||||
setfenv(1, M) -- Remove external access to contain everything in the module
|
||||
|
||||
-- The status values were chosen to match with the Grafana constraints:
|
||||
-- OKAY => green
|
||||
-- WARN & UNKW => orange
|
||||
-- CRIT & DOWN => red
|
||||
OKAY=0
|
||||
WARN=1
|
||||
UNKW=2
|
||||
CRIT=3
|
||||
DOWN=4
|
||||
|
||||
local STATUS_LABELS = {
|
||||
[OKAY]='OKAY',
|
||||
[WARN]='WARN',
|
||||
[UNKW]='UNKNOWN',
|
||||
[CRIT]='CRITICAL',
|
||||
[DOWN]='DOWN'
|
||||
}
|
||||
|
||||
function status_label(v)
|
||||
return STATUS_LABELS[v]
|
||||
end
|
||||
|
||||
local STATUS_WEIGHTS = {
|
||||
[UNKW]=0,
|
||||
[OKAY]=1,
|
||||
[WARN]=2,
|
||||
[CRIT]=3,
|
||||
[DOWN]=4
|
||||
}
|
||||
|
||||
function max_status(val1, val2)
|
||||
if not val1 then
|
||||
return val2
|
||||
elseif not val2 then
|
||||
return val1
|
||||
elseif STATUS_WEIGHTS[val1] > STATUS_WEIGHTS[val2] then
|
||||
return val1
|
||||
else
|
||||
return val2
|
||||
end
|
||||
end
|
||||
|
||||
function compare_threshold(value, op, threshold)
|
||||
local rule_matches = false
|
||||
if op == '==' or op == 'eq' then
|
||||
rule_matches = value == threshold
|
||||
elseif op == '!=' or op == 'ne' then
|
||||
rule_matches = value ~= threshold
|
||||
elseif op == '>=' or op == 'gte' then
|
||||
rule_matches = value >= threshold
|
||||
elseif op == '>' or op == 'gt' then
|
||||
rule_matches = value > threshold
|
||||
elseif op == '<=' or op == 'lte' then
|
||||
rule_matches = value <= threshold
|
||||
elseif op == '<' or op == 'lt' then
|
||||
rule_matches = value < threshold
|
||||
end
|
||||
return rule_matches
|
||||
end
|
||||
|
||||
return M
|
|
@ -1,88 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local cjson = require 'cjson'
|
||||
|
||||
local inject_message = inject_message
|
||||
local read_message = read_message
|
||||
local string = string
|
||||
local pcall = pcall
|
||||
|
||||
local M = {}
|
||||
setfenv(1, M) -- Remove external access to contain everything in the module
|
||||
|
||||
-- Return the value and index of the last field with a given name.
|
||||
function read_field(name)
|
||||
local i = -1
|
||||
local value = nil
|
||||
local variable_name = string.format('Fields[%s]', name)
|
||||
repeat
|
||||
local tmp = read_message(variable_name, i + 1)
|
||||
if tmp == nil then
|
||||
break
|
||||
end
|
||||
value = tmp
|
||||
i = i + 1
|
||||
until false
|
||||
return value, i
|
||||
end
|
||||
|
||||
|
||||
-- Extract value(s) from the message. The value can be either a scalar value
|
||||
-- or a table for multi-value metrics.Return nil and an error message on
|
||||
-- failure. The argument "tags" is optional, it's used for sanity checks.
|
||||
function read_values(tags)
|
||||
if not tags then
|
||||
tags = {}
|
||||
end
|
||||
local value
|
||||
local value_fields, value_fields_index = read_field('value_fields')
|
||||
if value_fields ~= nil then
|
||||
if tags['value_fields'] ~= nil and value_fields_index == 0 then
|
||||
return nil, 'index of field "value_fields" should not be 0'
|
||||
end
|
||||
local i = 0
|
||||
value = {}
|
||||
repeat
|
||||
local value_key = read_message(
|
||||
'Fields[value_fields]', value_fields_index, i)
|
||||
if value_key == nil then
|
||||
break
|
||||
end
|
||||
local value_val, value_index = read_field(value_key)
|
||||
if value_val == nil then
|
||||
return nil, string.format('field "%s" is missing', value_key)
|
||||
end
|
||||
if tags[value_key] ~= nil and value_index == 0 then
|
||||
return nil, string.format(
|
||||
'index of field "%s" should not be 0', value_key)
|
||||
end
|
||||
value[value_key] = value_val
|
||||
i = i + 1
|
||||
until false
|
||||
else
|
||||
local value_index
|
||||
value, value_index = read_field('value')
|
||||
if value == nil then
|
||||
-- "value" is a required field
|
||||
return nil, 'field "value" is missing'
|
||||
end
|
||||
if tags['value'] ~= nil and value_index == 0 then
|
||||
return nil, 'index of field "value" should not be 0'
|
||||
end
|
||||
end
|
||||
return value, ''
|
||||
end
|
||||
|
||||
return M
|
|
@ -1,34 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local l = require 'lpeg'
|
||||
l.locale(l)
|
||||
|
||||
local tonumber = tonumber
|
||||
|
||||
local M = {}
|
||||
setfenv(1, M) -- Remove external access to contain everything in the module
|
||||
|
||||
function anywhere (patt)
|
||||
return l.P {
|
||||
patt + 1 * l.V(1)
|
||||
}
|
||||
end
|
||||
|
||||
sp = l.space
|
||||
|
||||
-- Pattern used to match a number
|
||||
Number = l.P"-"^-1 * l.xdigit^1 * (l.S(".,") * l.xdigit^1 )^-1 / tonumber
|
||||
|
||||
return M
|
|
@ -1,83 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local table = require 'table'
|
||||
local ipairs = ipairs
|
||||
local pairs = pairs
|
||||
local type = type
|
||||
|
||||
local M = {}
|
||||
setfenv(1, M) -- Remove external access to contain everything in the module
|
||||
|
||||
-- return the position (index) of an item in a list, nil if not found
|
||||
function item_pos(item, list)
|
||||
if type(list) == 'table' then
|
||||
for i, v in ipairs(list) do
|
||||
if v == item then
|
||||
return i
|
||||
end
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
-- return true if an item is present in the list, false otherwise
|
||||
function item_find(item, list)
|
||||
return item_pos(item, list) ~= nil
|
||||
end
|
||||
|
||||
-- from http://lua-users.org/wiki/SortedIteration
|
||||
function __genOrderedIndex( t )
|
||||
local orderedIndex = {}
|
||||
for key in pairs(t) do
|
||||
table.insert( orderedIndex, key )
|
||||
end
|
||||
table.sort( orderedIndex )
|
||||
return orderedIndex
|
||||
end
|
||||
|
||||
function orderedNext(t, state)
|
||||
-- Equivalent of the next function, but returns the keys in the alphabetic
|
||||
-- order. We use a temporary ordered key table that is stored in the
|
||||
-- table being iterated.
|
||||
|
||||
key = nil
|
||||
if state == nil then
|
||||
-- the first time, generate the index
|
||||
t.__orderedIndex = __genOrderedIndex( t )
|
||||
key = t.__orderedIndex[1]
|
||||
else
|
||||
-- fetch the next value
|
||||
for i = 1,table.getn(t.__orderedIndex) do
|
||||
if t.__orderedIndex[i] == state then
|
||||
key = t.__orderedIndex[i+1]
|
||||
end
|
||||
end
|
||||
end
|
||||
|
||||
if key then
|
||||
return key, t[key]
|
||||
end
|
||||
|
||||
-- no more value to return, cleanup
|
||||
t.__orderedIndex = nil
|
||||
return
|
||||
end
|
||||
|
||||
function orderedPairs(t)
|
||||
-- Equivalent of the pairs() function on tables. Allows to iterate
|
||||
-- in order
|
||||
return orderedNext, t, nil
|
||||
end
|
||||
|
||||
return M
|
|
@ -1,46 +0,0 @@
|
|||
-- Copyright 2015-2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local cjson = require 'cjson'
|
||||
|
||||
local inject_message = inject_message
|
||||
local read_message = read_message
|
||||
local string = string
|
||||
local pcall = pcall
|
||||
|
||||
local M = {}
|
||||
setfenv(1, M) -- Remove external access to contain everything in the module
|
||||
|
||||
-- Encode a Lua variable as JSON without raising an exception if the encoding
|
||||
-- fails for some reason (for instance, the encoded buffer exceeds the sandbox
|
||||
-- limit)
|
||||
function safe_json_encode(v)
|
||||
local ok, data = pcall(cjson.encode, v)
|
||||
if not ok then
|
||||
return
|
||||
end
|
||||
return data
|
||||
end
|
||||
|
||||
-- Call inject_message() wrapped by pcall()
|
||||
function safe_inject_message(msg)
|
||||
local ok, err_msg = pcall(inject_message, msg)
|
||||
if not ok then
|
||||
return -1, err_msg
|
||||
else
|
||||
return 0
|
||||
end
|
||||
end
|
||||
|
||||
return M
|
|
@ -1,171 +0,0 @@
|
|||
-- Copyright 2016 Mirantis, Inc.
|
||||
--
|
||||
-- Licensed under the Apache License, Version 2.0 (the "License");
|
||||
-- you may not use this file except in compliance with the License.
|
||||
-- You may obtain a copy of the License at
|
||||
--
|
||||
-- http://www.apache.org/licenses/LICENSE-2.0
|
||||
--
|
||||
-- Unless required by applicable law or agreed to in writing, software
|
||||
-- distributed under the License is distributed on an "AS IS" BASIS,
|
||||
-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
-- See the License for the specific language governing permissions and
|
||||
-- limitations under the License.
|
||||
|
||||
local l = require "lpeg"
|
||||
l.locale(l)
|
||||
local pcall = pcall
|
||||
local string = require 'string'
|
||||
|
||||
local patterns = require 'stacklight.patterns'
|
||||
local error = error
|
||||
local setmetatable = setmetatable
|
||||
local tonumber = tonumber
|
||||
|
||||
local C = l.C
|
||||
local P = l.P
|
||||
local S = l.S
|
||||
local V = l.V
|
||||
local Ct = l.Ct
|
||||
local Cc = l.Cc
|
||||
|
||||
local Optional_space = patterns.sp^0
|
||||
local Only_spaces = patterns.sp^1 * -1
|
||||
|
||||
local function space(pat)
|
||||
return Optional_space * pat * Optional_space
|
||||
end
|
||||
|
||||
local EQ = P'=='
|
||||
local NEQ = P'!='
|
||||
local GT = P'>'
|
||||
local LT = P'<'
|
||||
local GTE = P'>='
|
||||
local LTE = P'<='
|
||||
local MATCH = P'=~'
|
||||
local NO_MATCH = P'!~'
|
||||
|
||||
local OR = P'||'
|
||||
local AND = P'&&'
|
||||
|
||||
local function get_operator(op)
|
||||
if op == '' then
|
||||
return '=='
|
||||
end
|
||||
return op
|
||||
end
|
||||
|
||||
local numerical_operator = (EQ + NEQ + LTE + GTE + GT + LT )^-1 / get_operator
|
||||
local sub_numerical_expression = space(numerical_operator) * patterns.Number * Optional_space
|
||||
local is_plain_numeric = (sub_numerical_expression * ((OR^1 + AND^1) * sub_numerical_expression)^0) * -1
|
||||
|
||||
local quoted_string = (P'"' * C((P(1) - (P'"'))^1) * P'"' + C((P(1) - patterns.sp)^1))
|
||||
local string_operator = (EQ + NEQ + MATCH + NO_MATCH)^-1 / get_operator
|
||||
local sub_string_expression = space(string_operator) * quoted_string * Optional_space
|
||||
local is_plain_string = (sub_string_expression * ((OR^1 + AND^1) * sub_string_expression)^0) * -1
|
||||
|
||||
local numerical_expression = P {
|
||||
'OR';
|
||||
AND = Ct(Cc('and') * V'SUB' * space(AND) * V'AND' + V'SUB'),
|
||||
OR = Ct(Cc('or') * V'AND' * space(OR) * V'OR' + V'AND'),
|
||||
SUB = Ct(sub_numerical_expression)
|
||||
} * -1
|
||||
|
||||
local string_expression = P {
|
||||
'OR';
|
||||
AND = Ct(Cc('and') * V'SUB' * space(AND) * V'AND' + V'SUB'),
|
||||
OR = Ct(Cc('or') * V'AND' * space(OR) * V'OR' + V'AND'),
|
||||
SUB = Ct(sub_string_expression)
|
||||
} * -1
|
||||
|
||||
local is_complex = patterns.anywhere(EQ + NEQ + LTE + GTE + GT + LT + MATCH + NO_MATCH + OR + AND)
|
||||
|
||||
local function eval_tree(tree, value)
|
||||
local match = false
|
||||
|
||||
if type(tree[1]) == 'table' then
|
||||
match = eval_tree(tree[1], value)
|
||||
else
|
||||
local operator = tree[1]
|
||||
if operator == 'and' or operator == 'or' then
|
||||
match = eval_tree(tree[2], value)
|
||||
for i=3, #tree, 1 do
|
||||
local m = eval_tree(tree[i], value)
|
||||
if operator == 'or' then
|
||||
match = match or m
|
||||
else
|
||||
match = match and m
|
||||
end
|
||||
end
|
||||
else
|
||||
local matcher = tree[2]
|
||||
if operator == '==' then
|
||||
return value == matcher
|
||||
elseif operator == '!=' then
|
||||
return value ~= matcher
|
||||
elseif operator == '>' then
|
||||
return value > matcher
|
||||
elseif operator == '<' then
|
||||
return value < matcher
|
||||
elseif operator == '>=' then
|
||||
return value >= matcher
|
||||
elseif operator == '<=' then
|
||||
return value <= matcher
|
||||
elseif operator == '=~' then
|
||||
local ok, m = pcall(string.find, value, matcher)
|
||||
return ok and m ~= nil
|
||||
elseif operator == '!~' then
|
||||
local ok, m = pcall(string.find, value, matcher)
|
||||
return ok and m == nil
|
||||
end
|
||||
end
|
||||
end
|
||||
return match
|
||||
end
|
||||
|
||||
local MatchExpression = {}
|
||||
MatchExpression.__index = MatchExpression
|
||||
|
||||
setfenv(1, MatchExpression) -- Remove external access to contain everything in the module
|
||||
|
||||
function MatchExpression.new(expression)
|
||||
local r = {}
|
||||
setmetatable(r, MatchExpression)
|
||||
if is_complex:match(expression) then
|
||||
r.is_plain_numeric_exp = is_plain_numeric:match(expression) ~= nil
|
||||
|
||||
if r.is_plain_numeric_exp then
|
||||
r.tree = numerical_expression:match(expression)
|
||||
elseif is_plain_string:match(expression) ~= nil then
|
||||
r.tree = string_expression:match(expression)
|
||||
end
|
||||
if r.tree == nil then
|
||||
error('Invalid expression: ' .. expression)
|
||||
end
|
||||
else
|
||||
if expression == '' or Only_spaces:match(expression) then
|
||||
error('Expression is empty')
|
||||
end
|
||||
r.is_simple_equality_matching = true
|
||||
end
|
||||
r.expression = expression
|
||||
|
||||
return r
|
||||
end
|
||||
|
||||
function MatchExpression:matches(value)
|
||||
if self.is_simple_equality_matching then
|
||||
return self.expression == value or
|
||||
tonumber(self.expression) == value or
|
||||
tonumber(value) == self.expression
|
||||
end
|
||||
if self.is_plain_numeric_exp then
|
||||
value = tonumber(value)
|
||||
if value == nil then
|
||||
return false
|
||||
end
|
||||
end
|
||||
return eval_tree(self.tree, value)
|
||||
end
|
||||
|
||||
return MatchExpression
|
|
@ -1,71 +0,0 @@
|
|||
local M = {}
|
||||
setfenv(1, M) -- Remove external access to contain everything in the module
|
||||
|
||||
local alarms = {
|
||||
{
|
||||
['name'] = 'cpu-critical',
|
||||
['description'] = 'The CPU usage is too high',
|
||||
['severity'] = 'critical',
|
||||
['trigger'] = {
|
||||
['logical_operator'] = 'or',
|
||||
['rules'] = {
|
||||
{
|
||||
['metric'] = 'intel.procfs.cpu.idle_percentage',
|
||||
['fields'] = {
|
||||
['cpuID'] = 'all'
|
||||
},
|
||||
['relational_operator'] = '<=',
|
||||
['threshold'] = '5',
|
||||
['window'] = '120',
|
||||
['periods'] = '0',
|
||||
['function'] = 'avg',
|
||||
},
|
||||
{
|
||||
['metric'] = 'intel.procfs.cpu.iowait_percentage',
|
||||
['fields'] = {
|
||||
['cpuID'] = 'all'
|
||||
},
|
||||
['relational_operator'] = '>=',
|
||||
['threshold'] = '35',
|
||||
['window'] = '120',
|
||||
['periods'] = '0',
|
||||
['function'] = 'avg',
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
{
|
||||
['name'] = 'cpu-warning',
|
||||
['description'] = 'The CPU usage is high',
|
||||
['severity'] = 'warning',
|
||||
['trigger'] = {
|
||||
['logical_operator'] = 'or',
|
||||
['rules'] = {
|
||||
{
|
||||
['metric'] = 'intel.procfs.cpu.idle_percentage',
|
||||
['fields'] = {
|
||||
['cpuID'] = 'all'
|
||||
},
|
||||
['relational_operator'] = '<=',
|
||||
['threshold'] = '15',
|
||||
['window'] = '120',
|
||||
['periods'] = '0',
|
||||
['function'] = 'avg',
|
||||
},
|
||||
{
|
||||
['metric'] = 'intel.procfs.cpu.iowait_percentage',
|
||||
['fields'] = {
|
||||
['cpuID'] = 'all'
|
||||
},
|
||||
['relational_operator'] = '>=',
|
||||
['threshold'] = '25',
|
||||
['window'] = '120',
|
||||
['periods'] = '0',
|
||||
['function'] = 'avg',
|
||||
},
|
||||
},
|
||||
},
|
||||
},
|
||||
}
|
||||
|
||||
return alarms
|
|
@ -1,3 +0,0 @@
|
|||
Package: *
|
||||
Pin: origin "mirror.fuel-infra.org"
|
||||
Pin-Priority: 500
|
|
@ -1,207 +0,0 @@
|
|||
--
|
||||
-- Inspired from the lua_sandbox Postgres Output Example
|
||||
-- https://github.com/mozilla-services/lua_sandbox/blob/f1ee9eb/docs/heka/output.md#example-postgres-output
|
||||
--
|
||||
|
||||
local os = require 'os'
|
||||
local http = require 'socket.http'
|
||||
local message = require 'stacklight.message'
|
||||
|
||||
--local write = require 'io'.write
|
||||
--local flush = require 'io'.flush
|
||||
|
||||
local influxdb_host = read_config('host') or error('influxdb host is required')
|
||||
local influxdb_port = read_config('port') or error('influxdb port is required')
|
||||
|
||||
local batch_max_lines = read_config('batch_max_lines') or 3000
|
||||
assert(batch_max_lines > 0, 'batch_max_lines must be greater than zero')
|
||||
|
||||
local db = read_config("database") or error("database config is required")
|
||||
|
||||
local write_url = string.format('http://%s:%d/write?db=%s', influxdb_host, influxdb_port, db)
|
||||
local query_url = string.format('http://%s:%s/query', influxdb_host, influxdb_port)
|
||||
|
||||
local database_created = false
|
||||
|
||||
local buffer = {}
|
||||
local buffer_len = 0
|
||||
|
||||
|
||||
local function escape_string(str)
|
||||
return tostring(str):gsub("([ ,])", "\\%1")
|
||||
end
|
||||
|
||||
local function encode_scalar_value(value)
|
||||
if type(value) == "number" then
|
||||
-- Always send numbers as formatted floats, so InfluxDB will accept
|
||||
-- them if they happen to change from ints to floats between
|
||||
-- points in time. Forcing them to always be floats avoids this.
|
||||
return string.format("%.6f", value)
|
||||
elseif type(value) == "string" then
|
||||
-- string values need to be double quoted
|
||||
return '"' .. value:gsub('"', '\\"') .. '"'
|
||||
elseif type(value) == "boolean" then
|
||||
return '"' .. tostring(value) .. '"'
|
||||
end
|
||||
end
|
||||
|
||||
local function encode_value(value)
|
||||
if type(value) == "table" then
|
||||
local values = {}
|
||||
for k, v in pairs(value) do
|
||||
table.insert(
|
||||
values,
|
||||
string.format("%s=%s", escape_string(k), encode_scalar_value(v))
|
||||
)
|
||||
end
|
||||
return table.concat(values, ',')
|
||||
else
|
||||
return "value=" .. encode_scalar_value(value)
|
||||
end
|
||||
end
|
||||
|
||||
local function write_batch()
|
||||
assert(buffer_len > 0)
|
||||
local body = table.concat(buffer, '\n')
|
||||
local resp_body, resp_status = http.request(write_url, body)
|
||||
if resp_body and resp_status == 204 then
|
||||
-- success
|
||||
buffer = {}
|
||||
buffer_len = 0
|
||||
return resp_body, ''
|
||||
else
|
||||
-- error
|
||||
local err_msg = resp_status
|
||||
if resp_body then
|
||||
err_msg = string.format('influxdb write error: [%s] %s',
|
||||
resp_status, resp_body)
|
||||
end
|
||||
return nil, err_msg
|
||||
end
|
||||
end
|
||||
|
||||
|
||||
local function create_database()
|
||||
-- query won't fail if database already exists
|
||||
local body = string.format('q=CREATE DATABASE %s', db)
|
||||
local resp_body, resp_status = http.request(query_url, body)
|
||||
if resp_body and resp_status == 200 then
|
||||
-- success
|
||||
return resp_body, ''
|
||||
else
|
||||
-- error
|
||||
local err_msg = resp_status
|
||||
if resp_body then
|
||||
err_msg = string.format('influxdb create database error [%s] %s',
|
||||
resp_status, resp_body)
|
||||
end
|
||||
return nil, err_msg
|
||||
end
|
||||
end
|
||||
|
||||
|
||||
-- create a line for the current message, return nil and an error string
|
||||
-- if the message is invalid
|
||||
local function create_line()
|
||||
|
||||
local tags = {}
|
||||
local dimensions, dimensions_index = message.read_field('dimensions')
|
||||
if dimensions then
|
||||
local i = 0
|
||||
repeat
|
||||
local tag_key = read_message('Fields[dimensions]', dimensions_index, i)
|
||||
if tag_key == nil then
|
||||
break
|
||||
end
|
||||
-- skip the plugin_running_on dimension
|
||||
if tag_key ~= 'plugin_running_on' then
|
||||
local variable_name = string.format('Fields[%s]', tag_key)
|
||||
local tag_val = read_message(variable_name, 0)
|
||||
if tag_val == nil then
|
||||
-- the dimension is advertized in the "dimensions" field
|
||||
-- but there is no field for it, so we consider the
|
||||
-- entire message as invalid
|
||||
return nil, string.format('dimension "%s" is missing', tag_key)
|
||||
end
|
||||
tags[escape_string(tag_key)] = escape_string(tag_val)
|
||||
end
|
||||
i = i + 1
|
||||
until false
|
||||
end
|
||||
|
||||
if tags['dimensions'] ~= nil and dimensions_index == 0 then
|
||||
return nil, 'index of field "dimensions" should not be 0'
|
||||
end
|
||||
|
||||
local name, name_index = message.read_field('name')
|
||||
if name == nil then
|
||||
-- "name" is a required field
|
||||
return nil, 'field "name" is missing'
|
||||
end
|
||||
if tags['name'] ~= nil and name_index == 0 then
|
||||
return nil, 'index of field "name" should not be 0'
|
||||
end
|
||||
|
||||
local value, err_msg = message.read_values(tags)
|
||||
if value == nil then
|
||||
return nil, err_msg
|
||||
end
|
||||
|
||||
local tags_array = {}
|
||||
for tag_key, tag_val in pairs(tags) do
|
||||
table.insert(tags_array, string.format('%s=%s', tag_key, tag_val))
|
||||
end
|
||||
|
||||
return string.format('%s,%s %s %d',
|
||||
escape_string(name),
|
||||
table.concat(tags_array, ','),
|
||||
encode_value(value),
|
||||
string.format('%d', read_message('Timestamp'))), ''
|
||||
end
|
||||
|
||||
|
||||
function process_message()
|
||||
|
||||
if not database_created then
|
||||
local ok, err_msg = create_database()
|
||||
if not ok then
|
||||
return -3, err_msg -- retry
|
||||
end
|
||||
database_created = true
|
||||
end
|
||||
|
||||
local line, err_msg = create_line()
|
||||
if line == nil then
|
||||
-- the message is not valid, skip it
|
||||
return -2, err_msg -- skip
|
||||
end
|
||||
|
||||
buffer_len = buffer_len + 1
|
||||
buffer[buffer_len] = line
|
||||
|
||||
if buffer_len > batch_max_lines then
|
||||
local ok, err_msg = write_batch()
|
||||
if not ok then
|
||||
buffer[buffer_len] = nil
|
||||
buffer_len = buffer_len - 1
|
||||
-- recreate database on retry
|
||||
if string.match(err_msg, 'database not found') then
|
||||
database_created = false
|
||||
end
|
||||
return -3, err_msg -- retry
|
||||
end
|
||||
return 0
|
||||
end
|
||||
|
||||
return -4 -- batching
|
||||
end
|
||||
|
||||
|
||||
function timer_event(ns)
|
||||
if buffer_len > 0 then
|
||||
local ok, _ = write_batch()
|
||||
if ok then
|
||||
update_checkpoint()
|
||||
end
|
||||
end
|
||||
end
|
|
@ -1 +0,0 @@
|
|||
deb http://mirror.fuel-infra.org/mos-repos/ubuntu/snapshots/9.0-latest/ mos9.0-proposed main
|
|
@ -1,18 +0,0 @@
|
|||
FROM {{ image_spec("base-tools") }}
|
||||
MAINTAINER {{ maintainer }}
|
||||
|
||||
# NOTE(elemoine): the InfluxDB package is downloaded from dl.influxdb.com. Do
|
||||
# we want to host the package instead?
|
||||
|
||||
RUN gpg \
|
||||
--keyserver hkp://ha.pool.sks-keyservers.net \
|
||||
--recv-keys 05CE15085FC09D18E99EFB22684A14CF2582E0C5 \
|
||||
&& curl https://dl.influxdata.com/influxdb/releases/influxdb_{{ influxdb_version }}_amd64.deb.asc -o /tmp/influxdb.deb.asc \
|
||||
&& curl https://dl.influxdata.com/influxdb/releases/influxdb_{{ influxdb_version }}_amd64.deb -o /tmp/influxdb.deb \
|
||||
&& gpg --batch --verify /tmp/influxdb.deb.asc /tmp/influxdb.deb \
|
||||
&& dpkg -i /tmp/influxdb.deb \
|
||||
&& chown -R influxdb: /etc/influxdb \
|
||||
&& usermod -a -G microservices influxdb \
|
||||
&& rm -f /tmp/influxdb.deb.asc /tmp/influxdb.deb
|
||||
|
||||
USER influxdb
|
|
@ -1,11 +0,0 @@
|
|||
FROM {{ image_spec("base-tools") }}
|
||||
MAINTAINER {{ maintainer }}
|
||||
|
||||
RUN curl https://download.elastic.co/kibana/kibana/kibana-{{ kibana_version }}-amd64.deb -o /tmp/kibana.deb \
|
||||
&& dpkg -i /tmp/kibana.deb \
|
||||
&& rm -f /tmp/kibana.deb
|
||||
|
||||
RUN usermod -a -G microservices kibana \
|
||||
&& chown -R kibana: /opt/kibana
|
||||
|
||||
USER kibana
|
|
@ -1,16 +0,0 @@
|
|||
FROM {{ image_spec("base-tools") }}
|
||||
MAINTAINER {{ maintainer }}
|
||||
|
||||
# Install Snap
|
||||
ADD install.sh /tmp/
|
||||
RUN mkdir -p /etc/snap/auto \
|
||||
&& bash /tmp/install.sh /etc/snap/auto \
|
||||
&& rm /tmp/install.sh \
|
||||
&& useradd --user-group snap \
|
||||
&& usermod -a -G microservices snap \
|
||||
&& chown -R snap: /etc/snap \
|
||||
&& apt-get purge -y --auto-remove \
|
||||
&& apt-get clean \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
USER snap
|
|
@ -1,58 +0,0 @@
|
|||
#!/bin/bash
|
||||
|
||||
set -e
|
||||
|
||||
AUTO_DISCOVERY_PATH="$1"
|
||||
|
||||
#
|
||||
# Install Snap and Snap plugins
|
||||
#
|
||||
|
||||
# Snap release, platform and architecture
|
||||
RELEASE=v0.15.0-beta-126-g9a05d66
|
||||
PLATFORM=linux
|
||||
ARCH=amd64
|
||||
TDIR=/tmp/snap
|
||||
|
||||
# Binary storage service URI components
|
||||
PROTOCOL=https
|
||||
HOST=bintray.com
|
||||
BASEURL="mirantis/snap/download_file?file_path="
|
||||
|
||||
mkdir -p $TDIR
|
||||
# Retrieve archived binaries and extract them in temporary location
|
||||
for a in snap snap-plugins; do
|
||||
f="${a}-${RELEASE}-${PLATFORM}-${ARCH}.tar.gz"
|
||||
# -L required due to potential successive redirections
|
||||
curl -s -k -L -o $TDIR/$f ${PROTOCOL}://${HOST}/${BASEURL}$f
|
||||
tar zxCf $TDIR $TDIR/$f --exclude '*mock[12]'
|
||||
done
|
||||
|
||||
# Copy retrieved binaries excluding demo plugins
|
||||
install --owner=root --group=root --mode=755 $TDIR/snap-${RELEASE}/bin/* $TDIR/snap-${RELEASE}/plugin/* /usr/local/bin
|
||||
# Make the plugins auto-loadable by the snap framework
|
||||
for f in /usr/local/bin/snap-plugin*; do
|
||||
ln -s $f $AUTO_DISCOVERY_PATH
|
||||
done
|
||||
|
||||
# Update some permissions for plugins which require privileged access to filesystem
|
||||
#
|
||||
# the processes snap plugin accesses files like /proc/1/io which
|
||||
# only the root user can read
|
||||
#
|
||||
# the smart snap plugin accesses files in /host-proc and /host-dev (/proc and /dev
|
||||
# from the host) which also requires root user access
|
||||
#
|
||||
for f in snap-plugin-collector-processes snap-plugin-collector-smart; do
|
||||
chmod u+s /usr/local/bin/$f
|
||||
done
|
||||
|
||||
#
|
||||
# Clean up
|
||||
#
|
||||
apt-get purge -y --auto-remove $BUILD_DEPS
|
||||
apt-get clean
|
||||
rm -rf /var/lib/apt/lists/*
|
||||
rm -rf $TDIR
|
||||
|
||||
exit 0
|
|
@ -1,36 +0,0 @@
|
|||
dsl_version: 0.1.0
|
||||
service:
|
||||
name: cron
|
||||
kind: DaemonSet
|
||||
containers:
|
||||
- name: cron
|
||||
image: cron
|
||||
volumes:
|
||||
- name: mysql-logs
|
||||
path: "/var/log/ccp/mysql"
|
||||
type: host
|
||||
readOnly: False
|
||||
- name: rabbitmq-logs
|
||||
path: "/var/log/ccp/rabbitmq"
|
||||
type: host
|
||||
readOnly: False
|
||||
- name: keystone-logs
|
||||
path: "/var/log/ccp/keystone"
|
||||
type: host
|
||||
readOnly: False
|
||||
- name: horizon-logs
|
||||
path: "/var/log/ccp/horizon"
|
||||
type: host
|
||||
readOnly: False
|
||||
daemon:
|
||||
command: cron -f
|
||||
files:
|
||||
- logrotate.conf
|
||||
- logrotate-services.conf
|
||||
files:
|
||||
logrotate.conf:
|
||||
path: /etc/logrotate.conf
|
||||
content: cron-logrotate-global.conf.j2
|
||||
logrotate-services.conf:
|
||||
path: /etc/logrotate.d/logrotate-services.conf
|
||||
content: cron-logrotate-services.conf.j2
|
|
@ -1,11 +0,0 @@
|
|||
filename = "afd.lua"
|
||||
-- log_level = 7
|
||||
message_matcher = "TRUE"
|
||||
ticker_interval = 10
|
||||
{% raw %}
|
||||
afd_type = "node"
|
||||
afd_file = "{{ afd_file }}"
|
||||
afd_cluster_name = "{{ afd_cluster_name }}"
|
||||
afd_logical_name = "{{ afd_logical_name }}"
|
||||
{% endraw %}
|
||||
hostname = "{{ node_name }}"
|
|
@ -1,79 +0,0 @@
|
|||
alarms:
|
||||
- name: 'root-fs-warning'
|
||||
description: 'The root filesystem free space is low'
|
||||
severity: 'warning'
|
||||
enabled: 'true'
|
||||
trigger:
|
||||
rules:
|
||||
- metric: 'intel.procfs.filesystem.space_percent_free'
|
||||
fields:
|
||||
filesystem: 'rootfs'
|
||||
relational_operator: '<'
|
||||
threshold: 10
|
||||
window: 60
|
||||
periods: 0
|
||||
function: min
|
||||
- name: 'root-fs-critical'
|
||||
description: 'The root filesystem free space is too low'
|
||||
severity: 'critical'
|
||||
enabled: 'true'
|
||||
trigger:
|
||||
rules:
|
||||
- metric: 'intel.procfs.filesystem.space_percent_free'
|
||||
fields:
|
||||
filesystem: 'rootfs'
|
||||
relational_operator: '<'
|
||||
threshold: 5
|
||||
window: 60
|
||||
periods: 0
|
||||
function: min
|
||||
- name: 'cpu-critical'
|
||||
description: 'The CPU usage is too high'
|
||||
severity: 'critical'
|
||||
trigger:
|
||||
logical_operator: 'or'
|
||||
rules:
|
||||
- metric: 'intel.procfs.cpu.idle_percentage'
|
||||
fields:
|
||||
cpuID: 'all'
|
||||
relational_operator: '<='
|
||||
threshold: '5'
|
||||
window: '120'
|
||||
periods: '0'
|
||||
function: 'avg'
|
||||
- metric: 'intel.procfs.cpu.iowait_percentage'
|
||||
fields:
|
||||
cpuID: 'all'
|
||||
relational_operator: '>='
|
||||
threshold: '35'
|
||||
window: '120'
|
||||
periods: '0'
|
||||
function: 'avg'
|
||||
- name: 'cpu-warning'
|
||||
description: 'The CPU usage is high'
|
||||
severity: 'warning'
|
||||
trigger:
|
||||
logical_operator: 'or'
|
||||
rules:
|
||||
- metric: 'intel.procfs.cpu.idle_percentage'
|
||||
fields:
|
||||
cpuID: 'all'
|
||||
relational_operator: '<='
|
||||
threshold: '15'
|
||||
window: '120'
|
||||
periods: '0'
|
||||
function: 'avg'
|
||||
- metric: 'intel.procfs.cpu.iowait_percentage'
|
||||
fields:
|
||||
cpuID: 'all'
|
||||
relational_operator: '>='
|
||||
threshold: '25'
|
||||
window: '120'
|
||||
periods: '0'
|
||||
function: 'avg'
|
||||
|
||||
node_cluster_alarms:
|
||||
system:
|
||||
alarms:
|
||||
rootfs: ['root-fs-critical', 'root-fs-warning']
|
||||
cpu: ['cpu-critical', 'cpu-warning']
|
|
@ -1,12 +0,0 @@
|
|||
{{ cron.rotate.interval }}
|
||||
rotate {{ cron.rotate.days }}
|
||||
copytruncate
|
||||
compress
|
||||
delaycompress
|
||||
notifempty
|
||||
missingok
|
||||
|
||||
minsize {{ cron.rotate.minsize }}
|
||||
maxsize {{ cron.rotate.maxsize }}
|
||||
|
||||
include /etc/logrotate.d
|
|
@ -1,12 +0,0 @@
|
|||
{% set services = [
|
||||
'mysql',
|
||||
'rabbitmq',
|
||||
'keystone',
|
||||
'horizon',
|
||||
]
|
||||
%}
|
||||
|
||||
{% for service in services %}
|
||||
"/var/log/ccp/{{ service }}/*.log"
|
||||
{}
|
||||
{% endfor %}
|
|
@ -1,27 +0,0 @@
|
|||
configs:
|
||||
kibana:
|
||||
port:
|
||||
cont: 5601
|
||||
heka:
|
||||
max_procs: 2
|
||||
service_pattern: "^k8s_(.-)%..*"
|
||||
hindsight_heka_tcp_port:
|
||||
cont: 5565
|
||||
influxdb:
|
||||
database: "ccp"
|
||||
host: "influxdb"
|
||||
password: ""
|
||||
port:
|
||||
cont: 8086
|
||||
user: ""
|
||||
snap:
|
||||
log_level: 3
|
||||
cron:
|
||||
rotate:
|
||||
interval: "daily"
|
||||
days: 6
|
||||
minsize: "1M"
|
||||
maxsize: "100M"
|
||||
versions:
|
||||
influxdb_version: "0.13.0"
|
||||
kibana_version: "4.6.1"
|
|
@ -1,40 +0,0 @@
|
|||
#!/bin/bash
|
||||
|
||||
GRAFANA_URL="http://{{ grafana.user }}:{{ grafana.password }}@{{ address('grafana') }}:{{ grafana.port.cont }}"
|
||||
|
||||
echo "Waiting for Grafana to come up..."
|
||||
until $(curl --fail --output /dev/null --silent ${GRAFANA_URL}/api/org); do
|
||||
printf "."
|
||||
sleep 2
|
||||
done
|
||||
echo -e "Grafana is up and running.\n"
|
||||
|
||||
echo "Creating InfluxDB datasource..."
|
||||
curl -i -XPOST -H "Accept: application/json" -H "Content-Type: application/json" "${GRAFANA_URL}/api/datasources" -d '
|
||||
{
|
||||
"name": "CCP InfluxDB",
|
||||
"type": "influxdb",
|
||||
"access": "proxy",
|
||||
"isDefault": true,
|
||||
"url": "'"{{ address('influxdb', influxdb.port, with_scheme=True) }}"'",
|
||||
"password": "'"{{ influxdb.password }}"'",
|
||||
"user": "'"{{ influxdb.user }}"'",
|
||||
"database": "'"{{ influxdb.database }}"'"
|
||||
}'
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "Can not create InfluxDB datasource"
|
||||
exit 1
|
||||
fi
|
||||
echo -e "InfluxDB datasource was successfully created.\n"
|
||||
|
||||
echo "Importing default dashboards..."
|
||||
for dashboard in /tmp/*.dashboard.json; do
|
||||
echo -e "\tImporting ${dashboard}..."
|
||||
curl -i -XPOST --data "@${dashboard}" -H "Accept: application/json" -H "Content-Type: application/json" "${GRAFANA_URL}/api/dashboards/db"
|
||||
if [ $? -ne 0 ]; then
|
||||
echo "Error importing ${dashboard}"
|
||||
exit 1
|
||||
fi
|
||||
echo -e "\tDone"
|
||||
done
|
||||
echo -e "Default dashboards succesfully imported\n"
|
|
@ -1,10 +0,0 @@
|
|||
[docker_logs_decoder]
|
||||
type = "MultiDecoder"
|
||||
subs = ['openstack_log_decoder', 'ovs_log_decoder']
|
||||
cascade_strategy = "first-wins"
|
||||
log_sub_errors = false
|
||||
|
||||
[docker_log_input]
|
||||
type = "DockerLogInput"
|
||||
decoder = "docker_logs_decoder"
|
||||
log_decode_failures = false
|
|
@ -1,16 +0,0 @@
|
|||
[elasticsearch_json_encoder]
|
||||
type = "ESJsonEncoder"
|
||||
index = {% raw %}"%{Type}-%{%Y.%m.%d}"{% endraw %}
|
||||
es_index_from_timestamp = true
|
||||
fields = ["Timestamp", "Type", "Logger", "Severity", "Payload", "Pid", "Hostname", "DynamicFields"]
|
||||
|
||||
[elasticsearch_output]
|
||||
type = "ElasticSearchOutput"
|
||||
server = "{{ address('elasticsearch', elasticsearch.port, with_scheme=True) }}"
|
||||
message_matcher = "Type == 'log'"
|
||||
encoder = "elasticsearch_json_encoder"
|
||||
use_buffering = true
|
||||
[elasticsearch_output.buffering]
|
||||
max_buffer_size = 1073741824 # 1024 * 1024 * 1024
|
||||
max_file_size = 134217728 # 128 * 1024 * 1024
|
||||
full_action = "block"
|
|
@ -1,10 +0,0 @@
|
|||
[hekad]
|
||||
maxprocs = {{ heka.max_procs }}
|
||||
|
||||
[debug_output]
|
||||
type = "LogOutput"
|
||||
message_matcher = "Fields[payload_name] == 'debug'"
|
||||
encoder = "rst_encoder"
|
||||
|
||||
[rst_encoder]
|
||||
type = "RstEncoder"
|
|
@ -1,13 +0,0 @@
|
|||
[horizon_apache_log_decoder]
|
||||
type = "SandboxDecoder"
|
||||
filename = "lua_decoders/os_horizon_apache_log.lua"
|
||||
[horizon_apache_log_decoder.config]
|
||||
access_log_pattern = '%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b %D \"%{Referer}i\" \"%{User-Agent}i\"'
|
||||
|
||||
[horizon_apache_logstreamer_input]
|
||||
type = "LogstreamerInput"
|
||||
decoder = "horizon_apache_log_decoder"
|
||||
log_directory = "/var/log/ccp/horizon"
|
||||
file_match = 'horizon-(?P<Service>.+)\.log\.?(?P<Seq>\d*)$'
|
||||
priority = ["^Seq"]
|
||||
differentiator = ["horizon-", "Service"]
|
|
@ -1,13 +0,0 @@
|
|||
[keystone_apache_log_decoder]
|
||||
type = "SandboxDecoder"
|
||||
filename = "lua_decoders/os_keystone_apache_log.lua"
|
||||
[keystone_apache_log_decoder.config]
|
||||
apache_log_pattern = '%{X-Forwarded-For}i %l %u %t \"%r\" %>s %b %D \"%{Referer}i\" \"%{User-Agent}i\"'
|
||||
|
||||
[keystone_apache_logstreamer_input]
|
||||
type = "LogstreamerInput"
|
||||
decoder = "keystone_apache_log_decoder"
|
||||
log_directory = "/var/log/ccp/keystone"
|
||||
file_match = 'keystone-(?P<Service>.+)\.log\.?(?P<Seq>\d*)$'
|
||||
priority = ["^Seq"]
|
||||
differentiator = ["keystone-", "Service"]
|
|
@ -1,11 +0,0 @@
|
|||
[mariadb_log_decoder]
|
||||
type = "SandboxDecoder"
|
||||
filename = "lua_decoders/os_mysql_log.lua"
|
||||
|
||||
[mariadb_logstreamer_input]
|
||||
type = "LogstreamerInput"
|
||||
decoder = "mariadb_log_decoder"
|
||||
log_directory = "/var/log/ccp/mysql"
|
||||
file_match = 'mysql\.log\.?(?P<Seq>\d*)$'
|
||||
priority = ["^Seq"]
|
||||
differentiator = ['mysql']
|
|
@ -1,6 +0,0 @@
|
|||
[openstack_log_decoder]
|
||||
type = "SandboxDecoder"
|
||||
filename = "lua_decoders/os_openstack_log.lua"
|
||||
|
||||
[openstack_log_decoder.config]
|
||||
heka_service_pattern = "{{ heka.service_pattern }}"
|
|
@ -1,6 +0,0 @@
|
|||
[ovs_log_decoder]
|
||||
type = "SandboxDecoder"
|
||||
filename = "lua_decoders/os_ovs.lua"
|
||||
|
||||
[ovs_log_decoder.config]
|
||||
heka_service_pattern = "{{ heka.service_pattern }}"
|
|
@ -1,18 +0,0 @@
|
|||
[rabbitmq_log_decoder]
|
||||
type = "SandboxDecoder"
|
||||
filename = "lua_decoders/os_rabbitmq_log.lua"
|
||||
|
||||
[rabbitmq_log_splitter]
|
||||
type = "RegexSplitter"
|
||||
delimiter = '\n\n(=[^=]+====)'
|
||||
delimiter_eol = false
|
||||
deliver_incomplete_final = true
|
||||
|
||||
[rabbitmq_logstreamer_input]
|
||||
type = "LogstreamerInput"
|
||||
decoder = "rabbitmq_log_decoder"
|
||||
splitter = "rabbitmq_log_splitter"
|
||||
log_directory = "/var/log/ccp/rabbitmq"
|
||||
file_match = '(?P<Service>rabbitmq.*)\.log\.?(?P<Seq>\d*)$'
|
||||
priority = ["^Seq"]
|
||||
differentiator = ["Service"]
|
|
@ -1,22 +0,0 @@
|
|||
output_path = "/var/lib/hindsight/output"
|
||||
sandbox_load_path = "/var/lib/hindsight/load"
|
||||
sandbox_run_path = "/var/lib/hindsight/run"
|
||||
analysis_lua_path = "/usr/lib/x86_64-linux-gnu/luasandbox/modules/?.lua;/opt/ccp/lua/modules/?.lua"
|
||||
analysis_lua_cpath = "/usr/lib/x86_64-linux-gnu/luasandbox/modules/?.so"
|
||||
io_lua_path = analysis_lua_path .. ";/usr/lib/x86_64-linux-gnu/luasandbox/io_modules/?.lua"
|
||||
io_lua_cpath = analysis_lua_cpath .. ";/usr/lib/x86_64-linux-gnu/luasandbox/io_modules/?.so"
|
||||
|
||||
hostname = "{{ node_name }}"
|
||||
|
||||
input_defaults = {
|
||||
-- output_limit = 64 * 1024
|
||||
-- memory_limit = 8 * 1024 * 1024
|
||||
-- instruction_limit = 1e6
|
||||
-- preserve_data = false
|
||||
-- ticker_interval = 0
|
||||
}
|
||||
analysis_defaults = {
|
||||
}
|
||||
output_defaults = {
|
||||
}
|
||||
|
|
@ -1,9 +0,0 @@
|
|||
filename = "afd.lua"
|
||||
log_level = 7
|
||||
message_matcher = "TRUE"
|
||||
ticker_interval = 10
|
||||
afd_type = "node"
|
||||
afd_file = "afd_node_default_cpu_alarms"
|
||||
afd_cluster_name = "default"
|
||||
afd_logical_name = "cpu"
|
||||
hostname = "{{ node_name }}"
|
|
@ -1,6 +0,0 @@
|
|||
filename = "heka_tcp.lua"
|
||||
address = "localhost"
|
||||
port = {{ hindsight_heka_tcp_port.cont }}
|
||||
-- the heka_tcp plugin is a "Continuous" plugin, so instruction_limit
|
||||
-- must be set to zero
|
||||
instruction_limit = 0
|
|
@ -1,8 +0,0 @@
|
|||
filename = "influxdb_tcp.lua"
|
||||
host = "influxdb"
|
||||
port = {{ influxdb.port.cont }}
|
||||
database = "{{ influxdb.database }}"
|
||||
batch_max_lines = 3000
|
||||
message_matcher = "TRUE"
|
||||
ticker_interval = 10
|
||||
log_level = 7
|
|
@ -1,4 +0,0 @@
|
|||
filename = "kubelet_stats.lua"
|
||||
kubelet_stats_port = 10255
|
||||
kubelet_stats_node = "{{ node_name }}"
|
||||
ticker_interval = 10
|
|
@ -1,5 +0,0 @@
|
|||
filename = "prune_input.lua"
|
||||
ticker_interval = 60
|
||||
input = true
|
||||
analysis = true
|
||||
exit_on_stall = false
|
|
@ -1,17 +0,0 @@
|
|||
reporting-disabled = true
|
||||
|
||||
[meta]
|
||||
dir = "/var/lib/influxdb/meta"
|
||||
|
||||
[data]
|
||||
engine = "tsm1"
|
||||
dir = "/var/lib/influxdb/data"
|
||||
wal-dir = "/var/lib/influxdb/wal"
|
||||
|
||||
[admin]
|
||||
enabled = true
|
||||
|
||||
[http]
|
||||
auth-enabled = false # FIXME(elemoine)
|
||||
bind-address = "{{ network_topology["private"]["address"] }}:{{ influxdb.port.cont }}"
|
||||
log-enabled = false
|
|
@ -1,62 +0,0 @@
|
|||
# Kibana is served by a back end server. This controls which port to use.
|
||||
port: {{ kibana.port.cont }}
|
||||
|
||||
# The host to bind the server to.
|
||||
host: "{{ network_topology["private"]["address"] }}"
|
||||
|
||||
# The Elasticsearch instance to use for all your queries.
|
||||
elasticsearch_url: "{{ address('elasticsearch', elasticsearch.port, with_scheme=True) }}"
|
||||
|
||||
# preserve_elasticsearch_host true will send the hostname specified in `elasticsearch`. If you set it to false,
|
||||
# then the host you use to connect to *this* Kibana instance will be sent.
|
||||
elasticsearch_preserve_host: true
|
||||
|
||||
# Kibana uses an index in Elasticsearch to store saved searches, visualizations
|
||||
# and dashboards. It will create a new index if it doesn't already exist.
|
||||
kibana_index: ".kibana"
|
||||
|
||||
# If your Elasticsearch is protected with basic auth, this is the user credentials
|
||||
# used by the Kibana server to perform maintence on the kibana_index at statup. Your Kibana
|
||||
# users will still need to authenticate with Elasticsearch (which is proxied thorugh
|
||||
# the Kibana server)
|
||||
# kibana_elasticsearch_username: user
|
||||
# kibana_elasticsearch_password: pass
|
||||
|
||||
# The default application to load.
|
||||
# kibana.defaultAppId: "dashboard/Main"
|
||||
|
||||
# Time in milliseconds to wait for responses from the back end or elasticsearch.
|
||||
# This must be > 0
|
||||
request_timeout: 300000
|
||||
|
||||
# Time in milliseconds for Elasticsearch to wait for responses from shards.
|
||||
# Set to 0 to disable.
|
||||
shard_timeout: 0
|
||||
|
||||
# Set to false to have a complete disregard for the validity of the SSL
|
||||
# certificate.
|
||||
verify_ssl: false
|
||||
|
||||
# If you need to provide a CA certificate for your Elasticsarech instance, put
|
||||
# the path of the pem file here.
|
||||
# ca: /path/to/your/CA.pem
|
||||
|
||||
# SSL for outgoing requests from the Kibana Server (PEM formatted)
|
||||
# ssl_key_file: /path/to/your/server.key
|
||||
# ssl_cert_file: /path/to/your/server.crt
|
||||
|
||||
# Set the path to where you would like the process id file to be created.
|
||||
# pid_file: /var/run/kibana.pid
|
||||
|
||||
# Plugins that are included in the build, and no longer found in the plugins/ folder
|
||||
bundled_plugin_ids:
|
||||
- plugins/dashboard/index
|
||||
- plugins/discover/index
|
||||
- plugins/doc/index
|
||||
- plugins/kibana/index
|
||||
- plugins/markdown_vis/index
|
||||
- plugins/metric_vis/index
|
||||
- plugins/settings/index
|
||||
- plugins/table_vis/index
|
||||
- plugins/vis_types/index
|
||||
- plugins/visualize/index
|
File diff suppressed because it is too large
Load Diff
|
@ -1,108 +0,0 @@
|
|||
{
|
||||
"version": 1,
|
||||
"schedule": {
|
||||
"type": "simple",
|
||||
"interval": "10s"
|
||||
},
|
||||
"max-failures": 5,
|
||||
"workflow": {
|
||||
"collect": {
|
||||
"config": {
|
||||
"/intel": {
|
||||
"proc_path": "/host-proc"
|
||||
},
|
||||
"/intel/disk": {
|
||||
"dev_path": "/host-dev"
|
||||
}
|
||||
},
|
||||
"tags": {
|
||||
"/intel": {
|
||||
"hostname": "{{ node_name }}"
|
||||
}
|
||||
},
|
||||
"metrics": {
|
||||
"/intel/disk/smart/*": {},
|
||||
"/intel/procfs/cpu/*/idle_percentage": {},
|
||||
"/intel/procfs/cpu/*/iowait_percentage": {},
|
||||
"/intel/procfs/cpu/*/irq_percentage": {},
|
||||
"/intel/procfs/cpu/*/nice_percentage": {},
|
||||
"/intel/procfs/cpu/*/softirq_percentage": {},
|
||||
"/intel/procfs/cpu/*/steal_percentage": {},
|
||||
"/intel/procfs/cpu/*/system_percentage": {},
|
||||
"/intel/procfs/cpu/*/user_percentage": {},
|
||||
"/intel/procfs/filesystem/*/inodes_free": {},
|
||||
"/intel/procfs/filesystem/*/inodes_reserved": {},
|
||||
"/intel/procfs/filesystem/*/inodes_used": {},
|
||||
"/intel/procfs/filesystem/*/space_free": {},
|
||||
"/intel/procfs/filesystem/*/space_reserved": {},
|
||||
"/intel/procfs/filesystem/*/space_used": {},
|
||||
"/intel/procfs/filesystem/*/inodes_percent_free": {},
|
||||
"/intel/procfs/filesystem/*/inodes_percent_reserved": {},
|
||||
"/intel/procfs/filesystem/*/inodes_percent_used": {},
|
||||
"/intel/procfs/filesystem/*/space_percent_free": {},
|
||||
"/intel/procfs/filesystem/*/space_percent_reserved": {},
|
||||
"/intel/procfs/filesystem/*/space_percent_used": {},
|
||||
"/intel/procfs/filesystem/*/device_name": {},
|
||||
"/intel/procfs/filesystem/*/device_type": {},
|
||||
"/intel/procfs/disk/*/merged_read": {},
|
||||
"/intel/procfs/disk/*/merged_write": {},
|
||||
"/intel/procfs/disk/*/octets_read": {},
|
||||
"/intel/procfs/disk/*/octets_write": {},
|
||||
"/intel/procfs/disk/*/ops_read": {},
|
||||
"/intel/procfs/disk/*/ops_write": {},
|
||||
"/intel/procfs/disk/*/time_read": {},
|
||||
"/intel/procfs/disk/*/time_write": {},
|
||||
"/intel/procfs/iface/*/bytes_recv": {},
|
||||
"/intel/procfs/iface/*/bytes_sent": {},
|
||||
"/intel/procfs/iface/*/compressed_recv": {},
|
||||
"/intel/procfs/iface/*/compressed_sent": {},
|
||||
"/intel/procfs/iface/*/drop_recv": {},
|
||||
"/intel/procfs/iface/*/drop_sent": {},
|
||||
"/intel/procfs/iface/*/errs_recv": {},
|
||||
"/intel/procfs/iface/*/errs_sent": {},
|
||||
"/intel/procfs/iface/*/fifo_recv": {},
|
||||
"/intel/procfs/iface/*/fifo_sent": {},
|
||||
"/intel/procfs/iface/*/frame_recv": {},
|
||||
"/intel/procfs/iface/*/frame_sent": {},
|
||||
"/intel/procfs/iface/*/multicast_recv": {},
|
||||
"/intel/procfs/iface/*/multicast_sent": {},
|
||||
"/intel/procfs/iface/*/packets_recv": {},
|
||||
"/intel/procfs/iface/*/packets_sent": {},
|
||||
"/intel/procfs/load/min1": {},
|
||||
"/intel/procfs/load/min5": {},
|
||||
"/intel/procfs/load/min15": {},
|
||||
"/intel/procfs/load/min1_rel": {},
|
||||
"/intel/procfs/load/min5_rel": {},
|
||||
"/intel/procfs/load/min15_rel": {},
|
||||
"/intel/procfs/load/runnable_scheduling": {},
|
||||
"/intel/procfs/load/existing_scheduling": {},
|
||||
"/intel/procfs/meminfo/buffers": {},
|
||||
"/intel/procfs/meminfo/cached": {},
|
||||
"/intel/procfs/meminfo/mem_free": {},
|
||||
"/intel/procfs/meminfo/mem_used": {},
|
||||
"/intel/procfs/processes/dead": {},
|
||||
"/intel/procfs/processes/parked": {},
|
||||
"/intel/procfs/processes/running": {},
|
||||
"/intel/procfs/processes/sleeping": {},
|
||||
"/intel/procfs/processes/stopped": {},
|
||||
"/intel/procfs/processes/tracing": {},
|
||||
"/intel/procfs/processes/waiting": {},
|
||||
"/intel/procfs/processes/wakekill": {},
|
||||
"/intel/procfs/processes/waking": {},
|
||||
"/intel/procfs/processes/zombie": {},
|
||||
"/intel/procfs/swap/all/cached_bytes": {},
|
||||
"/intel/procfs/swap/all/free_bytes": {},
|
||||
"/intel/procfs/swap/io/in_pages_per_sec": {},
|
||||
"/intel/procfs/swap/io/out_pages_per_sec": {},
|
||||
"/intel/procfs/swap/all/used_bytes": {}
|
||||
},
|
||||
"publish": [{
|
||||
"plugin_name": "heka",
|
||||
"config": {
|
||||
"host": "localhost",
|
||||
"port": {{ hindsight_heka_tcp_port.cont }}
|
||||
}
|
||||
}]
|
||||
}
|
||||
}
|
||||
}
|
|
@ -1,5 +0,0 @@
|
|||
log_level: {{ snap.log_level }}
|
||||
control:
|
||||
plugin_load_timeout: 15
|
||||
plugin_trust_level: 0
|
||||
auto_discover_path: /etc/snap/auto
|
File diff suppressed because it is too large
Load Diff
|
@ -1,69 +0,0 @@
|
|||
dsl_version: 0.1.0
|
||||
service:
|
||||
name: heka
|
||||
kind: DaemonSet
|
||||
containers:
|
||||
- name: heka
|
||||
image: heka
|
||||
volumes:
|
||||
- name: docker-sock
|
||||
type: host
|
||||
path: /run/docker.sock
|
||||
- name: mysql-logs
|
||||
path: "/var/log/ccp/mysql"
|
||||
type: host
|
||||
readOnly: True
|
||||
- name: rabbitmq-logs
|
||||
path: "/var/log/ccp/rabbitmq"
|
||||
type: host
|
||||
readOnly: True
|
||||
- name: keystone-logs
|
||||
path: "/var/log/ccp/keystone"
|
||||
type: host
|
||||
readOnly: True
|
||||
- name: horizon-logs
|
||||
path: "/var/log/ccp/horizon"
|
||||
type: host
|
||||
readOnly: True
|
||||
daemon:
|
||||
command: hekad --config=/etc/heka
|
||||
dependencies:
|
||||
- elasticsearch
|
||||
files:
|
||||
- heka-global.toml
|
||||
- heka-elasticsearch.toml
|
||||
- heka-mariadb.toml
|
||||
- heka-openstack.toml
|
||||
- heka-rabbitmq.toml
|
||||
- heka-ovs.toml
|
||||
- heka-dockerlogs.toml
|
||||
- heka-keystone.toml
|
||||
- heka-horizon.toml
|
||||
files:
|
||||
heka-global.toml:
|
||||
path: /etc/heka/heka-global.toml
|
||||
content: heka-global.toml.j2
|
||||
heka-elasticsearch.toml:
|
||||
path: /etc/heka/heka-elasticsearch.toml
|
||||
content: heka-elasticsearch.toml.j2
|
||||
heka-mariadb.toml:
|
||||
path: /etc/heka/heka-mariadb.toml
|
||||
content: heka-mariadb.toml.j2
|
||||
heka-openstack.toml:
|
||||
path: /etc/heka/heka-openstack.toml
|
||||
content: heka-openstack.toml.j2
|
||||
heka-rabbitmq.toml:
|
||||
path: /etc/heka/heka-rabbitmq.toml
|
||||
content: heka-rabbitmq.toml.j2
|
||||
heka-ovs.toml:
|
||||
path: /etc/heka/heka-ovs.toml
|
||||
content: heka-ovs.toml.j2
|
||||
heka-dockerlogs.toml:
|
||||
path: /etc/heka/heka-dockerlogs.toml
|
||||
content: heka-dockerlogs.toml
|
||||
heka-keystone.toml:
|
||||
path: /etc/heka/heka-keystone.toml
|
||||
content: heka-keystone.toml.j2
|
||||
heka-horizon.toml:
|
||||
path: /etc/heka/heka-horizon.toml
|
||||
content: heka-horizon.toml.j2
|
|
@ -1,43 +0,0 @@
|
|||
dsl_version: 0.1.0
|
||||
service:
|
||||
name: influxdb
|
||||
ports:
|
||||
- {{ influxdb.port }}
|
||||
containers:
|
||||
- name: influxdb
|
||||
image: influxdb
|
||||
daemon:
|
||||
command: influxd -config /etc/influxdb/influxdb.conf
|
||||
files:
|
||||
- influxdb.conf
|
||||
# {% if grafana is defined and grafana.enable %}
|
||||
post:
|
||||
- name: stacklight-grafana-configure
|
||||
command: /opt/ccp/bin/grafana-configure.sh
|
||||
type: single
|
||||
dependencies:
|
||||
- grafana
|
||||
files:
|
||||
- grafana-configure.sh
|
||||
- kubernetes-dashboard
|
||||
- system-dashboard
|
||||
# {% endif %}
|
||||
volumes:
|
||||
- name: influxdb-data
|
||||
type: empty-dir
|
||||
path: /var/lib/influxdb
|
||||
files:
|
||||
influxdb.conf:
|
||||
path: /etc/influxdb/influxdb.conf
|
||||
content: influxdb.conf.j2
|
||||
perm: "0600"
|
||||
grafana-configure.sh:
|
||||
path: /opt/ccp/bin/grafana-configure.sh
|
||||
content: grafana-configure.sh.j2
|
||||
perm: "0755"
|
||||
kubernetes-dashboard:
|
||||
path: /tmp/kubernetes.dashboard.json
|
||||
content: kubernetes.dashboard.json
|
||||
system-dashboard:
|
||||
path: /tmp/system.dashboard.json
|
||||
content: system.dashboard.json
|
|
@ -1,18 +0,0 @@
|
|||
dsl_version: 0.1.0
|
||||
service:
|
||||
name: kibana
|
||||
ports:
|
||||
- {{ kibana.port }}
|
||||
containers:
|
||||
- name: kibana
|
||||
image: kibana
|
||||
daemon:
|
||||
command: /opt/kibana/bin/kibana
|
||||
dependencies:
|
||||
- elasticsearch
|
||||
files:
|
||||
- kibana.yml
|
||||
files:
|
||||
kibana.yml:
|
||||
path: /opt/kibana/config/kibana.yml
|
||||
content: kibana.yml.j2
|
|
@ -1,94 +0,0 @@
|
|||
dsl_version: 0.1.0
|
||||
service:
|
||||
name: stacklight-collector
|
||||
kind: DaemonSet
|
||||
containers:
|
||||
- name: hindsight
|
||||
image: hindsight
|
||||
pre:
|
||||
- name: service-bootstrap
|
||||
type: local
|
||||
command: /opt/ccp/bin/bootstrap-hindsight.sh /var/lib/hindsight
|
||||
daemon:
|
||||
command: /usr/bin/hindsight /etc/hindsight/hindsight.cfg
|
||||
files:
|
||||
- hindsight.cfg
|
||||
- heka-tcp.cfg
|
||||
- prune-input.cfg
|
||||
- influxdb-tcp.cfg
|
||||
- kubelet-stats.cfg
|
||||
volumes:
|
||||
- name: hindsight
|
||||
type: empty-dir
|
||||
path: /var/lib/hindsight
|
||||
- name: stacklight-alarms
|
||||
type: empty-dir
|
||||
path: /opt/ccp/lua/modules/stacklight_alarms
|
||||
- name: snap
|
||||
image: snap
|
||||
privileged: true
|
||||
daemon:
|
||||
command: snapd --config /etc/snap/snap.conf
|
||||
files:
|
||||
- snap.conf
|
||||
- snap-task.json
|
||||
volumes:
|
||||
- name: proc
|
||||
type: host
|
||||
path: /proc
|
||||
mount-path: /host-proc
|
||||
- name: dev
|
||||
type: host
|
||||
path: /dev
|
||||
mount-path: /host-dev
|
||||
- name: alarm-manager
|
||||
image: alarm-manager
|
||||
daemon:
|
||||
command: /opt/ccp/bin/alarm-manager.py -w /etc/alarm-manager
|
||||
files:
|
||||
- alarms.yaml
|
||||
- lua-cfg-template.j2
|
||||
volumes:
|
||||
- name: hindsight
|
||||
type: empty-dir
|
||||
path: /var/lib/hindsight
|
||||
- name: stacklight-alarms
|
||||
type: empty-dir
|
||||
path: /opt/ccp/lua/modules/stacklight_alarms
|
||||
files:
|
||||
hindsight.cfg:
|
||||
path: /etc/hindsight/hindsight.cfg
|
||||
content: hindsight.cfg.j2
|
||||
perm: "0600"
|
||||
heka-tcp.cfg:
|
||||
path: /var/lib/hindsight/run/input/heka_tcp.cfg
|
||||
content: hindsight_heka_tcp.cfg.j2
|
||||
perm: "0600"
|
||||
prune-input.cfg:
|
||||
path: /var/lib/hindsight/run/input/prune_input.cfg
|
||||
content: hindsight_prune_input.cfg
|
||||
perm: "0600"
|
||||
influxdb-tcp.cfg:
|
||||
path: /var/lib/hindsight/run/output/influxdb_tcp.cfg
|
||||
content: hindsight_influxdb_tcp.cfg.j2
|
||||
perm: "0600"
|
||||
kubelet-stats.cfg:
|
||||
path: /var/lib/hindsight/run/input/kubelet_stats.cfg
|
||||
content: hindsight_kubelet_stats.cfg.j2
|
||||
perm: "0600"
|
||||
snap.conf:
|
||||
path: /etc/snap/snap.conf
|
||||
content: snap.conf.j2
|
||||
perm: "0600"
|
||||
snap-task.json:
|
||||
path: /etc/snap/auto/task.json
|
||||
content: snap-task.json.j2
|
||||
perm: "0600"
|
||||
alarms.yaml:
|
||||
path: /etc/alarm-manager/alarms.yaml
|
||||
content: alarms.yaml
|
||||
perm: "0600"
|
||||
lua-cfg-template.j2:
|
||||
path: /etc/alarm-manager/templates/alarm_manager_lua_config_template.cfg.j2
|
||||
content: alarm_manager_lua_config_template.cfg.j2
|
||||
perm: "0600"
|
|
@ -1,89 +0,0 @@
|
|||
#!/usr/bin/python3
|
||||
# Copyright 2016 Mirantis, Inc.
|
||||
#
|
||||
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||
# not use this file except in compliance with the License. You may obtain
|
||||
# a copy of the License at
|
||||
#
|
||||
# http://www.apache.org/licenses/LICENSE-2.0
|
||||
#
|
||||
# Unless required by applicable law or agreed to in writing, software
|
||||
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||
# License for the specific language governing permissions and limitations
|
||||
# under the License.
|
||||
#
|
||||
|
||||
import argparse
|
||||
import sys
|
||||
import glob
|
||||
import os
|
||||
import json
|
||||
|
||||
|
||||
class Action(argparse.Action):
|
||||
def __call__(self, parser, namespace, values, option_string=None):
|
||||
path = values
|
||||
if os.path.isdir(path):
|
||||
path = "{}/*.json".format(path)
|
||||
elif os.path.isfile(path):
|
||||
pass
|
||||
else:
|
||||
raise ValueError("'{}' no such file or directory".format(path))
|
||||
setattr(namespace, self.dest, path)
|
||||
|
||||
|
||||
parser = argparse.ArgumentParser(
|
||||
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||
description="""
|
||||
Format JSON file with ordered keys
|
||||
Remove sections:
|
||||
templating.list[].current
|
||||
templating.list[].options
|
||||
Override time entry to {{"from": "now-1h","to": "now"}}
|
||||
Enable sharedCrosshair
|
||||
Increment the version
|
||||
|
||||
WARNING: this script modifies all manipulated files.
|
||||
|
||||
if a DIRECTORY is provided, all files with suffix '.json' will be modified.
|
||||
|
||||
WARNING: this script modifies all manipulated files.""")
|
||||
parser.add_argument('path',
|
||||
action=Action,
|
||||
help="Path to JSON file or directory "
|
||||
"including .json files")
|
||||
path = parser.parse_args().path
|
||||
|
||||
for f in glob.glob(path):
|
||||
print('Processing {}...'.format(f))
|
||||
data = None
|
||||
absf = os.path.abspath(f)
|
||||
with open(absf) as _in:
|
||||
data = json.load(_in)
|
||||
dashboard = data.get('dashboard')
|
||||
if not dashboard:
|
||||
print('Malformed JSON: no "dashboard" key')
|
||||
sys.exit(1)
|
||||
for k, v in dashboard.items():
|
||||
if k == 'annotations':
|
||||
for anno in v.get('list', []):
|
||||
anno['datasource'] = 'CCP InfluxDB'
|
||||
if k == 'templating':
|
||||
variables = v.get('list', [])
|
||||
for o in variables:
|
||||
if o['type'] == 'query':
|
||||
o['options'] = []
|
||||
o['current'] = {}
|
||||
o['refresh'] = 1
|
||||
|
||||
dashboard['time'] = {'from': 'now-1h', 'to': 'now'}
|
||||
dashboard['sharedCrosshair'] = True
|
||||
dashboard['refresh'] = '1m'
|
||||
dashboard['id'] = None
|
||||
dashboard['version'] = dashboard.get('version', 0) + 1
|
||||
|
||||
with open(absf, 'w') as out:
|
||||
json.dump(data, out, indent=2, sort_keys=True)
|
||||
|
||||
print('Done processing {}.'.format(f))
|
|
@ -1,5 +0,0 @@
|
|||
#!/bin/bash
|
||||
set -ex
|
||||
|
||||
workdir=$(dirname $0)
|
||||
yamllint -c $workdir/yamllint.yaml $(find . -not -path '*/\.*' -type f -name '*.yaml')
|
|
@ -1,21 +0,0 @@
|
|||
extends: default
|
||||
|
||||
rules:
|
||||
braces:
|
||||
max-spaces-inside: 1
|
||||
comments:
|
||||
level: error
|
||||
comments-indentation:
|
||||
level: warning
|
||||
document-end:
|
||||
present: no
|
||||
document-start:
|
||||
level: error
|
||||
present: no
|
||||
empty-lines:
|
||||
max: 1
|
||||
max-start: 0
|
||||
max-end: 0
|
||||
line-length:
|
||||
level: warning
|
||||
max: 120
|
27
tox.ini
27
tox.ini
|
@ -1,27 +0,0 @@
|
|||
[tox]
|
||||
minversion = 1.6
|
||||
envlist = linters,bashate,py34,py27,pep8
|
||||
skipsdist = True
|
||||
|
||||
[testenv:pep8]
|
||||
commands = flake8 {posargs}
|
||||
|
||||
[testenv:venv]
|
||||
commands = {posargs}
|
||||
|
||||
[testenv:linters]
|
||||
deps = yamllint
|
||||
commands =
|
||||
{toxinidir}/tools/yamllint.sh
|
||||
|
||||
[testenv:bashate]
|
||||
deps = bashate>=0.2
|
||||
whitelist_externals = bash
|
||||
commands = bash -c "find {toxinidir} -type f -name '*.sh' -not -path '*/.tox/*' -print0 | xargs -0 bashate -v"
|
||||
|
||||
[flake8]
|
||||
# E123, E125 skipped as they are invalid PEP-8.
|
||||
show-source = True
|
||||
ignore = E123,E125,H102
|
||||
builtins = _
|
||||
exclude=.venv,.git,.tox,dist,doc,*openstack/common*,*lib/python*,*egg,build
|
Loading…
Reference in New Issue