Commit Graph

79 Commits

Author SHA1 Message Date
Martin Chacon Piza 37ccf62158 Revert "Use external Zookeeper in Local mode"
This reverts commit d156980a52.

Change-Id: I8a0bbf34db9d25503e50d4ef9d9da466f229fc78
2021-05-27 16:15:52 +02:00
Martin Chacon Piza d156980a52 Use external Zookeeper in Local mode
By default Storm uses the in-process ZooKeeper when the local mode
is chosen.

The in-process ZooKeeper doesn't handle the trim of ZooKeeper
Transactional logs & snapshots.

There is a constructor which enables the possibility to use an
external ZooKeeper when the local mode is used. [1][2]

[1] https://issues.apache.org/jira/browse/STORM-213
[2] https://github.com/apache/storm/pull/137/files

Change-Id: I4c6e6f1dbacf093f94b566abc7251f4e33eff93f
2021-04-07 20:53:07 +02:00
Martin Chacon Piza 0ee0e79a73 Add mariadb-connector-j and bump version to 2.4.0
mariadb-connector-j is:

- Well maintained.
- Compatible (v2.7.0) with MySql Server 8.0.x
- LGPL Licensed

Change-Id: I44830b47a1de4ae69f635327a2da1802a67d92d2
Story: 2008236
Task: 41079
2020-10-08 16:45:14 +02:00
bandorf e980ea71e0 Enable monasca-thresh to set topology.max.spout.pending f. Storm
topology.max.spout.pending allows to limit the number of concurrent
entries sent from spout to worker(s). However, this requires the
usage of a unique id when sending messages (emit).

Change-Id: I907a4574b80e7c3347ba6a9f12c7836767dc3dd7
Story: 2005471
Task: 30550
2019-07-17 12:03:38 +02:00
Thomas Bechtold 8c976339a7 Use storm version 1.1.3
monasca-api uses storm 1.1.3 [1] so adjust the version here to be in
sync with monasca-api.

[1]
https://git.openstack.org/cgit/openstack/monasca-api/tree/devstack/settings#n109

Story: 2003031
Task: 29275

Change-Id: I0f30e1353a40fb379fa7c389686f65226d5c222b
2019-03-15 15:03:24 +00:00
Zuul 5f76cf1668 Merge "Make Thresh logging compatable with Storm 1.1.1" 2018-04-10 15:28:04 +00:00
Zuul f689816936 Merge "Added license informaton for third-party libraries" 2018-03-06 15:14:57 +00:00
Kenan Karamehmedovic 344adcde36 Added license informaton for third-party libraries
Task:  6353
Story: 2001540

Depends-On: https://review.openstack.org/544813

Change-Id: Iff0d2f130defb4d26cbf047bed44f73f89bbcee8
2018-02-19 11:46:00 +00:00
Witold Bedyk 15f9962fcb Remove mysql-connector and bump version to 2.3.0
MySQL Connector is released under GPLv2 license which restricts the
distribution of the consuming project [1]. This change removes MySQL
Connector and leaves Drizzle JDBC which is licensed under BSD.

[1] https://governance.openstack.org/tc/reference/licensing.html

Story: 2001522
Task: 6324
Change-Id: I4c39ebc290475820b5ba3ab54c36198ca9069abe
Depends-On: https://review.openstack.org/541366
2018-02-09 15:54:47 +01:00
Craig Bryant 106088887a Log as error failure to send to Kafka
Exception on failure to send to Kafka was only being logged
at the debug level. Increased to error level as this is a major
failure in the Threshold Engine functionality

Change-Id: I131d6d7a20cd0e907334cf5d0ff6fac342e8f320
2018-01-18 10:47:30 -07:00
Witold Bedyk be22e54fd7 Upper pom version to 2.2.0
Change-Id: I59e3d60f606791cc15bb0f2ee17d871e15aa754d
2017-12-20 14:32:08 +01:00
Monasca CI e8001abfac Make Thresh logging compatable with Storm 1.1.1
The newest version of Storm (1.1.1) looks for a log4j2 file
instead of a logback file

Change-Id: I31b1739e1e42b91c31e1cdd43b539da2b030933a
2017-12-04 16:34:28 -07:00
Witold Bedyk 86b8634c9c Change version to 2.1.1
Depends-On: Ib3da5c9e1f6e5e2d6f77269129bd769179bfd3be
Change-Id: I6e4b73525b3c32c2bdd04c63ae73bcb4c50b5447
2017-02-15 13:48:30 +00:00
Witold Bedyk 31a7b69b59 Change version to 2.1.0
Change-Id: Ic867645c7c77954bde27cecc64ec01057013f99a
2016-11-23 17:06:49 +01:00
Michael James Hoppal a62c27a165 Bump drizzle driver version to support millisecond resolution
Change-Id: I671d4e081191bd72bbb8bac97ff3a27812dc50dc
2016-08-15 15:45:42 -06:00
Craig Bryant 2acdd58dc3 Implement the Last Function
The Alarm state is driven by the last measurement with the newest
timestamp. Use the value even if the measurement is older than the
oldest bucket. This ensures the measurement will be used when the
Threshold Engine is started if the measurement
is received while the Threshold Engine is stopped

Never evaluate subAlarm with function Last except on receiving of
a measurement.

Add tests to ensure this works.

The change is dependent on the monasca-common change and the
change to monasca-api to add the state field to sub_alarm.

Change-Id: Ib5123ed035018757a50d9ebeb7335fbca48054f2
Implements: Blueprint last-value
2016-08-02 12:05:37 -06:00
Michael James Hoppal 865816dd78 Add millisecond resolution to alarms.
At the moment it we are using the mysql built in function
NOW() which returns in second resolution.

Change-Id: I1192abb5aab3a9110721cc68f5a1d16a38f77c10
2016-07-13 08:14:35 -06:00
Craig Bryant 3927da1697 Save Measurements that arrive before their SubAlarms
Thresh creates an Alarm when a new Measurement matches an
AlarmDefinition. The previous Thresh code just discarded the
Measurement if it arrived before the newly created SubAlarm,
which was likely to occur. This code saves a Measurement that
does not match an existing SubAlarm in the expectation that the
SubAlarm will arrive very soon. It then adds the Measurement
to the SubAlarm. If the measurement would cause the SubAlarm to
transition to the ALARM state, that happens.

This is more important for determinstic alarms because they will get
fewer Measurements and ignoring the first one may prevent an Alarm's
state going to ALARM when it should

Change-Id: I08e9e481ad55862ba602eba5a68eb371b1d35bbc
2016-06-27 08:39:02 -06:00
Craig Bryant 0d80a987db Treat empty windows as OK for deterministic alarms
Using the standard case of count(log_message) > 1, getting no
log_message measurements should be treated as OK. However, the old code
uses the emptyWindowObservationThreshold for both deterministic and
non-deterministic alarms which means that there must be 3 empty windows
before the deterministic alarm transitions to OK.

This change cause the evaluation of the Alarm to treat an empty window
as OK for deterministic alarms. So, count(log_message) > 1, getting no
measurements in a window will transition the alarm back to OK

Change-Id: I19a04bf78f907b23ef583409f2def54771c07d72
2016-06-20 12:30:42 -06:00
Craig Bryant d5d14ecdcd Clone the currentValues property in duplicate method
It is possible for the currentValues property to change which can cause
java.util.ConcurrentModificationException. Fix by cloning currentValues
before the SubAlarm gets emitted into storm

Change-Id: I555beffafe0208c0d256732517af401938876d3d
2016-06-09 16:42:47 -06:00
Craig Bryant 0e72d867ec Change to use Storm 1.0.0 instead of 0.9.x
Storm classes changed from starting with backtype to org.apache

Since this is a major backwards incompatible change, increment the
jar version

Copy some Stream classes from monasca-common. They were only used for
monasca-thresh anyways and having them in a separate repo made it
harder to make this change. A later review will remove these classes
from monasca-common

Need to have an explicit dependency on commons-codec

Change-Id: I36db83ce7fdea02ae4df267cf0820e49dcdf3001
2016-06-09 14:14:23 -06:00
Tomasz Trębski 080b11dc54 (Non)deterministic alarm processing
'deterministic' being part of alarm expressions
allows monasca-thresh to determine if
given alarms can go back to UNDETERMINED
state or not.

'deterministic' means that alarm
won't ever transititon to UNDETERMINED state,
even if there are no measurements received for
long enough. By default, all alarms
are assumed to be 'non-deterministic' which means
that they can transition to 'UNDETERMINED' state

Implements: blueprint alarmonlogs
Depends-On: Ia42f9a1be37c31416bdac341b092fe527f860c16
Change-Id: Ibe0839123a15494ad45b809e68600c0acef3d330
2016-06-07 12:00:58 +02:00
Brad Klein 4c2ac9a5e3 Include metric value in alarm notification message.
With commit 4e333d5fe4 notification
messages look like '...with the values: []', this fixes that.

Change-Id: I8e96b4aa5c77f74a9d3dd00a5647fbfde5fde9b2
Closes-Bug: #1554718
2016-03-11 10:58:31 -07:00
Craig Bryant 4e333d5fe4 Duplicate the SubAlarm before emitting it
This prevents Storm from throwing a ConcurrentModificationException if
the SubAlarm's state changes soon after the emit

Change-Id: Idc0de8a0ef6d13bce800e4e8a4e13e43cdf1c010
Closes-Bug: #1548999
2016-02-24 10:29:12 -07:00
Jenkins 78a8a1d0f1 Merge "Pass link and lifecycle state in state transitions" 2016-01-27 21:36:52 +00:00
Craig Bryant 6f3286bc0e Treat match-by of null as []
The API sometimes sends null for match-by when it should send []. Make the
Threshold Engine more tolerant by treating null as []

Change-Id: Idf29e58c27a2c0ba531d041a144e8c5f35b6be46
2016-01-26 11:55:18 -07:00
Ryan Brandt c65fba06b0 Pass link and lifecycle state in state transitions
Requires changes in monasca-api, monasca-common to use

Change-Id: Ibf592a5e333f348895df6c681c23d0a34c115045
2016-01-19 14:28:19 -07:00
Deklan Dieterly 7febc99a69 Upgrade to Kafka 0.8.2.2
Upgrade Kafka to current stable release - 0.8.2.2.
Upgrade Scala version to 2.11.

Change-Id: I113997fc1c3124bc1073cb261d6b6f873c6fc6b2
2016-01-04 15:39:24 -07:00
Craig Bryant 96f9b442f3 Evaluate SubAlarms immediately if possible
Some SubAlarm expressions can be evaluated immediately. If the
expression is MAX(m) > 10, a single measurement of m > 10 will cause
the SubAlarm to transition to the ALARM state regardless of any other
measurement of m that is received. However if the operater is < or <=,
MAX can't be immediately evaluated since a following measurement
could be larger than the one storeed.  COUNT also can be evaluated
immediately if the operator is > or >= since it never decreases. MIN
is the opposite and can be immediately evaluated if the operator is
< or <=. AVG and SUM can't be evaluated until the end of the
evaluation window since the average or sum could go up or down
depending on the measurements received and whether or not they are
negative.

Also see if the sliding window for a SubAlarm can be slid when a
metric is received for the SubAlarm. This could allow the SubAlarm
to be evaluated faster than waiting for the tick tuple since that
is only received every 60 seconds.

Add unit tests for immediate SubAlarm evaluation.

Add unit tests for previously untested parts of SubAlarmStats

Change-Id: I989a82328fa4ccc04b49d203f70a1adc9fa4d3bb
2015-11-02 14:58:39 -07:00
Craig Bryant c3568930f6 Change last URL from stackforge to openstack
Also remove a trailing space from run_maven.sh

Change-Id: I35b929a91e6dcbccd30d11263e0c3bf673e21040
2015-10-19 11:49:48 -06:00
venkatamahesh 8892699305 Change repositories from stackforge to openstack
Change-Id: I1579ccd3803a1d2ca6173ce517cfc28350e15d05
2015-10-19 09:15:07 +05:30
Tomasz Trębski 7a595cf420 Dependencies updated
Following changes made because monasca-common
was modified:

- removed unnecessary dependencies
- updated hikari version
- added javax.el-api

Change-Id: I176008a258411500bf14ba4a26258bdce90476db
2015-09-23 06:37:16 +00:00
Jenkins 84eddb58da Merge "Added a whitelist for restricting the StatsD metrics" 2015-09-17 20:52:10 +00:00
Michael James Hoppal f43dfb5918 Add the drizzle driver to pom
Allows the end user the ability to choose between using drizzle
and mysql connector.

Change-Id: If74b239824d35ccbf9a5fd2f2cae6dbd0efb40a0
2015-09-04 16:10:52 -06:00
Jenkins 0d7a75c304 Merge "Add support for drizzle jdbc connector" 2015-09-03 22:13:41 +00:00
Roland Hochmuth 4334a8e44a Add support for drizzle jdbc connector
Mysql jdbc connector returns an Integer when querying period and period.
Drizzle jdbc connector returns a Long. Adding appropriate conversions.

Change-Id: Ie96b10347dbd52b4e0e267f5fbb7cf3d6d6eafff
2015-09-02 11:58:38 -06:00
Tomasz Trębski 622339f6bc Hibernate support added
- added ORM support with Hibernate
- rewritten two mysql repositories to use ORM

Change-Id: I22e342ca57b4cc62b12a44cdf503ce068b9b67b5
2015-08-31 11:45:39 +02:00
Dexter Fryar afc22b56a1 Added a whitelist for restricting the StatsD metrics
A whitelist and metric map for the metrics that are
sent by Storm / Threshold Engine to the Monasca
StatsD agent/daemon.

Also relates to:
  https://github.com/hpcloud-mon/ansible-monasca-thresh/pull/14

=======

/etc/monasca/thresh-config.yml

```
statsdConfig:
  host: localhost
  port: 8125
  debugmetrics: false
  dimensions: !!map
    service : monitoring
    component : storm
  whitelist: !!seq
    - aggregation-bolt.execute-count.filtering-bolt_alarm-creation-stream
    - aggregation-bolt.execute-count.filtering-bolt_default
    - aggregation-bolt.execute-count.system_tick
    - filtering-bolt.execute-count.event-bolt_metric-alarm-events
    - filtering-bolt.execute-count.metrics-spout_default
    - thresholding-bolt.execute-count.aggregation-bolt_default
    - thresholding-bolt.execute-count.event-bolt_alarm-definition-events
    - system.memory_heap.committedBytes
    - system.memory_nonHeap.committedBytes
    - system.newWorkerEvent
    - system.startTimeSecs
    - system.GC_ConcurrentMarkSweep.timeMs
  metricmap: !!map
    aggregation-bolt.execute-count.filtering-bolt_alarm-creation-stream :
      monasca_threshold.aggregation-bolt.execute-count.filtering-bolt_alarm-creation-stream
    aggregation-bolt.execute-count.filtering-bolt_default :
      monasca_threshold.aggregation-bolt.execute-count.filtering-bolt_default
    aggregation-bolt.execute-count.system_tick :
      monasca_threshold.aggregation-bolt.execute-count.system_tick
    filtering-bolt.execute-count.event-bolt_metric-alarm-events :
      monasca_threshold.filtering-bolt.execute-count.event-bolt_metric-alarm-events
    filtering-bolt.execute-count.metrics-spout_default :
      monasca_threshold.filtering-bolt.execute-count.metrics-spout_default
    thresholding-bolt.execute-count.aggregation-bolt_default :
      monasca_threshold.thresholding-bolt.execute-count.aggregation-bolt_default
    thresholding-bolt.execute-count.event-bolt_alarm-definition-events :
      monasca_threshold.thresholding-bolt.execute-count.event-bolt_alarm-definition-events
    system.memory_heap.committedBytes :
      monasca_threshold.system.memory_heap.committedBytes
    system.memory_nonHeap.committedBytes :
      monasca_threshold.system.memory_nonHeap.committedBytes
    system.newWorkerEvent :
      monasca_threshold.system.newWorkerEvent
    system.startTimeSecs :
      monasca_threshold.system.startTimeSecs
    system.GC_ConcurrentMarkSweep.timeMs :
      monasca_threshold.system.GC_ConcurrentMarkSweep.timeMs
```

host: IP or host where the Monasca Agent running a StatsD is running that will consume
      the metrics produced by Storm / Threshold Engine

port: UDP port number where the Monasca Agent running a StatsD daemon that will consume
      the metrics produced by Storm / Threshold Engine

dimensions: A map of key/value pairs that will be passed along as dimensions for each metric

whitelist: A list of metrics in the native name that Storm presents

metricmap: A mapping from the native Storm metric name to a user defined name.  The user
           defined name is what will appear in the Monasca data store.  If there is no
           mapping present and it is listed in the whitelist then it will be published
           with the native name. The 12 metrics whitelisted/mapped above correspond to the
           monasca health dashboard which is defined in grafana.

           https://github.com/hpcloud-mon/grafana/blob/master/src/app/dashboards/monasca.json

Change-Id: I7bcefd03d02714ac42efd9b2d9cadb77907fa17e
2015-08-28 17:16:36 -05:00
Craig Bryant 237c752e6a Fix problem where dimensions were null
This caused issues for the MetricDefinitionAndTenantIdMatcher when
it was comparing dimensions. Changed it to replace null with an
empty set since that is what the rest of the code is expecting

Change-Id: I3dbec749f29604ef49d89d4a8ec1f6d882305957
2015-08-04 08:53:22 -06:00
Roland Hochmuth fb9b6888c1 Modified query in getAlarmedMetrics for performance
Change-Id: I3520ea7154a450c5cbb5d4aecf152e1521435907
2015-07-30 21:59:32 -06:00
Craig Bryant c400895872 Fix NullPointerExceptions in MetricFilteringBolt
This happened because MetricDefinitionAndTenantIdMatcher wasn't handling
the same Alarm Definition being added. This happens because there are
multiple MetricFilteringBolts using the same
MetricDefinitionAndTenantIdMatcher. The Alarm Definition is now
checked if is already there before being added

Change-Id: I9f382e8da5193b60a64dbe40c9fcf321fc47766f
2015-06-05 10:52:54 -06:00
Craig Bryant 9c4bd6cc99 Simplify the check for whether a metric matches
The old way was more efficient if there were few dimensions on
the metric and lots of Alarm Definitions with different
dimension sets for a given tenant id and metric name, but that
is probably unlikely. The more common case will be one or two
alarm definitions per tenant id and metric name and more
dimensions on the metric. The new algorithm is faster for that
case since it doesn't create every possible combinations of
dimensions like the old algorithm

Change-Id: I183570d52c61f0a2932cf37c5c659a6c529b4bbb
2015-05-29 15:46:41 -06:00
Ryan Brandt 5b33c539bf Add new state_updated_at field to alarm
Adjust the database interaction for the new state_updated_at field
in the alarm table

Change-Id: I941f9bfdc64a43820dbbe3a4c047cc64b93335a7
2015-05-13 15:50:01 -06:00
Ryan Brandt c6f025016d Clean up orphaned alarms on alarm definition delete
Change-Id: I8dea800768989b80f1fe810f1c9572ed439d0133
2015-05-08 12:31:22 -06:00
Jenkins 1f0888d6fa Merge "Bump the version to 1.1.0" 2015-04-29 03:39:30 +00:00
Craig Bryant 88ce39abdb Bump the version to 1.1.0
Change-Id: Id63a620b4d3c301567e462b3d04a1ddc8a8322b2
2015-04-28 21:28:24 -06:00
Ryan Brandt 0f249a28cb Allow unicode in events
Change the events deserialization to handle UTF-8 encoding

Change-Id: I73c1a50df5fe365b1ed7d047f58d0e7f67f51d40
2015-04-21 14:14:14 -06:00
Craig Bryant 8f28398d07 Only the jar and sample config in the deb
Remove control scripts from deb

Update sample config file to be more current

Change-Id: If2b11dd1cab807f58b4b23a0a1933fb179032964
2015-04-20 11:22:47 -06:00
Craig Bryant 6bdef9f492 AlarmStateTransitionedEvent timestamp now in ms
This will ensure a unique timestamp. Influx will only keep one
entry with the same timestamp

Change-Id: Ibf1001fea9328a6541381d344221b86e39996e1d
2015-04-14 11:13:43 -06:00
Craig Bryant e3ac4b0857 Remove warnings
Remove unused methods, unused imports, unused private constants

Change-Id: Ie900b295cc5410fa9039649e868228d1d3de78ee
2015-04-07 16:37:21 -06:00