By default Storm uses the in-process ZooKeeper when the local mode
is chosen.
The in-process ZooKeeper doesn't handle the trim of ZooKeeper
Transactional logs & snapshots.
There is a constructor which enables the possibility to use an
external ZooKeeper when the local mode is used. [1][2]
[1] https://issues.apache.org/jira/browse/STORM-213
[2] https://github.com/apache/storm/pull/137/files
Change-Id: I4c6e6f1dbacf093f94b566abc7251f4e33eff93f
mariadb-connector-j is:
- Well maintained.
- Compatible (v2.7.0) with MySql Server 8.0.x
- LGPL Licensed
Change-Id: I44830b47a1de4ae69f635327a2da1802a67d92d2
Story: 2008236
Task: 41079
topology.max.spout.pending allows to limit the number of concurrent
entries sent from spout to worker(s). However, this requires the
usage of a unique id when sending messages (emit).
Change-Id: I907a4574b80e7c3347ba6a9f12c7836767dc3dd7
Story: 2005471
Task: 30550
MySQL Connector is released under GPLv2 license which restricts the
distribution of the consuming project [1]. This change removes MySQL
Connector and leaves Drizzle JDBC which is licensed under BSD.
[1] https://governance.openstack.org/tc/reference/licensing.html
Story: 2001522
Task: 6324
Change-Id: I4c39ebc290475820b5ba3ab54c36198ca9069abe
Depends-On: https://review.openstack.org/541366
Exception on failure to send to Kafka was only being logged
at the debug level. Increased to error level as this is a major
failure in the Threshold Engine functionality
Change-Id: I131d6d7a20cd0e907334cf5d0ff6fac342e8f320
The Alarm state is driven by the last measurement with the newest
timestamp. Use the value even if the measurement is older than the
oldest bucket. This ensures the measurement will be used when the
Threshold Engine is started if the measurement
is received while the Threshold Engine is stopped
Never evaluate subAlarm with function Last except on receiving of
a measurement.
Add tests to ensure this works.
The change is dependent on the monasca-common change and the
change to monasca-api to add the state field to sub_alarm.
Change-Id: Ib5123ed035018757a50d9ebeb7335fbca48054f2
Implements: Blueprint last-value
At the moment it we are using the mysql built in function
NOW() which returns in second resolution.
Change-Id: I1192abb5aab3a9110721cc68f5a1d16a38f77c10
Thresh creates an Alarm when a new Measurement matches an
AlarmDefinition. The previous Thresh code just discarded the
Measurement if it arrived before the newly created SubAlarm,
which was likely to occur. This code saves a Measurement that
does not match an existing SubAlarm in the expectation that the
SubAlarm will arrive very soon. It then adds the Measurement
to the SubAlarm. If the measurement would cause the SubAlarm to
transition to the ALARM state, that happens.
This is more important for determinstic alarms because they will get
fewer Measurements and ignoring the first one may prevent an Alarm's
state going to ALARM when it should
Change-Id: I08e9e481ad55862ba602eba5a68eb371b1d35bbc
Using the standard case of count(log_message) > 1, getting no
log_message measurements should be treated as OK. However, the old code
uses the emptyWindowObservationThreshold for both deterministic and
non-deterministic alarms which means that there must be 3 empty windows
before the deterministic alarm transitions to OK.
This change cause the evaluation of the Alarm to treat an empty window
as OK for deterministic alarms. So, count(log_message) > 1, getting no
measurements in a window will transition the alarm back to OK
Change-Id: I19a04bf78f907b23ef583409f2def54771c07d72
It is possible for the currentValues property to change which can cause
java.util.ConcurrentModificationException. Fix by cloning currentValues
before the SubAlarm gets emitted into storm
Change-Id: I555beffafe0208c0d256732517af401938876d3d
Storm classes changed from starting with backtype to org.apache
Since this is a major backwards incompatible change, increment the
jar version
Copy some Stream classes from monasca-common. They were only used for
monasca-thresh anyways and having them in a separate repo made it
harder to make this change. A later review will remove these classes
from monasca-common
Need to have an explicit dependency on commons-codec
Change-Id: I36db83ce7fdea02ae4df267cf0820e49dcdf3001
'deterministic' being part of alarm expressions
allows monasca-thresh to determine if
given alarms can go back to UNDETERMINED
state or not.
'deterministic' means that alarm
won't ever transititon to UNDETERMINED state,
even if there are no measurements received for
long enough. By default, all alarms
are assumed to be 'non-deterministic' which means
that they can transition to 'UNDETERMINED' state
Implements: blueprint alarmonlogs
Depends-On: Ia42f9a1be37c31416bdac341b092fe527f860c16
Change-Id: Ibe0839123a15494ad45b809e68600c0acef3d330
With commit 4e333d5fe4 notification
messages look like '...with the values: []', this fixes that.
Change-Id: I8e96b4aa5c77f74a9d3dd00a5647fbfde5fde9b2
Closes-Bug: #1554718
This prevents Storm from throwing a ConcurrentModificationException if
the SubAlarm's state changes soon after the emit
Change-Id: Idc0de8a0ef6d13bce800e4e8a4e13e43cdf1c010
Closes-Bug: #1548999
The API sometimes sends null for match-by when it should send []. Make the
Threshold Engine more tolerant by treating null as []
Change-Id: Idf29e58c27a2c0ba531d041a144e8c5f35b6be46
Some SubAlarm expressions can be evaluated immediately. If the
expression is MAX(m) > 10, a single measurement of m > 10 will cause
the SubAlarm to transition to the ALARM state regardless of any other
measurement of m that is received. However if the operater is < or <=,
MAX can't be immediately evaluated since a following measurement
could be larger than the one storeed. COUNT also can be evaluated
immediately if the operator is > or >= since it never decreases. MIN
is the opposite and can be immediately evaluated if the operator is
< or <=. AVG and SUM can't be evaluated until the end of the
evaluation window since the average or sum could go up or down
depending on the measurements received and whether or not they are
negative.
Also see if the sliding window for a SubAlarm can be slid when a
metric is received for the SubAlarm. This could allow the SubAlarm
to be evaluated faster than waiting for the tick tuple since that
is only received every 60 seconds.
Add unit tests for immediate SubAlarm evaluation.
Add unit tests for previously untested parts of SubAlarmStats
Change-Id: I989a82328fa4ccc04b49d203f70a1adc9fa4d3bb
Following changes made because monasca-common
was modified:
- removed unnecessary dependencies
- updated hikari version
- added javax.el-api
Change-Id: I176008a258411500bf14ba4a26258bdce90476db
Mysql jdbc connector returns an Integer when querying period and period.
Drizzle jdbc connector returns a Long. Adding appropriate conversions.
Change-Id: Ie96b10347dbd52b4e0e267f5fbb7cf3d6d6eafff
A whitelist and metric map for the metrics that are
sent by Storm / Threshold Engine to the Monasca
StatsD agent/daemon.
Also relates to:
https://github.com/hpcloud-mon/ansible-monasca-thresh/pull/14
=======
/etc/monasca/thresh-config.yml
```
statsdConfig:
host: localhost
port: 8125
debugmetrics: false
dimensions: !!map
service : monitoring
component : storm
whitelist: !!seq
- aggregation-bolt.execute-count.filtering-bolt_alarm-creation-stream
- aggregation-bolt.execute-count.filtering-bolt_default
- aggregation-bolt.execute-count.system_tick
- filtering-bolt.execute-count.event-bolt_metric-alarm-events
- filtering-bolt.execute-count.metrics-spout_default
- thresholding-bolt.execute-count.aggregation-bolt_default
- thresholding-bolt.execute-count.event-bolt_alarm-definition-events
- system.memory_heap.committedBytes
- system.memory_nonHeap.committedBytes
- system.newWorkerEvent
- system.startTimeSecs
- system.GC_ConcurrentMarkSweep.timeMs
metricmap: !!map
aggregation-bolt.execute-count.filtering-bolt_alarm-creation-stream :
monasca_threshold.aggregation-bolt.execute-count.filtering-bolt_alarm-creation-stream
aggregation-bolt.execute-count.filtering-bolt_default :
monasca_threshold.aggregation-bolt.execute-count.filtering-bolt_default
aggregation-bolt.execute-count.system_tick :
monasca_threshold.aggregation-bolt.execute-count.system_tick
filtering-bolt.execute-count.event-bolt_metric-alarm-events :
monasca_threshold.filtering-bolt.execute-count.event-bolt_metric-alarm-events
filtering-bolt.execute-count.metrics-spout_default :
monasca_threshold.filtering-bolt.execute-count.metrics-spout_default
thresholding-bolt.execute-count.aggregation-bolt_default :
monasca_threshold.thresholding-bolt.execute-count.aggregation-bolt_default
thresholding-bolt.execute-count.event-bolt_alarm-definition-events :
monasca_threshold.thresholding-bolt.execute-count.event-bolt_alarm-definition-events
system.memory_heap.committedBytes :
monasca_threshold.system.memory_heap.committedBytes
system.memory_nonHeap.committedBytes :
monasca_threshold.system.memory_nonHeap.committedBytes
system.newWorkerEvent :
monasca_threshold.system.newWorkerEvent
system.startTimeSecs :
monasca_threshold.system.startTimeSecs
system.GC_ConcurrentMarkSweep.timeMs :
monasca_threshold.system.GC_ConcurrentMarkSweep.timeMs
```
host: IP or host where the Monasca Agent running a StatsD is running that will consume
the metrics produced by Storm / Threshold Engine
port: UDP port number where the Monasca Agent running a StatsD daemon that will consume
the metrics produced by Storm / Threshold Engine
dimensions: A map of key/value pairs that will be passed along as dimensions for each metric
whitelist: A list of metrics in the native name that Storm presents
metricmap: A mapping from the native Storm metric name to a user defined name. The user
defined name is what will appear in the Monasca data store. If there is no
mapping present and it is listed in the whitelist then it will be published
with the native name. The 12 metrics whitelisted/mapped above correspond to the
monasca health dashboard which is defined in grafana.
https://github.com/hpcloud-mon/grafana/blob/master/src/app/dashboards/monasca.json
Change-Id: I7bcefd03d02714ac42efd9b2d9cadb77907fa17e
This caused issues for the MetricDefinitionAndTenantIdMatcher when
it was comparing dimensions. Changed it to replace null with an
empty set since that is what the rest of the code is expecting
Change-Id: I3dbec749f29604ef49d89d4a8ec1f6d882305957
This happened because MetricDefinitionAndTenantIdMatcher wasn't handling
the same Alarm Definition being added. This happens because there are
multiple MetricFilteringBolts using the same
MetricDefinitionAndTenantIdMatcher. The Alarm Definition is now
checked if is already there before being added
Change-Id: I9f382e8da5193b60a64dbe40c9fcf321fc47766f
The old way was more efficient if there were few dimensions on
the metric and lots of Alarm Definitions with different
dimension sets for a given tenant id and metric name, but that
is probably unlikely. The more common case will be one or two
alarm definitions per tenant id and metric name and more
dimensions on the metric. The new algorithm is faster for that
case since it doesn't create every possible combinations of
dimensions like the old algorithm
Change-Id: I183570d52c61f0a2932cf37c5c659a6c529b4bbb