Commit Graph

17 Commits

Author SHA1 Message Date
Craig Bryant 106088887a Log as error failure to send to Kafka
Exception on failure to send to Kafka was only being logged
at the debug level. Increased to error level as this is a major
failure in the Threshold Engine functionality

Change-Id: I131d6d7a20cd0e907334cf5d0ff6fac342e8f320
2018-01-18 10:47:30 -07:00
Craig Bryant 2acdd58dc3 Implement the Last Function
The Alarm state is driven by the last measurement with the newest
timestamp. Use the value even if the measurement is older than the
oldest bucket. This ensures the measurement will be used when the
Threshold Engine is started if the measurement
is received while the Threshold Engine is stopped

Never evaluate subAlarm with function Last except on receiving of
a measurement.

Add tests to ensure this works.

The change is dependent on the monasca-common change and the
change to monasca-api to add the state field to sub_alarm.

Change-Id: Ib5123ed035018757a50d9ebeb7335fbca48054f2
Implements: Blueprint last-value
2016-08-02 12:05:37 -06:00
Michael James Hoppal 865816dd78 Add millisecond resolution to alarms.
At the moment it we are using the mysql built in function
NOW() which returns in second resolution.

Change-Id: I1192abb5aab3a9110721cc68f5a1d16a38f77c10
2016-07-13 08:14:35 -06:00
Craig Bryant 0e72d867ec Change to use Storm 1.0.0 instead of 0.9.x
Storm classes changed from starting with backtype to org.apache

Since this is a major backwards incompatible change, increment the
jar version

Copy some Stream classes from monasca-common. They were only used for
monasca-thresh anyways and having them in a separate repo made it
harder to make this change. A later review will remove these classes
from monasca-common

Need to have an explicit dependency on commons-codec

Change-Id: I36db83ce7fdea02ae4df267cf0820e49dcdf3001
2016-06-09 14:14:23 -06:00
Ryan Brandt c65fba06b0 Pass link and lifecycle state in state transitions
Requires changes in monasca-api, monasca-common to use

Change-Id: Ibf592a5e333f348895df6c681c23d0a34c115045
2016-01-19 14:28:19 -07:00
Ryan Brandt c6f025016d Clean up orphaned alarms on alarm definition delete
Change-Id: I8dea800768989b80f1fe810f1c9572ed439d0133
2015-05-08 12:31:22 -06:00
Craig Bryant 6bdef9f492 AlarmStateTransitionedEvent timestamp now in ms
This will ensure a unique timestamp. Influx will only keep one
entry with the same timestamp

Change-Id: Ibf1001fea9328a6541381d344221b86e39996e1d
2015-04-14 11:13:43 -06:00
Michael James Hoppal 35d3976342 Adding subalarm values to alarm state transitions
Change-Id: Ifeb8fa8e203adfab8ec84ec3aad13835dc2c62ea
2015-02-24 09:19:09 -07:00
Craig Bryant 46f2f5f3df Improve performance of Alarm Creation
The AlarmCreationBolt now caches AlarmDefinitions and Alarms for
quicker evaluation of incoming metrics. Incoming metrics end up in one
of these buckets:

1. Fits into an existing Alarm
2. Causes a new Alarm to be created
3. Already exists in an existing Alarm

All of these require the analysis of existing Alarms. I tried writing SQL
to do this analysis but it just wasn't fast enough so instead I added
the caching of the Alarm Definitions and Alarms.

The AlarmCreationBolt now needs to process Alarm deletion message so
that stream from the EventProcessingBolt had to be hooked up to the
AlarmCreationBolt.

The AlarmCreationBolt used to incorrectly handle
Alarm Definition Updated events.

Improved the queries in AlarmDAOImpl to be more efficient by using
fewer queries intsead of multiple queries per Alarm. However the
AlarmDAOImplTest now requires a real mysql instance since the h2
emulator doesn't understand "group_concat". Mark that test as
only run for target integration-test

Turn on tests for target integration-test

Previous code was not reusing Metrics from existing Alarms
all times it should have. Added test for this case

Changed info messages to debug to lessen normal logging.

Added more tests of existing and new functionality

Added some timing code for debug

Removed unused code

Added more debug logging code

Added reference to API doc for Alarm Definitions in README

Change-Id: Ied9841ecde7608b9eb1eb9c110b73b079ede71bc
2015-01-09 12:35:56 -07:00
Craig Bryant 50dabd0847 Messages not distributed across Kafka partitions
Change AlarmStateTransitionNotification Kafka message key to be
the message count instead of a constant to get a good
distribution across Kafka partitions

JAH-926 event and AlarmStateTransition messages are not
distributed across Kafka partitions

Change-Id: I0291168635f2311585de011d9003f6af06120b86
2014-12-11 16:38:30 -07:00
Victor Ion Munteanu 04e92ad9cc Fixes a typo, lack of explicit modifier, duplication of ;;
Change-Id: Idf42a1dd7707d258bee379082eb4acb5d35ba8b2
2014-11-14 17:03:04 +01:00
Craig Bryant 3cfa7c54cc Handle AlarmDefinitionUpdatedEvents
The ThresholdEngine can now handle AlarmDefinition updates where the
associated Metrics don't changes. So, the only things that can
change are operator, function, threshold, period, periods and the boolean
operators linking AlarmSubExpressions together. The number of
AlarmSubExpressions and anything else like matchBy can't change.

Tie the SubALarms to the AlarmSubExpressions in the
Alarm Definition by the ID of the AlarmSubExpresssions

Move the logic that handles a SubExpression update into SubAlarmStats

Route the update messages to the proper bolts on the proper streams

Fix a problem in AlarmDAOImpl where the AlarmedMetrics didn't have
their dimensions set properly

Rewrite ThresholdingEngineAlarmTest to be clearer how it works. Also,
test the new functionality

SubExpression needs to be registered for Serialization or weird things
happen where sometimes it is serialized properly and sometimes the
id is null

Fix the ThresholdingEngineTest so all three tests can run. Topology
has to be torn down and rebuilt each time

Change-Id: I8b37d7d4791463a6e704d10d75e35332818d3a68
2014-10-21 13:23:58 -06:00
Craig Bryant adf36b0ae5 Use new monasca-common package names
Change-Id: I1fd7497b5ef7f983c4b3957a185030a6cacef987
2014-10-13 17:15:12 -06:00
Craig Bryant 032f068f82 Add severity to AlarmStateTransitionedEvent
Had to add severity to AlarmDefinition and read from database

Handle severity change in AlarmDefinitionUpdateEvent

Simplified AlarmDefinitionDAOImpl

Change-Id: Id619272fd605f5e27187f5a0db29bac820e51cbc
2014-10-10 13:35:57 -06:00
Craig Bryant c3f15d8c33 Get Alarm updating to work properly. Previously, the state would
not change back to the real state

Get Alarm Deletion to work properly. Previously, it would not clean
up the Alarms or Sub Alarms

Get the Alarm Definition Deletion Event to work correctly. Previously
it wasn't handled

Get the Alarm Definition Updated Event to work correctly, at least
for name, description and severity

Finish implementing some TODOs in AlarmCreationBolt and add tests

Fixed some code that could handle Alarm Definition Updates in
the MetricAggregationBolt but the right calls aren't made in
the rest of the code, yet

Removed some unused or duplicated test java files

Turn the tests back on as they now run

Change-Id: Idb6c92d35e2273601411ca9c0f7d7ba45c61ad55
2014-10-09 17:11:25 -06:00
Craig Bryant 0743e0a8e3 Changes for Alarmed Metrics
This code creates Alarms correctly but can't delete them or
Alarm Definitions. It also does not handle Alarm Definition
updates.

This code is dependent on the new monasca-common changes that
have not been merged.

MetricFilteringBolt reads AlarmDefinitions on startup.
MetricFilteringBolt sends the message that a new MetricDefinition
was found for an AlarmDefinition.

MetricAggregationBolt gets SubAlarm messages for SubAlarm creation
instead of getting SubAlarms from the database

Created the AlarmCreationBolt which will handle AlarmCreation when
the metrics that fill an alarm are seen. Also, will add Alarmed Metrics
to an existing alarm if they match.

Temporarily renamed sub_alarm table to sub_alarms table to not conflict
with API table.

Using the BeanMapper to directly map an AlarmDefinition caused an issue
with the older version of guava used by storm so had to hack around that.

Temporarily renamed sub_alarm table to sub_alarms table to not conflict
with API table.

Using the BeanMapper to directly map an AlarmDefinition caused an issue
with the older version of guava used by storm so had to hack around that.

Allow more then one AlarmCreationBolt because it routes by
AlarmDefinitionId

Change the routing for MetricFilteringBolts so they always get the same
tenantId + Metric name. Prevents multiple messages from going to the
AlarmCreationBolt

Removed MetricDefinitionDAO and all related classes since they are
not used anymore

Make it so there is only one AlarmCreationBolt because there is
no code protection for ensuring dimensions aren't created twice
for concurrent creates

Tests are turned off for now

Change-Id: I6747d789bbe0f7f723e19060bf19c3fe318c1e3e
2014-09-30 13:00:02 -06:00
Craig Bryant f72caef55f Change packages to monasca.thresh
The diffs are going to be very hard to read on this change, but
basically the change is from com.hpcloud.mon to monasca.thresh.

Since monasca-common packages haven't changed, yet, they are still
referenced as com.hpcloud.mon, but everything else has changed to
monasca.thresh.

The init script has also been changed to invoke the new package path.

Change-Id: I5aad9302a8f5b11ed1b9874160f0d7a9882c5b2d
2014-08-05 22:03:47 -06:00