monasca-transform/tests/unit/test_resources/metrics_pre_hourly_data
Ashwin Agate 00b874a6b3 Two stage transformation
Breaking down the aggregation into two stages.

The first stage aggregates raw metrics frequently and is
implemented as a Spark Streaming job which
aggregates metrics at a configurable time interval
(defaults to 10 minutes) and writes the intermediate
aggregated data, or instance usage data
to new "metrics_pre_hourly" kafka topic.

The second stage is implemented
as a batch job using Spark Streaming createRDD
direct stream batch API, which is triggered by the
first stage only when first stage runs at the
top of the hour.

Also enhanced kafka offsets table to keep track
of offsets from two stages along with streaming
batch time, last time version row got updated
and revision number. By default it should keep
last 10 revisions to the offsets for each
application.

Change-Id: Ib2bf7df6b32ca27c89442a23283a89fea802d146
2016-06-28 13:47:50 +00:00
..
__init__.py Two stage transformation 2016-06-28 13:47:50 +00:00
data_provider.py Two stage transformation 2016-06-28 13:47:50 +00:00
metrics_pre_hourly_data.txt Two stage transformation 2016-06-28 13:47:50 +00:00