monasca-transform

History

Ashwin Agate 00b874a6b3 Two stage transformation Breaking down the aggregation into two stages. The first stage aggregates raw metrics frequently and is implemented as a Spark Streaming job which aggregates metrics at a configurable time interval (defaults to 10 minutes) and writes the intermediate aggregated data, or instance usage data to new "metrics_pre_hourly" kafka topic. The second stage is implemented as a batch job using Spark Streaming createRDD direct stream batch API, which is triggered by the first stage only when first stage runs at the top of the hour. Also enhanced kafka offsets table to keep track of offsets from two stages along with streaming batch time, last time version row got updated and revision number. By default it should keep last 10 revisions to the offsets for each application. Change-Id: Ib2bf7df6b32ca27c89442a23283a89fea802d146		2016-06-28 13:47:50 +00:00
..
__init__.py	Two stage transformation	2016-06-28 13:47:50 +00:00
data_provider.py	Two stage transformation	2016-06-28 13:47:50 +00:00
metrics_pre_hourly_data.txt	Two stage transformation	2016-06-28 13:47:50 +00:00