Commit Graph

23 Commits

Author SHA1 Message Date
Witek Bedyk 811acd76c9 Remove project content on master branch
This is step 2b of repository deprecation process as described in [1].
Project deprecation has been anounced here [2].

[1] https://docs.openstack.org/project-team-guide/repository.html#step-2b-remove-project-content
[2] http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016814.html

Depends-On: https://review.opendev.org/751983
Change-Id: I83bb2821d64a4dddd569ff9939aa78d271834f08
2020-09-15 10:12:44 +02:00
Witold Bedyk 8257dd9f42 Fix PEP8 tests for Python 3
* relace long with int
* replace execfile with exec

Change-Id: If98949fa5f49091fbf11c95a65302a4f844538c9
Story: 2003240
Task: 26827
2018-10-01 11:21:23 +02:00
Ashwin Agate fbad704cc2 Remove service_id from pre-transform spec
Remove unused field service_id from pre-transform spec.
service_id's original purpose was identify the service
that is genarating metric, but this information should
be provided by the source, as a dimension rather
than assigning its value in pre-transform spec.

Change-Id: I223eb2296df438b139e3d9b5aaf4b1b679f70797
Depends-on: I81a35e048e6bd5649c6b3031ac2722be6a309088
Story: 2001815
Task: 12556
2018-07-17 15:39:24 -07:00
Ashwin Agate 0cf08c45c5 Cleanup pre transform and transform specs
* Removed unused fields "event_status", "event_version",
  "record_type", "mount", "device", "pod_name", "container_name"
  "app", "interface", "deployment" and "daemon_set"
   from record_store data. Now it is not required to add
  new dimension, meta or value_meta fields to record store data
  instead use special notation, e.g. "dimension#" to refer to any
  dimension field in the incoming metric.

* Refactor and eliminate need to add any new metric.dimensions
  field in multiple places e.g. add to record store and
  instance usage dataframe schema and in all generic
  aggregation components. Added a new Map type column
  called "extra_data_map" to store any new fields, in
  instance usage data format. Map type column eliminates the
  need to add new columns to instance usage data.

* Allow users to define any fields in  "meta",
  "metric.dimensions" and "metric.value_meta" fields
  for aggregation in "aggregation_group_by_list" or
  "setter_group_by_list" using "dimensions#{$field_name}"
  or "meta#{$field_name}" or "value_meta#{$field_name}"

* Updated generic aggregation components and data formats docs.

Change-Id: I81a35e048e6bd5649c6b3031ac2722be6a309088
Story: 2001815
Task: 19605
2018-05-29 16:35:33 -07:00
Amir Mofakhar 37d4f09057 Update pep8 checks
* set the maximum line length to 100
* cleaned up the codes for pep8

Change-Id: Iab260a4e77584aae31c0596f39146dd5092b807a
Signed-off-by: Amir Mofakhar <amofakhar@op5.com>
2018-04-18 10:05:00 +02:00
Ashwin Agate 022bd11a4d Switch to using Spark version 2.2.0
Following changes were required:

1.)
By default the pre-built distribution
for Spark 2.2.0 is compiled with Scala 2.11.
monasca-transform requires Spark compiled with
Scala 2.10 since we use spark streaming to
pull data from Kafka and the version of Kafka
is compatible with Scala 2.10.
The recommended way is to compile Spark
with Scala 2.10, but for purposes of devstack
plugin made changes to pull the required jars
from mvn directly.
(see SPARK_JARS and SPARK_JAVA_LIB variables in
settings)
All jars get moved to
<SPARK_HOME>/assembly/target/assembly/
target/scala_2.10/jars/
Note: <SPARK_HOME>/jars gets renamed
to <SPARK_HOME>/jars_original.
spark-submit defaults to assembly location
if <SPARK_HOME>/jars directory is missing.

2.) Updated start up scripts for spark
worker and spark master with a new env variable
SPARK_SCALA_VERSIOn=2.10. Also updated
PYTHONPATH variable to add new
py4j-0.10.4-src.zip file

3.) Some changes to adhere to deprecated pyspark
function calls which were removed in Spark 2.0

Change-Id: I8f8393bb91307d55f156b2ebf45225a16ae9d8f4
2017-08-21 11:18:22 -07:00
Jenkins b852f7141f Merge "Started adding kubernetes metrics aggregation" 2017-03-03 19:18:47 +00:00
Flint Calvin d8f283c378 Started adding kubernetes metrics aggregation
There is a desire to use Monasca Transform to aggregate
kubernetes metrics.  This change is a start in that
direction.  The transformation specs in the tests
folder now include some representative aggregations
of some kubernetes metrics.

This commit also includes some changes to get
first_attempt_at_spark_test.py working again
after being moved from the unit test folder to the
functional test folder.

Change-Id: I038ecaf42e67d5c994980991232a2a8428a4f4e3
2017-03-02 15:38:25 +00:00
agatea 1579d8b9e5 Reuse existing spark sql context
Prevent creating a new spark sql context object with every batch.
Profiling of java heap for the driver indicated that there is a
steady increase (~12MB over 5 days) of
org.apache.spark.sql.execution.metric.LongSQLMetricValue
and org.apache.spark.sql.execution.ui.SQLTaskMetrics with
each batch execution. These are used by the spark streaming
ui and were not being garbage collected.
See https://issues.apache.org/jira/browse/SPARK-17381
with a similar issue.
This change along with setting
spark.sql.ui.retainedExecutions to a low number in
sparks-defaults.conf will reduce gradual increase in heap
size.
Also made a change to catch unhandled MemberNotJoined exception
because of whichthe transform service thread went into
a unresponsive state.

Change-Id: Ibf244cbfc00a90ada66f492b473719c25fa17fd2
2017-02-27 14:06:53 -08:00
Ashwin Agate c189feeb8b Delete hourly offsets from offsets table
Pre Hourly processor fails if offsets recorded in
kafka_offsets table no longer exist in kafka.
This change deletes the offsets from kafka_offsets
table, so that the pre hourly processor can resume
processing with the next run.

Change-Id: I017c271e630fdf6de05a73b3bfcb14f5ed18615f
2017-01-09 19:35:51 +00:00
Jenkins ef08eea0ce Merge "Add configurable amnesty period for late metrics" 2016-11-24 09:12:34 +00:00
David C Kennedy 26e53336d4 Add configurable amnesty period for late metrics
Added configuration option to allow the pre-hourly transformation to be
done at a specified period past the hour.  This includes a check to
ensure that if not done yet for the hour but overdue processing is done
at the earliest time.

Change-Id: I8882f3089ca748ce435b4e9a92196a72a0a8e63f
2016-11-22 13:03:52 +00:00
Ashwin Agate c3fcd61f93 Remove unique metric count aggregation
Removing unique metric count aggregation
since it causes data from Kafka to be pulled
and processed twice and leads to unecessary
increase in over all time required to process
a batch.

Change-Id: I2046f95709232979dfd590d5293c803cac05bbb2
2016-11-22 12:58:15 +00:00
Ashwin Agate 1c65ca011b Validate metrics before publishing to kafka
Validate monasca metrics using monasca-common
validate library (requires monasca-common >= 1.1.0)

Change-Id: Iea784edbb3b57db57e6a90d1fc557b2c386c3713
2016-09-30 19:07:02 +00:00
Flint Calvin 0ea79c0305 Added aggregation results to application log
Made changes such that debug-level log entries are written to
the application log noting which aggregated metrics are submitted
during pre-hourly and hourly processing.

Change-Id: I64c6a18233614fe680aa0b084570ee7885f316e5
2016-09-23 18:24:20 +00:00
Flint Calvin 0365bfa5bb Modifications to include processing_meta in pre-hourly metrics.
Change-Id: I3464008cf8695864b75cbbbfd6570db5defa8cb5
2016-08-16 22:27:04 +00:00
Ashwin Agate 90b20bfd41 Change to monasca-common simport
Use monasca-common simport library

Closes-Bug: #1596331

Change-Id: I695d6db9c5c49c0120e73b76ea75f7a30222419d
2016-07-09 19:04:19 +00:00
Flint Calvin 1c3a7989e7 Added some bulletproofing to catch invalid configuration
entries for caching levels.  Also changed the calculate_rate
component to use values from instance usage if available (rather
than using 'all').

Change-Id: Ibdbc8d57c2566de76051c9277f9c75225546d4d7
2016-07-07 17:49:11 +00:00
darfed 9c95206bac Corrected log file name
The log file was being duplicated at monasca-transform.log and
monasca_transform.log.  Fixed this to be set simply at
monasca-transform.log.

Change-Id: I6a63737c569b06a271e11b880675edadfbdcc250
2016-06-30 21:59:19 +01:00
Ashwin Agate 00b874a6b3 Two stage transformation
Breaking down the aggregation into two stages.

The first stage aggregates raw metrics frequently and is
implemented as a Spark Streaming job which
aggregates metrics at a configurable time interval
(defaults to 10 minutes) and writes the intermediate
aggregated data, or instance usage data
to new "metrics_pre_hourly" kafka topic.

The second stage is implemented
as a batch job using Spark Streaming createRDD
direct stream batch API, which is triggered by the
first stage only when first stage runs at the
top of the hour.

Also enhanced kafka offsets table to keep track
of offsets from two stages along with streaming
batch time, last time version row got updated
and revision number. By default it should keep
last 10 revisions to the offsets for each
application.

Change-Id: Ib2bf7df6b32ca27c89442a23283a89fea802d146
2016-06-28 13:47:50 +00:00
Flint Calvin c7aabb6927 Added aggregation for vm.mem.used_mb and
swiftlm.diskusage.host.val.size.  Also renamed disk.allocation to
vm.disk.allocation and resolved a problem with resource_id not
being found for certain aggregations.

Change-Id: Iad82d149e7a04ed1e0ecfe936b90acfff1dca13e
2016-06-13 22:38:46 +00:00
Flint Calvin e4ade60711 Implemented aggregation for disk.allocation. Also set the
apache download source to use the archive site to ensure that
the dependency package does not disappear.  Also brought the
vagrant environment inline with monasca-api (i.e., use the
same values for private network, add substitution for kafka
brokers ip address to the conf).  Also parameterised
dependency sources (i.e., added settings to parameterise the
maven and apache repositories for the devstack plugin).

Change-Id: If9f0e2ed16bbfcd62152d29e5c7c86f5d555f9aa
2016-06-01 15:56:26 +00:00
Ashwin Agate 8f61dd95a9 monasca-transform initial commit
The monasca-transform is a new component in Monasca that
aggregates and transforms metrics.

monasca-transform is a Spark based data driven aggregation
engine which collects, groups and aggregates existing individual
Monasca metrics according to business requirements and publishes
new transformed (derived) metrics to the Monasca Kafka queue.

Since the new transformed metrics are published as any other
metric in Monasca, alarms can be set and triggered on the
transformed metric, just like any other metric.

Co-Authored-By: Flint Calvin <flint.calvin@hp.com>
Co-Authored-By: David Charles Kennedy <david.c.kennedy@hpe.com>
Co-Authored-By: Ashwin Agate <ashwin.agate@hp.com>

Implements: blueprint monasca-transform

Change-Id: I0e67ac7a4c9a5627ddaf698855df086d55a52d26
2016-05-26 00:10:37 +00:00