monasca-transform

Commit Graph

Author	SHA1	Message	Date
Witek Bedyk	811acd76c9	Remove project content on master branch This is step 2b of repository deprecation process as described in [1]. Project deprecation has been anounced here [2]. [1] https://docs.openstack.org/project-team-guide/repository.html#step-2b-remove-project-content [2] http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016814.html Depends-On: https://review.opendev.org/751983 Change-Id: I83bb2821d64a4dddd569ff9939aa78d271834f08	2020-09-15 10:12:44 +02:00
Ashwin Agate	0cf08c45c5	Cleanup pre transform and transform specs * Removed unused fields "event_status", "event_version", "record_type", "mount", "device", "pod_name", "container_name" "app", "interface", "deployment" and "daemon_set" from record_store data. Now it is not required to add new dimension, meta or value_meta fields to record store data instead use special notation, e.g. "dimension#" to refer to any dimension field in the incoming metric. * Refactor and eliminate need to add any new metric.dimensions field in multiple places e.g. add to record store and instance usage dataframe schema and in all generic aggregation components. Added a new Map type column called "extra_data_map" to store any new fields, in instance usage data format. Map type column eliminates the need to add new columns to instance usage data. * Allow users to define any fields in "meta", "metric.dimensions" and "metric.value_meta" fields for aggregation in "aggregation_group_by_list" or "setter_group_by_list" using "dimensions#{$field_name}" or "meta#{$field_name}" or "value_meta#{$field_name}" * Updated generic aggregation components and data formats docs. Change-Id: I81a35e048e6bd5649c6b3031ac2722be6a309088 Story: 2001815 Task: 19605	2018-05-29 16:35:33 -07:00
Amir Mofakhar	37d4f09057	Update pep8 checks * set the maximum line length to 100 * cleaned up the codes for pep8 Change-Id: Iab260a4e77584aae31c0596f39146dd5092b807a Signed-off-by: Amir Mofakhar <amofakhar@op5.com>	2018-04-18 10:05:00 +02:00
Ashwin Agate	3589dd0820	Set region in metric meta from config file - set region in 'meta' section of aggregated metric from config file. Change-Id: I82b00e9d88704835537cfd0285678b94bcc0f3a8	2018-03-27 16:47:46 -07:00
Ashwin Agate	022bd11a4d	Switch to using Spark version 2.2.0 Following changes were required: 1.) By default the pre-built distribution for Spark 2.2.0 is compiled with Scala 2.11. monasca-transform requires Spark compiled with Scala 2.10 since we use spark streaming to pull data from Kafka and the version of Kafka is compatible with Scala 2.10. The recommended way is to compile Spark with Scala 2.10, but for purposes of devstack plugin made changes to pull the required jars from mvn directly. (see SPARK_JARS and SPARK_JAVA_LIB variables in settings) All jars get moved to <SPARK_HOME>/assembly/target/assembly/ target/scala_2.10/jars/ Note: <SPARK_HOME>/jars gets renamed to <SPARK_HOME>/jars_original. spark-submit defaults to assembly location if <SPARK_HOME>/jars directory is missing. 2.) Updated start up scripts for spark worker and spark master with a new env variable SPARK_SCALA_VERSIOn=2.10. Also updated PYTHONPATH variable to add new py4j-0.10.4-src.zip file 3.) Some changes to adhere to deprecated pyspark function calls which were removed in Spark 2.0 Change-Id: I8f8393bb91307d55f156b2ebf45225a16ae9d8f4	2017-08-21 11:18:22 -07:00
Flint Calvin	d8f283c378	Started adding kubernetes metrics aggregation There is a desire to use Monasca Transform to aggregate kubernetes metrics. This change is a start in that direction. The transformation specs in the tests folder now include some representative aggregations of some kubernetes metrics. This commit also includes some changes to get first_attempt_at_spark_test.py working again after being moved from the unit test folder to the functional test folder. Change-Id: I038ecaf42e67d5c994980991232a2a8428a4f4e3	2017-03-02 15:38:25 +00:00
Cao Xuan Hoang	6cba31c0dd	Replaced e.message with str(e) For logging the exception message: e.message has been deprecated. The preferred way is to call str(e). More details: https://www.python.org/dev/peps/pep-0352/ Change-Id: I27b6a7b1f5e336df3cd618684cedfd01c840c99f	2017-01-19 02:00:43 +00:00
David C Kennedy	8a6e619f72	Populate the project id for kafka publish This needs to be the admin project id so for devstack this needs to be written to the configuration file once the users/projects etc are created and identifiable. Add a similar process to the refresh script. Correct the configuration property name to 'project' rather than using the old nomencature 'tenant'. Change-Id: Ib9970ffacf5ee0f7f006722038a1db8024c1385e	2016-11-11 15:24:22 +00:00
Ashwin Agate	1c65ca011b	Validate metrics before publishing to kafka Validate monasca metrics using monasca-common validate library (requires monasca-common >= 1.1.0) Change-Id: Iea784edbb3b57db57e6a90d1fc557b2c386c3713	2016-09-30 19:07:02 +00:00
Flint Calvin	4edad0286a	Eliminated ceiling function for utilization metrics Removed the calls to the ceiling function on utilization metrics aggregation such that they now are exact values (i.e., not rounded up to the next integral value). Change-Id: I9813b94acb051f6754da2d559090318010f86e57	2016-09-09 21:07:29 +00:00
Flint Calvin	3cdb0d1687	Made corrections such that swiftlm.diskusage.rate_agg is now correctly based on swiftlm.diskusage.host.val.avail (instead of incorrectly being based on swiftlm.diskusage.host.val.size). Change-Id: If17853e166c050cefbf390791a8696ce520fca96	2016-09-06 20:52:49 +00:00
Flint Calvin	eff0e74c50	Eliminated processing_meta from hourly metrics. Change-Id: Ibcb00cae12d87e93546bddf14a796662425f7167	2016-08-18 23:25:06 +00:00
Flint Calvin	0365bfa5bb	Modifications to include processing_meta in pre-hourly metrics. Change-Id: I3464008cf8695864b75cbbbfd6570db5defa8cb5	2016-08-16 22:27:04 +00:00
Flint Calvin	615e52d5cd	Modifications to make rate calculations work with two-stage aggregation. Change-Id: I8c7b6112a04ba378ba1911a342cb97e8c388ebc6	2016-08-09 16:33:34 +00:00
Jenkins	beae384c2d	Merge "add test cases for fetch_quantity_prehourly_instance_usage"	2016-07-28 14:51:21 +00:00
Michael Dong	acba1782ad	add test cases for fetch_quantity_prehourly_instance_usage Removed dependency of yaml module. Change-Id: I87ce80d420bc75ddbef2c8454f088be25f9ff908	2016-07-25 12:19:24 -07:00
Jenkins	aa7addbc84	Merge "timestamp parsed into utc instead of localtime(which was default) Closes Bug:#160531"	2016-07-25 15:28:03 +00:00
Michael Dong	c498564929	add 'string' to all firstrecord_timestamp and lastrecord_timestamp to fix bug Closes-Bug:#1603529 Change-Id: I5585fc5c376e3220d79d22be3394e3b6ad0e6214	2016-07-19 13:22:51 -07:00
Flint Calvin	c7128b0136	Added filter capability for transform specs. Change-Id: Ie5b456039c9810da19c1699cc7d5a44277496843	2016-07-18 22:24:05 +00:00
Michael Dong	bbf977a501	timestamp parsed into utc instead of localtime(which was default) Closes Bug:#160531 Change-Id: I74454eda759af5dfaf33b58b4ea18142350129b0	2016-07-15 11:46:30 -07:00
Flint Calvin	1c3a7989e7	Added some bulletproofing to catch invalid configuration entries for caching levels. Also changed the calculate_rate component to use values from instance usage if available (rather than using 'all'). Change-Id: Ibdbc8d57c2566de76051c9277f9c75225546d4d7	2016-07-07 17:49:11 +00:00
Ashwin Agate	00b874a6b3	Two stage transformation Breaking down the aggregation into two stages. The first stage aggregates raw metrics frequently and is implemented as a Spark Streaming job which aggregates metrics at a configurable time interval (defaults to 10 minutes) and writes the intermediate aggregated data, or instance usage data to new "metrics_pre_hourly" kafka topic. The second stage is implemented as a batch job using Spark Streaming createRDD direct stream batch API, which is triggered by the first stage only when first stage runs at the top of the hour. Also enhanced kafka offsets table to keep track of offsets from two stages along with streaming batch time, last time version row got updated and revision number. By default it should keep last 10 revisions to the offsets for each application. Change-Id: Ib2bf7df6b32ca27c89442a23283a89fea802d146	2016-06-28 13:47:50 +00:00
Flint Calvin	d8e73f3bde	Added several Swift aggregations (including a new usage component for calculating rate changes). Also fixed some pep8 issues. Change-Id: I46685d39ace663595aa524f04d8d35a71c9432c3	2016-06-21 19:44:02 +00:00
Ashwin Agate	8f61dd95a9	monasca-transform initial commit The monasca-transform is a new component in Monasca that aggregates and transforms metrics. monasca-transform is a Spark based data driven aggregation engine which collects, groups and aggregates existing individual Monasca metrics according to business requirements and publishes new transformed (derived) metrics to the Monasca Kafka queue. Since the new transformed metrics are published as any other metric in Monasca, alarms can be set and triggered on the transformed metric, just like any other metric. Co-Authored-By: Flint Calvin <flint.calvin@hp.com> Co-Authored-By: David Charles Kennedy <david.c.kennedy@hpe.com> Co-Authored-By: Ashwin Agate <ashwin.agate@hp.com> Implements: blueprint monasca-transform Change-Id: I0e67ac7a4c9a5627ddaf698855df086d55a52d26	2016-05-26 00:10:37 +00:00

24 Commits