monasca-transform

Commit Graph

Author	SHA1	Message	Date
Witek Bedyk	811acd76c9	Remove project content on master branch This is step 2b of repository deprecation process as described in [1]. Project deprecation has been anounced here [2]. [1] https://docs.openstack.org/project-team-guide/repository.html#step-2b-remove-project-content [2] http://lists.openstack.org/pipermail/openstack-discuss/2020-August/016814.html Depends-On: https://review.opendev.org/751983 Change-Id: I83bb2821d64a4dddd569ff9939aa78d271834f08	2020-09-15 10:12:44 +02:00
Ashwin Agate	0cf08c45c5	Cleanup pre transform and transform specs * Removed unused fields "event_status", "event_version", "record_type", "mount", "device", "pod_name", "container_name" "app", "interface", "deployment" and "daemon_set" from record_store data. Now it is not required to add new dimension, meta or value_meta fields to record store data instead use special notation, e.g. "dimension#" to refer to any dimension field in the incoming metric. * Refactor and eliminate need to add any new metric.dimensions field in multiple places e.g. add to record store and instance usage dataframe schema and in all generic aggregation components. Added a new Map type column called "extra_data_map" to store any new fields, in instance usage data format. Map type column eliminates the need to add new columns to instance usage data. * Allow users to define any fields in "meta", "metric.dimensions" and "metric.value_meta" fields for aggregation in "aggregation_group_by_list" or "setter_group_by_list" using "dimensions#{$field_name}" or "meta#{$field_name}" or "value_meta#{$field_name}" * Updated generic aggregation components and data formats docs. Change-Id: I81a35e048e6bd5649c6b3031ac2722be6a309088 Story: 2001815 Task: 19605	2018-05-29 16:35:33 -07:00
Ashwin Agate	3589dd0820	Set region in metric meta from config file - set region in 'meta' section of aggregated metric from config file. Change-Id: I82b00e9d88704835537cfd0285678b94bcc0f3a8	2018-03-27 16:47:46 -07:00
agateaaa	2da390414e	Hourly aggregation account for early arrving metrics With this change pre hourly processor which does the hourly aggregation (second stage) and writes the final aggregated metrics to metris topic in kafka now accounts for any early arriving metrics. This change along with two previous changes to pre hourly processor that added 1.) configurable late metrics slack time (https://review.openstack.org/#/c/394497/), and 2.) batch filtering (https://review.openstack.org/#/c/363100/) will make sure all late arriving or early arriving metrics for an hour are aggregated appropriately. Also made improvement in MySQL offset to call delete excess revisions only once. Change-Id: I919cddf343821fe52ad6a1d4170362311f84c0e4	2017-04-17 15:29:34 -07:00
Flint Calvin	d8f283c378	Started adding kubernetes metrics aggregation There is a desire to use Monasca Transform to aggregate kubernetes metrics. This change is a start in that direction. The transformation specs in the tests folder now include some representative aggregations of some kubernetes metrics. This commit also includes some changes to get first_attempt_at_spark_test.py working again after being moved from the unit test folder to the functional test folder. Change-Id: I038ecaf42e67d5c994980991232a2a8428a4f4e3	2017-03-02 15:38:25 +00:00
David C Kennedy	26e53336d4	Add configurable amnesty period for late metrics Added configuration option to allow the pre-hourly transformation to be done at a specified period past the hour. This includes a check to ensure that if not done yet for the hour but overdue processing is done at the earliest time. Change-Id: I8882f3089ca748ce435b4e9a92196a72a0a8e63f	2016-11-22 13:03:52 +00:00
David C Kennedy	8a6e619f72	Populate the project id for kafka publish This needs to be the admin project id so for devstack this needs to be written to the configuration file once the users/projects etc are created and identifiable. Add a similar process to the refresh script. Correct the configuration property name to 'project' rather than using the old nomencature 'tenant'. Change-Id: Ib9970ffacf5ee0f7f006722038a1db8024c1385e	2016-11-11 15:24:22 +00:00
Flint Calvin	0ea79c0305	Added aggregation results to application log Made changes such that debug-level log entries are written to the application log noting which aggregated metrics are submitted during pre-hourly and hourly processing. Change-Id: I64c6a18233614fe680aa0b084570ee7885f316e5	2016-09-23 18:24:20 +00:00
Flint Calvin	bf2e42b3e0	Made changes to prevent multiple metrics in the same batch. Change-Id: Iec9935c21d8b65bf79067d4a009859c898b75993	2016-08-31 18:18:25 +00:00
darfed	bb83b30dc1	Add TLS/SSL capability to database connection Add properties to conf file to allow configuration of SSL for the database connection. Done for both the python and java connection strings. Change-Id: I4c3d25c3f8f12eae801a6a818bf4ac7acd93d2dc	2016-07-28 10:12:10 +01:00
darfed	9c95206bac	Corrected log file name The log file was being duplicated at monasca-transform.log and monasca_transform.log. Fixed this to be set simply at monasca-transform.log. Change-Id: I6a63737c569b06a271e11b880675edadfbdcc250	2016-06-30 21:59:19 +01:00
Ashwin Agate	00b874a6b3	Two stage transformation Breaking down the aggregation into two stages. The first stage aggregates raw metrics frequently and is implemented as a Spark Streaming job which aggregates metrics at a configurable time interval (defaults to 10 minutes) and writes the intermediate aggregated data, or instance usage data to new "metrics_pre_hourly" kafka topic. The second stage is implemented as a batch job using Spark Streaming createRDD direct stream batch API, which is triggered by the first stage only when first stage runs at the top of the hour. Also enhanced kafka offsets table to keep track of offsets from two stages along with streaming batch time, last time version row got updated and revision number. By default it should keep last 10 revisions to the offsets for each application. Change-Id: Ib2bf7df6b32ca27c89442a23283a89fea802d146	2016-06-28 13:47:50 +00:00
David C Kennedy	05c36ab8e5	Allow configurable SPARK_HOME Spark could feasibly be installed in any location so we should allow SPARK_HOME to be specified in the conf file and that value used in the spark-submit carried out in the transform service invocation. Change-Id: I4d25ccaa0e271eeb783d186666cdc8aaf131097c	2016-06-03 16:55:23 +01:00
Ashwin Agate	8f61dd95a9	monasca-transform initial commit The monasca-transform is a new component in Monasca that aggregates and transforms metrics. monasca-transform is a Spark based data driven aggregation engine which collects, groups and aggregates existing individual Monasca metrics according to business requirements and publishes new transformed (derived) metrics to the Monasca Kafka queue. Since the new transformed metrics are published as any other metric in Monasca, alarms can be set and triggered on the transformed metric, just like any other metric. Co-Authored-By: Flint Calvin <flint.calvin@hp.com> Co-Authored-By: David Charles Kennedy <david.c.kennedy@hpe.com> Co-Authored-By: Ashwin Agate <ashwin.agate@hp.com> Implements: blueprint monasca-transform Change-Id: I0e67ac7a4c9a5627ddaf698855df086d55a52d26	2016-05-26 00:10:37 +00:00

14 Commits