openstack-helm-infra

Commit Graph

Author	SHA1	Message	Date
Vladimir Kozhukalov	ae91cf3fc3	Use deploy-env role for all deployment jobs To make it easier to maintain the jobs all experimental jobs (those which are not run in check and gate pipelines) are moved to a separate file. They will be revised later to use the same deploy-env role. Also many charts use Openstack images for testing this PR adds 2023.1 Ubuntu Focal overrides for all these charts. Change-Id: I4a6fb998c7eb1026b3c05ddd69f62531137b6e51	2023-09-22 15:02:07 -05:00
Terekhin, Alexey (at4945)	aa3efe9715	Adding the feature to launch Prometheus process with custom script This change adds feature to launch Prometheus process using a custom script which should be stored in override values. Because the known issue https://github.com/prometheus/prometheus/issues/6934 is still open many years, we are going to struggle with growing WAL files using our custom downstream wrapper script which stops Prometheus process and deletes WALs. This solution can not fit all customers because completely kills wal cached data but it is ok for our purposes. Such way I just added the feature to use another custom script to launch Prometheus and left original functionality by default. Default/custom mode are defined in 'values.yaml' as the body of the custom launcher script. Change-Id: Ie02ea1d6a7de5c676e2e96f3dcd6aca172af4afb	2022-12-29 16:09:22 -08:00
Brian Haley	f31cfb2ef9	support image registries with authentication Based on spec in openstack-helm repo, support-OCI-image-registry-with-authentication-turned-on.rst Each Helm chart can configure an OCI image registry and credentials to use. A Kubernetes secret is then created with these info. Service Accounts then specify an imagePullSecret specifying the Secret with creds for the registry. Then any pod using one of these ServiceAccounts may pull images from an authenticated container registry. Change-Id: Iebda4c7a861aa13db921328776b20c14ba346269	2022-07-20 14:28:47 -05:00
Gage Hugo	711d5706dd	Update default image value for prometheus This change updates the default image value in the prometheus chart from newton to wallaby for the helm_test image. Change-Id: I0f70734a8455661f7705baeed3cafbaf529c56a8	2022-04-28 17:23:04 +00:00
Gage Hugo	22e50a5569	Update htk requirements This change updates the helm-toolkit path in each chart as part of the move to helm v3. This is due to a lack of helm serve. Change-Id: I011e282616bf0b5a5c72c1db185c70d8c721695e	2021-10-06 01:02:28 +00:00
Thiago Brito	5a0ba49d50	Prepending library/ to docker official images This will ease mirroring capabilities for the docker official images. Signed-off-by: Thiago Brito <thiago.brito@windriver.com> Change-Id: I0f9177b0b83e4fad599ae0c3f3820202bf1d450d	2021-06-02 15:04:38 -03:00
Lo, Chi (cl566n)	3b030aa40d	Removed hard-coded value for backendPort This change will retrieve the backend port from values.yaml instead of a hard-coded value. Change-Id: I27630d3ead2c8a517f4fe8577e8396776010f9a8	2021-04-15 13:33:22 -07:00
DeJaeger, Darren (dd118r)	be2584fd7c	Adjust Prometheus http readiness probe path from /status to /-/ready Prometheus documentation shows that /-/ready can be used to check that it is ready to service traffic (i.e. respond to queries) [0]. I've witnessed cases where Prometheus's readiness probe is passing during initial deployment using /status, which in turn triggers its helm test to start. Said helm test then fails because /status is not a good a reliable indicator that Prometheus is actually ready to serve traffic and the helm test is performing actions that require it to be proprely up and ready. [0]: https://prometheus.io/docs/prometheus/latest/management_api/ Change-Id: Iab22d0c986d680663fbe8e84d6c0d89b03dc6428	2021-04-13 13:17:49 -04:00
Lo, Chi (cl566n)	1892fca645	Enable TLS for Prometheus This patchset enabled TLS path for Prometheus when it acts as a server. Note that TLS is not directly terminated at Prometheus. TLS is terminated at apache proxy which in turn route request to Prometheus. Change-Id: I0db366b6237a34da2e9a31345d96ae8f63815fa2	2021-03-17 17:06:07 -07:00
Smith, David (ds3330)	96b751465a	Upgrade Prometheus to v2.25 change/Remove deprecated flags The flag storage.tsdb.retention is deprecated and generates warnings on startup storage.tsdb.retention.time is the new flag. storage.tsdb.wal-compression is now set as the default in v2.20 and above and is no longer needed Change-Id: I66f861a354a3cdde69a712ca5fd8a1d1a1eca60a	2021-03-16 18:19:49 +00:00
Smith, David (ds3330)	1934d32cdd	Fix spacing inconsistencies with flags Change-Id: I83676f62a4cfc7d8e20145a72f28eeab5ef4cc8d	2021-01-06 00:16:16 +00:00
Smith, David (ds3330)	9d9aaa8948	Fix spacing inconsistencies with flags Change-Id: Ia8f7437071a8865f1470412ad616b67a38142719	2020-10-21 13:44:07 +00:00
Steven Fitzpatrick	cdd0f33d0c	Revert "Prometheus: Render Rules as Templates" This reverts commit `fb7fc87d23`. I first submitted that as a way to add dynamic capability to the prometheus rules (they infamously don't support ENV variable substitution there). However this be done easily with another solution, and would clean up the prometheus chart values significantly. Change-Id: Ibec512d92490798ae5522468b915b49e7746806a	2020-10-06 15:21:18 +00:00
Steven Fitzpatrick	f4bdb713c1	Prometheus: Add configurable readiness/liveness Probes This change adds probes to the prometheus statefulset using the HTK probe generation functions Change-Id: I249d662dd0d23dd964f7118af94c733bbdc5db92	2020-10-05 19:28:00 +00:00
Andrii Ostapenko	1532958c80	Change helm-toolkit dependency version to ">= 0.1.0" Since we introduced chart version check in gates, requirements are not satisfied with strict check of 0.1.0 Change-Id: I15950b735b4f8566bc0018fe4f4ea9ba729235fc Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>	2020-09-24 12:19:28 -05:00
Mohammed Naser	c7a45f166f	Run chart-testing on all charts Added chart lint in zuul CI to enhance the stability for charts. Fixed some lint errors in the current charts. Change-Id: I9df4024c7ccf8b3510e665fc07ba0f38871fcbdb	2020-09-11 18:02:38 +03:00
Smith, David (ds3330)	9027d1337f	Allow the storage.tsdb.wal-compression flag to be available Change-Id: I609414330f0c8a65b6c0d3409bded09fcff0bbe0	2020-08-26 15:42:24 +00:00
Zuul	10fd77b6e4	Merge "Update alertmanager include snmp_notifier function"	2020-08-11 06:16:10 +00:00
Steven Fitzpatrick	fb7fc87d23	Prometheus: Render Rules as Templates This change allows us to substitute values into our rules files. Example: - alert: my_region_is_down expr: up{region="{{ $my_region }}"} == 0 To support this change, rule annotations that used the expansion {{ $labels.foo }} had to be surrounded with "{{` ... `}}" to render correctly. Change-Id: Ia7ac891de8261acca62105a3e2636bd747a5fbea	2020-08-10 18:16:35 +00:00
Xiaoguang(William) Zhang	7c94deae43	Update alertmanager include snmp_notifier function Change-Id: I5aedbdcdbba397a9fddde19a0898cb91de08553a	2020-08-07 12:25:33 -04:00
Phil Sphicas	5d8cf965c1	Prometheus: Allow input of TLS client creds in values.yaml Some scrape targets require the use of TLS client certificates, which are specified as filenames as part of the tls_config. This change allows these client certs and keys to be provided, stores them in a secret, and mounts them in the pod under /tls_configs. Example: tls_configs: kubernetes-etcd: ca.pem: \| -----BEGIN CERTIFICATE----- -----END CERTIFICATE----- crt.pem: \| -----BEGIN CERTIFICATE----- -----END CERTIFICATE----- key.pem: \| -----BEGIN RSA PRIVATE KEY----- -----END RSA PRIVATE KEY----- conf: prometheus: scrape_configs: template: \| scrape_configs: - job_name: kubernetes-etcd scheme: https tls_config: ca_file: /tls_configs/kubernetes-etcd.ca.pem cert_file: /tls_configs/kubernetes-etcd.cert.pem key_file: /tls_configs/kubernetes-etcd.key.pem Change-Id: I963c65dc39f1b5110b091296b93e2de9cdd980a4	2020-07-31 16:31:52 +00:00
willxz	c97c592216	Change for alertmanager v0.20 - Update alertmanger and prometheus discovery port from 6783 to 9094 - Update to support fqdn for discovery hostname - Add one test alert to Prometheus to test alert pipeline - update container name from alertmanger to prometheus-alertmanager Change-Id: Iec5e758e4b576dff01e84591a2440d030d5ff3c4	2020-07-22 17:39:09 -04:00
Zuul	b51c473175	Merge "Add user account to be used for federated metric collection."	2020-07-14 19:06:26 +00:00
Smith, David (ds3330)	a4fc3f7d78	Add user account to be used for federated metric collection. Add federated user account for with consolidated metrics Change-Id: I8a5e9aca0a0b29b672c8427b6491ff92797c5146	2020-07-14 13:33:35 +00:00
KHIYANI, RAHUL (rk0850)	b400a6c41d	Add missing security context to promethues and postgresql pods/containers This updates the chart to include the pod security context on the pod template. This also adds the container security context to set readOnlyRootFilesystem flag to true Change-Id: Icb7a9de4d98bac1f0bcf6181b6e88695f4b09709	2020-07-07 21:20:36 +00:00
Andrii Ostapenko	824f168efc	Undo octal-values restriction together with corresponding code Unrestrict octal values rule since benefits of file modes readability exceed possible issues with yaml 1.2 adoption in future k8s versions. These issues will be addressed when/if they occur. Also ensure osh-infra is a required project for lint job, that matters when running job against another project. Change-Id: Ic5e327cf40c4b09c90738baff56419a6cef132da Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>	2020-07-07 15:42:53 +00:00
Andrii Ostapenko	83e27e600c	Enable key-duplicates and octal-values yamllint checks With corresponding code changes. Change-Id: I11cde8971b3effbb6eb2b69a7d31ecf12140434e	2020-06-17 13:14:30 -05:00
Andrii Ostapenko	dfb32ccf60	Enable yamllint rules for templates - braces - brackets - colons - commas - comments - comments-indentation - document-start - hyphens - indentation With corresponding code changes. Also idempotency fix for lint script. Change-Id: Ibe5281cbb4ad7970e92f3d1f921abb1efc89dc3b	2020-06-17 13:13:53 -05:00
Andrii Ostapenko	8f24a74bc7	Introduces templates linting This commit rewrites lint job to make template linting available. Currently yamllint is run in warning mode against all templates rendered with default values. Duplicates detected and issues will be addressed in subsequent commits. Also all y*ml files are added for linting and corresponding code changes are made. For non-templates warning rules are disabled to improve readability. Chart and requirements yamls are also modified in the name of consistency. Change-Id: Ife6727c5721a00c65902340d95b7edb0a9c77365	2020-06-11 23:29:42 -05:00
Andrii Ostapenko	731a6b4cfa	Enable yamllint checks - document-end - document-start - empty-lines - hyphens - indentation - key-duplicates - new-line-at-end-of-file - new-lines - octal-values with corresponding code adjustment. Change-Id: I92d6aa20df82aa0fe198f8ccd535cfcaf613f43a	2020-05-29 19:49:05 +00:00
Andrii Ostapenko	67d1409a74	Enable yamllint checks - brackets - braces - colon - commas with corresponding code adjustment. Change-Id: I8d294cfa8f358431bee6ecb97396dae66f955b86	2020-05-21 14:04:23 +00:00
diwakar thyagaraj	163c5aa780	Enable Apparmor to all osh-infra test pods Also Changed container names to static. Change-Id: I51f53b480d18aaa38a9707429f01052ee122e7e9 Signed-off-by: diwakar thyagaraj <diwakar.chitoor.thyagaraj@att.com>	2020-05-19 15:36:07 +00:00
Zuul	e53d28718d	Merge "Remove OSH Authors copyright"	2020-05-12 20:00:38 +00:00
diwakar thyagaraj	64ac469eb6	Enable Apparmor to Prometheus-init-containers Change-Id: Ibea27338437c9c039b10bff02a28d60d3f5cf4b1 Signed-off-by: diwakar thyagaraj <diwakar.chitoor.thyagaraj@att.com>	2020-05-08 17:24:54 +00:00
Gage Hugo	d14d826b26	Remove OSH Authors copyright The current copyright refers to a non-existent group "openstack helm authors" with often out-of-date references that are confusing when adding a new file to the repo. This change removes all references to this copyright by the non-existent group and any blank lines underneath. Change-Id: I1882738cf9757c5350a8533876fd37b5920b5235	2020-05-07 02:11:15 +00:00
Zuul	01aa16620b	Merge "Prometheus: Status Alerts Scalar/Vector Conversion"	2020-02-18 17:35:43 +00:00
Zuul	57ad8ad603	Merge "Prometheus: Ceph Alerts Scalar/Vector Conversion"	2020-02-18 17:35:42 +00:00
Zuul	3c7a9de243	Merge "Prometheus: Node Alerts Scalar/Vector Conversion"	2020-02-18 17:29:48 +00:00
dt241s@att.com	8bd4a2624a	[FIX] Add apparmor to prometheus. This also fixes Elasticsearch apparmor Jobs. Change-Id: I8f2a9aa12beffe3ca394a2e9dd00aba7e5292f29	2020-02-14 23:13:38 +00:00
Steven Fitzpatrick	a41262e459	Prometheus: Node Alerts Scalar/Vector Conversion This change converts alert expressions which relied on instant vectors to use range aggregate functions instead - For just the 'basic_linux' rules. Change-Id: I30d6ab71d747b297f522bbeb12b8f4dbfce1eefe Co-Authored-By: Meghan Heisler <mkheisler93@gmail.com>	2020-02-11 15:14:40 +00:00
Steven Fitzpatrick	f37865d6a0	Prometheus: Ceph Alerts Scalar/Vector Conversion This change updates the prometheus alerting rules to use ranged vectors in their expressions, to avoid situations wher missed scrapes would cause scalar metrics to "go stale" - resetting the alert timer. Only the ceph alerts are affected by this change. Change-Id: Ib47866d12616aaa808e6a09c58aa4352e338a152 Co-Authored-By: Meghan Heisler <mkheisler93@gmail.com>	2020-02-11 15:14:35 +00:00
Steven Fitzpatrick	d408bed90d	Prometheus: Status Alerts Scalar/Vector Conversion This change converts alert expressions which relied on instant vectors to use range aggregate functions instead. Change-Id: I4df757f961524bed23b6a6ad361779c1749ca2c5 Co-Authored-By: Meghan Heisler <mkheisler93@gmail.com>	2020-02-11 15:14:27 +00:00
Zuul	cc399a08ed	Merge "Fix incorrect prometheus alert names in nagios"	2020-01-15 23:43:05 +00:00
Zuul	c2ece6a45a	Merge "Support for local storage"	2020-01-09 23:18:16 +00:00
Smruti Soumitra Khuntia	2ac08b59b4	Support for local storage This change adds a means of introducing new storage classes and local persistent volumes. Change-Id: I340c75f3d0a1678f3149f3cf62e4ab104823cc49 Co-Authored-By: Steven Fitzpatrick <steven.fitzpatrick@att.com>	2020-01-09 10:24:31 -06:00
Tin Lam	c199addf3c	Update apiVersion This patch set updates and tests the apiVersion for rbac.authorization.k8s.io from v1beta1 to v1 in preparation for its removal in k8s 1.20. Change-Id: I4e68db1f75ff72eee55ecec93bd59c68c179c627 Signed-off-by: Tin Lam <tin@irrational.io>	2020-01-09 08:59:48 +00:00
Steve Wilkerson	ddd5a74319	Prometheus: Add feature-gate support in deployment scripts This updates the deployment scripts for Prometheus to leverage the feature gate functionality rather than bash generation of the list of override files to use for alerting rules Change-Id: Ie497ae930f7cc4db690a4ddc812a92e4491cde93 Signed-off-by: Steve Wilkerson <sw5822@att.com>	2020-01-07 22:06:19 +00:00
Steven Fitzpatrick	4fdcff593c	Fix incorrect prometheus alert names in nagios I noticed a some nagios service checks were checking prometheus alerts which did not exist in our default prometheus configuration. In one case a prometheus alert did not match the naming convention of similar alerts. One nagios service check, ceph_monitor_clock_skew_high, does not have a corresponding alert at all, so I've changed it to check the node_ntmp_clock_skew_high alert, where a node has the label ceph-mon="enabled". Change-Id: I2ebf9a4954190b8e2caefc8a61270e28bf24d9fa	2020-01-03 10:30:08 -06:00
Steve Wilkerson	fbd34421f2	Prometheus: Update chart to support federation This updates the Prometheus chart to support federation. This moves to defining the Prometheus configuration file via a template in the values.yaml file instead of through raw yaml. This allows for overriding the chart's default configuration wholesale, as this would be required for a hierarchical federated setup. This also strips out all of the default rules defined in the chart for the same reason. There are example rules defined for the various aspects of OSH's infrastructure in the prometheus/values_overrides directory that are executed as part of the normal CI jobs. This also adds a nonvoting federated-monitoring job that vets out the ability to federate prometheus in a hierarchical fashion with extremely basic overrides Change-Id: I0f121ad5e4f80be4c790dc869955c6b299ca9f26 Signed-off-by: Steve Wilkerson <sw5822@att.com>	2019-11-21 12:39:56 +00:00
Steve Wilkerson	c1555920e5	Update podManagementPolicy for Prometheus and Alertmanager This updates the podManagementPolicy to 'Parallel' for Prometheus and Alertmanager, as there's no need to handle deploying these two services in a sequential manner Change-Id: I2f33b9651bed20c4cb2e0c477ae2227cbf9310cf Signed-off-by: Steve Wilkerson <sw5822@att.com>	2019-11-20 21:37:55 +00:00

1 2 3 4

166 Commits