Commit Graph

86 Commits

Author SHA1 Message Date
Sergiy Markin 13c1d8cd38 [backups] Add throttlling of remote backups
This PS adds a possibility to limit (to throttle) the number of
simultaneously uploaded backups while keeping the logic on the client
side using flag files on remote side. The main idea is to have an
ability to limit number of simultaneous remote backups upload sessions.

Change-Id: I5464004d4febfbe20df9cd41ca62ceb9fd6f0c0d
2023-12-18 20:39:45 +00:00
Sergiy Markin 4a95f75b6b [backups] Added staggered backups
This PS adds staggered backups possibility by adding anti-affinity rules
to backups cronjobs that can be followed across several namespaces to
decrease load on remote backup destination server making sure that at
every moment in time there is only one backup upload is in progress.

Change-Id: If49791f866a73a08fb98fa0e0b4854042d079c66
2023-12-05 04:10:22 +00:00
Markin, Sergiy (sm515x) ed7e58f4b1 [postgres] Update postgres to 14.5
Updated postgres binary version to 14.5.

Also replaced deprecated config item wal_keep_segments with wal_keep_size.

Change-Id: Ie86850f8ebb8bfaae4ba5457409d3920b230ce9c
2022-09-19 19:20:23 -05:00
Markin, Sergiy (sm515x) 5c4056ad34 [DATABASE] Add verify databases backup
HTK - added verify_databases_backup_in_directory function that is
going to be defined inside mariadb/postgresql/etcd charts.

Mariadb chart - added verify_databases_backup_archives function
implementation.

Added mariadb-verify container to mariadb-backup cronjob to run
verification process.

Added remove backup verification pocess - comparition of local and remote file md5 hashes.

PostgreSQL chart - added empty implementation of verify_databases_backup_archives() function. This is a subject for future realization.

Change-Id: I361cdb92c66b0b27539997d697adfd1e93c9a29d
2022-09-09 01:41:00 +00:00
Brian Haley f31cfb2ef9 support image registries with authentication
Based on spec in openstack-helm repo,
support-OCI-image-registry-with-authentication-turned-on.rst

Each Helm chart can configure an OCI image registry and
credentials to use. A Kubernetes secret is then created with these
info. Service Accounts then specify an imagePullSecret specifying
the Secret with creds for the registry. Then any pod using one
of these ServiceAccounts may pull images from an authenticated
container registry.

Change-Id: Iebda4c7a861aa13db921328776b20c14ba346269
2022-07-20 14:28:47 -05:00
Schubert Anselme 753a32c33d
Migrate CronJob resources to batch/v1 and PodDisruptionBudget resources to policy/v1
This change updates the following charts to migrate CronJob resources to the batch/v1 API version, available since v1.21. [0]
and to migrate PodDisruptionBudget to the policy/v1 API version, also available since v1.21. [1]

This also uplift ingress controller to 1.1.3

- ceph-client (CronJob)
- cert-rotation (CronJob)
- elasticsearch (CronJob)
- mariadb (CronJob & PodDisruptionBudget)
- postgresql (CronJob)

0: https://kubernetes.io/docs/reference/using-api/deprecation-guide/#cronjob-v125
1: https://kubernetes.io/docs/reference/using-api/deprecation-guide/#poddisruptionbudget-v125

Change-Id: Ia6189b98a86b3f7575dc4678bb3a0cce69562c93
2022-05-10 15:12:53 -04:00
Gage Hugo bc5bad42b4 Fix invalid fields in values for postgresql
The postgresql chart currently fails to run when deployed with
helm v3 due to invalid fields defined in values.yaml that are
more strictly enforced. This change removes these invalid values
to allow deploying the postgresql chart with helm v3.

Change-Id: Iabd3cfa77da618026ceb2dfdffd5d2a0b1519d93
2022-03-22 17:00:53 -05:00
Lo, Chi (cl566n) 2fc1ce4a14 Removing -x from database backup script
The set -x has produced 6 identical log strings every time the
log_backup_error_exit function is called.  Prometheus is using
the occurrence and number of some logs over a period of time to
evaluate database backup failure or not.  Only one log should be
generated when a particular database backup scenario failed.

Upon discussion with database backup and restore SME, it is
recommended to remove the set -x once and for all.

Change-Id: I846b5c16908f04ac40ee8f4d87d3b7df86036512
2022-02-23 16:42:29 -08:00
Sophie Huang 25d1eedc59 Postgresql: Enhance postgresql backup
Pick up the helm-toolkit DB backup enhancement in postgresql
to add capability to retry uploading backup to remote server.

Change-Id: I041d83211f08a8d0c9c22a66e16e6b7652bfc7d9
2022-01-25 20:58:27 +00:00
Gage Hugo 22e50a5569 Update htk requirements
This change updates the helm-toolkit path in each chart as part
of the move to helm v3. This is due to a lack of helm serve.

Change-Id: I011e282616bf0b5a5c72c1db185c70d8c721695e
2021-10-06 01:02:28 +00:00
Sean Eagan b1a247e7f5 Helm 3 - Fix Job labels
If labels are not specified on a Job, kubernetes defaults them
to include the labels of their underlying Pod template. Helm 3
injects metadata into all resources [0] including a
`app.kubernetes.io/managed-by: Helm` label. Thus when kubernetes
sees a Job's labels they are no longer empty and thus do not get
defaulted to the underlying Pod template's labels. This is a
problem since Job labels are depended on by
- Armada pre-upgrade delete hooks
- Armada wait logic configurations
- kubernetes-entrypoint dependencies

Thus for each Job template this adds labels matching the
underlying Pod template to retain the same labels that were
present with Helm 2.

[0]: https://github.com/helm/helm/pull/7649

Change-Id: I3b6b25fcc6a1af4d56f3e2b335615074e2f04b6d
2021-09-30 16:01:31 -05:00
Thiago Brito 5a0ba49d50 Prepending library/ to docker official images
This will ease mirroring capabilities for the docker official images.

Signed-off-by: Thiago Brito <thiago.brito@windriver.com>
Change-Id: I0f9177b0b83e4fad599ae0c3f3820202bf1d450d
2021-06-02 15:04:38 -03:00
anthony.bellino ce9d420ee5 Add tls to Postgresql
This PS provides the capability to enable tls for the
Postgresql chart.

Change-Id: Ie1ebd693dbf23f98bef832e3c57defe3a4e026bd
2021-02-08 16:52:01 +00:00
Apurva Gokani 25aa369025 postgres archive cleanup script
This change adds  cleanup mechanism to archive by following steps:
1) add archive_cleanup.sh under /tmp directory
2) through the start.sh this script will be triggered
3) It runs every hour, checking utilization of archive dir
4) If it is above threshold it deletes half of old files

Change-Id: I918284b0aa5a698a6028b9807fcbf6559ef0ff45
2021-01-14 16:21:14 +00:00
Phil Sphicas 20288319af postgresql: Revert "Add default reject rule ..."
This reverts commit 982e3754a5.
"Add default reject rule end in Postgres pg_hba.conf to ensure all
connections must be explicitly allowed."

The original commit introduced a breaking change when installing with
the chart defaults - before, all remote connections with md5 auth were
allowed, and after the change, only explicit users are allowed.

This is fully overridable, but the original defaults are more
conservative.

Change-Id: Ib297e480bccd3ac7c0cf15985b3def2c8b3e889e
2020-10-23 17:50:50 +00:00
Phil Sphicas c43331d67a postgresql: Optimize restart behavior
* add preStop hook to trigger Fast Shutdown
* disable readiness probe by default

When Kubernetes terminates a pod, the container runtime typically sends
a SIGTERM signal to pid 1 in each container [0]. PostgreSQL interprets
SIGTERM as a request to do a "Smart Shutdown" [1]. This can take minutes
(often exhausting the termination grace period), and during this time,
new connections are not being serviced.

Now that postgresql has a single replica, this behavior is undesirable.
If we kill the pod (e.g. in an upgrade), we probably want it to come
back as soon as possible.

This change adds a preStop hook that sends a SIGINT to postgresql in
order to trigger a "Fast Shutdown". In addition, the readiness probe is
disabled by default, since it adds no value in a single-replica
scenario.

0: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination
1: https://www.postgresql.org/docs/9.6/server-shutdown.html

Change-Id: Ib5f3d2a49e55332604c91f9a011e87d78947dbef
2020-10-23 07:41:57 +00:00
Phil Sphicas a10699c4e0 postgresql: Allow probe tweaking
Uses the standard helm-toolkit macros for liveness and readiness probes,
allowing them to be enabled or disabled, and params to be overridden.

The existing hard-coded settings are preserved as the chart defaults.

Change-Id: Idd063e6b8721126c88fa22c459f93812151d7b64
2020-10-23 06:52:45 +00:00
Chris Wedgwood da1117e257 [PostgreSQL] Use explicit entrypoint for prometheus exporter
It appears having `args:` without `command:` causes some combinations
of kubernetes & container runtimes to not work as expected.

Change-Id: Id9d692632066de410ca5f13bbfe13d1899b93819
2020-10-11 13:53:34 +00:00
Apurva Gokani 85cbd6f04b adding archiving to postgres
To safeguard postgres from clogging up wal files
in pg_xlog directory, This change does the following:
1) adding postgres archiving to move the WAL file to different directory
2) Makes sure that archive is in different Persistent volume.

Change-Id: I59bc76f27384d4f3836ef609855afcc33a7b99d0
2020-10-08 13:14:03 -05:00
Andrii Ostapenko 1532958c80
Change helm-toolkit dependency version to ">= 0.1.0"
Since we introduced chart version check in gates, requirements are not
satisfied with strict check of 0.1.0

Change-Id: I15950b735b4f8566bc0018fe4f4ea9ba729235fc
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
2020-09-24 12:19:28 -05:00
Mohammed Naser c7a45f166f Run chart-testing on all charts
Added chart lint in zuul CI to enhance the stability for charts.
Fixed some lint errors in the current charts.

Change-Id: I9df4024c7ccf8b3510e665fc07ba0f38871fcbdb
2020-09-11 18:02:38 +03:00
Gnana Lakshmi Kilambhi (gk118g) 982e3754a5 Add default reject rule at the end in Postgres pg_hba.conf to ensure all connections must be explicitly allowed.
default reject at the end of pg_hba.conf is added to ensure all connections must be explicitly allowed.
List of dependant users are added to allow connections are: 
1. postgresql-admin
2. postgres
3. psql_exporter

Change-Id: Ic7bd19e5eb4745b91d94d5a88851280054459547
2020-09-03 12:53:17 +00:00
anthony.bellino 96369491cb Patroni exclusion for Postgres
This PS removes the previously put in place HA clustering support
Patroni provided.

Change-Id: I03ed11282413a454062ab34b8594ba60ac2175aa
2020-08-31 18:02:37 +00:00
Parsons, Cliff (cp769u) 233197fc0b Add capabilitity to backup only a single database
This PS adds the capability to Mariadb and Postgresql to backup a
single database (as an optional parameter to the backup script).

Change-Id: I9bc1eb0173063638b2cf58465c063f602ed20bc1
2020-08-18 18:30:31 +00:00
diwakar thyagaraj acf6276f49 Add Application armor to Postgresql-backup pods
Change-Id: Idb4d214803bb98f1846154bb27d571f44ca74dba
Signed-off-by: diwakar thyagaraj <diwakar.chitoor.thyagaraj@att.com>
2020-08-14 18:23:02 +00:00
Parsons, Cliff (cp769u) c10de970c3 Fix postgresql backup cronjob deployment issues
There are a couple of issues that need fixing:
1) "backoffLimit" and "activeDeadlineSeconds" attributes are placed in
the CronJob part of the cron-job-backup-postgres.yaml, but should be
placed in the Job template part.
2) The backup cronjob had two names in the values.yaml
"backup_postgresql" and "postgresql_backup" in various places. It should
be "postgresql_backup" in all of those places so that the CronJob can be
deployed correctly.

Change-Id: Ifd1c7c03ee947763ac073e55c6d74c211615c343
2020-07-29 22:39:59 +00:00
KHIYANI, RAHUL (rk0850) b400a6c41d Add missing security context to promethues and postgresql pods/containers
This updates the chart to include the pod security context
on the pod template.

This also adds the container security context to set
readOnlyRootFilesystem flag to true

Change-Id: Icb7a9de4d98bac1f0bcf6181b6e88695f4b09709
2020-07-07 21:20:36 +00:00
Andrii Ostapenko 824f168efc Undo octal-values restriction together with corresponding code
Unrestrict octal values rule since benefits of file modes readability
exceed possible issues with yaml 1.2 adoption in future k8s versions.
These issues will be addressed when/if they occur.

Also ensure osh-infra is a required project for lint job, that matters
when running job against another project.

Change-Id: Ic5e327cf40c4b09c90738baff56419a6cef132da
Signed-off-by: Andrii Ostapenko <andrii.ostapenko@att.com>
2020-07-07 15:42:53 +00:00
Cliff Parsons 4964ea2a76 Fix drop databases issue in Postgresql restore
Recently, the Postgresql backups were modified to generate drop database
commands (--clean pgdumpall option). Also for single database restore,
a DROP DATABASE command was added before the restore so that the
database could be restored without duplicate rows. However, if there are
existing database connections (by the applications or other users), then
the drop database commands will fail. So for the duration of the restore
database operation, the databases being restored need to have their
existing connections dropped and new connections prevented until the
database(s) restored, then connections should be re-allowed.

Also found a problem with psql returning 0 (success code) even though
there were errors during its execution. The solution is to check the
output for errors and if there are any, dump out the log file for the
user to see and let the user know there are errors.

Lastly, a problem was found with the single database restortion, where
the database dump for a single database was being incorrectly extracted
from the psql dump file, resulting in the database not being restored
correctly (most of the db being wiped out). This patchset fixes that
issue as well.

Change-Id: I4db3f6ac7e9fe7cce6a432dfba056e17ad1e3f06
2020-06-30 19:39:00 +00:00
Cliff Parsons 1da7a5b0f8 Fix problems with DB utilities in HTK and Postgresql
This PS fixes:
1) Removes printing of the word "Done" after the restore/list command
   executes, which is not needed and clutters the output.
2) Fixes problem with list_tables related to command output.
3) Fixes parameter ordering problem with list_rows and list_schema
4) Adds the missing menu/parameter parsing code for list_schema
5) Fixes backup-restore secret and handling of PD_DUMPALL_OPTIONS.
6) Fixes single db restore, which wasn't dropping the database, and
   ended up adding duplicate rows.
7) Fixes cronjob deficiencies - added security context and init containers,
   fixed backup related service account related typos.
8) Fixes get_schema so that it only finds the table requested, rather
   than other tables that also start with the same substring.
9) Fixes swift endpoint issue where it sometimes returns the wrong
   endpoint, due to bad grep command.

Change-Id: I0e3ab81732db031cb6e162b622efaf77bbc7ec25
2020-06-24 19:16:04 +00:00
Andrii Ostapenko 83e27e600c Enable key-duplicates and octal-values yamllint checks
With corresponding code changes.

Change-Id: I11cde8971b3effbb6eb2b69a7d31ecf12140434e
2020-06-17 13:14:30 -05:00
Andrii Ostapenko dfb32ccf60 Enable yamllint rules for templates
- braces
- brackets
- colons
- commas
- comments
- comments-indentation
- document-start
- hyphens
- indentation

With corresponding code changes.

Also idempotency fix for lint script.

Change-Id: Ibe5281cbb4ad7970e92f3d1f921abb1efc89dc3b
2020-06-17 13:13:53 -05:00
Andrii Ostapenko 8f24a74bc7 Introduces templates linting
This commit rewrites lint job to make template linting available.
Currently yamllint is run in warning mode against all templates
rendered with default values. Duplicates detected and issues will be
addressed in subsequent commits.

Also all y*ml files are added for linting and corresponding code changes
are made. For non-templates warning rules are disabled to improve
readability. Chart and requirements yamls are also modified in the name
of consistency.

Change-Id: Ife6727c5721a00c65902340d95b7edb0a9c77365
2020-06-11 23:29:42 -05:00
Parsons, Cliff (cp769u) 9b6f5b267f Add backup/restore configuration secret
This patchset adds a secret containing the backup/restore configuration
for Postgresql, in case it is needed for invoking a backup/restore
operation from a different application or from a different namespace
(like from a utility container). Default is to not produce the secret.

Change-Id: I273fe169e7ee533c3fe04ad33c97af64b29bc16f
2020-06-04 20:06:37 +00:00
Cliff Parsons a9ddbd9e46 Add capability to retrieve rows from databases
Adding the capability to retrieve a list of tables, list of rows,
and the table schema information from a given database backup
archive file, for the purpose of manual database table/row
restoration and also for just viewing.

This is added to the HTK _restore_main.sh.tpl and is integrated
into the Postgresql restore script (Mariadb will be done later).

Change-Id: I729ecf7a720f1847a431de7e149cec6841ec67b8
2020-06-02 19:02:37 +00:00
Andrii Ostapenko 731a6b4cfa Enable yamllint checks
- document-end
- document-start
- empty-lines
- hyphens
- indentation
- key-duplicates
- new-line-at-end-of-file
- new-lines
- octal-values

with corresponding code adjustment.

Change-Id: I92d6aa20df82aa0fe198f8ccd535cfcaf613f43a
2020-05-29 19:49:05 +00:00
Parsons, Cliff (cp769u) 5a2babd514 Backup/restore enhancements
This patchset introduces the framework by which all OSH-based database
systems can use to backup and restore their databases. The framework
is refactored from the Postgresql backup and restore logic. This will
prevent alot of code duplication in the backup restore scripts across
each cluster.

In the process, some improvements needed to be made:
1) Removing the need for 2 separate containers to do the backup
   and restore work to a remote gateway. This simplifies the design
   and enables a higher level of robustness.
2) Adding separate "days to keep" config value for remote backup files,
   as there may be different requirements for the remote files than the
   local backup files.
3) Adding capability to send Storage_Policy when creating the remote
   RGW swift container.
4) Making coding style improvement for readability and maintainability.
5) Fixing a deployment bug that occurs when remote backup is disabled.

Change-Id: I3a3482ad67320e89f04305b17da79abf7ad6eb45
2020-05-13 16:34:21 +00:00
Gage Hugo d14d826b26 Remove OSH Authors copyright
The current copyright refers to a non-existent group
"openstack helm authors" with often out-of-date references that
are confusing when adding a new file to the repo.

This change removes all references to this copyright by the
non-existent group and any blank lines underneath.

Change-Id: I1882738cf9757c5350a8533876fd37b5920b5235
2020-05-07 02:11:15 +00:00
diwakar thyagaraj 7c5479fb83 Enable Apparmor to postgresql init containers
Change-Id: If679428710dbb8c9c8a5da4248c48e05a2fb0844
Signed-off-by: diwakar thyagaraj <diwakar.chitoor.thyagaraj@att.com>
2020-05-06 01:55:12 +00:00
Cliff Parsons 382d113a87 Postgresql backup/restore enhancements
1) Added a new backup container for accessing RGW via Openstack Swift API.
2) Modified the backup script so that tarballed databases can be sent to the RGW.
3) Added new script to send the database backup to the RGW.
4) Modified the restore script so that databases can be retrieved from the RGW.
5) Added new script to retrieve the database backups from the RGW.

Change-Id: Id17a8fcb63f5614ea038c58acdc256fb4e05f434
2020-04-22 22:31:48 +00:00
Phil Sphicas 3860dedef3 postgresql: Add metadata labels to CronJob
This change adds the same helm-toolkit-generated metadata labels to
the CronJob itself that are applied to the Jobs it creates.

Change-Id: I888ca6f25c97e3deb6710e2e6be5a87a6133604b
2020-03-16 18:22:14 -07:00
Andrii Ostapenko e0e9e623a3 Remove extra securityContext in postgresql backup cron job
Change-Id: I0a55f06fe93f7ab0852621fd9927542d87d1be7e
2020-03-13 04:24:46 +00:00
dt241s@att.com 8caf00b12c [FIX] Fix Apparmor for postgresql-openstack-create-user
Change-Id: I4ae738636bff152a57bf292786d3855384e3529b
2020-02-25 18:58:16 +00:00
Zuul d8c937f608 Merge "Enable Docker default Apparmor for Postgresql and prometheus-postgresql." 2020-02-18 20:58:17 +00:00
Radhika Pai c884ec439b Postgresql_exporter: Adding queries.yaml file
This change must enable postgresql-exporter to push additional metrics
(like replication_lag) which are derived using a SQL query against Postgres DB.

(Co-Author: Steven Fitzpatrick)

Change-Id: I78dc433a3782b48155ab293cb5afe90b3bc0ef1f
2020-02-17 19:26:29 -06:00
dt241s@att.com f633555f16 Enable Docker default Apparmor for Postgresql and prometheus-postgresql.
Change-Id: I013ca5f99e5032c44f0d679e467da9e928c02a6b
2020-02-17 23:01:06 +00:00
Cliff Parsons c18ee59aff Fix postgresql database backup issue
Currently postgresql database backup job will fail due to not having
correct permissions on the mounted PVC. This patchset corrects the
permissions on the PVC mount so that the backup pods can write to the
/var/backup directory structure.

Another problem was that pg_dumpall was not able to get the correct
password from the admin_user.conf. This may be due to the extra lines
in the file, so this patchset reads it differently in order to find
the password. This was a change to the backup and restore scripts.

Also there are a number of small corrections made to the error handling
for both backup and restore scripts, to be consistent with the MariaDB
backup/restore scripts.

Change-Id: Ica361764c591099e16d03a0988f73c6976583ceb
2020-02-10 17:38:10 +00:00
Koffi Nogbe 914ea2bd60 Add audit database user for audit purposes
This commit adds an audit user to the postgresql database which
will have only SELECT privileges on the postgresql database tables.
This is accomplished by setting up audit user creation parameters
in the Patroni bootstrap environment settings, according to (1).

(1) https://patroni.readthedocs.io/en/latest/ENVIRONMENT.html

Change-Id: Idf1cd90b5d093f12fa4a3c5c794d4b5bbc6c8831
2020-01-28 16:48:29 +00:00
Doug Aaser cf7b8dbb3d Add explicit admin user to Patroni
In this PS we explicitly define the admin user rather than letting
patroni use the default username and password.

Change-Id: I9885314902c3a60e709f96e2850a719ff9586b3d
2020-01-24 21:14:32 +00:00
Tin Lam a43ae25226 Postgresql egress netpol
This patch set places in a default kubernetes egress network
policy for postgresql database chart.

Change-Id: I6caa917faf23becc3a1c09b47f457b8b2db996e4
Signed-off-by: Tin Lam <tin@irrational.io>
2020-01-09 18:50:36 +00:00