Commit Graph

32 Commits

Author SHA1 Message Date
Ghanshyam Mann e06f50cb06 Retire Tripleo: remove repo content
TripleO project is retiring
- https://review.opendev.org/c/openstack/governance/+/905145

this commit remove the content of this project repo

Change-Id: I73df79a8698625815ea4e3099904da448a49887e
2024-02-24 11:42:30 -08:00
Takashi Kajinami 8427725125 Pacemaker: Replace hiera by lookup (2)
The hiera function is deprecated and does not work with the latest
hieradata version 5. It should be replaced by the new lookup
function[1].

[1] https://puppet.com/docs/puppet/7/hiera_automatic.html

With the lookup function, we can define value type and merge behavior,
but these are kept default at this moment to limit scope of this change
to just simple replacement. Adding value type might be useful to make
sure the value is in expected type (especially when a boolean value is
expected), but we will revisit that later.

example:
lookup(<NAME>, [<VALUE TYPE>], [<MERGE BEHAVIOR>], [<DEFAULT VALUE>])

This covers the remaining manifests to set up pacemaker resource.

Change-Id: I749b979a7333f68a646f36afa912603b1af0a943
2022-09-08 02:29:49 +09:00
Zuul 13fe297288 Merge "Redis: Share the same base class" 2022-08-03 23:49:36 +00:00
Takashi Kajinami b1097f1907 Redis: Share the same base class
... instead of maintaining the complete independent implementations for
the same service.

Change-Id: I7213b66bd8a4feecef15e760771fff6cf960826e
2022-07-02 12:19:32 +09:00
Takashi Kajinami ef041632ea Remove implementations for Docker support
... because Docker support has been removed from tht and these are no
longer used.

Depends-on: https://review.opendev.org/843755
Change-Id: I5719d06464ba2c1d37898b44f70ac5521ceaaf7e
2022-06-20 17:29:07 +09:00
Cédric Jeanneret e91aac2822 Add missing "z" flag for specific mounts
Depending on the host history, it may happen some directory content
don't have the correct SELinux type. This has been seen with OVN
service, during a Queens -> Train FFU:

while the /var/lib/openvswitch/ovn directory had the correct
container_file_t type, some files in this location were typed with
openvswitch_var_lib_t, leading to errors during the deploy part of the
upgrade (after the OS upgrade, when the deploy is running on the cleaned
host).
The specific issue depends on the actual files with the wrong label, but
usually it involves a container crash/error, leading to a deploy error,
and a manual intervention in order to correct the SELinux type in the
location.

This situation may happen when first deployed on Queens, since it was
using Docker. For the records, back then Docker Daemon was configured in
order to disable the SELinux support, so it didn't really care about
labels; but the situation is different with Podman, and we have a full
SELinux support at all levels on the OS, leading to the issue.

For the records, tripleo-heat-templates as well as tripleo-ansible are
setting the "setype: container_file_t" on the directories, but we don't
use the "recurse: true" in order to avoid performance issues - some
locations might be huge, and it would take too much time to relabel
everything via ansible.

This patch aims to converge all the mounts to the same options, and
ensure no SELinux denial can prevent the actual container startup and
function.

Change-Id: Ic3e427156fc82c524c763d1896937fcc3c49fabb
Closes-Bug: #1943459
2021-09-14 12:59:31 +02:00
Takashi Kajinami f08d83de05 Fix lint errors with the latest lint packages
This change fixes the lint errors detected since we removed pins of
lint packages.
Note that this change also replaces absolute name used to call
the tripleo::stunnel::service_proxy resource type, which is not yet
detected by the latest lint rules.

Closes-Bug: #1928079
Change-Id: I12ba801db92cb3df1d05f14f4c150ac765f0b874
2021-05-11 22:17:37 +09:00
Michele Baldessari d185cbf032 Allow OCF resources to be created with --force
While moving to running pcs commands on the host and off short-lived
containers, we are confronted with the issue that pcs usually checks
for the resource agent's existence on the host before creating it.
Since we'd rather avoid installing the needed resource agents on the
host (as it is inside a container), we allow a new 'force_ocf' parameter
to be passed to those situations where we might need it.

Depends-On: I20eb78a061a334b20f6b2274591c5d313a0af532

Related-Bug: #1863442
Change-Id: If9048196b5c03e3cfaba72f043b7f7275568bdc4
2020-05-08 08:12:28 +00:00
Takashi Kajinami 5f77bc71ac Remove unnecessory usage of hiera
We don't need to use hiera if the parameter is actually implemented
in the class.

Change-Id: Ia916707eaecb7a6d48f992ff2112fe8507544ee1
2020-04-21 23:30:39 +09:00
Michele Baldessari 06c4aa7446 Log stdout of HA containers
When podman dropped the journald log-driver we rushed to move to the supported
k8s-file driver. This had the side effect of us losing the stdout logs of the
HA containers.

In fact previously we were easily able to troubleshoot haproxy startup failures
just by looking in the journal. These days instead if haproxy fails to start we
have no traces whatsoever in the logs, because when a container fails it gets
stopped by pacemaker (and consequently removed) and no logs on the system are
available any longer.

Tested as follows:
1) Redeploy a previously deployed overcloud that did not have the patch
and observe that we now log the startup of HA bundles in /var/log/containers/stdouts/*bundle.log

[root@controller-0 stdouts]# ls -l *bundle.log |grep -v -e init -e restart
-rw-------. 1 root root   16032 Apr 14 14:13 openstack-cinder-volume.log
-rw-------. 1 root root   19515 Apr 14 14:00 haproxy-bundle.log
-rw-------. 1 root root   10509 Apr 14 14:03 ovn-dbs-bundle.log
-rw-------. 1 root root    6451 Apr 14 14:00 redis-bundle.log

2) Deploy a composable HA overcloud from scratch with the patch above
and observe that we obtain the stdout on disk.

Note that most HA containers log to their usual on-host files just
fine, we are mainly missing haproxy logs and/or the kolla startup only
of the HA containers.

Closes-Bug: #1872734

Change-Id: I4270b398366e90206adffe32f812632b50df615b
2020-04-15 20:10:03 +00:00
Alex Schultz a566d6b9b8 Add check for bootstrap_node for downcase
Downcase in puppet 6.14 throws an error if the input to it is Undef. We
can avoid this by checking for a value before trying to downcase.

See context https://review.rdoproject.org/r/#/c/26297/

Change-Id: Ib2e97060523a4198a14949a15c9171b56928699c
2020-04-07 14:51:41 -06:00
Michele Baldessari d766eb81a3 Make the bundle user configurable via hiera
Allow all bundles --user option to be overridden as some of them might
prefer switching to a non-root user when possible.
The ovn-dbs bundle is a bit special because it never specified any user.
Hence we default that user to undef and do not set anything.

Tested as follows:
1. deployed an overcloud
2. patched it with this change
3. redeployed and and then observed that no HA container has restarted at all
4. verified cinder-volume runs with root by default:
USER  PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
root    1  0.0  0.0   4204   716 ?        Ss   09:01   0:00 dumb-init --single-child -- /bin/bash /usr/local/bin/kolla_start
root    7  0.7  0.7 912976 145760 ?       S    09:01   1:04 /usr/bin/python3 /usr/bin/cinder-volume --config-file /usr/share/cinder/cinder-dist.conf --config-file /etc/cinder/cinder.conf
root   71  0.1  0.6 925800 124640 ?       S    09:01   0:14 /usr/bin/python3 /usr/bin/cinder-volume --config-file /usr/share/cinder/cinder-dist.conf --config-file /etc/cinder/cinder.conf
5. added 'tripleo::profile::pacemaker::cinder::volume_bundle::bundle_user: cinder' to
   the templates and redeployed
6. Observed that cinder-volume got restarted and now runs with cinder
   user:
USER   PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
cinder   1  0.0  0.0   4204   804 ?        Ss   12:23   0:00 dumb-init --single-child -- /bin/bash /usr/local/bin/kolla_start
cinder   7  2.1  0.7 912976 145432 ?       S    12:23   0:04 /usr/bin/python3 /usr/bin/cinder-volume --config-file /usr/share/cinder/cinder-dist.conf --config-file /etc/cinder/cinder.conf
cinder  64  0.3  0.5 919908 118452 ?       S    12:23   0:00 /usr/bin/python3 /usr/bin/cinder-volume --config-file /usr/share/cinder/cinder-dist.conf --config-file /etc/cinder/cinder.conf

Change-Id: I985d0d192ef3accf7fdd31503348de80713fded4
2020-01-13 11:40:32 +01:00
Tobias Urdin 1523a4b804 Convert all class usage to relative names
Change-Id: Ib2ed745b682cf12f9469a5a64451adcabec400af
2019-12-08 23:23:25 +01:00
Michele Baldessari bad716070a Switch HA containers to k8s-file log-driver and make it a parameter
Currently in puppet-tripleo for the HA container we hardcode the following:
 options => "--user=root --log-driver=journald -e KOLLA_CONFIG_STRATEGY=COPY_ALWAYS${tls_priorities_real}",

Since at least podman had some changes in terms of supported driver
backends (and bugs) it's best if we make this configurable. While we're
at it we should also switch to k8s-file as a driver when podman is being
used which is what all other containers are using. When docker is the
default container_cli we will stick to journald as usual.

Tested this on a Train environment and successfully verified that
we still see the correct logs in /var/log/containers/.../...

Change-Id: I5b1483826f816d11a064a937d59f9a8f468315a5
Closes-Bug: #1853517
2019-11-22 11:36:37 +01:00
Michele Baldessari f1a593b642 Initial support for tls_priorities
We add initial support for being able to specify tls priorities in
pacemaker. For bundles this will happen via an env variable because
pacemaker_remote is started normally as a process and there is no
sourcing of /etc/sysconfig/pacemaker.

Tested on both queens and stein. Via a deploy and a redeploy against
existing cloud. Observed that:
A) We got PCMK_tls_priorities inside /etc/sysconfig/pacemaker with the
value that was passed in THT
B) Containers had the following env variable set:
  "PCMK_tls_priorities=normal",

The '-e' addition is a noop in case the PCMK_tls_priorities is unset
so that we do not change the signature of the resources and hence do
not needlessly restart the HA resource.

Depends-On: I1971810f6a90f244ed5ced972a5fe7fde29dde86
Change-Id: I703b5a429f48063474aace85bc45d948f5c91435
2019-07-27 07:59:45 +00:00
Damien Ciabrini afebff58fb redis HA: allow SELinux relabel for /var/run/redis
/var/run/redis is bind-mounted from the host, and on every reboot
that directory is recreated with default context for the host.

Configure the bind-mount so that /var/run/redis is relabelled
with a container context every time the redis container is started,
so that kolla can copy its config file and update the owner and
attributes as expected without SELinux denials.

Change-Id: Iaa8a99eb9ced21fb6c7c87c5b56dec55383af9a9
Partial-Bug: #1826554
2019-04-29 18:50:43 +02:00
Sofer Athlan-Guyot 48b1775e35 Extra variables to reprovision pacemaker cluster one node at a time.
For the upgrade we have to re-provision the controller cluster, one
node at a time.

Using extra override variable set in hiera we are able to specify to
pacemaker which nodes should be added to the cluster.

Change-Id: I2f6ef4679265718fbbe8726ee6c81832bc468f3e
Implements: blueprint upgrades-with-os
2019-02-12 10:20:48 +01:00
Zuul 71a8722eb3 Merge "Remove the duplicated word" 2019-02-06 16:45:40 +00:00
Michele Baldessari 736d69dad9 Add retries to HA bundles
The retry is needed in a composable HA environment because a two nodes
might be modifying the CIB at the same time and so we need to retry more
than once to get the freshest CIB, modify it and push it back. Currently
all HA resources have it but we did not add it in the bundles. While it
is a rare race, we should still plug it.

Change-Id: Ib9d9c76c83f103e329a9c575ae5c110d5ad3c048
Closes-Bug: #1809223
2019-01-04 12:51:51 +00:00
Michele Baldessari 177d951be3 Allow the container backend to be configurable
We added a container backend in puppet-pacemaker via
Ia4a7b58d14d80e85d51e98acec1aad2ba90b69de. Let's now
let tripleo override it when needed.

Tested this via some hiera keys overrides and it works correctly.

Change-Id: I610923327462b901840131316a4984c8fe98faaa
2018-11-15 20:41:24 +01:00
zhulingjie de01705140 Remove the duplicated word
Change-Id: Ib16459b1f7c8ef2c0e6612c660e31bfb5f2c74fc
2018-09-15 14:37:32 +00:00
Michele Baldessari f2484a0bf9 Fix up property names in case of mixed case hostnames
When deploying a stack that containes mixed-case hostnames
the following error might be triggered:
Debug: try 15/20: /usr/sbin/pcs -f
/var/lib/pacemaker/cib/puppet-cib-backup20180405-8-1sqw3dc property set
--node TEST-STACK34-controller-1 redis-role=true
Debug: Error: Error: unable to set attribute redis-role
Could not map name=TEST-STACK34-controller-1 to a UUID
while the name in the cluster is test-stack34-controller-1

This used to work pre-bundles because we used the facter provided
$::hostname variable which was lower-cased for us. With bundles we
switched to setting cluster properties from the service bootstrap nodes
and so we used the '<service>_short_node_names' hiera key which might
contain mixed-case hostnames.

In order to fix this we just downcase() the short_node_names hiera
string that we loop on so we can get the same behaviour we had on bare
metal.

Tested on an env with mixed-case hostnames:
[root@uppercaseovercloud-controller-0 keystone]# hiera -c /etc/puppet/hiera.yaml rabbitmq_short_node_names
["UPPERCASEOverCloud-controller-0",
 "UPPERCASEOverCloud-controller-1",
 "UPPERCASEOverCloud-controller-2"]

Cluster pcs properties were set correctly:
[root@uppercaseovercloud-controller-0 keystone]# pcs property |grep rabbitmq
 uppercaseovercloud-controller-0: galera-role=true haproxy-role=true rabbitmq-role=true redis-role=true rmq-node-attr-last-known-rabbitmq=rabbit@uppercaseovercloud-controller-0
 uppercaseovercloud-controller-1: galera-role=true haproxy-role=true rabbitmq-role=true redis-role=true rmq-node-attr-last-known-rabbitmq=rabbit@uppercaseovercloud-controller-1
 uppercaseovercloud-controller-2: galera-role=true haproxy-role=true rabbitmq-role=true redis-role=true rmq-node-attr-last-known-rabbitmq=rabbit@uppercaseovercloud-controller-2

Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Depends-On: Ie240b8a4217827dd8ade82479a828817d63143ba
Closes-bug: #1773219
Change-Id: I5bd49c4a1b13b2310f8a1173aa6b86abfa5dab3d
2018-05-28 10:28:14 +02:00
Jiri Stransky d8d86cfe68 Conventional log directories for pacemaker bundles
Use /var/log/containers/<service> instead of /var/log/<service>, as
the rest of the containerized services.

Change-Id: Id5760c16260de991ff95168c76186edc113752c8
Depends-On: Icb311984104eac16cd391d75613517f62ccf6696
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Closes-Bug: #1731969
2018-03-19 12:55:12 +00:00
Damien Ciabrini 32cce5f150 Fix Redis TLS setup, including replication traffic
This patch reverts the revert of Redis TLS [1], and fixes the
encryption of Redis replication traffic for HA deployments.

In order to encrypt replication traffic, Redis is configured to
drive outgoing replication traffic to a stunnel endpoint on
<localhost:port_xxx>. Stunnel then manages the encryption up to
the peer Redis master.

Likewise, slave Redis nodes advertise themselves as coming from
<localhost:port_yyy> in order to let the Master initiate connection
the Slave over its own stunnel endpoint, should it needs to.

Each redis node is assigned a unique replication port, and has
dedicated stunnels to each one of its peer. This port mapping
info is used by the redis resource agent to manage A/P failover.

The regular Redis port is unchanged, so Redis clients (OpenStack
services, HAproxy, CLI, firewall) are not impacted by this change.
Only SELinux needs to be adapted.

[1] I37501c4c983c87e3a38841272eb176ebbe626a65

Change-Id: I6cc818973fab25b4cd6f7a0d040aaa05a35c5bb1
Related-bug: #1737707
2018-02-09 09:18:19 +00:00
Alex Schultz 3c58543678 Revert "Revert "Set meta container-attribute-target=host attribute""
This reverts commit 1681d3bceb. 

NOTE: This needs to be tested against scenario004-containers before merging.

This is needed because when we run bundles we actually
want to store attributes on a per-node basis and not on a per-bundle
basis. By activating this attribute pacemaker will pass
some extra OCS_RESKEY_CRM_meta attributes that will help us in this
decision.

We can merge this once we have packages for pacemaker and
resource-agents releases that contain the necessary fixes.

Proper pacemaker and resource-agents are now in the repo [1] so
we can merge it and backport it to pike.

[1] https://buildlogs.centos.org/centos/7/cloud/x86_64/openstack-pike/

Closes-Bug: #1713007

Change-Id: Ie968470126833939c19223f04db29556e550673d
2017-10-30 16:12:46 +00:00
John Trowbridge 1681d3bceb Revert "Set meta container-attribute-target=host attribute"
This patch broke the containers scenario004 test because it relies on a
newer mariadb container than has actually passed CI at this time.

To revert this revert, we need to make sure we test
scenario004-containers against that patch.

This reverts commit 6bcb011723.

Closes-Bug: 1721497

Change-Id: I34c7c388eed94db1735c45e26661a0af8cdce8e9
2017-10-06 13:03:04 +00:00
Michele Baldessari 6bcb011723 Set meta container-attribute-target=host attribute
This is needed because when we run bundles we actually
want to store attributes on a per-node basis and not on a per-bundle
basis. By activating this attribute pacemaker will pass
some extra OCS_RESKEY_CRM_meta attributes that will help us in this
decision.

We can merge this once we have packages for pacemaker and
resource-agents releases that contain the necessary fixes.

Proper pacemaker and resource-agents are now in the repo [1] so
we can merge it and backport it to pike.

[1] https://buildlogs.centos.org/centos/7/cloud/x86_64/openstack-pike/

Closes-Bug: #1713007

Change-Id: I0dd06e953b4c81f217d0f4199b2337e4c3358086
2017-09-28 14:05:21 +02:00
Michele Baldessari 1da0b51ecc Fix up the control-port for rabbitmq bundles
Mistakenly this was set to 3121 which is the same port that pacemaker
remote uses. Move this to 3122 which was the plan all along.

Also fix a wrong port comment in redis and mysql at the same time.

Change-Id: Iccca6a53a769570443091577c7d86f47119d9cbb
2017-07-21 10:46:48 +02:00
Martin André 1e90178298 Leverage kolla config_files to copy config into containers
This solves a problem with bind-mounts when the containers are holding
files descriptors open.

At the same time this makes the template more robust to puppet changes
since new config files will be available in the containers without
needing to update the templates.

Closes-Bug: #1698323
Change-Id: I857c94ba5f7f064d7c58df621ec5d477654b9166
Depends-On: I78dcec741a941dc21adba33ba33a6dc6ff1d217c
2017-07-12 09:56:56 +00:00
Steve Baker 94f13e6608 Ensure hiera step value is an integer
The step is typically set with the hieradata setting an integer value:

  {"step": 1}

However it would be useful for the value to be a string so that
substitutions are possible, for example:

  {"step": "%{::step}"}

This change ensures the step parameter defaults to an integer by
calling Integer(hiera('step'))

This change was made by manually removing the undef defaults from
fluentd.pp, uchiwa.pp, and sensu.pp then bulk updating with:

    find ./ -type f -print0 |xargs -0 sed -i "s/= hiera('step')/= Integer(hiera('step'))/"

Change-Id: I8a47ca53a7dea8391103abcb8960a97036a6f5b3
2017-06-14 14:31:52 +12:00
Michele Baldessari b10adec303 Make sure the resource bundles use a location_rule
In composable HA we bind resources to nodes that have special
node properties. We need to do this also for bundle resources
otherwise there is a potential race where the bundle might be
started on nodes where it is not supposed to during a small
window of time.

Tested with the depends-on and correctly obtained a containerized
composable HA deployment:

Docker container set: rabbitmq-bundle
[192.168.24.1:8787/tripleoupstream/centos-binary-rabbitmq:latest]
  rabbitmq-bundle-0    (ocf:💓rabbitmq-cluster):      Started overcloud-rabbit-0
  rabbitmq-bundle-1    (ocf:💓rabbitmq-cluster):      Started overcloud-rabbit-1
  rabbitmq-bundle-2    (ocf:💓rabbitmq-cluster):      Started overcloud-rabbit-2
Docker container set: galera-bundle
[192.168.24.1:8787/tripleoupstream/centos-binary-mariadb:latest]
  galera-bundle-0      (ocf:💓galera):        Master overcloud-galera-0
  galera-bundle-1      (ocf:💓galera):        Master overcloud-galera-1
  galera-bundle-2      (ocf:💓galera):        Master overcloud-galera-2
Docker container set: redis-bundle
[192.168.24.1:8787/tripleoupstream/centos-binary-redis:latest]
  redis-bundle-0       (ocf:💓redis): Master overcloud-controller-0
  redis-bundle-1       (ocf:💓redis): Slave overcloud-controller-1
  redis-bundle-2       (ocf:💓redis): Slave overcloud-controller-2
ip-192.168.24.11       (ocf:💓IPaddr2):       Started overcloud-controller-0
ip-10.0.0.7    (ocf:💓IPaddr2):       Started overcloud-controller-1
ip-172.16.2.11 (ocf:💓IPaddr2):       Started overcloud-controller-2
ip-172.16.2.9  (ocf:💓IPaddr2):       Started overcloud-controller-0
ip-172.16.1.6  (ocf:💓IPaddr2):       Started overcloud-controller-1
ip-172.16.3.7  (ocf:💓IPaddr2):       Started overcloud-controller-2
Docker container set: haproxy-bundle
[192.168.24.1:8787/tripleoupstream/centos-binary-haproxy:latest]
  haproxy-bundle-docker-0      (ocf:💓docker):        Started overcloud-controller-0
  haproxy-bundle-docker-1      (ocf:💓docker):        Started overcloud-controller-1
  haproxy-bundle-docker-2      (ocf:💓docker):        Started overcloud-controller-2

Depends-On: I44449861cbfe56304b8829c9ca10fd648353b3ae
Change-Id: I48fb490040497ba08cae19937159c0efdf99e3f8
2017-06-09 21:18:27 +02:00
Damien 8b5b0b3f15 Puppet module to deploy Redis bundle for HA
This module is used by tripleo-heat-templates to configure and deploy
Kolla-based Redis containers managed by pacemaker.

We use short-lived containers that call pcs via puppet to create
the needed pacemaker resources, properties and constraints.

Co-Authored-By: Michele Baldesari <michele@acksyn.org>
Partial-Bug: #1692924

Depends-On: I44fbd7f89ab22b72e8d3fc0a0e3fe54a9418a60f
Depends-On: Ie9b7e7d2a3cec4b121915a17c1e809e4ec950e7f

Change-Id: Ia1131611d15670190b7b6654f72e6290bf7f8b9e
2017-05-25 14:34:37 +02:00