From c3ccb92d64e7ee724d7d2f28d138894805d41629 Mon Sep 17 00:00:00 2001 From: Michele Baldessari Date: Fri, 20 Jul 2018 19:44:09 +0200 Subject: [PATCH] Configure keepalived before rabbitmq Sometimes an undercloud could fail to install with the following error: 2018-05-29 12:53:17,588 INFO: May 29 12:53:08 foo.int.bar systemd[1]: Starting RabbitMQ broker... 2018-05-29 12:53:17,588 INFO: May 29 12:53:11 foo.int.bar rabbitmq-server[14327]: ERROR: epmd error for host foo: address (cannot connect to host/port) 2018-05-29 12:53:17,588 INFO: May 29 12:53:11 foo.int.bar systemd[1]: rabbitmq-server.service: main process exited, code=exited, status=1/FAILURE 1) The hostname of the box is foo.int.bar foo and in the hosts file we have the following entry: 192.168.248.2 192.168.248.2 foo.int.bar foo Note: 192.168.248.2 is a VIP managed by keepalived because we configured this undercloud to be an SSL one so we have: undercloud_public_host = 192.168.248.2 2) At this stage we see rabbitmq-server being started: Jan 27 06:46:31 foo.int.bar systemd[1]: Starting Flexible Branding Service... Jan 27 06:46:31 foo.int.bar systemd[1]: epmd@0.0.0.0.socket failed to listen on sockets: Address already in use Jan 27 06:46:31 foo.int.bar systemd[1]: Failed to listen on Erlang Port Mapper Daemon Activation Socket. Jan 27 06:46:31 foo.int.bar systemd[1]: Unit epmd@0.0.0.0.socket entered failed state. Jan 27 06:46:31 foo.int.bar systemd[1]: Starting Erlang Port Mapper Daemon Activation Socket. Jan 27 06:46:31 foo.int.bar systemd[1]: Starting RabbitMQ broker... Jan 27 06:46:34 foo.int.bar rabbitmq-server[14532]: ERROR: epmd error for host foo: address (cannot connect to host/port) Now epmd might have already been up (and normally the failed message is not particularly concerning). But the real problem is that we are trying to connect to foo which maps to a VIP, but the VIP gets started only later by keepalived: 3) Jan 27 07:02:30 foo.int.bar Keepalived_vrrp[914]: VRRP_Instance(42) Sending/queueing gratuitous ARPs on br-ctlplane for 192.168.248.2 Jan 27 07:02:30 foo.int.bar Keepalived_vrrp[914]: Sending gratuitous ARP on br-ctlplane for 192.168.248.2 Let's make sure keepalived is up and running before rabbitmq in order to fix this. Change-Id: I010102b01e41610838c836a743a07be1965944d6 Closes-Bug: #1782814 --- elements/puppet-stack-config/puppet-stack-config.pp | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/elements/puppet-stack-config/puppet-stack-config.pp b/elements/puppet-stack-config/puppet-stack-config.pp index d9298a6a0..26fde943e 100644 --- a/elements/puppet-stack-config/puppet-stack-config.pp +++ b/elements/puppet-stack-config/puppet-stack-config.pp @@ -80,6 +80,12 @@ if hiera('tripleo::haproxy::service_certificate', undef) { enable_load_balancer => true, } include ::tripleo::keepalived + # NOTE: The following is required because we need to make sure that keepalived + # is up and running before rabbitmq. The reason is that when the undercloud is + # with ssl the hostname is configured to one of the VIPs so rabbit will try to + # connect to it at startup and if the VIP is not up it will fail (LP#1782814) + Class['::tripleo::keepalived'] -> Class['::rabbitmq'] + # NOTE: This is required because the haproxy configuration should be changed # before any keystone operations are triggered. Without this, it will try to # access the new endpoints that point to haproxy even if haproxy hasn't