Commit Graph

13 Commits

Author SHA1 Message Date
Jiri Stransky 7a438651af Remove obsolete code for handling Pacemakerized resource restarts
Remove scripts and templates which dealt with Pacemaker and its
resource restarts before we moved to containerized deployments. These
should all now be unused.

Many environments had this mapping:

    OS::TripleO::Tasks::ControllerPreConfig: OS::Heat::None
    OS::TripleO::Tasks::ControllerPostConfig: OS::Heat::None
    OS::TripleO::Tasks::ControllerPostPuppetRestart: ../../extraconfig/tasks/post_puppet_pacemaker_restart.yaml

The ControllerPostPuppetRestart is only ever referenced from
ControllerPostConfig, so if ControllerPostConfig is OS::Heat::None, it
doesn't matter what ControllerPostPuppetRestart is mapped to.

Change-Id: Ibca72affb3d55cf62e5dfb52fe56b3b1c8b12ee0
Closes-Bug: #1794720
2018-10-11 10:41:15 +02:00
Alan Bishop 5a89ea21f2 Maintain ceph-osd package only on nodes hosting CephOSD service
The ceph-osd package is only required on nodes hosting the CephOSD
service, but the package's presence on other nodes may interfere with
software updates. That's because some distros distribute Ceph software
in different channels, and not all nodes have access to the ceph-osd
channel.

There are two parts to the fix, and the first is an enhancement to the
yum update process. The process detects when the ceph-osd package is not
required, and removes the package from the node.

The second part takes ceph-osd out of the default list of packages
needed by puppet-ceph. The ceph-osd package is listed only on the nodes
hosting the CephOSD service.

Closes-Bug: #1713292
Change-Id: I7a581518ed25cf5f264abfaabfcf2041363a065b
2017-09-05 22:55:29 +00:00
marios 839c0b1116 Adds check for existing yum process during the legacy minor update
Checks for an existing /var/run/yum.pid and exit 1 with an error
message saying why.

Change-Id: I374eeb4164a8007ae67fea2796eac109fffdef97
Closes-Bug: 1704131
2017-07-13 16:25:52 +03:00
Alex Schultz abf4444d90 Ignore case for bootstrap node checks
The bootstrap_nodeid can have capital letters while the hostname may
not. In puppet we use downcase for this comparison, so let's follow a
similar pattern for scripts from THT.

Change-Id: I8a0bec4a6f3ed0b4f2289cbe7023344fb284edf7
Closes-Bug: #16998201
2017-06-15 13:59:31 -06:00
Michele Baldessari 4923f5c499 Initial VIP ipv6 minor update code
To test this change we deployed a stock master with ipv6 which created a bunch
of ipv6 with /64 netmask:
[root@overcloud-controller-0 ~]# pcs resource show ip-fd00.fd00.fd00.2000..18
 Resource: ip-fd00.fd00.fd00.2000..18 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=fd00:fd00:fd00:2000::18 cidr_netmask=64
  Operations: start interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-start-interval-0s)
              stop interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-stop-interval-0s)
              monitor interval=10s timeout=20s (ip-fd00.fd00.fd00.2000..18-monitor-interval-10s)

Then we update the THT folder with this patch and upload the new scripts on the undercloud via:
openstack overcloud deploy --update-plan-only ....

Then we kick off the minor update workflow:
openstack overcloud update stack -i overcloud

Once the controller-0 node (bootstrap node for pacemaker) is completed we have the
correct VIP configuration:
[root@overcloud-controller-0 heat-config-script]# pcs resource show ip-fd00.fd00.fd00.2000..18
 Resource: ip-fd00.fd00.fd00.2000..18 (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: ip=fd00:fd00:fd00:2000::18 cidr_netmask=128 nic=vlan20 lvs_ipv6_addrlabel=true lvs_ipv6_addrlabel_value=99
  Operations: start interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-start-interval-0s)
              stop interval=0s timeout=20s (ip-fd00.fd00.fd00.2000..18-stop-interval-0s)
              monitor interval=10s timeout=20s (ip-fd00.fd00.fd00.2000..18-monitor-interval-10s)

Also verified that running the script a second time does not alter the
(already fixed) VIPs.

Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>

Change-Id: I765cd5c9b57134dff61f67ce726bf88af90f8090
2017-05-02 20:01:21 +02:00
marios 25983882c2 Add special case upgrade from openvswitch 2.5.0-14
In [1] we removed the previously used special case upgrade code.
However we have since discovered that for openvswitch 2.5.0-14
the special case is still required with an extra flag to prevent
the restart.  This adds the upgrade code back into the minor
update and 'manual upgrade' scripts for compute/swift. The
review at If998704b3c4199bbae8a1d068c31a71763f5c8a2 is adding
this logic for the ansible upgrade steps.

Related-Bug: 1669714
[1] https://review.openstack.org/#/q/59e5f9597eb37f69045e470eb457b878728477d7
Change-Id: I3e5899e2d831b89745b2f37e61ff69dbf83ff595
2017-03-31 18:38:33 +03:00
marios afcb6e01f3 Make the openvswitch 2.4->2.5 upgrade more robust
In I9b1f0eaa0d36a28e20b507bec6a4e9b3af1781ae and
I11fcf688982ceda5eef7afc8904afae44300c2d9 we added a manual step
for upgrading openvswitch in order to specify the --nopostun
as discussed in the bug below.

This change adds a minor update to make this workaround more
robust. It removes any existing rpms that may be around from
an earlier run, and also checks that the rpms installed are
at least newer than the version we are on.

This also refactors the code into a common definition in the
pacemaker_common_functions.sh which is included even for the
heredocs generating upgrade scripts during init. Thanks
Sofer Athlan-Guyot and Jirka Stransky for help with that.

Change-Id: Idc863de7b5a8c116c990ee8c1472cfe377836d37
Related-Bug: 1635205
2016-12-14 19:15:11 +02:00
marios a7af5b90e4 Fixup the start of swift services
Seems the conditional has changed and we should pickup the
tripleo::profile::base::swift::storage::enable_swift_storage
hiera data.

After controller nodes are upgraded the swift services were down
even though there was no stand-alone swift node (the current
conditional was failing as that hiera isn't set any more)

Closes-Bug: 1638821
Change-Id: Id1383c1e54f9cae13fd375e90da525230e5d23eb
2016-11-03 07:33:40 +00:00
marios fb25385d34 Rework the pacemaker_common_functions for M..N upgrades
For N we cannot assume services are managed by pacemaker.
This adds functions to check if a service is systemd or
pcmk managed and start/stops it accordingly. For pcmk,
only stop/disable on bootstrap node for example, whereas
systemd should stop/start on all controllers.

There is also an equivalent change to the check_resource
which has been reworked to allow both pcmk and systemd.

Implements: blueprint overcloud-upgrades-workflow-mitaka-to-newton
Change-Id: Ic8252736781dc906b3aef8fc756eb8b2f3bb1f02
2016-09-17 04:46:24 +00:00
Raoul Scarazzini 05b2a200ca Filter for local nodes in check_resource function
While having extra customizations inside a TripleO deployed
Pacemaker environment, say you have instance HA with
pacemaker_remoted or you need to configure an external arbitrator
for something, then the status of the resources for remote nodes
is "Stopped".
This leads to failures while, for example, scaling up.
This fixes the way status is checked, filtering just local nodes.

Co-Authored-By: Giulio Fidente <gfidente@redhat.com>
Change-Id: I8dc25f5d7031c265858afd5a266fda5315ae37a0
2016-04-04 10:41:37 +02:00
marios 9727a27d00 Fixup systemctl_swift stop/start during the controller upgrade
During the controller upgrade in
major_upgrade_controller_pacemaker_1.sh we use systemctl to stop
all swift services and then start them again in _pacemaker_2.sh

In the case of stand-alone swift nodes the deployer may have
used the ControllerEnableSwiftStorage: false so that only the
swift-proxy service is left on controllers (wrt swift). The
systemctl_swift function used during upgrades is changed to factor
this in.

Change-Id: Ib22005123429f250324df389855d0dccd2343feb
2016-03-09 15:43:40 +02:00
marios 31b884a2a8 Moves the swift start/stop into the common_functions.sh file
Since swift isn't managed by pacemaker we need to manually (systemctl)
stop and start the swift services. This moves the duplicate blocks for
start/stop into a common function (we already include that
pacemaker_common_functions.sh here so may as well)

Change-Id: Ic4f23212594c1bf9edc39143bf60c7f6d648fd1d
2016-03-02 18:31:51 +02:00
Giulio Fidente 9e5bc65938 Split pacemaker common check_service function out of _restart.sh
Also split out echo_error function to DRY the error output code and
allow changing the way we report errors in a single place.

Change-Id: I448bf0eb49390f03155335736bb4ab4e979db128
Co-Authored-By: Jiri Stransky <jistr@redhat.com>
2016-01-26 15:21:24 +01:00