Make redeploy idempotent

Rerunning the overcloud deploy command with no changes restarts a
truckload of containers (first seen this via
https://bugzilla.redhat.com/show_bug.cgi?id=1612960).  So we really have
three separate issues here. Below is the list of all the containers that
may restart needlessly (at least what I have observed in my tests):
A) cron category:
ceilometer_agent_notification cinder_api cinder_api_cron cinder_scheduler
heat_api heat_api_cfn heat_api_cron heat_engine keystone keystone_cron
logrotate_crond nova_api nova_api_cron nova_conductor nova_consoleauth
nova_metadata nova_scheduler nova_vnc_proxy openstack-cinder-volume-docker-0
panko_api

These end up being restarted because in the config volume for the container there is
a cron file and cron files are generated with a timestamp inside:
$ cat /var/lib/config-data/puppet-generated/keystone/var/spool/cron/keystone
...
 # HEADER: This file was autogenerated at 2018-08-07 11:44:57 +0000 by puppet.
...

The timestamp is unfortunately hard coded into puppet in both the cron provider and the parsedfile
provider:
https://github.com/puppetlabs/puppet/blob/master/lib/puppet/provider/cron/crontab.rb#L127
https://github.com/puppetlabs/puppet/blob/master/lib/puppet/provider/parsedfile.rb#L104

We fix this by repiping tar into 'tar xO' and grepping away any line
that starts with # HEADER.

B) swift category:
swift_account_auditor swift_account_reaper swift_account_replicator
swift_account_server swift_container_auditor swift_container_replicator
swift_container_server swift_container_updater swift_object_auditor
swift_object_expirer swift_object_replicator swift_object_server
swift_object_updater swift_proxy swift_rsync

So the swift containers restart because when recalculating the md5 over the
/var/lib/config-data/puppet-generated/swift folder we also include:
B.1) /etc/swift/backups/... which is a folder which over time collects backup of the ringfiles
B.2) /etc/swift/*.gz it seems that the *.gz files seem to change over time

We just add a parameter to the tar command to exclude those files as
we do not need to trigger a restart if those files change.
--exclude='*/etc/swift/backups/*' --exclude='*/etc/swift/*.gz'

C) libvirt category:
nova_compute nova_libvirt nova_migration_target nova_virtlogd

This one seems to be due to the fact that the /etc/libvirt/passwd.db file contains a timestamp and
even when we disable a user and passwd.db does not exist, it gets
created:
[root@compute-1 nova_libvirt]# git diff cb2441bb1caf7572ccfd870561dcc29d7819ba04..0c7441f30926b111603ce4d4b60c6000fe49d290 .

passwd.db changes do not need to trigger a restart of the container se
we can safely exclude this file from any md5 calculation.

Part C) was: Co-Authored-By: Martin Schupper <mschuppe@redhat.com>

We only partial-bug this one because we want a cleaner fix where
exceptions to the files being checksummed will be specified in the tht
service files.

Partial-Bug: #1786065

Tested as follows:
./overcloud_deploy.sh
tripleo-ansible-inventory --static-yaml-inventory inv.yaml
ansible -f1 -i inv.yaml  -m shell --become -a "docker ps --format=\"{{ '{{' }}.Names{{ '}}' }}: {{ '{{' }}.CreatedAt{{ '}}' }}\" | sort" overcloud > before
./overcloud_deploy.sh
ansible -f1 -i inv.yaml  -m shell --become -a "docker ps --format=\"{{ '{{' }}.Names{{ '}}' }}: {{ '{{' }}.CreatedAt{{ '}}' }}\" | sort" overcloud > after
diff -u before after | wc -l
0

Change-Id: I10f5cacd9fee94d804ebcdffd0125676f5a209c4
This commit is contained in:
Michele Baldessari 2018-08-08 21:04:53 +02:00
parent 83a21f3563
commit 42c3f18051
1 changed files with 11 additions and 2 deletions

View File

@ -267,8 +267,17 @@ with open(sh_script, 'w') as script_file:
# Write a checksum of the config-data dir, this is used as a
# salt to trigger container restart when the config changes
tar -c -f - /var/lib/config-data/${NAME} --mtime='1970-01-01' | md5sum | awk '{print $1}' > /var/lib/config-data/${NAME}.md5sum
tar -c -f - /var/lib/config-data/puppet-generated/${NAME} --mtime='1970-01-01' | md5sum | awk '{print $1}' > /var/lib/config-data/puppet-generated/${NAME}.md5sum
# We need to exclude the swift rings and their backup as those change over time and
# containers do not need to restart if they change
EXCLUDE=--exclude='*/etc/swift/backups/*'\ --exclude='*/etc/swift/*.ring.gz'\ --exclude='*/etc/swift/*.builder'\ --exclude='*/etc/libvirt/passwd.db'
# We need to repipe the tar command through 'tar xO' to force text
# output because otherwise the sed command cannot work. The sed is
# needed because puppet puts timestamps as comments in cron and
# parsedfile resources, hence triggering a change at every redeploy
tar -c --mtime='1970-01-01' $EXCLUDE -f - /var/lib/config-data/${NAME} | tar xO | \
sed '/^#.*HEADER.*/d' | md5sum | awk '{print $1}' > /var/lib/config-data/${NAME}.md5sum
tar -c --mtime='1970-01-01' $EXCLUDE -f - /var/lib/config-data/puppet-generated/${NAME} --mtime='1970-01-01' | tar xO \
| sed '/^#.*HEADER.*/d' | md5sum | awk '{print $1}' > /var/lib/config-data/puppet-generated/${NAME}.md5sum
fi
""")