Incorrect condition in Podman part prevented the retrieval
of facts of all the containers when no names were provided.
Closes-Bug: #2058492
Change-Id: I6d7f7ca0523eb17c7d9a9b93d2037bf77f2c2a47
Signed-off-by: Martin Hiner <martin.hiner@tietoevry.com>
Changes name of ansible module kolla_docker to
kolla_container.
Change-Id: I13c676ed0378aa721a21a1300f6054658ad12bc7
Signed-off-by: Martin Hiner <m.hiner@partner.samsung.com>
docker_restart_policy: no causes systemd units to not get created
and we use it in CI to disable restarts on services.
Introducing oneshot policy to not create systemd unit for oneshot
containers (those that are running bootstrap tasks, like db
bootstrap and don't need a systemd unit), but still create systemd
units for long lived containers but with Restart=No.
Change-Id: I9e0d656f19143ec2fcad7d6d345b2c9387551604
This change adds basic deployment based on Podman
container manager as an alternative to Docker.
Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Signed-off-by: Martin Hiner <m.hiner@partner.samsung.com>
Signed-off-by: Petr Tuma <p.tuma@partner.samsung.com>
Change-Id: I2b52964906ba8b19b8b1098717b9423ab954fa3d
Depends-On: Ie4b4c1cf8fe6e7ce41eaa703b423dedcb41e3afc
Currently clustering steps are very static, if for a reason first
node in the inventory fails and gets re-introduced - K-A will create
a second empty cluster on that node.
This patch changes the approach and checks if cluster exists, if it
does - chooses a donor for the new node from currently running
node set.
Also it fixes node replacement - it removes old node from cluster
(that has the same ip address as newly provisioned node).
Closes-Bug: #1875223
Change-Id: Ia025283e38ea7c3bd37c7a70d03f6b46c68f4456
Fourth part of patchset:
https://review.opendev.org/c/openstack/kolla-ansible/+/799229/
which was suggested to be split into smaller patches.
This commit refactors select methods from DockerWorker class
into ContainerWorker class. New class contains Docker independent
methods also used in Podman introduction and is inteded as a
parent class for specific worker classes.
Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Co-authored-by: Martin Hiner <m.hiner@partner.samsung.com>
Change-Id: I2dd5920410dda053f2dfedc4e2666c56b1a7095a
This patch fixes kolla_docker module
as it did not take into account common_options
parameter. From patchset it's visible that module's
default values are used always - even if user overrided
some param in common_options dict.
Closes-Bug: #2003079
Change-Id: I677fde708dd004decaff4bd39f2173d8d81052fb
Second part of patchset:
https://review.opendev.org/c/openstack/kolla-ansible/+/799229/
in which was suggested to split patch into smaller ones.
THis change adds container_engine to module parameters
so when we introduce podman, kolla_toolbox can be used
for both engines.
Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Co-authored-by: Martin Hiner <m.hiner@partner.samsung.com>
Change-Id: Ic2093aa9341a0cb36df8f340cf290d62437504ad
Second part of patchset:
https://review.opendev.org/c/openstack/kolla-ansible/+/799229/
in which was suggested to split patch into smaller ones.
This change adds container_engine variable to kolla_container_facts
module, this prepares module to be used with docker and podman as well
without further changes in roles.
Signed-off-by: Ivan Halomi <i.halomi@partner.samsung.com>
Co-authored-by: Martin Hiner <m.hiner@partner.samsung.com>
Change-Id: I9e8fa30646844ab4a288555f3aafdda345b3a118
Moved the DockerWorker class from module file into its separate file
in module_utils directory for future extension.
Unit tests changed accordingly.
Signed-off-by: Ivan Halomi <ivan.halomi@tietoevry.com>
Co-authored-by: Martin Hiner <martin.hiner@tietoevry.com>
Change-Id: Ia2a471a9a2805e13b2c20dbf8a7297c23231aae3
This change bumps up max supported Ansible version
to 4.x (ansible-core 2.11.x) and minimum to 2.10.
Change-Id: I8b9212934dfab3831986e8db55671baee32f4bbd
This is required to support Debian Bullseye (11) - need to set
nova-libvirt to use 'host' CgroupnsMode.
Change-Id: I40213d4092fa325bcf37bb1fb4437ab125fe328b
The proposed approach allows for checking whether config
files are current, e.g. cases when the deployment was aborted after
config files were generated but before they were injected into the
containers which lead to old config staying in containers.
After this patch we can do:
kolla-ansible genconfig
kolla-ansible deploy-containers
and it would do what we expected rather than being a noop
in the second part.
We also lose the need to have notifies
and whens in config and handler sections respectively.
This is optimised in a separate patch.
Future work:
- optimise for large files
- could we get away with comparing timestamps and sizes?
container's should have a newer timestamp due to copy,
could also preserve it
Change-Id: I1d26e48e1958f13b854d8afded4bfba5021a2dec
Closes-Bug: #1848775
Depends-On: https://review.opendev.org/c/openstack/kolla/+/773257
Co-Authored-By: Mark Goddard <mark@stackhpc.com>
Keepalived and haproxy cooperate to provide control plane HA in
kolla-ansible deployments.
Certain care should be exerted to avoid prolonged availability
loss during reconfigurations and upgrades.
This patch aims to provide this care.
There is nothing special about keepalived upgrade compared to
reconfig, hence it is simplified to run the same code as for
deploy.
The broken logic of safe upgrade is replaced by common handler
code which's goal is to ensure we down current master only after
we have backups ready.
This change introduces a switch to kolla_docker module that allows
to ignore missing containers (as they are logically stopped).
ignore_missing is the switch's name.
All tests are included.
Change-Id: I22ddec5f7ee4a7d3d502649a158a7e005fe29c48
W503 and W504 are incompatible and we need to choose one of them.
Existing codes follows W503, so we disable W504.
Change-Id: Ic745e956dd332eb0fa49b93c1e6acb12f8a7f26c
The repo is Python 3 now, so update hacking to version 3.0 which
supports Python 3.
Fix problems found by updated hacking version.
Remove hacking and friends from lower-constraints, they are not needed
during installation.
Change-Id: I7ef5ac8a89e94f5da97780198619b6facc86ecfe
Kolla-Ansible Ceph deployment mechanism has been deprecated in Train [1].
This change removes the Ansible code and associated CI jobs.
[1]: https://review.opendev.org/669214
Change-Id: Ie2167f02ad2f525d3b0f553e2c047516acf55bc2
This is to fix the duplicated words issue like
"Other services that are are out of scope of this".
Change-Id: Ie4882dbb64d6e8774888b97895af20ba3855f0f8
Adds support for configuration of the Docker client timeout via
'docker_client_timeout'.
This change also increases the default timeout to 120 seconds, as we
sometimes see timeouts in CI and heavily loaded or underpowered
environments. Increasing 'docker_client_timeout' further may be helpful
in cases where Docker reports 'Read timed out'.
Change-Id: I73745771078cb2c0ebae2b1d87ba2c4c12958d82
Closes-Bug: #1809844
* Deploy services using kolla-ansible deploy
* Reconfigure the image for one or more services to use an invalid
* config
* Deploy/reconfigure services using kolla-ansible reconfigure
The invalid config could be a wrong docker registry, wrong image name,
wrong tag, etc.
The restart handler for the service fails, and the old container is
left running.
The restart handler for the service fails, and the old container is
stopped and removed. This leaves the service in a broken state.
This change fixes the issue by pulling the image if necessary prior to
stopping and removing the container.
Change-Id: I85b2a1b224d4c4d85c32c4922a2cd2c41171a1dc
Closes-Bug: #1852572
This role can be used by other roles to register RabbitMQ resources.
Currently support is provided for creating virtual hosts and users.
Change-Id: Ie1774a10b4d629508584af679b8aa9e372847804
Partially Implements: blueprint support-nova-cells
Depends-On: https://review.opendev.org/684742
Since
70b515bf12
was merged, we implicitly require Docker API version 1.25
(https://docs.docker.com/engine/api/v1.25/) to support passing
environment variables to docker exec. The version of docker we deployed
before the Docker CE upgrade was 1.12.0, which is Docker API version
1.24, and so does not support this. We get the following error:
Setting environment for exec is not supported in API < 1.25
This change modifies the kolla_toolbox module to use the new JSON
method for parsing Ansible's output when Docker API 1.25 is available,
falling back to the old regex-based method otherwise.
This change can be reverted when we require a minimum Docker API version
of 1.25+.
Change-Id: Ie671624ecca5b43d7bd8fbd959d701d9e21d66b3
Closes-Bug: #1845681
The kolla_toolbox Ansible module executes as-hoc ansible commands in the
kolla_toolbox container, and parses the output to make it look as if
ansible-playbook executed the command. Currently however, this module
sometimes fails to catch failures of the underlying command, and also
sometimes shows tasks as 'ok' when the underlying command was changed.
This has been tested both before and after the upgrade to ansible 2.8.
This change fixes this issue by configuring ansible to emit output in
JSON format, to make parsing simpler. We can now pick up errors and
changes, and signal them to the caller.
This change also adds an ansible playbook, tests/test-kolla-toolbox.yml,
that can be executed to test the module. It's not currently integrated
with any CI jobs.
Note that this change cannot be backported as the JSON output callback
plugin was added in Ansible 2.5.
Change-Id: I8236dd4165f760c819ca972b75cbebc62015fada
Closes-Bug: #1844114
In order to orchestrate smooth transition to fluentd 0.14.x
aka 1.0 stable branch aka td-agent 3
from td-agent repository - use image labels (fluentd_version
and fluentd_binary).
Depends-On: https://review.opendev.org/676411
Change-Id: Iab8518c34ef876056c6abcdb5f2e9fc9f1f7dbdd
- add support for sha256 in bslurp module
- change sha1 to sha256 in ceph-mon ansible role
Depends-On: https://review.opendev.org/655623
Change-Id: I25e28d150f2a8d4a7f87bb119d9fb1c46cfe926f
Closes-Bug: #1826327
1) ceph-nfs (ganesha-ceph) - use NFSv4 only
This is recommended upstream.
v3 and UDP require portmapper (aka rpcbind) which we
do not want, except where Ubuntu ganesha version (2.6)
forces it by requiring enabled UDP, see [1].
The issue has been fixed in 2.8, included in CentOS.
Additionally disable v3 helper protocols and kerberos
to avoid meaningless warnings.
2) ceph-nfs (ganesha-ceph) - do not export host dbus
It is not in use. This avoids the temptation to try
handling it on host.
3) Properly handle ceph services deploy and upgrade
Upgrade runs deploy.
The order has been corrected - nfs goes after mds.
Additionally upgrade takes care of rgw for keystone
(for swift emulation).
4) Enhance ceph keyring module with error detection
Now it does not blindly try to create a keyring after
any failure. This used to hide real issue.
5) Retry ceph admin keyring update until cluster works
Reordering deployment caused issue with ceph cluster not being
fully operational before taking actions on it.
6) CI: Remove osd df from collected logs as it may hang CI
Hangs are caused by healthy MON and no healthy MGR.
A descriptive note is left in its place.
7) CI: Add 5s timeout to ceph informational commands
This decreases the timeout from the default 300s.
[1] https://review.opendev.org/669315
Change-Id: I1cf0ad10b80552f503898e723f0c4bd00a38f143
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
Docker has no restart policy named 'never'. It has 'no'.
This has bitten us already (see [1]) and might bite us again whenever
we want to change the restart policy to 'no'.
This patch makes our docker integration honor all valid restart policies
and only valid restart policies.
All relevant docker restart policy usages are patched as well.
I added some FIXMEs around which are relevant to kolla-ansible docker
integration. They are not fixed in here to not alter behavior.
[1] https://review.opendev.org/667363
Change-Id: I1c9764fb9bbda08a71186091aced67433ad4e3d6
Signed-off-by: Radosław Piliszek <radoslaw.piliszek@gmail.com>
It is possible to reference undefined variable in kolla-docker module if
DockerWorker object initialization fail, so the current behaviour will
crash the playbook with the unwanted error message :
UnboundLocalError: local variable 'dw' referenced before assignment
Change-Id: Ic8d26b11f93255220888b5406f8ab4a6f81736c2
Closes-Bug: #1819361
By default, docker containers inherit ulimit from limits of docker
deamon. On CentOS 7, docker daemon default NOFILE is 1048576.
It can found in /usr/lib/systemd/system/docker.service.
The big limit will cause many problem. we should control it in
production environment.
Change-Id: Iab962446a94ef092977728259d9818b86cfa7f68
Nova services may reasonably expect cell databases to exist when they
start. The current cell setup tasks in kolla run after the nova
containers have started, meaning that cells may or may not exist in the
database when they start, depending on timing. In particular, we are
seeing issues in kolla CI currently with jobs timing out waiting for
nova compute services to start. The following error is seen in the nova
logs of these jobs, which may or may not be relevant:
No cells are configured, unable to continue
This change creates the cell0 and cell1 databases prior to starting nova
services.
In order to do this, we must create new containers in which to run the
nova-manage commands, because the nova-api container may not yet exist.
This required adding support to the kolla_docker module for specifying a
command for the container to run that overrides the image's command.
We also add the standard output and error to the module's result when a
non-detached container is run. A secondary benefit of this is that the
output of bootstrap containers is now displayed in the Ansible output if
the bootstrapping command fails, which will help with debugging.
Change-Id: I2c1e991064f9f588f398ccbabda94f69dc285e61
Closes-Bug: #1808575
This change adds support to comfigure tty,
it was enabled by default but a recent patch
removed it. Some services such as Karaf in opendaylight
requires a TTY during startup.
Closes-Bug: #1806662
Change-Id: Ia4335523b727d0e45505cbb1efb40ccf04c27db7
With a pseudo terminal, service is not treated as a daemon
and signals would not work as expected.
Change-Id: I16aa29a7924df51659d973a81d8005ae3d86f57b
Related-Bug: #1799642
move the merge_yaml and merge_config module's DOCUMENTATION and
EXAMPLES into action_plugins.
Change-Id: I84c5b94afb870fc9a25838782389f7b1f8b882fd
Closes-Bug: #1799236
This commit is to apply resource-constraints only to few OpenStack services.
Commit to apply constraints to other services will be made in coming commits.
Partially-Implements: blueprint resource-constraints
Change-Id: Icafa54baca24d2de64238222a5677b9d8b90e2aa