Commit Graph

29 Commits

Author SHA1 Message Date
Dmitriy Rabotyagov 2272de8f0c Fix linters issues
With update of ansible-lint to version >=6.0.0 a lot of new
linters were added, that enabled by default. In order to comply
with linter rules we're applying changes to the role.

This is a follow-up change to [1].

[1] https://review.opendev.org/c/openstack/openstack-ansible-lxc_hosts/+/888180

Change-Id: I2564e3dcb2efad8f6a2ed21bec61668c1b6f6209
2023-08-22 13:24:46 +02:00
Dmitriy Rabotyagov 3d8e3690ba Replace ifupdown with native ip-link
We also leverage systemd-networkd for managing lxc-net and replace
using of custom service template for lxc-dnsmasq service with our
systemd-service role. These changes are quite tighten together, so
it's quite hard to split them in different patchsets.

Depends-On: https://review.opendev.org/c/openstack/ansible-role-systemd_service/+/861350
Change-Id: I5ac99e2b6c6e6ccd9da18ae68e1f8801f95f4f4e
2022-11-11 09:57:56 +01:00
Dmitriy Rabotyagov 337ddf8780 Replace systemd-mount template with role
To reduce role complexity we replace separatelly maintained template
with systemd_mount role that is widely used across OSA.

Depends-On: https://review.opendev.org/c/openstack/openstack-ansible/+/836945
Change-Id: I23632f9c145be334b1d19067352f8b82114a1209
2022-04-07 11:40:09 +00:00
Jonathan Rosser c844e21a6e Ansible systemd module can reload units without specifying a service
Remove an old workaround for ansible <2.4

Change-Id: Iafa0ae54538be2690a813c05fadb472c15a01b5a
2022-02-02 03:59:07 -05:00
Jonathan Rosser 59abc5a288 Remove support for gentoo
Openstack-Ansible does not maintain support for deploying on gentoo
so we can simplify this ansible role

Change-Id: If2a63a2743714745e0f0b0eea2ee3d5b8d4c9a35
2021-02-17 19:14:55 +00:00
Dmitriy Rabotyagov a6476c3f5b Increase amount of MaxSessions
For running bigger amount of ansible forks, we need to increase
ssh MaxSessions parameter for lxc hosts, since
all connections to lxc containers occur through hosts

Depends-On: https://review.opendev.org/758399
Change-Id: Ib3e850ba79658a42995cd782a11342aca6858342
2020-10-15 13:05:11 +00:00
Matthew Thode 32d0a30c35
add gentoo support
Change-Id: Ieb1df06e6581601215851d78fb932a9d1e99e183
2019-02-22 19:43:59 -06:00
Mohammed Naser f2ac427403 prep: remove old machinectl workarounds
There are a few manual workarounds that we're placing in order
to workaround old versions of machinectl however we don't actually
leverage those and they seem to be causing a dbus restart which
causes extra problems.

This patch removes those workarounds in order to prevent restarting
dbus which causes the system to start timing out on systemd-logind.

Change-Id: I86483225754a5b1c6030ef21e2c0cdf2cd908c3b
Closes-Bug: #1807405
2018-12-07 10:37:21 -05:00
Jesse Pretorius 4f1db03d96 Make apt key import for Ubuntu a uniform process
In https://review.openstack.org/588962 the implementation
of the apt key store copy into the container was changed
for bionic, but left alone for xenial. This patch makes
the approach uniform across both distributions.

Change-Id: I79f49fd02be3bbee5f22cdde000b19578167e3ca
2018-08-25 21:18:42 +01:00
zhulingjie eeb21321f4 Remove the unnecessary space
Change-Id: If2badfdcbcab2fac3bb2250bf5e97bd139573e58
2018-07-11 23:16:36 -04:00
Jean-Philippe Evrard 15d4a21f4a Fix usage of "|" for tests
With the more recent versions of ansible, we should now use
"is" instead of the "|" sign for the tests.

This should fix it.

Change-Id: I7ba6ca7d7c8a9bbaf85933370d0ced9931f9a34b
2018-07-12 17:03:19 +02:00
Zuul 3d5f38f23c Merge "Add Bionic testing" 2018-05-14 20:59:11 +00:00
Jean-Philippe Evrard 2910c5ad60 Add Bionic testing
Now that bionic testing is added into the tests repos, we can
start testing it in the repo.

cgmanager isn't in bionic, and therefore is removed

The service module isn't in bionic, and therefore it's been renamed to
"systemd".

The apparmor setup we were doing was breaking the apparmor profiles
required. While this worked in xenial it breaks bionic. To fix this
we're just disabling the apparmor profiles instead of trying to to
augment them through block file changes.

Depends-On: https://review.openstack.org/#/c/566959/
Change-Id: Ie4bca80d0dba7b0da0b5829b91cd6d815894aeaa
Co-Authored-By: Kevin Carter <kevin.carter@rackspace.com>
2018-05-14 21:04:09 +02:00
Kevin Carter bf9a79d05e Add mount options for better machinectl performance
The machinectl default options, while functional, could be tuned for
better overall performance. This change adds several options which will
ensure container workloads are using the lest amount of storage with the
best possible performance.

For more information on the options being used see
 * https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs(5)#MOUNT_OPTIONS

All of the "machines" mount procedures have been moved into a unified
volume task file. This was done to ensure a consistent experience across
our supported distros. To ensure any new options are non-disruptive, the
mount handler has been changed to use "reload-or-restart" which will first
try to reload a mount instead of restarting it mounts.

Change-Id: Ia962fd4c5bb2a73ddd884d3bb3837c47b43d6903
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-05-05 17:43:40 +00:00
Kevin Carter 7e98da3d0f Convert lxc_hosts role to use simple download URL
For a very long time we've been parsing and using the lxc images as
provided by upstream lxc. While these images are functional there are by
no means optimal. In general they're quite a bit larger than they need
to be and contian a lot of little sharp edges that have cut us over
the years. This change removes all of the lxc image cache parsing and
meta-data linking and simply downloads the rootfs a given url. To
maintain compatibility with the legacy images a script has been created
to parse the image index and return the legacy image url.

The result of this change:

* Access to smaller more optimal base image which is well known by the
  corresponding communities.

* Deployers now have the ability to set and forget the download url for an
  internal image instead of having to create a cache infrastructure
  compatible with the lxc download template.

* Any rootfs tarball will work as an image.

* Fewer tasks are executed and less memory is consumed resulting in faster
  deployment times.

* The base cache has a uniform meta-data setup giving all container
  types the same access to config, devices, and templating.

Change-Id: I1775e775bbb7fe86bdffdd8296c2cff5ebc5bac8
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-03-21 23:52:53 +00:00
Kevin Carter 44409262d2 Use local container meta-data
The current lxc meta-data process is one where we download an archive
from the upstream lxc images and store it locally on the host. While the
archive is small, this is a process that can break due to transient
networking issues and is an external dependency that we don't need.

The meta-data for the containers we build is all the same between
distros so it's easy to replicate and maintain as a local dependency.
This change creates a templates meta-data folder and stores our
required meta-data items within it. With this change we'll ensure
all containers are built with the same capabilities without requiring
access to an upstream repo and will improve the general speed of
deployment due to the task simplification and removal of an external
dependency.

Change-Id: I999d7068ce05645c477408fbd40556427c202a40
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-03-19 15:07:26 +00:00
Kevin Carter f179f21a66 Clean-up old systemd prep and allow machinctl to grow
The machinectl cache is currently set image to 16G by default. If
multiple container images are imported into the cache this may be too
small by default. This change sets the cache to "64G" by default allowing
the cache more room to grow by.

This change also disables the quota system once the limit has been set
The option `lxc_host_machine_quota_disabled` has been added to disable or
enable the quota system as needed. This is done after the default limit has
been set so an adequately sized sparce file can be created should it not
already exist.

> More documentation can be seen here [0] with regard to the set-limit
  option.

Because we support both modern and older systemd, the cache prep tasks
for old systemd have been updated so that deployers using earlier
versions of systemd can benefit from the ability to grow an existing
cache via playbook run.

[0] https://www.freedesktop.org/software/systemd/man/machinectl.html#set-limit%20%5BNAME%5D%20BYTES

Closes-Bug: #1745361
Change-Id: I85fefc6ce186bb6808ac37a9ea79a50e29671115
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2018-02-12 15:30:14 +00:00
Jean-Philippe Evrard 1ac0533d37 Fix ansible linting issues
These issues can show up when we'll bump ansible lint to a more
recent version.

Change-Id: Ic8fcc374543419dbc7e299040e293fb78bc932fc
2017-12-07 13:37:02 +00:00
Kevin Carter 53a6cce9ed
Use handlers to restart services and move dnsmasq to a unit file
These changes further optimise the lxc_host role so that it's using more
of the built in modules and making better use of handlers.

Moving the dnsmasq process to a unit file gives operators the ability to
restart the dnsmasq process if there's an issue with the service. It
also ensures the service stays running as systemd will take better care
of the service by isolating it within a specific cgroup, ensuring good
reporting and memory management, and providing the ability to recover
from failures in an automated way.

Closes-Bug: #1518485
Change-Id: I42d0caa3b12e70a3601c30051eefc067e81a71bb
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2017-11-01 15:19:22 -05:00
Kevin Carter 076493d014 Update role for efficiency and to make better use modules
The LXC host role can be tuned up for better overall efficiency.

Highlights:
* Move async wait to a later position for role performance. The
  async wait we're doing can be moved elsewhere in the role so
  that we're able to do more in parallel. This change simply moves
  the async wait to a postition just before its required.
* Move container creation tasks into their own sub-files which are
  accessed using dynamic routing.
* Several syntatic items were cleaned up.
* All of the basic cache cleanup has been moved to handlers.

Closes-Bug: #1718979
Change-Id: I26eae11be8f7d5b691fbccd3d2fe1cfb21b8cf55
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2017-11-01 05:15:04 +00:00
Markos Chandras 82406ad958 systemd: Set a higher DefaultTasksMax value
systemd-2.28 introduced DefaultTasksMax which is used to control
the default TasksMax= setting for services and scopes running on the
system. (TasksMax= is the primary setting that exposes the "pids"
cgroup controller on systemd and was introduced in the previous
systemd release.) The setting now defaults to 512, which means
services that are not explicitly configured otherwise will only
be able to create 512 processes or threads at maximum, from this
version on. However, the 512 limit seems too strict and sometimes
leads to failures like the following one on busy containers

==> opensuse422: fatal: [container3]: FAILED! => {"changed": false, "cmd": "/usr/sbin/rabbitmqctl -q -n '' list_user_permissions guest", "failed": true, "msg": "/usr/sbin/rabbitmqctl: fork: retry: No child processes\n/usr/lib64/rabbitmq/lib/rabbitmq_server-3.6.6//sbin/rabbitmq-env: fork: retry: Resource temporarily unavailable\n/usr/lib64/rabbitmq/lib/rabbitmq_server-3.6.6//sbin/rabbitmq-env: fork: retry: No child processes\n/usr/lib64/rabbitmq/lib/rabbitmq_server-3.6.6//sbin/rabbitmq-env: fork: retry: No child processes\nFailed to create thread: Resource temporarily unavailable (11)\r\nAborted (core dumped)", "rc": 134, "stderr": "/usr/sbin/rabbitmqctl: fork: retry: No child processes\n/usr/lib64/rabbitmq/lib/rabbitmq_server-3.6.6//sbin/rabbitmq-env: fork: retry: Resource temporarily unavailable\n/usr/lib64/rabbitmq/lib/rabbitmq_server-3.6.6//sbin/rabbitmq-env: fork: retry: No child processes\n/usr/lib64/rabbitmq/lib/rabbitmq_server-3.6.6//sbin/rabbitmq-env: fork: retry: No child processes\nFailed to create thread: Resource temporarily unavailable (11)\r\nAborted (core dumped)\n", "stderr_lines": ["/usr/sbin/rabbitmqctl: fork: retry: No child processes", "/usr/lib64/rabbitmq/lib/rabbitmq_server-3.6.6//sbin/rabbitmq-env: fork: retry: Resource temporarily unavailable", "/usr/lib64/rabbitmq/lib/rabbitmq_server-3.6.6//sbin/rabbitmq-env: fork: retry: No child processes", "/usr/lib64/rabbitmq/lib/rabbitmq_server-3.6.6//sbin/rabbitmq-env: fork: retry: No child processes", "Failed to create thread: Resource temporarily unavailable (11)", "Aborted (core dumped)"], "stdout": "", "stdout_lines": []}

and with messages in the kernel log such as

[ 2925.999021] cgroup: fork rejected by pids controller in /init.scope/lxc/container1
[ 3083.704049] cgroup: fork rejected by pids controller in /init.scope/lxc/container2

As we see, even though the /init.scope/lxc/container1 as pids.max set to 'max', the /init.scope
has pids.max set to 512 and in cgroups we always respect the lowest
boundary

~> cat /sys/fs/cgroup/pids/init.scope/lxc/container1/pids.max
max
~> cat /sys/fs/cgroup/pids/init.scope/pids.max
512

As a result of which, the 512 limit is enforced.

As such, we add a new variable to make this limit configurable. The
default limit has now been increased to 8192.

Change-Id: I8b4143aac84d4c795cab9c0d978c9a97ebea1793
2017-06-22 08:51:20 +01:00
Jesse Pretorius f37dffd33c Remove unnecessary handler
The base container is not created using the LXC
tooling, so the handler is now superfluous.

Change-Id: I2e025fab3df4980579ec318a7676a4a0832576b1
2017-05-02 14:39:06 +01:00
Andy McCrae 1eddafec76 Remove Trusty support form lxc_hosts role
Change-Id: I90afc3bbec9eaaaeef76efc5e0a3ca2e9cf87ef4
Implements: blueprint trusty-removal
2016-12-15 13:13:45 +00:00
Jimmy McCrory 3f0c8f642f Ensure apparmor is running before reloading
The 'Reload apparmor' handler can fail if the apparmor service is not
already in a running state. Add an additional handler to ensure that
apparmor is started and enabled on boot.

Change-Id: If2752d69beb2c646a64f2ca02ce39a0d4161a5b5
2016-11-04 09:25:13 -07:00
Qin Wang 75d32df1dc removed redundant handler and flushed handler right away
loading lxc-openstack profile into apparmor is done with service reloading,
so the redundant loading handler of lxc-openstack is removed.
The reloading handler is flushed right away in case of interrupted execution.

Change-Id: I7a0e9d886808e0949a0e8301c6a5ea2994c6cd49
closes-bug: 1620757
2016-09-08 20:25:04 +00:00
Jimmy McCrory 6c12d17fed Fix generation of LXC hostnames
The LXC download template sets hostnames within containers by an
in-place string replacement of 'LXC_NAME' in /etc/hosts and
/etc/hostnames with the given container name.

Create the base cache container image with the name 'LXC_NAME' so that
this this in-place text replacement happens and containers are created
with the expected hostnames.

Change-Id: I851f29d8feebc41e9bcbc1866bba1782c6727d6a
2016-05-04 14:14:59 -07:00
Kevin Carter f5542103b3
Changed for lxc-host setup/build for multi-distro
This change updates the lxc-host setup role to build the lxc cache using the
download template based on default images found here:[0]. These images are
upsteam builds from the greater LXC/D community.

This update adds support for Ubuntu 14.04, 16.04 and RHEL/CentOS 7 container
types and the cache will be generated from the host Operating system.

[0] - https://images.linuxcontainers.org/

Change-Id: Ie13be2322d28178760481c59805101d6aeef4f36
Co-Authored-By: Jesse Pretorius <jesse.pretorius@rackspace.co.uk>
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2016-05-03 08:49:54 -05:00
Jesse Pretorius 47de991138 Fix apparmor profile load handler
The apparmor load handler has an incorrectly spelled profile name.

Change-Id: Ie38288702a5f92388e9253b3ed220f7459ea8fa4
2016-03-04 23:44:48 +00:00
Kevin Carter eb9f3d858b IRR for lxc_host
The change moves the role out from the main repo lxc_host
repository and into its own standalone repository.

Items within this change:
  * The role has been updated to ensure it runs standalone.
  * Tests added to the role within tox.
  * Functional tests added to the role that can either be run
    via the run_tests.sh script or using tox.
  * dev requirements have been updated for testing usecases.
  * Docs added to both the README.rst file as well as the docs
    folder.

Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
2015-11-03 04:22:57 -06:00