Enable quota system and set qgroups

This change implements the machinectl quota system and qgroups when
they're enabled and available. This change is being implemented to
resolve an issue where machinectl based containers using a loopback file
system spam DMESG with the following:

* BTRFS error (device loop0): could not find root $INT

While various upstream sources say this error is benign[0], it raises
an inconsistency flag within the host system and is speculatively the
cause of our inconsistent read-only/Full-FS issues we've seen in the
integrated gate. Once the qgroups are properly setup the system will
remove the inconsistency flag and the message spam will stop.

* BTRFS info (device loop0): qgroup scan completed (inconsistency flag cleared)

To resolve this issue the quota system is being enabled by default
and unlimited qgroups are being setup to ensure we're not running
into file system limitations. This change essentially acknowledges
the built-in quota system and provides for the ability to set /
define specific quota (qgroup) options as necessary. While many
deployers may never use these options or this tooling, the role will
now properly set everything up should it ever be needed.

[0] https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1651435
Closes-Bug: #1753790
Change-Id: I34a41ac8a9fe4419254284c83f4600efee274c04
Signed-off-by: Kevin Carter <kevin.carter@rackspace.com>
This commit is contained in:
Kevin Carter 2018-05-14 21:53:21 -05:00 committed by Kevin Carter (cloudnull)
parent 3d5f38f23c
commit 2971b787ac
5 changed files with 80 additions and 17 deletions

View File

@ -42,8 +42,17 @@ lxc_host_machine_volume_size: |-
{%- endfor -%}
{{ mounts[0] }}
# Disable the machinctl quota system.
lxc_host_machine_quota_disabled: true
# Enable or Disable the BTRFS quota system for the "/var/lib/machines" mount
# point. More information on the BTRFS quota system can be found here:
# * https://btrfs.wiki.kernel.org/index.php/Quota_support
lxc_host_machine_quota_disabled: false
# Set the default qgroup limits used for file system quotas. The default is
# "none". See the following documentation for more information:
# * https://btrfs.wiki.kernel.org/index.php/Manpage/btrfs-qgroup
lxc_host_machine_qgroup_space_limit: none
lxc_host_machine_qgroup_compression_limit: none
# DefaultTasksMax systemd value. It's not recommended to change this value as it
# could prevent new processes from starting on busy containers.

View File

@ -0,0 +1,13 @@
---
features:
- An option to disable the ``machinectl`` quota system has been changed. The
variable ``lxc_host_machine_quota_disabled`` is a Boolean with a default of
**false**. When this option is set to **true** it will disable the
``machinectl`` quota system.
- The options ``lxc_host_machine_qgroup_space_limit`` and
``lxc_host_machine_qgroup_compression_limit`` have been added allowing a
deployer to set **qgroup** limits as they see fit. The default value for
these options is "none" which is effectively **unlimited**. These options
accept any nominal size value followed by the single letter type, example
``64G``. These options are only effective when the option
``lxc_host_machine_quota_disabled`` is set to **false**.

View File

@ -50,6 +50,24 @@
- include_tasks: "lxc_cache_preparation_systemd_{{ (systemd_version.stdout_lines[0].split()[-1] | int > 219) | ternary('new', 'old') }}.yml"
- name: Set the qgroup limits
block:
- name: Set the qgroup size|compression limits on machines
command: "btrfs qgroup limit {{ item }} {{ lxc_image_cache_path }}"
changed_when: false
with_items:
- "-e {{ lxc_host_machine_qgroup_space_limit }}"
- "-c {{ lxc_host_machine_qgroup_compression_limit }}"
when:
- not lxc_host_machine_quota_disabled
rescue:
- name: Notice regarding quota system
debug:
msg: >-
There was an error processing the setup of qgroups. Check the system
to ensure they're available otherwise disable the quota system by
setting `lxc_host_machine_quota_disabled` to true.
- block:
- name: Generate apt keys from LXC host for the container cache
shell: "apt-key exportall"

View File

@ -24,3 +24,21 @@
retries: 3
delay: 10
until: cache_download|success
- name: Set the qgroup limits
block:
- name: Set the qgroup size|compression limits on machines
command: "btrfs qgroup limit {{ item }} /var/lib/lxc/{{ lxc_container_base_name }}"
changed_when: false
with_items:
- "-e {{ lxc_host_machine_qgroup_space_limit }}"
- "-c {{ lxc_host_machine_qgroup_compression_limit }}"
when:
- not lxc_host_machine_quota_disabled
rescue:
- name: Notice regarding quota system
debug:
msg: >-
There was an error processing the setup of qgroups. Check the system
to ensure they're available otherwise disable the quota system by
setting `lxc_host_machine_quota_disabled` to true.

View File

@ -61,22 +61,27 @@
- meta: flush_handlers
- name: Disable the machinectl quota system
command: "btrfs quota {{ lxc_host_machine_quota_disabled | bool | ternary('disable', 'enable') }} /var/lib/machines"
args:
executable: /bin/bash
failed_when: false
register: machines_create
tags:
- skip_ansible_lint
- name: Update quota system and group limits
block:
- name: Disable|Enable the machinectl quota system
command: "btrfs quota {{ lxc_host_machine_quota_disabled | bool | ternary('disable', 'enable') }} /var/lib/machines"
changed_when: false
- name: Notice quota system was not disabled
debug:
msg: >-
The machinectl quota system could not be disabled. This typically
means it is already off or not available on the system.
when:
- machines_create.rc != 0
- name: Set the qgroup size|compression limits on machines
command: "btrfs qgroup limit {{ item }} /var/lib/machines"
changed_when: false
with_items:
- "-e {{ lxc_host_machine_qgroup_space_limit }}"
- "-c {{ lxc_host_machine_qgroup_compression_limit }}"
when:
- not lxc_host_machine_quota_disabled | bool
rescue:
- name: Notice regarding quota system
debug:
msg: >-
The machinectl quota system could not be setup. Check the system for
quota system availability otherwise disable it by setting
`lxc_host_machine_quota_disabled` to true.
# NOTE(cloudnull): Because the machines mount may be a manually created sparse
# file we run an online resize to ensure the machines mount is