Commit Graph

37 Commits

Author SHA1 Message Date
Ian Wienand be992b3bb6
infra-prod: run job against linaro
We have access to manage the linaro cloud, but we don't want to
completely own the host as it has been configured with kolla-ansible;
so we don't want to take over things like name resolution, iptables
rules, docker installation, etc.

But we would like to manage some parts of it, like rolling out our
root users, some cron jobs, etc.  While we could just log in and do
these things, it doesn't feel very openinfra.

This allows us to have a group "unmanaged" that skips the base jobs.
The base playbook is updated to skip these hosts.

For now, we add a cloud-linaro prod job that just does nothing so we
can validate the whole thing.  When it's working, I plan to add a few
things as discussed above.

Change-Id: Ie8de70cbac7ffb9d727a06a349c3d2a3b3aa0b40
2023-03-15 12:00:25 +11:00
Monty Taylor d93a661ae4 Run iptables in service playbooks instead of base
It's the only part of base that's important to run when we run a
service. Run it in the service playbooks and get rid of the
dependency on infra-prod-base.

Continue running it in base so that new nodes are brought up
with iptables in place.

Bump the timeout for the mirror job, because the iptables addition
seems to have just bumped it over the edge.

Change-Id: I4608216f7a59cfa96d3bdb191edd9bc7bb9cca39
2020-06-04 07:44:22 -05:00
Monty Taylor e8716e742e Move base roles into a base subdir
If we move these into a subdir, it cleans up the number of things
we nave to files match on.

Stop running disable-puppet-agent in base. We run it in run-puppet
which should be fine.

Change-Id: Ia16adb96b11d25a097490882c4c59a50a0b7b23d
2020-05-27 16:28:37 -05:00
Monty Taylor 67212c3ef2 Clean up base playbook
We're going to try using this in some other organizations, so
simplify thing.

Add in a flush handlers so that we don't have to split plays.
Remove kubernetes group, this isn't actually a thing right now.

Change-Id: I26b21aa8ffca1ac5112136831aa7664d5c3becac
2020-05-27 16:28:37 -05:00
Monty Taylor 8193a01742 Stop running mcollective
The puppet-agent package has added mcollective as a service that
just gets run, so we have a ton of servers, basically anything
that has been rebooted since the package update, that are running
an mcollective daemon that is trying and failing to connect to
the mcollective system that does not exist.

Add a disabling of the mcollective unit to the disable-puppet-agent
role. Also - add disable-puppet-agent everywhere. We'll re-remove
it later, but let's make sure we've got this turned off on all
of our non-puppet hosts now too.

Change-Id: I5c6235e88e5aac3fee85f58d762023c793635f42
2020-05-05 15:00:04 -05:00
James E. Blair 8ad300927e Split the base playbook into services
This is a first step toward making smaller playbooks which can be
run by Zuul in CD.

Zuul should be able to handle missing projects now, so remove it
from the puppet_git playbook and into puppet.

Make the base playbook be merely the base roles.

Make service playbooks for each service.

Remove the run-docker job because it's covered by service jobs.

Stop testing that puppet is installed in testinfra. It's accidentally
working due to the selection of non-puppeted hosts only being on
bionic nodes and not installing puppet on bionic. Instead, we can now
rely on actually *running* puppet when it's important, such as in the
eavesdrop job. Also remove the installation of puppet on the nodes in
the base job, since it's only useful to test that a synthetic test
of installing puppet on nodes we don't use works.

Don't run remote_puppet_git on gitea for now - it's too slow. A
followup patch will rework gitea project creation to not take hours.

Change-Id: Ibb78341c2c6be28005cea73542e829d8f7cfab08
2019-05-19 07:31:00 -05:00
Ian Wienand afd907c16d letsencrypt support
This change contains the roles and testing for deploying certificates
on hosts using letsencrypt with domain authentication.

From a top level, the process is implemented in the roles as follows:

1) letsencrypt-acme-sh-install

   This role installs the acme.sh tool on hosts in the letsencrypt
   group, along with a small custom driver script to help parse output
   that is used by later roles.

2) letsencrypt-request-certs

   This role runs on each host, and reads a host variable describing
   the certificates required.  It uses the acme.sh tool (via the
   driver) to request the certificates from letsencrypt.  It populates
   a global Ansible variable with the authentication TXT records
   required.

   If the certificate exists on the host and is not within the renewal
   period, it should do nothing.

3) letsencrypt-install-txt-record

   This role runs on the adns server.  It installs the TXT records
   generated in step 2 to the acme.opendev.org domain and then
   refreshes the server.  Hosts wanting certificates will have
   pre-provisioned CNAME records for _acme-challenge.host.opendev.org
   pointing to acme.opendev.org.

4) letsencrypt-create-certs

   This role runs on each host, reading the same variable as in step
   2.  However this time the acme.sh tool is run to authenticate and
   create the certificates, which should now work correctly via the
   TXT records from step 3.  After this, the host will have the
   full certificate material.

Testing is added via testinfra.  For testing purposes requests are
made to the staging letsencrypt servers and a self-signed certificate
is provisioned in step 4 (as the authentication is not available
during CI).  We test that the DNS TXT records are created locally on
the CI adns server, however.

Related-Spec: https://review.openstack.org/587283

Change-Id: I1f66da614751a29cc565b37cdc9ff34d70fdfd3f
2019-04-02 15:31:41 +11:00
Ian Wienand 490df68885 Use adns group in base.yaml
Although we only have adns01, for testing purposes it would be handy
to have another adns server in testinfra (this way, we can write tests
for letsencrypt paths that don't try and execute on the existing dns
testing paths).

Change-Id: Ie1968660c110bdb626df637f182f1f39598e59ac
2019-03-27 14:21:29 +11:00
James E. Blair 287eecd9d2 Run zuul-preview
Change-Id: Ib72e2bd29d1061822e0c16c201445115a5e5c58f
2019-02-25 13:14:51 -08:00
James E. Blair 4b031f9f24 Run an haproxy load balancer for gitea
This runs an haproxy which is strikingly similar to the one we
currently run for git.openstack.org, but it is run in a docker
container.

Change-Id: I647ae8c02eb2cd4f3db2b203d61a181f7eb632d2
2019-02-22 12:54:04 -08:00
James E. Blair 67cda2c7df Deploy gitea with docker-compose
This deploys a shared-nothing gitea server using docker-compose.
It includes a mariadb server.

Change-Id: I58aff016c7108c69dfc5f2ebd46667c4117ba5da
2019-02-18 08:46:40 -08:00
Zuul c820963613 Merge "Install kubectl on bridge" 2019-02-11 22:02:36 +00:00
James E. Blair 94d404a535 Install kubectl on bridge
With a snap package.  Because apparently that's how that's done.

Change-Id: I0462cc062c2706509215158bca99e7a2ad58675a
2019-02-11 10:16:58 -08:00
Monty Taylor 07edd9d297 Add opendev kubernetes nodes to ansible inventory
We want our base ansible roles to run on these nodes. However,
k8s-on-openstack manages firewall rules via openstack security
groups, so we don't want to run those there.

There was a discussion about making a minimal set of roles that
are run by default and then a group containing servers that got
the full set ... but that would require a duplicate entry for 99%
of our servers in the inventory, while the "only run a subset" is
the exception case.

Change-Id: I2cbf364305f758cecf11df41398d3d2c05222fda
2019-02-08 16:49:01 +00:00
James E. Blair 7610682b6f Configure .kube/config on bridge
Add the gitea k8s cluster to root's .kube/config file on bridge.

The default context does not exist in order to force us to explicitly
specify a context for all commands (so that we do not inadvertently
deploy something on the wrong k8s cluster).

Change-Id: I53368c76e6f5b3ab45b1982e9a977f9ce9f08581
2019-02-06 15:43:19 -08:00
James E. Blair 12709a1c8b Run a docker registry for CI
Change-Id: If9669bb3286e25bb16ab09373e823b914b645f26
2019-02-01 10:12:51 -08:00
Ian Wienand f07bf2a507 Import install-docker role
This is a role for installing docker on our control-plane servers.

It is based on install-docker from zuul-jobs.

Basic testinfra tests are added; because docker fiddles the iptables
rules in magic ways, the firewall testing is moved out of the base
tests and modified to partially match our base firewall configuration.

Change-Id: Ia4de5032789ff0f2b07d4f93c0c52cf94aa9c25c
2018-12-14 11:30:47 -08:00
James E. Blair 6368113ec9 Add kube config to nodepool servers
This adds connection information for an experimental kubernetes
cluster hosted in vexxhost-sjc1 to the nodepool servers.

Change-Id: Ie7aad841df1779ddba69315ddd9e0ae96a1c8c53
2018-11-28 16:24:53 -08:00
James E. Blair dae1a0351c Configure opendev nameservers using ansible
Change-Id: Ie6430053159bf5a09b2c002ad6a4f84334a5bca3
2018-11-02 13:49:38 -07:00
James E. Blair 90e6088881 Configure adns1.opendev.org server via ansible
Change-Id: Ib4d3cd7501a276bff62e3bc0998d93c41f3ab185
2018-11-02 13:49:38 -07:00
James E. Blair 310bc5696c Remove !ci-backup play
Let's abandon the idea that we'll treat the backup server specially.
As long as we allow *any* automated remote access via ansible, we
have opened the door to potential compromise of the backup systems
if bridge is compromised.  Rather than pretending that this separation
gives us any benefit, remove it.

Change-Id: I751060dc05918c440374e80ffb483d948f048f36
2018-09-07 11:00:22 -07:00
James E. Blair e4382df8af Move the !ci-backup play next to the rest of the servers
This is fundamental stuff, it should stay together.

Change-Id: I160512e02197f41a7ac5b63b844e62e1cf703e2e
2018-09-07 10:58:32 -07:00
James E. Blair 2eee43e627 Name plays in playbooks
In run_all, we start a bunch of plays in sequence, but it's difficult
to tell what they're doing until you see the tasks.  Name the plays
themselves to produce a better narrative structure.

Change-Id: I0597eab2c06c6963601dec689714c38101a4d470
2018-09-07 10:51:56 -07:00
James E. Blair c34860d166 Add a run-nodepool job
Change-Id: I9d0721a7db7f355683895fca5a2a5f152d147034
2018-09-05 15:52:36 -07:00
Clark Boylan 09288c7c37 Manage clouds.yaml files in ansible
This manages the clouds.yaml files in ansible so that we can get them
updated automatically on bridge.openstack.org (which does not puppet).

Co-Authored-By: James E. Blair <jeblair@redhat.com>
Depends-On: https://review.openstack.org/598378
Change-Id: I2071f2593f57024bc985e18eaf1ffbf6f3d38140
2018-09-04 08:49:00 -07:00
Monty Taylor eb086094a8 Install limestone CA on hosts using openstacksdk
In order to talk to limestone clouds we need to configure a custom CA.
Do this in ansible instead of puppet.

A followup should add writing out clouds.yaml files.

Change-Id: I355df1efb31feb31e039040da4ca6088ea632b7e
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
2018-08-31 12:17:35 -07:00
Ian Wienand 7bd8117e2a Add install-puppet to base playbook
This role manages puppet on the host and should be applied to all
puppet hosts.

The dependent change is required to get the "puppet" group into the
generated inventory for the system-config-run-base test.

Change-Id: I0e18c53836baca743d32abf1bb4b7a3f63c025bb
Depends-On: https://review.openstack.org/596994
2018-08-28 17:54:09 +10:00
James E. Blair 3d166f99f6 Add unbound role
Add it to the base playbook and add a testinfra test for it.

Change-Id: Id5098f33aac213e6add6f061684d0214dc99ab5b
2018-08-27 13:29:18 -07:00
James E. Blair dceb09d886 Add snmpd role and add it to base
Change-Id: I00bf872e8504efb26d20832f1da82da8cfe87258
2018-08-27 07:34:36 -07:00
David Shrewsbury b3b698c6ff Add timezone role
Contains a handler to restart crond when tz is changed. Cron service
name differs across distros.

Removes the puppet-timezone usage.

Change-Id: I4e45d0e0ed37214ac491f373ff2d37750e720718
2018-08-27 07:34:28 -07:00
Monty Taylor 15663daaf7 Add iptables role
Co-Authored-By: James E. Blair <corvus@inaugust.com>
Change-Id: Id8b347483affd710759f9b225bfadb3ce851333c
Depends-On: https://review.openstack.org/596503
2018-08-27 14:33:32 +00:00
Monty Taylor bab6fcad3c
Remove base.yaml things from openstack_project::server
Now that we've got base server stuff rewritten in ansible, remove the
old puppet versions.

Depends-On: https://review.openstack.org/588326
Change-Id: I5c82fe6fd25b9ddaa77747db377ffa7e8bf23c7b
2018-08-16 17:25:10 -05:00
Monty Taylor 0d1f235fce
Add exim config for firehose and storyboard
In order to get puppet out of the business of mucking with exim and
fighting ansible, finish moving the config to ansible.

This introduces a storyboard group that we can use to apply the exim
config across both servers. It also splits the base playbook so that we
can avoid running exim on the backup servers. And we set
purge_apt_sources the same as was set in puppet. We should probably
remove it though, since none of us have any clue why it's here.

Change-Id: I43ee891a9c1beead7f97808208829b01a0a7ced6
2018-08-15 15:11:48 -05:00
Monty Taylor 4cca3f8d2a
Add lists exim config to ansible
The mailing list servers have a more complex exim config. Put the
routers and transports into ansible variables.

While we're doing it, role variables with an exim_ prefix - since 'routers'
as a global variable might be a little broad.

iteritems isn't a thing in python3, only items.

We need to escape the exim config with ${if or{{ - because of the {{
which looks like jinja. Wrap it in a {% raw %} block.

Getting the yaml indentation right for things here is non-trivial. Make
them strings instead.

Add a README.rst file - and use the zuul:rolevar construct in it,
because it's nice.

Change-Id: Ieccfce99a1d278440c5baa207479a1887898298e
2018-08-15 15:11:48 -05:00
Monty Taylor a4a134815c
Add exim role to base playbook
We want email to work.

Add a default value so that integration tests work - and update the
template so that if the value in the alias mapping is empty we don't
write out a half-formed alias.

Enable the epel repo on CentOS nodes in base-repos. This is done in
install_puppet.sh, but install_puppet.sh doesn't get run on ansible-only
nodes.

Change-Id: I68ad9f66c3b8672d9642c7764e50adac9cafdaf9
2018-08-13 09:20:36 -05:00
Monty Taylor d587307aaf
Make integration tests works
Split base playbook into two plays

The update apt-cache handler from base-repos needs to fire before we run
base-server. Split into two plays so that the handler will fire.

Fix use of first_found

For include_vars, using the lookup version of first_found requires being
explicit about the path to search in as well. We also need to use query
together with loop to get skip to work right.

Extract the list of file locations we look for for distro and platform
specific variables into a variable so that we can reuse it instead of
copy-pasta.

The vim package is vim-nox on ubuntu and vim-minimal on debian.

ntpdate only needs to be enabled on boot, it does not need to be
immediately started. At least, that's what the old puppet was doing and
trying to start it immediately breaks centos integration tests.

emacs-nox is emacs23-nox on trusty.

Change-Id: If3db276a5f6a8f76d7ce8635da8d2cbc316af341
Depends-On: https://review.openstack.org/588326
2018-08-10 12:12:32 -05:00
Monty Taylor 0bb4232586 Add base playbooks and roles to bootstrap a new server
We want to launch a new bastion host to run ansible on. Because we're
working on the transition to ansible, it seems like being able to do
that without needing puppet would be nice. This gets user management,
base repo setup and whatnot installed. It doesn't remove them from the
existing puppet, nor does it change the way we're calling anything that
currently exists.

Add bridge.openstack.org to the disabled group so that we don't try to
run puppet on it.

Change-Id: I3165423753009c639d9d2e2ed7d9adbe70360932
2018-08-01 14:57:44 -07:00