Commit Graph

200 Commits

Author SHA1 Message Date
Monty Taylor 83ced7f6e6 Split inventory into multiple dirs and move hostvars
Make inventory/service for service-specific things, including the
groups.yaml group definitions, and inventory/base for hostvars
related to the base system, including the list of hosts.

Move the exisitng host_vars into inventory/service, since most of
them are likely service-specific. Move group_vars/all.yaml into
base/group_vars as almost all of it is related to base things,
with the execption of the gerrit public key.

A followup patch will move host-specific values into equivilent
files in inventory/base.

This should let us override hostvars in gate jobs. It should also
allow us to do better file matchers - and to be able to organize
our playbooks move if we want to.

Depends-On: https://review.opendev.org/731583
Change-Id: Iddf57b5be47c2e9de16b83a1bc83bee25db995cf
2020-06-04 07:44:36 -05:00
Zuul 3f61433c59 Merge "Generate ssl check list directly from letsencrypt variables" 2020-05-28 23:31:11 +00:00
Monty Taylor 67212c3ef2 Clean up base playbook
We're going to try using this in some other organizations, so
simplify thing.

Add in a flush handlers so that we don't have to split plays.
Remove kubernetes group, this isn't actually a thing right now.

Change-Id: I26b21aa8ffca1ac5112136831aa7664d5c3becac
2020-05-27 16:28:37 -05:00
Clark Boylan eb22e01f31 Add support for multiple jvbs behind meetpad
The jitsi video bridge (jvb) appears to be the main component we'll need
to scale up to handle more users on meetpad. Start preliminary
ansiblification of scale out jvb hosts.

Note this requires each new jvb to run on a separate host as the jvb
docker images seem to rely on $HOSTNAME to uniquely identify each jvb.

Change-Id: If6d055b6ec163d4a9d912bee9a9912f5a7b58125
2020-05-20 13:41:30 -07:00
James E. Blair 085856e318 Add iptables_extra_allowed_groups
This adds a new variable for the iptables role that allows us to
indicate all members of an ansible inventory group should have
iptables rules added.

It also removes the unused zuul-executor-opendev group, and some
unused variables related to the snmp rule.

Also, collect the generated iptables rules for debugging.

Change-Id: I48746a6527848a45a4debf62fd833527cc392398
Depends-On: https://review.opendev.org/728952
2020-05-20 13:18:29 -07:00
Ian Wienand c9215801f0 Generate ssl check list directly from letsencrypt variables
This autogenerates the list of ssl domains for the ssl-cert-check tool
directly from the letsencrypt list.

The first step is the install-certcheck role that replaces the
puppet-ssl_cert_check module that does the same.  The reason for this
is so that during gate testing we can test this on the test
bridge.openstack.org server, and avoid adding another node as a
requirement for this test.

letsencrypt-request-certs is updated to set a fact
letsencrypt_certcheck_domains for each host that is generating a
certificate.  As described in the comments, this defaults to the first
host specified for the certificate and the listening port can be
indicated (if set, this new port value is stripped when generating
certs as is not necessary for certificate generation).

The new letsencrypt-config-certcheck role runs and iterates all
letsencrypt hosts to build the final list of domains that should be
checked.  This is then extended with the
letsencrypt_certcheck_additional_domains value that covers any hosts
using certificates not provisioned by letsencrypt using this
mechanism.

These additional domains are pre-populated from the openstack.org
domains in the extant check file, minus those openstack.org domain
certificates we are generating via letsencrypt (see
letsencrypt-create-certs/handlers/main.yaml).  Additionally, we
update some of the certificate variables in host_vars that are
listening on port !443.

As mentioned, bridge.openstack.org is placed in the new certcheck
group for gate testing, so the tool and config file will be deployed
to it.  For production, cacti is added to the group, which is where
the tool currently runs.  The extant puppet installation is disabled,
pending removal in a follow-on change.

Change-Id: Idbe084f13f3684021e8efd9ac69b63fe31484606
2020-05-20 14:27:14 +10:00
Ian Wienand 45201f3d66 Remove puppet mirror support
Remove the separate "mirror_opendev" group and rename it to just
"mirror".  Update various parts to reflect that change.

We no longer deploy any mirror hosts with puppet, remove the various
configuration files.

Depends-On: https://review.opendev.org/728345
Change-Id: Ia982fe9cb4357447989664f033df976b528aaf84
2020-05-16 10:14:25 +10:00
Zuul 99f809ccc5 Merge "Use zuul checkouts of ansible roles from other repos" 2020-05-07 18:41:21 +00:00
Zuul 502ddff9b3 Merge "Test zuul-executor on focal" 2020-05-07 17:53:20 +00:00
Monty Taylor 39495ffdd5 Test zuul-executor on focal
We want to replace the current executors with focal executors.
Make sure zuul-executor can run there.

Kubic is apparently the new source for libcontainers stuff:

  https://podman.io/getting-started/installation.html

Use only timesyncd on focal

ntp and timesyncd have a hard conflict with each other. Our test
images install ntp. Remove it and just stay with timesyncd.

Change-Id: I0126f7c77d92deb91711f38a19384a9319955cf5
2020-05-06 18:00:29 -05:00
Monty Taylor 4b9d1a88bd Use zuul checkouts of ansible roles from other repos
We have two standalone roles, puppet and cloud-launcher, but we
currently install them with galaxy so depends-on patches don't
work. We also install them every time we run anything, even if
we don't need them for the playbook in question.

Add two roles, one to install a set of ansible roles needed by
the host in question, and the other to encapsulate the sequence
of running puppet, which now includes installing the puppet
role, installing puppet, disabling the puppet agent and then
running puppet.

As a followup, we'll do the same thing with the puppet modules,
so that we arent' cloning and rsyncing ALL of the puppet modules
all the time no matter what.

Change-Id: I69a2e99e869ee39a3da573af421b18ad93056d5b
2020-04-30 12:39:12 -05:00
Monty Taylor e0619f17f1 Run nodepool launchers with ansible and containers
We don't run start in prod normally but we do need to run
it in the gate.

Change-Id: Iec50684280409eb978bf5638bf74ae16fad8aa26
2020-04-30 17:37:22 +00:00
Monty Taylor 8d7075b02f Run zookeeper cluster in nodepool jobs
Rather than running a local zookeeper, just run a real zookeeper.
Also, get rid of nb01-test and just use nb04 - what could possibly
go wrong?

Dynamically write zookeeper host information to nodepool.yaml

So that we can run an actual zk using the new zk role on hosts in
ansible inventory, we need to write out the ip addresses of the
hosts that we build in zuul. This means having the info baked in
to the file in project-config isn't going to work.

We can do this in prod too, it shouldn't hurt anything.

Increase timeout for run-service-nodepool

We need to fix the playbook, but we'll do that after we get the
puppet gone.

Change-Id: Ib01d461ae2c5cec3c31ec5105a41b1a99ff9d84a
2020-04-29 16:18:25 -05:00
Zuul b21a8e58cf Merge "Run Zuul using Ansible and Containers" 2020-04-24 16:31:42 +00:00
Monty Taylor f0b77485ec Run Zuul using Ansible and Containers
Zuul is publishing lovely container images, so we should
go ahead and start using them.

We can't use containers for zuul-executor because of the
docker->bubblewrap->AFS issue, so install from pip there.

Don't start any of the containers by default, which should
let us safely roll this out and then do a rolling restart.
For things (like web or mergers) where it's safe to do so,
a followup change will swap the flag.

Change-Id: I37dcce3a67477ad3b2c36f2fd3657af18bc25c40
2020-04-24 09:18:44 -05:00
Monty Taylor d5c68c5131 Split codesearch into its own playbook
Make a service playbook, manifest and jobs for codesearch.

Remove openstack_project::server - it doesn't do anything.

Change-Id: I44c140de4ae0b283940f8e23e8c47af983934471
2020-04-21 13:18:28 -05:00
Monty Taylor 711295e918 Remove old etherpad.openstack.org
Once the DNS is swapped over to point at etherpad.opendev.org
we can delete the old stuff.

Change-Id: I626dd22b22a23619fcf460533336f1ddfec615d9
2020-04-19 10:58:46 -05:00
James E. Blair 42574b2b37 Run ZK from containers
Migration plan:
* add zk* to emergency
* copy data files on each node to a safe place for DR backup
* make a json data backup: zk-shell localhost:2181 --run-once 'mirror / json://!tmp!zookeeper-backup.json/'
* manually run a modified playbook to set up the docker infra without starting containers
* rolling restart; for each node:
  * stop zk
  * split data and log files and move them to new locations
  * remove zk packages
  * start zk containers
* remove from emergency; land this change.

Change-Id: Ic06c9cf9604402aa8eb4bb79238021c14c5d9563
2020-04-17 08:43:09 -07:00
Zuul 135a6a721e Merge "Back up a single gitea backend" 2020-04-14 20:33:27 +00:00
Monty Taylor 2ee77458a8 Back up a single gitea backend
We need to keep at least one of these databases.

Change-Id: Ic734498fbada70856f62de972d7863df472966e5
2020-04-13 08:53:16 -05:00
Monty Taylor 428c423548 Turn backup server back off
Change-Id: I988d6391672053e87722b2f0a10e98c0fa783c40
2020-04-10 13:46:29 -05:00
Monty Taylor 59679d009b Run ansible on the backup server
We need to pulse on the backup server to register etherpad.opendev.org.

Change-Id: Iaec41b1183373bd832dae70af4ae04dfb5bde263
2020-04-10 13:46:29 -05:00
Monty Taylor ca5549fc6c Add review and etherpad to backup group
We should probably back these up.

Change-Id: I1e174273faefacea98ebece7a90a1baf93d52245
2020-04-10 13:46:25 -05:00
Monty Taylor b23515c623 Make a new dockerized etherpad.opendev.org
Upstream likes building the settings file into the image, but that's
less exciting, let's bind-mount ours in.

Depends-On: https://review.opendev.org/717491/
Change-Id: Ia1894d884ef2a84e1282345b77fe07bf8898f367
2020-04-07 11:10:57 -05:00
Zuul 1fd2e226ab Merge "Remove inventory references to <static|files>.openstack.org" 2020-03-31 21:47:47 +00:00
Ian Wienand 476c3ac6f2 Remove inventory references to <static|files>.openstack.org
These hosts have been removed; remove the old references and
unnecessary groups, add the new host to cacti.

Change-Id: Ibcfd78a37e20e514c190ef801c2d44320c8b3f74
Story: #2006598
2020-04-01 07:49:02 +11:00
Zuul 70e2828ce4 Merge "Remove files from letsencrypt group" 2020-03-31 07:39:36 +00:00
Zuul ce3a064133 Merge "Add meetpad server" 2020-03-27 14:44:30 +00:00
Monty Taylor a72ad58d5a Remove files from letsencrypt group
It got missed in an earlier cleanup.

Change-Id: If795fcb6637492518fe2ca2cd37ca6cb41afb101
2020-03-26 07:19:37 -05:00
Ian Wienand f55580fbf0 Remove files02.openstack.org and related puppet
All this has moved to static.opendev.org; the server can now be
removed.

Change-Id: I8ca5d7a206e950c28bb8372a85b6a62d6b9ba00c
2020-03-26 10:36:13 +11:00
James E. Blair 8b093dacd5 Add meetpad server
Depends-On: https://review.opendev.org/714189
Change-Id: I5863aaa805a18f9085ee01c3205b0f9ad602922d
2020-03-25 07:44:24 -07:00
Monty Taylor d3c8c1077b Switch to running gerrit via ansible+containers
This should be mostly a no-op - but we will need to do a shutdown
in emergency mode.

Tell the gerrit role to not run compose up when run as part of
remote_puppet_git.

Change-Id: Id45376c2697656a12afeacf317b6f26c85c08dad
2020-03-19 17:21:39 -05:00
Ian Wienand b1bfee423b nodepool-builder: Add webserver
This adds the webserver that serves the logs and generated images.

Change-Id: I230f5291e0bd928af2e00966d76c3f385b749cb6
2020-03-11 09:16:31 +11:00
Ian Wienand 281425a44d Add initial Ansible for nodepool hosts
This is a start at ansible-deployed nodepool environments.

We rename the minimal-nodepool element to nodepool-base-legacy, and
keep running that for the old nodes.

The groups are updated so that only the .openstack.org hosts will run
puppet.  Essentially they should remain unchanged.

We start a nodepool-base element that will replace the current
puppet-<openstackci|nodepool> deployment parts.  For step one, this
grabs project-config and links in the elements and config file.

A testing host is added for gate testing which should trigger these
roles.  This will build into a full deployment test of the builder
container.

Change-Id: If0eb9f02763535bf200062c51a8a0f8793b1e1aa
Depends-On: https://review.opendev.org/#/c/710700/
2020-03-06 14:02:52 +11:00
Monty Taylor 083cbf2911 Get LE certs for review.o.o
We have LE dns entries for review.o.o, but we're not actually
requesting the cert. Go ahead and request it - it'll make the
apache config easier to sort out.

Get the openstack.org certs for review-dev while we're at it.

Change-Id: I91d06c97993ba37204bd1fc326ae823e1b9c0c1a
Depends-On: https://review.opendev.org/707267
Depends-On: https://review.opendev.org/707255
2020-02-11 17:01:43 -06:00
Monty Taylor cc619fe589 Add review-dev01.opendev.org
Add a new review-dev server on the opendev domain with LE support
enabled.

Depends-On: https://review.opendev.org/705661
Change-Id: Ie32124cd617e9986602301f230e83bb138524fdf
2020-02-05 09:58:25 -06:00
Ian Wienand c3c96d3797 Add Linaro US cloud
Add the credentials for the newly provisioned us.linaro.cloud cloud

Change-Id: I0b81a8eeabec4e0b00258dc4e499c1d449b21681
2020-01-22 06:44:01 +11:00
Ian Wienand f5b5ee9336 Add roles for a basic static server
Basic implementation of the opendev static server, described in

 https://docs.opendev.org/opendev/infra-specs/latest/specs/retire-static.html

Change-Id: Ie1b92f06b71aa6069fe831b26ba1cc272ce4562c
Story: #2006598
Task:  #37757
2020-01-16 14:10:08 +11:00
Monty Taylor 6f3a2792cc Switch to ansible on review-dev
The review-dev service playbook should do everything now that
the puppet did. Update how we're running things.

Change-Id: I70303c48328ea6713c24bf9c6f63d4808d30b95c
2020-01-14 12:04:15 -06:00
Clark Boylan 3deef00ba9 Manage insecure-ci-registry cert with LE
This adds a new handler to restart the zuul registry to pick up the new
cert. We may want to consider updating zuul registry to accept a reload
of ssl config without restarting the service.

Depends-On: https://review.opendev.org/702050
Change-Id: I23f6bea68285bc7cb0d12224235eaa16f0d07986
2020-01-13 15:20:20 -08:00
Clark Boylan 3981c02322 Provision LE cert for zuul.opendev.org
This provisions the cert but does not use it yet. We will do the
switchover once the cert is confirmed to be in place.

Depends-On: https://review.opendev.org/701819
Change-Id: I04fee48b9a79758527d8f9e8128c0fa915cd133e
2020-01-09 11:36:41 -08:00
Clark Boylan f7a305afbf Manage opendev.org with LE on all giteas
This catches up gitea02-07 with 01 managing ssl certs with LE.

Change-Id: I06228edca2204c5c57ebc5cb60b9d1308a393058
2019-11-18 12:47:08 -08:00
Clark Boylan 5392f8a27c Manage opendev.org cert with LE
This is the first step in managing the opendev.org cert with LE. We
modify gitea01.opendev.org only to request the cert so that if this
breaks the other 7 giteas can continue to serve opendev.org. When we are
happy with the results we can merge the followup change to update the
other 7 giteas.

Depends-On: https://review.opendev.org/694182
Change-Id: I9587b8c2896975aa0148cc3d9b37f325a0be8970
2019-11-18 12:07:10 -08:00
James E. Blair b5d37bfaa2 Remove arm64ci (3/2)
This cloud no longer exists.

Change-Id: Iec9d98c7bcf3c4cd3f9853bb059e3ee2efc31e87
Depends-On: https://review.opendev.org/686761
2019-10-04 09:20:33 -07:00
Ian Wienand 376915e17a run_all.sh : add backup playbook
The backup roles have been debugged and are ready to run.

A note is added about having the backup server in a default disabled
state.  This was discussed at an infra meeting where consensus was to
keep it disabled [1].

[1] http://eavesdrop.openstack.org/meetings/infra/2019/infra.2019-06-11-19.01.log.html#l-184

Change-Id: I2a3d2d08a9d1514bf6bdcf15bc5bc95689f3020f
2019-08-09 16:43:55 +10:00
Ian Wienand 78dc3e6ffd Add review-dev as a new backup client
Opt in review-dev to be a client for the new backup server

Change-Id: Ie24855a0df9f8d8d83588ae2f7221415a6535fd5
2019-08-08 13:55:33 +10:00
Ian Wienand 734aaee327 Add vexxhost backup server
This is a new backup server for use with the roles in
I9bf74df351e056791ed817180436617048224d2c

Restrict the puppet group to only the openstack.org servers as this
new server doesn't need puppet.

Depends-On: https://review.opendev.org/674549
Change-Id: Ia8e2e01f579ed9475830c159bf266b63bed52c36
2019-08-05 19:00:29 +10:00
James E. Blair 48cafd19f8 Add LE cert for logs.opendev.org to static
This can be used in an apache vhost later, but should be fine to
merge now.

Depends-On: https://review.opendev.org/673902
Change-Id: Ic2cb7585433351ec1bdabd88915fa1ca07da44e7
2019-07-31 13:00:50 -07:00
Clark Boylan ffcd1791bf Cleanup nodepool builder clouds.yaml
We ended up running into a problem with nodepool built control plane
images (has to do with boot from volume not allowing us to delete images
that are in use by a nova instance). We have decided to clean this up
and go back to not doing this until we can do it more properly.

Note this isn't a revert because having a group for access to control
plane clouds does seem like a good idea in general and I believe there
have been changes we'd have to resolve in the clouds.yaml files anyway.

Depends-On: https://review.opendev.org/#/c/665012/
Change-Id: I5e72928ec2dec37afa9c8567eff30eb6e9c04f1d
2019-07-22 13:55:29 -07:00
Ian Wienand ece14bbfb0 Add mirror-update01.opendev.org server
Add the new mirror-update server as a follow-on to
I525ac18b55f0e11b0a541b51fa97ee5d6512bf70.

Also ensure that the new mirror server isn't in the puppet groups by
only matching the openstack.org one.

Also remove from the afsadmin group.  This group is only used for
keytabs stored on bridge.o.o.  I don't think that we need group for
the keytabs -- a keytab should only ever be in use on one host at a
time, so we are better off keeping the keytabs in a specific host_var
for the host they are used on, rather than being in a group and
possibly deployed on servers where they are not used.

Depends-On: https://review.opendev.org/668610
Change-Id: Icda92bb234adc00f6718c1c656e8f069ce2704c4
2019-07-02 17:34:09 +10:00