From 1533f03a30b43b8053fe3b5df3e3f2f63375c55e Mon Sep 17 00:00:00 2001 From: Ian Wienand Date: Mon, 23 Sep 2019 17:33:27 +1000 Subject: [PATCH] Spec to retire static.openstack.org Change-Id: Ic5557750ee6c52def01c8d362b8d9e7563cc0f8a --- doc/source/index.rst | 1 + specs/retire-static.rst | 378 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 379 insertions(+) create mode 100644 specs/retire-static.rst diff --git a/doc/source/index.rst b/doc/source/index.rst index 9f07393..c18e258 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -44,6 +44,7 @@ permits. specs/translation_check_site specs/wiki_modernization specs/letsencrypt + specs/retire-static Help Wanted =========== diff --git a/specs/retire-static.rst b/specs/retire-static.rst new file mode 100644 index 0000000..dcd1c88 --- /dev/null +++ b/specs/retire-static.rst @@ -0,0 +1,378 @@ +:: + + Copyright 2019 Red Hat Inc. + + This work is licensed under a Creative Commons Attribution 3.0 + Unported License. + http://creativecommons.org/licenses/by/3.0/legalcode + +.. + This template should be in ReSTructured text. Please do not delete + any of the sections in this template. If you have nothing to say + for a whole section, just write: "None". For help with syntax, see + http://sphinx-doc.org/rest.html To test out your formatting, see + http://www.tele3.cz/jbar/rest/rest.html + +=========================== +Retire static.openstack.org +=========================== + +Include the URL of your StoryBoard story: + +https://storyboard.openstack.org/#!/story/2006598 + +Move the services provided by ``static.openstack.org`` into less +centralised approaches more consistent with modern deployment trends. + +Problem Description +=================== + +The ``static.openstack.org`` host is a monolithic server providing +various hosting services via a large amount of volume-attached +storage. + +The immediate problem is it currently running Ubuntu Trusty which is +reaching the end of its supported life. + +The secondary problems are twofold: + +Firstly, we would like to move the various publishing and hosting +operations from centralised volumes on a single server to our AFS +distributed file-system. + +Secondly, we would like to make the hosting portion more OpenDev +compatible; this means avoiding working on legacy deployment methods +(i.e. puppet) and integrating with our general idea of a "whitebox" +service that can be used by many different projects. + +Thus we propose breaking up the services it offers to utlise more +modern infrastructure alternatives and retiring the host. + +Proposed Change +=============== + +We can break the services down + +Log storage + Legacy log storage (~14tb) + +Redirects + Apache service redirects a number of legacy URLs to new locations + +Static site serving + 100gb attached partition holding various static sites (i.e. plain + HTML publishing, no middleware, etc) + +Tarball + 512gb partition which holds and publishes release tarballs for all + projects. + + +Alternatives +------------ + +``apt-get dist-ugprade`` the host to a more recent distribution, fix +any puppet issues and ignore it until next time it needs updating. + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee: + TBD + +Gerrit Topic +------------ + +Use Gerrit topic "static-services" all patches related to this spec. + +.. code-block:: bash + + git-review -t static-services + +Work Items +---------- + +Log storage +~~~~~~~~~~~ + +OpenDev CI logs have been moved to various object-storage backends +provided by donors. The existing logs will age out per our existing +old-log cleanup jobs. + +Since logs were always ephemeral there should be no issues with old +links. For clarity we will remove (rather than redirect) the +``logs.openstack.org`` DNS entry so there is no confusion that logs +might still live there. + +Workitems: + +* remove ``logs.openstack.org`` DNS entries after old logs entries + have cleared out + +Legacy redirects +~~~~~~~~~~~~~~~~ + +The following do straight redirects from their config hostnames to +``docs.openstack.org`` + +* 50-cinder.openstack.org.conf +* 50-devstack.org.conf +* 50-glance.openstack.org.conf +* 50-horizon.openstack.org.conf +* 50-keystone.openstack.org.conf +* 50-nova.openstack.org.conf +* 50-swift.openstack.org.conf + +The following have slightly different semantics + +* 50-ci.openstack.org.conf + + * ``/nodepool``, ``/shade``, ``/zuul``, etc all to docs; see + https://opendev.org/opendev/system-config/src/branch/master/modules/openstack_project/templates/ci.vhost.erb + +* 50-qa.openstack.org.conf + + * currently redirects to broken link + https://docs.openstack.org/developer/qa + +The following redirects to ``openstack.org`` + +* 50-summit.openstack.org.conf + +Clearly there is a need for a generic ability to redirect various URLs +as things change over time. + +We will use a single containerised ``haproxy`` instance to handle +redirects for the OpenDev project. Although initially it will simply +be handling 302 redirects, it is imagined that future services can use +it for it's availability or load-balancing services as well. Note +that ``gitea`` services also have their own load-balancer; although it +reuses all the deployment mechanisms, the production service is kept +separately to maintain isolation been probably the most important +service (code) and more informational services. + +Proof-of-concept reviews are provided at: + + * https://review.opendev.org/677903 : make haproxy role more generic + * https://review.opendev.org/678159 : add a service load balancer + +The work items consist of: + + * approval of the above reviews + * starting the production host + * iterating the extant DNS records and pointing them to the new + load-balancer + +OpenDev infrastructure migration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +We wish to provide new services only using our latest deployment +methods, to avoid introducing even more legacy services and to provide +a basis for the migration process to OpenDev services. + +Although ``files02.openstack.org`` has an existing role as a webserver +serving content from the ``/openstack.org`` AFS mount, it is +configured using legacy puppet. Thus a new server will be provisioned +using our Ansible environment, rather than adding more hosts to legacy +configuration. + +This server should be a "whitebox" server that is capable of serving a +range of domains that OpenDev would like to serve. However, it's role +will only to be to serve static directories on AFS volumes. After +this process, there will be numerous examples of SSL certificate +generation, vhost configuration, AFS volume setup and publishing jobs +for any other projects to copy and implement. + +Initially this server needs to serve https sites for the replacement +services; namely + + * governance.openstack.org + * specs.openstack.org + * security.openstack.org + * service-types.openstack.org + * releases.openstack.org + * tarballs.openstack.org + +Currently, SSL certificates are manually provisioned and entered into +puppet secret data, where they are deployed to the host. We wish to +use automatically renewing letsencrypt certificates per our other +infrastructure, utilising our DNS based authentication. However, +since ``openstack.org`` remains administered by external teams in +RAX's propietary environment, we will make an exception and setup DNS +validation records manually for these legacy sites until a full +migration of ``openstack.org`` to OpenDev infrastructure is possible. +Other domains will use OpenDev nameservers, which support automated +DNS validation renewals. + +We will have the new server provisioned and ready before we begin the +steps of migrating publishing locations. This means we can debug any +setup issues outside production, and effects a zero-downtime cutover +when the sites are ready. + +Workitems are as follows: + +* Write roles and tests to provision a new ``static01.opendev.org`` + server which will be limited to running Apache and serving AFS + directories. +* Create the server +* Create CNAME ``static.opendev.org`` which will be the main service + hostname, to provide for easier server replacement or other updates + in the future. +* Pre-provision https certificates for the above listed services + + * Using the RAX web interface for name services and the openstack + infra permissions, setup + ``_acme-challenge..openstack.org`` records as a CNAME to + ``acme.opendev.org``. + * Each site should have a separate certificate provisioned. The + configuration would be something like + + .. code-block:: yaml + + letsencrypt_certs: + governance-openstack-org: + - governance.openstack.org + specs.openstack.org: + - specs.openstack.org + and.so.on. + * Debug any failures; however the theory is (taking one example): + the existing letsencrypt roles should request a certificate for + ``governance.openstack.org`` on ``static01.opendev.org`` and + receive the authentication key, which is placed in a TXT record in + ``acme.opendev.org``. The certificate creation will will trigger + a lookup of ``_acme-challenge.governance.openstack.org`` which + will be a CNAME to ``acme.opendev.org``, which contains the + correct TXT record. The certificate is issued on + ``static01.opendev.org``. + +* Preconfigure the vhost configuration for the above sites (using + prior provisioned keys for SSL) +* Confirm correct operation of the sites with dummy content. + +Static hosting +~~~~~~~~~~~~~~ + +A number of jobs publish directly to ``/srv/static`` on the server. +These are then served by Apache as static websites. + +In general, we want these jobs to publish to our AFS volumes. By +publishing to AFS we remove the central point of failure of a single +server and it's attached disks (mitigated by multiple AFS servers and +replicas). + +The AFS volumes are then served by ``static01.opendev.org`` which has +a dedicated role as an AFS to HTTP bridge. + +The sites in question are: + +* 50-governance.openstack.org.conf + * https://governance.openstack.org + * main source -> https://opendev.org/openstack/governance-website + * published via https://opendev.org/openstack/project-config/src/branch/master/zuul.d/projects.yaml#L2298 + * aliases ``/srv/static/`` + +* 50-security.openstack.org.conf + * https://security.openstack.org + * single repo source -> https://opendev.org/openstack/ossa + * deployed by publish-security job -> https://opendev.org/openstack/project-config/src/branch/master/zuul.d/jobs.yaml#L739 + +* 50-service-types.openstack.org.conf + * https://service-types.openstack.org + * single repo -> https://opendev.org/openstack/service-types-authority + * https://opendev.org/openstack/project-config/src/branch/master/zuul.d/jobs.yaml#L551 + +* 50-specs.openstack.org.conf + * https://specs.openstack.org + * various spec repos; published by ``openstack-spec-jobs`` to subdirectories + +* 50-releases.openstack.org.conf + * https://releases.openstack.org + * generated by -> https://opendev.org/openstack/releases/ + * note generates .htaccess with contsraints links, used widely in pip + * publish-tox-jobs-static : https://opendev.org/openstack/project-config/src/branch/master/zuul.d/jobs.yaml#L685 + +* 50-tarballs.openstack.org.conf + * https://tarballs.openstack.org + * every project's release jobs + +The extant AFS layout has volumes for each project. Thus we will +continue this theme and an admin will create one volume for each of +the above static sites; e.g. + +* /afs/openstack.org/project/governance.openstack.org (~200mb) +* /afs/openstack.org/project/security.openstack.org (100mb) +* /afs/openstack.org/project/service-types.openstack.org (520k) +* /afs/openstack.org/project/specs.openstack.org (current 706mb) +* /afs/openstack.org/project/releases.openstack.org (current 57mb) +* /afs/openstack.org/project/tarballs.openstack.org (current 134gb) + +The work items are as follows + +* Create the volumes for each site as described above +* Migrate the extant data to the new volumes. It is impractical to + recreate all the sites as it would require triggering many often + infrequently updated repos. +* Publishing jobs will be updated to use AFS publishing to these new + locations. During transition period, we can publish to both + locations. +* Update the site configuration on ``static01.opendev.org`` to serve + the site from the new location +* We should be able to fully test the new sites at this point with + manual host entries. Ensure: + * https certificates working correctly + * old links remain consistent + +* For each site, move to production by updating the CNAME entries in + the ``openstack.org`` domain for the main server to point to + ``static.opendev.org`` (note, not the server directly, + i.e. ``static01.opendev.org``, to give us flexibility in managing + the backend service with server replacements or load-balancing in + the future). Per prior testing, this should be transparent. +* Old publishing jobs removed + +Repositories +------------ + +Unlikley to require new repositories + +Servers +------- + +* a new http server for serving AFS content +* A load-balancer server is suggested to host the haproxy container + +DNS Entries +----------- + +Quite a few DNS entries will need to be updated as described + +Documentation +------------- + +Developers should largely not care where the results are published. + +Small doc updates for any new services. + +A guide to setting up jobs, host configuration, etc. for publishing +static data for other projects may be useful. + +Security +-------- + +N/A + +Testing +------- + +Since all updates are replacements, we can confirm that the new sites +are operational before putting them into production. Any DNS switches +can be essentially zero impact. + + +Dependencies +============ + +N/A at this time