Add spec for active-active

This specification contains a high-level description of a proposed architecture for handling an active-active topology within Octavia. Moved Distributor to new document. Captured the comments from Mitaka mid-cycle. Updated active-active-topology per latest comments. Major update to active-active-distributor per latest comments. More updates per comments Change-Id: Ifc2d618a979fd0eb822f2cba4b759ab6ade7793f Co-Authored-By: Eran Raichstein <eranra@il.ibm.com> Co-Authored-By: Dean Lorenz <dean@il.ibm.com> Co-Authored-By: Stephen Balukoff <stephen@balukoff.com>
2015-10-14 01:39:27 -07:00 · 2015-10-14 01:39:27 -07:00 · f43edf77b8
parent de6bfc1629
commit f43edf77b8
2 changed files with 1465 additions and 0 deletions
--- a/specs/version1/active-active-distributor.rst
+++ b/specs/version1/active-active-distributor.rst
@ -0,0 +1,830 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+.. attention::
+  Please review the active-active topology blueprint first ("Active-Active,
+  N+1 Amphorae Setup",
+  https://review.openstack.org/#/c/234639/7/specs/version1/active-active-topology.rst).
+
+=================================================
+Distributor for Active-Active, N+1 Amphorae Setup
+=================================================
+
+https://blueprints.launchpad.net/octavia/+spec/active-active-topology
+
+This blueprint describes how Octavia implements a *Distributor* to support the
+*active-active* loadbalancer (LB) solution, as described in the blueprint
+linked above. It presents the high-level Distributor design and suggests
+high-level code changes to the current code base to realize this design.
+
+In a nutshell, in an *active-active* topology, an *Amphora Cluster* of two
+or more active Amphorae collectively provide the loadbalancing service.
+It is designed as a 2-step loadbalancing process; first, a lightweight
+*distribution* of VIP traffic over an Amphora Cluster; then, full-featured
+loadbalancing of traffic over the back-end members. Since a single
+loadbalancing service, which is addressable by a single VIP address, is
+served by several Amphorae at the same time, there is a need to distribute
+incoming requests among these Amphorae -- that is the role of the
+*Distributor*.
+
+This blueprint uses terminology defined in the Octavia glossary when available,
+and defines new terms to describe new components and features as necessary.
+
+.. _P2:
+
+  **Note:** Items marked with [P2]_ refer to lower priority features to be
+  designed / implemented only after initial release.
+
+Problem description
+===================
+
+* Octavia shall implement a Distributor to support the active-active
+  topology.
+
+* The operator should be able to select and configure the Distributor
+  (e.g., through an Octavia configuration file or [P2]_ through a flavor
+  framework).
+
+* Octavia shall support a pluggable design for the Distributor, allowing
+  different implementations. In particular, the Distributor shall be
+  abstracted through a *driver*, similarly to the current support of
+  Amphora implementations.
+
+* Octavia shall support different provisioning types for the Distributor;
+  including VM-based (the default, similar to current Amphorae),
+  [P2]_ container-based, and [P2]_ external (vendor-specific) hardware.
+
+* The operator shall be able to configure the distribution policies,
+  including affinity and availability (see below for details).
+
+Architecture
+============
+
+High-level Topology Description
+-------------------------------
+
+* The following diagram illustrates the Distributor's role in an active-active
+  topology:
+
+::
+
+
+                          Front-End                               Back-End
+  Internet                Networks                                Networks
+  (world)                 (tenants)                               (tenants)
+     ║                A       B       C                             A B C
+  ┌──╨───┐floating IP ║       ║       ║  ┌────────┬──────────┬────┐ ║ ║ ║
+  │      ├─ to VIP ──►╢◄──────║───────║──┤f.e. IPs│ Amphorae │b.e.├►╜ ║ ║
+  │      │   LB A     ║       ║       ║  └──┬─────┤    of    │ IPs│   ║ ║
+  │      │            ║       ║       ║     │VIP A│ Tenant A ├────┘   ║ ║
+  │  GW  │            ║       ║       ║     └─────┴──────────┘        ║ ║
+  │Router│floating IP ║       ║       ║  ┌────────┬──────────┬────┐   ║ ║
+  │      ├─ to VIP ───║──────►╟◄──────║──┤f.e. IPs│ Amphorae │b.e.├──►╜ ║
+  │      │   LB B     ║       ║       ║  └──┬─────┤    of    │ IPs│     ║
+  │      │            ║       ║       ║     │VIP B│ Tenant B ├────┘     ║
+  │      │            ║       ║       ║     └─────┴──────────┘          ║
+  │      │floating IP ║       ║       ║  ┌────────┬──────────┬────┐     ║
+  │      ├─ to VIP ───║───────║──────►╢◄─┤f.e. IPs│ Amphorae │b.e.├────►╜
+  └──────┘   LB C     ║       ║       ║  └──┬─────┤    of    │ IPs│
+                      ║       ║       ║     │VIP C│ Tenant C ├────┘
+                 arp─►╢  arp─►╢  arp─►╢     └─────┴──────────┘
+               ┌─┴─┐  ║┌─┴─┐  ║┌─┴─┐  ║
+               │VIP│┌►╜│VIP│┌►╜│VIP│┌►╜
+               ├───┴┴┐ ├───┴┴┐ ├───┴┴┐
+               │IP A │ │IP B │ │IP C │
+              ┌┴─────┴─┴─────┴─┴─────┴┐
+              │                       │
+              │      Distributor      │
+              │     (multi-tenant)    │
+              └───────────────────────┘
+
+
+* In the above diagram, several tenants (A, B, C, ...) share the
+  Distributor, yet the Amphorae, and the front- and back-end (tenant)
+  networks are not shared between tenants. (See also "Distributor Sharing"
+  below.) Note that in the initial code implementing the distributor, the
+  distributor will not be shared between tenants, until tests verifying the
+  security of a shared distributor can be implemented.
+
+* The Distributor acts as a (one-legged) router, listening on each
+  load balancer's VIP and forwarding to one of its Amphorae.
+
+* Each load balancer's VIP is advertised and answered by the Distributor.
+  An ``arp`` request for any of the VIP addresses is answered by the
+  Distributor, hence any traffic sent for each VIP is received by the
+  Distributor (and forwarded to an appropriate Amphora).
+
+* ARP is disabled on all the Amphorae for the VIP interface.
+
+* The Distributor distributes the traffic of each VIP to an Amphora in the
+  corresponding load balancer Cluster.
+
+* An example of high-level data flow:
+
+  1. Internet clients access a tenant service through an externally visible
+     floating-IP (IPv4 or IPv6).
+
+  2. The GW router maps the floating IP into a loadbalancer's internal VIP on
+     the tenant's front-end network.
+
+  3. (1st packet to VIP only) the GW send an ``arp`` request on VIP
+     (tenant front-end) network. The Distributor answers the ``arp`` request
+     with its own MAC address on this network (all the Amphorae on the network
+     can serve the VIP, but do not answer the ``arp``).
+
+  4. The GW router forwards the client request to the Distributor.
+
+  5. The Distributor forwards the packet to one of the Amphorae on the
+     tenant's front-end network (distributed according some policy,
+     as described below), without changing the destination IP (i.e., still
+     using the VIP).
+
+  6. The Amphora accepts the packet and continues the flow on the tenant's
+     back-end network as for other Octavia loadbalancer topologies (non
+     active-active).
+
+  7. The outgoing response packets from the Amphora are forwarded directly
+     to the GW router (that is, it does not pass through the Distributor).
+
+Affinity of Flows to Amphorae
+-----------------------------
+
+- Affinity is required to make sure related packets are forwarded to the
+  same Amphora. At minimum, since TCP connections are terminated at the
+  Amphora, all packets that belong to the same flow must be sent to the
+  same Amphora. Enhanced affinity levels can be used to make sure that flows
+  with similar attributes are always sent to the same Amphora; this may be
+  desired to achieve better performance (see discussion below).
+
+- [P2]_ The Distributor shall support different modes of client-to-Amphora
+  affinity. The operator should be able to select and configure the desired
+  affinity level.
+
+- Since the Distributor is L3 and the "heavy lifting" is expected to be
+  done by the Amphorae, this specification proposes implementing two
+  practical affinity alternatives. Other affinity alternatives may be
+  implemented at a later time.
+
+  *Source IP and source port*
+    In this mode, the Distributor must always send packets from the same
+    combination of Source IP and Source port to the same Amphora. Since
+    the Target IP and Target Port are fixed per Listener, this mode implies
+    that all packets from the same TCP flow are sent to the same Amphora.
+    This is the minimal affinity mode, as without it TCP connections will
+    break.
+
+    *Note*: related flows (e.g., parallel client calls from the same HTML
+    page) will typically be distributed to different Amphorae; however,
+    these should still be routed to the same back-end. This could be
+    guaranteed by using cookies and/or by synchronizing the stick-tables.
+    Also, the Amphorae in the Cluster could be configured to use the same
+    hashing parameters (avoid any random seed) to ensure all make similar
+    decisions.
+
+  *Source IP* (default)
+    In this mode, the Distributor must always send packets from the same
+    source IP to the same Amphora, regardless of port. This mode allows TLS
+    session reuse (e.g., through session ids), where an abbreviated
+    handshake can be used to improve latency and computation time.
+
+    The main disadvantage of sending all traffic from the same source IP to
+    the same Amphora is that it might lead to poor load distribution for
+    large workloads that have the same source IP (e.g., workload behind a
+    single nat or proxy).
+
+    **Note on TLS implications**:
+      In some (typical) TLS sessions, the additional load incurred for each new
+      session is significantly larger than the load incurred for each new
+      request or connection on the same session; namely, the total load on each
+      Amphora will be more affected by the number of different source IPs it
+      serves than by the number of connections. Moreover, since the total load
+      on the Cluster incurred by all the connections depends on the level of
+      session reuse, spreading a single source IP over multiple Amphorae
+      *increases* the overall load on the Cluster. Thus, a Distributor that
+      uniformly spreads traffic without affinity per source IP (e.g., uses
+      per-flow affinity only) might cause an increase in overall load on the
+      Cluster that is proportional to the number of Amphorae. For example, in a
+      scale-out scenario (where a new Amphora is spawned to share the total
+      load), moving some flows to the new Amphora might increase the overall
+      Cluster load, negating the benefit of scaling-out.
+
+      Session reuse helps with the certificate exchange phase. Improvements
+      in performance with the certificate exchange depend on the type of keys
+      used, and is greatest with RSA. Session reuse may be less important with
+      other schemes; shared TLS session tickets are another mechanism that may
+      circumvent the problem; also, upcoming versions of HA-Proxy may be able
+      to obviate this problem by synchronizing TLS state between Amphorae
+      (similar to stick-table protocol).
+
+- Per the agreement at the Mitaka mid-cycle, the default affinity shall be
+  based on source-IP only and a consistent hashing function (see below)
+  shall be used to distribute flows in a predictable manner; however,
+  abstraction will be used to allow other implementations at a later time.
+
+Forwarding with OVS and OpenFlow Rules
+--------------------------------------
+
+* The reference implementation of the Distributor shall use OVS for
+  forwarding and configure the Distributor through OpenFlow rules.
+
+  - OpenFlow rules can be implemented by a software switch (e.g., OVS) that
+    can run on a VM. Thus, can be created and managed by Octavia similarly
+    to creation and management of Amphora VMs.
+
+  - OpenFlow rules are supported by several HW switches, so the same
+    control plane can be used for both SW and HW implementations.
+
+* Outline of Rules
+
+  - A ``group`` with the ``select`` method is used to distribute IP traffic
+    over multiple Amphorae. There is one ``bucket`` per Amphora -- adding
+    an Amphora adds a new ``bucket`` and deleting and Amphora removes the
+    corresponding ``bucket``.
+
+  - The ``select`` method supports (OpenFlow v1.5) hashed-based selection
+    of the ``bucket``. The hash can be set up to use different fields,
+    including by source IP only (default) and by source IP and source port.
+
+  - All buckets route traffic back on the in-port (i.e., no forwarding
+    between ports). This ensures that the same front-end network is used
+    (i.e., the Distributor does not route between front-end networks;
+    therefore, does not mix traffic of different tenants).
+
+  - The ``bucket`` actions do a re-write of the outgoing packets. It
+    supports re-write of the destination MAC to that of the specific
+    Amphora and re-write of the source MAC to that of the Distributor
+    interface (together these MAC re-writes provide L3 routing functionality).
+
+    *Note:* alternative re-write rules can be used to support other forwarding
+    mechanisms.
+
+  - OpenFlow rules are also used to answer ``arp`` requests on the VIP.
+    ``arp`` requests for each VIP are captured, re-written as ``arp``
+    replies with the MAC address of the particular front-end interface and
+    sent back on the in-port. Again, there is no routing between interfaces.
+
+* Handling Amphora failure
+
+  - Initial implementation will assume a fixed size for each cluster (no
+    elasticity). The hashing will be "consistent" by virtue of never
+    changing the number of ``buckets``. If the cluster size is changed on
+    the fly (there should not be an API to do so) then there are no
+    guarantees on shuffling.
+
+  - If an Amphora fails then remapping cannot be avoided -- all flows of
+    the failed Amphora must be remapped to a different one. Rather than
+    mapping these flows to other active Amphorae in the cluster, the reference
+    implementation will map all flows to the cluster's *standby* Amphora (i.e.
+    the "+1" Amphora in this "N+1" cluster). This ensures that the cluster
+    size does not change. The only change in the OpenFlow rules would be to
+    replace the MAC of the failed Amphora with that of the standby Amphora.
+
+  - This implementation is very similar to Active-Standby fail-over. There
+    will be a standby Amphora that can serve traffic in case of failure.
+    The differences from Active-Standby is that a single Amphora acts as a
+    standby for multiple ones; fail-over re-routing is handled through the
+    Distributor (rather than by VRRP); and a whole cluster of Amphorae is
+    active concurrently, to enable support of large workloads.
+
+  - Health Manager will trigger re-creation of a failed Amphora. Once the
+    Amphora is ready it becomes the new *standby* (no changes to OpenFlow
+    rules).
+
+  - [P2]_ Handle concurrent failure of more than a single Amphora
+
+* Handling Distributor failover
+
+  - To handle the event of a Distributor failover caused by a catastrophic
+    failure of a Distributor, and in order to preserve the client to Amphora
+    affinity when the Distributor is replaced, the Amphora registration process
+    with the Distributor should preserve positional information. This should
+    ensure that when a new Distributor is created, Amphorae will be assigned to
+    the same buckets to which they were previously assigned.
+
+  - In the reference implementation, we propose making the Distributor API
+    return the complete list of Amphorae MAC addresses with positional
+    information each time an Amphora is registered or unregistered.
+
+Proposed change
+===============
+
+**Note:** These are changes on top of the changes described in the
+"Active-Active, N+1 Amphorae Setup" blueprint, (see
+https://blueprints.launchpad.net/octavia/+spec/active-active-topology)
+
+* Create flow for the creation of an Amphora cluster with N active Amphora
+  and one extra standby Amphora. Set-up the Amphora roles accordingly.
+
+* Support the creation, connection, and configuration of the various
+  networks and interfaces as described in `high-level topology` diagram.
+  The Distributor shall have a separate interface for each loadbalancer and
+  shall not allow any routing between different ports. In particular, when
+  a loadbalancer is created the Distributor should:
+
+  - Attach the Distributor to the loadbalancer's front-end network by
+    adding a VIP port to the Distributor (the LB VIP Neutron port).
+
+  - Configure OpenFlow rules: create a group with the desired cluster size
+    and with the given Amphora MACs; create rules to answer ``arp``
+    requests for the VIP address.
+
+  **Notes:**
+    [P2]_ It is desirable that the Distributor be considered as a router by
+    Neutron (to handle port security, network forwarding without ``arp``
+    spoofing, etc.). This may require changes to Neutron and may also mean
+    that Octavia will be a privileged user of Neutron.
+
+    Distributor needs to support IPv6 NDP
+
+    [P2_] If the Distributor is implemented as a container then hot-plugging
+    a port for each VIP might not be possible.
+
+    If DVR is used then routing rules must be used to forward external
+    traffic to the Distributor rather than rely on ``arp``. In particular,
+    DVR messes-up ``noarp`` settings.
+
+* Support Amphora failure recovery
+
+  - Modify the HM and failure recovery flows to add tasks to notify the ACM
+    when ACTIVE-ACTIVE topology is in use. If an active Amphora fails then
+    it needs to be decommissioned on the Distributor and replaced with
+    the standby.
+
+  - Failed Amphorae should be recreated as a standby (in the new
+    IN_CLUSTER_STANDBY role). The standby Amphora should also be monitored and
+    recovered on failure.
+
+* Distributor driver and Distributor image
+
+  - The Distributor should be supported similarly to an amphora; namely, have
+    its own abstract driver.
+
+  - Distributor image (for reference implementation) should include OVS
+    with a recent version (>1.5) that supports hash-based bucket selection.
+    As is done for Amphorae, Distributor image should be installed with
+    public keys to allow secure configuration by the Octavia controller.
+
+  - Reference implementation shall spawn a new Distributor VM as needed. It
+    shall monitor its health and handle recovery using heartbeats sent to the
+    health monitor in a similar fashion to how this is done presently with
+    Amphorae. [P2]_ Spawn a new Distributor if the number VIPs exceeds a given
+    limit (to limit the number of Neutron ports attached to one Distributor).
+    [P2]_ Add configuration options and/or Operator API to allow operator to
+    request a dedicated Distributor for a VIP (or per tenant).
+
+* Define a REST API for Distributor configuration (no SSH API).
+  See below for details.
+
+* Create data-model for Distributor.
+
+Alternatives
+------------
+
+TBD
+
+Data model impact
+-----------------
+
+Add table ``distributor`` with the following columns:
+
+* id  ``(sa.String(36) , nullable=False)``
+    ID of Distributor instance.
+
+* compute_id ``(sa.String(36), nullable=True)``
+    ID of compute node running the Distributor.
+
+* lb_network_ip ``(sa.String(64), nullable=True)``
+    IP of Distributor on management network.
+
+* status ``(sa.String(36), nullable=True)``
+    Provisioning status
+
+* vip_port_ids (list of ``sa.String(36)``)
+    List of Neutron port IDs.
+    New VIFs may be plugged into the Distributor when a new LB is created. We
+    may need to store the Neutron port IDs in order to support
+    fail-over from one Distributor instance to another.
+
+Add table ``distributor_health`` with the following columns:
+
+* distributor_id  ``(sa.String(36) , nullable=False)``
+    ID of Distributor instance.
+
+* last_update ``(sa.DateTime, nullable=False)``
+    Last time distributor heartbeat was received by a health monitor.
+
+* busy ``(sa.Boolean, nullable=False)``
+    Field indicating a create / delete or other action is being conducted on
+    the distributor instance (ie. to prevent a race condition when multiple
+    health managers are in use).
+
+Add table ``amphora_registration`` with the following columns. This describes
+which Amphorae are registered with which Distributors and in which order:
+
+* lb_id  ``(sa.String(36) , nullable=False)``
+    ID of load balancer.
+
+* distributor_id  ``(sa.String(36) , nullable=False)``
+    ID of Distributor instance.
+
+* amphora_id  ``(sa.String(36) , nullable=False)``
+    ID of Amphora instance.
+
+* position ``(sa.Integer, nullable=True)``
+    Order in which Amphorae are registered with the Distributor.
+
+REST API Impact
+---------------
+Distributor will be running its own rest API server. This API will be secured
+using two-way SSL authentication, and use certificate rotation in the same
+way this is done with Amphorae today.
+
+Following API calls will be addressed.
+
+1. Post VIP Plug
+
+   Adding a VIP network interface to the Distributor involves tasks which run
+   outside the Distributor itself. Once these are complete, the Distributor
+   must be configured to use the new interface. This is a REST call, similar
+   to what is currently done for Amphorae when connecting to a new member
+   network.
+
+   `lb_id`
+     An identifier for the particular loadbalancer/VIP. Used for subsequent
+     register/unregister of Amphorae.
+
+   `vip_address`
+     The IP of the VIP (for which IP to answer ``arp`` requests)
+
+   `subnet_cidr`
+     Netmask for the VIP's subnet.
+
+   `gateway`
+     Gateway outbound packets from the VIP ip address should use.
+
+   `mac_address`
+     MAC address of the new interface corresponding to the VIP.
+
+   `vrrp_ip`
+     In the case of HA Distributor, this contains the IP address that will
+     be used in setting up the allowed address pairs relationship. (See
+     Amphora VIP plugging under the ACTIVE-STANDBY topology for an example
+     of how this is used.)
+
+   `host_routes`
+     List of routes that should be added when the VIP is plugged.
+
+   `alg_extras`
+     Extra arguments related to the algorithm that will be used to distribute
+     requests to Amphorae part of this load balancer configuration. This
+     consists of an algorithm name and affinity type. In the initial release
+     of ACTIVE-ACTIVE, the only valid algorithm will be *hash*, and the
+     affinity type may be ``Source_IP`` or [P2]_ ``Source_IP_AND_port``.
+
+2. Pre VIP unplug
+
+   Removing a VIP network interface will involve several tasks on the
+   Distributor to gracefully roll-back OVS configuration and other details
+   that were set-up when the VIP was plugged in.
+
+   `lb_id`
+     ID of the VIP's loadbalancer that will be unplugged.
+
+3. Register Amphorae
+
+   This adds Amphorae to the configuration for a given load balancer. The
+   Distributor should respond with a new list of all Amphorae registered with
+   the Distributor with positional information.
+
+   `lb_id`
+     ID of the loadbalancer with which the Amphora will be registered
+
+   `amphorae`
+     List of Amphorae MAC addresses and (optional) position argument in which
+     they should be registered.
+
+4. Unregister Amphorae
+
+   This removes Amphorae from the configuration for a given load balancer. The
+   Distributor should respond with a new list of all Amphorae registered with
+   the Distributor with positional information.
+
+   `lb_id`
+     ID of the loadbalancer with which the Amphora will be registered
+
+   `amphorae`
+     List of Amphorae MAC addresses that should be unregistered with the
+     Distributor.
+
+Security impact
+---------------
+
+The Distributor is designed to be multi-tenant by default. (Note that the first
+reference implementation will not be multi-tenant until tests can be developed
+to verify the security of a multi-tenant reference distributor.) Although each
+tenant has its own front-end network, the Distributor is connected to all,
+which might allow leaks between these networks. The rationale is two fold:
+First, the Distributor should be considered as a trusted infrastructure
+component. Second, all traffic is external traffic before it reaches the
+Amphora. Note that the GW router has exactly the same attributes; in other
+words, logically, we can consider the Distributor to be an extension to the GW
+(or even use the GW HW to implement the Distributor).
+
+This approach might not be considered secure enough for some cases, such as, if
+LBaaS is used for internal tier-to-tier communication inside a tenant network.
+Some tenants may want their loadbalancer's VIP to remain private and their
+front-end network to be isolated. In these cases, in order to accomplish
+active-active for this tenant we would need separate dedicated Distributor
+instance(s).
+
+Notifications impact
+--------------------
+
+Other end user impact
+---------------------
+
+Performance Impact
+------------------
+
+Other deployer impact
+---------------------
+
+Developer impact
+----------------
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Work Items
+----------
+
+Dependencies
+============
+
+
+Testing
+=======
+
+* Unit tests with tox.
+* Function tests with tox.
+
+
+Documentation Impact
+====================
+
+Further Discussion
+==================
+
+.. Note::
+  This section captures some background, ideas, concerns, and remarks that
+  were raised by various people. Some of the items here can be considered for
+  future/alternative design and some will hopefully make their way into, yet
+  to be written, related blueprints (e.g., auto-scaled topology).
+
+[P2]_ Handling changes in Cluster size (manual or auto-scaled)
+----------------------------------------------------------------
+
+- The Distributor shall support different mechanism for preserving affinity
+  of flows to Amphorae following a *change in the size* of the Amphorae
+  Cluster.
+
+- The goal is to minimize shuffling of client-to-Amphora mapping during
+  cluster size changes:
+
+  * When an Amphora is removed from the Cluster (e.g., due to failure or
+    scale-down action), all its flows are broken; however, flows to other
+    Amphorae should not be affected. Also, if a drain method is used to empty
+    the Amphora of client flows (in the case of a graceful removal), this
+    should prevent disruption.
+
+  * When an Amphora is *added* to the Cluster (e.g., recovery of a failed
+    Amphora), some new flows should be distributed to the new Amphora;
+    however, most flows should still go to the same Amphora they were
+    distributed to before the new Amphora was added. For example, if the
+    affinity of flows to Amphorae is per Source IP and a new Amphora was just
+    added then the Distributor should forward packets from this IP only one
+    of only two Amphorae: either the same Amphora as before or the
+    Amphora that was added.
+
+  Using a simple hash to maintain affinity does not meet this goal.
+
+  For example, suppose we maintain affinity (for a fixed cluster size) using
+  a hash (for randomizing key distribution) as
+  `chosen_amphora_id = hash(sourceIP # port) mod number_of_amphorae`.
+  When a new Amphora is added or remove the number of Amphorae changes;
+  thus, a different Amphora will be chosen for most flows.
+
+- Below are the couple of ways to tackle this shuffling problem.
+
+  *Consistent Hashing*
+    Consistent hashing is a hashing mechanism (regardless if key is based on
+    IP or IP/port) that preserves most hash mappings during changes in the
+    size of the Amphorae Cluster. In particular, for a cluster with N
+    Amphorae that grows to N+1 Amphorae, a consistent hashing function
+    ensures that, with high probability, only 1/N of inputs flows will be
+    re-hashed (more precisely, K/N keys will be rehashed). Note that, even
+    with consistent hashing, some flows will be remapped and there is only
+    a statistical bound on the number of remapped flows.
+
+    The "classic" consistent hashing algorithm maps both server IDs and
+    keys to hash values and selects for each key the server with the
+    closest hash value to the key hash value. Lookup generally requires
+    O(log N) to search for the "closest" server. Achieving good
+    distribution requires multiple hashes per server (~10s) -- although
+    these can be pre-computed there is an ~10s*N memory footprint. Other
+    algorithms (e.g., MSFT's Magleb) have better performance, but provide
+    weaker guarantees.
+
+    There are several consistent hashing libraries available. None are
+    supported in OVS.
+
+    * Ketama https://github.com/RJ/ketama
+
+    * Openstack swift http://docs.openstack.org/developer/swift/ring.html
+
+    * Amazon dynamo
+      http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf
+
+    We should also strongly consider making any consistent hashing algorithm
+    we develop available to all OpenStack components by making it part of an
+    Oslo library.
+
+  *Rendezvous hashing*
+    This method provides similar properties to Consistent Hashing (i.e., a
+    hashing function that remaps only 1/N of keys when a cluster with N
+    Amphorae grows to N+1 Amphorae.
+
+    For each server ID, the algorithm concatenates the key and server ID and
+    computes a hash. The server with the largest hash is chosen. This
+    approach requires O(N) for each lookup, but is much simpler to
+    implement and has virtually no memory footprint. Through search-tree
+    encoding of the server IDs it is possible to achieve O(log N) lookup,
+    but implementation is harder and distribution is not as good. Another
+    feature is that more than one server can be chosen (e.g., two largest
+    values) to handle larger loads -- not directly useful for the
+    Distributor use case.
+
+  *Hybrid, Permutation-based approach*
+    This is an alternative implementation of consistent hashing that may be
+    simpler to implement. Keys are hashed to a set of buckets; each bucket
+    is pre-mapped to a random permutation of the server IDs. Lookup is by
+    computing a hash of the key to obtain a bucket and then going over the
+    permutation selecting the first server. If a server is marked as "down"
+    the next server in the list is chosen. This approach is similar to
+    Rendezvous hashing if each key is directly pre-mapped to a random
+    permutation (and like it allows more than one server selection). If the
+    number of failed servers is small then lookup is about O(1); memory is
+    O(N * #buckets), where the granularity of distribution is improved by
+    increasing the number of buckets. The permutation-based approach is
+    useful to support clusters of fixed size that need to handle a few
+    nodes going down and then coming back up. If there is an assumption on
+    the number of failures then memory can be reduced to O( max_failures *
+    #buckets). This approach seems to suit the Distributor Active-Active
+    use-case for non-elastic workloads.
+
+- Flow tracking is required, even with the above hash functions, to handle
+  the (relatively few) remapped flows. If an existing flow is remapped, its
+  TCP connection would break. This is acceptable when an Amphora goes down
+  and it flows are mapped to a new one. On the other hand, it may be
+  unacceptable when an Amphora is added to the cluster and 1/N of existing
+  flows are remapped. The Distributor may support different modes, as follows.
+
+  *None / Stateless*
+    In this mode, the Distributor applies its most recent forwarding rules,
+    regardless of previous state. Some existing flows might be remapped to a
+    different Amphora and would be broken. The client would have to recover
+    and establish a connection with the new Amphora (it would still be
+    mapped to the same back-end, if possible). Combined with consistent (or
+    similar) hashing, this may be good enough for many web applications
+    that are built for failure anyway, and can restore their state upon
+    reconnect.
+
+  *Full flow Tracking*
+    In this mode, the Distributor tracks existing flows to provide full
+    affinity, i.e., only new flows can be remapped to different Amphorae.
+    The Linux connection tracking may be used (e.g., through IPTables or
+    through OpenFlow); however, this might not scale well. Alternatively,
+    the Distributor can use an independent mechanism similar to HA-Proxy
+    sticky-tables to track the flows. Note that the Distributor only needs to
+    track the mapping per source IP and source port (unlike Linux connection
+    tracking which follows the TCP state and related connections).
+
+  *Use Ryu*
+    Ryu is a well supported and tested python binding for issuing OpenFlow
+    commands. Especially since Neutron recently moved to using this for
+    many of the things it does, using this in the Distributor might make
+    sense for Octavia as well.
+
+Forwarding Data-path Implementation Alternatives
+------------------------------------------------
+
+The current design uses L2 forwarding based only on L3 parameters and uses
+Direct Return routing (one-legged). The rational behind this approach is
+to keep the Distributor as light as possible and have the Amphorae do the
+bulk of the work. This allows one (or a few) Distributor instance(s) to
+serve all traffic even for very large workloads. Other approaches are
+possible.
+
+2-legged Router
+^^^^^^^^^^^^^^^
+
+- Distributor acts as router, being in-path on both directions.
+
+- New network between Distributor and Amphorae -- Only Distributor on VIP
+  subnet.
+
+- No need to use MAC forwarding -- use routing rules
+
+LVS
+^^^
+
+Use LVS for Distributor.
+
+DNS
+^^^
+
+Use DNS for the Distributor.
+
+- Use DNS to map to particular Amphorae. Distribution will be of
+  domain name rather than VIP.
+
+- No problem with per-flow affinity, as client will use same IP for entire
+  TCP connection.
+
+- Need a different public IP for each Amphora (no VIP)
+
+Pure SDN
+^^^^^^^^
+
+- Implement the OpenFlow rules directly in the network, without a
+  Distributor instance.
+
+- If the network infrastructure supports this then the Distributor can
+  become more robust and very lightweight, making it practical to have a
+  dedicated Distributor per VIP (only the rules will be dedicated as the
+  network and SDN controller are shared resources)
+
+Distributor Sharing
+-------------------
+
+- The initial implementation of the Distributor will not be shared between
+  tenants until tests can be written to verify the security of this solution.
+
+- The implementation should support different Distributor sharing and
+  cardinality configurations. This includes single-shared Distributor,
+  multiple-dedicated Distributors, and multiple-shared Distributors. In
+  particular, an abstraction layer should be used and the data-model should
+  include an association between the load balancer and Distributor.
+
+- A shared Distributor uses the least amount of resources, but may not meet
+  isolation requirements (performance and/or security) or might become a
+  bottleneck.
+
+Distributor High-Availability
+-----------------------------
+
+- The Distributor should be highly-available (as this is one of the
+  motivations for the active-active topology). Once the initial active-active
+  functionality is delivered, developing a highly available distributor should
+  take a high priority.
+
+- A mechanism similar to the VRRP used on ACTIVE-STANDBY topology Amphorae
+  can be used.
+
+- Since the Distributor is stateless (for fixed cluster sizes and if no
+  connection tracking is used) it is possible to set up an active-active
+  configuration and advertise more than one Distributor (e.g, for ECMP).
+
+- As a first step, the initial implementation will use a single Distributor
+  instance (i.e., will not be highly-available). Health Manager will monitor
+  the Distributor health and initiate recovery if needed.
+
+- The implementation should support plugging-in a hardware-based
+  implementation of the Distributor that may have its own high-availability
+  support.
+
+- In order to preserve client to Amphora affinity in the case of a failover,
+  a VRRP-like HA Distributor has several options. We could potentially push
+  Amphora registrations to the standby Distributor with the position
+  arguments specified, in order to guarantee the active and standby Distributor
+  always have the same configuration. Or, we could invent and utilize a
+  synchronization protocol between the active and standby Distributors. This
+  will be explored and decided when an HA Distributor specification is
+  written and approved.
+
+References
+==========
+
+.. [1] https://blueprints.launchpad.net/octavia/+spec/base-image
+.. [2] https://blueprints.launchpad.net/octavia/+spec/controller-worker
+.. [3] https://blueprints.launchpad.net/octavia/+spec/amphora-driver-interface
+.. [4] https://blueprints.launchpad.net/octavia/+spec/controller
+.. [5] https://blueprints.launchpad.net/octavia/+spec/operator-api
+.. [6] doc/main/api/haproxy-amphora-api.rst
+.. [7] https://blueprints.launchpad.net/octavia/+spec/active-active-topology
--- a/specs/version1/active-active-topology.rst
+++ b/specs/version1/active-active-topology.rst
@ -0,0 +1,635 @@
+..
+  This work is licensed under a Creative Commons Attribution 3.0 Unported
+  License.
+
+  http://creativecommons.org/licenses/by/3.0/legalcode
+
+
+=================================
+Active-Active, N+1 Amphorae Setup
+=================================
+
+https://blueprints.launchpad.net/octavia/+spec/active-active-topology
+
+This blueprint describes how Octavia implements an *active-active*
+loadbalancer (LB) solution that is highly-available through redundant
+Amphorae. It presents the high-level service topology and suggests
+high-level code changes to the current code base to realize this scenario.
+In a nutshell, an *Amphora Cluster* of two or more active Amphorae
+collectively provide the loadbalancing service.
+
+The Amphora Cluster shall be managed by an *Amphora Cluster Manager* (ACM).
+The ACM shall provide an abstraction that allows different types of
+active-active features (e.g., failure recovery, elasticity, etc.). The
+initial implementation shall not rely on external services, but the
+abstraction shall allow for interaction with external ACMs (to be developed
+later).
+
+This blueprint uses terminology defined in Octavia glossary when available,
+and defines new terms to describe new components and features as necessary.
+
+.. _P2:
+
+  **Note:** Items marked with [P2]_ refer to lower priority features to be
+  designed / implemented only after initial release.
+
+
+Problem description
+===================
+
+A tenant should be able to start a highly-available, loadbalancer for the
+tenant's backend services as follows:
+
+* The operator should be able to configure an active-active topology
+  through an Octavia configuration file or [P2]_ through a Neutron flavor,
+  which the loadbalancer shall support. Octavia shall support active-active
+  topologies in addition to the topologies that it currently supports.
+
+* In an active-active topology, a cluster of two or more amphorae shall
+  host a replicated configuration of the load-balancing services. Octavia
+  will manage this *Amphora Cluster* as a highly-available service using a
+  pool of active resources.
+
+* The Amphora Cluster shall provide the load-balancing services and support
+  the configurations that are supported by a single Amphora topology,
+  including L7 load-balancing, SSL termination, etc.
+
+* The active-active topology shall support various Amphora types and
+  implementations; including, virtual machines, [P2]_ containers, and
+  bare-metal servers.
+
+* The operator should be able to configure the high-availability
+  requirements for the active-active load-balancing services. The operator
+  shall be able to specify the number of healthy Amphorae that must exist
+  in the load-balancing Amphora Cluster. If the number of healthy Amphorae
+  drops under the desired number, Octavia shall automatically and
+  seamlessly create and configure a new Amphora and add it to the Amphora
+  Cluster. [P2]_ The operator should be further able to define that the
+  Amphora Cluster shall be allocated on separate physical resources.
+
+* An Amphora Cluster will collectively act to serve as a single logical
+  loadbalancer as defined in the Octavia glossary. Octavia will seamlessly
+  distribute incoming external traffic among the Amphorae in the Amphora
+  Cluster. To that end, Octavia will employ a *Distributor* component that
+  will forward external traffic towards the managed amphora instances.
+  Conceptually, the Distributor provides an extra level of load-balancing
+  for an active-active Octavia application, albeit a simplified one.
+  Octavia should be able to support several Distributor implementations
+  (e.g., software-based and hardware-based) and different affinity models
+  (at minimum, flow-affinity should be supported to allow TCP connectivity
+  between clients and Amphorae).
+
+* The detailed design of the Distributor component will be described in a
+  separate document (see "Distributor for Active-Active, N+1 Amphorae
+  Setup", active-active-distributor.rst).
+
+
+High-level Topology Description
+-------------------------------
+
+Single Tenant
+~~~~~~~~~~~~~
+
+* The following diagram illustrates the active-active topology:
+
+::
+
+                     Front-End                  Back-End
+  Internet            Network                   Network
+  (world)             (tenant)                  (tenant)
+    ║                    ║                         ║
+  ┌─╨────┐ floating IP   ║                         ║  ┌────────┐
+  │Router│  to LB VIP    ║  ┌────┬─────────┬────┐  ║  │ Tenant │
+  │  GW  ├──────────────►╫◄─┤ IP │ Amphora │ IP ├─►╫◄─┤Service │
+  └──────┘               ║  └┬───┤   (1)   │back│  ║  │  (1)   │
+                         ║   │VIP├─┬──────┬┴────┘  ║  └────────┘
+                         ║   └───┘ │ MGMT │        ║  ┌────────┐
+    ╓◄───────────────────║─────────┤  IP  │        ║  │ Tenant │
+    ║   ┌─────────┬────┐ ║         └──────┘        ╟◄─┤Service │
+    ║   │ Distri- │  IP├►╢                         ║  │  (2)   │
+    ║   │  butor  ├───┬┘ ║  ┌────┬─────────┬────┐  ║  └────────┘
+    ║   └─┬──────┬┤VIP│  ╟◄─┤ IP │ Amphora │ IP ├─►╢  ┌────────┐
+    ║     │ MGMT │└─┬─┘  ║  └┬───┤   (2)   │back│  ║  │ Tenant │
+    ╟◄────┤  IP  │  └arp►╢   │VIP├─┬──────┬┴────┘  ╟◄─┤Service │
+    ║     └──────┘       ║   └───┘ │ MGMT │        ║  │  (3)   │
+    ╟◄───────────────────║─────────┤  IP  │        ║  └────────┘
+    ║  ┌───────────────┐ ║         └──────┘        ║
+    ║  │ Octavia LBaaS │ ║           •••           ║      •
+    ╟◄─┤  Controller   │ ║  ┌────┬─────────┬────┐  ║      •
+    ║  └┬─────────────┬┘ ╙◄─┤ IP │ Amphora │ IP ├─►╢
+    ║   │   Amphora   │     └┬───┤   (k)   │back│  ║  ┌────────┐
+    ║   │ Cluster Mgr.│      │VIP├─┬──────┬┴────┘  ║  │ Tenant │
+    ║   └─────────────┘      └───┘ │ MGMT │        ╙◄─┤Service │
+    ╟◄─────────────────────────────┤  IP  │           │  (m)   │
+    ║                              └──────┘           └────────┘
+    ║
+  Management                    Amphora Cluster    Back-end Pool
+  Network                           1..k                1..m
+
+* An example of high-level data-flow:
+
+  1. Internet clients access a tenant service through an externally visible
+     floating-IP (IPv4 or IPv6).
+
+  2. If IPv4, a gateway router maps the floating IP into a loadbalancer's
+     internal VIP on the tenant's front-end network.
+
+  3. The (multi-tenant) Distributor receives incoming requests to the
+     loadbalancer's VIP. It acts as a one-legged direct return LB,
+     answering ``arp`` requests for the loadbalancer's VIP (see Distributor
+     spec.).
+
+  4. The Distributor distributes incoming connections over the tenant's
+     Amphora Cluster, by forwarding each new connection opened with a
+     loadbalancer's VIP to a front-end MAC address of an Amphora in the
+     Amphora Cluster (layer-2 forwarding). *Note*: the Distributor may
+     implement other forwarding schemes to support more complex routing
+     mechanisms, such as DVR (see Distributor spec.).
+
+  5. An Amphora receives the connection and accepts traffic addressed to
+     the loadbalancer's VIP. The front-end IPs of the Amphorae are
+     allocated on the tenant's front-end network. Each Amphora accepts VIP
+     traffic, but does not answer ``arp`` request for the VIP address.
+
+  6. The Amphora load-balances the incoming connections to the back-end
+     pool of tenant servers, by forwarding each external request to a
+     member on the tenant network. The Amphora also performs SSL
+     termination if configured.
+
+  7. Outgoing traffic traverses from the back-end pool members, through
+     the Amphora and directly to the gateway (i.e., not through the
+     Distributor).
+
+Multi-tenant Support
+~~~~~~~~~~~~~~~~~~~~
+
+* The following diagram illustrates the active-active topology with
+  multiple tenants:
+
+::
+
+                      Front-End                   Back-End
+  Internet             Networks                   Networks
+  (world)              (tenant)                   (tenant)
+    ║                    B  A                         A
+    ║      floating IP   ║  ║                         ║  ┌────────┐
+  ┌─╨────┐ to LB VIP A   ║  ║  ┌────┬─────────┬────┐  ║  │Tenant A│
+  │Router├───────────────║─►╫◄─┤A IP│ Amphora │A IP├─►╫◄─┤Service │
+  │  GW  ├──────────────►╢  ║  └┬───┤   (1)   │back│  ║  │  (1)   │
+  └──────┘ floating IP   ║  ║   │VIP├─┬──────┬┴────┘  ║  └────────┘
+           to LB VIP B   ║  ║   └───┘ │ MGMT │        ║  ┌────────┐
+    ╓◄───────────────────║──║─────────┤  IP  │        ║  │Tenant A│
+    ║                    ║  ║         └──────┘        ╟◄─┤Service │
+    M                    B  A  ┌────┬─────────┬────┐  ║  │  (2)   │
+    ║                    ║  ╟◄─┤A IP│ Amphora │A IP├─►╢  └────────┘
+    ║                    ║  ║  └┬───┤   (2)   │back│  ║  ┌────────┐
+    ║                    ║  ║   │VIP├─┬──────┬┴────┘  ║  │Tenant A│
+    ║                    ║  ║   └───┘ │ MGMT │        ╟◄─┤Service │
+    ╟◄───────────────────║──║─────────┤  IP  │        ║  │  (3)   │
+    ║                    ║  ║         └──────┘        ║  └────────┘
+    ║                    B  A           •••           B      •
+    ║   ┌─────────┬────┐ ║  ║  ┌────┬─────────┬────┐  ║      •
+    ║   │         │IP A├─╢─►╫◄─┤A IP│ Amphora │A IP├─►╢  ┌────────┐
+    ║   │         ├───┬┘ ║  ║  └┬───┤   (k)   │back│  ║  │Tenant A│
+    ║   │ Distri- │VIP├─arp►╜   │VIP├─┬──────┬┴────┘  ╙◄─┤Service │
+    ║   │  butor  ├───┘  ║      └───┘ │ MGMT │           │  (m)   │
+    ╟◄─ │         │ ─────║────────────┤  IP  │           └────────┘
+    ║   │         ├────┐ ║            └──────┘
+    ║   │         │IP B├►╢                                tenant A
+    ║   │         ├───┬┘ ║  = = = = = = = = = = = = = = = = = = = = =
+    ║   │         │VIP│  ║     ┌────┬─────────┬────┐  B   tenant B
+    ║   └─┬──────┬┴─┬─┘  ╟◄────┤B IP│ Amphora │B IP├─►╢  ┌────────┐
+    ║     │ MGMT │  └arp►╢     └┬───┤   (1)   │back│  ║  │Tenant B│
+    ╟◄────┤  IP  │       ║      │VIP├─┬──────┬┴────┘  ╟◄─┤Service │
+    ║     └──────┘       ║      └───┘ │ MGMT │        ║  │  (1)   │
+    ╟◄───────────────────║────────────┤  IP  │        ║  └────────┘
+    ║  ┌───────────────┐ ║            └──────┘        ║
+    M  │ Octavia LBaaS │ B              •••           B      •
+    ╟◄─┤  Controller   │ ║     ┌────┬─────────┬────┐  ║      •
+    ║  └┬─────────────┬┘ ╙◄────┤B IP│ Amphora │B IP├─►╢
+    ║   │   Amphora   │        └┬───┤   (q)   │back│  ║  ┌────────┐
+    ║   │ Cluster Mgr.│         │VIP├─┬──────┬┴────┘  ║  │Tenant B│
+    ║   └─────────────┘         └───┘ │ MGMT │        ╙◄─┤Service │
+    ╟◄────────────────────────────────┤  IP  │           │  (r)   │
+    ║                                 └──────┘           └────────┘
+    ║
+  Management                      Amphora Clusters      Back-end Pool
+  Network                         A(1..k), B(1..q)    A(1..m),B(1..r)
+
+
+* Both tenants A and B share the Distributor, but each has a different
+  front-end network. The Distributor listens on both loadbalancers' VIPs
+  and forwards to either A's or B's Amphorae.
+
+* The Amphorae and the back-end (tenant) networks are not shared between
+  tenants.
+
+
+Problem Details
+---------------
+
+* Octavia should support different Distributor implementations, similar
+  to its support for different Amphora types. The operator should be able
+  to configure different types of algorithms for the Distributor. All
+  algorithms should provide flow-affinity to allow TLS termination at the
+  amphora. See Distributor spec. for details.
+
+* Octavia controller shall seamlessly configure any newly created Amphora
+  ([P2]_ including peer state synchronization, such as sticky-tables, if
+  needed) and shall reconfigure the other solution components (e.g.,
+  Neutron) as needed. The  controller shall further manage all Amphora
+  life-cycle events.
+
+* Since it is impractical at scale for peer state synchronization to occur
+  between all Amphorae part of a single load balancer, Amphorae that are all
+  part of a single load balancer configuration need to be divided into smaller
+  peer groups (consisting of 2 or 3 Amphorae) with which they should
+  synchronize state information.
+
+
+Proposed change
+===============
+
+
+Required changes
+----------------
+
+The active-active loadbalancers require the following high-level changes:
+
+
+Amphora related changes
+~~~~~~~~~~~~~~~~~~~~~~~
+
+* Updated Amphora image to support active-active topology. The front-end
+  still has both a unique IP (to allow direct addressing on front-end
+  network) and a VIP; however, it should not answer ARP requests for the
+  VIP address (all Amphorae in a single Amphora Cluster concurrently serve
+  the same VIP). Amphorae should continue to have a management IP on the LB
+  Network so Octavia can configure them. Amphorae should also generally
+  support hot-plugging interfaces into back-end tenant networks as they do
+  in the current implementation. [P2]_ Finally, the Amphora configuration
+  may need to be changed to randomize the member list, in order to prevent
+  synchronized decisions by all Amphorae in the Amphora Cluster.
+
+* Extend data model to support active-active Amphora. This is somewhat
+  similar to active-passive (VRRP) support. Each Amphora needs to store its
+  IP and port on it's front-end network (similar to ha_ip and ha_port_id
+  in the current model) and its role should indicate it is in a cluster.
+
+  The provisioning status should be interpreted as referring to an Amphora
+  only and not the load-balancing service. The status of the load balancer
+  should correspond to the number of ``ONLINE`` Amphorae in the Cluster.
+  If all Amphoae are ``ONLINE``, the load balancer is also ``ONLINE``. If a
+  small number of Amphorae are not ``ONLINE``, then the load balancer is
+  ``DEGRADED``. If enough Amphorae are not ``ONLINE`` (past a threshold), then
+  the load balancer is ``DOWN``.
+
+* Rework some of the controller worker flows to support creation and
+  deletion of Amphorae by the ACM in an asynchronous manner. The compute
+  node may be created/deleted independently of the corresponding Amphora
+  flow, triggered as events by the ACM logic (e.g., node update). The flows
+  do not need much change (beyond those implied by the changes in the data
+  model), since the post-creation/pre-deletion configuration of each
+  Amphora is unchanged. This is also similar to the failure recovery flow,
+  where a recovery flow is triggered asynchronously.
+
+* Create a flow (or task) for the controller worker for (de-)registration
+  of Amphorae with Distributor. The Distributor has to be aware of the
+  current ``ONLINE`` Amphorae, to which it can forward traffic. [P2_] The
+  Distributor can do very basic monitoring of the Amphorae health (primarily
+  to make sure network connectivity between the Distributor and Amphorae is
+  working). Monitoring pool member health will remain the purview of the
+  pool health monitors.
+
+* All the Amphorae in the Amphora Cluster shall replicate the same
+  listeners, pools, and TLS configuration, as they do now. We assume all
+  Amphorae in the Amphora Cluster can perform exactly the same
+  load-balancing decisions and can be treated as equivalent by the
+  Distributor (except for affinity considerations).
+
+* Extend the Amphora (REST) API and/or *Plug VIP* task to allow disabling
+  of ``arp`` on the VIP.
+
+* In order to prevent losing session_persistence data in the event of an
+  Amphora failure, the Amphorae will need to be configured to share
+  session_persistence data (via stick tables) with a subset of other
+  Amphorae that are part of the same load balancer configuration (ie. a
+  peer group).
+
+Amphora Cluster Manager driver for the active-active topology (*new*)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+* Add an active-active topology to the topology types.
+
+* Add a new driver to support creation/deletion of an Amphora Cluster via
+  an ACM. This will re-use existing controller-worker flows as much as
+  possible. The reference ACM will call the existing drivers to create
+  compute nodes for the Amphorae and configure them.
+
+* The ACM shall orchestrate creation and deletion of Amphora instances to
+  meet the availability requirements. Amphora failover will utilize the
+  existing health monitor flows, with hooks to notify the ACM when
+  ACTIVE-ACTIVE topology is used. [P2]_ ACM shall handle graceful amphora
+  removal via draining (delay actual removal until existing connections are
+  terminated or some timeout has passed).
+
+* Change the flow of LB creation. The ACM driver shall create an Amphora
+  Cluster instance for each new loadbalancer. It should maintain the
+  desired number of Amphorae in the Cluster and meet the
+  high-availability configuration given by the operator. *Note*: a base
+  functionality is already supported by the Health Manager; it may be
+  enough to support a fixed or dynamic cluster size. In any case, existing
+  flows to manage Amphora life cycle will be re-used in the reference ACM
+  driver.
+
+* The ACM shall be responsible for providing health, performance, and
+  life-cycle management at the Cluster-level rather than at Amphora-level.
+  Maintaining the loadbalancer status (as described above) by some function
+  of the collective status of all Amphorae in the Cluster is one example.
+  Other examples include tracking configuration changes, providing Cluster
+  statistics, monitoring and maintaining compute nodes for the Cluster,
+  etc. The ACM abstraction would also support pluggable ACM implementations
+  that may provide more advance capabilities (e.g., elasticity, AZ aware
+  availability, etc.). The reference ACM driver will re-use existing
+  components and/or code which currently handle health, life-cycle, etc.
+  management for other load balancer topologies.
+
+* New data model for an Amphora Cluster which has a one-to-one mapping with
+  the loadbalancer. This defines the common properties of the Amphora
+  Cluster (e.g., id, min. size, desired size, etc.) and additional
+  properties for the specific implementation.
+
+* Add configuration file options to support configuration of an
+  active-active Amphora Cluster. Add default configuration. [P2]_ Add
+  Operator API.
+
+* Add or update documentation for new components added and new or changed
+  functionality.
+
+* Communication between the ACM and Distributors should be secured using
+  two-way SSL certificate authentication much the same way this is accomplished
+  between other Octavia controller components and Amphorae today.
+
+Network driver changes
+~~~~~~~~~~~~~~~~~~~~~~
+
+* Support the creation, connection, and configuration of the various
+  networks and interfaces as described in ‘high-level topology' diagram.
+
+* Adding a new loadbalancer requires attaching the Distributor to the
+  loadbalancer's front-end network, adding a VIP port to the Distributor,
+  and configuring the Distributor to answer ``arp`` requests for the VIP.
+  The Distributor shall have a separate interface for each loadbalancer and
+  shall not allow any routing between different ports; in particular,
+  Amphorae of different tenants must not be able to communicate with each
+  other. In the reference implementation, this will be accomplished by using
+  separate OVS bridges per load balancer.
+
+* Adding a new Amphora requires attaching it to the front-end and back-end
+  networks (similar to current implementation), adding the VIP (but with
+  ``arp`` disabled), and registering the Amphora with the Distributor. The
+  tenant's front-end and back-end networks must allow attachment of
+  dynamically created Amphorae by involving the ACM (e.g., when the health
+  monitor replaces a failed Amphora). ([P2]_ extend the LBaaS API to allow
+  specifying an address range for new Amphorae usage, e.g., a subnet pool).
+
+
+Amphora health-monitoring support
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+* Modify Health Manager to manage the health for an Amphora Cluster through
+  the ACM; namely, forward Amphora health change events to the ACM, so it
+  can decide when the Amphora Cluster is considered to be in healthy state.
+  This should be done in addition to managing the health of each Amphora.
+  [P2]_ Monitor the Amphorae also on their front-end network (i.e., from
+  the Distributor).
+
+
+Distributor support
+~~~~~~~~~~~~~~~~~~~
+
+* **Note:** as mentioned above, the detailed design of the Distributor
+  component is described in a separate document). Some design
+  considerations are highlighted below.
+
+* The Distributor should be supported similarly to an Amphora; namely, have
+  its own abstract driver.
+
+* For a reference implementation, add support for a Distributor image.
+
+* Define a REST API for Distributor configuration (no SSH API). The API
+  shall support:
+
+  - Add and remove a VIP (loadbalancer) and specify distribution parameters
+    (e.g., affinity, algorithm, etc.).
+
+  - Registration and de-registration of Amphorae.
+
+  - Status
+
+  - [P2]_ Macro-level stats
+
+* Spawn Distributors (if using on demand Distributor compute nodes) and/or
+  attach to existing ones as needed. Manage health and life-cycle of the
+  Distributor(s). Create, connect, and configure Distributor networks as
+  necessary.
+
+* Create data model for the Distributor.
+
+* Add Distributor driver and flows to (re-)configure the Distributor on
+  creation/destruction of a new loadbalancer (add/remove loadbalancer VIP)
+  and [P2]_ configure the distribution algorithm for the loadbalancer's
+  Amphora Cluster.
+
+* Add flows to Octavia to (re-)configure the Distributor on adding/removing
+  Amphorae from the Amphora Cluster.
+
+
+Packaging
+~~~~~~~~~
+
+* Extend Octavia installation scripts to create an image for the Distributor.
+
+
+Alternatives
+------------
+
+* Use external services to manage the cluster directly.
+    This utilizes functionality that already exists in OpenStack (e.g.,
+    like Heat and Ceilometer) rather than replicating it. This approach
+    would also benefit from future extensions to these services. On the
+    other hand, this adds undesirable dependencies on other projects (and
+    their corresponding teams), complicates handling of failures, and
+    require defensive coding around service calls. Furthermore, these
+    services cannot handle the LB-specific control configuration.
+
+* Implement a nested Octavia
+    Use another layer of Octavia to distribute traffic across the Amphora
+    Cluster (i.e., the Amphorae in the Cluster are back-end members of
+    another Octavia instance). This approach has the potential to provide
+    greater flexibility (e.g., provide NAT and/or more complex distribution
+    algorithms). It also potentially reuses existing code. However, we do
+    not want the Distributor to proxy connections so HA-Proxy cannot be
+    used. Furthermore, this approach might significantly increase the
+    overhead of the solution.
+
+
+Data model impact
+-----------------
+
+* loadbalancer table
+
+  - `cluster_id`: associated Amphora Cluster (no changes to table, 1-1
+    relationship from Cluster data-model)
+
+* lb_topology table
+
+  - new value: ``ACTIVE_ACTIVE``
+
+* amphora_role table
+
+  - new value: ``IN_CLUSTER``
+
+* Distributor table (*new*): Distributor information, similar to Amphora.
+  See Distributor spec.
+
+* Cluster table (*new*): an extension to loadbalancer (i.e., one-to-one
+  mapping to load-balancer)
+
+  - `id` (primary key)
+
+  - `cluster_name`: identifier of Cluster instance for Amphora Cluster
+    Manager
+
+  - `desired_size`: required number of Amphorae in Cluster. Octavia will
+    create this many active-active Amphorae in the Amphora Cluster.
+
+  - `min_size`: number of ``ACTIVE`` Amphorae in Cluster must be above this
+    number for Amphora Cluster status to be ``ACTIVE``
+
+  - `cooldown`:  cooldown period between successive add/remove Amphora
+    operations (to avoid thrashing)
+
+  - `load_balancer_id`: 1:1 relationship to loadbalancer
+
+  - `distributor_id`: N:1 relationship to Distributor. Support multiple
+    Distributors
+
+  - `provisioning_status`
+
+  - `operating_status`
+
+  - `enabled`
+
+  - `cluster_type`: type of Amphora Cluster implementation
+
+
+REST API Impact
+---------------
+
+* Distributor REST API -- This is a new internal API that will be secured
+  via two-way SSL certificate authentication. See Distributor spec.
+
+* Amphora REST API -- support configuration of disabling ``arp`` on VIP.
+
+* [P2]_ LBaaS API -- support configuration of desired availability, perhaps
+  by selecting a flavor (e.g., gold is a minimum of 4 Amphorae, platinum is
+  a minimum of 10 Amphora).
+
+* Operator API --
+
+  - Topology to use
+
+  - Cluster type
+
+  - Default availability parameters for the Amphora Cluster
+
+
+Security impact
+---------------
+
+* See the Distributor spec. for Distributor related security impact.
+
+
+Notifications impact
+--------------------
+
+None.
+
+
+Other end user impact
+---------------------
+
+None.
+
+
+Performance Impact
+------------------
+
+ACTIVE-ACTIVE should be able to deliver significantly higher performance than
+SINGLE or ACTIVE-STANDBY topology. It will consume more resources to deliver
+this higher performance.
+
+
+Other deployer impact
+---------------------
+
+The reference ACM becomes a new process that is part of the Octavia control
+components (like the controller worker, health monitor and housekeeper). If
+the reference implementation is used, a new Distributor image will need to be
+created and stored in glance much the same way the Amphora image is created
+and stored today.
+
+Developer impact
+----------------
+
+None.
+
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+@TODO
+
+
+Work Items
+----------
+
+@TODO
+
+
+Dependencies
+============
+
+@TODO
+
+
+Testing
+=======
+
+* Unit tests with tox.
+* Function tests with tox.
+* Scenario tests.
+
+
+Documentation Impact
+====================
+
+Need to document all new APIs and API changes, new ACTIVE-ACTIVE topology
+design and features, and new instructions for operators seeking to deploy
+Octavia with ACTIVE-ACTIVE topology.
+
+
+References
+==========
+
+.. [1] https://blueprints.launchpad.net/octavia/+spec/base-image
+.. [2] https://blueprints.launchpad.net/octavia/+spec/controller-worker
+.. [3] https://blueprints.launchpad.net/octavia/+spec/amphora-driver-interface
+.. [4] https://blueprints.launchpad.net/octavia/+spec/controller
+.. [5] https://blueprints.launchpad.net/octavia/+spec/operator-api
+.. [6] doc/main/api/haproxy-amphora-api.rst
+.. [7] https://blueprints.launchpad.net/octavia/+spec/active-active-topology