Merge "Neutron Routed Networks"

2016-06-02 21:49:14 +00:00 · 2016-06-02 21:49:14 +00:00 · 4fc9ec210d
parent 06559388c6 67377ed376
commit 4fc9ec210d
1 changed files with 329 additions and 0 deletions
--- a/specs/newton/approved/neutron-routed-networks.rst
+++ b/specs/newton/approved/neutron-routed-networks.rst
@ -0,0 +1,329 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+=======================
+Neutron Routed Networks
+=======================
+
+https://blueprints.launchpad.net/nova/+spec/neutron-routed-networks
+
+In Neutron, there is a priority effort to support routed networks.  A routed
+network, in this context, is a physical network infrastructure that implements
+scaled networks by routing instead of large L2 broadcast domains.  For example,
+deployers may have routers at each top-of-rack.  Instead of a single VLAN
+covering the deployment, each rack would have its own VLAN and the routers will
+provide reachability to the rest of the racks over L3.  `Operators want Neutron
+to model this like a single network`__.  This has implications for Nova
+scheduling and possibly migration.
+
+__ operators-rfe_
+
+
+Problem description
+===================
+
+`Neutron has a spec`__ for how this will be handled in there.  Each L2 network
+is referred to as a segment.  Other terminology is in discussion in the spec.
+
+__ neutron-spec_
+
+For Nova, this has a couple of specific implications.  First, IP subnets will
+have affinity to particular network segments.  Second, compute hosts will have
+L2 reachability to (typically) only one segment within an network.  This means
+that IP addresses assigned to ports are constrained to a potentially small
+subset of compute hosts.
+
+Currently, Nova requires an IP address on a port.  If that requirement were
+kept, and that IP address is constrained to a small subset of compute hosts,
+then the scheduler would have to constrain scheduling to that subset.  This is
+a pretty severe artificial constraint on the scheduler.  To avoid it, Neutron
+needs to be able to leave the IP address unassigned until after the port is
+bound to a host.  After host binding, Nova can still fail the build for a
+deferred IP port if an IP is still not allocated.
+
+A related but much less severe constraint is that of IP availability across
+segments.  Some segments might be exhausted and that should be considered by
+the scheduler.  This is a resource that is under the control of Neutron and
+hence will need `a resource provider created`__ to manage it for the Nova
+scheduler.
+
+__ resource-providers-spec_
+
+For move operations involving the scheduler (e.g. live migrations), the VM
+already has an IP address.  For that IP address to continue to work, the VM
+must be migrated to another host with reachability to the same network segment.
+Forced move operations that bypass the scheduler may cause a failure at binding
+time if the segment is not available on the new host.
+
+Use Cases
+----------
+
+In the following use cases, there is an assumption that all segments in Neutron
+can be associated with one or more aggregates in Nova via the `proposed new`__
+`openstack resource-pool create` and `openstack resource-pool add aggregate``
+commands and associated REST API.
+
+__ generic-resource-pools_
+
+#. User has a port without a binding to a segment and provides it to nova boot.
+   Such a port will not have an IP address until after the scheduler places the
+   instance and the port gets bound to that host.  Then, Neutron can assign an
+   IP address from a segment which that compute host can reach.
+
+   In this use case, the scheduler must take into consideration the
+   availability of IP addresses in each of the segments.  For example, there
+   could be some segments in the network which are out of addresses completely.
+
+   - A similar use case is to add an additional port to an existing instance.
+     In this case, the segment and IP address of the new port will be set when
+     the new port is bound to the compute host.  Since the port was unbound to
+     begin with, there should be no restriction.
+
+     Binding may fail in this case if all of the segments available to the host
+     are out of IP addresses.
+
+#. User has a port that has an IP address and thus is effectively attached to a
+   segment (but not bound to a host).  He/She provides it to nova boot.  Nova
+   will ask Neutron for the segment to which the port is bound by getting the
+   details of the port.  Given that segment, the scheduler should place the
+   instance on a compute host belonging to the corresponding aggregate.
+
+   - A similar use case is to add an additional port to an existing instance.
+     In this case, the segment of the new port must match a segment available
+     to the instance's host.  If not, adding the port to the instance should
+     fail.
+
+#. User calls Nova boot and passes a network id.  The Nova scheduler will call
+   Neutron to create a port, will place the instance, and then will call
+   Neutron to update the port with binding details.  Neutron will use the host
+   binding to set the segment and allocate the IP.
+
+#. Any move operation calling out the scheduler.  In this case, the port
+   already has an IP address.  That IP address is only viable in the same
+   segment.  The scheduler must only consider target hosts that belong to the
+   same segment (or aggregate).
+
+Proposed change
+===============
+
+Neutron will be a resource provider as described in the `generic resource pools
+specification`__ and its dependencies.   I imagine that Neutron will create and
+maintain aggregates corresponding to its segments so that Nova has the same
+mapping as Neutron does of hosts to segments.
+
+__ generic-resource-pools_
+
+Next, Neutron creates a resource_pool for each of the segments.  The pool has a
+resource class (e.g. "IPV4_ADDRESS" or "IPV6_ADDRESS") in common with other
+resource pools but each pool is specific to a segment id.  The linkage is set
+by setting the UUID of the resource pool equal to the UUID of the segment in
+Neutron.  Resource pools are linked to the host aggregates.
+
+The resource pool has a record in an inventories table for IPs as a resource
+class.  It effectively gives the capacity of the pool from Nova's perspective::
+
+  capacity = (total - reserved) * allocation_ratio
+
+Neutron will call Nova's REST API to set "total" to the size of the allocation
+pool(s) on the subnets.  This will remain mostly static but could change if the
+allocation pool is updated in a subnet-update call.  The allocation_ratio will
+always be 1.0 in this use case.
+
+Neutron sets reserved to the total number of addresses which are consumed
+outside of Nova's purview.  This includes overhead stuff like dhcp and dns
+consumed from the subnets' allocation pool which Neutron shares with Nova.
+This is expected to remain mostly constant but might change a little more often
+than the total if new overhead ports are allocated in Neutron.
+
+The allocations table indicates how much of the capacity has been consumed by
+Nova.
+
+There can be a race to consume IP resources for any given segment.  In current
+Nova, the claim is made on the compute node after scheduling is done.   This
+can result in a race to consume IPs if the IP resource is getting low.  With
+the claim being made by the compute node, a failure to collect the claim can be
+very costly since the compute node has already started the process of claiming
+and consuming other resources.
+
+To reduce the cost of a failed claim this spec depends on `John G's spec`__ for
+pre-allocating before scheduling and moving the port update to the conductor.
+
+__ prep-for-network-aware-scheduling_
+
+Regarding the use cases where the user has a port and brings that port to Nova
+to create an instance (or to add it to an existing instance), they appear the
+same at first::
+
+  nova boot --nic port_id=$PORT_ID
+
+Nova will make a call to Neutron to get or create a port and will receive the
+details of the port in the response.  In those details, Neutron will include
+the segment_id of the each fixed_ip on the port if it is bound to a segment.
+This segment_id will be used to lookup the resource provider for IP addresses
+on the segment.
+
+For Nova to allow deferring IP allocation on a port, a new attribute will be
+added to the Neutron port called ip_allocation.  It will have one of three
+values:  "immediate," "deferred," or "none."  Ports with "immediate"
+ip_allocation act like ports do today:  it is expected that an IP will be
+allocated on port create.  Ports with "deferred" ip_allocation will have an IP
+address allocated on port update when host binding information is provided.
+Ports with "none" in ip_allocation are not intended to have an IP address
+allocation at all.  It is beyond the scope of this patch to handle ports with
+"none."
+
+Alternatives
+------------
+
+One alternative was considered around trying to eliminate races for IP resource
+between Nova and Neutron.  It involved significantly more active maintenance of
+the reserved field on the resource provider and required that the the
+allocation was conditionally recorded depending on the scenario.
+
+This method was rejected in favor of the current proposal for its complexity.
+
+Data model impact
+-----------------
+
+None
+
+REST API impact
+---------------
+
+None
+
+Security impact
+---------------
+
+None
+
+Notifications impact
+--------------------
+
+None
+
+Other end user impact
+---------------------
+
+#. Users who create a port with Neutron and bring it to Nova will notice that
+   the port doesn't have an IP address when the network is routed.
+
+#. Operators will notice the use of host aggregates which correspond to
+   Neutron segments and their corresponding resource providers.
+
+Performance Impact
+------------------
+
+The preceding spec to `prepare Nova for network aware`__ has some performance
+effects that should be noted here although this spec does not add to those.  It
+moves port get/create to before the scheduler which adds some overhead.  It
+also moves the port update to the conductor which will significantly reduce the
+overhead involved when port update fails due to exhausted IP address resources.
+
+__ prep-for-network-aware-scheduling_
+
+Other deployer impact
+---------------------
+
+Since this work is co-dependent on work in Neutron, there are some upgrade
+considerations.  If routed networks are not used in Neutron then there is no
+problem.  Existing networks and new non-routed networks will still work the way
+they do today.  Since routed networks are an optional new feature, this will
+only affect operators who wish to take advantage of it.
+
+The best thing for operators to do will be to upgrade both services before
+attempting to configure a routed provider network.  However, I'll discuss the
+implications of rolling upgrades.
+
+Consider if the Neutron API is upgraded and Nova is not.  Neutron will not have
+the generic resource provider API endpoint available.  Neutron will need to
+handle this gracefully taking advantage of microversioning in the Nova API.
+Neutron will poll infrequently to discover when Nova has been upgraded and will
+make use of the API when it becomes available.
+
+In the meantime, it will be possible to create routed networks in Neutron but
+scheduling will not be IP resource aware.  So, if segments run out of
+addresses, boot failures will happen when a VM is scheduled to these segments
+when Nova attempts to create a port and that fails.
+
+Finally, the deferred IP allocation use case will not work because Nova will
+refuse to use a port without an IP address until it has been upgraded.  The use
+cases that don't involve deferred IP allocation will work until the above IP
+exhaustion problem is encountered.
+
+If Nova is upgrade and Neutron is not, then there is no problem because routed
+provider networks and deferred IP address ports are not possible.
+
+Developer impact
+----------------
+
+None
+
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+* `Miguel Lavalle <https://launchpad.net/~minsel>`_
+* `Carl Baldwin <https://launchpad.net/~carl-baldwin>`_
+
+Work Items
+----------
+
+* Get segment_id, if available, from the port in the pre-schedule phase on the
+  conductor.  Use that segment_id to look up the resource provider for IP
+  address.
+
+* Allow deferred or no IP addresses on ports by looking at the ip_allocation
+  attribute on the port.
+
+* Neutron to curate host aggregates and resource pools within Nova.  (This is
+  Neutron acting as a client to the Nova API, isn't it?  So, it isn't really a
+  Nova work item.)
+
+Dependencies
+============
+
+This is co-dependent on the `Neutron spec`__ mentioned above.  Also depends on
+the `resource providers`__ which has merged in Nova and the newly created `spec
+to prepare for network aware scheduling`__.
+
+__ neutron-spec_
+__ resource-providers-spec_
+__ prep-for-network-aware-scheduling_
+
+
+Testing
+=======
+
+All new functionality will be covered with unit tests.  We'll be looking to
+create a multi-node job to run on Neutron and Nova which tests out routed
+networks.  It will include tests specifically for the use cases mentioned in
+this spec.
+
+
+Documentation Impact
+====================
+
+The OpenStack Administrator Guide will be updated.
+
+
+References
+==========
+
+.. _operators-rfe: https://bugs.launchpad.net/neutron/+bug/1458890
+.. _neutron-spec: https://review.openstack.org/#/c/225384/
+.. _prep-for-network-aware-scheduling: https://review.openstack.org/#/c/313001/
+.. _resource-providers-spec: https://review.openstack.org/#/c/225546/10/specs/mitaka/approved/resource-providers.rst
+.. _generic-resource-pools: https://review.openstack.org/#/c/300176/16/specs/newton/approved/generic-resource-pools.rst
+
+
+History
+=======
+
+None