Architecture Design Guide - network related content

This is incomplete and some information will need to be changed and updated by those with more technical knowledge. Implements: blueprint archguide-mitaka-reorg Partial-Bug: #1548182 Change-Id: I5acc55878e27c19ad81eefee637494652ace2ed9 Co-Authored-By: Victor Howard <victor.r.howard@gmail.com>
2016-03-28 11:13:26 -04:00 · 2016-03-28 11:13:26 -04:00 · cbb699f997
parent a82acdff52
commit cbb699f997
2 changed files with 388 additions and 7 deletions
--- a/doc/arch-design-draft/source/capacity-planning-scaling.rst
+++ b/doc/arch-design-draft/source/capacity-planning-scaling.rst
@ -186,11 +186,6 @@ resources servicing requests between proxy servers and storage nodes.
 For this reason, the network architecture used for access to storage
 nodes and proxy servers should make use of a design which is scalable.

-
-Network
-~~~~~~~
-.. TODO(unassigned): consolidate and update existing network sub-chapters.
-
 Compute resource design
 ~~~~~~~~~~~~~~~~~~~~~~~

--- a/doc/arch-design-draft/source/operator-requirements-network-design.rst
+++ b/doc/arch-design-draft/source/operator-requirements-network-design.rst
@ -1,6 +1,263 @@
-==============
+==========
+Networking
+==========
+
+OpenStack clouds generally have multiple network segments, with each
+segment providing access to particular resources. The network segments
+themselves also require network communication paths that should be
+separated from the other networks. When designing network services for a
+general purpose cloud, plan for either a physical or logical separation
+of network segments used by operators and tenants. Additional network
+segments can also be created for access to internal services such as the
+message bus and database used by various systems. Segregating these
+services onto separate networks helps to protect sensitive data and
+unauthorized access.
+
+Choose a networking service based on the requirements of your instances.
+The architecture and design of your cloud will impact whether you choose
+OpenStack Networking (neutron) or legacy networking (nova-network).
+
+Networking (neutron)
+~~~~~~~~~~~~~~~~~~~~
+
+OpenStack Networking (neutron) is a first class networking service that gives
+full control over creation of virtual network resources to tenants. This is
+often accomplished in the form of tunneling protocols that establish
+encapsulated communication paths over existing network infrastructure in order
+to segment tenant traffic. This method varies depending on the specific
+implementation, but some of the more common methods include tunneling over
+GRE, encapsulating with VXLAN, and VLAN tags.
+
+We recommend you design at least three network segments. The first segment
+should be a public network, used to access REST APIs by tenants and operators.
+The controller nodes and swift proxies are the only devices connecting to this
+network segment. In some cases, this public network might also be serviced by
+hardware load balancers and other network devices.
+
+The second segment is used by administrators to manage hardware resources.
+Configuration management tools also utilize this segment for deploying
+software and services onto new hardware. In some cases, this network
+segment is also used for internal services, including the message bus
+and database services. The second segment needs to communicate with every
+hardware node. Due to the highly sensitive nature of this network segment,
+it needs to be secured from unauthorized access.
+
+The third network segment is used by applications and consumers to access the
+physical network, and for users to access applications. This network is
+segregated from the one used to access the cloud APIs and is not capable
+of communicating directly with the hardware resources in the cloud.
+Communication on this network segment is required by compute resource
+nodes and network gateway services that allow application data to access the
+physical network from outside the cloud.
+
+Legacy networking (nova-network)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The legacy networking (nova-network) service is primarily a layer-2 networking
+service. It functions in two modes: flat networking mode and VLAN mode. In a
+flat network mode, all network hardware nodes and devices throughout the cloud
+are connected to a single layer-2 network segment that provides access to
+application data.
+
+However, when the network devices in the cloud support segmentation using
+VLANs, legacy networking can operate in the second mode. In this design model,
+each tenant within the cloud is assigned a network subnet which is mapped to
+a VLAN on the physical network. It is especially important to remember that
+the maximum number of VLANs that can be used within a spanning tree domain
+is 4096. This places a hard limit on the amount of growth possible within the
+data center. Consequently, when designing a general purpose cloud intended to
+support multiple tenants, we recommend the use of legacy networking with
+VLANs, and not in flat network mode.
+
+Layer-2 architecture advantages
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+A network designed on layer-2 protocols has advantages over a network designed
+on layer-3 protocols. In spite of the difficulties of using a bridge to perform
+the network role of a router, many vendors, customers, and service providers
+choose to use Ethernet in as many parts of their networks as possible. The
+benefits of selecting a layer-2 design are:
+
+* Ethernet frames contain all the essentials for networking. These include, but
+  are not limited to, globally unique source addresses, globally unique
+  destination addresses, and error control.
+
+* Ethernet frames contain all the essentials for networking. These include,
+  but are not limited to, globally unique source addresses, globally unique
+  destination addresses, and error control.
+
+* Ethernet frames can carry any kind of packet. Networking at layer-2 is
+  independent of the layer-3 protocol.
+
+* Adding more layers to the Ethernet frame only slows the networking process
+  down. This is known as nodal processing delay.
+
+* You can add adjunct networking features, for example class of service (CoS)
+  or multicasting, to Ethernet as readily as IP networks.
+
+* VLANs are an easy mechanism for isolating networks.
+
+Most information starts and ends inside Ethernet frames. Today this applies
+to data, voice, and video. The concept is that the network will benefit more
+from the advantages of Ethernet if the transfer of information from a source
+to a destination is in the form of Ethernet frames.
+
+Although it is not a substitute for IP networking, networking at layer-2 can
+be a powerful adjunct to IP networking.
+
+Layer-2 Ethernet usage has these additional advantages over layer-3 IP network
+usage:
+
+* Speed
+* Reduced overhead of the IP hierarchy.
+* No need to keep track of address configuration as systems move around.
+
+Whereas the simplicity of layer-2 protocols might work well in a data center
+with hundreds of physical machines, cloud data centers have the additional
+burden of needing to keep track of all virtual machine addresses and
+networks. In these data centers, it is not uncommon for one physical node
+to support 30-40 instances.
+
+.. Important::
+
+   Networking at the frame level says nothing about the presence or
+   absence of IP addresses at the packet level. Almost all ports, links, and
+   devices on a network of LAN switches still have IP addresses, as do all the
+   source and destination hosts. There are many reasons for the continued need
+   for IP addressing. The largest one is the need to manage the network. A
+   device or link without an IP address is usually invisible to most
+   management applications. Utilities including remote access for diagnostics,
+   file transfer of configurations and software, and similar applications
+   cannot run without IP addresses as well as MAC addresses.
+
+Layer-2 architecture limitations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Layer-2 network architectures have some limitations that become noticeable when
+used outside of traditional data centers.
+
+* Number of VLANs is limited to 4096.
+* The number of MACs stored in switch tables is limited.
+* You must accommodate the need to maintain a set of layer-4 devices to handle
+  traffic control.
+* MLAG, often used for switch redundancy, is a proprietary solution that does
+  not scale beyond two devices and forces vendor lock-in.
+* It can be difficult to troubleshoot a network without IP addresses and ICMP.
+* Configuring ARP can be complicated on a large layer-2 networks.
+* All network devices need to be aware of all MACs, even instance MACs, so
+  there is constant churn in MAC tables and network state changes as instances
+  start and stop.
+* Migrating MACs (instance migration) to different physical locations are a
+  potential problem if you do not set ARP table timeouts properly.
+
+It is important to know that layer-2 has a very limited set of network
+management tools. It is difficult to control traffic as it does not have
+mechanisms to manage the network or shape the traffic. Network
+troubleshooting is also troublesome, in part because network devices have
+no IP addresses. As a result, there is no reasonable way to check network
+delay.
+
+In a layer-2 network all devices are aware of all MACs, even those that belong
+to instances. The network state information in the backbone changes whenever an
+instance starts or stops. Because of this, there is far too much churn in the
+MAC tables on the backbone switches.
+
+Furthermore, on large layer-2 networks, configuring ARP learning can be
+complicated. The setting for the MAC address timer on switches is critical
+and, if set incorrectly, can cause significant performance problems. So when
+migrating MACs to different physical locations to support instance migration,
+problems may arise. As an example, the Cisco default MAC address timer is
+extremely long. As such, the network information maintained in the switches
+could be out of sync with the new location of the instance.
+
+Layer-3 architecture advantages
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+In layer-3 networking, routing takes instance MAC and IP addresses out of the
+network core, reducing state churn. The only time there would be a routing
+state change is in the case of a Top of Rack (ToR) switch failure or a link
+failure in the backbone itself. Other advantages of using a layer-3
+architecture include:
+
+* Layer-3 networks provide the same level of resiliency and scalability
+  as the Internet.
+
+* Controlling traffic with routing metrics is straightforward.
+
+* You can configure layer-3 to useˇBGPˇconfederation for scalability. This
+  way core routers have state proportional to the number of racks, not to the
+  number of servers or instances.
+
+* There are a variety of well tested tools, such as ICMP, to monitor and
+  manage traffic.
+
+* Layer-3 architectures enable the use of Quality of Service (QoS) to manage
+  network performance.
+
+Layer-3 architecture limitations
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The main limitation of layer 3 is that there is no built-in isolation mechanism
+comparable to the VLANs in layer-2 networks. Furthermore, the hierarchical
+nature of IP addresses means that an instance is on the same subnet as its
+physical host, making migration out of the subnet difficult. For these reasons,
+network virtualization needs to use IPencapsulation and software at the end
+hosts. This is for isolation and the separation of the addressing in the
+virtual layer from the addressing in the physical layer. Other potential
+disadvantages of layer 3 include the need to design an IP addressing scheme
+rather than relying on the switches to keep track of the MAC addresses
+automatically, and to configure the interior gateway routing protocol in the
+switches.
+
 Network design
-==============
+~~~~~~~~~~~~~~
+
+There are many reasons an OpenStack network has complex requirements. However,
+one main factor is the many components that interact at different levels of the
+system stack, adding complexity. Data flows are also complex. Data in an
+OpenStack cloud moves both between instances across the network (also known as
+East-West), as well as in and out of the system (also known as North-South).
+Physical server nodes have network requirements that are independent of
+instance network requirements; these you must isolate from the core network
+to account for scalability. We recommend functionally separating the networks
+for security purposes and tuning performance through traffic shaping.
+
+You must consider a number of important general technical and business factors
+when planning and designing an OpenStack network. These include:
+
+* A requirement for vendor independence. To avoid hardware or software vendor
+  lock-in, the design should not rely on specific features of a vendors router
+  or switch.
+* A requirement to massively scale the ecosystem to support millions of end
+  users.
+* A requirement to support indeterminate platforms and applications.
+* A requirement to design for cost efficient operations to take advantage of
+  massive scale.
+* A requirement to ensure that there is no single point of failure in the
+  cloud ecosystem.
+* A requirement for high availability architecture to meet customer SLA
+  requirements.
+* A requirement to be tolerant of rack level failure.
+* A requirement to maximize flexibility to architect future production
+  environments.
+
+Bearing in mind these considerations, we recommend the following:
+
+* Layer-3 designs are preferable to layer-2 architectures.
+* Design a dense multi-path network core to support multi-directional
+  scaling and flexibility.
+* Use hierarchical addressing because it is the only viable option to scale
+  network ecosystem.
+* Use virtual networking to isolate instance service network traffic from the
+  management and internal network traffic.
+* Isolate virtual networks using encapsulation technologies.
+* Use traffic shaping for performance tuning.
+* Use eBGP to connect to the Internet up-link.
+* Use iBGP to flatten the internal traffic on the layer-3 mesh.
+* Determine the most effective configuration for block storage network.
+
+Operator considerations
+-----------------------

 The network design for an OpenStack cluster includes decisions regarding
 the interconnect needs within the cluster, the need to allow clients to
@ -56,3 +313,132 @@ correctly, failover traffic could overwhelm other ports or network
 links and create a cascading failure scenario. In this case,
 traffic that fails over to one link overwhelms that link and then
 moves to the subsequent links until all network traffic stops.
+
+Additional considerations
+-------------------------
+
+There are several further considerations when designing a network-focused
+OpenStack cloud. Redundant networking: ToR switch high availability risk
+analysis. In most cases, it is much more economical to use a single switch
+with a small pool of spare switches to replace failed units than it is to
+outfit an entire data center with redundant switches. Applications should
+tolerate rack level outages without affecting normal operations since network
+and compute resources are easily provisioned and plentiful.
+
+Research indicates the mean time between failures (MTBF) on switches is
+between 100,000 and 200,000 hours. This number is dependent on the ambient
+temperature of the switch in the data center. When properly cooled and
+maintained, this translates to between 11 and 22 years before failure. Even
+in the worst case of poor ventilation and high ambient temperatures in the data
+center, the MTBF is still 2-3 years.
+
+Reference
+https://www.garrettcom.com/techsupport/papers/ethernet_switch_reliability.pdf
+for further information.
+
+Legacy networking (nova-network)
+OpenStack Networking
+Simple, single agent
+Complex, multiple agents
+Flat or VLAN
+Flat, VLAN, Overlays, L2-L3, SDN
+No plug-in support
+Plug-in support for 3rd parties
+No multi-tier topologies
+Multi-tier topologies
+
+Preparing for the future: IPv6 support
+--------------------------------------
+
+One of the most important networking topics today is the exhaustion of
+IPv4 addresses. As of late 2015, ICANN announced that the the final
+IPv4 address blocks have been fully assigned. Because of this, IPv6
+protocol has become the future of network focused applications. IPv6
+increases the address space significantly, fixes long standing issues
+in the IPv4 protocol, and will become essential for network focused
+applications in the future.
+
+OpenStack Networking, when configured for it, supports IPv6. To enable
+IPv6, create an IPv6 subnet in Networking and use IPv6 prefixes when
+creating security groups.
+
+Asymmetric links
+----------------
+
+When designing a network architecture, the traffic patterns of an
+application heavily influence the allocation of total bandwidth and
+the number of links that you use to send and receive traffic. Applications
+that provide file storage for customers allocate bandwidth and links to
+favor incoming traffic; whereas video streaming applications allocate
+bandwidth and links to favor outgoing traffic.
+
+Performance
+-----------
+
+It is important to analyze the applications tolerance for latency and
+jitter when designing an environment to support network focused
+applications. Certain applications, for example VoIP, are less tolerant
+of latency and jitter. When latency and jitter are issues, certain
+applications may require tuning of QoS parameters and network device
+queues to ensure that they queue for transmit immediately or guarantee
+minimum bandwidth. Since OpenStack currently does not support these functions,
+consider carefully your selected network plug-in.
+
+The location of a service may also impact the application or consumer
+experience. If an application serves differing content to different users,
+it must properly direct connections to those specific locations. Where
+appropriate, use a multi-site installation for these situations.
+
+You can implement networking in two separate ways. Legacy networking
+(nova-network) provides a flat DHCP network with a single broadcast domain.
+This implementation does not support tenant isolation networks or advanced
+plug-ins, but it is currently the only way to implement a distributed
+layer-3 (L3) agent using the multi host configuration. OpenStack Networking
+(neutron) is the official networking implementation and provides a pluggable
+architecture that supports a large variety of network methods. Some of these
+include a layer-2 only provider network model, external device plug-ins, or
+even OpenFlow controllers.
+
+Networking at large scales becomes a set of boundary questions. The
+determination of how large a layer-2 domain must be is based on the
+amount of nodes within the domain and the amount of broadcast traffic
+that passes between instances. Breaking layer-2 boundaries may require
+the implementation of overlay networks and tunnels. This decision is a
+balancing act between the need for a smaller overhead or a need for a smaller
+domain.
+
+When selecting network devices, be aware that making a decision based on the
+greatest port density often comes with a drawback. Aggregation switches and
+routers have not all kept pace with Top of Rack switches and may induce
+bottlenecks on north-south traffic. As a result, it may be possible for
+massive amounts of downstream network utilization to impact upstream network
+devices, impacting service to the cloud. Since OpenStack does not currently
+provide a mechanism for traffic shaping or rate limiting, it is necessary to
+implement these features at the network hardware level.
+
+Tunable networking components
+-----------------------------
+
+Consider configurable networking components related to an OpenStack
+architecture design when designing for network intensive workloads
+that include MTU and QoS. Some workloads require a larger MTU than normal
+due to the transfer of large blocks of data. When providing network
+service for applications such as video streaming or storage replication,
+we recommend that you configure both OpenStack hardware nodes and the
+supporting network equipment for jumbo frames where possible. This
+allows for better use of available bandwidth. Configure jumbo frames across the
+complete path the packets traverse. If one network component is not capable of
+handling jumbo frames then the entire path reverts to the default MTU.
+
+Quality of Service (QoS) also has a great impact on network intensive workloads
+as it provides instant service to packets which have a higher priority due to
+the impact of poor network performance. In applications such as Voice over IP
+(VoIP), differentiated services code points are a near requirement for proper
+operation. You can also use QoS in the opposite direction for mixed workloads
+to prevent low priority but high bandwidth applications, for example backup
+services, video conferencing, or file sharing, from blocking bandwidth that is
+needed for the proper operation of other workloads. It is possible to tag file
+storage traffic as a lower class, such as best effort or scavenger, to allow
+the higher priority traffic through. In cases where regions within a cloud
+might be geographically distributed it may also be necessary to plan
+accordingly to implement WAN optimization to combat latency or packet loss.