Merge "Convert Object Storage files to RST"

2015-06-12 10:13:13 +00:00 · 2015-06-12 10:13:13 +00:00 · ca92d0caf4
parent 62eb859fc1 ae71474177
commit ca92d0caf4
11 changed files with 380 additions and 3 deletions
--- a/doc/admin-guide-cloud-rst/source/figures/objectstorage-accountscontainers.png
+++ b/doc/admin-guide-cloud-rst/source/figures/objectstorage-accountscontainers.png
--- a/doc/admin-guide-cloud-rst/source/figures/objectstorage-buildingblocks.png
+++ b/doc/admin-guide-cloud-rst/source/figures/objectstorage-buildingblocks.png
--- a/doc/admin-guide-cloud-rst/source/figures/objectstorage-partitions.png
+++ b/doc/admin-guide-cloud-rst/source/figures/objectstorage-partitions.png
--- a/doc/admin-guide-cloud-rst/source/figures/objectstorage-replication.png
+++ b/doc/admin-guide-cloud-rst/source/figures/objectstorage-replication.png
--- a/doc/admin-guide-cloud-rst/source/figures/objectstorage-ring.png
+++ b/doc/admin-guide-cloud-rst/source/figures/objectstorage-ring.png
--- a/doc/admin-guide-cloud-rst/source/figures/objectstorage-usecase.png
+++ b/doc/admin-guide-cloud-rst/source/figures/objectstorage-usecase.png
--- a/doc/admin-guide-cloud-rst/source/figures/objectstorage-zones.png
+++ b/doc/admin-guide-cloud-rst/source/figures/objectstorage-zones.png
--- a/doc/admin-guide-cloud-rst/source/objectstorage.rst
+++ b/doc/admin-guide-cloud-rst/source/objectstorage.rst
@ -8,13 +8,12 @@ Contents
 .. toctree::
   :maxdepth: 2

-   objectstorage_characteristics.rst
-
-.. TODO (karenb)
   objectstorage_intro.rst
   objectstorage_features.rst
   objectstorage_characteristics.rst
   objectstorage_components.rst
+
+.. TODO (karenb)
   objectstorage_ringbuilder.rst
   objectstorage_arch.rst
   objectstorage_replication.rst
--- a/doc/admin-guide-cloud-rst/source/objectstorage_components.rst
+++ b/doc/admin-guide-cloud-rst/source/objectstorage_components.rst
@ -0,0 +1,283 @@
+==========
+Components
+==========
+
+The components that enable Object Storage to deliver high availability,
+high durability, and high concurrency are:
+
+-  **Proxy servers.** Handle all of the incoming API requests.
+
+-  **Rings.** Map logical names of data to locations on particular
+   disks.
+
+-  **Zones.** Isolate data from other zones. A failure in one zone
+   doesn't impact the rest of the cluster because data is replicated
+   across zones.
+
+-  **Accounts and containers.** Each account and container are
+   individual databases that are distributed across the cluster. An
+   account database contains the list of containers in that account. A
+   container database contains the list of objects in that container.
+
+-  **Objects.** The data itself.
+
+-  **Partitions.** A partition stores objects, account databases, and
+   container databases and helps manage locations where data lives in
+   the cluster.
+
+|
+
+.. _objectstorage-building-blocks-figure:
+
+**Object Storage building blocks**
+
+.. figure:: figures/objectstorage-buildingblocks.png
+
+|
+
+Proxy servers
+-------------
+
+Proxy servers are the public face of Object Storage and handle all of
+the incoming API requests. Once a proxy server receives a request, it
+determines the storage node based on the object's URL, for example,
+https://swift.example.com/v1/account/container/object. Proxy servers
+also coordinate responses, handle failures, and coordinate timestamps.
+
+Proxy servers use a shared-nothing architecture and can be scaled as
+needed based on projected workloads. A minimum of two proxy servers
+should be deployed for redundancy. If one proxy server fails, the others
+take over.
+
+For more information concerning proxy server configuration, please see
+the `Configuration
+Reference <http://docs.openstack.org/trunk/config-reference/content/proxy-server-configuration.html>`__.
+
+Rings
+-----
+
+A ring represents a mapping between the names of entities stored on disk
+and their physical locations. There are separate rings for accounts,
+containers, and objects. When other components need to perform any
+operation on an object, container, or account, they need to interact
+with the appropriate ring to determine their location in the cluster.
+
+The ring maintains this mapping using zones, devices, partitions, and
+replicas. Each partition in the ring is replicated, by default, three
+times across the cluster, and partition locations are stored in the
+mapping maintained by the ring. The ring is also responsible for
+determining which devices are used for handoff in failure scenarios.
+
+Data can be isolated into zones in the ring. Each partition replica is
+guaranteed to reside in a different zone. A zone could represent a
+drive, a server, a cabinet, a switch, or even a data center.
+
+The partitions of the ring are equally divided among all of the devices
+in the Object Storage installation. When partitions need to be moved
+around (for example, if a device is added to the cluster), the ring
+ensures that a minimum number of partitions are moved at a time, and
+only one replica of a partition is moved at a time.
+
+You can use weights to balance the distribution of partitions on drives
+across the cluster. This can be useful, for example, when differently
+sized drives are used in a cluster.
+
+The ring is used by the proxy server and several background processes
+(like replication).
+
+|
+
+.. _objectstorage-ring-figure:
+
+**The ring**
+
+.. figure:: figures/objectstorage-ring.png
+
+|
+
+These rings are externally managed, in that the server processes
+themselves do not modify the rings, they are instead given new rings
+modified by other tools.
+
+The ring uses a configurable number of bits from an MD5 hash for a path
+as a partition index that designates a device. The number of bits kept
+from the hash is known as the partition power, and 2 to the partition
+power indicates the partition count. Partitioning the full MD5 hash ring
+allows other parts of the cluster to work in batches of items at once
+which ends up either more efficient or at least less complex than
+working with each item separately or the entire cluster all at once.
+
+Another configurable value is the replica count, which indicates how
+many of the partition-device assignments make up a single ring. For a
+given partition number, each replica's device will not be in the same
+zone as any other replica's device. Zones can be used to group devices
+based on physical locations, power separations, network separations, or
+any other attribute that would improve the availability of multiple
+replicas at the same time.
+
+Zones
+-----
+
+Object Storage allows configuring zones in order to isolate failure
+boundaries. Each data replica resides in a separate zone, if possible.
+At the smallest level, a zone could be a single drive or a grouping of a
+few drives. If there were five object storage servers, then each server
+would represent its own zone. Larger deployments would have an entire
+rack (or multiple racks) of object servers, each representing a zone.
+The goal of zones is to allow the cluster to tolerate significant
+outages of storage servers without losing all replicas of the data.
+
+As mentioned earlier, everything in Object Storage is stored, by
+default, three times. Swift will place each replica
+"as-uniquely-as-possible" to ensure both high availability and high
+durability. This means that when chosing a replica location, Object
+Storage chooses a server in an unused zone before an unused server in a
+zone that already has a replica of the data.
+
+|
+
+.. _objectstorage-zones-figure:
+
+**Zones**
+
+.. figure:: figures/objectstorage-zones.png
+
+|
+
+When a disk fails, replica data is automatically distributed to the
+other zones to ensure there are three copies of the data.
+
+Accounts and containers
+-----------------------
+
+Each account and container is an individual SQLite database that is
+distributed across the cluster. An account database contains the list of
+containers in that account. A container database contains the list of
+objects in that container.
+
+|
+
+.. _objectstorage-accountscontainers-figure:
+
+**Accounts and containers**
+
+.. figure:: figures/objectstorage-accountscontainers.png
+
+|
+
+To keep track of object data locations, each account in the system has a
+database that references all of its containers, and each container
+database references each object.
+
+Partitions
+----------
+
+A partition is a collection of stored data, including account databases,
+container databases, and objects. Partitions are core to the replication
+system.
+
+Think of a partition as a bin moving throughout a fulfillment center
+warehouse. Individual orders get thrown into the bin. The system treats
+that bin as a cohesive entity as it moves throughout the system. A bin
+is easier to deal with than many little things. It makes for fewer
+moving parts throughout the system.
+
+System replicators and object uploads/downloads operate on partitions.
+As the system scales up, its behavior continues to be predictable
+because the number of partitions is a fixed number.
+
+Implementing a partition is conceptually simple, a partition is just a
+directory sitting on a disk with a corresponding hash table of what it
+contains.
+
+|
+
+.. _objectstorage-partitions-figure:
+
+**Partitions**
+
+.. figure:: figures/objectstorage-partitions.png
+
+|
+
+Replicators
+-----------
+
+In order to ensure that there are three copies of the data everywhere,
+replicators continuously examine each partition. For each local
+partition, the replicator compares it against the replicated copies in
+the other zones to see if there are any differences.
+
+The replicator knows if replication needs to take place by examining
+hashes. A hash file is created for each partition, which contains hashes
+of each directory in the partition. Each of the three hash files is
+compared. For a given partition, the hash files for each of the
+partition's copies are compared. If the hashes are different, then it is
+time to replicate, and the directory that needs to be replicated is
+copied over.
+
+This is where partitions come in handy. With fewer things in the system,
+larger chunks of data are transferred around (rather than lots of little
+TCP connections, which is inefficient) and there is a consistent number
+of hashes to compare.
+
+The cluster eventually has a consistent behavior where the newest data
+has a priority.
+
+|
+
+.. _objectstorage-replication-figure:
+
+**Replication**
+
+.. figure:: figures/objectstorage-replication.png
+
+|
+
+If a zone goes down, one of the nodes containing a replica notices and
+proactively copies data to a handoff location.
+
+Use cases
+---------
+
+The following sections show use cases for object uploads and downloads
+and introduce the components.
+
+
+Upload
+~~~~~~
+
+A client uses the REST API to make a HTTP request to PUT an object into
+an existing container. The cluster receives the request. First, the
+system must figure out where the data is going to go. To do this, the
+account name, container name, and object name are all used to determine
+the partition where this object should live.
+
+Then a lookup in the ring figures out which storage nodes contain the
+partitions in question.
+
+The data is then sent to each storage node where it is placed in the
+appropriate partition. At least two of the three writes must be
+successful before the client is notified that the upload was successful.
+
+Next, the container database is updated asynchronously to reflect that
+there is a new object in it.
+
+|
+
+.. _objectstorage-usecase-figure:
+
+**Object Storage in use**
+
+.. figure:: figures/objectstorage-usecase.png
+
+|
+
+Download
+~~~~~~~~
+
+A request comes in for an account/container/object. Using the same
+consistent hashing, the partition name is generated. A lookup in the
+ring reveals which storage nodes contain that partition. A request is
+made to one of the storage nodes to fetch the object and, if that fails,
+requests are made to the other nodes.
--- a/doc/admin-guide-cloud-rst/source/objectstorage_features.rst
+++ b/doc/admin-guide-cloud-rst/source/objectstorage_features.rst
@ -0,0 +1,72 @@
+=====================
+Features and benefits
+=====================
+
+-----------------------------+--------------------------------------------------+
+| Features                    | Benefits                                         |
+=============================+==================================================+
+| Leverages commodity         | No lock-in, lower price/GB.                      |
+| hardware                    |                                                  |
+-----------------------------+--------------------------------------------------+
+| HDD/node failure agnostic   | Self-healing, reliable, data redundancy protects |
+|                             | from failures.                                   |
+-----------------------------+--------------------------------------------------+
+| Unlimited storage           | Large and flat namespace, highly scalable        |
+|                             | read/write access, able to serve content         |
+|                             | directly from storage system.                    |
+-----------------------------+--------------------------------------------------+
+| Multi-dimensional           | Scale-out architecture: Scale vertically and     |
+| scalability                 | horizontally-distributed storage. Backs up       |
+|                             | and archives large amounts of data with          |
+|                             | linear performance.                              |
+-----------------------------+--------------------------------------------------+
+| Account/container/object    | No nesting, not a traditional file system:       |
+| structure                   | Optimized for scale, it scales to multiple       |
+|                             | petabytes and billions of objects.               |
+-----------------------------+--------------------------------------------------+
+| Built-in replication        | A configurable number of accounts, containers    |
+| 3✕ + data redundancy        | and object copies for high availability.         |
+| (compared with 2✕ on RAID)  |                                                  |
+-----------------------------+--------------------------------------------------+
+| Easily add capacity (unlike | Elastic data scaling with ease                   |
+| RAID resize)                |                                                  |
+-----------------------------+--------------------------------------------------+
+| No central database         | Higher performance, no bottlenecks               |
+-----------------------------+--------------------------------------------------+
+| RAID not required           | Handle many small, random reads and writes       |
+|                             | efficiently                                      |
+-----------------------------+--------------------------------------------------+
+| Built-in management         | Account management: Create, add, verify,         |
+| utilities                   | and delete users; Container management: Upload,  |
+|                             | download, and verify; Monitoring: Capacity,      |
+|                             | host, network, log trawling, and cluster health. |
+-----------------------------+--------------------------------------------------+
+| Drive auditing              | Detect drive failures preempting data corruption |
+-----------------------------+--------------------------------------------------+
+| Expiring objects            | Users can set an expiration time or a TTL on an  |
+|                             | object to control access                         |
+-----------------------------+--------------------------------------------------+
+| Direct object access        | Enable direct browser access to content, such as |
+|                             | for a control panel                              |
+-----------------------------+--------------------------------------------------+
+| Realtime visibility into    | Know what users are requesting.                  |
+| client requests             |                                                  |
+-----------------------------+--------------------------------------------------+
+| Supports S3 API             | Utilize tools that were designed for the popular |
+|                             | S3 API.                                          |
+-----------------------------+--------------------------------------------------+
+| Restrict containers per     | Limit access to control usage by user.           |
+| account                     |                                                  |
+-----------------------------+--------------------------------------------------+
+| Support for NetApp,         | Unified support for block volumes using a        |
+| Nexenta, SolidFire          | variety of storage systems.                      |
+-----------------------------+--------------------------------------------------+
+| Snapshot and backup API for | Data protection and recovery for VM data.        |
+| block volumes               |                                                  |
+-----------------------------+--------------------------------------------------+
+| Standalone volume API       | Separate endpoint and API for integration with   |
+| available                   | other compute systems.                           |
+-----------------------------+--------------------------------------------------+
+| Integration with Compute    | Fully integrated with Compute for attaching      |
+|                             | block volumes and reporting on usage.            |
+-----------------------------+--------------------------------------------------+
--- a/doc/admin-guide-cloud-rst/source/objectstorage_intro.rst
+++ b/doc/admin-guide-cloud-rst/source/objectstorage_intro.rst
@ -0,0 +1,23 @@
+==============================
+Introduction to Object Storage
+==============================
+
+OpenStack Object Storage (code-named swift) is open source software for
+creating redundant, scalable data storage using clusters of standardized
+servers to store petabytes of accessible data. It is a long-term storage
+system for large amounts of static data that can be retrieved,
+leveraged, and updated. Object Storage uses a distributed architecture
+with no central point of control, providing greater scalability,
+redundancy, and permanence. Objects are written to multiple hardware
+devices, with the OpenStack software responsible for ensuring data
+replication and integrity across the cluster. Storage clusters scale
+horizontally by adding new nodes. Should a node fail, OpenStack works to
+replicate its content from other active nodes. Because OpenStack uses
+software logic to ensure data replication and distribution across
+different devices, inexpensive commodity hard drives and servers can be
+used in lieu of more expensive equipment.
+
+Object Storage is ideal for cost effective, scale-out storage. It
+provides a fully distributed, API-accessible storage platform that can
+be integrated directly into applications or used for backup, archiving,
+and data retention.