Updating userdoc overview

* formatted text for 79 chars
* minor grammar fixes

Change-Id: Ib68ebfc0cfd2d9cdb987b60f39347a96c5873741
Partial-Bug: 1490687
This commit is contained in:
Michael McCune 2015-08-31 18:49:12 -04:00
parent 6b7e3a0271
commit c2ea4fa572
1 changed files with 31 additions and 29 deletions

View File

@ -4,49 +4,50 @@ Getting Started
Clusters
--------
A cluster deployed by Sahara consists of node groups. Node groups vary by
A cluster deployed by sahara consists of node groups. Node groups vary by
their role, parameters and number of machines. The picture below
illustrates an example of a Hadoop cluster consisting of 3 node groups each having a
different role (set of processes).
illustrates an example of a Hadoop cluster consisting of 3 node groups each
having a different role (set of processes).
.. image:: ../images/hadoop-cluster-example.jpg
Node group parameters include Hadoop parameters like `io.sort.mb` or
`mapred.child.java.opts`, and several infrastructure parameters like the flavor
for VMs or storage location (ephemeral drive or Cinder volume).
Node group parameters include Hadoop parameters like ``io.sort.mb`` or
``mapred.child.java.opts``, and several infrastructure parameters like the
flavor for VMs or storage location (ephemeral drive or cinder volume).
A cluster is characterized by its node groups and its parameters. Like a node
group, a cluster has Hadoop and infrastructure parameters. An
example of a cluster-wide Hadoop parameter is `dfs.replication`. For
group, a cluster has data processing framework and infrastructure parameters.
An example of a cluster-wide Hadoop parameter is ``dfs.replication``. For
infrastructure, an example could be image which will be used to launch cluster
VMs.
Templates
---------
In order to simplify cluster provisioning Sahara employs the concept of templates.
There are two kinds of templates: node group templates and cluster templates. The
former is used to create node groups, the latter - clusters. Essentially
templates have the very same parameters as corresponding entities. Their aim
is to remove the burden of specifying all of the required parameters each time a user
wants to launch a cluster.
In order to simplify cluster provisioning sahara employs the concept of
templates. There are two kinds of templates: node group templates and
cluster templates. The former is used to create node groups, the latter
- clusters. Essentially templates have the very same parameters as
corresponding entities. Their aim is to remove the burden of specifying all
of the required parameters each time a user wants to launch a cluster.
In the REST interface, templates have extended functionality. First you can
specify node-scoped parameters here, they will work as a defaults for node
groups. Also with the REST interface, during cluster creation a user can override
template parameters for both cluster and node groups.
specify node-scoped parameters here, they will work as defaults for node
groups. Also with the REST interface, during cluster creation a user can
override template parameters for both cluster and node groups.
Provisioning Plugins
--------------------
A provisioning plugin is a component responsible for provisioning a Hadoop
cluster. Generally each plugin is capable of provisioning a specific Hadoop
distribution. Also the plugin can install management and/or monitoring tools for
a cluster.
A provisioning plugin is a component responsible for provisioning a data
processing cluster. Generally each plugin is capable of provisioning a
specific data processing framework or Hadoop distribution. Also the plugin
can install management and/or monitoring tools for a cluster.
Since Hadoop parameters vary depending on distribution and the Hadoop version,
templates are always plugin and Hadoop version specific. A template cannot
be used if the plugin/Hadoop versions are different than the ones they were created for.
Since framework configuration parameters vary depending on the distribution
and the version, templates are always plugin and version specific. A template
cannot be used if the plugin, or framework, versions are different than the
ones they were created for.
You may find the list of available plugins on that page: :doc:`plugins`
@ -54,13 +55,14 @@ Image Registry
--------------
OpenStack starts VMs based on a pre-built image with an installed OS. The image
requirements for Sahara depend on the plugin and Hadoop version. Some plugins
require just a basic cloud image and will install Hadoop on the VMs from scratch. Some
plugins might require images with pre-installed Hadoop.
requirements for sahara depend on the plugin and data processing framework
version. Some plugins require just a basic cloud image and will install the
framework on the VMs from scratch. Some plugins might require images with
pre-installed frameworks or Hadoop distributions.
The Sahara Image Registry is a feature which helps filter out images during
cluster creation. See :doc:`registering_image` for details on how to
work with Image Registry.
cluster creation. See :doc:`registering_image` for details on how to work
with Image Registry.
Features
--------