Merge "Small doc fixes found during doc day"

This commit is contained in:
Zuul 2018-02-08 21:23:52 +00:00 committed by Gerrit Code Review
commit fbd7abb754
3 changed files with 29 additions and 27 deletions

View File

@ -12,9 +12,9 @@ OpenStack. It is worth mentioning that Amazon has provided Hadoop for
several years as Amazon Elastic MapReduce (EMR) service.
Sahara aims to provide users with a simple means to provision Hadoop, Spark,
and Storm clusters by specifying several parameters such as the version,
cluster topology, hardware node details and more. After a user fills in all
the parameters, sahara deploys the cluster in a few minutes. Also sahara
and Storm clusters by specifying several parameters such as the framework
version, cluster topology, hardware node details and more. After a user fills
in all the parameters, sahara deploys the cluster in a few minutes. Also sahara
provides means to scale an already provisioned cluster by adding or removing
worker nodes on demand.
@ -53,6 +53,8 @@ The sahara product communicates with the following OpenStack services:
are used to work with OpenStack, limiting a user's abilities in sahara to
their OpenStack privileges.
* Compute (nova) - used to provision VMs for data processing clusters.
* Bare metal (ironic) - used to provision Baremetal nodes for data processing
clusters.
* Orchestration (heat) - used to provision and orchestrate the deployment of
data processing clusters.
* Image (glance) - stores VM images, each image containing an operating system
@ -90,8 +92,6 @@ For fast cluster provisioning a generic workflow will be as following:
* for base images without a pre-installed framework, sahara will support
pluggable deployment engines that integrate with vendor tooling.
* you can download prepared up-to-date images from
http://sahara-files.mirantis.com/images/upstream/
* define cluster configuration, including cluster size, topology, and
framework parameters (for example, heap size):
@ -99,8 +99,8 @@ For fast cluster provisioning a generic workflow will be as following:
* to ease the configuration of such parameters, configurable templates
are provided.
* provision the cluster; sahara will provision VMs, install and configure
the data processing framework.
* provision the cluster; sahara will provision nodes (VMs or baremetal),
install and configure the data processing framework.
* perform operations on the cluster; add or remove nodes.
* terminate the cluster when it is no longer needed.
@ -118,7 +118,8 @@ For analytics as a service, a generic workflow will be as following:
* all cluster provisioning and job execution will happen transparently
to the user.
* cluster will be removed automatically after job completion.
* if using a transient cluster, it will be removed automatically after job
completion.
* get the results of computations (for example, from swift).
@ -129,28 +130,28 @@ While provisioning clusters through sahara, the user operates on three types
of entities: Node Group Templates, Cluster Templates and Clusters.
A Node Group Template describes a group of nodes within cluster. It contains
a list of hadoop processes that will be launched on each instance in a group.
a list of processes that will be launched on each instance in a group.
Also a Node Group Template may provide node scoped configurations for those
processes. This kind of template encapsulates hardware parameters (flavor)
for the node VM and configuration for data processing framework processes
for the node instance and configuration for data processing framework processes
running on the node.
A Cluster Template is designed to bring Node Group Templates together to
form a Cluster. A Cluster Template defines what Node Groups will be included
and how many instances will be created in each. Some data processing framework
and how many instances will be created for each. Some data processing framework
configurations can not be applied to a single node, but to a whole Cluster.
A user can specify these kinds of configurations in a Cluster Template. Sahara
enables users to specify which processes should be added to an anti-affinity
group within a Cluster Template. If a process is included into an
anti-affinity group, it means that VMs where this process is going to be
anti-affinity group, it means that instances where this process is going to be
launched should be scheduled to different hardware hosts.
The Cluster entity represents a collection of VM instances that all have the
same data processing framework installed. It is mainly characterized by a VM
The Cluster entity represents a collection of instances that all have the
same data processing framework installed. It is mainly characterized by an
image with a pre-installed framework which will be used for cluster
deployment. Users may choose one of the pre-configured Cluster Templates to
start a Cluster. To get access to VMs after a Cluster has started, the user
should specify a keypair.
start a Cluster. To get access to instances after a Cluster has started, the
user should specify a keypair.
Sahara provides several constraints on cluster framework topology. You can see
all constraints in the documentation for the appropriate plugin.

View File

@ -16,7 +16,7 @@ Launching a cluster via the sahara UI
Registering an Image
--------------------
1) Navigate to the "Project" dashboard, then the "Data Processing" tab, then
1) Navigate to the "Project" dashboard, then to the "Data Processing" tab, then
click on the "Clusters" panel and finally the "Image Registry" tab.
2) From that page, click on the "Register Image" button at the top right
@ -33,7 +33,7 @@ Registering an Image
Create Node Group Templates
---------------------------
1) Navigate to the "Project" dashboard, then the "Data Processing" tab, then
1) Navigate to the "Project" dashboard, then to the "Data Processing" tab, then
click on the "Clusters" panel and then the "Node Group Templates" tab.
2) From that page, click on the "Create Template" button at the top right
@ -57,7 +57,7 @@ Create Node Group Templates
Create a Cluster Template
-------------------------
1) Navigate to the "Project" dashboard, then the "Data Processing" tab, then
1) Navigate to the "Project" dashboard, then to the "Data Processing" tab, then
click on the "Clusters" panel and finally the "Cluster Templates" tab.
2) From that page, click on the "Create Template" button at the top right
@ -87,7 +87,7 @@ Create a Cluster Template
Launching a Cluster
-------------------
1) Navigate to the "Project" dashboard, then the "Data Processing" tab, then
1) Navigate to the "Project" dashboard, then to the "Data Processing" tab, then
click on the "Clusters" panel and lastly, click on the "Clusters" tab.
2) Click on the "Launch Cluster" button at the top right

View File

@ -13,13 +13,13 @@ having a different role (set of processes).
Node group parameters include Hadoop parameters like ``io.sort.mb`` or
``mapred.child.java.opts``, and several infrastructure parameters like the
flavor for VMs or storage location (ephemeral drive or cinder volume).
flavor for instances or storage location (ephemeral drive or cinder volume).
A cluster is characterized by its node groups and its parameters. Like a node
group, a cluster has data processing framework and infrastructure parameters.
An example of a cluster-wide Hadoop parameter is ``dfs.replication``. For
infrastructure, an example could be image which will be used to launch cluster
VMs.
instances.
Templates
---------
@ -32,14 +32,15 @@ corresponding entities. Their aim is to remove the burden of specifying all
of the required parameters each time a user wants to launch a cluster.
In the REST interface, templates have extended functionality. First you can
specify node-scoped parameters here, they will work as defaults for node
specify node-scoped parameters, they will work as defaults for node
groups. Also with the REST interface, during cluster creation a user can
override template parameters for both cluster and node groups.
Templates are portable - they can be exported to JSON files and imported
later either on the same deployment or on another one. To import an exported
either on the same deployment or on another one. To import an exported
template, replace the placeholder values with appropriate ones. This can be
accomplished easily through the CLI or UI, or be done manually.
accomplished easily through the CLI or UI, or manually editing the template
file.
Provisioning Plugins
--------------------
@ -62,7 +63,7 @@ Image Registry
OpenStack starts VMs based on a pre-built image with an installed OS. The image
requirements for sahara depend on the plugin and data processing framework
version. Some plugins require just a basic cloud image and will install the
framework on the VMs from scratch. Some plugins might require images with
framework on the instance from scratch. Some plugins might require images with
pre-installed frameworks or Hadoop distributions.
The Sahara Image Registry is a feature which helps filter out images during
@ -72,5 +73,5 @@ with Image Registry.
Features
--------
Sahara has several interesting features. The full list could be found there:
Sahara has several interesting features. The full list could be found here:
:doc:`features`