Merge "Updating Sahara docs"

This commit is contained in:
Jenkins 2015-09-29 14:40:35 +00:00 committed by Gerrit Code Review
commit 2d5e04197c
6 changed files with 108 additions and 32 deletions

View File

@ -15,6 +15,7 @@ It is vendor-agnostic and currently supports the following distributions:
- Hortonworks Data Platform (HDP)
- Cloudera Hadoop Distribution (CDH)
- Apache Spark
- MapR
Sahara can install Hadoop clusters on demand.
The user must populate several parameters
@ -23,7 +24,7 @@ and Sahara will deploy this cluster in a few minutes.
It can also scale the cluster by adding or removing nodes as needed.
- For Sahara usage guidelines, read the User Guide section of the
`Sahara documentation <http://sahara.readthedocs.org/en/stable-juno/>`_.
`Sahara documentation <http://sahara.readthedocs.org/en/stable-kilo/>`_.
- The list of prebuilt images is available here: :ref:`sahara-images-ops`.
The images are usually

View File

@ -9,8 +9,10 @@ Sahara uses auto-security groups for opening required ports for each plugin.
To learn about the open ports for each plugin, please refer to the official
documentation:
* `Apache Hadoop <https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/ClusterSetup.html#Web_Interfaces>`_
* Hortonworks Hadoop
`Version 2.0 <http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_reference/content/reference_chap2.html>`_
and `Version 2.2 <http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.4/bk_HDP_Reference_Guide/content/reference_chap2.html>`_
* `Spark <https://spark.apache.org/docs/1.2.0/security.html>`_
`Version 2.2 <http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.4/bk_HDP_Reference_Guide/content/reference_chap2.html>`_
and `Version 2.3 <http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_HDP_Reference_Guide/content/reference_chap2.html>`_
* `Apach Spark <https://spark.apache.org/docs/1.3.1/security.html>`_
* `Cloudera <http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_ports_cdh5.html>`_
* `MapR <http://doc.mapr.com/display/MapR40x/Configuring+MapR+Security>`_

View File

@ -25,7 +25,7 @@ Target component
* Specify the ``User Name`` value for this OS.
* Set the following values: ``Plugin=vanilla``, ``Version=2.4.1``.
* Set the following values: ``Plugin=vanilla``, ``Version=2.6.0``.
* Click `Add plugin tags` and `Done`.
@ -36,7 +36,7 @@ Target component
* Click `Create Template`.
* In the `Create Node Group template` dialog, set the following values:
``Plugin name=Vanilla Apache Hadoop``, ``Hadoop version=2.4.1``.
``Plugin name=Vanilla Apache Hadoop``, ``Hadoop version=2.6.0``.
Click `Create` to proceed.
* In the second `Create Node Group template` dialog, set the following values:
@ -54,7 +54,7 @@ Target component
* Click `Create Template`.
* In the `Create Node Group template` dialog, set the following values:
``Plugin name=Vanilla Apache Hadoop`` and ``Hadoop version=2.4.1``.
``Plugin name=Vanilla Apache Hadoop`` and ``Hadoop version=2.6.0``.
Click `Create` to proceed.
* In the second `Create Node Group template` dialog, set the following values:
@ -71,7 +71,7 @@ Target component
* Click `Create Template`.
* In the `Create Cluster Template` dialog, set the following values:
``Plugin name=Vanilla Apache Hadoop``, ``Hadoop version=2.4.1``.
``Plugin name=Vanilla Apache Hadoop``, ``Hadoop version=2.6.0``.
Click `Create` to proceed.
* In the second `Create Cluster Template` dialog, set the following values:
@ -91,7 +91,7 @@ Target component
* Click `Launch Cluster`.
* In the `Launch Cluster` dialog, set the following values:
``Plugin name=Vanilla Apache Hadoop``, ``Hadoop version=2.4.1``.
``Plugin name=Vanilla Apache Hadoop``, ``Hadoop version=2.6.0``.
Click `Create` to proceed.
* In the second `Launch Cluster` dialog, set ``Cluster Name=vanilla2-cluster``.
@ -118,4 +118,4 @@ Target component
For more information, see
`Sahara documentation <http://sahara.readthedocs.org/en/stable-juno/>`_.
`Sahara documentation <http://sahara.readthedocs.org/en/stable-kilo/>`_.

View File

@ -6,15 +6,19 @@ Sahara Images
Prepared images can be downloaded from the following locations:
* `Ubuntu 14.04 for Vanilla Hadoop 1.2.1 <http://sahara-files.mirantis.com/mos61/sahara-juno-vanilla-1.2.1-ubuntu-14.04.qcow2>`_
* `CentOS 6.6 for Vanilla Hadoop 1.2.1 <http://sahara-files.mirantis.com/mos61/sahara-juno-vanilla-1.2.1-centos-6.6.qcow2>`_
* `Ubuntu 14.04 for Vanilla Hadoop 2.4.1 <http://sahara-files.mirantis.com/mos61/sahara-juno-vanilla-2.4.1-ubuntu-14.04.qcow2>`_
* `CentOS 6.6 for Vanilla Hadoop 2.4.1 <http://sahara-files.mirantis.com/mos61/sahara-juno-vanilla-2.4.1-centos-6.6.qcow2>`_
* `CentOS 6.6 for HDP 2.0.6 <http://sahara-files.mirantis.com/mos61/sahara-juno-hdp-2.0.6-centos-6.6.qcow2>`_
* `CentOS 6.6 for HDP 2.2.0 <http://sahara-files.mirantis.com/mos61/sahara-juno-hdp-2.2.0-centos-6.6.qcow2>`_
* `Ubuntu 12.04 for CDH 5 <http://sahara-files.mirantis.com/mos61/sahara-juno-cdh-5-ubuntu-12.04.qcow2>`_
* `CentOS 6.6 for CDH 5 <http://sahara-files.mirantis.com/mos61/sahara-juno-cdh-5-centos-6.6.qcow2>`_
* `Ubuntu 14.04 for Spark 1.0.0 <http://sahara-files.mirantis.com/mos61/sahara-juno-spark-1.0.0-ubuntu-14.04.qcow2>`_
* `Ubuntu 14.04 for Vanilla Hadoop 2.6.0 <http://sahara-files.mirantis.com/mos70/sahara-kilo-vanilla-2.6.0-ubuntu-14.04.qcow2>`_
* `CentOS 6.6 for Vanilla Hadoop 2.6.0 <http://sahara-files.mirantis.com/mos70/sahara-kilo-vanilla-2.6.0-centos-6.6.qcow2>`_
* `CentOS 6.6 for HDP 2.2 <http://sahara-files.mirantis.com/mos70/sahara-kilo-ambari-2.2-centos-6.6.qcow2>`_
* `CentOS 6.6 for HDP 2.3 <http://sahara-files.mirantis.com/mos70/sahara-kilo-ambari-2.2-centos-6.6.qcow2>`_
* `Ubuntu 12.04 for CDH 5.4.0 <http://sahara-files.mirantis.com/mos70/sahara-kilo-cdh-5.4.0-ubuntu-12.04.qcow2>`_
* `CentOS 6.6 for CDH 5.4.0 <http://sahara-files.mirantis.com/mos70/sahara-kilo-cdh-5.4.0-centos-6.6.qcow2>`_
* `Ubuntu 14.04 for Spark 1.3.1 <http://sahara-files.mirantis.com/mos70/sahara-kilo-spark-1.3.1-ubuntu-14.04.qcow2>`_
* `Ubuntu 14.04 for MapR 4.0.2 <http://sahara-files.mirantis.com/mos70/sahara-kilo-mapr-4.0.2-ubuntu-14.04.qcow2>`_
* `CentOS 6.6 for MapR 4.0.2 <http://sahara-files.mirantis.com/mos70/sahara-kilo-mapr-4.0.2-centos-6.6.qcow2>`_
.. note::
For the HDP 2.2 and the HDP 2.3 installations, you use the same image.
The default username for these images depends on the distribution:
@ -30,11 +34,11 @@ The default username for these images depends on the distribution:
You can find MD5 checksum of an image by adding the ``.md5`` suffix
to the image url, for example
http://sahara-files.mirantis.com/mos61/sahara-juno-vanilla-2.4.1-ubuntu-14.04.qcow2.md5.
http://sahara-files.mirantis.com/mos70/sahara-kilo-vanilla-2.6.0-ubuntu-14.04.qcow2.md5.
To check an ``.iso`` file with an MD5 hash, run:
.. code-block:: console
$ md5sum -c sahara-juno-vanilla-2.4.1-ubuntu-14.04.qcow2.md5
$ md5sum -c sahara-kilo-vanilla-2.6.0-ubuntu-14.04.qcow2.md5
..

View File

@ -7,7 +7,7 @@ Planning a Sahara Deployment
:ref:`Sahara<sahara-term>` enables users
to easily provision and manage Apache Hadoop clusters
in an OpenStack environment.
Sahara works with either Release 1.x or 2.x of Hadoop.
Sahara supports only 2.x Release of Hadoop.
The Sahara control processes run on the Controller node.
The entire Hadoop cluster runs in VMs
@ -17,16 +17,15 @@ A typical set-up is:
- One VM that runs management and monitoring processes (Apache Ambari,
Cloudera Manager, Ganglia, Nagios)
- One VM that serves as the Hadoop master node
to run JobTracker (ResourceManager for Hadoop Release 2.x) and NameNode.
to run ResourceManager and NameNode.
- Many VMs that serve as Hadoop worker nodes,
each of which runs TaskTracker (NodeManager for Hadoop Release 2.x)
and DataNodes.
each of which runs NodeManager and DataNodes.
You must have exactly one instance of each management and master processes
running in the environment. Other than that,
You must have exactly one instance of each management and master
processes running in the environment. Other than that,
you are free to use other configurations.
For example, you can run the TaskTracker/NodeManager and DataNodes
in the same VM that runs JobTracker/ResourceManager and NameNode;
For example, you can run the NodeManager and DataNodes
in the same VM that runs ResourceManager and NameNode;
such a configuration may not produce performance levels
that are acceptable for a production environment
but it works for evaluation and demonstration purposes.
@ -34,6 +33,11 @@ You could also run DataNodes and TaskTrackers in separate VMs.
Sahara can use either :ref:`swift-object-storage-term` or :ref:`ceph-term`
for object storage.
.. note:: If you have configured the Swift public URL with SSL,
Sahara will only work with the prepared
:ref:`Sahara images<sahara-images-ops>`.
Special steps are required to implement data locality for Swift;
see `Data-locality <http://docs.openstack.org/developer/sahara/userdoc/features.html#data-locality>`_
for details.
@ -64,10 +68,28 @@ is hidden in the OpenStack Dashboard.
In either case, Sahara assigns a floating IP to each VM it spawns
so be sure to allocate enough floating IPs.
However, if you have a limited number of floating IPs or special security
policies you may not be able to provide access to all instances. In
this case, you can use the instances that have access as proxy gateways.
To enable this functionality, set the **is_proxy_gateway** parameter to `true`
for the node group you want to use as proxy. Sahara will communicate with all
other cluster instances through the instances of this node group.
.. note:: If **use_floating_ips** is set to `true` and the cluster
contains a node group that is used as proxy, the requirement
to provision a pool of floating IPs is only applied to the
proxy node group. Sahara accesses the other instances through
proxy instances using the private network.
.. note:: The Cloudera Hadoop plugin does not support the access
to the Cloudera manager through a proxy node. Therefore,
you can only assign the nodes on which you have
the Cloudera manager as proxy gateways.
**Security Groups**
Sahara can create and configure security groups separately for each cluster
depending on a provisioning plugin and Hadoop version.
Sahara can create and configure security groups separately for each
cluster depending on a provisioning plugin and Hadoop version.
:ref:`Security Groups<security-groups-term>`
**VM Flavor Requirements**
@ -106,6 +128,53 @@ Guide <quickstart-guide>` will have Sahara working poorly.
Be sure that communication between virtual machines is not blocked.
**Default templates**
Sahara bundles default templates that define simple clusters for the supported
plugins. These templates are already added to the sahara database, therefore,
you do not need to create them.
**Supported default templates for plugins**
There is an overview of the supported default templates for each plugin:
* Vanilla Apache Hadoop 2.6.0:
There are 2 node groups created for this plugin. First one is named
vanilla-2-master and contains all management Hadoop components - NameNode,
HistoryServer and ResourceManager. It also includes Oozie server required to
run Hadoop jobs. Second one is named vanilla-2-worker and contains components
required for data storage and processing - NodeManager and DataNode.
The cluster template is also represented for this plugin. It's named
vanilla-2 and contains 1 master and 3 worker nodes.
* Cloudera Hadoop Distribution (CDH) 5.4.0:
There are 3 node groups created for this plugin. First one is named
cdh-5-master and contains all management Hadoop components - NameNode,
HistoryServer and ResourceManager. It also includes Oozie server required to
run Hadoop jobs. Second one is named cdh-5-manager and contains Cloudera
Management component that provides UI to manage Hadoop cluster. Third one is
named cdh-5-worker and contains components required for data storage and
processing - NodeManager and DataNode.
The cluster template is also represented for this plugin. It's named cdh-5
and contains 1 manager, 1 master and 3 worker nodes.
* Hortonworks Data Platform (HDP) 2.2:
There are also 2 node groups created for this plugin. First one named
hdp-2-2-master and contains all management Hadoop components - Ambari,
NameNode, MapReduce HistoryServer, ResourceManager, YARN Timeline Server,
ZooKeeper. It also includes Oozie server required to run Hadoop jobs.
Second one named hdp-2-2-worker and contains components required for data
storage and processing - NodeManager and DataNode.
The cluster template is also represented for this plugin. It's named hdp-2-2
and contains 1 master and 4 worker nodes.
For additional information about using Sahara to run
Apache Hadoop, see the
`Sahara documentation <http://docs.openstack.org/developer/sahara/overview.html>`_.

View File

@ -6,7 +6,7 @@ Sahara (formerly known as "Savanna")
Sahara is the OpenStack service
that provisions an Apache Hadoop cluster on top of OpenStack.
Sahara currently supports Vanilla Apache Hadoop, Hortonworks Data Platform
(HDP), Cloudera Hadoop Distribution (CDH) and Apache Spark.
(HDP), Cloudera Hadoop Distribution (CDH), Apache Spark, and MapR.
To enable Sahara in your OpenStack environment
that is deployed using Fuel,