Merge "Updating Sahara docs"
This commit is contained in:
commit
2d5e04197c
|
@ -15,6 +15,7 @@ It is vendor-agnostic and currently supports the following distributions:
|
|||
- Hortonworks Data Platform (HDP)
|
||||
- Cloudera Hadoop Distribution (CDH)
|
||||
- Apache Spark
|
||||
- MapR
|
||||
|
||||
Sahara can install Hadoop clusters on demand.
|
||||
The user must populate several parameters
|
||||
|
@ -23,7 +24,7 @@ and Sahara will deploy this cluster in a few minutes.
|
|||
It can also scale the cluster by adding or removing nodes as needed.
|
||||
|
||||
- For Sahara usage guidelines, read the User Guide section of the
|
||||
`Sahara documentation <http://sahara.readthedocs.org/en/stable-juno/>`_.
|
||||
`Sahara documentation <http://sahara.readthedocs.org/en/stable-kilo/>`_.
|
||||
|
||||
- The list of prebuilt images is available here: :ref:`sahara-images-ops`.
|
||||
The images are usually
|
||||
|
|
|
@ -9,8 +9,10 @@ Sahara uses auto-security groups for opening required ports for each plugin.
|
|||
To learn about the open ports for each plugin, please refer to the official
|
||||
documentation:
|
||||
|
||||
* `Apache Hadoop <https://hadoop.apache.org/docs/r2.6.0/hadoop-project-dist/hadoop-common/ClusterSetup.html#Web_Interfaces>`_
|
||||
* Hortonworks Hadoop
|
||||
`Version 2.0 <http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.0.6.0/bk_reference/content/reference_chap2.html>`_
|
||||
and `Version 2.2 <http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.4/bk_HDP_Reference_Guide/content/reference_chap2.html>`_
|
||||
* `Spark <https://spark.apache.org/docs/1.2.0/security.html>`_
|
||||
`Version 2.2 <http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.4/bk_HDP_Reference_Guide/content/reference_chap2.html>`_
|
||||
and `Version 2.3 <http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.0/bk_HDP_Reference_Guide/content/reference_chap2.html>`_
|
||||
* `Apach Spark <https://spark.apache.org/docs/1.3.1/security.html>`_
|
||||
* `Cloudera <http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_ports_cdh5.html>`_
|
||||
* `MapR <http://doc.mapr.com/display/MapR40x/Configuring+MapR+Security>`_
|
||||
|
|
|
@ -25,7 +25,7 @@ Target component
|
|||
|
||||
* Specify the ``User Name`` value for this OS.
|
||||
|
||||
* Set the following values: ``Plugin=vanilla``, ``Version=2.4.1``.
|
||||
* Set the following values: ``Plugin=vanilla``, ``Version=2.6.0``.
|
||||
|
||||
* Click `Add plugin tags` and `Done`.
|
||||
|
||||
|
@ -36,7 +36,7 @@ Target component
|
|||
* Click `Create Template`.
|
||||
|
||||
* In the `Create Node Group template` dialog, set the following values:
|
||||
``Plugin name=Vanilla Apache Hadoop``, ``Hadoop version=2.4.1``.
|
||||
``Plugin name=Vanilla Apache Hadoop``, ``Hadoop version=2.6.0``.
|
||||
Click `Create` to proceed.
|
||||
|
||||
* In the second `Create Node Group template` dialog, set the following values:
|
||||
|
@ -54,7 +54,7 @@ Target component
|
|||
* Click `Create Template`.
|
||||
|
||||
* In the `Create Node Group template` dialog, set the following values:
|
||||
``Plugin name=Vanilla Apache Hadoop`` and ``Hadoop version=2.4.1``.
|
||||
``Plugin name=Vanilla Apache Hadoop`` and ``Hadoop version=2.6.0``.
|
||||
Click `Create` to proceed.
|
||||
|
||||
* In the second `Create Node Group template` dialog, set the following values:
|
||||
|
@ -71,7 +71,7 @@ Target component
|
|||
* Click `Create Template`.
|
||||
|
||||
* In the `Create Cluster Template` dialog, set the following values:
|
||||
``Plugin name=Vanilla Apache Hadoop``, ``Hadoop version=2.4.1``.
|
||||
``Plugin name=Vanilla Apache Hadoop``, ``Hadoop version=2.6.0``.
|
||||
Click `Create` to proceed.
|
||||
|
||||
* In the second `Create Cluster Template` dialog, set the following values:
|
||||
|
@ -91,7 +91,7 @@ Target component
|
|||
* Click `Launch Cluster`.
|
||||
|
||||
* In the `Launch Cluster` dialog, set the following values:
|
||||
``Plugin name=Vanilla Apache Hadoop``, ``Hadoop version=2.4.1``.
|
||||
``Plugin name=Vanilla Apache Hadoop``, ``Hadoop version=2.6.0``.
|
||||
Click `Create` to proceed.
|
||||
|
||||
* In the second `Launch Cluster` dialog, set ``Cluster Name=vanilla2-cluster``.
|
||||
|
@ -118,4 +118,4 @@ Target component
|
|||
|
||||
|
||||
For more information, see
|
||||
`Sahara documentation <http://sahara.readthedocs.org/en/stable-juno/>`_.
|
||||
`Sahara documentation <http://sahara.readthedocs.org/en/stable-kilo/>`_.
|
||||
|
|
|
@ -6,15 +6,19 @@ Sahara Images
|
|||
|
||||
Prepared images can be downloaded from the following locations:
|
||||
|
||||
* `Ubuntu 14.04 for Vanilla Hadoop 1.2.1 <http://sahara-files.mirantis.com/mos61/sahara-juno-vanilla-1.2.1-ubuntu-14.04.qcow2>`_
|
||||
* `CentOS 6.6 for Vanilla Hadoop 1.2.1 <http://sahara-files.mirantis.com/mos61/sahara-juno-vanilla-1.2.1-centos-6.6.qcow2>`_
|
||||
* `Ubuntu 14.04 for Vanilla Hadoop 2.4.1 <http://sahara-files.mirantis.com/mos61/sahara-juno-vanilla-2.4.1-ubuntu-14.04.qcow2>`_
|
||||
* `CentOS 6.6 for Vanilla Hadoop 2.4.1 <http://sahara-files.mirantis.com/mos61/sahara-juno-vanilla-2.4.1-centos-6.6.qcow2>`_
|
||||
* `CentOS 6.6 for HDP 2.0.6 <http://sahara-files.mirantis.com/mos61/sahara-juno-hdp-2.0.6-centos-6.6.qcow2>`_
|
||||
* `CentOS 6.6 for HDP 2.2.0 <http://sahara-files.mirantis.com/mos61/sahara-juno-hdp-2.2.0-centos-6.6.qcow2>`_
|
||||
* `Ubuntu 12.04 for CDH 5 <http://sahara-files.mirantis.com/mos61/sahara-juno-cdh-5-ubuntu-12.04.qcow2>`_
|
||||
* `CentOS 6.6 for CDH 5 <http://sahara-files.mirantis.com/mos61/sahara-juno-cdh-5-centos-6.6.qcow2>`_
|
||||
* `Ubuntu 14.04 for Spark 1.0.0 <http://sahara-files.mirantis.com/mos61/sahara-juno-spark-1.0.0-ubuntu-14.04.qcow2>`_
|
||||
* `Ubuntu 14.04 for Vanilla Hadoop 2.6.0 <http://sahara-files.mirantis.com/mos70/sahara-kilo-vanilla-2.6.0-ubuntu-14.04.qcow2>`_
|
||||
* `CentOS 6.6 for Vanilla Hadoop 2.6.0 <http://sahara-files.mirantis.com/mos70/sahara-kilo-vanilla-2.6.0-centos-6.6.qcow2>`_
|
||||
* `CentOS 6.6 for HDP 2.2 <http://sahara-files.mirantis.com/mos70/sahara-kilo-ambari-2.2-centos-6.6.qcow2>`_
|
||||
* `CentOS 6.6 for HDP 2.3 <http://sahara-files.mirantis.com/mos70/sahara-kilo-ambari-2.2-centos-6.6.qcow2>`_
|
||||
* `Ubuntu 12.04 for CDH 5.4.0 <http://sahara-files.mirantis.com/mos70/sahara-kilo-cdh-5.4.0-ubuntu-12.04.qcow2>`_
|
||||
* `CentOS 6.6 for CDH 5.4.0 <http://sahara-files.mirantis.com/mos70/sahara-kilo-cdh-5.4.0-centos-6.6.qcow2>`_
|
||||
* `Ubuntu 14.04 for Spark 1.3.1 <http://sahara-files.mirantis.com/mos70/sahara-kilo-spark-1.3.1-ubuntu-14.04.qcow2>`_
|
||||
* `Ubuntu 14.04 for MapR 4.0.2 <http://sahara-files.mirantis.com/mos70/sahara-kilo-mapr-4.0.2-ubuntu-14.04.qcow2>`_
|
||||
* `CentOS 6.6 for MapR 4.0.2 <http://sahara-files.mirantis.com/mos70/sahara-kilo-mapr-4.0.2-centos-6.6.qcow2>`_
|
||||
|
||||
.. note::
|
||||
|
||||
For the HDP 2.2 and the HDP 2.3 installations, you use the same image.
|
||||
|
||||
The default username for these images depends on the distribution:
|
||||
|
||||
|
@ -30,11 +34,11 @@ The default username for these images depends on the distribution:
|
|||
|
||||
You can find MD5 checksum of an image by adding the ``.md5`` suffix
|
||||
to the image url, for example
|
||||
http://sahara-files.mirantis.com/mos61/sahara-juno-vanilla-2.4.1-ubuntu-14.04.qcow2.md5.
|
||||
http://sahara-files.mirantis.com/mos70/sahara-kilo-vanilla-2.6.0-ubuntu-14.04.qcow2.md5.
|
||||
|
||||
To check an ``.iso`` file with an MD5 hash, run:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
$ md5sum -c sahara-juno-vanilla-2.4.1-ubuntu-14.04.qcow2.md5
|
||||
$ md5sum -c sahara-kilo-vanilla-2.6.0-ubuntu-14.04.qcow2.md5
|
||||
..
|
||||
|
|
|
@ -7,7 +7,7 @@ Planning a Sahara Deployment
|
|||
:ref:`Sahara<sahara-term>` enables users
|
||||
to easily provision and manage Apache Hadoop clusters
|
||||
in an OpenStack environment.
|
||||
Sahara works with either Release 1.x or 2.x of Hadoop.
|
||||
Sahara supports only 2.x Release of Hadoop.
|
||||
|
||||
The Sahara control processes run on the Controller node.
|
||||
The entire Hadoop cluster runs in VMs
|
||||
|
@ -17,16 +17,15 @@ A typical set-up is:
|
|||
- One VM that runs management and monitoring processes (Apache Ambari,
|
||||
Cloudera Manager, Ganglia, Nagios)
|
||||
- One VM that serves as the Hadoop master node
|
||||
to run JobTracker (ResourceManager for Hadoop Release 2.x) and NameNode.
|
||||
to run ResourceManager and NameNode.
|
||||
- Many VMs that serve as Hadoop worker nodes,
|
||||
each of which runs TaskTracker (NodeManager for Hadoop Release 2.x)
|
||||
and DataNodes.
|
||||
each of which runs NodeManager and DataNodes.
|
||||
|
||||
You must have exactly one instance of each management and master processes
|
||||
running in the environment. Other than that,
|
||||
You must have exactly one instance of each management and master
|
||||
processes running in the environment. Other than that,
|
||||
you are free to use other configurations.
|
||||
For example, you can run the TaskTracker/NodeManager and DataNodes
|
||||
in the same VM that runs JobTracker/ResourceManager and NameNode;
|
||||
For example, you can run the NodeManager and DataNodes
|
||||
in the same VM that runs ResourceManager and NameNode;
|
||||
such a configuration may not produce performance levels
|
||||
that are acceptable for a production environment
|
||||
but it works for evaluation and demonstration purposes.
|
||||
|
@ -34,6 +33,11 @@ You could also run DataNodes and TaskTrackers in separate VMs.
|
|||
|
||||
Sahara can use either :ref:`swift-object-storage-term` or :ref:`ceph-term`
|
||||
for object storage.
|
||||
|
||||
.. note:: If you have configured the Swift public URL with SSL,
|
||||
Sahara will only work with the prepared
|
||||
:ref:`Sahara images<sahara-images-ops>`.
|
||||
|
||||
Special steps are required to implement data locality for Swift;
|
||||
see `Data-locality <http://docs.openstack.org/developer/sahara/userdoc/features.html#data-locality>`_
|
||||
for details.
|
||||
|
@ -64,10 +68,28 @@ is hidden in the OpenStack Dashboard.
|
|||
In either case, Sahara assigns a floating IP to each VM it spawns
|
||||
so be sure to allocate enough floating IPs.
|
||||
|
||||
However, if you have a limited number of floating IPs or special security
|
||||
policies you may not be able to provide access to all instances. In
|
||||
this case, you can use the instances that have access as proxy gateways.
|
||||
To enable this functionality, set the **is_proxy_gateway** parameter to `true`
|
||||
for the node group you want to use as proxy. Sahara will communicate with all
|
||||
other cluster instances through the instances of this node group.
|
||||
|
||||
.. note:: If **use_floating_ips** is set to `true` and the cluster
|
||||
contains a node group that is used as proxy, the requirement
|
||||
to provision a pool of floating IPs is only applied to the
|
||||
proxy node group. Sahara accesses the other instances through
|
||||
proxy instances using the private network.
|
||||
|
||||
.. note:: The Cloudera Hadoop plugin does not support the access
|
||||
to the Cloudera manager through a proxy node. Therefore,
|
||||
you can only assign the nodes on which you have
|
||||
the Cloudera manager as proxy gateways.
|
||||
|
||||
**Security Groups**
|
||||
|
||||
Sahara can create and configure security groups separately for each cluster
|
||||
depending on a provisioning plugin and Hadoop version.
|
||||
Sahara can create and configure security groups separately for each
|
||||
cluster depending on a provisioning plugin and Hadoop version.
|
||||
:ref:`Security Groups<security-groups-term>`
|
||||
|
||||
**VM Flavor Requirements**
|
||||
|
@ -106,6 +128,53 @@ Guide <quickstart-guide>` will have Sahara working poorly.
|
|||
|
||||
Be sure that communication between virtual machines is not blocked.
|
||||
|
||||
**Default templates**
|
||||
|
||||
Sahara bundles default templates that define simple clusters for the supported
|
||||
plugins. These templates are already added to the sahara database, therefore,
|
||||
you do not need to create them.
|
||||
|
||||
**Supported default templates for plugins**
|
||||
|
||||
There is an overview of the supported default templates for each plugin:
|
||||
|
||||
* Vanilla Apache Hadoop 2.6.0:
|
||||
|
||||
There are 2 node groups created for this plugin. First one is named
|
||||
vanilla-2-master and contains all management Hadoop components - NameNode,
|
||||
HistoryServer and ResourceManager. It also includes Oozie server required to
|
||||
run Hadoop jobs. Second one is named vanilla-2-worker and contains components
|
||||
required for data storage and processing - NodeManager and DataNode.
|
||||
|
||||
The cluster template is also represented for this plugin. It's named
|
||||
vanilla-2 and contains 1 master and 3 worker nodes.
|
||||
|
||||
* Cloudera Hadoop Distribution (CDH) 5.4.0:
|
||||
|
||||
There are 3 node groups created for this plugin. First one is named
|
||||
cdh-5-master and contains all management Hadoop components - NameNode,
|
||||
HistoryServer and ResourceManager. It also includes Oozie server required to
|
||||
run Hadoop jobs. Second one is named cdh-5-manager and contains Cloudera
|
||||
Management component that provides UI to manage Hadoop cluster. Third one is
|
||||
named cdh-5-worker and contains components required for data storage and
|
||||
processing - NodeManager and DataNode.
|
||||
|
||||
The cluster template is also represented for this plugin. It's named cdh-5
|
||||
and contains 1 manager, 1 master and 3 worker nodes.
|
||||
|
||||
* Hortonworks Data Platform (HDP) 2.2:
|
||||
|
||||
There are also 2 node groups created for this plugin. First one named
|
||||
hdp-2-2-master and contains all management Hadoop components - Ambari,
|
||||
NameNode, MapReduce HistoryServer, ResourceManager, YARN Timeline Server,
|
||||
ZooKeeper. It also includes Oozie server required to run Hadoop jobs.
|
||||
Second one named hdp-2-2-worker and contains components required for data
|
||||
storage and processing - NodeManager and DataNode.
|
||||
|
||||
The cluster template is also represented for this plugin. It's named hdp-2-2
|
||||
and contains 1 master and 4 worker nodes.
|
||||
|
||||
|
||||
For additional information about using Sahara to run
|
||||
Apache Hadoop, see the
|
||||
`Sahara documentation <http://docs.openstack.org/developer/sahara/overview.html>`_.
|
||||
|
|
|
@ -6,7 +6,7 @@ Sahara (formerly known as "Savanna")
|
|||
Sahara is the OpenStack service
|
||||
that provisions an Apache Hadoop cluster on top of OpenStack.
|
||||
Sahara currently supports Vanilla Apache Hadoop, Hortonworks Data Platform
|
||||
(HDP), Cloudera Hadoop Distribution (CDH) and Apache Spark.
|
||||
(HDP), Cloudera Hadoop Distribution (CDH), Apache Spark, and MapR.
|
||||
|
||||
To enable Sahara in your OpenStack environment
|
||||
that is deployed using Fuel,
|
||||
|
|
Loading…
Reference in New Issue