Remedying of patchings, version conflicts, classpath issues, etc.
ALSO: Switch the Hadoop libraries used on the Spark standalone plugin to
Hadoop 2.7.3. The version was previously 2.6.5, to match Cloudera's
so-called "Hadoop 2.6.0", but in fact this concordance is not at all
necessary...
Change-Id: Iafafb64fd60a1ae585375a68173c84fbb82c7e1f
Actually there is no much difference between using CDH 5.4 and 5.5
for spark, because both are using Hadoop 2.6 as a base.
Also removing redundant case in installing packages for CDH.
Change-Id: Ie24bb72365352edb22d94a461df0a0af6cd71806
Closes-bug: 1686400
Building CDH image under version 5.5.0 is no longer support.
Remove these useless code.
Also, adding the ambari usage info in sahara-image-create
command.
Change-Id: I6fffe25ee9daf651355611be675137babb67e2a8
Add spark element to vanilla images.
It provides run spark jobs on vanilla
clusters.
bp spark-jobs-for-vanilla-hadoop
Change-Id: Ie5eb9ec10b0052c9d1f6284b312edfee0ddba4f0
Update the Spark element to use the existing hadoop-cloudera element for HDFS
for Spark versions > 1.0, instead of the ad-hoc cloudera-cdh one. For Spark 1.0.2,
CDH4 via the old hadoop-cdh element is used, since a precompiled binary for CDH5
is not available.
This change also makes it possible to specify an arbitrary Spark version via the
new -s commandline switch, reducing the amount of code for supporting future
versions of Spark. The defaults for Spark are 1.3.1 and CDH 5.3, a combination
that works well in our deployments.
A small change is needed in the cloudera element: when creating a Spark image,
only the HDFS packages have to be installed.
README files have been updated to clarify that default versions are tested, while
other combinations are not. A reference to the SparkPlugin wiki page was added
to point to a table of supported versions.
Change-Id: Ifc2a0c8729981e1e1df79b556a4c2e6bd1ba893a
Implements: blueprint support-spark-1-3
Depends-On: I8fa482b6d1d6abaa6633aec309a3ba826a8b7ebb
Use the cache-url script (in the element cache-url) to download and
cache resources which might be expensive (mostly because of their size)
to fetch every time.
As the shared cache ($DIB_IMAGE_CACHE) is available only when running
the root.d elements, move the download phases to root.d scripts.
Change-Id: Iec3e0f92e62c4c9542487a3c228ba8f9e884e5dd