Commit Graph

7 Commits

Author SHA1 Message Date
Jeremy Freudberg 7910521a7e Totally rewrite s3_hadoop
Remedying of patchings, version conflicts, classpath issues, etc.

ALSO: Switch the Hadoop libraries used on the Spark standalone plugin to
Hadoop 2.7.3. The version was previously 2.6.5, to match Cloudera's
so-called "Hadoop 2.6.0", but in fact this concordance is not at all
necessary...

Change-Id: Iafafb64fd60a1ae585375a68173c84fbb82c7e1f
2018-07-02 11:51:28 -04:00
Telles Nobrega 05085a81db Adding Spark 2.2.0
Adding newest version of spark.

Change-Id: Ib2894d5d93d3ecfd17e0fb1eba4d687a97406027
2017-12-13 09:55:41 -03:00
Vitaly Gridnev 972ce02c76 Fixing spark and CDH refinements
Actually there is no much difference between using CDH 5.4 and 5.5
for spark, because both are using Hadoop 2.6 as a base.
Also removing redundant case in installing packages for CDH.

Change-Id: Ie24bb72365352edb22d94a461df0a0af6cd71806
Closes-bug: 1686400
2017-06-07 20:51:34 +00:00
Shu Yingya 20126fbde6 Remove some codes of older version builder
Building CDH image under version 5.5.0 is no longer support.
Remove these useless code.
Also, adding the ambari usage info in sahara-image-create
command.

Change-Id: I6fffe25ee9daf651355611be675137babb67e2a8
2017-04-14 11:24:05 +08:00
Mikhail Lelyakin b7d83da204 Merge Vanilla and Spark plugins
Add spark element to vanilla images.
It provides run spark jobs on vanilla
clusters.

bp spark-jobs-for-vanilla-hadoop
Change-Id: Ie5eb9ec10b0052c9d1f6284b312edfee0ddba4f0
2016-09-07 12:12:58 +00:00
Daniele Venzano f376b0f480 Use Cloudera element for Spark HDFS
Update the Spark element to use the existing hadoop-cloudera element for HDFS
for Spark versions > 1.0, instead of the ad-hoc cloudera-cdh one. For Spark 1.0.2,
CDH4 via the old hadoop-cdh element is used, since a precompiled binary for CDH5
is not available.

This change also makes it possible to specify an arbitrary Spark version via the
new -s commandline switch, reducing the amount of code for supporting future
versions of Spark. The defaults for Spark are 1.3.1 and CDH 5.3, a combination
that works well in our deployments.

A small change is needed in the cloudera element: when creating a Spark image,
only the HDFS packages have to be installed.

README files have been updated to clarify that default versions are tested, while
other combinations are not. A reference to the SparkPlugin wiki page was added
to point to a table of supported versions.

Change-Id: Ifc2a0c8729981e1e1df79b556a4c2e6bd1ba893a
Implements: blueprint support-spark-1-3
Depends-On: I8fa482b6d1d6abaa6633aec309a3ba826a8b7ebb
2015-07-10 14:12:37 +00:00
Pino Toscano f2fac65cc4 Start caching large-sized resources
Use the cache-url script (in the element cache-url) to download and
cache resources which might be expensive (mostly because of their size)
to fetch every time.
As the shared cache ($DIB_IMAGE_CACHE) is available only when running
the root.d elements, move the download phases to root.d scripts.

Change-Id: Iec3e0f92e62c4c9542487a3c228ba8f9e884e5dd
2015-04-03 15:45:16 +02:00