sahara-extra

Commit Graph

Author	SHA1	Message	Date
Ray Zhang	ef3c1ab4d5	Adds the lazy seek Can make the query couple of 10 times faster. In order to debug a Presto query performance issue, I observed the seeking in Sahara-extra is expensive and sometimes even unnecessary. The best way to avoid the overhead and unnecessary calls of seeking is to do it only when the client really needs the data. After this changes, the same query in Presto able to run 30 times faster. Both Presto and S3 clients have added the similar changes too. Change-Id: I8586af0d481fd08d48620e699467280f7b93150a	2016-10-05 13:26:54 -07:00
Nadya Privalova	24e66b7dda	Sources for hadoop-patch Sources were obtained from https://issues.apache.org/jira/secure/attachment/12583703/HADOOP-8545-033.patch by running "patch" command. All the files related to Hadoop-common were skiped during patching. Changes were made after patching: * pom.xml was updated to use hadoop-core 1.1.2 dependency * removed dependency on 2.x hadoop in code (@Override and isDirectory() -> isDir()) * removed Hadoop 2.X tests There are no unit-tests, only integration. Change-Id: I8d7c2f544d14f79597fcdefe27ecae0d43b6df9e	2013-08-23 16:32:14 +04:00

Author

SHA1

Message

Date

Ray Zhang

ef3c1ab4d5

Adds the lazy seek

Can make the query couple of 10 times faster.

In order to debug a Presto query performance issue, I observed the
seeking in Sahara-extra is expensive and sometimes even unnecessary.
The best way to avoid the overhead and unnecessary calls of seeking
is to do it only when the client really needs the data.
After this changes, the same query in Presto able to run 30 times faster.
Both Presto and S3 clients have added the similar changes too.

Change-Id: I8586af0d481fd08d48620e699467280f7b93150a

2016-10-05 13:26:54 -07:00

Nadya Privalova

24e66b7dda

Sources for hadoop-patch

Sources were obtained from https://issues.apache.org/jira/secure/attachment/12583703/HADOOP-8545-033.patch
by running "patch" command. All the files related to Hadoop-common were skiped during patching.

Changes were made after patching:
* pom.xml was updated to use hadoop-core 1.1.2 dependency
* removed dependency on 2.x hadoop in code (@Override and isDirectory() -> isDir())
* removed Hadoop 2.X tests

There are no unit-tests, only integration.

Change-Id: I8d7c2f544d14f79597fcdefe27ecae0d43b6df9e

2013-08-23 16:32:14 +04:00

2 Commits