Commit Graph

2 Commits

Author SHA1 Message Date
Ray Zhang ef3c1ab4d5 Adds the lazy seek
Can make the query couple of 10 times faster.

In order to debug a Presto query performance issue, I observed the
seeking in Sahara-extra is expensive and sometimes even unnecessary.
The best way to avoid the overhead and unnecessary calls of seeking
is to do it only when the client really needs the data.
After this changes, the same query in Presto able to run 30 times faster.
Both Presto and S3 clients have added the similar changes too.

Change-Id: I8586af0d481fd08d48620e699467280f7b93150a
2016-10-05 13:26:54 -07:00
Nadya Privalova 24e66b7dda Sources for hadoop-patch
Sources were obtained from https://issues.apache.org/jira/secure/attachment/12583703/HADOOP-8545-033.patch
by running "patch" command. All the files related to Hadoop-common were skiped during patching.

Changes were made after patching:
* pom.xml was updated to use hadoop-core 1.1.2 dependency
* removed dependency on 2.x hadoop in code (@Override and isDirectory() -> isDir())
* removed Hadoop 2.X tests

There are no unit-tests, only integration.

Change-Id: I8d7c2f544d14f79597fcdefe27ecae0d43b6df9e
2013-08-23 16:32:14 +04:00