ef3c1ab4d5
Can make the query couple of 10 times faster. In order to debug a Presto query performance issue, I observed the seeking in Sahara-extra is expensive and sometimes even unnecessary. The best way to avoid the overhead and unnecessary calls of seeking is to do it only when the client really needs the data. After this changes, the same query in Presto able to run 30 times faster. Both Presto and S3 clients have added the similar changes too. Change-Id: I8586af0d481fd08d48620e699467280f7b93150a |
||
---|---|---|
common-artifacts | ||
edp-adapt-for-oozie | ||
edp-adapt-for-spark | ||
edp-examples | ||
hadoop-swiftfs | ||
tools | ||
.gitignore | ||
.gitreview | ||
.mailmap | ||
CONTRIBUTING.rst | ||
HACKING.rst | ||
LICENSE | ||
MANIFEST.in | ||
README.rst | ||
requirements.txt | ||
setup.cfg | ||
setup.py | ||
test-requirements.txt | ||
tox.ini |
README.rst
OpenStack Data Processing ("Sahara") extra repo
Sahara-extra is place for Sahara components not included into the main Sahara repository
Here is the list of components:
- Sources for Swift filesystem implementation for Hadoop: https://github.com/openstack/sahara-extra/blob/master/hadoop-swiftfs/README.rst
- Sources for main function wrapper that adapt for oozie: https://github.com/openstack/sahara-extra/blob/master/edp-adapt-for-oozie/README.rst
- Sources for main function wrapper that adapt for spark: https://github.com/openstack/sahara-extra/blob/master/edp-adapt-for-spark/README.rst
- Diskimage-builder elements moved to the new repo: https://github.com/openstack/sahara-image-elements
Tools for building artifacts located in tools dir.