3. Background
• 300+PB data stored in Hadoop/HDFS-based clusters
• More queries and get results faster improves
analysts, data scientists, and engineers productivity
• MapReduce and Hive are designed for large-scale,
reliable computation
• External projects too nascent or did not meet our
requirements for flexibility and scale
Thursday, 24 April, 14
5. Key points for low latency
• In memory parallel computing
• Pipeline
• Data local computation
• Data cache
• Dynamic compile part of the plan to byte code
• Careful use of memory and data structure
• BlinkDB liked approximate queries
• Traditional SQL optimize
• GC control
Thursday, 24 April, 14
7. In memory parallel computing
select c1.rank, count(*) from dim.city c1 join dim.city c2 on c1.id = c2.id
where c1.id > 10 group by c1.rank limit 10;
Thursday, 24 April, 14
14. Data local computation
• Select acceptable nodes (as least 10 nodes by
default)
– Nodes has the same address
– If not enough, add nodes in the same rack
– If not enough, randomly select nodes in other racks
• Select the node with the smallest number of
assignments (pending tasks)
Thursday, 24 April, 14
15. Data cache
• Google Guava LoadingCache
• Cached Objects
– HiveMeta database table partition
– Byte Code Class
FilterAndProjectOperatorFactoryFactory,
ScanFilterAndProjectOperatorFactoryFactory
– functions
Thursday, 24 April, 14
16. Dynamic compile plan to byte code
• Presto dynamic compile FilterAndProjectOperator
and ScanFilterAndProjectOperator to byte code
which lets the JIT optimize and generate native
machine code
• How much does it speed up ?
• ScanFilterAndProjectOperator
Thursday, 24 April, 14
17. Careful use mem & data structure
• Slice
– Unsafe#copyMemory
– 20% ~ 30% speed up for ORCFile write performance
• ThreadLocalRandom
– ThreadLocal seed instead of AtomicLong
– 100% speed up
• ListenableFuture
– Async Callback
Thursday, 24 April, 14
20. GC control
• A JDK 1.7 BUG
• When code cache fills up, there is a chance that JIT
might stop compile byte code to native code.
• By forcing classes to unload from the perm gen,
we let the code cache evictor make room before
the cache fills up.
• System.gc()
Thursday, 24 April, 14
21. TPCH benchmark test
• Run presto-main/src/test/java/com/facebook/
presto/benchmark/BenchmarkSuite.java
• A part of the result as below
Thursday, 24 April, 14
22. What we do
• Support kerberos authentication
• Implicit type coercion
• Support reading lzo compressed tables
• Implement useful functions
• Fix planning issue when using DISTICT aggregations
in HAVING clause
• https://github.com/MTDATA/presto/commits/
mt-0.60
Thursday, 24 April, 14