Contemporary philippine arts from the regions_PPT_Module_12 [Autosaved] (1).pptx
February 2014 HUG : Pig On Tez
1. Pig on Tez
PRESENTED BY
Cheolsoo Park, Netflix
R o h i n i P a l a n i s w a m y , Ya h o o !
The Apache Software Foundation
2. Apache Pig on Tez Team
Name
Role
Company
Apache Pig Contributor
Linkedin
Cheolsoo Park
VP. Apache Pig
Netflix
Daniel Dai
Apache Pig PMC
Hortonworks
Mark Wagner
Apache Pig Committer
Linkedin
Olga Natkovich
Apache Pig PMC, Pig on Tez Project
Manager
Yahoo!
Rohini Palaniswamy
Apache Pig PMC
Yahoo!
Alex Bain
The Apache Software Foundation
2
3. Agenda
Overview
Pig and Hive
Pig on Tez
Why Tez?
Benefits of Tez
Design
Operator DAGs
Performance
Known Issues
Where are we?
What next?
The Apache Software Foundation
3
4. Pig Overview
Apache top-level project for ETL on hadoop.
PIG Latin - Procedural scripting language that translates sequence of data processing
steps into MapReduce jobs.
Easy to write, read and reuse and very extensible.
Feature parity with SQL (FILTER BY, CROSS, JOIN (OUTER, INNER), ORDER BY, LIMIT, RANK, ROLLUP,
CUBE), Custom Loader and Storer, User defined functions (java and non-java), Nested
ForEach, Streaming, macros and much more
PAGEVIEWS = LOAD ‘/data/pageviews’ as (user, url);
GRP = GROUP PAGEVIEWS BY user;
CNT = FOREACH GRP GENERATE group, COUNT(url) as numvisits;
STORE CNT into ‘/data/visited’ using PigStorage(‘,’);
The Apache Software Foundation
4
5. Pig and Hive
Pig
Language
Hive
PIG Latin - Procedural
SQL - Declarative
Features
Feature rich. Can easily add new
operators and constructs. For eg:
Nested Foreach, Switch case,
Macros, Scalars.
Limited to SQL operators
Developer code
Load/StoreFunc, Algebraic and
Accumulator java UDFs, non-java
UDFs (jython, python, javascript,
groovy, jruby), Custom Partitioners.
StorageHandler, java UDFs
Complex Processing
Well suited. Multi-query works well
with 1000s of lines of pig script.
Not a good fit
Server
Only client. Can work with Hive
Metastore using
HCatLoader/Storer.
Requires Metastore server
and data has to be registered
in it. HiverServer2 for jdbc
The Apache Software Foundation
5
6. Pig and Hive - Continued
Pig
Tez as execution engine
Hive
Planned for 0.14
Planned for Hive 0.13
ORCFile Support
Patch available. Currently through
HCatLoader
From Hive 0.12 onwards.
Huge performance gains
Vectorization
No. May be in future.
Yes. Huge performance gains
Transactions
No
Yes. In works
Cost-based optimizer
No
Yes. In works
JDBC support, Integration with BI
tools
No
Yes. HiveServer2 with
Microstrategy/Tableau
Area of application
Pipeline processing language
standard
Interactive Analytics
/Reporting Platform
The Apache Software Foundation
6
7. Why Tez?
Built on top of YARN
Multi-tenancy (queues, capacity management)
Resource allocation
DAG execution framework
Natural fit for Pig and Hive than MR as their execution plans are DAGs.
Better than running a DAG of MR jobs passing data in between jobs using HDFS as intermediate store.
Different types of edges
ONE_ONE, BROADCAST, SHUFFLE
Flexible Input-Processor-Output runtime model
Custom Vertex Processors. For eg: Map Processor, Reduce Processor, Pig Processor
Custom Inputs. For eg: MRFileInput (input to map), ShuffledMergedInput (input to reduce)
Custom Outputs. For eg: OnFileSortedOutput (output of map), MRFileOutput (output of reduce)
Multiple inputs and outputs
Highly extensible
Security
Support from Tez Community and Hive Community
The Apache Software Foundation
7
8. Why Tez? – As a end user
Better Performance
Reduced Resource Usage (Containers/Memory/CPU)
Reduced Network I/O
Reduced Namenode and Datanode load
The Apache Software Foundation
8
9. Benefits of Tez
Features
Benefits
•
No intermediate data storage
•
•
•
Single AM for whole DAG
The Apache Software Foundation
•
Less pressure on Namenode
- Lesser calls for listing and getting block locations
- Smaller namespace usage
- Cuts down on GC
Less pressure on Datanode
- Cuts down on IO in network for both writing and reading.
- Saves space as there are no 3 replicas
Eliminates extra step of map reads from HDFS in every
intermediate job in DAG
- Saves on capacity by eliminating the need for map task
containers
Saves on capacity. For a 5 stage MR job, there would be 5 AM
containers launched.
Eliminates issue of queue and resource contention faced in MR
for jobs started after previous job in DAG completes.
9
10. Benefits of Tez - Continued
Features
Benefits
•
Container reuse
•
Reduced launch overhead
- Container request and release overhead
- Resource localization overhead
- JVM launch time overhead
Reduced network IO
- Reduce tasks can be launched on same node as Map
- 1-1 edge tasks can be launched on same node
•
Memory structures like small tables used for join can be cached
in jvm and reused for next task on container reuse. Provides
significant performance speedup.
•
Using unsorted input and output where possible saves a lot of
CPU usage and increases performance
•
Saves on capacity. Can have reducers based on data size
instead of having fixed number of reducers.
Vertex caching
Custom inputs and outputs
Dynamic reducer estimation
The Apache Software Foundation
10
11. Pig on Tez - Design
Logical Plan
LogToPhyTranslationVisitor
Physical Plan
TezCompiler
MRCompiler
Tez Plan
MR Plan
Tez Execution Engine
MR Execution Engine
The Apache Software Foundation
11
12. Pig on Tez – Join
Left
split
Right
split
Left
split
Load L and R
Right
split
l = LOAD ‘left’ AS (x, y);
r = LOAD ‘right’ AS (x, z);
j = JOIN l BY x, r BY x;
Configuration
per input
Configuration
per job
Join
The Apache Software Foundation
Left
split
Left
split
Right
split
Load L
Load R
Join
12
Right
split
13. Pig on Tez – Split + Group-by
Load foo
Split multiplex
de-multiplex
Group by y Group by z
HDFS
f = LOAD ‘foo’ AS (x, y, z);
g1 = GROUP f BY y;
g2 = GROUP f BY z;
j = JOIN g1 BY group,
g2 BY group;
Load foo
Multiple outputs
Group by y
HDFS
Group by z
Load g1 and Load g2
Reduce follows
reduce
Join
The Apache Software Foundation
Join
13
14. Pig on Tez – Order-by
Aggregate
HDFS
Load &
Sample
f = LOAD ‘foo’ AS (x, y);
o = ORDER f BY x;
Sample
Aggregate
Stage sample map
on distributed cache
Pass through input
via 1-1 edge
Broadcast sample map
Partition
Partition
Sort
Sort
The Apache Software Foundation
14
15. Pig on Tez – Skewed join
l = LOAD ‘left’ AS (x, y);
r = LOAD ‘right’ AS (x, z);
j = JOIN l BY x, r BY x
USING ‘skewed’;
Sample L
Load &
Sample
Aggregate
HDFS
Aggregate
Pass through input
via 1-1 edge
Stage sample map
on distributed cache
Broadcast
sample map
Partition L
Partition R
Partition L and Partition R
Join
Join
The Apache Software Foundation
15
16. Time in secs
Performance numbers
5000
4500
4000
3500
3000
2500
2000
1500
1000
500
0
MR
Tez
Replicated
Join (2.8x)
The Apache Software Foundation
Join +
Groupby
(1.5x)
Join +
3 way Split +
Groupby +
Join +
Orderby
Groupby +
(1.5x)
Orderby
(2.6x)
16
17. Factors affecting performance
Number of stages in the DAG
Higher the number of stages in the DAG, performance of Tez over MR will be better.
Cluster/queue capacity
More congested a queue is, the performance of Tez over MR will be better due to container reuse.
Size of intermediate output
More the size of intermediate output, the performance of Tez over MR will be better due to reduced
HDFS usage.
Size of data in the job
For smaller data and more stages, the performance of Tez over MR will be better as percentage of
launch overhead in the total time is high for smaller jobs.
Vertex caching
The Apache Software Foundation
17
18. Container usage
MR
Tez
Savings
Tez with
container reuse
7563
7562
1
180
Join + Groupby +
Orderby
7655
7603
52
180
Join + Groupby +
Orderby
7663
7609
54
180
3 way Split + Join +
Groupby + Order by
622
563
59
180
Query
Replicated Join
Note. The cluster size was 25 nodes with 180 containers (1.5G each) and Tez reused
them again and again for tasks.
The Apache Software Foundation
18
19. Known issues
Container reuse will have issues when there are
Static variables in LoadFunc, StoreFunc, UDFs
Memory leaks in LoadFunc, StoreFunc, UDFs
With single DAG execution of whole script, AM retries can be very costly until
Tez supports checkpointing and resuming.
The Apache Software Foundation
19
20. Where are we?
Major operators
Split, Union
Group-by, Distinct, Limit
Order-by
Load, Store, Filter-by, Foreach
Hash join, Replicated join, Skewed join, Merge join
UDFs (Java and non-Java)
Streaming
Multi-query on and off
Macros
Scalars
95% of e2e tests pass for finished features.
The Apache Software Foundation
20
21. What next?
Feature Parity with MR
Local mode
Port all unit and e2e tests
Support for remaining Operators
CROSS, RANK, CUBE, ROLLUP
Support for Native Mapreduce (Low priority)
Merge tez branch with trunk
Stability
Handling failures
Testing and tuning for large data and DAGs with > 10 stages
Usability
Counters
Progress Information
Log information and debuggability
The Apache Software Foundation
21
22. What next? – Performance Improvements
›
Dynamic Reducer Estimation
›
Better memory management
›
Calculate input splits in AM and let Tez do combining of input splits for
pig.maxCombinedSplitSize
›
Vertex Grouping to write data directly into one output directory from multiple vertices in
case of union
›
Using unsorted shuffle in Union, Orderby, Skewed Join, etc to improve performance.
›
Shared Edges for multiple outputs if same data has to go to multiple downstream
vertices. For eg: Multi-query off, skewed join sample aggregation output.
›
HDFS Caching
The Apache Software Foundation
22