SlideShare une entreprise Scribd logo
1  sur  65
1
What‘s new in
Hadoop 3.0
Heiko Loewe
@loeweh
2
Tributes
• Zhe Zhang, Erasure Coding
• Akira Ajisaka, Script Rewrite
• Junping Du, Yarn Timeline Service v2
• And manny Others
3
Hadoop 3.0 Roadmap
• Hadoop 3.x Releases
• Planned for hadoop-3.0.0
Classpath isolation on by default HADOOP-11656
• hadoop-3.0.0-alpha1
– HADOOP
• Move to JDK8+
• Shell script rewrite HADOOP-9902
• Move default ports out of ephemeral range HDFS-9427
– HDFS
• Removal of hftp in favor of webhdfs HDFS-5570
• Support for more than two standby NameNodes HDFS-6440
• Support for Erasure Codes in HDFS HDFS-7285
• Intra-datanode balancer HDFS-1312
– YARN
• YARN Timeline Service v.2 YARN-2928
– MAPREDUCE
• Derive heap size or mapreduce.*.memory.mb automatically MAPREDUCE-5785
4
Release Feature
2.0NameNode High-Availability,
2.2Federation, Snapshots, NFS v3 Mount
2.3Heterogenous Storage (Phase 1), In Memory Caching
2.4Rolling Upgrades, Posix ACL
2.5Extended Attributes
2.6Hot Swap Volumes, Heterogeneous Storage (Phase 2), transp. Encryption
2.7Files w/ variable Block Length, Inotify
2.8Docker Container in Linux, Yarn ATS 1.5, changing resources on alloc Yarn Container
2.9TBD
Release / Features / 2.X
5
What's Apache Hadoop 3?
20142010 2011 201320122009 2015
2.2.0
Heterogeneous storage
HDFS in-memory caching
2.3.0 2.5.0
2.4.0
HDFS ACLs
2.0.0-alpha
2.1.0-beta
branch-1
(branch-0.20)
1.0.0 1.1.0 1.2.1(stable)0.20.1 0.20.205
0.22.0
Security
0.23.11(final)
NameNode Federation, YARN
0.21.0
New append
0.23.0
NameNode HA
branch-2
HDFS Snapshots
NFSv3 support
Windows
HDFS Rolling Upgrades
Application History Server
RM Automatic Failover
2.6.0
YARN Rolling Upgrades
Transparent Encryption
Archival Storage
2.7.0
Hadoop2
Drop JDK6 support
Truncate API
2016
branch-0.23
trunk
Hadoop3
Hadoop 3 and 2 were diverged in 2011 (5 years ago!)
Hadoop1 (EOL)
Source: Akira Ajisaka
6
Break compatibility
Major version up is to clean up the code
• Deprecated APIs can be removed only in changing major version
– @Public and @Stable Java API
– REST API
– Metrics/JMX
– CLI
– Environment variables
• Wire-compatibility can be broken
– 2.X client cannot talk to 3.X server and vice versa
• Compatibility Guide:
– Apache Hadoop 3.0.0-alpha1 – Apache Hadoop Compatibility
7
Erasure Coding
8
Traditional Hadoop
• HDFS inherits 3-way replication from Google File System
• Simple, scalable and robust
• 200% storage overhead
• Secondary replicas rarely accessed
9
Erasure Coding Saves Storage
• Simplified Example: storing 2 bits
• Same data durability
– can lose any 1 bit
• Pro: Half the storage overhead
• Cons: Slower recovery
1 01 0Replication: 2 extra bits
XOR Coding: 1 0⊕ 1= 1 extra bit
10
Erasure Coding with m Data, n Parity
Reed Solomon Coding
11
Durability and Efficiency
Data Durability = How many simultaneous failures can be tolerated?
Storage Efficiency = How much portion of storage is for useful data?
Data Durability Storage Efficiency
Single Replica 0 100%
3-way Replication 2 33%
XOR with 6 data cells 1 86%
RS (6,3) 3 67%
RS (10,4) 4 71%
12
Continious Layout
13
Stripped Layout
14
Other Implementation
15
HDFS Roadmap
16
Erasure Coding: Current Status
 Phase 1: striping layout
 C = 64KB (default)
 Work for small files
 No data locality
 Available on trunk
 Phase 2: contiguous layout
 C = 128MB (= HDFS Block size)
 Not work for small files
 Data locality
 Now in progress (HDFS-8030)
Incoming Data
DataNode 1
DataNode 2
DataNode 3
DataNode 4
DataNode 5
・
・
・
Cell size (C)
17
Name Server Changes
Mapping Logical and Storage Blocks
Too Many Storage Blocks?
Hierarchical Naming Protocol:
18
Erasure Coding: Write files using (6,3)-Reed-Solomon
・
・
・
・
・
・
・
・
・
・
・
・
 Write data to 9 DNs in parallel
6 Data Blocks
DN1
DN6
DN7
Incoming Data
DN9
3 Parity Blocks
ECClient
19
Erasure Coding: Read files
 Read data from 6 DNs in parallel
DN1
ECClient
DN6
DN9
・
・
・
・
・
・
・
・
・
20
Erasure Coding: Read files when DN fails
 Read data from arbitrary 6 DNs in parallel
DN1
ECClient
×DN6
DN7
DN9
・
・
・
・
・
・
・
・
・
21
hdfs erasure command
[loewe@loewe hadoop-3.0.0-alpha1]$ bin/hdfs erasurecode -help
Usage: hdfs erasurecode [generic options]
[-getPolicy <path>]
[-help [cmd ...]]
[-listPolicies]
[-setPolicy [-p <policyName>] <path>]
-getPolicy <path> :
Get erasure coding policy information about at specified path
-help [cmd ...] :
Displays help for given command or all commands if none is specified.
-listPolicies :
Get the list of erasure coding policies supported
-setPolicy [-p <policyName>] <path> :
Set a specified erasure coding policy to a directory
Options :
-p <policyName> erasure coding policy name to encode files. If not passed the
default policy will be used
<path> Path to a directory. Under this directory files will be
encoded using specified erasure coding policy
22
Acceleration with Intel ISA-L
• 1 legacy coder
– From Facebook’s HDFS-RAID project
• 2 new coders
– Pure Java — code improvement over HDFS-RAID
– Native coder with Intel’s Intelligent Storage Acceleration Library (ISA-L)
23
Benchmarks
24
Benchmarks
25
Benchmarks
26
Benchmarks
27
Yarn
Timeline Service v2
28
First, A bit of Vision…
• Evolution of Hadoop start with YARN
• YARN Evolution will continue to drive Hadoop forward
• Hadoop 3 will still use Yarn, but there are a lot of Improvements
Hadoop 3
29
Several important trends in age of Hadoop 3.0 +
YARN and Other Platform Services
Storage
Resource
Management Security
Service
Discovery Management
Monitoring
Alerts
IOT Assembly
Kafka Storm HBase Solr
Governance
MR Tez Spark …
Innovating
frameworks:
Flink,
DL(TensorFlow),
etc.
Various Environments
On Premise Private Cloud Public Cloud
30
Yarn Architecture
Yarn
Resource
Database
Scheduler
Application
Timeline Service
31
YARN Process Flow - Walkthrough
NodeManager NodeManager NodeManager NodeManager
Container
1.1
Container
2.4
NodeManager NodeManager NodeManager NodeManager
NodeManager NodeManager NodeManager NodeManager
Container
1.2
Container
1.3
AM 1
Container
2.2
Container
2.1
Container
2.3
AM2
Client2
Yarn
Scheduler
Yarn
Timeline Service
Client (Query)
32
Why Timeline Service v2
• Scalability and reliability challenges
– Single instance of Timeline Server
– Storage (single local LevelDB instance)
• Usability
– Flow
– Metrics and configuration as first-class citizens
– Metrics aggregation up the entity hierarchy
33
Highlights
v.1 v.2
Single writer/reader Timeline Server Distributed writer/collector architecture
Single local LevelDB storage* Scalable storage (HBase)
v.1 entity model New v.2 entity model
No aggregation Metrics aggregation
REST API Richer query REST API
34
Architecture
• Separation of writers (“collectors”) and readers
• Distributed collectors: one collector for each app
• Dedicated RM collector for RM-generated data
• Collector discovery via RM
• Pluggable storage with HBase as default storage
35
Distributed Collectors and Readers
36
New Entity Model
• Flows and flow runs as parents of YARN applicaSon enSSes
• First-class configuraSon (key-value pairs)
• First-class metrics (single-value or Sme series)
• Designed to handle mulS-cluster environment out of the box
37
What is a flow
• A flow is a group of YARNapplications
that are launched as parts of a logical
app
• Oozie, Scalding, Pig, etc.
– name:
– “frequent_visitor_stat”
– run id: 1466097809000
– version: “b9b9068”
38
Metrics Aggregation
• Application level
– Rolls up sub-application metrics
– Performed in real time in the collectors
in memory
• Flow run level
– Rolls up app level metrics
– Performed in HBase region servers via
coprocessors
• Offline aggregation (TBD)
– Rolls up on user, queue, and flow offline
periodically
– Phoenix tables
39
More Cloud Friendly
• Elastic
– Dynamic Resource Configuration
• YARN-291
• Allow tune down/up on NM’s resource in runtime
– Graceful decommissioning of NodeManagers
• YARN-914
• Drains a node that’s being decommissioned to allow running containers to finish
• Efficient
– Support for container resizing
• YARN-1197
• Allows applications to change the size of an existing container
40
More Cloud Friendly (Contd.)
• Isolation
– Embrace container technology to achieve better isolation
– Resource isolation support for disk and network
• YARN-2619 (disk), YARN-2140 (network)
• Containers get a fair share of disk and network resources using Cgroups
– Docker support in LinuxContainerExecutor
• YARN-3611
• Support to launch Docker containers alongside process
• Packaging and resource isolation
• Operation
– Container upgrades (YARN-4726)
• ”Do an upgrade of my Spark / HBase apps with minimal impact to end-users”
– AM Restart With Work Preserving
• MAPREDUCE-6608
41
• Add a native implementation of the map output collector
– Sort, Spill and IFile serialization
• Prequisites
– Built with -Pnative option
– Custom writable types and comparators are not supported
• Setting
<property name="mapreduce.job.map.output.collector.clas s"
value="org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator">
Task level native optimization (MAPREDUE-2841)
42
Benchmark
• Release Note in the issue:
– "For shuffle-intensive jobs this may provide speed-ups of 30% or more."
• Benchmarked with 3 slaves (m3.xlarge)
– CentOS 7.2
– 3.0.0-SNAPSHOT (revision 5865fe2b)
• A very shuffle-intensive wordcount job
– Input: 2.6GB (compressed)
– Shuffle: 14GB
– Output: 10GB
43
Updated Web UI
44
Setup Timeline Service v2
• Set up the HBase cluster (1.1.x)
– Add the timeline service jar to HBase
– Install the flow run coprocessor
– Create tables via TimelineSchemaCreator utility
• Configure the YARN cluster
– Enable Timeline Service v.2
– Add hbase-site.xml for the timeline collector and readers
– Start the timeline reader daemon
45
Shell Script rewrite
46
Directory Structure
47
Bin/hadoop Command
48
bin/hdfs Command
49
Shell Script Rewrite (HADOOP-9902)
• Hadoop and Shell Script
– Launching daemons
– Hadoop CLI
• Difficult to understand
– What is the correct env var to set a option
• java classpath?
• java.library.path?
• GC options?
– How to add the option to the env var
– We have to read almost all the shell scripts!
• New CLI is not completely downward compatible
– Hadoop 3: bin/hdfs namenode ‐format
– Hadoop 2: bin/hadoop namenode –format
Apache Hadoop 3.0.0-alpha1 – Apache Hadoop Compatibility
50
After rewriting the scripts ...
• Easy to understand
• Because shell API doc is available
– Shelldoc maker generates docs
from the scripts
– Similar to JavaDoc
Public API
Documentation by Akira Ajisaka build (trunk):
http://aajisaka.github.io/hadoop-project/hadoop-project-dist/hadoop-common/UnixShellAPI.html
51
• Very similar to .bashrc
– Read the API doc
– Create your own ~/.hadoopXX
• hadoop-env : hadoop-env.sh for each user
• hadooprc : called after shell env vars are configured
• And that's all :)
• ex.) Set additional classpath (.hadooprc)
hadoop_add_classpath /path/to/my/jar
.hadoop-env and .hadooprc
(HADOOP-11353, HADOOP-13045)
52
--debug option is available
CLASSPATH was overwritten!!
(before HADOOP-13045)
Useful for troubleshooting
$ hadoop --debug version
DEBUG:
DEBUG:
DEBUG:
( s n i p )
DEBUG:
DEBUG:
DEBUG:
( s n i p )
DEBUG:
hadoop_parse_args: procesiong version
hadoop_parse: asking c a l l e r to skip 1
HADOOP_CONF_DIR=/usr/local/hadoop/ e t c / hadoop
Applying the u s e r ' s . hadooprc
I n i t i a l CLASSPATH=/path/to/my/jar
I n i t i a l i z e CLASSPATH
I n i t i a l CLASSPATH=/usr/ l o c a l / hadoop/ s h a r e / hadoop/common/lib/*
53
Many new features, bug fixes, improvements
• 'hadoop distch' to change the ownership and permissions on many
files via MapReduce job
• 'hadoop jnipath' to print java.library.path
• 'hadoop --daemon' instead of hadoop-daemon.sh
– ex.) hdfs --daemon status namenode
– The return code for status is LSB-compatible
– hadoop-daemon(s).sh are now deprecated
• .out files are now appended (not overwritten)
– Allows external log rotation
• and many more
– see https://issues.apache.org/jira/browse/HADOOP-9902
54
Derive heap size or mapreduce.*.memory.mb
automatically (MAPREDUCE-5785)
• In Hadoop 2, two similar properties must be set :(
– mapreduce.{map,reduce}.memory.mb
• The amount of memory to request from the scheduler for each task (ex. 2048)
– mapreduce.{map,reduce}.java.opts
• Java options for YARN containers (ex. -Xmx2G)
• In Hadoop 3, either is enough
– .java.opts is derived from .memory.mb and vice versa
• .java.opts = .memory.mb * mapreduce.job.heap.memory-mb.ratio
• .memory.mb = .java.opts / mapreduce.job.heap.memory-mb.ratio
55
Intra Data-Node Balancer (HDFS-1312)
• Due to activities like deletes
the volumes of a DataNode
may become imbalance filled
DataNode
Block Placing
Policies
56
Intra Data-Node Balancer (HDFS-1312)
• Offline scripts existed to reblance a DataNode
• HDFS-1312 introduces a online process that
rebalances the Volume of a DataNode
• „hdfs diskbalancer“ Command
57
Multiple Name Nodes
58
Support more than two NameNodes (HDFS-6440)
• Hadoop 2 now supports only 2 NameNodes
– 1 active and 1 standby
• Hadoop 3 supports 2 or more standby NameNodes
– provides additional fault-tolerance
– avoids multiple standby NNs to checkpoint at the same time
– # of standby should be small due to block report
(typically 3 or 5 NameNodes)
59
Old Layout
ZK Failover
Controller
Active
NameNode
ZK Failover
Controller
Active
NameNode
Zookeeper Zookeeper Zookeeper
Fencing
Monitors the health of the NN Participating
in election of the active NN Coordinates
transition process Fence the other NN, if it
win election
The information which
NN is active is kept her
60
New Layout
ZK Failover
Controller
Active
NameNode
ZK Failover
Controller
Standby
NameNode
Zookeeper Zookeeper Zookeeper
Fencing
Monitors the health of the NN Participating
in election of the active NN Coordinates
transition process Fence the other NN, if it
win election
The information which
NN is active is kept her
ZK Failover
Controller
Standby
NameNode
… 4 .. 5
61
All Feature supported
• Checkpointing with NFS
• Checkpoint with Quorum Journal Daemon
• Manual Failover
• Automatic Failover
62
Incompatible changes
• Many deprecated APIs will be removed
– hftp/hsftp/s3 -> webhdfs/s3{n,a}
– Metrics v1
– org.apache.hadoop.Records
– and more
• Improved CLI output
– 'mapred job -list' shows the job name as well
– 'hadoop fs -du' shows the raw disk usage, and aligned more unix-like
– and more
• Search 'Incompatible change' flag
– https://s.apache.org/sMO4
63
Bump up the versions of the libraries
• Drop JDK7 support (HADOOP-11858)
• Dependency Hell
– Tomcat
– Jetty
– Jersey
– Guava
– Log4J
– Jackson
– And many more
common/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/local/hadoop/share/
hadoop/common/lib/zookeeper-3.4.6.jar:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/
common/lib/slf4j-api-1.7.10.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/hadoop/share/hadoop/common/lib/jsp-
api-2.1.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-client-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-json-1.9.jar:!/usr/
local/hadoop/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-collections-3.2.2.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-io-2.4.jar:/usr/local/hadoop/
share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/nimbus-jose-jwt-3.9.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/
hadoop/common/lib/stax-api-1.0-2.jar:/usr/local/hadoop/share/hadoop/common/lib/junit-4.11.jar:/usr/local/hadoop/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/htrace-
core4-4.0.1-incubating.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/
netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-annotations-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local! /hadoop/share/hadoop/
common/lib/activation-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/json-smart-1.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/local/hadoop/share/hadoop/common/lib/
java-xmlbuilder-0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/common/lib/jsch-0.1.51.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-
framework-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/jcip-annotations-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/
usr/local/hadoop/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/local/hadoop/share/
hadoop/common/lib/hadoop-auth-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/hadoop/share/hadoop/common/lib/jsr305-3.0.0.jar:/usr/local/hadoop/share/
hadoop/common/lib/gson-2.2.4.jar:/usr/local/hadoop/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/local/hadoop/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-
net-3.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-
compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/mockito-all-1.8.5.jar:/
usr/local/hadoop/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/hadoop/share/hadoop/common/lib/asm-3.2.jar:/usr/local/
hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/share/
hadoop/common/lib/curator-recipes-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop/share/hadoop/
common/lib/slf4j-log4j12-1.7.10.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT-tests.jar:/usr/local/
hadoop/share/hadoop/common/hadoop-nfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/hadoop-
hdfs-client-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/netty-all-4.1.0.Beta5.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/hpack-0.11.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/
xercesImpl-2.9.1.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/okio-1.4.0.jar:/usr/local/
hadoop/share/hadoop/hdfs/lib/okhttp-2.4.0.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-nfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-SNAPSHOT-tests.jar:/usr/local/
hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-
mapreduce-client-jobclient-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-nativetask-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-
mapreduce-client-app-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-
examples-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-
common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT-tests.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-
shuffle-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/lib/javassist-3.18.1-GA.jar:/usr/local/hadoop/share/hadoop/yarn/lib/metrics-core-3.0.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-3.0.jar:/usr/
local/hadoop/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/
curator-test-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/fst-2.24.jar:/usr/local/hadoop/share/hadoop/yarn/lib/objenesis-2.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/local/hadoop/
share/hadoop/yarn/lib/zookeeper-3.4.6-tests.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-servle 0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-math-2.2.jar:/usr/local/hadoop/share/hadoop/yarn/
hadoop-yarn-server-applicationhistoryservice-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-
api-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-tests-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/
share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.0.0-
SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/ yarn/hadoop-yarn-server-web-proxy-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-3.0.0-
SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn- registry-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-3.0.0-
SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-3.0.0-
SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-client-3.0.0-SNAPSHOT.jar:/usr/
java/latest/lib/tools.jar
64
Classpath isolation (HADOOP-11656)
• Relaxing the "dependency hell"
– Separate client and server jars
– Client jar does not pull any third party dependencies
• If the isolation is done ...
– We can safely upgrade the libraries in server code
– In branch-2, the upgrade is incompatible :(
65Follow me on Twitter: @loeweh

Contenu connexe

Tendances

Hadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of OzoneHadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of OzoneErik Krogen
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementDataWorks Summit/Hadoop Summit
 
Ozone: An Object Store in HDFS
Ozone: An Object Store in HDFSOzone: An Object Store in HDFS
Ozone: An Object Store in HDFSDataWorks Summit
 
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsTuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsDataWorks Summit
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldDataWorks Summit
 
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloudMoving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloudDataWorks Summit/Hadoop Summit
 
Unified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaUnified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaDataWorks Summit
 
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Chris Nauroth
 
Ozone - Evolution of hdfs scalability
Ozone - Evolution of hdfs scalabilityOzone - Evolution of hdfs scalability
Ozone - Evolution of hdfs scalabilityDinesh Chitlangia
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseDataWorks Summit/Hadoop Summit
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARNDataWorks Summit
 
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon
 
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSHDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSDataWorks Summit
 
[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan
[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan
[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil GovindanNewton Alex
 
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the fieldHadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the fieldUwe Printz
 

Tendances (20)

Hadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of OzoneHadoop Meetup Jan 2019 - Overview of Ozone
Hadoop Meetup Jan 2019 - Overview of Ozone
 
Achieving 100k Queries per Hour on Hive on Tez
Achieving 100k Queries per Hour on Hive on TezAchieving 100k Queries per Hour on Hive on Tez
Achieving 100k Queries per Hour on Hive on Tez
 
Taming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop ManagementTaming the Elephant: Efficient and Effective Apache Hadoop Management
Taming the Elephant: Efficient and Effective Apache Hadoop Management
 
Ozone: An Object Store in HDFS
Ozone: An Object Store in HDFSOzone: An Object Store in HDFS
Ozone: An Object Store in HDFS
 
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agentsTuning Apache Ambari performance for Big Data at scale with 3000 agents
Tuning Apache Ambari performance for Big Data at scale with 3000 agents
 
Hadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the FieldHadoop Operations - Best Practices from the Field
Hadoop Operations - Best Practices from the Field
 
Apache HBase: State of the Union
Apache HBase: State of the UnionApache HBase: State of the Union
Apache HBase: State of the Union
 
Apache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduceApache Hadoop 3.0 What's new in YARN and MapReduce
Apache Hadoop 3.0 What's new in YARN and MapReduce
 
Apache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and FutureApache Hadoop YARN: Past, Present and Future
Apache Hadoop YARN: Past, Present and Future
 
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS HadoopBreaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
Breaking the 1 Million OPS/SEC Barrier in HOPS Hadoop
 
Moving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloudMoving towards enterprise ready Hadoop clusters on the cloud
Moving towards enterprise ready Hadoop clusters on the cloud
 
Unified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache SamzaUnified Batch & Stream Processing with Apache Samza
Unified Batch & Stream Processing with Apache Samza
 
Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5Hadoop operations-2015-hadoop-summit-san-jose-v5
Hadoop operations-2015-hadoop-summit-san-jose-v5
 
Ozone - Evolution of hdfs scalability
Ozone - Evolution of hdfs scalabilityOzone - Evolution of hdfs scalability
Ozone - Evolution of hdfs scalability
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARN
 
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
 
HDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFSHDFS Tiered Storage: Mounting Object Stores in HDFS
HDFS Tiered Storage: Mounting Object Stores in HDFS
 
[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan
[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan
[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan
 
Hadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the fieldHadoop Operations - Best practices from the field
Hadoop Operations - Best practices from the field
 

Similaire à What's new in hadoop 3.0

Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecturesaipriyacoool
 
Asbury Hadoop Overview
Asbury Hadoop OverviewAsbury Hadoop Overview
Asbury Hadoop OverviewBrian Enochson
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInDataWorks Summit
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterPatrick Quairoli
 
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.MaharajothiP
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfsshrey mehrotra
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)mundlapudi
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated ArchitectureDatabricks
 
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and HadoopIOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and HadoopLeons Petražickis
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresmkorremans
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and FutureDataWorks Summit
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFSUSE Italy
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentationAmrut Patil
 

Similaire à What's new in hadoop 3.0 (20)

Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop ppt on the basics and architecture
Hadoop ppt on the basics and architectureHadoop ppt on the basics and architecture
Hadoop ppt on the basics and architecture
 
Tutorial Haddop 2.3
Tutorial Haddop 2.3Tutorial Haddop 2.3
Tutorial Haddop 2.3
 
Dfs in iaa_s
Dfs in iaa_sDfs in iaa_s
Dfs in iaa_s
 
Asbury Hadoop Overview
Asbury Hadoop OverviewAsbury Hadoop Overview
Asbury Hadoop Overview
 
Scaling Hadoop at LinkedIn
Scaling Hadoop at LinkedInScaling Hadoop at LinkedIn
Scaling Hadoop at LinkedIn
 
Lecture 2 part 1
Lecture 2 part 1Lecture 2 part 1
Lecture 2 part 1
 
Quick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage ClusterQuick-and-Easy Deployment of a Ceph Storage Cluster
Quick-and-Easy Deployment of a Ceph Storage Cluster
 
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
P.Maharajothi,II-M.sc(computer science),Bon secours college for women,thanjavur.
 
Introduction to hadoop and hdfs
Introduction to hadoop and hdfsIntroduction to hadoop and hdfs
Introduction to hadoop and hdfs
 
Apache Spark
Apache SparkApache Spark
Apache Spark
 
Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)Hadoop - Disk Fail In Place (DFIP)
Hadoop - Disk Fail In Place (DFIP)
 
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 Improving Apache Spark by Taking Advantage of Disaggregated Architecture Improving Apache Spark by Taking Advantage of Disaggregated Architecture
Improving Apache Spark by Taking Advantage of Disaggregated Architecture
 
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and HadoopIOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
IOD 2013 - Crunch Big Data in the Cloud with IBM BigInsights and Hadoop
 
Vijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-featuresVijfhart thema-avond-oracle-12c-new-features
Vijfhart thema-avond-oracle-12c-new-features
 
002 Introduction to hadoop v3
002   Introduction to hadoop v3002   Introduction to hadoop v3
002 Introduction to hadoop v3
 
Hadoop Overview kdd2011
Hadoop Overview kdd2011Hadoop Overview kdd2011
Hadoop Overview kdd2011
 
HDFS- What is New and Future
HDFS- What is New and FutureHDFS- What is New and Future
HDFS- What is New and Future
 
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMFGestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
Gestione gerarchica dei dati con SUSE Enterprise Storage e HPE DMF
 
Big data processing using hadoop poster presentation
Big data processing using hadoop poster presentationBig data processing using hadoop poster presentation
Big data processing using hadoop poster presentation
 

Dernier

Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...gajnagarg
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...Elaine Werffeli
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...amitlee9823
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...gajnagarg
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...gajnagarg
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...amitlee9823
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...karishmasinghjnh
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...amitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann
 

Dernier (20)

Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls Bellary Escorts ☎️9352988975 Two shot with one girl ...
 
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
Just Call Vip call girls roorkee Escorts ☎️9352988975 Two shot with one girl ...
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
SAC 25 Final National, Regional & Local Angel Group Investing Insights 2024 0...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men  🔝Bangalore🔝   Esc...
➥🔝 7737669865 🔝▻ Bangalore Call-girls in Women Seeking Men 🔝Bangalore🔝 Esc...
 
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
Just Call Vip call girls Mysore Escorts ☎️9352988975 Two shot with one girl (...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
Just Call Vip call girls kakinada Escorts ☎️9352988975 Two shot with one girl...
 
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night StandCall Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Hsr Layout ☎ 7737669865 🥵 Book Your One night Stand
 
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men  🔝Dindigul🔝   Escor...
➥🔝 7737669865 🔝▻ Dindigul Call-girls in Women Seeking Men 🔝Dindigul🔝 Escor...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
👉 Amritsar Call Girl 👉📞 6367187148 👉📞 Just📲 Call Ruhi Call Girl Phone No Amri...
 
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men  🔝mahisagar🔝   Esc...
➥🔝 7737669865 🔝▻ mahisagar Call-girls in Women Seeking Men 🔝mahisagar🔝 Esc...
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24  Building Real-Time Pipelines With FLaNKDATA SUMMIT 24  Building Real-Time Pipelines With FLaNK
DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK
 

What's new in hadoop 3.0

  • 1. 1 What‘s new in Hadoop 3.0 Heiko Loewe @loeweh
  • 2. 2 Tributes • Zhe Zhang, Erasure Coding • Akira Ajisaka, Script Rewrite • Junping Du, Yarn Timeline Service v2 • And manny Others
  • 3. 3 Hadoop 3.0 Roadmap • Hadoop 3.x Releases • Planned for hadoop-3.0.0 Classpath isolation on by default HADOOP-11656 • hadoop-3.0.0-alpha1 – HADOOP • Move to JDK8+ • Shell script rewrite HADOOP-9902 • Move default ports out of ephemeral range HDFS-9427 – HDFS • Removal of hftp in favor of webhdfs HDFS-5570 • Support for more than two standby NameNodes HDFS-6440 • Support for Erasure Codes in HDFS HDFS-7285 • Intra-datanode balancer HDFS-1312 – YARN • YARN Timeline Service v.2 YARN-2928 – MAPREDUCE • Derive heap size or mapreduce.*.memory.mb automatically MAPREDUCE-5785
  • 4. 4 Release Feature 2.0NameNode High-Availability, 2.2Federation, Snapshots, NFS v3 Mount 2.3Heterogenous Storage (Phase 1), In Memory Caching 2.4Rolling Upgrades, Posix ACL 2.5Extended Attributes 2.6Hot Swap Volumes, Heterogeneous Storage (Phase 2), transp. Encryption 2.7Files w/ variable Block Length, Inotify 2.8Docker Container in Linux, Yarn ATS 1.5, changing resources on alloc Yarn Container 2.9TBD Release / Features / 2.X
  • 5. 5 What's Apache Hadoop 3? 20142010 2011 201320122009 2015 2.2.0 Heterogeneous storage HDFS in-memory caching 2.3.0 2.5.0 2.4.0 HDFS ACLs 2.0.0-alpha 2.1.0-beta branch-1 (branch-0.20) 1.0.0 1.1.0 1.2.1(stable)0.20.1 0.20.205 0.22.0 Security 0.23.11(final) NameNode Federation, YARN 0.21.0 New append 0.23.0 NameNode HA branch-2 HDFS Snapshots NFSv3 support Windows HDFS Rolling Upgrades Application History Server RM Automatic Failover 2.6.0 YARN Rolling Upgrades Transparent Encryption Archival Storage 2.7.0 Hadoop2 Drop JDK6 support Truncate API 2016 branch-0.23 trunk Hadoop3 Hadoop 3 and 2 were diverged in 2011 (5 years ago!) Hadoop1 (EOL) Source: Akira Ajisaka
  • 6. 6 Break compatibility Major version up is to clean up the code • Deprecated APIs can be removed only in changing major version – @Public and @Stable Java API – REST API – Metrics/JMX – CLI – Environment variables • Wire-compatibility can be broken – 2.X client cannot talk to 3.X server and vice versa • Compatibility Guide: – Apache Hadoop 3.0.0-alpha1 – Apache Hadoop Compatibility
  • 8. 8 Traditional Hadoop • HDFS inherits 3-way replication from Google File System • Simple, scalable and robust • 200% storage overhead • Secondary replicas rarely accessed
  • 9. 9 Erasure Coding Saves Storage • Simplified Example: storing 2 bits • Same data durability – can lose any 1 bit • Pro: Half the storage overhead • Cons: Slower recovery 1 01 0Replication: 2 extra bits XOR Coding: 1 0⊕ 1= 1 extra bit
  • 10. 10 Erasure Coding with m Data, n Parity Reed Solomon Coding
  • 11. 11 Durability and Efficiency Data Durability = How many simultaneous failures can be tolerated? Storage Efficiency = How much portion of storage is for useful data? Data Durability Storage Efficiency Single Replica 0 100% 3-way Replication 2 33% XOR with 6 data cells 1 86% RS (6,3) 3 67% RS (10,4) 4 71%
  • 16. 16 Erasure Coding: Current Status  Phase 1: striping layout  C = 64KB (default)  Work for small files  No data locality  Available on trunk  Phase 2: contiguous layout  C = 128MB (= HDFS Block size)  Not work for small files  Data locality  Now in progress (HDFS-8030) Incoming Data DataNode 1 DataNode 2 DataNode 3 DataNode 4 DataNode 5 ・ ・ ・ Cell size (C)
  • 17. 17 Name Server Changes Mapping Logical and Storage Blocks Too Many Storage Blocks? Hierarchical Naming Protocol:
  • 18. 18 Erasure Coding: Write files using (6,3)-Reed-Solomon ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・ ・  Write data to 9 DNs in parallel 6 Data Blocks DN1 DN6 DN7 Incoming Data DN9 3 Parity Blocks ECClient
  • 19. 19 Erasure Coding: Read files  Read data from 6 DNs in parallel DN1 ECClient DN6 DN9 ・ ・ ・ ・ ・ ・ ・ ・ ・
  • 20. 20 Erasure Coding: Read files when DN fails  Read data from arbitrary 6 DNs in parallel DN1 ECClient ×DN6 DN7 DN9 ・ ・ ・ ・ ・ ・ ・ ・ ・
  • 21. 21 hdfs erasure command [loewe@loewe hadoop-3.0.0-alpha1]$ bin/hdfs erasurecode -help Usage: hdfs erasurecode [generic options] [-getPolicy <path>] [-help [cmd ...]] [-listPolicies] [-setPolicy [-p <policyName>] <path>] -getPolicy <path> : Get erasure coding policy information about at specified path -help [cmd ...] : Displays help for given command or all commands if none is specified. -listPolicies : Get the list of erasure coding policies supported -setPolicy [-p <policyName>] <path> : Set a specified erasure coding policy to a directory Options : -p <policyName> erasure coding policy name to encode files. If not passed the default policy will be used <path> Path to a directory. Under this directory files will be encoded using specified erasure coding policy
  • 22. 22 Acceleration with Intel ISA-L • 1 legacy coder – From Facebook’s HDFS-RAID project • 2 new coders – Pure Java — code improvement over HDFS-RAID – Native coder with Intel’s Intelligent Storage Acceleration Library (ISA-L)
  • 28. 28 First, A bit of Vision… • Evolution of Hadoop start with YARN • YARN Evolution will continue to drive Hadoop forward • Hadoop 3 will still use Yarn, but there are a lot of Improvements Hadoop 3
  • 29. 29 Several important trends in age of Hadoop 3.0 + YARN and Other Platform Services Storage Resource Management Security Service Discovery Management Monitoring Alerts IOT Assembly Kafka Storm HBase Solr Governance MR Tez Spark … Innovating frameworks: Flink, DL(TensorFlow), etc. Various Environments On Premise Private Cloud Public Cloud
  • 31. 31 YARN Process Flow - Walkthrough NodeManager NodeManager NodeManager NodeManager Container 1.1 Container 2.4 NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager NodeManager Container 1.2 Container 1.3 AM 1 Container 2.2 Container 2.1 Container 2.3 AM2 Client2 Yarn Scheduler Yarn Timeline Service Client (Query)
  • 32. 32 Why Timeline Service v2 • Scalability and reliability challenges – Single instance of Timeline Server – Storage (single local LevelDB instance) • Usability – Flow – Metrics and configuration as first-class citizens – Metrics aggregation up the entity hierarchy
  • 33. 33 Highlights v.1 v.2 Single writer/reader Timeline Server Distributed writer/collector architecture Single local LevelDB storage* Scalable storage (HBase) v.1 entity model New v.2 entity model No aggregation Metrics aggregation REST API Richer query REST API
  • 34. 34 Architecture • Separation of writers (“collectors”) and readers • Distributed collectors: one collector for each app • Dedicated RM collector for RM-generated data • Collector discovery via RM • Pluggable storage with HBase as default storage
  • 36. 36 New Entity Model • Flows and flow runs as parents of YARN applicaSon enSSes • First-class configuraSon (key-value pairs) • First-class metrics (single-value or Sme series) • Designed to handle mulS-cluster environment out of the box
  • 37. 37 What is a flow • A flow is a group of YARNapplications that are launched as parts of a logical app • Oozie, Scalding, Pig, etc. – name: – “frequent_visitor_stat” – run id: 1466097809000 – version: “b9b9068”
  • 38. 38 Metrics Aggregation • Application level – Rolls up sub-application metrics – Performed in real time in the collectors in memory • Flow run level – Rolls up app level metrics – Performed in HBase region servers via coprocessors • Offline aggregation (TBD) – Rolls up on user, queue, and flow offline periodically – Phoenix tables
  • 39. 39 More Cloud Friendly • Elastic – Dynamic Resource Configuration • YARN-291 • Allow tune down/up on NM’s resource in runtime – Graceful decommissioning of NodeManagers • YARN-914 • Drains a node that’s being decommissioned to allow running containers to finish • Efficient – Support for container resizing • YARN-1197 • Allows applications to change the size of an existing container
  • 40. 40 More Cloud Friendly (Contd.) • Isolation – Embrace container technology to achieve better isolation – Resource isolation support for disk and network • YARN-2619 (disk), YARN-2140 (network) • Containers get a fair share of disk and network resources using Cgroups – Docker support in LinuxContainerExecutor • YARN-3611 • Support to launch Docker containers alongside process • Packaging and resource isolation • Operation – Container upgrades (YARN-4726) • ”Do an upgrade of my Spark / HBase apps with minimal impact to end-users” – AM Restart With Work Preserving • MAPREDUCE-6608
  • 41. 41 • Add a native implementation of the map output collector – Sort, Spill and IFile serialization • Prequisites – Built with -Pnative option – Custom writable types and comparators are not supported • Setting <property name="mapreduce.job.map.output.collector.clas s" value="org.apache.hadoop.mapred.nativetask.NativeMapOutputCollectorDelegator"> Task level native optimization (MAPREDUE-2841)
  • 42. 42 Benchmark • Release Note in the issue: – "For shuffle-intensive jobs this may provide speed-ups of 30% or more." • Benchmarked with 3 slaves (m3.xlarge) – CentOS 7.2 – 3.0.0-SNAPSHOT (revision 5865fe2b) • A very shuffle-intensive wordcount job – Input: 2.6GB (compressed) – Shuffle: 14GB – Output: 10GB
  • 44. 44 Setup Timeline Service v2 • Set up the HBase cluster (1.1.x) – Add the timeline service jar to HBase – Install the flow run coprocessor – Create tables via TimelineSchemaCreator utility • Configure the YARN cluster – Enable Timeline Service v.2 – Add hbase-site.xml for the timeline collector and readers – Start the timeline reader daemon
  • 49. 49 Shell Script Rewrite (HADOOP-9902) • Hadoop and Shell Script – Launching daemons – Hadoop CLI • Difficult to understand – What is the correct env var to set a option • java classpath? • java.library.path? • GC options? – How to add the option to the env var – We have to read almost all the shell scripts! • New CLI is not completely downward compatible – Hadoop 3: bin/hdfs namenode ‐format – Hadoop 2: bin/hadoop namenode –format Apache Hadoop 3.0.0-alpha1 – Apache Hadoop Compatibility
  • 50. 50 After rewriting the scripts ... • Easy to understand • Because shell API doc is available – Shelldoc maker generates docs from the scripts – Similar to JavaDoc Public API Documentation by Akira Ajisaka build (trunk): http://aajisaka.github.io/hadoop-project/hadoop-project-dist/hadoop-common/UnixShellAPI.html
  • 51. 51 • Very similar to .bashrc – Read the API doc – Create your own ~/.hadoopXX • hadoop-env : hadoop-env.sh for each user • hadooprc : called after shell env vars are configured • And that's all :) • ex.) Set additional classpath (.hadooprc) hadoop_add_classpath /path/to/my/jar .hadoop-env and .hadooprc (HADOOP-11353, HADOOP-13045)
  • 52. 52 --debug option is available CLASSPATH was overwritten!! (before HADOOP-13045) Useful for troubleshooting $ hadoop --debug version DEBUG: DEBUG: DEBUG: ( s n i p ) DEBUG: DEBUG: DEBUG: ( s n i p ) DEBUG: hadoop_parse_args: procesiong version hadoop_parse: asking c a l l e r to skip 1 HADOOP_CONF_DIR=/usr/local/hadoop/ e t c / hadoop Applying the u s e r ' s . hadooprc I n i t i a l CLASSPATH=/path/to/my/jar I n i t i a l i z e CLASSPATH I n i t i a l CLASSPATH=/usr/ l o c a l / hadoop/ s h a r e / hadoop/common/lib/*
  • 53. 53 Many new features, bug fixes, improvements • 'hadoop distch' to change the ownership and permissions on many files via MapReduce job • 'hadoop jnipath' to print java.library.path • 'hadoop --daemon' instead of hadoop-daemon.sh – ex.) hdfs --daemon status namenode – The return code for status is LSB-compatible – hadoop-daemon(s).sh are now deprecated • .out files are now appended (not overwritten) – Allows external log rotation • and many more – see https://issues.apache.org/jira/browse/HADOOP-9902
  • 54. 54 Derive heap size or mapreduce.*.memory.mb automatically (MAPREDUCE-5785) • In Hadoop 2, two similar properties must be set :( – mapreduce.{map,reduce}.memory.mb • The amount of memory to request from the scheduler for each task (ex. 2048) – mapreduce.{map,reduce}.java.opts • Java options for YARN containers (ex. -Xmx2G) • In Hadoop 3, either is enough – .java.opts is derived from .memory.mb and vice versa • .java.opts = .memory.mb * mapreduce.job.heap.memory-mb.ratio • .memory.mb = .java.opts / mapreduce.job.heap.memory-mb.ratio
  • 55. 55 Intra Data-Node Balancer (HDFS-1312) • Due to activities like deletes the volumes of a DataNode may become imbalance filled DataNode Block Placing Policies
  • 56. 56 Intra Data-Node Balancer (HDFS-1312) • Offline scripts existed to reblance a DataNode • HDFS-1312 introduces a online process that rebalances the Volume of a DataNode • „hdfs diskbalancer“ Command
  • 58. 58 Support more than two NameNodes (HDFS-6440) • Hadoop 2 now supports only 2 NameNodes – 1 active and 1 standby • Hadoop 3 supports 2 or more standby NameNodes – provides additional fault-tolerance – avoids multiple standby NNs to checkpoint at the same time – # of standby should be small due to block report (typically 3 or 5 NameNodes)
  • 59. 59 Old Layout ZK Failover Controller Active NameNode ZK Failover Controller Active NameNode Zookeeper Zookeeper Zookeeper Fencing Monitors the health of the NN Participating in election of the active NN Coordinates transition process Fence the other NN, if it win election The information which NN is active is kept her
  • 60. 60 New Layout ZK Failover Controller Active NameNode ZK Failover Controller Standby NameNode Zookeeper Zookeeper Zookeeper Fencing Monitors the health of the NN Participating in election of the active NN Coordinates transition process Fence the other NN, if it win election The information which NN is active is kept her ZK Failover Controller Standby NameNode … 4 .. 5
  • 61. 61 All Feature supported • Checkpointing with NFS • Checkpoint with Quorum Journal Daemon • Manual Failover • Automatic Failover
  • 62. 62 Incompatible changes • Many deprecated APIs will be removed – hftp/hsftp/s3 -> webhdfs/s3{n,a} – Metrics v1 – org.apache.hadoop.Records – and more • Improved CLI output – 'mapred job -list' shows the job name as well – 'hadoop fs -du' shows the raw disk usage, and aligned more unix-like – and more • Search 'Incompatible change' flag – https://s.apache.org/sMO4
  • 63. 63 Bump up the versions of the libraries • Drop JDK7 support (HADOOP-11858) • Dependency Hell – Tomcat – Jetty – Jersey – Guava – Log4J – Jackson – And many more common/lib/jackson-core-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-cli-1.2.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-kerberos-codec-2.0.0-M15.jar:/usr/local/hadoop/share/ hadoop/common/lib/zookeeper-3.4.6.jar:/usr/local/hadoop/share/hadoop/common/lib/guava-11.0.2.jar:/usr/local/hadoop/share/hadoop/common/lib/api-asn1-api-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/ common/lib/slf4j-api-1.7.10.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-1.7.0.jar:/usr/local/hadoop/share/hadoop/common/lib/xmlenc-0.52.jar:/usr/local/hadoop/share/hadoop/common/lib/jsp- api-2.1.jar:/usr/local/hadoop/share/hadoop/common/lib/curator-client-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/protobuf-java-2.5.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-json-1.9.jar:!/usr/ local/hadoop/share/hadoop/common/lib/jettison-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-collections-3.2.2.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-io-2.4.jar:/usr/local/hadoop/ share/hadoop/common/lib/jetty-util-6.1.26.jar:/usr/local/hadoop/share/hadoop/common/lib/nimbus-jose-jwt-3.9.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-codec-1.4.jar:/usr/local/hadoop/share/ hadoop/common/lib/stax-api-1.0-2.jar:/usr/local/hadoop/share/hadoop/common/lib/junit-4.11.jar:/usr/local/hadoop/share/hadoop/common/lib/hamcrest-core-1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/htrace- core4-4.0.1-incubating.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-beanutils-core-1.8.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-core-1.9.jar:/usr/local/hadoop/share/hadoop/common/lib/ netty-3.6.2.Final.jar:/usr/local/hadoop/share/hadoop/common/lib/hadoop-annotations-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/lib/jersey-server-1.9.jar:/usr/local! /hadoop/share/hadoop/ common/lib/activation-1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/json-smart-1.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/apacheds-i18n-2.0.0-M15.jar:/usr/local/hadoop/share/hadoop/common/lib/ java-xmlbuilder-0.4.jar:/usr/local/hadoop/share/hadoop/common/lib/log4j-1.2.17.jar:/usr/local/hadoop/share/hadoop/common/lib/jsch-0.1.51.jar:/usr/local/hadoop/share/hadoop/common/lib/curator- framework-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/httpcore-4.2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/jcip-annotations-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-api-2.2.2.jar:/ usr/local/hadoop/share/hadoop/common/lib/avro-1.7.4.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-lang-2.6.jar:/usr/local/hadoop/share/hadoop/common/lib/httpclient-4.2.5.jar:/usr/local/hadoop/share/ hadoop/common/lib/hadoop-auth-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-digester-1.8.jar:/usr/local/hadoop/share/hadoop/common/lib/jsr305-3.0.0.jar:/usr/local/hadoop/share/ hadoop/common/lib/gson-2.2.4.jar:/usr/local/hadoop/share/hadoop/common/lib/jets3t-0.9.0.jar:/usr/local/hadoop/share/hadoop/common/lib/servlet-api-2.5.jar:/usr/local/hadoop/share/hadoop/common/lib/commons- net-3.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-math3-3.1.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-mapper-asl-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/commons- compress-1.4.1.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-logging-1.1.3.jar:/usr/local/hadoop/share/hadoop/common/lib/xz-1.0.jar:/usr/local/hadoop/share/hadoop/common/lib/mockito-all-1.8.5.jar:/ usr/local/hadoop/share/hadoop/common/lib/api-util-1.0.0-M20.jar:/usr/local/hadoop/share/hadoop/common/lib/commons-configuration-1.6.jar:/usr/local/hadoop/share/hadoop/common/lib/asm-3.2.jar:/usr/local/ hadoop/share/hadoop/common/lib/paranamer-2.3.jar:/usr/local/hadoop/share/hadoop/common/lib/jaxb-impl-2.2.3-1.jar:/usr/local/hadoop/share/hadoop/common/lib/snappy-java-1.0.4.1.jar:/usr/local/hadoop/share/ hadoop/common/lib/curator-recipes-2.7.1.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-xc-1.9.13.jar:/usr/local/hadoop/share/hadoop/common/lib/jackson-jaxrs-1.9.13.jar:/usr/local/hadoop/share/hadoop/ common/lib/slf4j-log4j12-1.7.10.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/common/hadoop-common-3.0.0-SNAPSHOT-tests.jar:/usr/local/ hadoop/share/hadoop/common/hadoop-nfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs:/usr/local/hadoop/share/hadoop/hdfs/lib/xml-apis-1.3.04.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/hadoop- hdfs-client-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/netty-all-4.1.0.Beta5.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/hpack-0.11.0.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/ xercesImpl-2.9.1.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/leveldbjni-all-1.8.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/commons-daemon-1.0.13.jar:/usr/local/hadoop/share/hadoop/hdfs/lib/okio-1.4.0.jar:/usr/local/ hadoop/share/hadoop/hdfs/lib/okhttp-2.4.0.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-nfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-SNAPSHOT-tests.jar:/usr/local/ hadoop/share/hadoop/hdfs/hadoop-hdfs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop- mapreduce-client-jobclient-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-nativetask-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop- mapreduce-client-app-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-core-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce- examples-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-hs-plugins-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client- common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-3.0.0-SNAPSHOT-tests.jar:/usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client- shuffle-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/lib/javassist-3.18.1-GA.jar:/usr/local/hadoop/share/hadoop/yarn/lib/metrics-core-3.0.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-3.0.jar:/usr/ local/hadoop/share/hadoop/yarn/lib/javax.inject-1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-guice-1.9.jar:/usr/local/hadoop/share/hadoop/yarn/lib/aopalliance-1.0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/ curator-test-2.7.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/fst-2.24.jar:/usr/local/hadoop/share/hadoop/yarn/lib/objenesis-2.1.jar:/usr/local/hadoop/share/hadoop/yarn/lib/jersey-client-1.9.jar:/usr/local/hadoop/ share/hadoop/yarn/lib/zookeeper-3.4.6-tests.jar:/usr/local/hadoop/share/hadoop/yarn/lib/guice-servle 0.jar:/usr/local/hadoop/share/hadoop/yarn/lib/commons-math-2.2.jar:/usr/local/hadoop/share/hadoop/yarn/ hadoop-yarn-server-applicationhistoryservice-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn- api-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-tests-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-common-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/ share/hadoop/yarn/hadoop-yarn-server-resourcemanager-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-distributedshell-3.0.0- SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/ yarn/hadoop-yarn-server-web-proxy-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-sharedcachemanager-3.0.0- SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn- registry-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-nodemanager-3.0.0- SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-server-timeline-pluginstorage-3.0.0- SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-applications-unmanaged-am-launcher-3.0.0-SNAPSHOT.jar:/usr/local/hadoop/share/hadoop/yarn/hadoop-yarn-client-3.0.0-SNAPSHOT.jar:/usr/ java/latest/lib/tools.jar
  • 64. 64 Classpath isolation (HADOOP-11656) • Relaxing the "dependency hell" – Separate client and server jars – Client jar does not pull any third party dependencies • If the isolation is done ... – We can safely upgrade the libraries in server code – In branch-2, the upgrade is incompatible :(
  • 65. 65Follow me on Twitter: @loeweh