SlideShare a Scribd company logo
1 of 17
Download to read offline
Presentation slide for HBase Meetup on the Night before Strata/HW @ Google in Chelsea 
Copyright © 2014 NTT DATA Corporation 
HTrace: 
Tracing in HBase and HDFS 
10/15/2014 
Masatake Iwasaki
What is HTrace? 
https://github.com/cloudera/htrace 
 
Tracing tool for parallel distributed system 
like Google's Dapper 
 
Effective for finding bottleneck 
Effective for code analysis 
Low Overhead 
Copyright © 2014 NTT DATA Corporation 2
Tracing spans 
Span represents traced processing unit and time 
Spans have parent-‐‑‒child relationship 
Passing tracing info along with RPC 
 
Time 
Span Atrace id: 12345 
node 1parent: root 
Span Btrace id: 12345 
RPC 
node 2parent: A 
RPCRPC 
Span Ctrace id: 12345Span D 
node 3parent: B 
parent: B 
trace id: 12345 
Copyright © 2014 NTT DATA Corporation 3
Starting root span 
org.htrace.Trace#startSpan create span 
new trace ID is assigned when root span starts 
FsShell shell = new FsShell();! 
conf.setQuietMode(false);! 
shell.setConf(conf);! 
int res = 0;! 
SpanReceiverHost.getInstance(new HdfsConfiguration());! 
TraceScope ts = null;! 
try {! 
ts = Trace.startSpan(FsShell, Sampler.ALWAYS);! 
res = ToolRunner.run(shell, argv);! 
} finally {! 
shell.close();! 
if (ts != null) ts.close();! 
} 
Copyright © 2014 NTT DATA Corporation 4
Starting passive span 
Starting child span only when there is parent 
For server-‐‑‒side custom tracing span 
 
if (Trace.isTracing()) {! 
traceScope = Trace.startSpan(method.getName());! 
} 
Copyright © 2014 NTT DATA Corporation 5
Passing tracing info along with RPC 
RPC header has optional field for tracing 
RPC with tracing info start span on server-‐‑‒side  
message RequestHeader {! 
optional uint32 call_id = 1;! 
optional RPCTInfo trace_info = 2;! 
optional string method_name = 3;! 
...! 
}! 
! 
message RPCTInfo {! 
optional int64 trace_id = 1;! 
optional int64 parent_id = 2;! 
}! 
 
Copyright © 2014 NTT DATA Corporation 6
Span receivers 
Each process loads receiver module 
Receivers receive spans from in-‐‑‒process queue  
Receivers send spans to collector asynchronously 
 
SpanReceiver 
Server 
SpanReceiver 
RPC 
SpanReceiver 
Client 
Server 
Collector/Sink 
RPC 
Tracing Spans 
Copyright © 2014 NTT DATA Corporation 7
Passing tracing info between threads 
Ongoing tracing span is stored in ThreadLocal 
You need to pass tracing info between threads 
 
if (header.hasTraceInfo()) {! 
// If the incoming RPC included tracing info, always continue the trace TraceInfo parentSpan = new TraceInfo(header.getTraceInfo().getTraceId(),! 
header.getTraceInfo().getParentId());! 
traceSpan = Trace.startSpan(rpcRequest.toString(), parentSpan).detach();! 
}! 
Call call = new Call(header.getCallId(), header.getRetryCount(),! 
rpcRequest, this, ProtoUtil.convert(header.getRpcKind()),! 
header.getClientId().toByteArray(), traceSpan);! 
! 
...! 
! 
if (call.traceSpan != null) {! 
traceScope = Trace.continueSpan(call.traceSpan);! 
}! 
 
Copyright © 2014 NTT DATA Corporation 8
JIRAs 
Already available in 
HBase (HBASE-‐‑‒6449) 
HDFS (HDFS-‐‑‒5274) 
 
Working on 
YARN (YARN-‐‑‒1418) 
Copyright © 2014 NTT DATA Corporation 9
Configurations 
Setting receiver class impl turns on tracing 
Each receiver impl has its own additional confs 
! 
property! 
namehbase.trace.spanreceiver.classes/name! 
valueorg.htrace.impl.HBaseSpanReceiver/value! 
/property ! 
property! 
namehbase.htrace.hbase.collector-quorum/name! 
value127.0.0.1/value! 
/property 
 
Copyright © 2014 NTT DATA Corporation 10
Tracing from HBase shell 
trace command start/stop tracing span 
# You need configuration on client node 
! 
$ hbase shell! 
 trace 'start'! 
 create 'test', 'f'! 
 trace 'stop' 
Copyright © 2014 NTT DATA Corporation 11
Example: Creating table in HBase 
Copyright © 2014 NTT DATA Corporation 12
Example: Tracing of putting 200MB of a file to HDFS 
Copyright © 2014 NTT DATA Corporation 13
Example: Tracing of getting 200MB of a file from HDFS 
Copyright © 2014 NTT DATA Corporation 14
Example: Zipkin UI 
Copyright © 2014 NTT DATA Corporation 15
Todo 
Adding granular tracing spans 
Sampling and filtering spans 
Dynamic reconfiguration (HDFS-‐‑‒6956) 
sink and viewer with less dependency 
Copyright © 2014 NTT DATA Corporation 16
Copyright © 2011 NTT DATA Corporation 
Copyright © 2014 NTT DATA Corporation

More Related Content

What's hot

The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
Databricks
 
Bringing complex event processing to Spark streaming
Bringing complex event processing to Spark streamingBringing complex event processing to Spark streaming
Bringing complex event processing to Spark streaming
DataWorks Summit
 

What's hot (20)

Introduction to Spark with Python
Introduction to Spark with PythonIntroduction to Spark with Python
Introduction to Spark with Python
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
Stream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream SharingStream Processing with Flink and Stream Sharing
Stream Processing with Flink and Stream Sharing
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/AvroThe Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
The Rise of ZStandard: Apache Spark/Parquet/ORC/Avro
 
Dynamodb Presentation
Dynamodb PresentationDynamodb Presentation
Dynamodb Presentation
 
How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...How to understand and analyze Apache Hive query execution plan for performanc...
How to understand and analyze Apache Hive query execution plan for performanc...
 
Amazon Elastcsearch Service 소개 및 활용 방법 (윤석찬)
Amazon Elastcsearch Service 소개 및 활용 방법 (윤석찬) Amazon Elastcsearch Service 소개 및 활용 방법 (윤석찬)
Amazon Elastcsearch Service 소개 및 활용 방법 (윤석찬)
 
ksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database SystemksqlDB: A Stream-Relational Database System
ksqlDB: A Stream-Relational Database System
 
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
Real-time Streaming and Querying with Amazon Kinesis and Amazon Elastic MapRe...
 
Pegasus In Depth (2018/10)
Pegasus In Depth (2018/10)Pegasus In Depth (2018/10)
Pegasus In Depth (2018/10)
 
Bringing complex event processing to Spark streaming
Bringing complex event processing to Spark streamingBringing complex event processing to Spark streaming
Bringing complex event processing to Spark streaming
 
Chicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An IntroductionChicago Data Summit: Apache HBase: An Introduction
Chicago Data Summit: Apache HBase: An Introduction
 
Strongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache PhoenixStrongly Consistent Global Indexes for Apache Phoenix
Strongly Consistent Global Indexes for Apache Phoenix
 
Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance Debunking the Myths of HDFS Erasure Coding Performance
Debunking the Myths of HDFS Erasure Coding Performance
 
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBaseApache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
 
Hive on Spark の設計指針を読んでみた
Hive on Spark の設計指針を読んでみたHive on Spark の設計指針を読んでみた
Hive on Spark の設計指針を読んでみた
 
PHP5.5新機能「ジェネレータ」初心者入門
PHP5.5新機能「ジェネレータ」初心者入門PHP5.5新機能「ジェネレータ」初心者入門
PHP5.5新機能「ジェネレータ」初心者入門
 
Time-Series Apache HBase
Time-Series Apache HBaseTime-Series Apache HBase
Time-Series Apache HBase
 
Getting Started with Confluent Schema Registry
Getting Started with Confluent Schema RegistryGetting Started with Confluent Schema Registry
Getting Started with Confluent Schema Registry
 

Similar to HTrace: Tracing in HBase and HDFS (HBase Meetup)

Case study ap log collector
Case study ap log collectorCase study ap log collector
Case study ap log collector
Jyun-Yao Huang
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting
Wei Ting Chen
 
Sql saturday pig session (wes floyd) v2
Sql saturday   pig session (wes floyd) v2Sql saturday   pig session (wes floyd) v2
Sql saturday pig session (wes floyd) v2
Wes Floyd
 
Visual Mapping of Clickstream Data
Visual Mapping of Clickstream DataVisual Mapping of Clickstream Data
Visual Mapping of Clickstream Data
DataWorks Summit
 

Similar to HTrace: Tracing in HBase and HDFS (HBase Meetup) (20)

Spark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod NarasimhaSpark Summit EU talk by Debasish Das and Pramod Narasimha
Spark Summit EU talk by Debasish Das and Pramod Narasimha
 
Case study ap log collector
Case study ap log collectorCase study ap log collector
Case study ap log collector
 
Genomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive BiologyGenomics Is Not Special: Towards Data Intensive Biology
Genomics Is Not Special: Towards Data Intensive Biology
 
Querying Network Packet Captures with Spark and Drill
Querying Network Packet Captures with Spark and DrillQuerying Network Packet Captures with Spark and Drill
Querying Network Packet Captures with Spark and Drill
 
Real Time and Big Data – It’s About Time
Real Time and Big Data – It’s About TimeReal Time and Big Data – It’s About Time
Real Time and Big Data – It’s About Time
 
Real Time and Big Data – It’s About Time
Real Time and Big Data – It’s About TimeReal Time and Big Data – It’s About Time
Real Time and Big Data – It’s About Time
 
Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...
Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...
Spark and Object Stores —What You Need to Know: Spark Summit East talk by Ste...
 
Apache Spark and Object Stores
Apache Spark and Object StoresApache Spark and Object Stores
Apache Spark and Object Stores
 
Apache Spark 2.3 boosts advanced analytics and deep learning with Python
Apache Spark 2.3 boosts advanced analytics and deep learning with PythonApache Spark 2.3 boosts advanced analytics and deep learning with Python
Apache Spark 2.3 boosts advanced analytics and deep learning with Python
 
20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting20150704 benchmark and user experience in sahara weiting
20150704 benchmark and user experience in sahara weiting
 
Sql saturday pig session (wes floyd) v2
Sql saturday   pig session (wes floyd) v2Sql saturday   pig session (wes floyd) v2
Sql saturday pig session (wes floyd) v2
 
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in HadoopDiscover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
Discover HDP2.1: Apache Storm for Stream Data Processing in Hadoop
 
Get most out of Spark on YARN
Get most out of Spark on YARNGet most out of Spark on YARN
Get most out of Spark on YARN
 
Visual Mapping of Clickstream Data
Visual Mapping of Clickstream DataVisual Mapping of Clickstream Data
Visual Mapping of Clickstream Data
 
Fraud Detection using Hadoop
Fraud Detection using HadoopFraud Detection using Hadoop
Fraud Detection using Hadoop
 
Virtual training Intro to InfluxDB & Telegraf
Virtual training  Intro to InfluxDB & TelegrafVirtual training  Intro to InfluxDB & Telegraf
Virtual training Intro to InfluxDB & Telegraf
 
Spark mhug2
Spark mhug2Spark mhug2
Spark mhug2
 
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
Gruter_TECHDAY_2014_03_ApacheTajo (in Korean)
 
EKAW - Publishing with Triple Pattern Fragments
EKAW - Publishing with Triple Pattern FragmentsEKAW - Publishing with Triple Pattern Fragments
EKAW - Publishing with Triple Pattern Fragments
 
Architecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with HadoopArchitecting a Fraud Detection Application with Hadoop
Architecting a Fraud Detection Application with Hadoop
 

More from NTT DATA OSS Professional Services

More from NTT DATA OSS Professional Services (20)

Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力Global Top 5 を目指す NTT DATA の確かで意外な技術力
Global Top 5 を目指す NTT DATA の確かで意外な技術力
 
Spark SQL - The internal -
Spark SQL - The internal -Spark SQL - The internal -
Spark SQL - The internal -
 
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
Apache Kafkaって本当に大丈夫?~故障検証のオーバービューと興味深い挙動の紹介~
 
Hadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返りHadoopエコシステムのデータストア振り返り
Hadoopエコシステムのデータストア振り返り
 
HDFS Router-based federation
HDFS Router-based federationHDFS Router-based federation
HDFS Router-based federation
 
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイントPostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
PostgreSQL10を導入!大規模データ分析事例からみるDWHとしてのPostgreSQL活用のポイント
 
Apache Hadoopの新機能Ozoneの現状
Apache Hadoopの新機能Ozoneの現状Apache Hadoopの新機能Ozoneの現状
Apache Hadoopの新機能Ozoneの現状
 
Distributed data stores in Hadoop ecosystem
Distributed data stores in Hadoop ecosystemDistributed data stores in Hadoop ecosystem
Distributed data stores in Hadoop ecosystem
 
Structured Streaming - The Internal -
Structured Streaming - The Internal -Structured Streaming - The Internal -
Structured Streaming - The Internal -
 
Apache Hadoopの未来 3系になって何が変わるのか?
Apache Hadoopの未来 3系になって何が変わるのか?Apache Hadoopの未来 3系になって何が変わるのか?
Apache Hadoopの未来 3系になって何が変わるのか?
 
Apache Hadoop and YARN, current development status
Apache Hadoop and YARN, current development statusApache Hadoop and YARN, current development status
Apache Hadoop and YARN, current development status
 
HDFS basics from API perspective
HDFS basics from API perspectiveHDFS basics from API perspective
HDFS basics from API perspective
 
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
SIerとオープンソースの美味しい関係 ~コミュニティの力を活かして世界を目指そう~
 
20170303 java9 hadoop
20170303 java9 hadoop20170303 java9 hadoop
20170303 java9 hadoop
 
ブロックチェーンの仕組みと動向(入門編)
ブロックチェーンの仕組みと動向(入門編)ブロックチェーンの仕組みと動向(入門編)
ブロックチェーンの仕組みと動向(入門編)
 
Application of postgre sql to large social infrastructure jp
Application of postgre sql to large social infrastructure jpApplication of postgre sql to large social infrastructure jp
Application of postgre sql to large social infrastructure jp
 
Application of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructureApplication of postgre sql to large social infrastructure
Application of postgre sql to large social infrastructure
 
Apache Hadoop 2.8.0 の新機能 (抜粋)
Apache Hadoop 2.8.0 の新機能 (抜粋)Apache Hadoop 2.8.0 の新機能 (抜粋)
Apache Hadoop 2.8.0 の新機能 (抜粋)
 
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~データ活用をもっともっと円滑に!~データ処理・分析基盤編を少しだけ~
データ活用をもっともっと円滑に! ~データ処理・分析基盤編を少しだけ~
 
商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと商用ミドルウェアのPuppet化で気を付けたい5つのこと
商用ミドルウェアのPuppet化で気を付けたい5つのこと
 

Recently uploaded

Recently uploaded (20)

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

HTrace: Tracing in HBase and HDFS (HBase Meetup)

  • 1. Presentation slide for HBase Meetup on the Night before Strata/HW @ Google in Chelsea Copyright © 2014 NTT DATA Corporation HTrace: Tracing in HBase and HDFS 10/15/2014 Masatake Iwasaki
  • 2. What is HTrace? https://github.com/cloudera/htrace Tracing tool for parallel distributed system like Google's Dapper Effective for finding bottleneck Effective for code analysis Low Overhead Copyright © 2014 NTT DATA Corporation 2
  • 3. Tracing spans Span represents traced processing unit and time Spans have parent-‐‑‒child relationship Passing tracing info along with RPC Time Span Atrace id: 12345 node 1parent: root Span Btrace id: 12345 RPC node 2parent: A RPCRPC Span Ctrace id: 12345Span D node 3parent: B parent: B trace id: 12345 Copyright © 2014 NTT DATA Corporation 3
  • 4. Starting root span org.htrace.Trace#startSpan create span new trace ID is assigned when root span starts FsShell shell = new FsShell();! conf.setQuietMode(false);! shell.setConf(conf);! int res = 0;! SpanReceiverHost.getInstance(new HdfsConfiguration());! TraceScope ts = null;! try {! ts = Trace.startSpan(FsShell, Sampler.ALWAYS);! res = ToolRunner.run(shell, argv);! } finally {! shell.close();! if (ts != null) ts.close();! } Copyright © 2014 NTT DATA Corporation 4
  • 5. Starting passive span Starting child span only when there is parent For server-‐‑‒side custom tracing span if (Trace.isTracing()) {! traceScope = Trace.startSpan(method.getName());! } Copyright © 2014 NTT DATA Corporation 5
  • 6. Passing tracing info along with RPC RPC header has optional field for tracing RPC with tracing info start span on server-‐‑‒side message RequestHeader {! optional uint32 call_id = 1;! optional RPCTInfo trace_info = 2;! optional string method_name = 3;! ...! }! ! message RPCTInfo {! optional int64 trace_id = 1;! optional int64 parent_id = 2;! }! Copyright © 2014 NTT DATA Corporation 6
  • 7. Span receivers Each process loads receiver module Receivers receive spans from in-‐‑‒process queue Receivers send spans to collector asynchronously SpanReceiver Server SpanReceiver RPC SpanReceiver Client Server Collector/Sink RPC Tracing Spans Copyright © 2014 NTT DATA Corporation 7
  • 8. Passing tracing info between threads Ongoing tracing span is stored in ThreadLocal You need to pass tracing info between threads if (header.hasTraceInfo()) {! // If the incoming RPC included tracing info, always continue the trace TraceInfo parentSpan = new TraceInfo(header.getTraceInfo().getTraceId(),! header.getTraceInfo().getParentId());! traceSpan = Trace.startSpan(rpcRequest.toString(), parentSpan).detach();! }! Call call = new Call(header.getCallId(), header.getRetryCount(),! rpcRequest, this, ProtoUtil.convert(header.getRpcKind()),! header.getClientId().toByteArray(), traceSpan);! ! ...! ! if (call.traceSpan != null) {! traceScope = Trace.continueSpan(call.traceSpan);! }! Copyright © 2014 NTT DATA Corporation 8
  • 9. JIRAs Already available in HBase (HBASE-‐‑‒6449) HDFS (HDFS-‐‑‒5274) Working on YARN (YARN-‐‑‒1418) Copyright © 2014 NTT DATA Corporation 9
  • 10. Configurations Setting receiver class impl turns on tracing Each receiver impl has its own additional confs ! property! namehbase.trace.spanreceiver.classes/name! valueorg.htrace.impl.HBaseSpanReceiver/value! /property ! property! namehbase.htrace.hbase.collector-quorum/name! value127.0.0.1/value! /property Copyright © 2014 NTT DATA Corporation 10
  • 11. Tracing from HBase shell trace command start/stop tracing span # You need configuration on client node ! $ hbase shell! trace 'start'! create 'test', 'f'! trace 'stop' Copyright © 2014 NTT DATA Corporation 11
  • 12. Example: Creating table in HBase Copyright © 2014 NTT DATA Corporation 12
  • 13. Example: Tracing of putting 200MB of a file to HDFS Copyright © 2014 NTT DATA Corporation 13
  • 14. Example: Tracing of getting 200MB of a file from HDFS Copyright © 2014 NTT DATA Corporation 14
  • 15. Example: Zipkin UI Copyright © 2014 NTT DATA Corporation 15
  • 16. Todo Adding granular tracing spans Sampling and filtering spans Dynamic reconfiguration (HDFS-‐‑‒6956) sink and viewer with less dependency Copyright © 2014 NTT DATA Corporation 16
  • 17. Copyright © 2011 NTT DATA Corporation Copyright © 2014 NTT DATA Corporation