SlideShare une entreprise Scribd logo
1  sur  24
HiTune sharing


             Xiao Zhu
             1/29/2013
HiTune is...
–   a Hadoop performance analyzer
–   developed by Intel
–   based on Chukwa
–   https://github.com/intel-hadoop/HiTune
–   Contact: jason.dai@intel.com jie.huang@intel.com.
–   Has 3 parts:
–   1) Tracker
–   2) Aggregation Engine
–   3) Analysis Engine


                                                        2
Example of HiTune Output




                           3
Example of HiTune Output




                           4
Example of HiTune Output




                           5
Chukwa is...
– an open source data collection system
  for monitoring large distributed
  systems.
– based on HDFS and Map/Reduce
  framework.
– http://incubator.apache.org/chukwa/

–   Has many parts, including:
–   1) Agent
–   2) Collector
–   3) DemuxManager
–   4) Other processes for logging and
    archive

                                          6
HiTune is based on Chukwa
                    is partly based on
  Tracker                                        Agent

                               is based on
  Aggregation Engine                             Collector

                        is partly based on
  Analysis Engine                                Demux Manager



We tend to call those parts by the right side names, and when we refer to
HiTune, we are considering HiTune and Chukwa together

Some of them are simply built upon Chukwa components
but others are implemented by modifying Chukwa or add new components.

You will find Chukwa patches and patched Chukwa binary in HiTune release.
So when you are going to deploy HiTune, I do not suggest deploy Chukwa
first manually (though you can), for HiTune has already included it.
                                                                            7
HiTune is based on Chukwa
                    is partly based on
  Tracker                                         Agent

                               is based on
  Aggregation Engine                              Collector

                        is partly based on
  Analysis Engine                                 Demux Manager



The tracker includes HiTune java agent part and Chukwa agent part.
The analysis engines includes HiTune script part and Chukwa Demux part.

See following data flow for explanations on those parts.




                                                                          8
HiTune/Chukwa System Basic
                 Structure
HiTune/Chukwa itself needs to set up on a standalone hadoop
cluster. We name it as ‘Chukwa Cluster’, and the target cluster is
named ‘Hadoop Cluster’.

     Hadoop Cluster                         Chukwa Cluster


      HiTune Agents
                                                             Demux
         Workload
                                         Collectors           Map/
           Map/                                              Reduce
          Reduce
           HDFS                                       HDFS



                            Excel
                                    User’s Computer                   9
HiTune/Chukwa Process and Data Flow
           Hadoop Cluster                        Chukwa Cluster


            HiTune Agents
                                                                  Demux
               Workload
                                              Collectors           Map/
                 Map/                                             Reduce
                Reduce
                 HDFS                                      HDFS



                                 Excel
                                         User’s Computer
10

     1. HiTune agents (java agent part) will be invoked by JVM when
     the workload starts on every node in hadoop cluster. This part
     will get system status and hadoop logs and save them on local
     storage.
HiTune/Chukwa Process and Data Flow
           Hadoop Cluster                        Chukwa Cluster


            HiTune Agents
                                                                  Demux
               Workload
                                              Collectors           Map/
                 Map/                                             Reduce
                Reduce
                 HDFS                                      HDFS



                                 Excel
                                         User’s Computer
11

     2. Agent (Chukwa agent part) process will check java agent
     output periodically and send new data to (one of) the
     Collector(s).
HiTune/Chukwa Process and Data Flow
           Hadoop Cluster                        Chukwa Cluster


            HiTune Agents
                                                                  Demux
               Workload
                                              Collectors           Map/
                 Map/                                             Reduce
                Reduce
                 HDFS                                      HDFS



                                 Excel
                                         User’s Computer
12

     3. Collector(s) put data to HDFS on Chukwa Cluster, When it has
     received 64MB data or a given time interval has passed, it pack
     received data to data packages (.done)
HiTune/Chukwa Process and Data Flow
          Hadoop Cluster                        Chukwa Cluster


            HiTune Agents
                                                                 Demux
              Workload
                                             Collectors           Map/
                Map/                                             Reduce
               Reduce
                HDFS                                      HDFS



                                Excel
                                        User’s Computer
13

     4. Demux Manager check data packages in Collector output dir
     on HDFS every 20 seconds. If it find .done files, it start
     Map/Reduce procedure to analyze it (May cost a long time to
     finish).
HiTune/Chukwa Process and Data Flow
           Hadoop Cluster                         Chukwa Cluster


            HiTune Agents
                                                                   Demux
               Workload
                                               Collectors           Map/
                 Map/                                              Reduce
                Reduce
                 HDFS                                       HDFS



                                  Excel
                                          User’s Computer
14

     4. (Cont.) After Demux finishes, a HiTune script is required to run
     by the user. This script will run Map/Reduce to get final output
     (.csv files) (May cost a long time to finish, but faster than 3).
HiTune/Chukwa Process and Data Flow
           Hadoop Cluster                          Chukwa Cluster


             HiTune Agents
                                                                    Demux
               Workload
                                                Collectors           Map/
                 Map/                                               Reduce
                Reduce
                 HDFS                                        HDFS



                                   Excel
                                           User’s Computer
15

     5. User get final output from hdfs://.JOBS/ manually. Then apply
     the output (.csv files) to HiTune Excel template to see the result.
     Graphics, Summaries and etc. will be computed by Excel.
HiTune/Chukwa Process and Data Flow
• Yes if you want you can deploy Chukwa on Hadoop cluster.

• Doing so will add difficulties to management and
  maintenance, but this is theoretically feasible.
Why such structure?
• Using Hadoop for MapReduce processing of
  logs is somewhat troublesome.
• Logs are generated incrementally across many
  machines, but Hadoop MapReduce works best
  on a small number of large files.
• HDFS doesn't currently support appends,
  making it difficult to keep the distributed copy
  fresh.

                                                 17
Why such structure?
• Chukwa is devoted to bridging that gap
  between logs and MapReduce.
• Chukwa is a scalable distributed monitoring
  and analysis system, particularly logs from
  Hadoop and other large systems.
• Though process of agents and collectors,
  large, appended, distributed logs are
  transformed into large data chunks, which are
  suitable for Map/Reduce.

                                              18
Why such structure?
• The overhead is mainly caused by
  agents, since only agents run on Hadoop
  Cluster.
• According to the HiTune paper, the overhead
  is less than 2%
• See those papers:
•   Dai, Jinquan, et al. "Hitune: Dataflow-based performance analysis for big data
    cloud." Proc. of the 2011 USENIX ATC (2011): 87-100. (Available on HiTune Github
    https://github.com/intel-hadoop/HiTune)
•   Boulon, Jerome, et al. "Chukwa, a large-scale monitoring system." Proceedings of
    CCA. Vol. 8. 2008.

                                                                                   19
current HiTune version: 0.9
• Support Hadoop 0.2 best
• Based on Chukwa 0.4
• Can support Hadoop 0.2+ , some options need
  to be changed, and some metrics will be
  missing. (Current IDH is using Hadoop 1.0+)
• Usually require a long time to complete
  aggregating and analyzing. Better deploy it on
  a fast cluster.
Questions?
Backup
HiTune trouble shooting
• Trouble shooting on HiTune is usually painful.
• Need to check those logs: Hadoop cluster logs
  (task tracker logs, job tracker logs, namenode
  logs, datanode logs), (most important!)Chukwa
  logs (agent logs, collector logs, demux
  logs), HiTune logs(script outputs).
• If there is no error or warning in logs, check
  outputs on disk and HDFS
• HiTuneStatusCheck.sh is not reliable. Check the
  logs yourself.
HiTune/Chukwa Process and Data Flow
           Hadoop Cluster                        Chukwa Cluster


            HiTune Agents
                                                                  Demux
               Workload
                                              Collectors           Map/
                 Map/                                             Reduce
                Reduce
                 HDFS                                      HDFS



                                 Excel
                                         User’s Computer
24

     6. Later, Chukwa will group and archive data used on Chukwa
     Cluster HDFS to save space, but we will not discuss it here.

Contenu connexe

Tendances

Optimizing MapReduce Job performance
Optimizing MapReduce Job performanceOptimizing MapReduce Job performance
Optimizing MapReduce Job performanceDataWorks Summit
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopGERARDO BARBERENA
 
Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14jijukjoseph
 
Hadoop Distributed file system.pdf
Hadoop Distributed file system.pdfHadoop Distributed file system.pdf
Hadoop Distributed file system.pdfvishal choudhary
 
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceHadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceCloudera, Inc.
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingImpetus Technologies
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentationpuneet yadav
 
Hadoop MapReduce Introduction and Deep Insight
Hadoop MapReduce Introduction and Deep InsightHadoop MapReduce Introduction and Deep Insight
Hadoop MapReduce Introduction and Deep InsightHanborq Inc.
 
EclipseCon Keynote: Apache Hadoop - An Introduction
EclipseCon Keynote: Apache Hadoop - An IntroductionEclipseCon Keynote: Apache Hadoop - An Introduction
EclipseCon Keynote: Apache Hadoop - An IntroductionCloudera, Inc.
 
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your ApplicationHadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your ApplicationYahoo Developer Network
 
Meethadoop
MeethadoopMeethadoop
MeethadoopIIIT-H
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answersKalyan Hadoop
 
KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.Kyong-Ha Lee
 
Hypertable Distilled by edydkim.github.com
Hypertable Distilled by edydkim.github.comHypertable Distilled by edydkim.github.com
Hypertable Distilled by edydkim.github.comEdward D. Kim
 
Seminar_Report_hadoop
Seminar_Report_hadoopSeminar_Report_hadoop
Seminar_Report_hadoopVarun Narang
 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introductionrajsandhu1989
 
#VirtualDesignMaster 3 Challenge 2 - Harshvardhan Gupta
#VirtualDesignMaster 3 Challenge 2 - Harshvardhan Gupta#VirtualDesignMaster 3 Challenge 2 - Harshvardhan Gupta
#VirtualDesignMaster 3 Challenge 2 - Harshvardhan Guptavdmchallenge
 

Tendances (20)

Optimizing MapReduce Job performance
Optimizing MapReduce Job performanceOptimizing MapReduce Job performance
Optimizing MapReduce Job performance
 
Introduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to HadoopIntroduccion a Hadoop / Introduction to Hadoop
Introduccion a Hadoop / Introduction to Hadoop
 
Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14Hadoop single node installation on ubuntu 14
Hadoop single node installation on ubuntu 14
 
Hadoop Distributed file system.pdf
Hadoop Distributed file system.pdfHadoop Distributed file system.pdf
Hadoop Distributed file system.pdf
 
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job PerformanceHadoop Summit 2012 | Optimizing MapReduce Job Performance
Hadoop Summit 2012 | Optimizing MapReduce Job Performance
 
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop ConsultingAdvanced Hadoop Tuning and Optimization - Hadoop Consulting
Advanced Hadoop Tuning and Optimization - Hadoop Consulting
 
Hadoop Installation presentation
Hadoop Installation presentationHadoop Installation presentation
Hadoop Installation presentation
 
Hadoop MapReduce Introduction and Deep Insight
Hadoop MapReduce Introduction and Deep InsightHadoop MapReduce Introduction and Deep Insight
Hadoop MapReduce Introduction and Deep Insight
 
Hadoop ppt2
Hadoop ppt2Hadoop ppt2
Hadoop ppt2
 
EclipseCon Keynote: Apache Hadoop - An Introduction
EclipseCon Keynote: Apache Hadoop - An IntroductionEclipseCon Keynote: Apache Hadoop - An Introduction
EclipseCon Keynote: Apache Hadoop - An Introduction
 
Hadoop 1.x vs 2
Hadoop 1.x vs 2Hadoop 1.x vs 2
Hadoop 1.x vs 2
 
HDFS Internals
HDFS InternalsHDFS Internals
HDFS Internals
 
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your ApplicationHadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
Hadoop Summit 2010 Tuning Hadoop To Deliver Performance To Your Application
 
Meethadoop
MeethadoopMeethadoop
Meethadoop
 
Big data interview questions and answers
Big data interview questions and answersBig data interview questions and answers
Big data interview questions and answers
 
KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.
 
Hypertable Distilled by edydkim.github.com
Hypertable Distilled by edydkim.github.comHypertable Distilled by edydkim.github.com
Hypertable Distilled by edydkim.github.com
 
Seminar_Report_hadoop
Seminar_Report_hadoopSeminar_Report_hadoop
Seminar_Report_hadoop
 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introduction
 
#VirtualDesignMaster 3 Challenge 2 - Harshvardhan Gupta
#VirtualDesignMaster 3 Challenge 2 - Harshvardhan Gupta#VirtualDesignMaster 3 Challenge 2 - Harshvardhan Gupta
#VirtualDesignMaster 3 Challenge 2 - Harshvardhan Gupta
 

En vedette

xkcd viewer report
xkcd viewer reportxkcd viewer report
xkcd viewer reportZx MYS
 
Stay Anonymous app report
Stay Anonymous app reportStay Anonymous app report
Stay Anonymous app reportZx MYS
 
Shopping buddy report
Shopping buddy reportShopping buddy report
Shopping buddy reportZx MYS
 
Bookio report
Bookio reportBookio report
Bookio reportZx MYS
 
Universal login
Universal loginUniversal login
Universal loginZx MYS
 
a Google Glass app presentation
a Google Glass app presentationa Google Glass app presentation
a Google Glass app presentationZx MYS
 
Event Coordinator
Event CoordinatorEvent Coordinator
Event CoordinatorZx MYS
 
Sketch of the ZXFS
Sketch of the ZXFSSketch of the ZXFS
Sketch of the ZXFSZx MYS
 
中国愤青群体心理研究 Chinese FenQin(angry youth) mentality (Chinese)
中国愤青群体心理研究 Chinese FenQin(angry youth) mentality (Chinese)中国愤青群体心理研究 Chinese FenQin(angry youth) mentality (Chinese)
中国愤青群体心理研究 Chinese FenQin(angry youth) mentality (Chinese)Zx MYS
 

En vedette (9)

xkcd viewer report
xkcd viewer reportxkcd viewer report
xkcd viewer report
 
Stay Anonymous app report
Stay Anonymous app reportStay Anonymous app report
Stay Anonymous app report
 
Shopping buddy report
Shopping buddy reportShopping buddy report
Shopping buddy report
 
Bookio report
Bookio reportBookio report
Bookio report
 
Universal login
Universal loginUniversal login
Universal login
 
a Google Glass app presentation
a Google Glass app presentationa Google Glass app presentation
a Google Glass app presentation
 
Event Coordinator
Event CoordinatorEvent Coordinator
Event Coordinator
 
Sketch of the ZXFS
Sketch of the ZXFSSketch of the ZXFS
Sketch of the ZXFS
 
中国愤青群体心理研究 Chinese FenQin(angry youth) mentality (Chinese)
中国愤青群体心理研究 Chinese FenQin(angry youth) mentality (Chinese)中国愤青群体心理研究 Chinese FenQin(angry youth) mentality (Chinese)
中国愤青群体心理研究 Chinese FenQin(angry youth) mentality (Chinese)
 

Similaire à Hi tune sharing

Hadoop installation by santosh nage
Hadoop installation by santosh nageHadoop installation by santosh nage
Hadoop installation by santosh nageSantosh Nage
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component rebeccatho
 
Hadoop with Lustre WhitePaper
Hadoop with Lustre WhitePaperHadoop with Lustre WhitePaper
Hadoop with Lustre WhitePaperDavid Luan
 
project report on hadoop
project report on hadoopproject report on hadoop
project report on hadoopManoj Jangalva
 
Distributed Systems Hadoop.pptx
Distributed Systems Hadoop.pptxDistributed Systems Hadoop.pptx
Distributed Systems Hadoop.pptxUttara University
 
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...Big Data Spain
 
Hadoop Adminstration with Latest Release (2.0)
Hadoop Adminstration with Latest Release (2.0)Hadoop Adminstration with Latest Release (2.0)
Hadoop Adminstration with Latest Release (2.0)Edureka!
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduceDerek Chen
 

Similaire à Hi tune sharing (20)

Hadoop installation by santosh nage
Hadoop installation by santosh nageHadoop installation by santosh nage
Hadoop installation by santosh nage
 
2.1-HADOOP.pdf
2.1-HADOOP.pdf2.1-HADOOP.pdf
2.1-HADOOP.pdf
 
Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component Introduction to Hadoop and Hadoop component
Introduction to Hadoop and Hadoop component
 
Hadoop with Lustre WhitePaper
Hadoop with Lustre WhitePaperHadoop with Lustre WhitePaper
Hadoop with Lustre WhitePaper
 
Hadoop overview.pdf
Hadoop overview.pdfHadoop overview.pdf
Hadoop overview.pdf
 
project report on hadoop
project report on hadoopproject report on hadoop
project report on hadoop
 
Distributed Systems Hadoop.pptx
Distributed Systems Hadoop.pptxDistributed Systems Hadoop.pptx
Distributed Systems Hadoop.pptx
 
hadoop
hadoophadoop
hadoop
 
hadoop
hadoophadoop
hadoop
 
Instant hadoop of your own
Instant hadoop of your ownInstant hadoop of your own
Instant hadoop of your own
 
Hadoop basics
Hadoop basicsHadoop basics
Hadoop basics
 
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
Coordinating the Many Tools of Big Data - Apache HCatalog, Apache Pig and Apa...
 
Unit 1
Unit 1Unit 1
Unit 1
 
Hadoop Adminstration with Latest Release (2.0)
Hadoop Adminstration with Latest Release (2.0)Hadoop Adminstration with Latest Release (2.0)
Hadoop Adminstration with Latest Release (2.0)
 
Unit 5
Unit  5Unit  5
Unit 5
 
Big data and hadoop
Big data and hadoopBig data and hadoop
Big data and hadoop
 
Introduction to HDFS and MapReduce
Introduction to HDFS and MapReduceIntroduction to HDFS and MapReduce
Introduction to HDFS and MapReduce
 
Cppt Hadoop
Cppt HadoopCppt Hadoop
Cppt Hadoop
 
Cppt
CpptCppt
Cppt
 
Cppt
CpptCppt
Cppt
 

Plus de Zx MYS

Camevent
CameventCamevent
CameventZx MYS
 
iBoard presentation
iBoard presentationiBoard presentation
iBoard presentationZx MYS
 
Delicious – A Recipe Share App
Delicious – A Recipe Share AppDelicious – A Recipe Share App
Delicious – A Recipe Share AppZx MYS
 
Oculus presentation
Oculus presentationOculus presentation
Oculus presentationZx MYS
 
Cloud-based smart classroom
Cloud-based smart classroomCloud-based smart classroom
Cloud-based smart classroomZx MYS
 
Carrier pigeon presentation
Carrier pigeon presentationCarrier pigeon presentation
Carrier pigeon presentationZx MYS
 
Columbia connect project rep
Columbia connect project repColumbia connect project rep
Columbia connect project repZx MYS
 

Plus de Zx MYS (7)

Camevent
CameventCamevent
Camevent
 
iBoard presentation
iBoard presentationiBoard presentation
iBoard presentation
 
Delicious – A Recipe Share App
Delicious – A Recipe Share AppDelicious – A Recipe Share App
Delicious – A Recipe Share App
 
Oculus presentation
Oculus presentationOculus presentation
Oculus presentation
 
Cloud-based smart classroom
Cloud-based smart classroomCloud-based smart classroom
Cloud-based smart classroom
 
Carrier pigeon presentation
Carrier pigeon presentationCarrier pigeon presentation
Carrier pigeon presentation
 
Columbia connect project rep
Columbia connect project repColumbia connect project rep
Columbia connect project rep
 

Dernier

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 

Dernier (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 

Hi tune sharing

  • 1. HiTune sharing Xiao Zhu 1/29/2013
  • 2. HiTune is... – a Hadoop performance analyzer – developed by Intel – based on Chukwa – https://github.com/intel-hadoop/HiTune – Contact: jason.dai@intel.com jie.huang@intel.com. – Has 3 parts: – 1) Tracker – 2) Aggregation Engine – 3) Analysis Engine 2
  • 3. Example of HiTune Output 3
  • 4. Example of HiTune Output 4
  • 5. Example of HiTune Output 5
  • 6. Chukwa is... – an open source data collection system for monitoring large distributed systems. – based on HDFS and Map/Reduce framework. – http://incubator.apache.org/chukwa/ – Has many parts, including: – 1) Agent – 2) Collector – 3) DemuxManager – 4) Other processes for logging and archive 6
  • 7. HiTune is based on Chukwa is partly based on Tracker Agent is based on Aggregation Engine Collector is partly based on Analysis Engine Demux Manager We tend to call those parts by the right side names, and when we refer to HiTune, we are considering HiTune and Chukwa together Some of them are simply built upon Chukwa components but others are implemented by modifying Chukwa or add new components. You will find Chukwa patches and patched Chukwa binary in HiTune release. So when you are going to deploy HiTune, I do not suggest deploy Chukwa first manually (though you can), for HiTune has already included it. 7
  • 8. HiTune is based on Chukwa is partly based on Tracker Agent is based on Aggregation Engine Collector is partly based on Analysis Engine Demux Manager The tracker includes HiTune java agent part and Chukwa agent part. The analysis engines includes HiTune script part and Chukwa Demux part. See following data flow for explanations on those parts. 8
  • 9. HiTune/Chukwa System Basic Structure HiTune/Chukwa itself needs to set up on a standalone hadoop cluster. We name it as ‘Chukwa Cluster’, and the target cluster is named ‘Hadoop Cluster’. Hadoop Cluster Chukwa Cluster HiTune Agents Demux Workload Collectors Map/ Map/ Reduce Reduce HDFS HDFS Excel User’s Computer 9
  • 10. HiTune/Chukwa Process and Data Flow Hadoop Cluster Chukwa Cluster HiTune Agents Demux Workload Collectors Map/ Map/ Reduce Reduce HDFS HDFS Excel User’s Computer 10 1. HiTune agents (java agent part) will be invoked by JVM when the workload starts on every node in hadoop cluster. This part will get system status and hadoop logs and save them on local storage.
  • 11. HiTune/Chukwa Process and Data Flow Hadoop Cluster Chukwa Cluster HiTune Agents Demux Workload Collectors Map/ Map/ Reduce Reduce HDFS HDFS Excel User’s Computer 11 2. Agent (Chukwa agent part) process will check java agent output periodically and send new data to (one of) the Collector(s).
  • 12. HiTune/Chukwa Process and Data Flow Hadoop Cluster Chukwa Cluster HiTune Agents Demux Workload Collectors Map/ Map/ Reduce Reduce HDFS HDFS Excel User’s Computer 12 3. Collector(s) put data to HDFS on Chukwa Cluster, When it has received 64MB data or a given time interval has passed, it pack received data to data packages (.done)
  • 13. HiTune/Chukwa Process and Data Flow Hadoop Cluster Chukwa Cluster HiTune Agents Demux Workload Collectors Map/ Map/ Reduce Reduce HDFS HDFS Excel User’s Computer 13 4. Demux Manager check data packages in Collector output dir on HDFS every 20 seconds. If it find .done files, it start Map/Reduce procedure to analyze it (May cost a long time to finish).
  • 14. HiTune/Chukwa Process and Data Flow Hadoop Cluster Chukwa Cluster HiTune Agents Demux Workload Collectors Map/ Map/ Reduce Reduce HDFS HDFS Excel User’s Computer 14 4. (Cont.) After Demux finishes, a HiTune script is required to run by the user. This script will run Map/Reduce to get final output (.csv files) (May cost a long time to finish, but faster than 3).
  • 15. HiTune/Chukwa Process and Data Flow Hadoop Cluster Chukwa Cluster HiTune Agents Demux Workload Collectors Map/ Map/ Reduce Reduce HDFS HDFS Excel User’s Computer 15 5. User get final output from hdfs://.JOBS/ manually. Then apply the output (.csv files) to HiTune Excel template to see the result. Graphics, Summaries and etc. will be computed by Excel.
  • 16. HiTune/Chukwa Process and Data Flow • Yes if you want you can deploy Chukwa on Hadoop cluster. • Doing so will add difficulties to management and maintenance, but this is theoretically feasible.
  • 17. Why such structure? • Using Hadoop for MapReduce processing of logs is somewhat troublesome. • Logs are generated incrementally across many machines, but Hadoop MapReduce works best on a small number of large files. • HDFS doesn't currently support appends, making it difficult to keep the distributed copy fresh. 17
  • 18. Why such structure? • Chukwa is devoted to bridging that gap between logs and MapReduce. • Chukwa is a scalable distributed monitoring and analysis system, particularly logs from Hadoop and other large systems. • Though process of agents and collectors, large, appended, distributed logs are transformed into large data chunks, which are suitable for Map/Reduce. 18
  • 19. Why such structure? • The overhead is mainly caused by agents, since only agents run on Hadoop Cluster. • According to the HiTune paper, the overhead is less than 2% • See those papers: • Dai, Jinquan, et al. "Hitune: Dataflow-based performance analysis for big data cloud." Proc. of the 2011 USENIX ATC (2011): 87-100. (Available on HiTune Github https://github.com/intel-hadoop/HiTune) • Boulon, Jerome, et al. "Chukwa, a large-scale monitoring system." Proceedings of CCA. Vol. 8. 2008. 19
  • 20. current HiTune version: 0.9 • Support Hadoop 0.2 best • Based on Chukwa 0.4 • Can support Hadoop 0.2+ , some options need to be changed, and some metrics will be missing. (Current IDH is using Hadoop 1.0+) • Usually require a long time to complete aggregating and analyzing. Better deploy it on a fast cluster.
  • 23. HiTune trouble shooting • Trouble shooting on HiTune is usually painful. • Need to check those logs: Hadoop cluster logs (task tracker logs, job tracker logs, namenode logs, datanode logs), (most important!)Chukwa logs (agent logs, collector logs, demux logs), HiTune logs(script outputs). • If there is no error or warning in logs, check outputs on disk and HDFS • HiTuneStatusCheck.sh is not reliable. Check the logs yourself.
  • 24. HiTune/Chukwa Process and Data Flow Hadoop Cluster Chukwa Cluster HiTune Agents Demux Workload Collectors Map/ Map/ Reduce Reduce HDFS HDFS Excel User’s Computer 24 6. Later, Chukwa will group and archive data used on Chukwa Cluster HDFS to save space, but we will not discuss it here.