6. Chukwa is...
– an open source data collection system
for monitoring large distributed
systems.
– based on HDFS and Map/Reduce
framework.
– http://incubator.apache.org/chukwa/
– Has many parts, including:
– 1) Agent
– 2) Collector
– 3) DemuxManager
– 4) Other processes for logging and
archive
6
7. HiTune is based on Chukwa
is partly based on
Tracker Agent
is based on
Aggregation Engine Collector
is partly based on
Analysis Engine Demux Manager
We tend to call those parts by the right side names, and when we refer to
HiTune, we are considering HiTune and Chukwa together
Some of them are simply built upon Chukwa components
but others are implemented by modifying Chukwa or add new components.
You will find Chukwa patches and patched Chukwa binary in HiTune release.
So when you are going to deploy HiTune, I do not suggest deploy Chukwa
first manually (though you can), for HiTune has already included it.
7
8. HiTune is based on Chukwa
is partly based on
Tracker Agent
is based on
Aggregation Engine Collector
is partly based on
Analysis Engine Demux Manager
The tracker includes HiTune java agent part and Chukwa agent part.
The analysis engines includes HiTune script part and Chukwa Demux part.
See following data flow for explanations on those parts.
8
9. HiTune/Chukwa System Basic
Structure
HiTune/Chukwa itself needs to set up on a standalone hadoop
cluster. We name it as ‘Chukwa Cluster’, and the target cluster is
named ‘Hadoop Cluster’.
Hadoop Cluster Chukwa Cluster
HiTune Agents
Demux
Workload
Collectors Map/
Map/ Reduce
Reduce
HDFS HDFS
Excel
User’s Computer 9
10. HiTune/Chukwa Process and Data Flow
Hadoop Cluster Chukwa Cluster
HiTune Agents
Demux
Workload
Collectors Map/
Map/ Reduce
Reduce
HDFS HDFS
Excel
User’s Computer
10
1. HiTune agents (java agent part) will be invoked by JVM when
the workload starts on every node in hadoop cluster. This part
will get system status and hadoop logs and save them on local
storage.
11. HiTune/Chukwa Process and Data Flow
Hadoop Cluster Chukwa Cluster
HiTune Agents
Demux
Workload
Collectors Map/
Map/ Reduce
Reduce
HDFS HDFS
Excel
User’s Computer
11
2. Agent (Chukwa agent part) process will check java agent
output periodically and send new data to (one of) the
Collector(s).
12. HiTune/Chukwa Process and Data Flow
Hadoop Cluster Chukwa Cluster
HiTune Agents
Demux
Workload
Collectors Map/
Map/ Reduce
Reduce
HDFS HDFS
Excel
User’s Computer
12
3. Collector(s) put data to HDFS on Chukwa Cluster, When it has
received 64MB data or a given time interval has passed, it pack
received data to data packages (.done)
13. HiTune/Chukwa Process and Data Flow
Hadoop Cluster Chukwa Cluster
HiTune Agents
Demux
Workload
Collectors Map/
Map/ Reduce
Reduce
HDFS HDFS
Excel
User’s Computer
13
4. Demux Manager check data packages in Collector output dir
on HDFS every 20 seconds. If it find .done files, it start
Map/Reduce procedure to analyze it (May cost a long time to
finish).
14. HiTune/Chukwa Process and Data Flow
Hadoop Cluster Chukwa Cluster
HiTune Agents
Demux
Workload
Collectors Map/
Map/ Reduce
Reduce
HDFS HDFS
Excel
User’s Computer
14
4. (Cont.) After Demux finishes, a HiTune script is required to run
by the user. This script will run Map/Reduce to get final output
(.csv files) (May cost a long time to finish, but faster than 3).
15. HiTune/Chukwa Process and Data Flow
Hadoop Cluster Chukwa Cluster
HiTune Agents
Demux
Workload
Collectors Map/
Map/ Reduce
Reduce
HDFS HDFS
Excel
User’s Computer
15
5. User get final output from hdfs://.JOBS/ manually. Then apply
the output (.csv files) to HiTune Excel template to see the result.
Graphics, Summaries and etc. will be computed by Excel.
16. HiTune/Chukwa Process and Data Flow
• Yes if you want you can deploy Chukwa on Hadoop cluster.
• Doing so will add difficulties to management and
maintenance, but this is theoretically feasible.
17. Why such structure?
• Using Hadoop for MapReduce processing of
logs is somewhat troublesome.
• Logs are generated incrementally across many
machines, but Hadoop MapReduce works best
on a small number of large files.
• HDFS doesn't currently support appends,
making it difficult to keep the distributed copy
fresh.
17
18. Why such structure?
• Chukwa is devoted to bridging that gap
between logs and MapReduce.
• Chukwa is a scalable distributed monitoring
and analysis system, particularly logs from
Hadoop and other large systems.
• Though process of agents and collectors,
large, appended, distributed logs are
transformed into large data chunks, which are
suitable for Map/Reduce.
18
19. Why such structure?
• The overhead is mainly caused by
agents, since only agents run on Hadoop
Cluster.
• According to the HiTune paper, the overhead
is less than 2%
• See those papers:
• Dai, Jinquan, et al. "Hitune: Dataflow-based performance analysis for big data
cloud." Proc. of the 2011 USENIX ATC (2011): 87-100. (Available on HiTune Github
https://github.com/intel-hadoop/HiTune)
• Boulon, Jerome, et al. "Chukwa, a large-scale monitoring system." Proceedings of
CCA. Vol. 8. 2008.
19
20. current HiTune version: 0.9
• Support Hadoop 0.2 best
• Based on Chukwa 0.4
• Can support Hadoop 0.2+ , some options need
to be changed, and some metrics will be
missing. (Current IDH is using Hadoop 1.0+)
• Usually require a long time to complete
aggregating and analyzing. Better deploy it on
a fast cluster.
23. HiTune trouble shooting
• Trouble shooting on HiTune is usually painful.
• Need to check those logs: Hadoop cluster logs
(task tracker logs, job tracker logs, namenode
logs, datanode logs), (most important!)Chukwa
logs (agent logs, collector logs, demux
logs), HiTune logs(script outputs).
• If there is no error or warning in logs, check
outputs on disk and HDFS
• HiTuneStatusCheck.sh is not reliable. Check the
logs yourself.
24. HiTune/Chukwa Process and Data Flow
Hadoop Cluster Chukwa Cluster
HiTune Agents
Demux
Workload
Collectors Map/
Map/ Reduce
Reduce
HDFS HDFS
Excel
User’s Computer
24
6. Later, Chukwa will group and archive data used on Chukwa
Cluster HDFS to save space, but we will not discuss it here.