"Hadoop and NoSQL: Scalable Back-end Clusters Orchestration in Real-world Systems" was presented in CloudCon2012: BIT’s 1st Annual World Congress of Cloud Computing 2012 will be held from August 28-30, 2012 in Dalian, China
1. Hadoop and NoSQL: Scalable
Back-end Clusters Orchestration
in Real-world Systems
CloudCon 2012, Dalian, China
Ruo Ando
NICT National Institute of
Information and Communications
Technology, Tokyo Japan
2. Agenda: Scalable Back-end Clusters Orchestration
for real-world systems (large scale network monitoring)
■Hadoop and NoSQL: Scalable Back-end Clusters Orchestration in Real-world Systems
Hadoop and NoSQl are usually used together. Partly because Key-Value data format (such as JSON)
is suitable for exchanging data between MongoDB and HDFS. These technologies is deployed network
monitoring system and large scale Testbed in National research institute in Japan.
■What is Orchestration is for? – large scale network monitoring
With rapid enragement of botNet and file sharing networks, network traffic monitoring logs has become
“big data”. Today’s Large scale network monitoring needs scalable clusters for traffic logging and data
processing.
■Back ground – Internet traffic explosion
Some statistics are shown about mobile phone traffic and "gigabyte club"
■Real world systems – large scale DHT network crawling
To test performance of our system, we have crawling DHT (BitTorrent) network. Our system have obtained
information of over 10,000,000 nodes in 24 hours. Besides, ranking of countries about popularity of DHT
network is generated by our HDFS.
■Architecture overview
We use everything available for constructing high-speed and scalable clusters (hypervisor, NoSQL, HDFS,
Scala, etc..)
■Map Reduce and Traffic logs
For aggregating and sorting traffic logs, we have programmed two stage Map Reduce.
■Results and demos
■conclusion
3. NICT: National Institute of Information and Solar observatory
Communications Technology, Tokyo Japan
Large scale TestBeds
Large scale network emulation for
analyzing cyber incidents (DDOS, BotNet)
We have over
140,000 passive
monitor in Darknet
for analyzing botNet
Darknet monitoring for malware analysis
4. StarBed:A Large Scale Network Experiment
Environment in NICT
• Developers along desire to evaluate their new
technologies in realistic situations. The developers for the
Internet are not excepted. The general experimental
issues for Internet technologies are efficiency and
scalability. StarBED enables to evaluate such factors in
realistic situations.
• Actual computers and network equipments are required if
we want to evaluate software for the real Internet. In
StarBED there are many actual computers, and switches
which connect these computers. We reproduce close to
reality situations with actual equipments that are used on
Internet. If developers want to evaluate their real
implementation, they have to use actual equipments.
group # of experiment networks
F 168 0 0 4 SATA 2006
H 240 0 0 2 SATA 2009
I 192 0 0 4 SATA 2011
J 96 0 0 4 SATA 2011 There are about 1000 servers.
Other 500 StarBed collaborates with other testbed project of
total 960 DETER, PlanetLab in US.
Group I,J,K,L Model Cisco UCS C200 M2 CPU Intel 6-Core Xeon X5670 x 2
Memory 48.0GB Disk SATA 500GB x 2 Network (on-board) double GigabitEthernet
5. Real world systems: monitoring Bittorrent network -
handling massive DHT crawling
Invisibility (thus unstoppable)
encourages illegal adoption of
DHT network Bit Torrent traffic rate of all internet
estimates
In 2010 Oct, A New York judge ordered LimeWire ① “55%” - CableLabs
to shutdown its file-sharing software. About an half of upstream traffic of CATV.
US federal court judge issued that Limewire’s ② “35%” - CacheLogic
service is used as one of the software for “LIVEWIRE - File-sharing network thrives beneath
infringement of copyright contents. the Radar”
Later soon, the new version of Limewire called ③ “60%” - documents in www.sans.edu
LPE (Limewire Pirate Edition) has been released “It is estimated that more than 60% of the traffic on
as resurrection by anonymous creators. the internet is peer-to-peer.”
6. Parser and translator is
Architecture Overview parallelized by Scala.
Virtual machines and Data nodes is applicable for scaling out.
8. Demo: visualizing propagation of DHT crawling
We have
crawled
more than
10,000,000
Peers in
DHT nework
In 24 hours
SQL (MySQL
or Postgres)
Cannot
handle
4,000,000
peers in
3 hours !
9. DHT crawler and Map Reduce
For huge scale of DHT network, we cannot Without HDFS, it takes 7 days for
run too many crawlers. processing data of 1 day.
RANK Country # of nodes Region Domain
1 Russia 1,488,056 Russia RU
2 United states 1,177,766 North America US
3 China 815,934 East Asia CN
4 UK 414,282 West Europe GB
5 Canada 408,592 North America CA
6 Ukraine 399,054 East Europe UA
7 France 394,005 West Europe FR
8 India 309,008 South Asia IN
9 Taiwan 296,856 East Asia TW
DHT network
10 Brazil 271,417 South America BR
11 Japan 262,678 East Asia JP
12 Romania 233,536 East Europe RO
13 Bulgaria 226,885 East Europe BG
14 South Korea 217,409 East Asia KR
15 Australia 216,250 Oceania AU
16 Poland 184,087 East Europe PL
17 Sweden 183,465 North Europe SE
18 Thailand 183,008 South East Asia TH
19 Italy 177,932 West Europe IT
20 Spain 172,969 West Europe ES
Reduce
DHT Crawler DHT Crawler DHT Crawler
Shuffle
Scale out !
Map Map Map
Key value store
<key>=node ID
<value>=data (address, port, etc)
Dump Data
Map job should be increased
corresponding to the number of DHT crawler.
10. Scaling DHT crawlers out!
FIND_NODE : used to obtain the
contact information of ID.
Response should be a key “nodes” or the compact node
info for the target node or the K (8) in its routing table.
arguments: {"id" : "<querying nodes id>",
"target" : "<id of target node>"}
response: {"id" : "<queried nodes id>",
"nodes" : "<compact node info>"}
DHT network
The response should be a key nodes of
or the compact node info for the target node
or the K (8) in its routing table.
DHT Crawler DHT Crawler DHT Crawler
Info of key nodes and K(8) should be
Hypervisor randomly distributed.
So we can obtain 8^N peers in worst case.
11. Rapid propagation of
DHT gossip protocol N^M
node
12000000
10000000
8000000
6000000
4000000
2000000
0
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
diff
1000000
Applying 100000
gossip
protocol, 10000
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
DHT has
N^M (N=5-8) After 5 hours, Δ(increasing)
propagation become stable
speed.
In first 4 hours, we can obtain
more than 4000000 peers!
12. Visualization & ranking
77.221.39.201,6881,2011/9/25 23:57:43,1
87.97.210.128,62845,2011/9/25 23:56:32,1
188.40.33.212,6881,2011/9/25 23:33:58,1
188.232.9.21,49924,2011/9/25 23:37:02,1
Traffic logs
is parsed
Into XML
Location info is (Keyhole
IP address retrieved by GeoIP Time
Markup
from each IP address
Language)
Location Info
Domain name (country, city, latlng)
KML movie
Strings are tokenized Figure
and aggregated
ranking by HDFS
13. Two-Stage Map Reduce: count and sorting
Frequency count Sorting according
for each word to Reduce1
Map
Reduce1 Map
Input Map Reduce Output
Reduce2 Map
Map
MapReduce is the algorithm suitable for coping with Big data.
Ranking (sorting)
map(key1,value) -> list<key2,value2> Need second stage
reduce(key2, list<value2>) -> list<value3> of Map phase.
MapReduce: Simplified Data Processing on Large Clusters
Jeffrey Dean and Sanjay Ghemawat
OSDI'04: Sixth Symposium on Operating System Design and Implementation,
San Francisco, CA, December, 2004.
14. Map Phase
*.0.194.107,h116-0-194-107.catv02.itscom.jp
*.28.27.107,c-76-28-27-107.hsd1.ct.comcast.net
*.40.239.181,c-68-40-239-181.hsd1.mi.comcast.net
*.253.44.184,pool-96-253-44-184.prvdri.fios.verizon.net
*.27.170.168,cpc11-stok15-2-0-cust167.1-4.cable.virginmedia.com
*.22.23.81,cpc2-stkn10-0-0-cust848.11-2.cable.virginmedia.com
*.0.194.107 hdsl1 comcast hdsl1 comcast verizon virginmedia
1 1 1 1 1 1 1
Log string is divided into words and assigned “1”.
key-value – {word, 1} Map job is
easier to increase
In Map phase, each line is tokenized for a word, and each word then Reduce job.
is assigned “1”.
15. Reduce Phase
*.0.194.107 hdsl1 comcast hdsl1 comcast verizon virginmedia
1 1 1 1 1 1 1
hdsl1 comcast verizon
1 1 1
1 1
Reduce job is applied for
counting frequency of each word.
Reduce: count up 1 for each word.
Key-value – {hdsl, 2} / Key-value – {comcast, 2} / Key-value – {verizon, 1}
16. Sorting and ranking
*.0.194.107 hdsl1 comcast hdsl1 comcast verizon hdsl1
1 1 1 1 1 1 1
hdsl1 comcast verizon
1 1 1
1 1
①
Sorting and ranking is
1
③ ② second reduce phase.
Words with the frequency
is sorted in shuffle.
@list1 = reverse sort { (split(/¥s/,$a))[1] <=> (split(/¥s/,$b))[1] } @list1;
17. Example: # of nodes Ranking in one day
RANK Country # of nodes Region Domain
1 Russia 1,488,056 Russia RU
2 United states 1,177,766 North America US
3 China 815,934 East Asia CN
4 UK 414,282 West Europe GB
5 Canada 408,592 North America CA
6 Ukraine 399,054 East Europe UA
7 France 394,005 West Europe FR
8 India 309,008 South Asia IN
9 Taiwan 296,856 East Asia TW
10 Brazil 271,417 South America BR
11 Japan 262,678 East Asia JP
12 Romania 233,536 East Europe RO
13 Bulgaria 226,885 East Europe BG
14 South Korea 217,409 East Asia KR
15 Australia 216,250 Oceania AU
16 Poland 184,087 East Europe PL
17 Sweden 183,465 North Europe SE
18 Thailand 183,008 South East Asia TH
19 Italy 177,932 West Europe IT
20 Spain 172,969 West Europe ES
18. ALL cities except US
N/A 978457
1 Moscow 285097 (RU:1)
2 Beijing 240419 (CN:3)
3 Seoul 180186 (KR)
4 Taipei 161498 (TW:9)
5 Kiev 117392 (RU:1)
6 Saint Petersburg 94560
7 Bucharest 79336
These peers has 8 Sofia 78445 (BG:13)
been connected from 9 Central District 65635 (HK)
single point in Tokyo in
24 hours. Propagation
10 Bangkok 62882 (TH:18)
in DHT network is 11 Delhi 62563 (IN:8)
beyond over 12 Tokyo 54531 (JP:11)
boarder control. 13 London 53514 (GB:4)
14 Guangzhou 52981 (CN:3)
15 Athens 52656 (3680000: 1.4%)
16 Budapest 52031
Z. N. J. Peterson, M. Gondree, and R. Beverly.
A position paper on data sovereignty:
The importance
of geolocating data in the cloud.
the 3nd USENIX workshop on Hot Topics in
Cloud Computing, June 2011
19. rank 3 China 815,934 East Asia CN
name # of peers population 都市名
Beijing 240419 1755 北京
Guangzhou 52981 1,004 広州
Shanghai 27399 1921 上海
Jinan 26281 569 済南
Chengdu 18835 1059 成都
Shenyang 18566 776 瀋陽
Tianjin 18460 1228 天津
Hebei 17414 - 河北
Wuhan 15239 910 武漢
Hangzhou 12997 796 杭州
Harbin 10848 987 ハルビン
Changchun 10411 751 長春
Nanning 10318 648 南寧
Beijing is the largest city of which the Qingdao 10257 757 青島
number of peers is about 240000, second
to Moscow.
Tokyo 54531 1318 東京
In china, BT seems to be popular besides Osaka 7430 886 大阪
many domestic file sharing systems. yokohama 6983 369 横浜
BitComet: a popular Tokyo and Guangzhou has almost the same
number of peers about 50000.
client in Asia
20. Demo2: (almost) real time monitoring of peers
in Japan
In this movie,
there are
four colors
According to
the number
of files
located in
each point.
In this slide, traffic log
is translated into XML
Key hole markup
Language
Movie can be generated after a day. Spying the World from your Laptop -- Identifying
and Profiling Content Providers and
Aggregation and translation of 24 hours is Big Downloaders in BitTorrent
completed in 16 hours 3rd USENIX Workshop on Large-Scale Exploits
and Emergent Threats (LEET'10) (2010)
21. Conclusion: Scalable Back-end Clusters Orchestration
for real-world systems (large scale network monitoring)
■Hadoop and NoSQL: Scalable Back-end Clusters Orchestration in Real-world Systems
Hadoop and NoSQl are usually used together. Partly because Key-Value data format (such as JSON)
is suitable for exchanging data between MongoDB and HDFS. These technologies is deployed
network monitoring system and large scale Testbed in National research institute in Japan.
■What is Orchestration is for? – large scale network monitoring
With rapid enragement of botNet and file sharing networks, network traffic monitoring logs has
become “big data”. Today’s Large scale network monitoring needs scalable clusters for traffic
logging and data processing.
■Back ground – Internet traffic explosion
Some statistics are shown about mobile phone traffic and "gigabyte club"
■Real world systems – large scale DHT network crawling
To test performance of our system, we have crawling DHT (BitTorrent) network. Our system have
Obtained information of over 10,000,000 nodes in 24 hours. Besides, ranking of countries about
popularity of DHT network is generated by our HDFS.
■Architecture overview
We use everything available for constructing high-speed and scalable clusters (hypervisor, NoSQL,
HDFS, Scala, etc..)
■Map Reduce and Traffic logs
For aggregating and sorting traffic logs, we have programmed two stage Map Reduce.