SlideShare une entreprise Scribd logo
1  sur  10
Télécharger pour lire hors ligne
International Journal of Engineering Research and Development
e-ISSN: 2278-067X, p-ISSN: 2278-800X, www.ijerd.com
Volume 10, Issue 5 (May 2014), PP.26-35
26
Can Modern Interconnects Improve the Performance of
Hadoop Cluster? Performance evaluation of
Hadoop on SSD and HDD with IPoIB
Piyush Saxena
M.Tech (Computer Science and Engineering), Amity School of Engineering and Technology,
Amity University, Noida, India +91-9451427546
Abstract:- In today‟s world where Internet is most required and where pentabytes of data is produced per hour,
there is a drastic need to speed up the performance and throughput of the cloud system. Traditional cloud
systems were not able to give the performance that the storage devices like SSD and HDD were meant to
deliver.
In the last paper we showed that the hadoop on SSD and HDD did not showed much difference in performance
as these were traditionally connected to the processing system that acts as a hindrance to the system. Another
reason that could be spotted with the pattern of data access was that there was less of Random Access Memory
with low caching resources available. To these issues, another set of experiments were conducted using a highly
improved connecting method than the conventional 10 GigE and by implementing Distributed shared memory
that can make the access patterns much faster. The improved methods that were considered for the test purposes
were IPoIB and RDMA-IB.
In this paper we will also present that Modern Interconnects used in Hadoop (MapReduce) with SSD can
outperform the traditional Interconnecting technique like 10 GigE networks. In addition, we also demonstrate
that the use of sockets or conventional TCP/IP applications can be still used with new technology and with
improved throughput and less latency when IBoIP is used.
Keywords:- Hadoop, HDFS, SSD, HDD, HiBench, Benchmarking, 10 GigE, IPoIB, RDMA-IB.
I. INTRODUCTION TO HADOOP AND HDFS
In today‟s digital age, a big measure of data is been processed on the internet. Allotting optimal data
processing with advantageous response duration acts the output to the requests by the consumer. There are
frequent users that assay to enter the alike data above the web and it is a challenging task for the server to deliver
optimal result. The large amount of data the internet has to deal with every day has made conventional solutions
extremely uneconomical. There are difficulties like processing large documents split into many disaffiliated sub-
tasks that are segmented with the available nodes, and processed in parallel. Due to this, MapReduce and Hadoop
came into existence.
Hadoop is a free-of-cost, programming architecture that is java-based and supports the processing of
large amounts of applications on systems that have thousands of nodes and involves multiple pentabytes of data.
The Hadoop Distributed File System helps faster data transfer rates between the nodes and makes the cluster to
persist functioning performances uninterrupted in case of node failure. This system actually lowers the risk of
complete system failure even when a significant no. of nodes are in-operative.[2]
Hadoop was motivated by MapReduce (Fig.1) that was introduced by Google, a software framework in which an
application is broken down into numerous small parts. Any of these parts (also called fragments or blocks) can be
run on any node in the cluster.[3]
MapReduce-based studies have been actively carried out for the efficient processing of big data on
hadoop. Hadoop runs on clusters of computers that can handle large amounts of data and support distributed
applications.[4] In the last few years, lots of research has been carried out to improve the performance of hadoop.
One of the hindrances is the performance issues of the storage device used as it is connected to the system by a
slower connecting interface like Bus. Even the difference in the Devices used for storage creates the hindrance.[5]
The performance of the Hadoop system is also bound on the type of workload that we consider. This is
why we consider HiBench as the standard model for testing Hadoop Distributed File System (HDFS). In this
paper, we try to study and evaluate the performance of Hadoop Distributed File System on a Hadoop Cluster
system that contains flash memory based SSD (Solid State Drive) and Hard Disk Drive by optimizing each
parameter on HiBench.
Can Modern Interconnects Improve the Performance of Hadoop Cluster?...
27
Technology has advanced fast, and datasets have grown even faster as it is easier to generate and
incarcerate data. The large Big Data, are a warehouse of information. The primary challenge in the investigation
of Big Data is to conquer the I/O blockage present on modern systems.[6] Lethargic I/O systems overpower the
very use of having high end processors. They cannot provide data fast adequate to utilize all of the accessible
processing power. Outcome of this is wastage of power and increases in the price of in commission large clusters.
An approach is the use of Modern interconnects like IPoIB and RDMA-IB in place of Traditional Interconnects
like 10 GigE.
Fig 1.) Hadoop MapReduce Architecture
II. TRADITIONAL INTERCONNECT 10 GIGE NETWORK
10 gigabit Ethernet is a communication methodology that can give data transfer speeds up to 10 billion
bits per second. 10 gigabit Ethernet is also known as 10GE, 10GbE or 10 GigE.
It supports full duplex connections that can be connected by network switches and shared medium operation with
CSMA/CD.[7] It can work properly with the existing protocols. Since the 10 GigE works in full-duplex method,
it doesn‟t need Carrier Sense Multiple Access/Collision Detection protocols that is extremely important as this
improves the efficiency and the speed of 10 Gb Ethernet as it can be easily deployed in the existing network, thus
giving a cost-efficient methodology that support high-speed, low-latency requirements.[8]
10-Gigabit Ethernet offers distances between physical locations up to 40 kilometers over a single-mode fiber and
multi-mode fiber systems.
Technically 10 Gb Ethernet is a Layer 1 and Layer 2 protocol that follows the Ethernet attributes like
Media Access Control (MAC) protocol, the Ethernet frame format, & min and max frame size. This technology
supports both LAN and WAN standards. (Fig 2.)
Issues faced in deploying 10 Gb Ethernet are due to the costs of fibre channels, but the benefits received are very
large.
III. MODERN INTERCONNECT IPOIB NETWORK
Infini Band (IB) [9] is a uniform organization regional Network that is applicable in HPC and data
centre environments Infini Band Technology has high speed data transfer at a very low latency time. To allow the
legacy IP based applications over Internet Protocol based apps over InfiniBand in Data Centers. Internet Protocol
over InfiniBand protocol uses an interface on top of Infini band „Verbs‟ Layer that allows the applications
running on sockets to use host based TCP/IP protocol stack that is converted into native InfiniBand Verbs that
looks invisible to the application. Sockets Direct Protocol (SDP) is a development of the sockets based boundary
interface, allows the process to bypass the TCP/IP protocol stack and translate socket based packets into the verbs
layer RDMA operations, still maintaining TCP streaming socket symbolism. [10]
SDP has the benefits of trespassing software layer that is required in IPoIB. The results of this are SDP
has better latency and performance than IPoIB.
Can Modern Interconnects Improve the Performance of Hadoop Cluster?...
28
The uses of the InfiniBand are in modern computing and high performance computing. The benefits of
IBoIP is reducing communication latency as well as providing higher available bandwidth to clients in the local
DCN.
The administration of network load is of concern in the new networking technologies. Quality of Service
provisioning could be used to control the traffic for intra-network loads that can have main concern over the input
data stream (IDS). The traffic loads and flexibility in fine tuning of the performance of the network is also a bottle
neck for the system wide performance. Such technology would boost the performance of traditional data centers
that still work on Ethernet Best Effort Service with low or no requirement for modifying the conventional socket
applications.
It is important to evaluate the behavior of the H/W level QOS provisioning for InfiniBand network with
applications on the optimized socket based protocols. This yields in a step to use of this new technology to
harness high-speed interconnects for existing Internet applications.
In this paper, we will analyze and see the performance improvements in case of modern interconnects
like IPoIB in comparison of the traditional Bus interconnects or 10 GigE hardware.
InfiniBand is a prominent cluster interconnecting technology with very low latency and very high
performance. Native InfiniBand verbs is the lowest software layer of the InfiniBand network that allows direct
user-level access to IB Host Channel Adapter (HCA) resources by omitting the Operating System. At the IB
verbs level, a queue pairing form is used for message underneath both Send/Receive and RDMA semantics.
InfiniBand needs the user to register the buffer before using it for communication.
InfiniBand HCAs has 2 ports that can operate as 4X InfiniBand or 10-GigE. The architecture of HCA
includes a stateless offload engine for network interface card (NIC) based protocol processing.
Sockets Direct Protocol was designed originally for InfiniBand that has now been redefined as a
transport –agnostic protocol for RDMA network based fabrics. It was made known to improve and progress the
performance of sockets by using the RDMA protocol of the InfiniBand network. SDP is a byte-stream protocol
that is built on TCP stream socket connotations. SDP uses a protocol switch inside the operating system kernel
that clearly alternates between kernel TCP/IP stack above IB (IPoIB) along with the SDP above IB (which
sidesteps the kernel TCP/IP stack) [11].
SDP acquires bi-form layouts of data interchange. In the buffered-copy arrangement, the socket data is
duplicated in a preregistered buffer foregoing the network transfer. In the zero-copy arrangement, the consumer
buffer is lucidly registered for broadcasting to bypass data reproduction. (Fig 2.)
IV. MODERN INTERCONNECT RDMA-IB
InfiniBand Host Channel Adapters (HCA) and further network equipments can be approached by the
upper layer software using an interface called Verbs. The verbs interface is a low level communication interface
that follows the Queue Pair (or communication end-points) model.
Queue pairs are required to establish a channel between the two communicating entities. Each queue pair
has a certain number of work queue elements. Upper-level software places a work request on the queue pair that
is then processed by the HCA. When a work element is completed, it is placed in the completion queue. Upper
level software can detect completion by polling the completion queue. Verbs that are used to transfer data are
completely OS-bypassed. (Fig 2.)
Fig 2.) Various Interconnect Technologies and architecture
Can Modern Interconnects Improve the Performance of Hadoop Cluster?...
29
V. TEST BED SYSTEM USED FOR THE ANALYSIS
4 node all 1U servers (Quanta Stack) with 2 Intel Xeon X5670 CPU‟s, each one has 6 cores that is equal
to 12 physical cores with 96GB of Memory and Ethernet/ Infiniband as network. Storage Device 100 GB SSD
and 2 TB HDD.
VI. CLASSIFICATION OF MICRO BENCHMARK WORKLOADS
1.) Sort: It is a representation of a large subset of real world MapReduce jobs that is transforming data from
one representation to another. Sort requires an Input Output bound system resource utilization with the data
access patterns as equal quantities of data access. The input data is generated using the RandomTextWriter
program contained in the Hadoop distribution. Time taken by Reduce stage is twice the time taken by Map stage.
(Fig.3.1) [12]
Fig 3.1) MapReduce for SORT workload
2.) Word Count: It is also a representation of a large subset of real world MapReduce jobs that is
transforming data by extracting a small amount of interesting data from a large data set. Word Count requires a
CPU bound system resource utilization with the data access patterns as reducing quantities of data access. The
input data is generated using the RandomTextWriter program contained in the Hadoop distribution. Time taken by
Reduce stage is nearly the same as the time taken by Map stage. (Fig.3.2)
Fig 3.2) MapReduce for WORD COUNT Workload
3.) TeraSort: It sorts 10 billion 100-byte records generated by the TeraGen program contained in the
Hadoop distribution. TeraSort requires CPU bound system resource utilization during Map stage and Input
Output bound system resource utilization during Reduce stage with the data access patterns as reducing and then
growing quantities of data access. Time taken by Reduce stage is 1.5 times the time taken by Map stage. (Fig.3.3)
Fig 3.3) MapReduce for TERA SORT workload
Can Modern Interconnects Improve the Performance of Hadoop Cluster?...
30
VII. PERFORMANCE EVALUATION OF SSD AND HDD ON 10GIGE AND IPOIB
For the Performance evaluation and Analysis of the performance of SSD and HDD the considered work
loads are Sort, Word Count and Tera Sort on two different workloads viz. 10 GigE and IBoIP. The size of data
taken for all the workloads is 6550021992 bytes that is 6.1001GB of data. [13][14][15] (X-Axis -> Percentage ;
Y-Axis -> Time in sec)
1) Sort Work Load: Since Sort has an Input Output bound resource utilization it is easily observed that
SSD (Fig.4.1) buffers the data much earlier and at a faster rate than HDD (Fig.5.1) that tends to buffer at a
constant speed. Due to this reason the SSD had an earlier chance to start off with the Reduce phase as compared
to the HDD. It can also be inference from the graduated behavior of the graph that HDD works in a much
stabilized manner as compared to the SDD. Over all SSD finishes off its job with the processors 39seconds
earlier than the HDD. This proves that the SSD works much faster than HDD in the scenario of Sort Workload.
Fig 4.1.) SORT workload on Solid State Drive 10GigE (Blue -> Map Phase; Red -> Reduce Phase)
Fig 4.2.) SORT workload on Solid State Drive IPoIB (Blue -> Map Phase; Red -> Reduce Phase)
As of the performance change between the Modern Interconnect using IPoIB comapred to the
Traditional Interconnect using 10 GigE can be analysed from the benchmarking results of SSD and HDD used on
both types of interconnects. The analysis is as follows:
For SSD: the level of improvement is an average of 45% with precise improvement of 44% in Map
Phase and 46% in Reduce Phase. The Map phase completed at 113sec in case of IBoIP as compared to 212sec in
case of 10GigE. The Reduce phase completed at 455sec in IPoIB as compared to 852sec in case of 10GigE. The
reduce phase started from 30% of map phase.
For HDD: the level of improvement is an average of 27% with precise improvement of 26% in Map
Phase and 27% in Reduce Phase. The Map phase completed at 196sec in case of IBoIP as compared to 265sec in
case of 10GigE. The Reduce phase completed at 699sec in IPoIB as compared to 953sec in case of 10GigE. The
reduce phase started from 28% of map phase.
International Journal of Engineering Research and Development
e-ISSN: 2278-067X, p-ISSN: 2278-800X, www.ijerd.com
Volume 10, Issue 5 (May 2014), PP.26-35
31
Fig 5.1) SORT workload on Hard Disk Drive 10GigE (Blue -> Map Phase; Red -> Reduce Phase)
Fig 5.2) SORT workload on Hard Disk Drive IPoIB (Blue -> Map Phase; Red -> Reduce Phase)
2) Word Count Work Load: Since Sort has a CPU bound resource utilization it is easily observed that
SSD (Fig.6.1) and HDD (Fig.7.1) both buffers approximately at the same rate but with a little variation in the
speed as SSD buffers about 3 seconds faster than HDD. Due to this reason the SSD had an earlier chance to start
off with the Reduce phase at 47 seconds as compared to the HDD that starts at 49 seconds. It can also be inferred
from the abrupt behavior of the graph that HDD takes a longer time in the reduce phase as compared to the SSd
that takes less time. Over all SSD finishes off its job with the processors 6seconds earlier than the HDD that is not
a very major time difference. But, still this proves that the SSD works faster than HDD in the scenario of Word
Count Workload.
Fig 6.1) WORD COUNT Workload on Solid State Device 10GigE
Can Modern Interconnects Improve the Performance of Hadoop Cluster?...
32
Fig 6.2) WORD COUNT Workload on Solid State Device IPoIB
As of the performance change between the Modern Interconnect using IPoIB comapred to the
Traditional Interconnect using 10 GigE can be analysed from the benchmarking of SSD and HDD. The analysis is
as follows:
For SSD: the level of improvement is an average of 46% with precise improvement of 45% in Map
Phase and 47% in Reduce Phase. The Map phase completed at 43sec in case of IBoIP as compared to 79sec in
case of 10GigE. The Reduce phase completed at 50sec in IPoIB as compared to 94sec in case of 10GigE. The
reduce phase started from 44% of map phase.
For HDD: the level of improvement is an average of 29% with precise improvement of 26% in Map Phase and
33% in Reduce Phase. The Map phase completed at 64sec in case of IBoIP as compared to 86sec in case of
10GigE. The Reduce phase completed at 67sec in IPoIB as compared to 100sec in case of 10GigE. The reduce
phase started from 65% of map phase.
Fig 7.1) WORD COUNT Workload on Hard Disk Drive 10GigE
3) TeraSort Work Load: Since Sort has a CPU bound system resource utilization during Map stage and
Input Output bound system resource utilization during Reduce stage it is easily observed that SSD (Fig.8.1)
buffers the data much earlier 19Sec and at a faster rate than HDD (Fig.9.1) 21Sec that tends to buffer at an abrupt
speed. Due to this reason the SSD had an earlier chance to start off with the Reduce at 23sec as compared to the
HDD that starts at 24 second. It can also be observed that the reduce phase for SSD and HDD takes equal amount
of time i.e. process is independent of SSD or HDD & dependent on processor. SSD finishes its job 1second
earlier than the HDD that is not a negligible difference. But, still this proves that the SSD has lower latency than
HDD in the scenario of Tera Sort Workload
International Journal of Engineering Research and Development
e-ISSN: 2278-067X, p-ISSN: 2278-800X, www.ijerd.com
Volume 10, Issue 5 (May 2014), PP.26-35
33
Fig 7.2) WORD COUNT Workload on Hard Disk Drive IPoIB
Fig 8.1) TERASORT workload on Solid State Drive 10GigE
Fig 8.2) TERASORT workload on Solid State Drive IPoIB
As of the performance change between the Modern Interconnect using IPoIB comapred to the
Traditional Interconnect using 10 GigE can be analysed from the benchmarking results of SSD and HDD used on
both types of interconnects. The analysis is as follows:
For SSD: the level of improvement is an average of 44% with precise improvement of 41% in Map
Phase and 46% in Reduce Phase. The Map phase completed at 14sec in case of IBoIP as compared to 28sec in
case of 10GigE. The Reduce phase completed at 25sec in IPoIB as compared to 46sec in case of 10GigE. The
reduce phase started from 98% of map phase.
For HDD: the level of improvement is an average of 25% with precise improvement of 24% in Map
Phase and 26% in Reduce Phase. The Map phase completed at 16sec in case of IBoIP as compared to 21sec in
Can Modern Interconnects Improve the Performance of Hadoop Cluster?...
34
case of 10GigE. The Reduce phase completed at 34sec in IPoIB as compared to 46sec in case of 10GigE. The
reduce phase started from 100% of map phase.
Fig 9.1) TERASORT workload on Hard Disk Drive 10GigE
Fig 9.2) TERASORT workload on Hard Disk Drive IPoIB
VIII. CONCLUSION AND FUTURE SCOPE
From the above results and analysis the performance of SSD and HDD is nearly the same for the same
Interconnect used, but positive results can be seen for better performance of SSD than HDD with use of IPoIB
(Fig 10). Also the difference in the performance is very visible and drastic. So, an observation that can be
monitored is that the Map phase in any of the workload is performing well until the random access memory is not
consumed or the interconnect technology of the network used is of very high throughput and low latency. This
concludes that there is a need to involve a Distributed Shared Memory (DSM) or the need to improve the
Interconnect technology for networking from the traditional 10GigE to IBoIP to improve the performance of the
SSD and HDD and get better significant results [16]. Another connection technique like InfiniBand using RDMA
is used to connect then better performance in terms of latency, speed of access and fault tolerance can be achieved
[10].
Fig 10) Comparison of performances of the Interconnect Technologies
Can Modern Interconnects Improve the Performance of Hadoop Cluster?...
35
In the future, a model to have DSM as a part should be used to be implemented with the use of
InfiniBand on RDMA that supports the use of Verbs and with the technology of Optical Fibers to achieve faster
performances and Remote Dynamic Memory Access. [17][18][19] The modern interconnects like IPoIB and
RDMA-IB has got a lot of potential and their powers need to be researched and harnessed on in the future.[20]
REFERENCES
[1] Hadoop Home: http://hadoop.apache.org/
[2] Jacky Wu, "Hadoop HDFS & MapReduce", Help Guidelines LSA Lab, NTHU, Taiwan 2013.8.7.
[3] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, "The Google File System", SOSP‟03,
October 19–22, 2003, Bolton Landing, New York, USA. Copyright 2003 ACM.
[4] Piyush Saxena, Satyajit Padhy, Praveen Kumar, “Optimizing Parallel Data Processing With Dynamic
Resource Allocation”, International Conference on Reliability, Infocom Technologies and
Optimization.,pp. 735-739, Jan. 29-31, 2013.
[5] Jeffrey Dean and Sanjay Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters",
OSDI‟04, 2004, Bolton Landing, New York, USA. Copyright 2004 ACM.
[6] “Can High-Performance Interconnects Benefit Hadoop Distributed File System? ”, Lab Resources of
Network-Based Computing Laboratory, Department of Computer Science and Engineering, The Ohio
State University, USA.
[7] N. S. Islam, M. W. Rahman, J. Jose, R. Rajachandrasekar, H. Wang, H. Subramoni, C. Murthy, and D.
K. Panda, “High Performance RDMA-based Design of HDFS over InfiniBand”, Research Resources of
Department of Computer Science and Engineering and IBM T.J Watson Research Center, The Ohio
State University Yorktown Heights, NY. November 10-16, 2012, Salt Lake City, Utah, USA.
[8] Open Fabrics Enterprise Distribution, http://www.openfabrics.org/.
[9] X. Ding, S. Jiang, F. Chen, K. Davis, and X. Zhang. DiskSeen: Exploiting Disk Layout and Access
History to Enhance I/O Prefetch. In Proceedings of USENIX07, 2007.
[10] InfiniBand Trade Association Home: http://www.infinibandta.org/
[11] C. Gniady, Y. C. Hu, and Y.-H. Lu. Program Counter Based Techniques for Dynamic Power
Management. In Proceedings of the 10th International Symposium on High Performance Computer
Architecture, HPCA ‟04, Washington, DC, USA, 2004. IEEE Computer Society.
[12] Shengsheng Huang, Jie Huang, Jinquan Dai, Tao Xie, and Bo Huang, "The HiBench Benchmark Suite:
Characterization of the MapReduce-Based Data Analysis", ICDE Workshops'10, Oct. 2010, 2010
IEEE.
[13] Lan Yi, “Experience with HiBench: From Micro-Benchmarks toward End-to-End Pipelines”, WBDB
2013 Workshop Presentation, Intel China Software Center, 2013.07.16.
[14] Dominique Heger, “Hadoop Performance Tuning - A Pragmatic & Iterative Approach”, Research
details by DHTechnologies ‐ www.dhtusa.com, 2013.
[15] Jason Dai, “Toward Efficient Provisioning and Performance Tuning for Hadoop”, Apache Asia
Roadshow 2010, Intel China Software Center, June 2010.
[16] Remote Direct Memory Access : http://en.wikipedia.org/wiki/Remote_direct_memory_access
[17] Liang Ming , Dan Feng, Fang Wang, Qi Chen, Yang Li, Yong Wan, Jun Zhou, "A Performance
Enhanced User-space Remote Procedure Call on InfiniBand*", Photonics and Optolectronics Meetings
(POEM)., 2011.
[18] Fan Liang, Chen Feng, Xiaoyi Lu, Zhiwei Xu, "Performance Benefits of DataMPI:A Case Study with
BigDataBench", ACM SOFT BPOE ‟14, Mar 1, 2014, Salt Lake City, Utah, USA.
[19] Xiaoyi Lu, Nusrat S. Islam, Md. Wasi-ur-Rahman, Jithin Jose, Hari Subramoni, Hao Wang, and
Dhabaleswar K. (DK) Panda, "High-Performance Design of Hadoop RPC with RDMA over
InfiniBand", National Science Foundation grants #OCI-0926691, #OCI-1148371 and #CCF-1213084,
2013 IEEE.
[20] K. Gupta, R. Jain, H. Pucha, P. Sarkar, and D. Subhraveti, “Scaling Highly-Parallel Data-Intensive
Supercomputing Applications on a Parallel Clustered Filesystem,” in The SC10 Storage Challenge.
Piyush Saxena Pursuing Master of Technology in Computer Science and Engineering from
Amity School of Engineering and Technology, Amity University Uttar Pradesh, Noida, India,
Area of Interest: Cloud Computing, Data Mining and Warehousing and Soft Computing.

Contenu connexe

Tendances

HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODinside-BigData.com
 
Accelerated Any-Scale Solutions from DDN
Accelerated Any-Scale Solutions from DDNAccelerated Any-Scale Solutions from DDN
Accelerated Any-Scale Solutions from DDNinside-BigData.com
 
Leveraging IoT as part of your digital transformation
Leveraging IoT as part of your digital transformationLeveraging IoT as part of your digital transformation
Leveraging IoT as part of your digital transformationJohn Archer
 
Emulex Presents Why I/O is Strategic Global Survey Results
Emulex Presents Why I/O is Strategic Global Survey ResultsEmulex Presents Why I/O is Strategic Global Survey Results
Emulex Presents Why I/O is Strategic Global Survey ResultsEmulex Corporation
 
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisHadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisOW2
 
PLX Technology Company Overview
PLX Technology Company OverviewPLX Technology Company Overview
PLX Technology Company OverviewPLX Technology
 
Delivering on the Hadoop/HBase Integrated Architecture
Delivering on the Hadoop/HBase Integrated ArchitectureDelivering on the Hadoop/HBase Integrated Architecture
Delivering on the Hadoop/HBase Integrated ArchitectureDataWorks Summit
 
Attaching cloud storage to a campus grid using parrot, chirp, and hadoop
Attaching cloud storage to a campus grid using parrot, chirp, and hadoopAttaching cloud storage to a campus grid using parrot, chirp, and hadoop
Attaching cloud storage to a campus grid using parrot, chirp, and hadoopJoão Gabriel Lima
 
Design and evaluation of a proxy cache for
Design and evaluation of a proxy cache forDesign and evaluation of a proxy cache for
Design and evaluation of a proxy cache foringenioustech
 
Mellnox Interconnect presentation in OpenPOWER Brazil workshop
Mellnox Interconnect presentation in OpenPOWER Brazil workshopMellnox Interconnect presentation in OpenPOWER Brazil workshop
Mellnox Interconnect presentation in OpenPOWER Brazil workshopGanesan Narayanasamy
 
Dell First Out the Blocks with 25GbE Servers
Dell First Out the Blocks with 25GbE ServersDell First Out the Blocks with 25GbE Servers
Dell First Out the Blocks with 25GbE ServersIT Brand Pulse
 
Covid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power SystemsCovid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power SystemsGanesan Narayanasamy
 
Data Science at Scale on MPP databases - Use Cases & Open Source Tools
Data Science at Scale on MPP databases - Use Cases & Open Source ToolsData Science at Scale on MPP databases - Use Cases & Open Source Tools
Data Science at Scale on MPP databases - Use Cases & Open Source ToolsEsther Vasiete
 
Nebula - The Future Internet Architecture
Nebula - The Future Internet ArchitectureNebula - The Future Internet Architecture
Nebula - The Future Internet ArchitectureRanjan Dhar
 
HUG Ireland Event - HPCC Presentation Slides
HUG Ireland Event - HPCC Presentation SlidesHUG Ireland Event - HPCC Presentation Slides
HUG Ireland Event - HPCC Presentation SlidesJohn Mulhall
 
Pioneering and Democratizing Scalable HPC+AI at PSC
Pioneering and Democratizing Scalable HPC+AI at PSCPioneering and Democratizing Scalable HPC+AI at PSC
Pioneering and Democratizing Scalable HPC+AI at PSCinside-BigData.com
 
Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015PivotalOpenSourceHub
 

Tendances (20)

HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPODHPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
HPC at Scale Enabled by DDN A3i and NVIDIA SuperPOD
 
Accelerated Any-Scale Solutions from DDN
Accelerated Any-Scale Solutions from DDNAccelerated Any-Scale Solutions from DDN
Accelerated Any-Scale Solutions from DDN
 
Leveraging IoT as part of your digital transformation
Leveraging IoT as part of your digital transformationLeveraging IoT as part of your digital transformation
Leveraging IoT as part of your digital transformation
 
Emulex Presents Why I/O is Strategic Global Survey Results
Emulex Presents Why I/O is Strategic Global Survey ResultsEmulex Presents Why I/O is Strategic Global Survey Results
Emulex Presents Why I/O is Strategic Global Survey Results
 
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, ParisHadoop's Role in the Big Data Architecture, OW2con'12, Paris
Hadoop's Role in the Big Data Architecture, OW2con'12, Paris
 
PLX Technology Company Overview
PLX Technology Company OverviewPLX Technology Company Overview
PLX Technology Company Overview
 
Delivering on the Hadoop/HBase Integrated Architecture
Delivering on the Hadoop/HBase Integrated ArchitectureDelivering on the Hadoop/HBase Integrated Architecture
Delivering on the Hadoop/HBase Integrated Architecture
 
Attaching cloud storage to a campus grid using parrot, chirp, and hadoop
Attaching cloud storage to a campus grid using parrot, chirp, and hadoopAttaching cloud storage to a campus grid using parrot, chirp, and hadoop
Attaching cloud storage to a campus grid using parrot, chirp, and hadoop
 
Design and evaluation of a proxy cache for
Design and evaluation of a proxy cache forDesign and evaluation of a proxy cache for
Design and evaluation of a proxy cache for
 
Mellnox Interconnect presentation in OpenPOWER Brazil workshop
Mellnox Interconnect presentation in OpenPOWER Brazil workshopMellnox Interconnect presentation in OpenPOWER Brazil workshop
Mellnox Interconnect presentation in OpenPOWER Brazil workshop
 
Dell First Out the Blocks with 25GbE Servers
Dell First Out the Blocks with 25GbE ServersDell First Out the Blocks with 25GbE Servers
Dell First Out the Blocks with 25GbE Servers
 
Covid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power SystemsCovid-19 Response Capability with Power Systems
Covid-19 Response Capability with Power Systems
 
Iccns2008 Cp15
Iccns2008 Cp15Iccns2008 Cp15
Iccns2008 Cp15
 
Data Science at Scale on MPP databases - Use Cases & Open Source Tools
Data Science at Scale on MPP databases - Use Cases & Open Source ToolsData Science at Scale on MPP databases - Use Cases & Open Source Tools
Data Science at Scale on MPP databases - Use Cases & Open Source Tools
 
OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar
 
Nebula - The Future Internet Architecture
Nebula - The Future Internet ArchitectureNebula - The Future Internet Architecture
Nebula - The Future Internet Architecture
 
HUG Ireland Event - HPCC Presentation Slides
HUG Ireland Event - HPCC Presentation SlidesHUG Ireland Event - HPCC Presentation Slides
HUG Ireland Event - HPCC Presentation Slides
 
Pioneering and Democratizing Scalable HPC+AI at PSC
Pioneering and Democratizing Scalable HPC+AI at PSCPioneering and Democratizing Scalable HPC+AI at PSC
Pioneering and Democratizing Scalable HPC+AI at PSC
 
Managing IPv6 Deployments
Managing IPv6 DeploymentsManaging IPv6 Deployments
Managing IPv6 Deployments
 
Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015Greenplum Database Open Source December 2015
Greenplum Database Open Source December 2015
 

En vedette

2004 ASME Power Conference Multi-Pressure Surface Condenser Performance Eval...
2004 ASME Power Conference Multi-Pressure Surface Condenser  Performance Eval...2004 ASME Power Conference Multi-Pressure Surface Condenser  Performance Eval...
2004 ASME Power Conference Multi-Pressure Surface Condenser Performance Eval...Komandur Sunder Raj, P.E.
 
Recommendation - Darryl Wear
Recommendation - Darryl WearRecommendation - Darryl Wear
Recommendation - Darryl WearLauren Davis
 
March BG DIG: Marketing Planning 101
March BG DIG: Marketing Planning 101March BG DIG: Marketing Planning 101
March BG DIG: Marketing Planning 101Werkshop Marketing
 
WES2015 - ADVANCED TYPE OF EJECTOR REFRIGERATION CYCLE
WES2015 - ADVANCED TYPE OF EJECTOR REFRIGERATION CYCLEWES2015 - ADVANCED TYPE OF EJECTOR REFRIGERATION CYCLE
WES2015 - ADVANCED TYPE OF EJECTOR REFRIGERATION CYCLEMarijo Miljkovic
 
Pre-School Model - Day Care Center
Pre-School Model - Day Care CenterPre-School Model - Day Care Center
Pre-School Model - Day Care CenterManish Poddar
 

En vedette (8)

De paseo con sofia
De paseo con sofiaDe paseo con sofia
De paseo con sofia
 
2004 ASME Power Conference Multi-Pressure Surface Condenser Performance Eval...
2004 ASME Power Conference Multi-Pressure Surface Condenser  Performance Eval...2004 ASME Power Conference Multi-Pressure Surface Condenser  Performance Eval...
2004 ASME Power Conference Multi-Pressure Surface Condenser Performance Eval...
 
Recommendation - Darryl Wear
Recommendation - Darryl WearRecommendation - Darryl Wear
Recommendation - Darryl Wear
 
PR 101 - David Green
PR 101 - David GreenPR 101 - David Green
PR 101 - David Green
 
March BG DIG: Marketing Planning 101
March BG DIG: Marketing Planning 101March BG DIG: Marketing Planning 101
March BG DIG: Marketing Planning 101
 
WES2015 - ADVANCED TYPE OF EJECTOR REFRIGERATION CYCLE
WES2015 - ADVANCED TYPE OF EJECTOR REFRIGERATION CYCLEWES2015 - ADVANCED TYPE OF EJECTOR REFRIGERATION CYCLE
WES2015 - ADVANCED TYPE OF EJECTOR REFRIGERATION CYCLE
 
TELEVISION PPT
TELEVISION PPTTELEVISION PPT
TELEVISION PPT
 
Pre-School Model - Day Care Center
Pre-School Model - Day Care CenterPre-School Model - Day Care Center
Pre-School Model - Day Care Center
 

Similaire à International Journal of Engineering Research and Development

InfiniBand in the Enterprise Data Center.pdf
InfiniBand in the Enterprise Data Center.pdfInfiniBand in the Enterprise Data Center.pdf
InfiniBand in the Enterprise Data Center.pdfbui thequan
 
Internet of Things and Hadoop
Internet of Things and HadoopInternet of Things and Hadoop
Internet of Things and Hadoopaziksa
 
Industry Brief: Tectonic Shift - HPC Networks Converge
Industry Brief: Tectonic Shift - HPC Networks ConvergeIndustry Brief: Tectonic Shift - HPC Networks Converge
Industry Brief: Tectonic Shift - HPC Networks ConvergeIT Brand Pulse
 
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...IJCSES Journal
 
A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...ijcses
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCEcscpconf
 
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...DataWorks Summit/Hadoop Summit
 
Mellanox High Performance Networks for Ceph
Mellanox High Performance Networks for CephMellanox High Performance Networks for Ceph
Mellanox High Performance Networks for CephMellanox Technologies
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCEcsandit
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersDataWorks Summit
 
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdfth1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdfTarekHassan840678
 
Monetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersMonetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersDataWorks Summit
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesIntel® Software
 
Ceph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance NetworksCeph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance NetworksCeph Community
 
Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks Ceph Community
 
IRJET- A Study of Comparatively Analysis for HDFS and Google File System ...
IRJET-  	  A Study of Comparatively Analysis for HDFS and Google File System ...IRJET-  	  A Study of Comparatively Analysis for HDFS and Google File System ...
IRJET- A Study of Comparatively Analysis for HDFS and Google File System ...IRJET Journal
 
Performance Evaluation of Soft RoCE over 1 Gigabit Ethernet
Performance Evaluation of Soft RoCE over 1 Gigabit EthernetPerformance Evaluation of Soft RoCE over 1 Gigabit Ethernet
Performance Evaluation of Soft RoCE over 1 Gigabit EthernetIOSR Journals
 
Big Data Benchmarking with RDMA solutions
Big Data Benchmarking with RDMA solutions Big Data Benchmarking with RDMA solutions
Big Data Benchmarking with RDMA solutions Mellanox Technologies
 

Similaire à International Journal of Engineering Research and Development (20)

InfiniBand in the Enterprise Data Center.pdf
InfiniBand in the Enterprise Data Center.pdfInfiniBand in the Enterprise Data Center.pdf
InfiniBand in the Enterprise Data Center.pdf
 
Internet of Things and Hadoop
Internet of Things and HadoopInternet of Things and Hadoop
Internet of Things and Hadoop
 
Industry Brief: Tectonic Shift - HPC Networks Converge
Industry Brief: Tectonic Shift - HPC Networks ConvergeIndustry Brief: Tectonic Shift - HPC Networks Converge
Industry Brief: Tectonic Shift - HPC Networks Converge
 
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
 
A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
 
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
Accelerating Apache Hadoop through High-Performance Networking and I/O Techno...
 
Mellanox High Performance Networks for Ceph
Mellanox High Performance Networks for CephMellanox High Performance Networks for Ceph
Mellanox High Performance Networks for Ceph
 
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCENETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
 
Monitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service ProvidersMonitizing Big Data at Telecom Service Providers
Monitizing Big Data at Telecom Service Providers
 
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdfth1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
th1330-1410effectenbeurszaal4-3v2-140424180955-phpapp01 (1).pdf
 
Monetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service ProvidersMonetizing Big Data at Telecom Service Providers
Monetizing Big Data at Telecom Service Providers
 
Accelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing TechnologiesAccelerate Big Data Processing with High-Performance Computing Technologies
Accelerate Big Data Processing with High-Performance Computing Technologies
 
Interconnect your future
Interconnect your futureInterconnect your future
Interconnect your future
 
Ceph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance NetworksCeph Day New York 2014: Ceph over High Performance Networks
Ceph Day New York 2014: Ceph over High Performance Networks
 
Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks Ceph Day London 2014 - Ceph Over High-Performance Networks
Ceph Day London 2014 - Ceph Over High-Performance Networks
 
InfiniBand
InfiniBandInfiniBand
InfiniBand
 
IRJET- A Study of Comparatively Analysis for HDFS and Google File System ...
IRJET-  	  A Study of Comparatively Analysis for HDFS and Google File System ...IRJET-  	  A Study of Comparatively Analysis for HDFS and Google File System ...
IRJET- A Study of Comparatively Analysis for HDFS and Google File System ...
 
Performance Evaluation of Soft RoCE over 1 Gigabit Ethernet
Performance Evaluation of Soft RoCE over 1 Gigabit EthernetPerformance Evaluation of Soft RoCE over 1 Gigabit Ethernet
Performance Evaluation of Soft RoCE over 1 Gigabit Ethernet
 
Big Data Benchmarking with RDMA solutions
Big Data Benchmarking with RDMA solutions Big Data Benchmarking with RDMA solutions
Big Data Benchmarking with RDMA solutions
 

Plus de IJERD Editor

A Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
A Novel Method for Prevention of Bandwidth Distributed Denial of Service AttacksA Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
A Novel Method for Prevention of Bandwidth Distributed Denial of Service AttacksIJERD Editor
 
MEMS MICROPHONE INTERFACE
MEMS MICROPHONE INTERFACEMEMS MICROPHONE INTERFACE
MEMS MICROPHONE INTERFACEIJERD Editor
 
Influence of tensile behaviour of slab on the structural Behaviour of shear c...
Influence of tensile behaviour of slab on the structural Behaviour of shear c...Influence of tensile behaviour of slab on the structural Behaviour of shear c...
Influence of tensile behaviour of slab on the structural Behaviour of shear c...IJERD Editor
 
Gold prospecting using Remote Sensing ‘A case study of Sudan’
Gold prospecting using Remote Sensing ‘A case study of Sudan’Gold prospecting using Remote Sensing ‘A case study of Sudan’
Gold prospecting using Remote Sensing ‘A case study of Sudan’IJERD Editor
 
Reducing Corrosion Rate by Welding Design
Reducing Corrosion Rate by Welding DesignReducing Corrosion Rate by Welding Design
Reducing Corrosion Rate by Welding DesignIJERD Editor
 
Router 1X3 – RTL Design and Verification
Router 1X3 – RTL Design and VerificationRouter 1X3 – RTL Design and Verification
Router 1X3 – RTL Design and VerificationIJERD Editor
 
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...IJERD Editor
 
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVRMitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVRIJERD Editor
 
Study on the Fused Deposition Modelling In Additive Manufacturing
Study on the Fused Deposition Modelling In Additive ManufacturingStudy on the Fused Deposition Modelling In Additive Manufacturing
Study on the Fused Deposition Modelling In Additive ManufacturingIJERD Editor
 
Spyware triggering system by particular string value
Spyware triggering system by particular string valueSpyware triggering system by particular string value
Spyware triggering system by particular string valueIJERD Editor
 
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...IJERD Editor
 
Secure Image Transmission for Cloud Storage System Using Hybrid Scheme
Secure Image Transmission for Cloud Storage System Using Hybrid SchemeSecure Image Transmission for Cloud Storage System Using Hybrid Scheme
Secure Image Transmission for Cloud Storage System Using Hybrid SchemeIJERD Editor
 
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...IJERD Editor
 
Gesture Gaming on the World Wide Web Using an Ordinary Web Camera
Gesture Gaming on the World Wide Web Using an Ordinary Web CameraGesture Gaming on the World Wide Web Using an Ordinary Web Camera
Gesture Gaming on the World Wide Web Using an Ordinary Web CameraIJERD Editor
 
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...IJERD Editor
 
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...IJERD Editor
 
Moon-bounce: A Boon for VHF Dxing
Moon-bounce: A Boon for VHF DxingMoon-bounce: A Boon for VHF Dxing
Moon-bounce: A Boon for VHF DxingIJERD Editor
 
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...IJERD Editor
 
Importance of Measurements in Smart Grid
Importance of Measurements in Smart GridImportance of Measurements in Smart Grid
Importance of Measurements in Smart GridIJERD Editor
 
Study of Macro level Properties of SCC using GGBS and Lime stone powder
Study of Macro level Properties of SCC using GGBS and Lime stone powderStudy of Macro level Properties of SCC using GGBS and Lime stone powder
Study of Macro level Properties of SCC using GGBS and Lime stone powderIJERD Editor
 

Plus de IJERD Editor (20)

A Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
A Novel Method for Prevention of Bandwidth Distributed Denial of Service AttacksA Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
A Novel Method for Prevention of Bandwidth Distributed Denial of Service Attacks
 
MEMS MICROPHONE INTERFACE
MEMS MICROPHONE INTERFACEMEMS MICROPHONE INTERFACE
MEMS MICROPHONE INTERFACE
 
Influence of tensile behaviour of slab on the structural Behaviour of shear c...
Influence of tensile behaviour of slab on the structural Behaviour of shear c...Influence of tensile behaviour of slab on the structural Behaviour of shear c...
Influence of tensile behaviour of slab on the structural Behaviour of shear c...
 
Gold prospecting using Remote Sensing ‘A case study of Sudan’
Gold prospecting using Remote Sensing ‘A case study of Sudan’Gold prospecting using Remote Sensing ‘A case study of Sudan’
Gold prospecting using Remote Sensing ‘A case study of Sudan’
 
Reducing Corrosion Rate by Welding Design
Reducing Corrosion Rate by Welding DesignReducing Corrosion Rate by Welding Design
Reducing Corrosion Rate by Welding Design
 
Router 1X3 – RTL Design and Verification
Router 1X3 – RTL Design and VerificationRouter 1X3 – RTL Design and Verification
Router 1X3 – RTL Design and Verification
 
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
Active Power Exchange in Distributed Power-Flow Controller (DPFC) At Third Ha...
 
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVRMitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
Mitigation of Voltage Sag/Swell with Fuzzy Control Reduced Rating DVR
 
Study on the Fused Deposition Modelling In Additive Manufacturing
Study on the Fused Deposition Modelling In Additive ManufacturingStudy on the Fused Deposition Modelling In Additive Manufacturing
Study on the Fused Deposition Modelling In Additive Manufacturing
 
Spyware triggering system by particular string value
Spyware triggering system by particular string valueSpyware triggering system by particular string value
Spyware triggering system by particular string value
 
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
A Blind Steganalysis on JPEG Gray Level Image Based on Statistical Features a...
 
Secure Image Transmission for Cloud Storage System Using Hybrid Scheme
Secure Image Transmission for Cloud Storage System Using Hybrid SchemeSecure Image Transmission for Cloud Storage System Using Hybrid Scheme
Secure Image Transmission for Cloud Storage System Using Hybrid Scheme
 
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
Application of Buckley-Leverett Equation in Modeling the Radius of Invasion i...
 
Gesture Gaming on the World Wide Web Using an Ordinary Web Camera
Gesture Gaming on the World Wide Web Using an Ordinary Web CameraGesture Gaming on the World Wide Web Using an Ordinary Web Camera
Gesture Gaming on the World Wide Web Using an Ordinary Web Camera
 
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
Hardware Analysis of Resonant Frequency Converter Using Isolated Circuits And...
 
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
Simulated Analysis of Resonant Frequency Converter Using Different Tank Circu...
 
Moon-bounce: A Boon for VHF Dxing
Moon-bounce: A Boon for VHF DxingMoon-bounce: A Boon for VHF Dxing
Moon-bounce: A Boon for VHF Dxing
 
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
“MS-Extractor: An Innovative Approach to Extract Microsatellites on „Y‟ Chrom...
 
Importance of Measurements in Smart Grid
Importance of Measurements in Smart GridImportance of Measurements in Smart Grid
Importance of Measurements in Smart Grid
 
Study of Macro level Properties of SCC using GGBS and Lime stone powder
Study of Macro level Properties of SCC using GGBS and Lime stone powderStudy of Macro level Properties of SCC using GGBS and Lime stone powder
Study of Macro level Properties of SCC using GGBS and Lime stone powder
 

Dernier

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 

Dernier (20)

Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 

International Journal of Engineering Research and Development

  • 1. International Journal of Engineering Research and Development e-ISSN: 2278-067X, p-ISSN: 2278-800X, www.ijerd.com Volume 10, Issue 5 (May 2014), PP.26-35 26 Can Modern Interconnects Improve the Performance of Hadoop Cluster? Performance evaluation of Hadoop on SSD and HDD with IPoIB Piyush Saxena M.Tech (Computer Science and Engineering), Amity School of Engineering and Technology, Amity University, Noida, India +91-9451427546 Abstract:- In today‟s world where Internet is most required and where pentabytes of data is produced per hour, there is a drastic need to speed up the performance and throughput of the cloud system. Traditional cloud systems were not able to give the performance that the storage devices like SSD and HDD were meant to deliver. In the last paper we showed that the hadoop on SSD and HDD did not showed much difference in performance as these were traditionally connected to the processing system that acts as a hindrance to the system. Another reason that could be spotted with the pattern of data access was that there was less of Random Access Memory with low caching resources available. To these issues, another set of experiments were conducted using a highly improved connecting method than the conventional 10 GigE and by implementing Distributed shared memory that can make the access patterns much faster. The improved methods that were considered for the test purposes were IPoIB and RDMA-IB. In this paper we will also present that Modern Interconnects used in Hadoop (MapReduce) with SSD can outperform the traditional Interconnecting technique like 10 GigE networks. In addition, we also demonstrate that the use of sockets or conventional TCP/IP applications can be still used with new technology and with improved throughput and less latency when IBoIP is used. Keywords:- Hadoop, HDFS, SSD, HDD, HiBench, Benchmarking, 10 GigE, IPoIB, RDMA-IB. I. INTRODUCTION TO HADOOP AND HDFS In today‟s digital age, a big measure of data is been processed on the internet. Allotting optimal data processing with advantageous response duration acts the output to the requests by the consumer. There are frequent users that assay to enter the alike data above the web and it is a challenging task for the server to deliver optimal result. The large amount of data the internet has to deal with every day has made conventional solutions extremely uneconomical. There are difficulties like processing large documents split into many disaffiliated sub- tasks that are segmented with the available nodes, and processed in parallel. Due to this, MapReduce and Hadoop came into existence. Hadoop is a free-of-cost, programming architecture that is java-based and supports the processing of large amounts of applications on systems that have thousands of nodes and involves multiple pentabytes of data. The Hadoop Distributed File System helps faster data transfer rates between the nodes and makes the cluster to persist functioning performances uninterrupted in case of node failure. This system actually lowers the risk of complete system failure even when a significant no. of nodes are in-operative.[2] Hadoop was motivated by MapReduce (Fig.1) that was introduced by Google, a software framework in which an application is broken down into numerous small parts. Any of these parts (also called fragments or blocks) can be run on any node in the cluster.[3] MapReduce-based studies have been actively carried out for the efficient processing of big data on hadoop. Hadoop runs on clusters of computers that can handle large amounts of data and support distributed applications.[4] In the last few years, lots of research has been carried out to improve the performance of hadoop. One of the hindrances is the performance issues of the storage device used as it is connected to the system by a slower connecting interface like Bus. Even the difference in the Devices used for storage creates the hindrance.[5] The performance of the Hadoop system is also bound on the type of workload that we consider. This is why we consider HiBench as the standard model for testing Hadoop Distributed File System (HDFS). In this paper, we try to study and evaluate the performance of Hadoop Distributed File System on a Hadoop Cluster system that contains flash memory based SSD (Solid State Drive) and Hard Disk Drive by optimizing each parameter on HiBench.
  • 2. Can Modern Interconnects Improve the Performance of Hadoop Cluster?... 27 Technology has advanced fast, and datasets have grown even faster as it is easier to generate and incarcerate data. The large Big Data, are a warehouse of information. The primary challenge in the investigation of Big Data is to conquer the I/O blockage present on modern systems.[6] Lethargic I/O systems overpower the very use of having high end processors. They cannot provide data fast adequate to utilize all of the accessible processing power. Outcome of this is wastage of power and increases in the price of in commission large clusters. An approach is the use of Modern interconnects like IPoIB and RDMA-IB in place of Traditional Interconnects like 10 GigE. Fig 1.) Hadoop MapReduce Architecture II. TRADITIONAL INTERCONNECT 10 GIGE NETWORK 10 gigabit Ethernet is a communication methodology that can give data transfer speeds up to 10 billion bits per second. 10 gigabit Ethernet is also known as 10GE, 10GbE or 10 GigE. It supports full duplex connections that can be connected by network switches and shared medium operation with CSMA/CD.[7] It can work properly with the existing protocols. Since the 10 GigE works in full-duplex method, it doesn‟t need Carrier Sense Multiple Access/Collision Detection protocols that is extremely important as this improves the efficiency and the speed of 10 Gb Ethernet as it can be easily deployed in the existing network, thus giving a cost-efficient methodology that support high-speed, low-latency requirements.[8] 10-Gigabit Ethernet offers distances between physical locations up to 40 kilometers over a single-mode fiber and multi-mode fiber systems. Technically 10 Gb Ethernet is a Layer 1 and Layer 2 protocol that follows the Ethernet attributes like Media Access Control (MAC) protocol, the Ethernet frame format, & min and max frame size. This technology supports both LAN and WAN standards. (Fig 2.) Issues faced in deploying 10 Gb Ethernet are due to the costs of fibre channels, but the benefits received are very large. III. MODERN INTERCONNECT IPOIB NETWORK Infini Band (IB) [9] is a uniform organization regional Network that is applicable in HPC and data centre environments Infini Band Technology has high speed data transfer at a very low latency time. To allow the legacy IP based applications over Internet Protocol based apps over InfiniBand in Data Centers. Internet Protocol over InfiniBand protocol uses an interface on top of Infini band „Verbs‟ Layer that allows the applications running on sockets to use host based TCP/IP protocol stack that is converted into native InfiniBand Verbs that looks invisible to the application. Sockets Direct Protocol (SDP) is a development of the sockets based boundary interface, allows the process to bypass the TCP/IP protocol stack and translate socket based packets into the verbs layer RDMA operations, still maintaining TCP streaming socket symbolism. [10] SDP has the benefits of trespassing software layer that is required in IPoIB. The results of this are SDP has better latency and performance than IPoIB.
  • 3. Can Modern Interconnects Improve the Performance of Hadoop Cluster?... 28 The uses of the InfiniBand are in modern computing and high performance computing. The benefits of IBoIP is reducing communication latency as well as providing higher available bandwidth to clients in the local DCN. The administration of network load is of concern in the new networking technologies. Quality of Service provisioning could be used to control the traffic for intra-network loads that can have main concern over the input data stream (IDS). The traffic loads and flexibility in fine tuning of the performance of the network is also a bottle neck for the system wide performance. Such technology would boost the performance of traditional data centers that still work on Ethernet Best Effort Service with low or no requirement for modifying the conventional socket applications. It is important to evaluate the behavior of the H/W level QOS provisioning for InfiniBand network with applications on the optimized socket based protocols. This yields in a step to use of this new technology to harness high-speed interconnects for existing Internet applications. In this paper, we will analyze and see the performance improvements in case of modern interconnects like IPoIB in comparison of the traditional Bus interconnects or 10 GigE hardware. InfiniBand is a prominent cluster interconnecting technology with very low latency and very high performance. Native InfiniBand verbs is the lowest software layer of the InfiniBand network that allows direct user-level access to IB Host Channel Adapter (HCA) resources by omitting the Operating System. At the IB verbs level, a queue pairing form is used for message underneath both Send/Receive and RDMA semantics. InfiniBand needs the user to register the buffer before using it for communication. InfiniBand HCAs has 2 ports that can operate as 4X InfiniBand or 10-GigE. The architecture of HCA includes a stateless offload engine for network interface card (NIC) based protocol processing. Sockets Direct Protocol was designed originally for InfiniBand that has now been redefined as a transport –agnostic protocol for RDMA network based fabrics. It was made known to improve and progress the performance of sockets by using the RDMA protocol of the InfiniBand network. SDP is a byte-stream protocol that is built on TCP stream socket connotations. SDP uses a protocol switch inside the operating system kernel that clearly alternates between kernel TCP/IP stack above IB (IPoIB) along with the SDP above IB (which sidesteps the kernel TCP/IP stack) [11]. SDP acquires bi-form layouts of data interchange. In the buffered-copy arrangement, the socket data is duplicated in a preregistered buffer foregoing the network transfer. In the zero-copy arrangement, the consumer buffer is lucidly registered for broadcasting to bypass data reproduction. (Fig 2.) IV. MODERN INTERCONNECT RDMA-IB InfiniBand Host Channel Adapters (HCA) and further network equipments can be approached by the upper layer software using an interface called Verbs. The verbs interface is a low level communication interface that follows the Queue Pair (or communication end-points) model. Queue pairs are required to establish a channel between the two communicating entities. Each queue pair has a certain number of work queue elements. Upper-level software places a work request on the queue pair that is then processed by the HCA. When a work element is completed, it is placed in the completion queue. Upper level software can detect completion by polling the completion queue. Verbs that are used to transfer data are completely OS-bypassed. (Fig 2.) Fig 2.) Various Interconnect Technologies and architecture
  • 4. Can Modern Interconnects Improve the Performance of Hadoop Cluster?... 29 V. TEST BED SYSTEM USED FOR THE ANALYSIS 4 node all 1U servers (Quanta Stack) with 2 Intel Xeon X5670 CPU‟s, each one has 6 cores that is equal to 12 physical cores with 96GB of Memory and Ethernet/ Infiniband as network. Storage Device 100 GB SSD and 2 TB HDD. VI. CLASSIFICATION OF MICRO BENCHMARK WORKLOADS 1.) Sort: It is a representation of a large subset of real world MapReduce jobs that is transforming data from one representation to another. Sort requires an Input Output bound system resource utilization with the data access patterns as equal quantities of data access. The input data is generated using the RandomTextWriter program contained in the Hadoop distribution. Time taken by Reduce stage is twice the time taken by Map stage. (Fig.3.1) [12] Fig 3.1) MapReduce for SORT workload 2.) Word Count: It is also a representation of a large subset of real world MapReduce jobs that is transforming data by extracting a small amount of interesting data from a large data set. Word Count requires a CPU bound system resource utilization with the data access patterns as reducing quantities of data access. The input data is generated using the RandomTextWriter program contained in the Hadoop distribution. Time taken by Reduce stage is nearly the same as the time taken by Map stage. (Fig.3.2) Fig 3.2) MapReduce for WORD COUNT Workload 3.) TeraSort: It sorts 10 billion 100-byte records generated by the TeraGen program contained in the Hadoop distribution. TeraSort requires CPU bound system resource utilization during Map stage and Input Output bound system resource utilization during Reduce stage with the data access patterns as reducing and then growing quantities of data access. Time taken by Reduce stage is 1.5 times the time taken by Map stage. (Fig.3.3) Fig 3.3) MapReduce for TERA SORT workload
  • 5. Can Modern Interconnects Improve the Performance of Hadoop Cluster?... 30 VII. PERFORMANCE EVALUATION OF SSD AND HDD ON 10GIGE AND IPOIB For the Performance evaluation and Analysis of the performance of SSD and HDD the considered work loads are Sort, Word Count and Tera Sort on two different workloads viz. 10 GigE and IBoIP. The size of data taken for all the workloads is 6550021992 bytes that is 6.1001GB of data. [13][14][15] (X-Axis -> Percentage ; Y-Axis -> Time in sec) 1) Sort Work Load: Since Sort has an Input Output bound resource utilization it is easily observed that SSD (Fig.4.1) buffers the data much earlier and at a faster rate than HDD (Fig.5.1) that tends to buffer at a constant speed. Due to this reason the SSD had an earlier chance to start off with the Reduce phase as compared to the HDD. It can also be inference from the graduated behavior of the graph that HDD works in a much stabilized manner as compared to the SDD. Over all SSD finishes off its job with the processors 39seconds earlier than the HDD. This proves that the SSD works much faster than HDD in the scenario of Sort Workload. Fig 4.1.) SORT workload on Solid State Drive 10GigE (Blue -> Map Phase; Red -> Reduce Phase) Fig 4.2.) SORT workload on Solid State Drive IPoIB (Blue -> Map Phase; Red -> Reduce Phase) As of the performance change between the Modern Interconnect using IPoIB comapred to the Traditional Interconnect using 10 GigE can be analysed from the benchmarking results of SSD and HDD used on both types of interconnects. The analysis is as follows: For SSD: the level of improvement is an average of 45% with precise improvement of 44% in Map Phase and 46% in Reduce Phase. The Map phase completed at 113sec in case of IBoIP as compared to 212sec in case of 10GigE. The Reduce phase completed at 455sec in IPoIB as compared to 852sec in case of 10GigE. The reduce phase started from 30% of map phase. For HDD: the level of improvement is an average of 27% with precise improvement of 26% in Map Phase and 27% in Reduce Phase. The Map phase completed at 196sec in case of IBoIP as compared to 265sec in case of 10GigE. The Reduce phase completed at 699sec in IPoIB as compared to 953sec in case of 10GigE. The reduce phase started from 28% of map phase.
  • 6. International Journal of Engineering Research and Development e-ISSN: 2278-067X, p-ISSN: 2278-800X, www.ijerd.com Volume 10, Issue 5 (May 2014), PP.26-35 31 Fig 5.1) SORT workload on Hard Disk Drive 10GigE (Blue -> Map Phase; Red -> Reduce Phase) Fig 5.2) SORT workload on Hard Disk Drive IPoIB (Blue -> Map Phase; Red -> Reduce Phase) 2) Word Count Work Load: Since Sort has a CPU bound resource utilization it is easily observed that SSD (Fig.6.1) and HDD (Fig.7.1) both buffers approximately at the same rate but with a little variation in the speed as SSD buffers about 3 seconds faster than HDD. Due to this reason the SSD had an earlier chance to start off with the Reduce phase at 47 seconds as compared to the HDD that starts at 49 seconds. It can also be inferred from the abrupt behavior of the graph that HDD takes a longer time in the reduce phase as compared to the SSd that takes less time. Over all SSD finishes off its job with the processors 6seconds earlier than the HDD that is not a very major time difference. But, still this proves that the SSD works faster than HDD in the scenario of Word Count Workload. Fig 6.1) WORD COUNT Workload on Solid State Device 10GigE
  • 7. Can Modern Interconnects Improve the Performance of Hadoop Cluster?... 32 Fig 6.2) WORD COUNT Workload on Solid State Device IPoIB As of the performance change between the Modern Interconnect using IPoIB comapred to the Traditional Interconnect using 10 GigE can be analysed from the benchmarking of SSD and HDD. The analysis is as follows: For SSD: the level of improvement is an average of 46% with precise improvement of 45% in Map Phase and 47% in Reduce Phase. The Map phase completed at 43sec in case of IBoIP as compared to 79sec in case of 10GigE. The Reduce phase completed at 50sec in IPoIB as compared to 94sec in case of 10GigE. The reduce phase started from 44% of map phase. For HDD: the level of improvement is an average of 29% with precise improvement of 26% in Map Phase and 33% in Reduce Phase. The Map phase completed at 64sec in case of IBoIP as compared to 86sec in case of 10GigE. The Reduce phase completed at 67sec in IPoIB as compared to 100sec in case of 10GigE. The reduce phase started from 65% of map phase. Fig 7.1) WORD COUNT Workload on Hard Disk Drive 10GigE 3) TeraSort Work Load: Since Sort has a CPU bound system resource utilization during Map stage and Input Output bound system resource utilization during Reduce stage it is easily observed that SSD (Fig.8.1) buffers the data much earlier 19Sec and at a faster rate than HDD (Fig.9.1) 21Sec that tends to buffer at an abrupt speed. Due to this reason the SSD had an earlier chance to start off with the Reduce at 23sec as compared to the HDD that starts at 24 second. It can also be observed that the reduce phase for SSD and HDD takes equal amount of time i.e. process is independent of SSD or HDD & dependent on processor. SSD finishes its job 1second earlier than the HDD that is not a negligible difference. But, still this proves that the SSD has lower latency than HDD in the scenario of Tera Sort Workload
  • 8. International Journal of Engineering Research and Development e-ISSN: 2278-067X, p-ISSN: 2278-800X, www.ijerd.com Volume 10, Issue 5 (May 2014), PP.26-35 33 Fig 7.2) WORD COUNT Workload on Hard Disk Drive IPoIB Fig 8.1) TERASORT workload on Solid State Drive 10GigE Fig 8.2) TERASORT workload on Solid State Drive IPoIB As of the performance change between the Modern Interconnect using IPoIB comapred to the Traditional Interconnect using 10 GigE can be analysed from the benchmarking results of SSD and HDD used on both types of interconnects. The analysis is as follows: For SSD: the level of improvement is an average of 44% with precise improvement of 41% in Map Phase and 46% in Reduce Phase. The Map phase completed at 14sec in case of IBoIP as compared to 28sec in case of 10GigE. The Reduce phase completed at 25sec in IPoIB as compared to 46sec in case of 10GigE. The reduce phase started from 98% of map phase. For HDD: the level of improvement is an average of 25% with precise improvement of 24% in Map Phase and 26% in Reduce Phase. The Map phase completed at 16sec in case of IBoIP as compared to 21sec in
  • 9. Can Modern Interconnects Improve the Performance of Hadoop Cluster?... 34 case of 10GigE. The Reduce phase completed at 34sec in IPoIB as compared to 46sec in case of 10GigE. The reduce phase started from 100% of map phase. Fig 9.1) TERASORT workload on Hard Disk Drive 10GigE Fig 9.2) TERASORT workload on Hard Disk Drive IPoIB VIII. CONCLUSION AND FUTURE SCOPE From the above results and analysis the performance of SSD and HDD is nearly the same for the same Interconnect used, but positive results can be seen for better performance of SSD than HDD with use of IPoIB (Fig 10). Also the difference in the performance is very visible and drastic. So, an observation that can be monitored is that the Map phase in any of the workload is performing well until the random access memory is not consumed or the interconnect technology of the network used is of very high throughput and low latency. This concludes that there is a need to involve a Distributed Shared Memory (DSM) or the need to improve the Interconnect technology for networking from the traditional 10GigE to IBoIP to improve the performance of the SSD and HDD and get better significant results [16]. Another connection technique like InfiniBand using RDMA is used to connect then better performance in terms of latency, speed of access and fault tolerance can be achieved [10]. Fig 10) Comparison of performances of the Interconnect Technologies
  • 10. Can Modern Interconnects Improve the Performance of Hadoop Cluster?... 35 In the future, a model to have DSM as a part should be used to be implemented with the use of InfiniBand on RDMA that supports the use of Verbs and with the technology of Optical Fibers to achieve faster performances and Remote Dynamic Memory Access. [17][18][19] The modern interconnects like IPoIB and RDMA-IB has got a lot of potential and their powers need to be researched and harnessed on in the future.[20] REFERENCES [1] Hadoop Home: http://hadoop.apache.org/ [2] Jacky Wu, "Hadoop HDFS & MapReduce", Help Guidelines LSA Lab, NTHU, Taiwan 2013.8.7. [3] Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, "The Google File System", SOSP‟03, October 19–22, 2003, Bolton Landing, New York, USA. Copyright 2003 ACM. [4] Piyush Saxena, Satyajit Padhy, Praveen Kumar, “Optimizing Parallel Data Processing With Dynamic Resource Allocation”, International Conference on Reliability, Infocom Technologies and Optimization.,pp. 735-739, Jan. 29-31, 2013. [5] Jeffrey Dean and Sanjay Ghemawat, "MapReduce: Simplified Data Processing on Large Clusters", OSDI‟04, 2004, Bolton Landing, New York, USA. Copyright 2004 ACM. [6] “Can High-Performance Interconnects Benefit Hadoop Distributed File System? ”, Lab Resources of Network-Based Computing Laboratory, Department of Computer Science and Engineering, The Ohio State University, USA. [7] N. S. Islam, M. W. Rahman, J. Jose, R. Rajachandrasekar, H. Wang, H. Subramoni, C. Murthy, and D. K. Panda, “High Performance RDMA-based Design of HDFS over InfiniBand”, Research Resources of Department of Computer Science and Engineering and IBM T.J Watson Research Center, The Ohio State University Yorktown Heights, NY. November 10-16, 2012, Salt Lake City, Utah, USA. [8] Open Fabrics Enterprise Distribution, http://www.openfabrics.org/. [9] X. Ding, S. Jiang, F. Chen, K. Davis, and X. Zhang. DiskSeen: Exploiting Disk Layout and Access History to Enhance I/O Prefetch. In Proceedings of USENIX07, 2007. [10] InfiniBand Trade Association Home: http://www.infinibandta.org/ [11] C. Gniady, Y. C. Hu, and Y.-H. Lu. Program Counter Based Techniques for Dynamic Power Management. In Proceedings of the 10th International Symposium on High Performance Computer Architecture, HPCA ‟04, Washington, DC, USA, 2004. IEEE Computer Society. [12] Shengsheng Huang, Jie Huang, Jinquan Dai, Tao Xie, and Bo Huang, "The HiBench Benchmark Suite: Characterization of the MapReduce-Based Data Analysis", ICDE Workshops'10, Oct. 2010, 2010 IEEE. [13] Lan Yi, “Experience with HiBench: From Micro-Benchmarks toward End-to-End Pipelines”, WBDB 2013 Workshop Presentation, Intel China Software Center, 2013.07.16. [14] Dominique Heger, “Hadoop Performance Tuning - A Pragmatic & Iterative Approach”, Research details by DHTechnologies ‐ www.dhtusa.com, 2013. [15] Jason Dai, “Toward Efficient Provisioning and Performance Tuning for Hadoop”, Apache Asia Roadshow 2010, Intel China Software Center, June 2010. [16] Remote Direct Memory Access : http://en.wikipedia.org/wiki/Remote_direct_memory_access [17] Liang Ming , Dan Feng, Fang Wang, Qi Chen, Yang Li, Yong Wan, Jun Zhou, "A Performance Enhanced User-space Remote Procedure Call on InfiniBand*", Photonics and Optolectronics Meetings (POEM)., 2011. [18] Fan Liang, Chen Feng, Xiaoyi Lu, Zhiwei Xu, "Performance Benefits of DataMPI:A Case Study with BigDataBench", ACM SOFT BPOE ‟14, Mar 1, 2014, Salt Lake City, Utah, USA. [19] Xiaoyi Lu, Nusrat S. Islam, Md. Wasi-ur-Rahman, Jithin Jose, Hari Subramoni, Hao Wang, and Dhabaleswar K. (DK) Panda, "High-Performance Design of Hadoop RPC with RDMA over InfiniBand", National Science Foundation grants #OCI-0926691, #OCI-1148371 and #CCF-1213084, 2013 IEEE. [20] K. Gupta, R. Jain, H. Pucha, P. Sarkar, and D. Subhraveti, “Scaling Highly-Parallel Data-Intensive Supercomputing Applications on a Parallel Clustered Filesystem,” in The SC10 Storage Challenge. Piyush Saxena Pursuing Master of Technology in Computer Science and Engineering from Amity School of Engineering and Technology, Amity University Uttar Pradesh, Noida, India, Area of Interest: Cloud Computing, Data Mining and Warehousing and Soft Computing.