VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
Unit 3 intro.pptx
1. HADOOP
• Using the solution provided by Google, Doug Cutting and his team developed an Open
Source Projectcalled HADOOP.
• Hadoop runs applications using the MapReduce algorithm, where the data is processed
inparallel withothers.
• In short, Hadoop is used to develop applications that could perform complete statistical
analysisonhugeamountsof data.
2. HADOOP
• Hadoop is an Apache open source framework written in java that allows distributed
processing of large datasets across clusters of computers using simple programming
models.
• The Hadoop framework application works in an environment that provides
distributed storageandcomputationacross clusters of computers.
• Hadoop is designed to scale up from single server to thousands of machines, each
offering localcomputation andstorage.
5. Basics of Hadoop
• Hadoopisanopen sourcesoftware frameworkfor storingdataandrunningapplicationsoncluster
ofcommodityhardware
• Itprovidesmassivestorage foranykindofdata,enormous processing power andtheabilityto
handlevirtually limitless concurrenttasksorjobs
• Adataresidinginalocalfilesystem ofapersonal computersystem, inHadoop,dataresides ina
distributedfilesystem whichiscalled asa HadoopDistributedFileSystem-HDFS
• TheprocessingmodelisbasedonDataLocality’conceptwhereincomputationallogicissent to
clusternodes(server) containingdata
• Thiscomputationallogicisnothing,butacompiledversion of aprogramwritten inahigh-level
languagesuchasJava.
• Suchaprogram,processes datastored inHadoopHDFS
6. Advantages and Disadvantages of Hadoop
• Varied Data Source
• Cost-effective
• Performance
• Fault-Tolerant
• Highly available
• Low Network Traffic
• High throughput
• Open source
• Scalable
• Ease of use
• Compatibility
• Multiple Language supported
• Issue with small file
• Vulnerable by Nature
• Processing Overhead
• Supports on Batch processing
• Iterative Processing
• Security
Advantages Disadvantages