SlideShare une entreprise Scribd logo
1  sur  19
Efficient Analysis of Big Data Using
Map Reduce Framework
1
Presented by
Rajshekhar
(1BY14SCS15)
Under the guidance of
Guru Prasad S
outline
Abstract
Introdution
Goals & Challenges Analyzing Data
Applications
HDFS
Big Data Analytics
MapReduce
Conclusion
References
2
Abstract
• Data now stream from daily life from
phones ,
credit cards ,
televisions and
computers etc.(specially from Internet).
• The data flows so fast. Five exabytes of data are generated
every days!!
• This huge collection of data is known as big data. -- Data is
too diverse, fast, changing and massive.
• Its difficult for the current computing infrastructure to handle big
data.
• To overcome this draw back, Google introduced “MapReduce”
framework.
3
Introduction
• Big Data has to deal with large and complex datasets that can be
structured, semi-structured, or unstructured.
• Big Data is so large that, its difficult to process by traditional
database and other software techniques;
How to explore, analyze such large datasets?.
• Analyzing big data is one of the challenges for researchers system
and academicians that needs special analyzing techniques
• Hadoop Map Reduce is a technique which analysis big data.
Two distinct tasks
MAP and REDUCE
that Hadoop programs perform.
4
3 C’s in BiG DATA
5
Goals and Challenges
Goals:-
Main goals of high-dimensional data analysis are:-
 To develop effective methods that can accurately
predict the future observations.
 Exploring the hidden structures of each
subpopulation of the data.
 Extracting important common features across
many subpopulations.
6
Continued….
Challenges:--
A. Meeting the need for speed .
B. Understanding the data.
C. Addressing data quality .
D. Displaying meaningful results .
E. Dealing with outliers .
7
Applications
8
Aadhar project by Govt. of India uses Hadoop.
New applications that are becoming possible in
the Big Data era include:
A. Personalized services.
B. Internet security.
C. Personalized medicine.
HDFS(Hadoop Distributed File System)
9
• Designed to hold very large amounts of data
(petabytes or even zettabytes), and provide high-
throughput access to this information.
Characteristics:
• Fault tolerant.
• Runs with commodity hardware.
• Able to handle large datasets.
• Master slave paradigm.
• Write once file access only.
HDFS components:
• NameNode.
• DataNode.
• Secondary NameNode.
HDFS continued….
10
Fig: HDFS architecture
BIG DATA ANALYTICS
• “ The process of collecting, organizing and
analyzing large sets of data.”
--To discover patterns &
other useful information.
• It will also help identify the data.
• Big data analysts basically want the knowledge
that comes from analyzing the data.
11
MAP REDUCE
12
• Invented by Google.
• Is a programming model for processing large datasets
distributed on a large cluster.
• MapReduce is the heart of Hadoop.
• Uses the concept of Divide and Conquer.
• Two methods: map() and Reduce() .
• Map()sorting and filtering.
• Reduce()counting and produce Result.
Mapreduce continued
13
Fig: MapReduce architecture
14
Map Reduce algorithms:
• MapReduce is a programming model designed for processing
large volumes of data in parallel by dividing the work into a
set of independent tasks.
For example twitter data was processed on
different servers on basis of months .
Hadoop is the physical implementation of Mapreduce .
 It is combination of 2 java functions :
Mapper() and Reducer().
 example: to check popularity of text.
15
Continued….
Mapper function maps the split files and provide input to
reducer.
Mapper ( filename , file –contents):
for each word in file-contents:
emit (word , 1).
Reducer function clubs the input provided by mapper and
produce output
Reducer ( word , values):
sum=0;
for each value in values:
sum=sum + value
emit(word , sum).
16
Conclusion
 MapReduce is simple but provides good scalability and fault-
tolerance for massive data processing.
 Analysis tools like Map Reduce over Hadoop guarantees …
Faster advances in many scientific disciplines and
Improving the Profitability and success of many enterprises.
 MapReduce has received a lot of attentions in many fields---
including Data mining,
Information retrieval,
Image retrieval and
Pattern recognition.
17
References
[1]Hadoop ,“PoweredbyHadoop”,
http://wiki.apache.org/hadoop/Poweredby.
[2 ] Hadoop Tutorial,YahooInc.,
https://developer.yahoo.com/hadoop/tutorial/index.html.
[3 ] Apache: Apache Hadoop,http://hadoop.apache.org
[4 ] Hadoop Distributed File System (HDFS),
http://hortonworks.com/hadoop/hdfs/
[5 ] Jianqing Fan1, Fang Han and Han Liu, Challenges of Big Data analysis,
National Science Review Advance Access published February, 2014.
[6 ] Haddop MapReduce- http://hadooptutorial.wikispaces.com/MapReduce
[7] Jens Dittrich JorgeArnulfo Quian´eRuiz, Efficient Big Data Processing in
Hadoop MapReduce.
18
End of Presentation.
19

Contenu connexe

Tendances

Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...Simplilearn
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?sudhakara st
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATAGauravBiswas9
 
Architecture of Hadoop
Architecture of HadoopArchitecture of Hadoop
Architecture of HadoopKnoldus Inc.
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map ReduceApache Apex
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture EMC
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY pptsravya raju
 
Hadoop And Their Ecosystem ppt
 Hadoop And Their Ecosystem ppt Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem pptsunera pathan
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop TutorialEdureka!
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with HadoopPhilippe Julio
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY pptsravya raju
 
No sql distilled-distilled
No sql distilled-distilledNo sql distilled-distilled
No sql distilled-distilledrICh morrow
 

Tendances (20)

Introduction to NoSQL
Introduction to NoSQLIntroduction to NoSQL
Introduction to NoSQL
 
Hadoop Architecture
Hadoop ArchitectureHadoop Architecture
Hadoop Architecture
 
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
Hadoop Tutorial For Beginners | Apache Hadoop Tutorial For Beginners | Hadoop...
 
Hadoop introduction , Why and What is Hadoop ?
Hadoop introduction , Why and What is  Hadoop ?Hadoop introduction , Why and What is  Hadoop ?
Hadoop introduction , Why and What is Hadoop ?
 
Hadoop
Hadoop Hadoop
Hadoop
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATA
 
Architecture of Hadoop
Architecture of HadoopArchitecture of Hadoop
Architecture of Hadoop
 
Hadoop Ecosystem
Hadoop EcosystemHadoop Ecosystem
Hadoop Ecosystem
 
Hadoop technology
Hadoop technologyHadoop technology
Hadoop technology
 
Introduction to Map Reduce
Introduction to Map ReduceIntroduction to Map Reduce
Introduction to Map Reduce
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Hadoop Overview & Architecture
Hadoop Overview & Architecture  Hadoop Overview & Architecture
Hadoop Overview & Architecture
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Data models in NoSQL
Data models in NoSQLData models in NoSQL
Data models in NoSQL
 
Hadoop And Their Ecosystem ppt
 Hadoop And Their Ecosystem ppt Hadoop And Their Ecosystem ppt
Hadoop And Their Ecosystem ppt
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
 
Big Data Analytics with Hadoop
Big Data Analytics with HadoopBig Data Analytics with Hadoop
Big Data Analytics with Hadoop
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Introduction to HDFS
Introduction to HDFSIntroduction to HDFS
Introduction to HDFS
 
No sql distilled-distilled
No sql distilled-distilledNo sql distilled-distilled
No sql distilled-distilled
 

En vedette

Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsIntroduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsSkillspeed
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsLynn Langit
 
Introduction To Map Reduce
Introduction To Map ReduceIntroduction To Map Reduce
Introduction To Map Reducerantav
 

En vedette (7)

Map Reduce introduction
Map Reduce introductionMap Reduce introduction
Map Reduce introduction
 
An Introduction To Map-Reduce
An Introduction To Map-ReduceAn Introduction To Map-Reduce
An Introduction To Map-Reduce
 
Hadoop map reduce concepts
Hadoop map reduce conceptsHadoop map reduce concepts
Hadoop map reduce concepts
 
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce FundamentalsIntroduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
Introduction to MapReduce | MapReduce Architecture | MapReduce Fundamentals
 
Map reduce vs spark
Map reduce vs sparkMap reduce vs spark
Map reduce vs spark
 
Hadoop MapReduce Fundamentals
Hadoop MapReduce FundamentalsHadoop MapReduce Fundamentals
Hadoop MapReduce Fundamentals
 
Introduction To Map Reduce
Introduction To Map ReduceIntroduction To Map Reduce
Introduction To Map Reduce
 

Similaire à Analysing of big data using map reduce

Introduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone ModeIntroduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone Modeinventionjournals
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystemnallagangus
 
Unstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus ModelUnstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus ModelEditor IJCATR
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...IJSRD
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...IJSRD
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceCsaba Toth
 
A Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis TechniquesA Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis Techniquesijsrd.com
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-HadoopNagarjuna D.N
 
Big data analysis using hadoop cluster
Big data analysis using hadoop clusterBig data analysis using hadoop cluster
Big data analysis using hadoop clusterFurqan Haider
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopIOSR Journals
 

Similaire à Analysing of big data using map reduce (20)

Big data
Big dataBig data
Big data
 
Big data
Big dataBig data
Big data
 
Big Data
Big DataBig Data
Big Data
 
Introduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone ModeIntroduction to Big Data and Hadoop using Local Standalone Mode
Introduction to Big Data and Hadoop using Local Standalone Mode
 
Bar camp bigdata
Bar camp bigdataBar camp bigdata
Bar camp bigdata
 
Hadoop - Architectural road map for Hadoop Ecosystem
Hadoop -  Architectural road map for Hadoop EcosystemHadoop -  Architectural road map for Hadoop Ecosystem
Hadoop - Architectural road map for Hadoop Ecosystem
 
Hadoop
HadoopHadoop
Hadoop
 
IJET-V2I6P25
IJET-V2I6P25IJET-V2I6P25
IJET-V2I6P25
 
Unstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus ModelUnstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus Model
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
 
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
Big Data Mining, Techniques, Handling Technologies and Some Related Issues: A...
 
Introduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduceIntroduction to Hadoop and MapReduce
Introduction to Hadoop and MapReduce
 
Big Data przt.pptx
Big Data przt.pptxBig Data przt.pptx
Big Data przt.pptx
 
A Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis TechniquesA Survey on Big Data Analysis Techniques
A Survey on Big Data Analysis Techniques
 
Big data Question bank.pdf
Big data Question bank.pdfBig data Question bank.pdf
Big data Question bank.pdf
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
 
Hadoop Cluster Analysis and Assessment
Hadoop Cluster Analysis and AssessmentHadoop Cluster Analysis and Assessment
Hadoop Cluster Analysis and Assessment
 
Big data analysis using hadoop cluster
Big data analysis using hadoop clusterBig data analysis using hadoop cluster
Big data analysis using hadoop cluster
 
Big Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – HadoopBig Data Analysis and Its Scheduling Policy – Hadoop
Big Data Analysis and Its Scheduling Policy – Hadoop
 
G017143640
G017143640G017143640
G017143640
 

Dernier

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAssociation for Project Management
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeThiyagu K
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 

Dernier (20)

Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
APM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across SectorsAPM Welcome, APM North West Network Conference, Synergies Across Sectors
APM Welcome, APM North West Network Conference, Synergies Across Sectors
 
Measures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and ModeMeasures of Central Tendency: Mean, Median and Mode
Measures of Central Tendency: Mean, Median and Mode
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 

Analysing of big data using map reduce

  • 1. Efficient Analysis of Big Data Using Map Reduce Framework 1 Presented by Rajshekhar (1BY14SCS15) Under the guidance of Guru Prasad S
  • 2. outline Abstract Introdution Goals & Challenges Analyzing Data Applications HDFS Big Data Analytics MapReduce Conclusion References 2
  • 3. Abstract • Data now stream from daily life from phones , credit cards , televisions and computers etc.(specially from Internet). • The data flows so fast. Five exabytes of data are generated every days!! • This huge collection of data is known as big data. -- Data is too diverse, fast, changing and massive. • Its difficult for the current computing infrastructure to handle big data. • To overcome this draw back, Google introduced “MapReduce” framework. 3
  • 4. Introduction • Big Data has to deal with large and complex datasets that can be structured, semi-structured, or unstructured. • Big Data is so large that, its difficult to process by traditional database and other software techniques; How to explore, analyze such large datasets?. • Analyzing big data is one of the challenges for researchers system and academicians that needs special analyzing techniques • Hadoop Map Reduce is a technique which analysis big data. Two distinct tasks MAP and REDUCE that Hadoop programs perform. 4
  • 5. 3 C’s in BiG DATA 5
  • 6. Goals and Challenges Goals:- Main goals of high-dimensional data analysis are:-  To develop effective methods that can accurately predict the future observations.  Exploring the hidden structures of each subpopulation of the data.  Extracting important common features across many subpopulations. 6
  • 7. Continued…. Challenges:-- A. Meeting the need for speed . B. Understanding the data. C. Addressing data quality . D. Displaying meaningful results . E. Dealing with outliers . 7
  • 8. Applications 8 Aadhar project by Govt. of India uses Hadoop. New applications that are becoming possible in the Big Data era include: A. Personalized services. B. Internet security. C. Personalized medicine.
  • 9. HDFS(Hadoop Distributed File System) 9 • Designed to hold very large amounts of data (petabytes or even zettabytes), and provide high- throughput access to this information. Characteristics: • Fault tolerant. • Runs with commodity hardware. • Able to handle large datasets. • Master slave paradigm. • Write once file access only. HDFS components: • NameNode. • DataNode. • Secondary NameNode.
  • 11. BIG DATA ANALYTICS • “ The process of collecting, organizing and analyzing large sets of data.” --To discover patterns & other useful information. • It will also help identify the data. • Big data analysts basically want the knowledge that comes from analyzing the data. 11
  • 12. MAP REDUCE 12 • Invented by Google. • Is a programming model for processing large datasets distributed on a large cluster. • MapReduce is the heart of Hadoop. • Uses the concept of Divide and Conquer. • Two methods: map() and Reduce() . • Map()sorting and filtering. • Reduce()counting and produce Result.
  • 14. 14
  • 15. Map Reduce algorithms: • MapReduce is a programming model designed for processing large volumes of data in parallel by dividing the work into a set of independent tasks. For example twitter data was processed on different servers on basis of months . Hadoop is the physical implementation of Mapreduce .  It is combination of 2 java functions : Mapper() and Reducer().  example: to check popularity of text. 15
  • 16. Continued…. Mapper function maps the split files and provide input to reducer. Mapper ( filename , file –contents): for each word in file-contents: emit (word , 1). Reducer function clubs the input provided by mapper and produce output Reducer ( word , values): sum=0; for each value in values: sum=sum + value emit(word , sum). 16
  • 17. Conclusion  MapReduce is simple but provides good scalability and fault- tolerance for massive data processing.  Analysis tools like Map Reduce over Hadoop guarantees … Faster advances in many scientific disciplines and Improving the Profitability and success of many enterprises.  MapReduce has received a lot of attentions in many fields--- including Data mining, Information retrieval, Image retrieval and Pattern recognition. 17
  • 18. References [1]Hadoop ,“PoweredbyHadoop”, http://wiki.apache.org/hadoop/Poweredby. [2 ] Hadoop Tutorial,YahooInc., https://developer.yahoo.com/hadoop/tutorial/index.html. [3 ] Apache: Apache Hadoop,http://hadoop.apache.org [4 ] Hadoop Distributed File System (HDFS), http://hortonworks.com/hadoop/hdfs/ [5 ] Jianqing Fan1, Fang Han and Han Liu, Challenges of Big Data analysis, National Science Review Advance Access published February, 2014. [6 ] Haddop MapReduce- http://hadooptutorial.wikispaces.com/MapReduce [7] Jens Dittrich JorgeArnulfo Quian´eRuiz, Efficient Big Data Processing in Hadoop MapReduce. 18