SlideShare a Scribd company logo
1 of 47
What’s in it for you?
Sqoop Tutorial
What’s in it for you?
Need for Sqoop
What’s in it for you?
Need for Sqoop
What is Sqoop?
What’s in it for you?
Need for Sqoop
What is Sqoop?
Sqoop Features
What’s in it for you?
Need for Sqoop
What is Sqoop?
Sqoop Features
Sqoop Architecture
What’s in it for you?
Need for Sqoop
What is Sqoop?
Sqoop Features
Sqoop Architecture
Sqoop Import
What’s in it for you?
Need for Sqoop
What is Sqoop?
Sqoop Features
Sqoop Architecture
Sqoop Import
Sqoop Export
What’s in it for you?
Need for Sqoop
What is Sqoop?
Sqoop Features
Sqoop Architecture
Sqoop Import
Sqoop Export
Sqoop Processing
What’s in it for you?
Need for Sqoop
What is Sqoop?
Sqoop Features
Sqoop Architecture
Sqoop Import
Sqoop Export
Sqoop Processing
Demo on Sqoop
Need for Sqoop
Need for Sqoop
Data processing
Processing huge volumes of data
requires loading data from diverse
sources
into Hadoop clusters
This process of loading data from
heterogeneous sources comes with a set of
challenges
Need for Sqoop
Maintaining data consistency
1
Challenges
Need for Sqoop
Maintaining data consistency Ensuring efficient
utilization of
resources
1 2
Challenges
Need for Sqoop
Maintaining data consistency Ensuring efficient
utilization of
resources
Loading bulk data
to Hadoop was not
possible
1 2 3
Challenges
Need for Sqoop
Maintaining data consistency Ensuring efficient
utilization of
resources
Loading bulk data
to Hadoop was not
possible
1 2 3 4
Challenges
Loading data using
scripts was slow
Need for Sqoop
Maintaining data consistency Ensuring efficient
utilization of
resources
Loading data using
scripts was slow
Loading bulk data
to Hadoop was not
possible
1 2 3 4
Challenges
Solution
Sqoop helped in overcoming all the challenges to
traditional approach and could load bulk data from
RDBMS to Hadoop very easily
What is Sqoop?
What is Sqoop?
Sqoop is a tool used to transfer bulk data between Hadoop and external datastores such as relational
databases (MS SQL Server, MySQL)
SQOOP = SQL + HADOOP
What is Sqoop?
Sqoop is a tool used to transfer bulk data between Hadoop and external datastores such as relational
databases (MS SQL Server, MySQL)
RDBMS
Import
Export
SQOOP = SQL + HADOOP
What is Sqoop?
Sqoop is a tool used to transfer bulk data between Hadoop and external datastores such as relational
databases (MS SQL Server, MySQL)
RDBMS
Import
Export
SQOOP = SQL + HADOOP
Export
Sqoop Features
Sqoop Features
1
5 2
4 3
Parallel import/export
Connectors for all
major RDBMS
databases
Kerberos Security
Integration
Import results of SQL
query
Provides full and
incremental load
Sqoop Features
1
5 2
4 3
Connectors for all major
RDMS databases
Kerberos Security
Integration
Import results of SQL
query
Provides full and
incremental load
Sqoop uses YARN framework to import and
export data. This provides fault tolerance on top of
parallelism
Parallel import/export
Sqoop Features
1
5 2
4 3
Parallel import/export
Connectors for all major
RDMS databases
Kerberos Security
Integration
Import results of SQL
query
Provides full and
incremental load
Sqoop allows us to import the result returned from
an SQL query into HDFS
Sqoop Features
1
5 2
4 3
Parallel import/export
Connectors for all
major RDBMS
databases
Kerberos Security
Integration
Import results of SQL
query
Provides full and
incremental load
Sqoop provides connectors for multiple Relational
Database Management System (RDBMS)
databases such as MySQL and MS SQL Server
Sqoop Features
1
5 2
4 3
Parallel import/export
Connectors for all major
RDMS databases
Kerberos Security
Integration
Import results of SQL
query
Provides full and
incremental load
Sqoop supports Kerberos computer network
authentication protocol that allows nodes communicating
over a non-secure network to prove their identity to one
another in a secure manner
Sqoop Features
1
5 2
4 3
Parallel import/export
Connectors for all major
RDMS databases
Kerberos Security
Integration
Import results of SQL
query
Sqoop can load the whole table or parts of the table
by a single command. Hence, it supports full and
incremental load
Provides full and
incremental load
Sqoop Architecture
Sqoop Architecture
Command
Client
Client submits the import/ export command
to import or export data
Sqoop Architecture
Command
ClientDocument Based
Systems
Relational
Database
Enterprise Data
Warehouse
Connector for Data warehouse
Connector for Document
based system
Connector for RDBMS
Data from different databases is
fetched by Sqoop
Connectors help in working with a
range of popular databases
Sqoop Architecture
Command
ClientDocument Based
Systems
Relational
Database
Enterprise Data
Warehouse
Map Task
HDFS/ HBase/
Hive
Multiple mappers perform map tasks to load
the data on to HDFS
Sqoop Architecture
Command
ClientDocument Based
Systems
Relational
Database
Enterprise Data
Warehouse
Map Task
HDFS/ HBase/
Hive
Similarly, multiple map tasks will export the
data from HDFS on to RDBMS using Sqoop
export command
Sqoop Import
Sqoop Import
Folders
RDBMS data store
Sqoop Import
Folders
Gathers
Metadata
1
1
Introspect database to gather metadata (primary
key information)
RDBMS data store
Sqoop Import
Sqoop Import
Sqoop job
HDFS
storage
Map
Map
Map
Map
Folders Submits Map-Only
Job
Hadoop Cluster
1
Introspect database to gather metadata (primary
key information)
2
Sqoop divides the input dataset into splits and
uses individual map tasks to push the splits to
HDFS
RDBMS data store
Sqoop Import
2
Gathers
Metadata
1
Sqoop Export
Sqoop Export
Sqoop job
HDFS
storage
Map
Map
Map
Map
Hadoop Cluster
Sqoop Export
Folders
RDBMS data store
Gathers
Metadata1
Submits Map-Only
Job
2
1
Introspect database to gather metadata (primary
key information)
2
Sqoop divides the input dataset into splits and
uses individual map tasks to push the splits to
RDBMS. Sqoop will export Hadoop files back to
RDBMS tables.
Sqoop Import
$ sqoop import (generic args) (import args)
$ sqoop-import (generic args) (import args)
Argument Description
--connect <jdbc-uri> Specify JDBC connect string
--connection-manager <class-name> Specify connection manager class to use
--driver <class-name> Manually specify JDBC driver class to use
--hadoop-mapred-home <dir> Override $HADOOP_MAPRED_HOME
--username <username> Set authentication username
--help Print usage instructions
Sqoop Export
$ sqoop export (generic args) (export args)
$ sqoop-export (generic args) (export args)
Argument Description
--connect <jdbc-uri> Specify JDBC connect string
--connection-manager <class-name> Specify connection manager class to use
--driver <class-name> Manually specify JDBC driver class to use
--hadoop-mapred-home <dir> Override $HADOOP_MAPRED_HOME
--username <username> Set authentication username
--help Print usage instructions
Sqoop Processing
Sqoop Processing
Sqoop runs in the Hadoop cluster1
Sqoop Processing
Sqoop runs in the Hadoop cluster
It imports data from RDBMS / NOSQL database to HDFS
1
2
Sqoop Processing
Sqoop runs in the Hadoop cluster
It imports data from RDBMS / NOSQL database to HDFS
It uses mappers to slice the incoming data into multiple formats and
load the data in HDFS
1
2
3
Sqoop Processing
Sqoop runs in the Hadoop cluster
It imports data from RDBMS / NOSQL database to HDFS
It uses mappers to slice the incoming data into multiple formats and
load the data in HDFS
It exports data back into RDBMS while making sure that the
schema of the data in the database in maintained
1
2
3
4
Demo on Sqoop
Sqoop Hadoop Tutorial | Apache Sqoop Tutorial | Sqoop Import Data From MySQL to HDFS | Simplilearn

More Related Content

More from Simplilearn

Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...
Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...
Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...Simplilearn
 
How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...
How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...
How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...Simplilearn
 
Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...
Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...
Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...Simplilearn
 
Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...
Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...
Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...Simplilearn
 
Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...
Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...
Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...Simplilearn
 
React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...
React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...
React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...Simplilearn
 
What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...
What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...
What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...Simplilearn
 
How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...
How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...
How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...Simplilearn
 
WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...
WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...
WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...Simplilearn
 
Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...
Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...
Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...Simplilearn
 
How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...
How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...
How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...Simplilearn
 
How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...
How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...
How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...Simplilearn
 
Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...
Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...
Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...Simplilearn
 
Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...
Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...
Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...Simplilearn
 
YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...
YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...
YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...Simplilearn
 
Instagram Ads.pptx
Instagram Ads.pptxInstagram Ads.pptx
Instagram Ads.pptxSimplilearn
 
Introduction to MATLAB in 8 Minutes
Introduction to MATLAB in 8 Minutes Introduction to MATLAB in 8 Minutes
Introduction to MATLAB in 8 Minutes Simplilearn
 
MATLAB Tutorial For Beginners 2023
MATLAB Tutorial For Beginners 2023MATLAB Tutorial For Beginners 2023
MATLAB Tutorial For Beginners 2023Simplilearn
 
How to Install MATLAB Software in Laptop ?
How to Install MATLAB Software in Laptop ?How to Install MATLAB Software in Laptop ?
How to Install MATLAB Software in Laptop ?Simplilearn
 
Chat GPT for Content Creation
Chat GPT for Content CreationChat GPT for Content Creation
Chat GPT for Content CreationSimplilearn
 

More from Simplilearn (20)

Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...
Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...
Backpropagation in Neural Networks | Back Propagation Algorithm with Examples...
 
How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...
How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...
How to Become a Business Analyst ?| Roadmap to Become Business Analyst | Simp...
 
Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...
Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...
Career Opportunities In Artificial Intelligence 2023 | AI Job Opportunities |...
 
Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...
Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...
Programming for Beginners | How to Start Coding in 2023? | Introduction to Pr...
 
Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...
Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...
Best IDE for Programming in 2023 | Top 8 Programming IDE You Should Know | Si...
 
React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...
React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...
React 18 Overview | React 18 New Features and Changes | React 18 Tutorial 202...
 
What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...
What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...
What Is Next JS ? | Introduction to Next JS | Basics of Next JS | Next JS Tut...
 
How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...
How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...
How To Become an SEO Expert In 2023 | SEO Expert Tutorial | SEO For Beginners...
 
WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...
WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...
WordPress Tutorial for Beginners 2023 | What Is WordPress and How Does It Wor...
 
Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...
Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...
Blogging For Beginners 2023 | How To Create A Blog | Blogging Tutorial | Simp...
 
How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...
How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...
How To Start A Blog In 2023 | Pros And Cons Of Blogging | Blogging Tutorial |...
 
How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...
How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...
How to Increase Website Traffic ? | 10 Ways To Increase Website Traffic in 20...
 
Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...
Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...
Google Keyword Planner Tutorial For 2023 | How to Use Google Keyword Planner?...
 
Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...
Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...
Content Writing Tutorial for Beginners | What Is Content Writing | Content Wr...
 
YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...
YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...
YouTube SEO 2023 | How to Rank YouTube Videos ? | YouTube SEO Tutorial | Simp...
 
Instagram Ads.pptx
Instagram Ads.pptxInstagram Ads.pptx
Instagram Ads.pptx
 
Introduction to MATLAB in 8 Minutes
Introduction to MATLAB in 8 Minutes Introduction to MATLAB in 8 Minutes
Introduction to MATLAB in 8 Minutes
 
MATLAB Tutorial For Beginners 2023
MATLAB Tutorial For Beginners 2023MATLAB Tutorial For Beginners 2023
MATLAB Tutorial For Beginners 2023
 
How to Install MATLAB Software in Laptop ?
How to Install MATLAB Software in Laptop ?How to Install MATLAB Software in Laptop ?
How to Install MATLAB Software in Laptop ?
 
Chat GPT for Content Creation
Chat GPT for Content CreationChat GPT for Content Creation
Chat GPT for Content Creation
 

Recently uploaded

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...christianmathematics
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfNirmal Dwivedi
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxDenish Jangid
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxcallscotland1987
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfPoh-Sun Goh
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - Englishneillewis46
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.MaryamAhmad92
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxAmanpreet Kaur
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptxMaritesTamaniVerdade
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxRamakrishna Reddy Bijjam
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 

Recently uploaded (20)

Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdfUGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
UGC NET Paper 1 Mathematical Reasoning & Aptitude.pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Dyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptxDyslexia AI Workshop for Slideshare.pptx
Dyslexia AI Workshop for Slideshare.pptx
 
ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Graduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - EnglishGraduate Outcomes Presentation Slides - English
Graduate Outcomes Presentation Slides - English
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.ICT role in 21st century education and it's challenges.
ICT role in 21st century education and it's challenges.
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 
Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024Mehran University Newsletter Vol-X, Issue-I, 2024
Mehran University Newsletter Vol-X, Issue-I, 2024
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 

Sqoop Hadoop Tutorial | Apache Sqoop Tutorial | Sqoop Import Data From MySQL to HDFS | Simplilearn

  • 1. What’s in it for you? Sqoop Tutorial
  • 2. What’s in it for you? Need for Sqoop
  • 3. What’s in it for you? Need for Sqoop What is Sqoop?
  • 4. What’s in it for you? Need for Sqoop What is Sqoop? Sqoop Features
  • 5. What’s in it for you? Need for Sqoop What is Sqoop? Sqoop Features Sqoop Architecture
  • 6. What’s in it for you? Need for Sqoop What is Sqoop? Sqoop Features Sqoop Architecture Sqoop Import
  • 7. What’s in it for you? Need for Sqoop What is Sqoop? Sqoop Features Sqoop Architecture Sqoop Import Sqoop Export
  • 8. What’s in it for you? Need for Sqoop What is Sqoop? Sqoop Features Sqoop Architecture Sqoop Import Sqoop Export Sqoop Processing
  • 9. What’s in it for you? Need for Sqoop What is Sqoop? Sqoop Features Sqoop Architecture Sqoop Import Sqoop Export Sqoop Processing Demo on Sqoop
  • 11. Need for Sqoop Data processing Processing huge volumes of data requires loading data from diverse sources into Hadoop clusters This process of loading data from heterogeneous sources comes with a set of challenges
  • 12. Need for Sqoop Maintaining data consistency 1 Challenges
  • 13. Need for Sqoop Maintaining data consistency Ensuring efficient utilization of resources 1 2 Challenges
  • 14. Need for Sqoop Maintaining data consistency Ensuring efficient utilization of resources Loading bulk data to Hadoop was not possible 1 2 3 Challenges
  • 15. Need for Sqoop Maintaining data consistency Ensuring efficient utilization of resources Loading bulk data to Hadoop was not possible 1 2 3 4 Challenges Loading data using scripts was slow
  • 16. Need for Sqoop Maintaining data consistency Ensuring efficient utilization of resources Loading data using scripts was slow Loading bulk data to Hadoop was not possible 1 2 3 4 Challenges Solution Sqoop helped in overcoming all the challenges to traditional approach and could load bulk data from RDBMS to Hadoop very easily
  • 18. What is Sqoop? Sqoop is a tool used to transfer bulk data between Hadoop and external datastores such as relational databases (MS SQL Server, MySQL) SQOOP = SQL + HADOOP
  • 19. What is Sqoop? Sqoop is a tool used to transfer bulk data between Hadoop and external datastores such as relational databases (MS SQL Server, MySQL) RDBMS Import Export SQOOP = SQL + HADOOP
  • 20. What is Sqoop? Sqoop is a tool used to transfer bulk data between Hadoop and external datastores such as relational databases (MS SQL Server, MySQL) RDBMS Import Export SQOOP = SQL + HADOOP Export
  • 22. Sqoop Features 1 5 2 4 3 Parallel import/export Connectors for all major RDBMS databases Kerberos Security Integration Import results of SQL query Provides full and incremental load
  • 23. Sqoop Features 1 5 2 4 3 Connectors for all major RDMS databases Kerberos Security Integration Import results of SQL query Provides full and incremental load Sqoop uses YARN framework to import and export data. This provides fault tolerance on top of parallelism Parallel import/export
  • 24. Sqoop Features 1 5 2 4 3 Parallel import/export Connectors for all major RDMS databases Kerberos Security Integration Import results of SQL query Provides full and incremental load Sqoop allows us to import the result returned from an SQL query into HDFS
  • 25. Sqoop Features 1 5 2 4 3 Parallel import/export Connectors for all major RDBMS databases Kerberos Security Integration Import results of SQL query Provides full and incremental load Sqoop provides connectors for multiple Relational Database Management System (RDBMS) databases such as MySQL and MS SQL Server
  • 26. Sqoop Features 1 5 2 4 3 Parallel import/export Connectors for all major RDMS databases Kerberos Security Integration Import results of SQL query Provides full and incremental load Sqoop supports Kerberos computer network authentication protocol that allows nodes communicating over a non-secure network to prove their identity to one another in a secure manner
  • 27. Sqoop Features 1 5 2 4 3 Parallel import/export Connectors for all major RDMS databases Kerberos Security Integration Import results of SQL query Sqoop can load the whole table or parts of the table by a single command. Hence, it supports full and incremental load Provides full and incremental load
  • 29. Sqoop Architecture Command Client Client submits the import/ export command to import or export data
  • 30. Sqoop Architecture Command ClientDocument Based Systems Relational Database Enterprise Data Warehouse Connector for Data warehouse Connector for Document based system Connector for RDBMS Data from different databases is fetched by Sqoop Connectors help in working with a range of popular databases
  • 31. Sqoop Architecture Command ClientDocument Based Systems Relational Database Enterprise Data Warehouse Map Task HDFS/ HBase/ Hive Multiple mappers perform map tasks to load the data on to HDFS
  • 32. Sqoop Architecture Command ClientDocument Based Systems Relational Database Enterprise Data Warehouse Map Task HDFS/ HBase/ Hive Similarly, multiple map tasks will export the data from HDFS on to RDBMS using Sqoop export command
  • 35. Sqoop Import Folders Gathers Metadata 1 1 Introspect database to gather metadata (primary key information) RDBMS data store Sqoop Import
  • 36. Sqoop Import Sqoop job HDFS storage Map Map Map Map Folders Submits Map-Only Job Hadoop Cluster 1 Introspect database to gather metadata (primary key information) 2 Sqoop divides the input dataset into splits and uses individual map tasks to push the splits to HDFS RDBMS data store Sqoop Import 2 Gathers Metadata 1
  • 38. Sqoop Export Sqoop job HDFS storage Map Map Map Map Hadoop Cluster Sqoop Export Folders RDBMS data store Gathers Metadata1 Submits Map-Only Job 2 1 Introspect database to gather metadata (primary key information) 2 Sqoop divides the input dataset into splits and uses individual map tasks to push the splits to RDBMS. Sqoop will export Hadoop files back to RDBMS tables.
  • 39. Sqoop Import $ sqoop import (generic args) (import args) $ sqoop-import (generic args) (import args) Argument Description --connect <jdbc-uri> Specify JDBC connect string --connection-manager <class-name> Specify connection manager class to use --driver <class-name> Manually specify JDBC driver class to use --hadoop-mapred-home <dir> Override $HADOOP_MAPRED_HOME --username <username> Set authentication username --help Print usage instructions
  • 40. Sqoop Export $ sqoop export (generic args) (export args) $ sqoop-export (generic args) (export args) Argument Description --connect <jdbc-uri> Specify JDBC connect string --connection-manager <class-name> Specify connection manager class to use --driver <class-name> Manually specify JDBC driver class to use --hadoop-mapred-home <dir> Override $HADOOP_MAPRED_HOME --username <username> Set authentication username --help Print usage instructions
  • 42. Sqoop Processing Sqoop runs in the Hadoop cluster1
  • 43. Sqoop Processing Sqoop runs in the Hadoop cluster It imports data from RDBMS / NOSQL database to HDFS 1 2
  • 44. Sqoop Processing Sqoop runs in the Hadoop cluster It imports data from RDBMS / NOSQL database to HDFS It uses mappers to slice the incoming data into multiple formats and load the data in HDFS 1 2 3
  • 45. Sqoop Processing Sqoop runs in the Hadoop cluster It imports data from RDBMS / NOSQL database to HDFS It uses mappers to slice the incoming data into multiple formats and load the data in HDFS It exports data back into RDBMS while making sure that the schema of the data in the database in maintained 1 2 3 4

Editor's Notes

  1. Style - 01
  2. Style - 01
  3. Style - 01
  4. Style - 01
  5. Style - 01
  6. Style - 01
  7. Style - 01
  8. Style - 01