Companies around the world today find it increasingly difficult to organize and
manage large volumes of data. Hadoop has emerged as the most efficient data
platform for companies working with big data, and is an integral part of storing,
handling and retrieving enormous amounts of data in a variety of applications.
Hadoop helps to run deep analytics which cannot be effectively handled by a
database engine.
Big enterprises around the world have found Hadoop to be a game changer in their
Big Data management, and as more companies embrace this powerful technology
the demand for Hadoop Developers is also growing. By learning how to harness the
power of Hadoop 2.0 to manipulate, analyse and perform computations on Big
Data, you will be paving the way for an enriching and financially rewarding career as
an expert Hadoop developer.
Asian American Pacific Islander Month DDSD 2024.pptx
Hadoop 2.0-development
1. Duration: 3 days
Format: Instructor-led Classroom Training
Hadoop (2.0) Development
Description
Companies around the world today nd it increasingly dicult to organize and
manage large volumes of data. Hadoop has emerged as the most ecient data
platform for companies working with big data, and is an integral part of storing,
handling and retrieving enormous amounts of data in a variety of applications.
Hadoop helps to run deep analytics which cannot be eectively handled by a
database engine.
Big enterprises around the world have found Hadoop to be a game changer in their
Big Data management, and as more companies embrace this powerful technology
the demand for Hadoop Developers is also growing. By learning how to harness the
power of Hadoop 2.0 to manipulate, analyse and perform computations on Big
Data, you will be paving the way for an enriching and nancially rewarding career as
an expert Hadoop developer.
Our three day course in Hadoop 2.0 Developer training will teach you the technical
aspects of Apache HadoopTM, and you will obtain a deeper understanding of the
power of HadoopTM. Our experienced trainers will handhold you through the
development of applications and analyses of Big Data, and you will be able to
comprehend the key concepts required to create robust big data processing
applications. Successful candidates will earn the credential of Hadoop Professional,
and will be capable of handling and analysing Terabyte scale of data successfully
using MapReduce.
Prerequisites
There are no prerequisites for taking the
course. While a basic knowledge of Hadoop
is not required; a basic knowledge of
software development using Java,
programming languages and databases will
be helpful.
Who Can Attend
• Architects and developers who design,
develop and maintain Hadoop-based
solutions
• Data Analysts, BI Analysts, BI Developers,
SAS Developers and related proles who
analyze Big Data in Hadoop environment
• Consultants who are actively involved in a
Hadoop Project
• Experienced Java software engineers who
need to understand and develop Java
MapReduce applications for Hadoop 2.0.
Course Structure
Phase 1: Hadoop 2.0 Fundamentals (12 Hours)
Big Data
Introduction to Big Data, Big Data in Advertising, Banking,
Telecom, eCommerce, Healthcare and Defense.
Processing options including Hadoop
Hadoop
Introduction, How Hadoop Works, HDFS, MapReduce and YARN
Hadoop Ecosystem
Sqoop, Oozie, Pig, Hive, Flume
Hadoop Hands On
Running HDFS commands, MapReduce program ,
Sqoop Import and Sqoop Export
Creating and Querying Hive tables
Evaluation Test
Bonus:
Setting up Hadoop 1.0 on a single node cluster manual
Setting up Hadoop 2.0 on a single node setup manual
Multinode setup walkthrough manual
Phase 2: Hadoop Development (8 hours)
Advanced MapReduce
MapReduce Code Walkthrough, ToolRunner, MR Unit, Distributed
Cache, Combiner, Partitioner, Setup and Cleanup
methods, Using Java API to access HDFS
Joins Using MapReduce
Map and Reduce Side joins
Custom Types
Input and Output Types in MapReduce, Custom
www.knowledgehut.com support@knowledgehut.com
2. Hadoop (2.0) Development
Input and Output Data types, Multiple Reducer MR program,
Zero Reducer Mapper Program
Advanced Mapreduce Hands On
MR Unit hands on, Distributed Cache hands on,
Partitioner hands on, Combiner hands on
Accessing les using HDFS API hands on
Map Side joins hands on and Reduce side joins hands on
MapReduce Design Patterns:
- Searching
- Sorting
- Filtering
- Inverted Index
- TF-IDF
- Word Co-occurrence
MapReduce Design Patterns Hands On:
Distributed Grep, Bloom Filters, Average Calculation,
Standard Deviation, Map Side joins and Reduce Side joins
Evaluation Test (30 marks)
Phase 3: Other Hadoop Development
Aspects- Pig, Hive, Oozie and Impala (8 hours)
Pig
Hive
Oozie
Impala
Evaluation Test
Benets
From the course:
• Understand Big Data and the various types of data stored in Hadoop
• Understand the fundamentals of MapReduce, Hadoop Distributed File System
(HDFS), YARN, and how to write MapReduce code
• Learn best practices and considerations for Hadoop development, debugging
techniques and implementation of workows and common algorithms
• Learn how to leverage Hadoop frameworks like ApachePig™, ApacheHive™,
Sqoop, Flume, Oozie and other projects from the Apache HadoopTM Ecosystem
• Understand optimal hardware congurations and network considerations for
building out, maintaining and monitoring your Hadoop cluster
• Learn advanced HadoopTM API topics required for real-world data analysis
• Understand the path to ROI with Hadoop
From the workshop:
• High quality training from an industry expert
• 3 Days of hands-on experience and practical exercises
• Earn 24 PDUs
• Hard copy of courseware
• 50% interactive and hands-on training exercises using HDFS, Pig, Hive, HBase,
key MapReduce components and features, and more
To know more about the next available workshop in your country, please visit this
link: http://www.knowledgehut.com/short?v=o2voDedi
www.knowledgehut.com support@knowledgehut.com