2. What is Big Data?
What is Hadoop?
Need of Hadoop
Challenges with Big Data
i.Storage
ii.Processing
Comparison with Other Technologies
Hadoop Echo System components
3. HDFS (Hadoop Distributed File System)
• Features of HDFS
• Configuring Block size,
• HDFS Architecture( 5 Daemons)
Name Node
Data Node
Job Tracker
Task Tracker
Secondary Name node
• Replication in Hadoop
• Configuring Custom Replication
• Fault Tolerance in Hadoop
• HDFS Commands
4. MAP REDUCE
• Map Reduce Architecture
• Processing Daemons of Hadoop
• Job Tracker (Roles and Responsibilities)
• Task Tracker(Roles and Responsibilities)
• Input split
• Input split vs Block size
• Data Types in Map Reduce
• Map Reduce Programming Model
• Driver Code
• Mapper Code
• Reducer Code
• Combiner in Map Reduce
• Partitioner in Map Reduce
• File input formats
• File output formats
• Compression Techniques in Map Reduce
• Joins in Map Reduce
6. Relational Operators in Pig
• COGROUP
• CROSS
• DISTINCT
• FILTER
• FOREACH
• GROUP
• JOIN (INNER)
• JOIN (OUTER)
• LIMIT
• LOAD
• ORDER
• SAMPLE
• SPILT
• STORE
• UNION
7. Diagnostic Operators in Pig
• Describe
• Dump
• Explain
• Illustrate
Eval Functions in Pig
• AVG
• CONCAT
• COUNT
• DIFF
• IS EMPTY
• MAX
• MIN
• SIZE
• SUM
• TOKENIZE
• writing Custom UDFS in Pig
8. HIVE
• Introduction
• Hive Architecture
• Hive Metastore
• Hive Query Launguage
• Difference between HQL and SQL
• Hive Built in Functions
• Hive UDF (user defined functions)
• Hive UDAF (user defined Aggregated functions)
• Hive UDTF (user defined table Generated functions)
• Hive Serde?
• Hive & Hbase Integration
• Hive Working with unstructured data
• Hive Working With Xml Data
• Hive Working With Json Data
9. • Hive Working With Urls And Weblog Data
• Hive – Json – Serde
• Loading Data From Local Files To Hive Tables
• Loading Data From Hdfs Files To Hive Tables
• Tables Types
• Inner Tables
• External Tables
• Partitioned Tables
• Non – Partitioned Tables
• Dynamic Partitions In Hive
• Bucketing in hive
• Hive Unions
• Hive Joins
• Multi Table / File Inserts
• Inserting Into Local Files
• Inserting Into Hdfs Files
• Array Operations In Hive
10. SQOOP (SQL + HADOOP)
• Introduction to Sqoop
• SQOOP Import
• SQOOP Export
• Importing Data From RDBMS to HDFS
• Importing Data From RDBMS to HIVE
• Importing Data From RDBMS to HBASE
• Exporting From HASE to RDBMS
• Exporting From HBASE to RDBMS
• Exporting From HIVE to RDBMS
• Exporting From HDFS to RDBMS
• Transformations While Importing / Exporting
• Defining SQOOP Jobs
11. NOSQL
• What is “Not only SQL”
• NOSQL Advantages
• What is problem with RDBMS for Large
• Data Scaling Systems
• Types of NOSQL & Purposes
• Key Value Store
• Columer Store
• Document Store
• Graph Store
• Introduction to cassandra – NOSQL Database
• Introduction to MangoDB and CouchDB Database
• Introduction to Neo4j – NOSQL Database
• Intergration of NOSQL Databases with Hadoop
12. HBASE
• Introduction to big table
• What is NOSQL and colummer store Database
• HBASE Introduction
• Hbase use cases
• Hbase basics
• Column families
• Scans
• Hbase Architecture
• Thrift
• Map Reduce Integration
• Map Reduce Over Hbase
• Hbase data Modeling
• Hbase Schema design
• Hbase CRUD operators
• Hive & Hbase interagation
• Hbase storage handles
13. FLUME
• Introduction to FLUME
• What is the streaming File
• FLUME Architecture
• FLUME Nodes & FLUME Manager
• FLUME Local & Physical Node
• FLUME Agents & FLUME Collector
KAFKA
• Introduction to KAFKA
• KAFKA Architecture
• Kafka components
• BROKER
• Topics
• Producers
• Consumers
• Configurations
14. OOZIE
• Introduction to OOZIE
• OOZIE as a seheduler
• OOZIE as a Workflow designer
• Seheduling jobs (OOZIE CODE)
• Defining Dependences between jobs
• (OOZIE Code Examples)
• Conditionally controlling jobs
• (OOZIE Code Examples)
• Defining parallel jobs (OOZIE Code Examples)
YARN
• YARN Architecture
• Resource Manager
• Application Master
• Node Manager
• MR vs. YARN
15. IMPALA
• What is Impala?
• Impala for query processing
• HIVE vs Impala
• Usecases with impala
MONGODB
• Introduction to MongoDB
• Features of MongoDB
• MongoDB Basic operations
Additional benefits from NBITS
• Course Material
• Sample resumes and Fine tuning of Resume
• Interview Questions
• Mock Interviews by Real time Consultants
• Certification Questions
• Job Assistance