5. wwww.edureka.co/big-data-and-hadoop
Pig & Hive
Tools like Pig and Hive that are built on top of Hadoop, offer high-level languages for working with data
If you want to write MapReduce program, then you can use Pig and Pig Latin for which knowledge of
Java is not required.
If you want to view data in HDFS in a readable form you can use Hive which again does not require any
knowledge of Java.
7. wwww.edureka.co/big-data-and-hadoop
But why Pig?
Pig simplifies complex MapReduce programs by using Pig Latin
Additionally, If you want to write your own MapReduce code, you can do so in any language (e.g. Perl, Python,
Ruby, C, etc.)
But the most attractive features of Pig are:
10 lines of PIG = 200 lines of Java
Built in operations like:
Join
Group
Filter
Sort
and more…
8. wwww.edureka.co/big-data-and-hadoop
Why Pig?
Provides common data operations
filters, joins, ordering, etc. and nested
data types tuples, bags, and maps
missing from MapReduce.
It is Open source and is actively
supported by a community of
developers.
Structured data
Semi-Structured data
Unstructured data
Similar to SQL
Reads like a series
of steps
Java
Python
JavaScript
Ruby
An ad-hoc way of creating and
executing map-reduce jobs on very
large data sets
Can take any data
Easy to learn, Easy
to read and write
Extensible by UDF
(User Defined Functions)
Java not required
11. wwww.edureka.co/big-data-and-hadoop
Features of Hive
You can use HIVE to read and write files on Hadoop and run your reports from a BI tool
Predictive Modeling & Hypothesis
Testing
Document Indexing
Customer-facing Business Intelligence
Log Processing
Data Mining
HIVE
Applications