SlideShare une entreprise Scribd logo
1  sur  9
Télécharger pour lire hors ligne
Pig
Relational Operators - I:
Order, Distinct, Limit,
GroupBy
Pig Relational Operator: ORDER
 This helps to sort the data based on Ascending or
Descending manner.
1. Sorting for numerical fields are based on numerically.
2. Sorting for chararray fields are based on lexically.
3. Sorting for bytearray fields are based on lexically.
4. Nulls are considered to be smaller than other values. Therefore
it will always come first or last during ascending or descending
the results.
Let’s perform this with the help of an example;
grunt> dataTransaction = Load ‘/home/hduser/datasets/store.csv’
using PigStorage(‘,’) AS
(Product_Name:chararray,CustomerName:chararray,Transaction_I
D:byearray,TransAmt1:bytearray,TransAmt2:bytearray,
TransAmt3:bytearray, Place:chararray, Department:chararray);
Rupak Roy
Pig Relational Operator: ORDER
grunt> orderbyName = ORDER dataTransaction by
Name;
Example 2:
grunt> orderbyNameNsymbol= order datatransaction
by date, symbol;
Example 3:
grunt> desc= order datatransaction by close desc,
open;
Here close column will have descending order and
since we didn’t mentioned any order for open column
it will take ascending order by default.
Rupak Roy
Pig Relational Operator: LIMIT
 Limit simply limits the number of records to display.
 For example, if we have 1 million rows and columns
and if we dump the results for testing or any purpose it
will take a lot of time to finish displaying the results one
by one which is very time consuming process, so to
make sure our required script is working it is better to
view some results rather than displaying the whole
results.
 However Pig will still read all the records even we limit
the display of results by (assume) 20 records, but it will
also display 20 different records each time even we
use the same limit query. We can overcome this issue
by using ORDER operator immediately after the limit
operator and will guarantee the same 20 records
each time when we use the same limit query.
Rupak Roy
Pig Relational Operator: LIMIT
grunt> Trecords= LIMIT dataTransaction 20;
grunt> dump Trecords;
grunt> dump Trecords; #it will display 20 different records.
#to overcome the limit issue
grunt> order= order dataTransaction by $0;
grunt> Trecords= Limit order 20;
grunt> dump Trecords;
Again we will test for the same results;
grunt> Trecrods = Limit order 20;
Rupak Roy
Pig Relational Operator: DISTINCT
 This operator simply removes the duplicate data.
grunt> RD = DISTINCT dataTransaction;
Note: Distinct operator makes use of a combiner
or we can say semi-reducer between the map
phase and reducer phase to remove the
duplicates.
Rupak Roy
Pig Relational Operator: GROUP
 Group operator is one of the important functions for grouping
the data from a large pool of datasets.
grunt> grouping = GROUP dataTransaction by Place;
grunt> describe grouping;
grunt>cnt = foreach grouping GENERATE group, COUNT( $1);
grunt> dump cnt;
By applying this to our dataset store.csv we can find from how
many places each customer purchase similar products.
Note: to verify this result, load the dataset in excel (since this is a
subset of a dataset and will be suitable to view it in excel). Use the
filter function then select only Christy Britain, we will see she have
purchased the similar products from 6 different places.
Rupak Roy
 Group on multiple keys:
grunt> grouped = GROUP dataTransaction by
(Place, Department);
Pig Relational Operator: GROUP
Rupak Roy
Next
 More into advanced relational operators
like foreach, filter, join and more.
Rupak Roy

Contenu connexe

Tendances

1.7 data reduction
1.7 data reduction1.7 data reduction
1.7 data reductionKrish_ver2
 
Data mining and data warehouse lab manual updated
Data mining and data warehouse lab manual updatedData mining and data warehouse lab manual updated
Data mining and data warehouse lab manual updatedYugal Kumar
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATAGauravBiswas9
 
Issues in knowledge representation
Issues in knowledge representationIssues in knowledge representation
Issues in knowledge representationSravanthi Emani
 
Forms of learning in ai
Forms of learning in aiForms of learning in ai
Forms of learning in aiRobert Antony
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessingSalah Amean
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kambererror007
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Simplilearn
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notesMohit Saini
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY pptsravya raju
 
Decision Trees
Decision TreesDecision Trees
Decision TreesStudent
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent methodSanghyuk Chun
 

Tendances (20)

Decision trees
Decision treesDecision trees
Decision trees
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Data Analytics Life Cycle
Data Analytics Life CycleData Analytics Life Cycle
Data Analytics Life Cycle
 
Frame buffer
Frame bufferFrame buffer
Frame buffer
 
1.7 data reduction
1.7 data reduction1.7 data reduction
1.7 data reduction
 
Planning
PlanningPlanning
Planning
 
Data mining and data warehouse lab manual updated
Data mining and data warehouse lab manual updatedData mining and data warehouse lab manual updated
Data mining and data warehouse lab manual updated
 
Region based segmentation
Region based segmentationRegion based segmentation
Region based segmentation
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATA
 
Issues in knowledge representation
Issues in knowledge representationIssues in knowledge representation
Issues in knowledge representation
 
Forms of learning in ai
Forms of learning in aiForms of learning in ai
Forms of learning in ai
 
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
Data Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessingData Mining:  Concepts and Techniques (3rd ed.)- Chapter 3 preprocessing
Data Mining: Concepts and Techniques (3rd ed.) - Chapter 3 preprocessing
 
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & KamberChapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
Chapter - 6 Data Mining Concepts and Techniques 2nd Ed slides Han & Kamber
 
Map Reduce
Map ReduceMap Reduce
Map Reduce
 
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
Support Vector Machine - How Support Vector Machine works | SVM in Machine Le...
 
Big data lecture notes
Big data lecture notesBig data lecture notes
Big data lecture notes
 
Hadoop
HadoopHadoop
Hadoop
 
HADOOP TECHNOLOGY ppt
HADOOP  TECHNOLOGY pptHADOOP  TECHNOLOGY ppt
HADOOP TECHNOLOGY ppt
 
Decision Trees
Decision TreesDecision Trees
Decision Trees
 
Gradient descent method
Gradient descent methodGradient descent method
Gradient descent method
 

Similaire à Apache PIG Relational Operations

Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Rupak Roy
 
TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...
TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...
TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...tdc-globalcode
 
Ramda, a functional JavaScript library
Ramda, a functional JavaScript libraryRamda, a functional JavaScript library
Ramda, a functional JavaScript libraryDerek Willian Stavis
 
Session 04 -Pig Continued
Session 04 -Pig ContinuedSession 04 -Pig Continued
Session 04 -Pig ContinuedAnandMHadoop
 
Compliance as Code with terraform-compliance
Compliance as Code with terraform-complianceCompliance as Code with terraform-compliance
Compliance as Code with terraform-complianceEmre Erkunt
 
Session 04 pig - slides
Session 04   pig - slidesSession 04   pig - slides
Session 04 pig - slidesAnandMHadoop
 
Passing Parameters using File and Command Line
Passing Parameters using File and Command LinePassing Parameters using File and Command Line
Passing Parameters using File and Command LineRupak Roy
 
ACADILD:: HADOOP LESSON
ACADILD:: HADOOP LESSON ACADILD:: HADOOP LESSON
ACADILD:: HADOOP LESSON Padma shree. T
 
Get expertise with mongo db
Get expertise with mongo dbGet expertise with mongo db
Get expertise with mongo dbAmit Thakkar
 
No more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in productionNo more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in productionChetan Khatri
 
Hive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TDHive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TDSATOSHI TAGOMORI
 
Multiple regression with R
Multiple regression with RMultiple regression with R
Multiple regression with RJerome Gomes
 
Wprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache HadoopWprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache HadoopSages
 
Pig Introduction to Pig
Pig Introduction to PigPig Introduction to Pig
Pig Introduction to PigChris Wilkes
 
Dive into Spark Streaming
Dive into Spark StreamingDive into Spark Streaming
Dive into Spark StreamingGerard Maas
 

Similaire à Apache PIG Relational Operations (20)

Apache Pig Relational Operators - II
Apache Pig Relational Operators - II Apache Pig Relational Operators - II
Apache Pig Relational Operators - II
 
Pig latin
Pig latinPig latin
Pig latin
 
TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...
TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...
TDC2016POA | Trilha Programacao Funcional - Ramda JS como alternativa a under...
 
Ramda, a functional JavaScript library
Ramda, a functional JavaScript libraryRamda, a functional JavaScript library
Ramda, a functional JavaScript library
 
R Get Started II
R Get Started IIR Get Started II
R Get Started II
 
Session 04 -Pig Continued
Session 04 -Pig ContinuedSession 04 -Pig Continued
Session 04 -Pig Continued
 
Compliance as Code with terraform-compliance
Compliance as Code with terraform-complianceCompliance as Code with terraform-compliance
Compliance as Code with terraform-compliance
 
Session 04 pig - slides
Session 04   pig - slidesSession 04   pig - slides
Session 04 pig - slides
 
Passing Parameters using File and Command Line
Passing Parameters using File and Command LinePassing Parameters using File and Command Line
Passing Parameters using File and Command Line
 
ACADILD:: HADOOP LESSON
ACADILD:: HADOOP LESSON ACADILD:: HADOOP LESSON
ACADILD:: HADOOP LESSON
 
Get expertise with mongo db
Get expertise with mongo dbGet expertise with mongo db
Get expertise with mongo db
 
No more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in productionNo more struggles with Apache Spark workloads in production
No more struggles with Apache Spark workloads in production
 
Lecture 2 part 3
Lecture 2 part 3Lecture 2 part 3
Lecture 2 part 3
 
pig intro.pdf
pig intro.pdfpig intro.pdf
pig intro.pdf
 
Hive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TDHive dirty/beautiful hacks in TD
Hive dirty/beautiful hacks in TD
 
Multiple regression with R
Multiple regression with RMultiple regression with R
Multiple regression with R
 
Wprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache HadoopWprowadzenie do technologi Big Data i Apache Hadoop
Wprowadzenie do technologi Big Data i Apache Hadoop
 
Pig Introduction to Pig
Pig Introduction to PigPig Introduction to Pig
Pig Introduction to Pig
 
Dive into Spark Streaming
Dive into Spark StreamingDive into Spark Streaming
Dive into Spark Streaming
 
Hadoop Interview Questions and Answers
Hadoop Interview Questions and AnswersHadoop Interview Questions and Answers
Hadoop Interview Questions and Answers
 

Plus de Rupak Roy

Hierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPHierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPRupak Roy
 
Clustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPClustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPRupak Roy
 
Network Analysis - NLP
Network Analysis  - NLPNetwork Analysis  - NLP
Network Analysis - NLPRupak Roy
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLPRupak Roy
 
Sentiment Analysis Practical Steps
Sentiment Analysis Practical StepsSentiment Analysis Practical Steps
Sentiment Analysis Practical StepsRupak Roy
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment AnalysisRupak Roy
 
Text Mining using Regular Expressions
Text Mining using Regular ExpressionsText Mining using Regular Expressions
Text Mining using Regular ExpressionsRupak Roy
 
Introduction to Text Mining
Introduction to Text Mining Introduction to Text Mining
Introduction to Text Mining Rupak Roy
 
Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase ArchitectureRupak Roy
 
Introduction to Hbase
Introduction to Hbase Introduction to Hbase
Introduction to Hbase Rupak Roy
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQLRupak Roy
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Rupak Roy
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive Rupak Roy
 
Scoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSScoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSRupak Roy
 
Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Rupak Roy
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functionsRupak Roy
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to FlumeRupak Roy
 
Apache PIG casting, reference
Apache PIG casting, referenceApache PIG casting, reference
Apache PIG casting, referenceRupak Roy
 
Pig Latin, Data Model with Load and Store Functions
Pig Latin, Data Model with Load and Store FunctionsPig Latin, Data Model with Load and Store Functions
Pig Latin, Data Model with Load and Store FunctionsRupak Roy
 
Introduction to PIG components
Introduction to PIG components Introduction to PIG components
Introduction to PIG components Rupak Roy
 

Plus de Rupak Roy (20)

Hierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLPHierarchical Clustering - Text Mining/NLP
Hierarchical Clustering - Text Mining/NLP
 
Clustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLPClustering K means and Hierarchical - NLP
Clustering K means and Hierarchical - NLP
 
Network Analysis - NLP
Network Analysis  - NLPNetwork Analysis  - NLP
Network Analysis - NLP
 
Topic Modeling - NLP
Topic Modeling - NLPTopic Modeling - NLP
Topic Modeling - NLP
 
Sentiment Analysis Practical Steps
Sentiment Analysis Practical StepsSentiment Analysis Practical Steps
Sentiment Analysis Practical Steps
 
NLP - Sentiment Analysis
NLP - Sentiment AnalysisNLP - Sentiment Analysis
NLP - Sentiment Analysis
 
Text Mining using Regular Expressions
Text Mining using Regular ExpressionsText Mining using Regular Expressions
Text Mining using Regular Expressions
 
Introduction to Text Mining
Introduction to Text Mining Introduction to Text Mining
Introduction to Text Mining
 
Apache Hbase Architecture
Apache Hbase ArchitectureApache Hbase Architecture
Apache Hbase Architecture
 
Introduction to Hbase
Introduction to Hbase Introduction to Hbase
Introduction to Hbase
 
Apache Hive Table Partition and HQL
Apache Hive Table Partition and HQLApache Hive Table Partition and HQL
Apache Hive Table Partition and HQL
 
Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export Installing Apache Hive, internal and external table, import-export
Installing Apache Hive, internal and external table, import-export
 
Introductive to Hive
Introductive to Hive Introductive to Hive
Introductive to Hive
 
Scoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMSScoop Job, import and export to RDBMS
Scoop Job, import and export to RDBMS
 
Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode Apache Scoop - Import with Append mode and Last Modified mode
Apache Scoop - Import with Append mode and Last Modified mode
 
Introduction to scoop and its functions
Introduction to scoop and its functionsIntroduction to scoop and its functions
Introduction to scoop and its functions
 
Introduction to Flume
Introduction to FlumeIntroduction to Flume
Introduction to Flume
 
Apache PIG casting, reference
Apache PIG casting, referenceApache PIG casting, reference
Apache PIG casting, reference
 
Pig Latin, Data Model with Load and Store Functions
Pig Latin, Data Model with Load and Store FunctionsPig Latin, Data Model with Load and Store Functions
Pig Latin, Data Model with Load and Store Functions
 
Introduction to PIG components
Introduction to PIG components Introduction to PIG components
Introduction to PIG components
 

Dernier

ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701bronxfugly43
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Jisc
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...Nguyen Thanh Tu Collection
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsMebane Rash
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxJisc
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxEsquimalt MFRC
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin ClassesCeline George
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.christianmathematics
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSCeline George
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfSherif Taha
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Association for Project Management
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17Celine George
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsKarakKing
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxVishalSingh1417
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseAnaAcapella
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentationcamerronhm
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdfssuserdda66b
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and ModificationsMJDuyan
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxheathfieldcps1
 

Dernier (20)

ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701ComPTIA Overview | Comptia Security+ Book SY0-701
ComPTIA Overview | Comptia Security+ Book SY0-701
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
TỔNG ÔN TẬP THI VÀO LỚP 10 MÔN TIẾNG ANH NĂM HỌC 2023 - 2024 CÓ ĐÁP ÁN (NGỮ Â...
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Unit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptxUnit-V; Pricing (Pharma Marketing Management).pptx
Unit-V; Pricing (Pharma Marketing Management).pptx
 
Spellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please PractiseSpellings Wk 3 English CAPS CARES Please Practise
Spellings Wk 3 English CAPS CARES Please Practise
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdfVishram Singh - Textbook of Anatomy  Upper Limb and Thorax.. Volume 1 (1).pdf
Vishram Singh - Textbook of Anatomy Upper Limb and Thorax.. Volume 1 (1).pdf
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
The basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptxThe basics of sentences session 3pptx.pptx
The basics of sentences session 3pptx.pptx
 

Apache PIG Relational Operations

  • 1. Pig Relational Operators - I: Order, Distinct, Limit, GroupBy
  • 2. Pig Relational Operator: ORDER  This helps to sort the data based on Ascending or Descending manner. 1. Sorting for numerical fields are based on numerically. 2. Sorting for chararray fields are based on lexically. 3. Sorting for bytearray fields are based on lexically. 4. Nulls are considered to be smaller than other values. Therefore it will always come first or last during ascending or descending the results. Let’s perform this with the help of an example; grunt> dataTransaction = Load ‘/home/hduser/datasets/store.csv’ using PigStorage(‘,’) AS (Product_Name:chararray,CustomerName:chararray,Transaction_I D:byearray,TransAmt1:bytearray,TransAmt2:bytearray, TransAmt3:bytearray, Place:chararray, Department:chararray); Rupak Roy
  • 3. Pig Relational Operator: ORDER grunt> orderbyName = ORDER dataTransaction by Name; Example 2: grunt> orderbyNameNsymbol= order datatransaction by date, symbol; Example 3: grunt> desc= order datatransaction by close desc, open; Here close column will have descending order and since we didn’t mentioned any order for open column it will take ascending order by default. Rupak Roy
  • 4. Pig Relational Operator: LIMIT  Limit simply limits the number of records to display.  For example, if we have 1 million rows and columns and if we dump the results for testing or any purpose it will take a lot of time to finish displaying the results one by one which is very time consuming process, so to make sure our required script is working it is better to view some results rather than displaying the whole results.  However Pig will still read all the records even we limit the display of results by (assume) 20 records, but it will also display 20 different records each time even we use the same limit query. We can overcome this issue by using ORDER operator immediately after the limit operator and will guarantee the same 20 records each time when we use the same limit query. Rupak Roy
  • 5. Pig Relational Operator: LIMIT grunt> Trecords= LIMIT dataTransaction 20; grunt> dump Trecords; grunt> dump Trecords; #it will display 20 different records. #to overcome the limit issue grunt> order= order dataTransaction by $0; grunt> Trecords= Limit order 20; grunt> dump Trecords; Again we will test for the same results; grunt> Trecrods = Limit order 20; Rupak Roy
  • 6. Pig Relational Operator: DISTINCT  This operator simply removes the duplicate data. grunt> RD = DISTINCT dataTransaction; Note: Distinct operator makes use of a combiner or we can say semi-reducer between the map phase and reducer phase to remove the duplicates. Rupak Roy
  • 7. Pig Relational Operator: GROUP  Group operator is one of the important functions for grouping the data from a large pool of datasets. grunt> grouping = GROUP dataTransaction by Place; grunt> describe grouping; grunt>cnt = foreach grouping GENERATE group, COUNT( $1); grunt> dump cnt; By applying this to our dataset store.csv we can find from how many places each customer purchase similar products. Note: to verify this result, load the dataset in excel (since this is a subset of a dataset and will be suitable to view it in excel). Use the filter function then select only Christy Britain, we will see she have purchased the similar products from 6 different places. Rupak Roy
  • 8.  Group on multiple keys: grunt> grouped = GROUP dataTransaction by (Place, Department); Pig Relational Operator: GROUP Rupak Roy
  • 9. Next  More into advanced relational operators like foreach, filter, join and more. Rupak Roy