Soumettre la recherche
Mettre en ligne
IRJET- Analysis of Boston’s Crime Data using Apache Pig
•
0 j'aime
•
24 vues
IRJET Journal
Suivre
https://www.irjet.net/archives/V6/i10/IRJET-V6I10140.pdf
Lire moins
Lire la suite
Ingénierie
Signaler
Partager
Signaler
Partager
1 sur 7
Télécharger maintenant
Télécharger pour lire hors ligne
Recommandé
Python in an Evolving Enterprise System (PyData SV 2013)
Python in an Evolving Enterprise System (PyData SV 2013)
PyData
Pig Experience
Pig Experience
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...
ijcses
A sql implementation on the map reduce framework
A sql implementation on the map reduce framework
eldariof
Web Oriented FIM for large scale dataset using Hadoop
Web Oriented FIM for large scale dataset using Hadoop
dbpublications
Big data
Big data
Abilash Mavila
Hadoop Summit San Jose 2014: Costing Your Big Data Operations
Hadoop Summit San Jose 2014: Costing Your Big Data Operations
Sumeet Singh
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
Hitendra Kumar
Recommandé
Python in an Evolving Enterprise System (PyData SV 2013)
Python in an Evolving Enterprise System (PyData SV 2013)
PyData
Pig Experience
Pig Experience
Tilani Gunawardena PhD(UNIBAS), BSc(Pera), FHEA(UK), CEng, MIESL
A comparative survey based on processing network traffic data using hadoop pi...
A comparative survey based on processing network traffic data using hadoop pi...
ijcses
A sql implementation on the map reduce framework
A sql implementation on the map reduce framework
eldariof
Web Oriented FIM for large scale dataset using Hadoop
Web Oriented FIM for large scale dataset using Hadoop
dbpublications
Big data
Big data
Abilash Mavila
Hadoop Summit San Jose 2014: Costing Your Big Data Operations
Hadoop Summit San Jose 2014: Costing Your Big Data Operations
Sumeet Singh
Hadoop a Natural Choice for Data Intensive Log Processing
Hadoop a Natural Choice for Data Intensive Log Processing
Hitendra Kumar
Large Scale Data With Hadoop
Large Scale Data With Hadoop
guest27e6764
November 2013 HUG: Compute Capacity Calculator
November 2013 HUG: Compute Capacity Calculator
Yahoo Developer Network
Big Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and Storing
IRJET Journal
Hadoop Report
Hadoop Report
Nishant Gandhi
Big Data Summer training presentation
Big Data Summer training presentation
HarshitaKamboj
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
Mahantesh Angadi
Hadoop MapReduce Framework
Hadoop MapReduce Framework
Edureka!
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig Latin
Pietro Michiardi
Hadoop technology doc
Hadoop technology doc
tipanagiriharika
Big data
Big data
rajsandhu1989
Hadoop tutorial
Hadoop tutorial
Aamir Ameen
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Abdul Nasir
Map Reduce
Map Reduce
Rahul Agarwal
Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview
Senthil Kumar
Neo4j vs giraph
Neo4j vs giraph
Nishant Gandhi
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Edureka!
Big data and hadoop
Big data and hadoop
Chanchal Tripathi
Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop
BigDataEverywhere
Introduction to Pig | Pig Architecture | Pig Fundamentals
Introduction to Pig | Pig Architecture | Pig Fundamentals
Skillspeed
Mansi chowkkar programming_in_data_analytics
Mansi chowkkar programming_in_data_analytics
MansiChowkkar
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET Journal
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
cscpconf
Contenu connexe
Tendances
Large Scale Data With Hadoop
Large Scale Data With Hadoop
guest27e6764
November 2013 HUG: Compute Capacity Calculator
November 2013 HUG: Compute Capacity Calculator
Yahoo Developer Network
Big Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and Storing
IRJET Journal
Hadoop Report
Hadoop Report
Nishant Gandhi
Big Data Summer training presentation
Big Data Summer training presentation
HarshitaKamboj
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
Mahantesh Angadi
Hadoop MapReduce Framework
Hadoop MapReduce Framework
Edureka!
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig Latin
Pietro Michiardi
Hadoop technology doc
Hadoop technology doc
tipanagiriharika
Big data
Big data
rajsandhu1989
Hadoop tutorial
Hadoop tutorial
Aamir Ameen
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Abdul Nasir
Map Reduce
Map Reduce
Rahul Agarwal
Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview
Senthil Kumar
Neo4j vs giraph
Neo4j vs giraph
Nishant Gandhi
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Edureka!
Big data and hadoop
Big data and hadoop
Chanchal Tripathi
Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop
BigDataEverywhere
Introduction to Pig | Pig Architecture | Pig Fundamentals
Introduction to Pig | Pig Architecture | Pig Fundamentals
Skillspeed
Mansi chowkkar programming_in_data_analytics
Mansi chowkkar programming_in_data_analytics
MansiChowkkar
Tendances
(20)
Large Scale Data With Hadoop
Large Scale Data With Hadoop
November 2013 HUG: Compute Capacity Calculator
November 2013 HUG: Compute Capacity Calculator
Big Data with Hadoop – For Data Management, Processing and Storing
Big Data with Hadoop – For Data Management, Processing and Storing
Hadoop Report
Hadoop Report
Big Data Summer training presentation
Big Data Summer training presentation
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
BIGDATA- Survey on Scheduling Methods in Hadoop MapReduce
Hadoop MapReduce Framework
Hadoop MapReduce Framework
High-level Programming Languages: Apache Pig and Pig Latin
High-level Programming Languages: Apache Pig and Pig Latin
Hadoop technology doc
Hadoop technology doc
Big data
Big data
Hadoop tutorial
Hadoop tutorial
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Hadoop Distriubted File System (HDFS) presentation 27- 5-2015
Map Reduce
Map Reduce
Hadoop Ecosystem Architecture Overview
Hadoop Ecosystem Architecture Overview
Neo4j vs giraph
Neo4j vs giraph
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Pig Tutorial | Twitter Case Study | Apache Pig Script and Commands | Edureka
Big data and hadoop
Big data and hadoop
Big Data Everywhere Chicago: SQL on Hadoop
Big Data Everywhere Chicago: SQL on Hadoop
Introduction to Pig | Pig Architecture | Pig Fundamentals
Introduction to Pig | Pig Architecture | Pig Fundamentals
Mansi chowkkar programming_in_data_analytics
Mansi chowkkar programming_in_data_analytics
Similaire à IRJET- Analysis of Boston’s Crime Data using Apache Pig
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET Journal
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
cscpconf
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
csandit
Sentiment Analysis using Big Data
Sentiment Analysis using Big Data
Rajat Mittal
Map reduce advantages over parallel databases report
Map reduce advantages over parallel databases report
Ahmad El Tawil
Big data Big Analytics
Big data Big Analytics
Ajay Ohri
App Engine Application for Detecting Similar Files in Google Drive
App Engine Application for Detecting Similar Files in Google Drive
IRJET Journal
Survey Paper on Big Data and Hadoop
Survey Paper on Big Data and Hadoop
IRJET Journal
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
IRJET Journal
IRJET- ALPYNE - A Grid Computing Framework
IRJET- ALPYNE - A Grid Computing Framework
IRJET Journal
Log Analysis Engine with Integration of Hadoop and Spark
Log Analysis Engine with Integration of Hadoop and Spark
IRJET Journal
The Study of the Large Scale Twitter on Machine Learning
The Study of the Large Scale Twitter on Machine Learning
IRJET Journal
Unstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus Model
Editor IJCATR
Nagarjuna_Damarla_Resume
Nagarjuna_Damarla_Resume
Nag Arjun
IRJET- Big Data-A Review Study with Comparitive Analysis of Hadoop
IRJET- Big Data-A Review Study with Comparitive Analysis of Hadoop
IRJET Journal
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
IJCSES Journal
Data scientist a perfect job
Data scientist a perfect job
Sidharth Raj Agarwal
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...
IRJET Journal
IJSRED-V2I3P43
IJSRED-V2I3P43
IJSRED
Nagarjuna_Damarla
Nagarjuna_Damarla
Nag Arjun
Similaire à IRJET- Analysis of Boston’s Crime Data using Apache Pig
(20)
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOP
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
NETWORK TRAFFIC ANALYSIS: HADOOP PIG VS TYPICAL MAPREDUCE
Sentiment Analysis using Big Data
Sentiment Analysis using Big Data
Map reduce advantages over parallel databases report
Map reduce advantages over parallel databases report
Big data Big Analytics
Big data Big Analytics
App Engine Application for Detecting Similar Files in Google Drive
App Engine Application for Detecting Similar Files in Google Drive
Survey Paper on Big Data and Hadoop
Survey Paper on Big Data and Hadoop
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
IRJET- Analysis for EnhancedForecastof Expense Movement in Stock Exchange
IRJET- ALPYNE - A Grid Computing Framework
IRJET- ALPYNE - A Grid Computing Framework
Log Analysis Engine with Integration of Hadoop and Spark
Log Analysis Engine with Integration of Hadoop and Spark
The Study of the Large Scale Twitter on Machine Learning
The Study of the Large Scale Twitter on Machine Learning
Unstructured Datasets Analysis: Thesaurus Model
Unstructured Datasets Analysis: Thesaurus Model
Nagarjuna_Damarla_Resume
Nagarjuna_Damarla_Resume
IRJET- Big Data-A Review Study with Comparitive Analysis of Hadoop
IRJET- Big Data-A Review Study with Comparitive Analysis of Hadoop
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
A Comparative Survey Based on Processing Network Traffic Data Using Hadoop Pi...
Data scientist a perfect job
Data scientist a perfect job
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...
IRJET- Comparatively Analysis on K-Means++ and Mini Batch K-Means Clustering ...
IJSRED-V2I3P43
IJSRED-V2I3P43
Nagarjuna_Damarla
Nagarjuna_Damarla
Plus de IRJET Journal
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
IRJET Journal
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
IRJET Journal
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
IRJET Journal
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
IRJET Journal
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
IRJET Journal
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
IRJET Journal
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
IRJET Journal
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
IRJET Journal
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
IRJET Journal
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
IRJET Journal
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
IRJET Journal
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
IRJET Journal
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
IRJET Journal
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
IRJET Journal
React based fullstack edtech web application
React based fullstack edtech web application
IRJET Journal
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
IRJET Journal
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
IRJET Journal
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
IRJET Journal
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
IRJET Journal
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
IRJET Journal
Plus de IRJET Journal
(20)
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
TUNNELING IN HIMALAYAS WITH NATM METHOD: A SPECIAL REFERENCES TO SUNGAL TUNNE...
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
STUDY THE EFFECT OF RESPONSE REDUCTION FACTOR ON RC FRAMED STRUCTURE
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
A COMPARATIVE ANALYSIS OF RCC ELEMENT OF SLAB WITH STARK STEEL (HYSD STEEL) A...
Effect of Camber and Angles of Attack on Airfoil Characteristics
Effect of Camber and Angles of Attack on Airfoil Characteristics
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
A Review on the Progress and Challenges of Aluminum-Based Metal Matrix Compos...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Dynamic Urban Transit Optimization: A Graph Neural Network Approach for Real-...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
Structural Analysis and Design of Multi-Storey Symmetric and Asymmetric Shape...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A Review of “Seismic Response of RC Structures Having Plan and Vertical Irreg...
A REVIEW ON MACHINE LEARNING IN ADAS
A REVIEW ON MACHINE LEARNING IN ADAS
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
Long Term Trend Analysis of Precipitation and Temperature for Asosa district,...
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
P.E.B. Framed Structure Design and Analysis Using STAAD Pro
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
A Review on Innovative Fiber Integration for Enhanced Reinforcement of Concre...
Survey Paper on Cloud-Based Secured Healthcare System
Survey Paper on Cloud-Based Secured Healthcare System
Review on studies and research on widening of existing concrete bridges
Review on studies and research on widening of existing concrete bridges
React based fullstack edtech web application
React based fullstack edtech web application
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A Comprehensive Review of Integrating IoT and Blockchain Technologies in the ...
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
A REVIEW ON THE PERFORMANCE OF COCONUT FIBRE REINFORCED CONCRETE.
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Optimizing Business Management Process Workflows: The Dynamic Influence of Mi...
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Multistoried and Multi Bay Steel Building Frame by using Seismic Design
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Cost Optimization of Construction Using Plastic Waste as a Sustainable Constr...
Dernier
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
ranjana rawat
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Christo Ananth
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Call Girls in Nagpur High Profile
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
rknatarajan
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
roncy bisnoi
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
JiananWang21
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
Call Girls in Nagpur High Profile Call Girls
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
DineshKumar4165
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
mulugeta48
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
ManishPatel169454
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
KreezheaRecto
Online banking management system project.pdf
Online banking management system project.pdf
Kamal Acharya
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
rknatarajan
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
rknatarajan
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
sanyuktamishra911
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
NFPA 5000 2024 standard .
NFPA 5000 2024 standard .
DerechoLaboralIndivi
Dernier
(20)
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call for Papers - International Journal of Intelligent Systems and Applicatio...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls in Ramesh Nagar Delhi 💯 Call Us 🔝9953056974 🔝 Escort Service
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
(INDIRA) Call Girl Bhosari Call Now 8617697112 Bhosari Escorts 24x7
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
Double rodded leveling 1 pdf activity 01
Double rodded leveling 1 pdf activity 01
Online banking management system project.pdf
Online banking management system project.pdf
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-IFLUID PROPERTIES & FLOW CHARACTERISTICS
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
UNIT-V FMM.HYDRAULIC TURBINE - Construction and working
KubeKraft presentation @CloudNativeHooghly
KubeKraft presentation @CloudNativeHooghly
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
NFPA 5000 2024 standard .
NFPA 5000 2024 standard .
IRJET- Analysis of Boston’s Crime Data using Apache Pig
1.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 10 | Oct 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 785 Analysis of Boston’s Crime Data using Apache Pig V.C Chandra Kishore1, Aditya Shekhar Camarushy2 1New Horizon College of Engineering, Bengaluru, Karnataka, India 2Manipal Institute of Technology, Manipal, Karnataka, India -----------------------------------------------------------------------***------------------------------------------------------------------------ Abstract— Big Data is a technology development happened to the hyperbolic rate of data growth, advanced new data varieties and parallel advancements in technology stake. The types of data could be structured, unstructured or semi- structured, making if difficult to analyse using the standard data management strategies. Hadoop is a framework for the analysis and transformation of incredibly giant data sets using MapReduce. Pig is an apache open source project which is used to reduce the difficulties of coding Map Reduce applications using Pig Latin which is an SQL like query language. In this paper we introduce the Big data analytics tools like apache pig and also work on a analysing a crime data set of Boston. Keywords-Big Data, Hadoop, Apache Pig, analysis, crime data I. INTRODUCTION The term Big Data refers to high volume of data that can't be processed or managed by traditional database systems due to its size and complexities. In today’s world data is generated from various sources, the size of this data is increasing at a rapid phase and is exceeding the size of an Exabyte. [1] Moreover with data being generated the computer systems are much faster yet analysing of such massive data sets is a challenge. [2] The data could be generated from various sources across internet; social media is just one of them. The size of Big data is massive and it keeps increasing, 90% of all data today was generated in the last two years. To Illustrate this consider the data generated around us every minute of the day [3] SOURCE AMOUNT OF DATA Uber Riders take 45,787.54 trips Google conducts 3,607,080 searches Netflix users stream 69,444 HRS of video Amazon makes $258,751.90 in sales Fig. 1 How much data does the world generate every minute This is a huge amount of data that is being generate every minute. This data that is generated is used for making better decisions at a company, in other words Netflix has seen 10% decrees in the users watch time after the new update of their app which tells them to review their new features: all this is made in order to get the most profits possible for any business. II. TYPES OF BIG DATA Big data can be explained in accordance to the 3 V’s mentioned by Gartner (2012) [4] as velocity, variety and volume and this is explained in detail below: 1. Velocity The Velocity is the speed at which the information is created, stored and analysed. The speed at which the information is generated is unbelievable for each minute we tend to transfer a hundred hours of video on YouTube, over two hundred million emails are sent, around twenty million photos are viewed and nearly three hundred thousand tweets are sent. The huge data generated from various sources continues to grow. 2. Volume It refers to unimaginable amounts of data that's being generated every second from social media, mobile phones, cars, sensors, videos. etc. The amount of information within the world doubles every 2 years. Self-driving cars alone generates two Petabyte of data per annum. [5] Whether the information is considered as huge information or not, relies upon volume of the information. It would be tough for us to manage such Brobdingnagian amounts of data in the past however with the current decrease in storage prices, superior storage solutions like Hadoop and Scala it is not an issue. 3. Variety It refers to the varied formats within which data is formed. In the past all the information that we possessed was structured data which was neatly fitted in columns and rows however today, ninety percent of the information that's generated an the organisation is unstructured data. In general, the various forms of data/information are:
2.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 10 | Oct 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 786 1. Structured data: Relational data 2. Semi-structured data: XML, E-mail data 3. Unstructured data: Word, PDF, Text, Media Logs. III. HADOOP Hadoop is an open source, distributed framework developed and maintained by the Apache Foundation which is written in java. Hadoop’s [6] MapReduce processes the information and HDFS stores the giant datasets in a cluster. It is employed to handle giant and sophisticated data which can be structured, unstructured or semi-structured that doesn't match into tables. HDFS (Hadoop Distributed File System)-used for storage. HDFS supports a conventional hierarchal file organization. A user or an application will produce directories and store files within these directories. The filing system namespace hierarchy is analogous to most different existing file systems [7]. HDFS has the many nodes namely: Name node, Secondary name node, Data node and the Map Reduce which is used for processing has Job Tracker and Task Tracker. a) PIG Apache Pig is a platform for analysing giant data sets that consists of a high-level language for expressing data analysis programs, including infrastructure for evaluating these programs. [8] It is designed to reduce the complexities to code a map-reduce application. 10 lines of code in Pig is equivalent to 100 lines of code in Java, It is used to load the data, apply various filters on it and finally dump the data in a format as per users requirement. Moreover, to analyse data using Pig, users need to write scripts using Pig Latin language. It is a Procedural Data Flow Language used by Researchers and Programmers but does not have a dedicated metadata database. b) Hive After gathering the data into HDFS they are then analysed using Hive queries. Hive [9] data warehouse software system which facilitates querying and managing massive datasets located in distributed storage. Initially it was developed by Facebook later the Apache Foundation took it and developed it further as open source. It is used by many companies, one such example is Amazon where hive is used in their Elastic MapReduce. To make it clear Hive is not a relational database, it is designed for Online Transaction Processing. It stores schema in the database and processed data into HDFS also it provides SQL type language called HiveQL. IV. PIG ARCHITECTURE AND COMPONENTS Pig architecture consists of Pig Latin Interpreter and it'll be executed on consumer/client Machine. It uses Pig Latin scripts and it converts the script into a series of map-reduce jobs[10]. Later it executes the map-reduce jobs and saves the output result into HDFS. In between, it performs completely different operations like Parse, Compile, Optimize and plan the Execution of the data. Apache Pig has a component referred to Pig Engine which takes the Pig Latin Scripts as an input and converts those scripts into Map-reduce Jobs. The various Apache Pig Components are: Fig. 2 Architecture of Pig - Parser It is used for checking the syntax of the script, type verifying and other various checks. The output of this component will be a directed acyclic graph (DAG), which represents the PL Statements and operators. The DAG and the operators if the script is represented as nodes, data flow is represented as edges. - Optimizer The DAG generated from the parser is then passed to the optimizer which performs various logical optimizations such as projections. - Compiler In this the compiler compiles the previously optimized logical plan into a series of Map-reduce Jobs for execution. - Execution engine Finally, these Jobs are then submitted Hadoop in a order and are executed on Hadoop to produce the desired results.
3.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 10 | Oct 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 787 Pig Execution Modes Pig will run in 2 execution modes. The modes depend on where the Pig script goes to run and where the info is residing. The data may be kept on a single machine, i.e. Local File System or it may be kept in a distributed system like typical Hadoop Cluster environment. The execution modes of Pig are: - Local Mode Here the files are present and processed from local host and local file system. There is no need for Hadoop or HDFS. This mode is usually used for testing purpose. Once executed it provides output on top of LFS. Pig runs in a single JVM and accesses the native/local file system. - MapReduce mode In this mode of execution, we work with data that is present in the HDFS using Pig. Here we use Pig Latin scripts to process the data and a Map-reduce job is invoked in the back-end. V. PIG LATIN It is a programming language which is used to create programs that run on Hadoop Pig Latin data model is fully nested kind and supports advanced non-atomic data varieties like maps and tuple. Below is a Pig Latin data model. Fig. 3 Pig Latin Data model 1. Atom It is a single value any data type in Pig Latin. It is in the form of string and can be used as a string or a number. example: 'Raju' or 39 2. Tuple It is a record consisting of an ordered set of fields and these fields can be of any type. It is similar to that of a row in RDBMS example: (Chandra, 20) 3. Bag It is a collection of unordered tuples In a bag each tuple could have any number of fields It is represented by { } example: {(Anju, 20),(Manu, 22)} 4. Map It is a set of key-value pairs where the key has to be unique and the value could be of any type. It is denoted by [ ] example: [name#John, age#22] To start MapReduce mode of execution: pig -x MapReduce To start local mode of execution: pig -x local or pig
4.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 10 | Oct 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 788 Data types Fig. 4 Data Types of Pig Latin VI. PIG COMMANDS 1. Loading the data from LFS/HDFS to Pig To put a file into Hadoop hadoop fs -put file /user/cloudera To switch to local execution mode in PIG pig -x local To load the data file into PIG bag2 = load '/home/cloudera/file' using PigStorage(',') as (id:int,name:chararray); To see the content of the bag dump bag2; 2. Utility commands Clear is used to clear the screen of the Grunt shell. grunt> clear Help is used to view the list of pig commands and properties grunt> help Dump is used to view the results on the screen grunt> dump rel_name Quit command is to quit the Grunt shell grunt> quit Describe is used to view the schema grunt> describe rel_name 3. Filtering There are three operators used for filtering they are: - FILTER Operator - FOREACH Operator - Distinct Operator Filter is used to select specific tuples from a relation based on requirement. It is similar to where statement in MySQL. Syntax grunt> Rel_name = FILTER Rel2_name BY (condition); example - filters the emp whose experience is more than 10 years a = filter file by exp>10; dump a; Syntax grunt> Rel_name = FOREACH Rel2_name GENERATE (required data); example - generate each emp name and emp office grunt> a = foreach file generate name,office; grunt> dump a; For each is used to generate data transformation based on columns. It is similar to select statement in MySQL.
5.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 10 | Oct 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 789 VII. PROPOSED METHODOLOGY Steps to be followed are: 1. Collect the crime data of Boston from various trusted web sources. 2. After collection of the data sets load the crime data of Boston using Hadoop command line. 3. Store the data in Hadoop Distributed File System as It is appropriate for those applications which have massive data sets. 4. The crime data of Boston is then processed by the Map- reduce which is the processing engine of Hadoop. 5. Analyse the data with the help of other Big Data tools which works on top of Hadoop and process the data Fig. 5 Methodology for analysing the data VIII. RESEARCH QUESTIONS 1. Find the top 3 reporting areas in Boston. 2. Find the top 5 STREETS safest streets in-terms of number of crimes registered. 3. Find the top 3 most active crime hours of the day 4. Find the number of crimes happened on Saturday and Sunday of the week. 5. Count the top 3 offence code groups in Boston. 6. Find the year which had the highest number of crimes in the past 3 years IX. EXPERIMENTAL FINDINGS 1. Find the top 3 STREETS safest streets in-terms of number of crimes registered. a = group crime by REPORTING_AREA; b = foreach a generate group, COUNT_STAR (crime.REPORTING_AREA) as no; c = order b by no desc; d = limit c 3; Fig. 6 Output and explanation of Query 1 2. Find the top 5 STREETS safest streets in-terms of number of crimes registered. a = group crime by STREET; b = foreach a generate group, COUNT_STAR (crime.STREET) as num; c = order b by num asc; d = limit c 5;
6.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 10 | Oct 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 790 Fig. 7 Output and explanation of Query 2 3. Find the top 3 most active crime hours of the day e= foreach crime generate SUBSTRING (OCCURRED_ON_DATE,9,11); f = group e by times; g = foreach f generate group as time,COUNT_STAR (e.times) as no; res = limit (order g by no desc ) 3; Fig. 8 Output and explanation of Query 3 4. Find the number of crimes happened on Saturday and Sunday of the week. a = group crime by DAY_OF_WEEK; b = foreach a generate group,COUNT_STAR (crime.INCIDENT_NUMBER) as no; c = filter b by group=='Saturday' or group=='Sunday'; Fig. 9 Output and explanation of Query 4 5. Count the top 3 offence code groups in Boston. a = group crime by OFFENSE_CODE_GROUP; b = foreach a generate group,COUNT_STAR (crime.OFFENSE_CODE_GROUP) as no; c = limit (order b by no desc) 5;
7.
International Research Journal
of Engineering and Technology (IRJET) e-ISSN: 2395-0056 Volume: 06 Issue: 10 | Oct 2019 www.irjet.net p-ISSN: 2395-0072 © 2019, IRJET | Impact Factor value: 7.34 | ISO 9001:2008 Certified Journal | Page 791 Fig. 10 Output and explanation of Query 5 6. Find the year which had the highest number of crimes in the past 3 years a = group crime by YEAR; b = foreach a generate group,COUNT_STAR (crime.INCIDENT_NUMBER) as no; c = filter b by group=='2018' or group=='2017' or group =='2016'; d = limit (order c by no desc) 1; Fig. 11 Output and explanation of Query 6 X. CONCLUSION Hadoop MapReduce is currently preferred for large-scale data analytics. Big-data analytics with pig and hive shows the important problems faced by customers and helps the companies to rectify these issues. This paper discusses the concepts of Hadoop and provides explanation for basic commands to analysing the crime data of Boston. The insights gathered from this study could be used to improve the current situation of Boston with regards to the crimes. REFERENCES [1] Peter Charles, Nathan Good, Laheem Lamar Jordan, how- much-info-2003 [2] Xu R, Wunsch D. Clustering. Hoboken: Wiley-IEEE Press; 2009. [3] How much data does the world generate every minute” Available [https://www.iflscience.com/technology/how-much- data-does-the-world-generate-every-minute/] [4] Mark van Rijmenam “ Self driving cars create 2 petabytes data” from [datafloq.com](http://datafloq.com/) [5] [http://hadoop.apache.org/](http://hadoop.apache.org /) [6] [https://hadoop.apache.org/docs/r1.2.1/hdfs_design.ht ml](https://hadoop.apache.org/docs/r1.2.1/hdfs_desig n.html) [7] [https://pig.apache.org/](https://pig.apache.org/) [8] [https://hive.apache.org/](https://hive.apache.org/) [9] http://a4academics.com/tutorials/83-hadoop/837- hadoop-pig
Télécharger maintenant