SlideShare une entreprise Scribd logo
1  sur  22
Yelp Dataset Challenge:
Business Analysis
Based on Location and Category
GROUP - I :
KEYUR MANDANI
MIKAELIAN OVANES
HEMANTH REDDY
Table of contents
• Introduction
• Cluster Configuration
• Agenda
• Flowchart
• Specifications
• Implementation
• Visualization
• GitHub
• References
What is Yelp?
--Yelp is a user driven web 2.0 service which reveals honest and
current insights on local businesses
--Yelp allows users from anywhere in the world to rate
and review any business.
--Yelp's revenues come from selling ads and sponsored listings
to small businesses.
--Harvard Business School study published in 2011 found that
each star in a Yelp rating affected the business owner's sales
by 5-9 percent.
What is Yelp?
--Yelp is a user driven web 2.0 service which reveals honest and
current insights on local businesses
--Yelp allows users from anywhere in the world to rate
and review any business.
--Yelp's revenues come from selling ads and sponsored listings
to small businesses.
--Harvard Business School study published in 2011 found that
each star in a Yelp rating affected the business owner's sales
by 5-9 percent.
Microsoft Azure HDInsight Cluster
Configuration
• Operating System : Linux
• Nodes: 4 Node
• Worker Nodes: 4 Nodes -16Core –14Gb RAM – 200Gb SSD
• Head Nodes: 2 Nodes - 8Core –14Gb RAM – 200Gb SSD
Tools Used
• Microsoft Azure HDInsight Cluster Hadoop Environment
• PowerBI for Data Visualization
• Amazon AWS S3 : Store data Online and To Fetch to HDFS
• Jsonprettyprinter : Format non-structured Data into structured data
• Mapping tools at Batchgeo.com
Agenda
Analyze Yelp Academic Dataset from
various business perspectives, including
business location, category, time of year,
user rating and user reviews.
Dataset Details
Data source: Yelp Academic Dataset
Data size : 1.98 GB
File Format : json
Number of files : 3
Downloaded
data from Yelp
website
Converted Json
file to .CSV file
using
Serialization/Dese
rializtion (SerDe)
Export Data to
Excel
Upload Files to
HDInsight Cluster
using SSH
Dashboard
Data
visualization
1 2 3 4 5 6
PROCESS FLOW
Used HiveQL to
Retrieve data
and create tables
Raw JSON Data
Upload JSON Files to HDInsight Cluster Using SSH
Download File: Wget –O Filename ‘ URL’‘FileDestination’
Move File to HDFS: hdfs dfs –put filename ‘File Destination Path’
Downloading Json-Serder File for Hive
Create Table with Serde (JsonSerde)
NOTE:-While Creating table using Hive-JsonSerde,
class path for Serde Needs to be specified
with the table.
Query To Display Review Count on Specific Time of Year
Average Rating and Average Review
Total Reviews by Business Category in Selected States
Average Rating by Business Category in US
Average Rating For Business In Arizona State
Total Number of Reviews for Business in Arizona State
Businesses in Las Vegas based on Longitude and Latitude
using batchgeo.com
Project Scope
Natural Language Processing:
From the review provided from the users, based on the
positive and negative words, we can predict the rating a
particular user will give.
Bluemix’s Natural Language Classifier can be used
References
• GitHub Repository Link: https://github.com/Keyur-
Mandani/CIS520-01-G-I.git
• SlideShare Link:
• Dataset : https://www.yelp.com/dataset_challenge/dataset
• Serde Source: http://code.google.com/p/archive/hive-json-
serde-0.2.jar
References from Class Lab Work
• Azure HDInsight Hadoop Linux Cluster Getting Started Artical
• www.tutorialpoints.com/hive

Contenu connexe

Similaire à Yelp Academic Dataset

Market Research Meets Big Data Analytics for Business Transformation
Market Research Meets Big Data Analytics  for Business Transformation Market Research Meets Big Data Analytics  for Business Transformation
Market Research Meets Big Data Analytics for Business Transformation Sally Sadosky
 
FAIR Dataverse
FAIR DataverseFAIR Dataverse
FAIR Dataversevty
 
Developer friendly open data
Developer friendly open dataDeveloper friendly open data
Developer friendly open dataAlbert O'Connor
 
2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledgeChristopher Williams
 
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...Sease
 
Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...
Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...
Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...Lucidworks
 
Testing Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopTesting Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopRTTS
 
Building Valuable Restful APIs - HRPHP 2015
Building Valuable Restful APIs - HRPHP 2015Building Valuable Restful APIs - HRPHP 2015
Building Valuable Restful APIs - HRPHP 2015Guillermo A. Fisher
 
Building Social Business Applications with OpenSocial
Building Social Business Applications with OpenSocialBuilding Social Business Applications with OpenSocial
Building Social Business Applications with OpenSocialClint Oram
 
Liferay portal – moving beyond content management
Liferay portal – moving beyond content managementLiferay portal – moving beyond content management
Liferay portal – moving beyond content managementAmbientia
 
Linked data for Enterprise Data Integration
Linked data for Enterprise Data IntegrationLinked data for Enterprise Data Integration
Linked data for Enterprise Data IntegrationSören Auer
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopHortonworks
 
Proof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityProof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityOpen Cyber University of Korea
 
Case Study: Big Data Analytics
Case Study: Big Data AnalyticsCase Study: Big Data Analytics
Case Study: Big Data AnalyticsAbhinav Das
 
Graphs fun vjug2
Graphs fun vjug2Graphs fun vjug2
Graphs fun vjug2Neo4j
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...Platfora
 
Ipedo Company Overview
Ipedo Company OverviewIpedo Company Overview
Ipedo Company OverviewTim_Matthews
 

Similaire à Yelp Academic Dataset (20)

DeepeshRehi
DeepeshRehiDeepeshRehi
DeepeshRehi
 
Market Research Meets Big Data Analytics for Business Transformation
Market Research Meets Big Data Analytics  for Business Transformation Market Research Meets Big Data Analytics  for Business Transformation
Market Research Meets Big Data Analytics for Business Transformation
 
FAIR Dataverse
FAIR DataverseFAIR Dataverse
FAIR Dataverse
 
Developer friendly open data
Developer friendly open dataDeveloper friendly open data
Developer friendly open data
 
2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge2017-01-08-scaling tribalknowledge
2017-01-08-scaling tribalknowledge
 
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
Rated Ranking Evaluator Enterprise: the next generation of free Search Qualit...
 
Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...
Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...
Empowering Customers to Self Solve - A Findability Journey - Manikandan Sivan...
 
Resume_VipinKP
Resume_VipinKPResume_VipinKP
Resume_VipinKP
 
Testing Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of HadoopTesting Big Data: Automated ETL Testing of Hadoop
Testing Big Data: Automated ETL Testing of Hadoop
 
Building Valuable Restful APIs - HRPHP 2015
Building Valuable Restful APIs - HRPHP 2015Building Valuable Restful APIs - HRPHP 2015
Building Valuable Restful APIs - HRPHP 2015
 
Building Social Business Applications with OpenSocial
Building Social Business Applications with OpenSocialBuilding Social Business Applications with OpenSocial
Building Social Business Applications with OpenSocial
 
Liferay portal – moving beyond content management
Liferay portal – moving beyond content managementLiferay portal – moving beyond content management
Liferay portal – moving beyond content management
 
Linked data for Enterprise Data Integration
Linked data for Enterprise Data IntegrationLinked data for Enterprise Data Integration
Linked data for Enterprise Data Integration
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Eliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside HadoopEliminating the Challenges of Big Data Management Inside Hadoop
Eliminating the Challenges of Big Data Management Inside Hadoop
 
Proof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics InteroperabilityProof of Concept for Learning Analytics Interoperability
Proof of Concept for Learning Analytics Interoperability
 
Case Study: Big Data Analytics
Case Study: Big Data AnalyticsCase Study: Big Data Analytics
Case Study: Big Data Analytics
 
Graphs fun vjug2
Graphs fun vjug2Graphs fun vjug2
Graphs fun vjug2
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
 
Ipedo Company Overview
Ipedo Company OverviewIpedo Company Overview
Ipedo Company Overview
 

Dernier

Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDRafezzaman
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Seán Kennedy
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfgstagge
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxMike Bennett
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...Amil Baba Dawood bangali
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024thyngster
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理e4aez8ss
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFAAndrei Kaleshka
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]📊 Markus Baersch
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一fhwihughh
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSINGmarianagonzalez07
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.pptamreenkhanum0307
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...Florian Roscheck
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)jennyeacort
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort servicejennyeacort
 

Dernier (20)

Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTDINTERNSHIP ON PURBASHA COMPOSITE TEX LTD
INTERNSHIP ON PURBASHA COMPOSITE TEX LTD
 
Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...Student profile product demonstration on grades, ability, well-being and mind...
Student profile product demonstration on grades, ability, well-being and mind...
 
RadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdfRadioAdProWritingCinderellabyButleri.pdf
RadioAdProWritingCinderellabyButleri.pdf
 
Semantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptxSemantic Shed - Squashing and Squeezing.pptx
Semantic Shed - Squashing and Squeezing.pptx
 
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
NO1 Certified Black Magic Specialist Expert Amil baba in Lahore Islamabad Raw...
 
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
Consent & Privacy Signals on Google *Pixels* - MeasureCamp Amsterdam 2024
 
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
科罗拉多大学波尔得分校毕业证学位证成绩单-可办理
 
How we prevented account sharing with MFA
How we prevented account sharing with MFAHow we prevented account sharing with MFA
How we prevented account sharing with MFA
 
GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]GA4 Without Cookies [Measure Camp AMS]
GA4 Without Cookies [Measure Camp AMS]
 
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
办理学位证纽约大学毕业证(NYU毕业证书)原版一比一
 
Call Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort ServiceCall Girls in Saket 99530🔝 56974 Escort Service
Call Girls in Saket 99530🔝 56974 Escort Service
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
2006_GasProcessing_HB (1).pdf HYDROCARBON PROCESSING
 
Machine learning classification ppt.ppt
Machine learning classification  ppt.pptMachine learning classification  ppt.ppt
Machine learning classification ppt.ppt
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...From idea to production in a day – Leveraging Azure ML and Streamlit to build...
From idea to production in a day – Leveraging Azure ML and Streamlit to build...
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
Call Us ➥97111√47426🤳Call Girls in Aerocity (Delhi NCR)
 
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
9711147426✨Call In girls Gurgaon Sector 31. SCO 25 escort service
 

Yelp Academic Dataset

  • 1. Yelp Dataset Challenge: Business Analysis Based on Location and Category GROUP - I : KEYUR MANDANI MIKAELIAN OVANES HEMANTH REDDY
  • 2. Table of contents • Introduction • Cluster Configuration • Agenda • Flowchart • Specifications • Implementation • Visualization • GitHub • References
  • 3. What is Yelp? --Yelp is a user driven web 2.0 service which reveals honest and current insights on local businesses --Yelp allows users from anywhere in the world to rate and review any business. --Yelp's revenues come from selling ads and sponsored listings to small businesses. --Harvard Business School study published in 2011 found that each star in a Yelp rating affected the business owner's sales by 5-9 percent.
  • 4. What is Yelp? --Yelp is a user driven web 2.0 service which reveals honest and current insights on local businesses --Yelp allows users from anywhere in the world to rate and review any business. --Yelp's revenues come from selling ads and sponsored listings to small businesses. --Harvard Business School study published in 2011 found that each star in a Yelp rating affected the business owner's sales by 5-9 percent.
  • 5. Microsoft Azure HDInsight Cluster Configuration • Operating System : Linux • Nodes: 4 Node • Worker Nodes: 4 Nodes -16Core –14Gb RAM – 200Gb SSD • Head Nodes: 2 Nodes - 8Core –14Gb RAM – 200Gb SSD
  • 6. Tools Used • Microsoft Azure HDInsight Cluster Hadoop Environment • PowerBI for Data Visualization • Amazon AWS S3 : Store data Online and To Fetch to HDFS • Jsonprettyprinter : Format non-structured Data into structured data • Mapping tools at Batchgeo.com
  • 7. Agenda Analyze Yelp Academic Dataset from various business perspectives, including business location, category, time of year, user rating and user reviews.
  • 8. Dataset Details Data source: Yelp Academic Dataset Data size : 1.98 GB File Format : json Number of files : 3
  • 9. Downloaded data from Yelp website Converted Json file to .CSV file using Serialization/Dese rializtion (SerDe) Export Data to Excel Upload Files to HDInsight Cluster using SSH Dashboard Data visualization 1 2 3 4 5 6 PROCESS FLOW Used HiveQL to Retrieve data and create tables
  • 11. Upload JSON Files to HDInsight Cluster Using SSH Download File: Wget –O Filename ‘ URL’‘FileDestination’ Move File to HDFS: hdfs dfs –put filename ‘File Destination Path’
  • 13. Create Table with Serde (JsonSerde) NOTE:-While Creating table using Hive-JsonSerde, class path for Serde Needs to be specified with the table.
  • 14. Query To Display Review Count on Specific Time of Year
  • 15. Average Rating and Average Review
  • 16. Total Reviews by Business Category in Selected States
  • 17. Average Rating by Business Category in US
  • 18. Average Rating For Business In Arizona State
  • 19. Total Number of Reviews for Business in Arizona State
  • 20. Businesses in Las Vegas based on Longitude and Latitude using batchgeo.com
  • 21. Project Scope Natural Language Processing: From the review provided from the users, based on the positive and negative words, we can predict the rating a particular user will give. Bluemix’s Natural Language Classifier can be used
  • 22. References • GitHub Repository Link: https://github.com/Keyur- Mandani/CIS520-01-G-I.git • SlideShare Link: • Dataset : https://www.yelp.com/dataset_challenge/dataset • Serde Source: http://code.google.com/p/archive/hive-json- serde-0.2.jar References from Class Lab Work • Azure HDInsight Hadoop Linux Cluster Getting Started Artical • www.tutorialpoints.com/hive