SlideShare une entreprise Scribd logo
1  sur  25
BIG DATA PROJECT –
ROLLER COASTER TYCOON
DATA ANALYSIS
Priyansh Gupta –
19CSU225
Radhika Mongia–
19CSU233
Rashi Gupta – 19CSU241
OBJECTIVE
The objective of the project is to analyze the
Roller Coaster Dataset and solve various
queries using HIVE.
DATA FORMAT: ALL THE FIELDS ARE COMMA-DELIMITED
1. park_id int,
2. theme string,
3. rollercoaster_type string,
4. custom_design int,
5. excitement double,
6. excitement_rating string,
7. intensity double,
8. intensity_rating string,
9. nausea double,
10. nausea_rating string,
11. max_speed double,
12. avg_speed double,
13. ride_time int,
14. ride_length int,
15. max_pos_gs double,
16. max_neg_gs double,
17. max_lateral_gs double,
18. total_air_time double,
19. drops int,
20. highest_drop_height int,
21. inversions int
DATA SET
● Load data into Hive table from our local file system
Load Data Local Inpath
'/home/cloudera/Desktop/rollercoasters.csv'
Overwrite Into table rollercoaster;
Question 1
● Number of rollercoaster type based on excitement and nausea and also print theme name
select theme, excitement_rating, nausea_rating , count(rollercoaster_type) from
rollercoaster group by
excitement_rating, nausea_rating,theme;
Question 2
● No. of rollercoaster where grouping based on excitement level and drop height
a) where excitement level is highest(very high) and drop_height>50
b) where excitement level is high and drop_height>50 and also print the park_id.
a) select excitement_rating, highest_drop_height , count(rollercoaster_type) from
rollercoaster
where excitement_rating = 'Very High' and
highest_drop_height > 50 group by excitement_rating, highest_drop_height;Q
b) select park_id,excitement_rating,highest_drop_height ,
count(rollercoaster_type)
from rollercoaster group by park_id, excitement_rating,
highest_drop_height
having excitement_rating = 'High' and highest_drop_height > 50;
Question 3
a) Find out the name of rollercoaster_type, excitement_level intensity _level and nausea_level where
total_air_time is max and
b) Find out the total_air_time of that rows whose excitement_level intensity _level and nausea_level is
similar to row where total_air_time is maximum.
a) select distinct rollercoaster_type,excitement_rating,intensity_rating ,
nausea_rating
from rollercoaster as r1 where r1.total_air_time in (select
max(r2.total_air_time) from rollercoaster as r2);
b) select r1.total_air_time from rollercoaster as r1 inner join
(select distinct excitement_rating,intensity_rating , nausea_rating from
rollercoaster as rc where
rc.total_air_time in (select max(total_air_time) from rollercoaster)) as t
on t.excitement_rating = r1.excitement_rating
and t.intensity_rating = r1.intensity_rating and t.nausea_rating = r1.nausea_rating;
Question 4
a) Find out the name of rollercoaster_type, excitement_level ,intensity _level and
nausea_level where avg_speed is max and
b) Compare the max_speed of those rows whose excitement_level intensity _level and
nausea_level is similar to row where avg_speed is maximum.
a) select rollercoaster_type,excitement_rating,intensity_rating,
nausea_rating from rollercoaster r1 where r1.avg_speed in
(select max(r2.avg_speed) from rollercoaster r2);
b) select r3.max_speed from rollercoaster as r3 inner join (select
excitement_rating,intensity_rating,
nausea_rating from rollercoaster r1 where r1.avg_speed
in(select max(r2.avg_speed) from rollercoaster r2) ) as t
on r3.intensity_rating = t.intensity_rating
and r3.excitement_rating = t.excitement_rating and r3.nausea_rating =
t.nausea_rating;
Question 5
● Find out the parkid and rollercoaster type where no of drop is greater than
10 and have same excitement _level.
select x.park_id,x.rollercoaster_type from (select park_id,rollercoaster_type
,excitement_rating
from rollercoaster where drops>10 group by park_id,rollercoaster_type
,excitement_rating) as x;
Question 6
● Group rollercoaster_type based on custom_design where excitement level
is high.
select custom_design ,rollercoaster_type from rollercoaster where
excitement_rating = 'High' group by
custom_design,rollercoaster_type;
Question 7
● If ride_length is greater than 2000 and max_speed is greater than 50 so what is the level of
excitement and nausea.
Select distinct excitement_rating,nausea_rating from rollercoaster where
ride_length > 2000 and max_speed >50;
Question 8
● Park_name(theme) where atleast 2 rides excitement level is high.
Select x.theme from (select theme,count(excitement_rating) from rollercoaster
where excitement_rating = 'High'
group by theme having count(excitement_rating)>=2 ) as x;
Question 9
● In which roller coaster ride excitement level and avg_speed is highest.
Select rollercoaster_type from rollercoaster r1 where excitement_rating = 'Very
High'
and r1.avg_speed in (select max(r2.avg_speed ) from rollercoaster as r2 );
Question 10
● Name of Rollercoaster where total_air_time is greater than 5 but still excitement_level is not very
high.
Select rollercoaster_type from rollercoaster where total_air_time>5 and
excitement_rating <> 'Very High';
Question 11
● If ride_length is greater than 2000 then find out avg_speed and excitement_level , group
excitement_level based on avg_speed >10.
select excitement_rating,avg_speed from rollercoaster
where ride_length >2000 and avg_speed>10 group by excitement_rating,avg_speed;
Question 12
● When max_pos> 3 and max_neg is >-2 then find out the name of rollercoaster where
intensity_level is greater than excitement_level.
Select rollercoaster_type from rollercoaster where max_pos_gs>3 and
max_neg_gs>-2 and intensity > excitement;
Question 13
● When max_pos>= 3 and max_neg is >=-2 count the no of rollercoaster grouping based on
a) Intensity_level greater than equal or less than excitement_level and
b) Find out the same when max_pos>= 4 and max_neg is >=1 condition is not true.
a) Select count(distinct(rollercoaster_type)) ,intensity_rating from rollercoaster
where max_pos_gs>=3 and max_neg_gs>=-2 and intensity > excitement group by
intensity_rating;
b) Select count(distinct(rollercoaster_type)) ,intensity_rating from
rollercoaster
where max_pos_gs<4 and max_neg_gs<1 and intensity > excitement group by
intensity_rating;
Question 14
● When nausea_level is low that what is the value of excitement_level.
select distinct excitement_rating from rollercoaster where nausea_rating =
'Low';
Question 15
● Group rollercoaster_type based on custom_design where intensity level is very high and ride_length is
greater than 2000.
Select custom_design, rollercoaster_type from rollercoaster where intensity_rating=
'Very High'
and ride_length>2000 group by custom_design , rollercoaster_type ;
THANK YOU

Contenu connexe

Dernier

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Dernier (20)

Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

En vedette

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

En vedette (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

BIG DATA PROJECT – ROLLER COASTER DATA ANALYSIS.pptx

  • 1. BIG DATA PROJECT – ROLLER COASTER TYCOON DATA ANALYSIS Priyansh Gupta – 19CSU225 Radhika Mongia– 19CSU233 Rashi Gupta – 19CSU241
  • 2. OBJECTIVE The objective of the project is to analyze the Roller Coaster Dataset and solve various queries using HIVE.
  • 3. DATA FORMAT: ALL THE FIELDS ARE COMMA-DELIMITED 1. park_id int, 2. theme string, 3. rollercoaster_type string, 4. custom_design int, 5. excitement double, 6. excitement_rating string, 7. intensity double, 8. intensity_rating string, 9. nausea double, 10. nausea_rating string, 11. max_speed double, 12. avg_speed double, 13. ride_time int, 14. ride_length int, 15. max_pos_gs double, 16. max_neg_gs double, 17. max_lateral_gs double, 18. total_air_time double, 19. drops int, 20. highest_drop_height int, 21. inversions int
  • 5. ● Load data into Hive table from our local file system Load Data Local Inpath '/home/cloudera/Desktop/rollercoasters.csv' Overwrite Into table rollercoaster;
  • 6. Question 1 ● Number of rollercoaster type based on excitement and nausea and also print theme name select theme, excitement_rating, nausea_rating , count(rollercoaster_type) from rollercoaster group by excitement_rating, nausea_rating,theme;
  • 7. Question 2 ● No. of rollercoaster where grouping based on excitement level and drop height a) where excitement level is highest(very high) and drop_height>50 b) where excitement level is high and drop_height>50 and also print the park_id. a) select excitement_rating, highest_drop_height , count(rollercoaster_type) from rollercoaster where excitement_rating = 'Very High' and highest_drop_height > 50 group by excitement_rating, highest_drop_height;Q
  • 8. b) select park_id,excitement_rating,highest_drop_height , count(rollercoaster_type) from rollercoaster group by park_id, excitement_rating, highest_drop_height having excitement_rating = 'High' and highest_drop_height > 50;
  • 9. Question 3 a) Find out the name of rollercoaster_type, excitement_level intensity _level and nausea_level where total_air_time is max and b) Find out the total_air_time of that rows whose excitement_level intensity _level and nausea_level is similar to row where total_air_time is maximum. a) select distinct rollercoaster_type,excitement_rating,intensity_rating , nausea_rating from rollercoaster as r1 where r1.total_air_time in (select max(r2.total_air_time) from rollercoaster as r2);
  • 10. b) select r1.total_air_time from rollercoaster as r1 inner join (select distinct excitement_rating,intensity_rating , nausea_rating from rollercoaster as rc where rc.total_air_time in (select max(total_air_time) from rollercoaster)) as t on t.excitement_rating = r1.excitement_rating and t.intensity_rating = r1.intensity_rating and t.nausea_rating = r1.nausea_rating;
  • 11. Question 4 a) Find out the name of rollercoaster_type, excitement_level ,intensity _level and nausea_level where avg_speed is max and b) Compare the max_speed of those rows whose excitement_level intensity _level and nausea_level is similar to row where avg_speed is maximum. a) select rollercoaster_type,excitement_rating,intensity_rating, nausea_rating from rollercoaster r1 where r1.avg_speed in (select max(r2.avg_speed) from rollercoaster r2);
  • 12. b) select r3.max_speed from rollercoaster as r3 inner join (select excitement_rating,intensity_rating, nausea_rating from rollercoaster r1 where r1.avg_speed in(select max(r2.avg_speed) from rollercoaster r2) ) as t on r3.intensity_rating = t.intensity_rating and r3.excitement_rating = t.excitement_rating and r3.nausea_rating = t.nausea_rating;
  • 13. Question 5 ● Find out the parkid and rollercoaster type where no of drop is greater than 10 and have same excitement _level. select x.park_id,x.rollercoaster_type from (select park_id,rollercoaster_type ,excitement_rating from rollercoaster where drops>10 group by park_id,rollercoaster_type ,excitement_rating) as x;
  • 14. Question 6 ● Group rollercoaster_type based on custom_design where excitement level is high. select custom_design ,rollercoaster_type from rollercoaster where excitement_rating = 'High' group by custom_design,rollercoaster_type;
  • 15. Question 7 ● If ride_length is greater than 2000 and max_speed is greater than 50 so what is the level of excitement and nausea. Select distinct excitement_rating,nausea_rating from rollercoaster where ride_length > 2000 and max_speed >50;
  • 16. Question 8 ● Park_name(theme) where atleast 2 rides excitement level is high. Select x.theme from (select theme,count(excitement_rating) from rollercoaster where excitement_rating = 'High' group by theme having count(excitement_rating)>=2 ) as x;
  • 17. Question 9 ● In which roller coaster ride excitement level and avg_speed is highest. Select rollercoaster_type from rollercoaster r1 where excitement_rating = 'Very High' and r1.avg_speed in (select max(r2.avg_speed ) from rollercoaster as r2 );
  • 18. Question 10 ● Name of Rollercoaster where total_air_time is greater than 5 but still excitement_level is not very high. Select rollercoaster_type from rollercoaster where total_air_time>5 and excitement_rating <> 'Very High';
  • 19. Question 11 ● If ride_length is greater than 2000 then find out avg_speed and excitement_level , group excitement_level based on avg_speed >10. select excitement_rating,avg_speed from rollercoaster where ride_length >2000 and avg_speed>10 group by excitement_rating,avg_speed;
  • 20. Question 12 ● When max_pos> 3 and max_neg is >-2 then find out the name of rollercoaster where intensity_level is greater than excitement_level. Select rollercoaster_type from rollercoaster where max_pos_gs>3 and max_neg_gs>-2 and intensity > excitement;
  • 21. Question 13 ● When max_pos>= 3 and max_neg is >=-2 count the no of rollercoaster grouping based on a) Intensity_level greater than equal or less than excitement_level and b) Find out the same when max_pos>= 4 and max_neg is >=1 condition is not true. a) Select count(distinct(rollercoaster_type)) ,intensity_rating from rollercoaster where max_pos_gs>=3 and max_neg_gs>=-2 and intensity > excitement group by intensity_rating;
  • 22. b) Select count(distinct(rollercoaster_type)) ,intensity_rating from rollercoaster where max_pos_gs<4 and max_neg_gs<1 and intensity > excitement group by intensity_rating;
  • 23. Question 14 ● When nausea_level is low that what is the value of excitement_level. select distinct excitement_rating from rollercoaster where nausea_rating = 'Low';
  • 24. Question 15 ● Group rollercoaster_type based on custom_design where intensity level is very high and ride_length is greater than 2000. Select custom_design, rollercoaster_type from rollercoaster where intensity_rating= 'Very High' and ride_length>2000 group by custom_design , rollercoaster_type ;