introduction to data science

B
bhavesh landeData Science Engineer à home
Introduction to Data Science
Week 1
www.swaraadyasolutions.co.in
Agenda
• Defining Data Science
• What Does a Data Science Professional Do?
• Data Science in Business
• Use Cases for Data Science
• Installation of R and R studio
www.swaraadyasolutions.co.in
www.swaraadyasolutions.co.in
Defining Data Science
• Data Science deals with the science and algorithms
related to data.
• Data generated from various sort of sources.
• Report says, “Every day, approximately 2 quintillion bytes
of data is generated. If it grows at this pace, then by the
next 3 years, it is expected that 2MB of data will be
created every second for every individual on this planet.”
• Last 2 years witnessing the creation of 90% of data over
the globe.
www.swaraadyasolutions.co.in
• Data has two sources:
• Structured
• Unstructured
• Structured sources include information that is compatible
with the relational database.
• E.g. ATM transactions, Flight Tickets which enable SQL to
make changes in them.
• Unstructured data is generated from tweets and comments
on social media, audio and video files which the SQL cannot
process.
www.swaraadyasolutions.co.in
Definition
“ Data Science is a broad field which is an assembly of scientific techniques,
methods, processes used to clean the data and then extract some useful
patterns and insights in form of visualizations.”
• Visualizations are crucial to make important business decisions and come up
with strategies that are instrumental for organization’s well-being.
www.swaraadyasolutions.co.in
History
In 1997, when C. F. Jeff at University of Michigan, stated that below concepts
should be studied under phrase Data Science.
• Data Collection
• Data Modeling
• DataAnalysis
www.swaraadyasolutions.co.in
Role of Data Science on Statistics
• Statistics
• Mathematics
• Computer Science
• DataAnalysis
• CriticalThinking
• Problem Solving
• Machine Learning
• DataVisualization
www.swaraadyasolutions.co.in
Data Science??
In 2012, it was titled as the “The sexiest job of the
21st Century” by Harvard Business School.
www.swaraadyasolutions.co.in
www.swaraadyasolutions.co.in
Statistics
• Statistics is the branch of mathematics that deals with data collection,
categorization, interpretation and presentation.
• These techniques helped with the processing and analyzing of the data at a
large scale.
www.swaraadyasolutions.co.in
StatisticsTechniquesTo Deal with Data
• Data Collection
– Collecting relevant data/information
– Primary data includes surveys, observations and experiments.
– Secondary data has internal records and government published data.
• Data Categorization and Classification
– Organized to get some insights
For example, we have data of heights of 10 people
160cm, 165cm, 155cm, 190cm, 177cm, 181cm, 179cm, 185cm, 159cm, 173cm
This data in an ordered array will look like
155cm, 159cm, 160cm ,165cm, 173cm, 177cm, 179cm, 181cm, 185cm, 190cm
The above data tells us that 155cm is the shortest height while 190cm is the tallest.
www.swaraadyasolutions.co.in
StatisticsTechniquesTo Deal with Data
• Data Classification
– Assembly of relevant facts/data into different categories/groups as per features.
– Factors are:
• Geographical
• Chronological (basis of time)
• Qualitative
• Quantitative
• Data Presentation
– Includes frequency distribution using histograms.
– For example, assume you are looking for prospective clients for your new
product which is an electric bike.
www.swaraadyasolutions.co.in
Applications
• Data Science has tons of applications in real-world implementation.
• Recommender Systems
– Content based – keeps track of users watching habits.
– Collaborative based – recognizes users with similar tastes.
• Voice and Image Recognition
• Spam and Fraud Detection
• Many more…….
www.swaraadyasolutions.co.in
Data Scientists andTheir Role
• Data Scientist is a Rockstar!!!
• A Data Scientist is an individual who has the power and freedom to
experiment with tons of different kinds of data.
• Based on knowledge of:
– Mathematics
– Problem solving
– Critical thinking
– Careful analysis
www.swaraadyasolutions.co.in
• For anyone who is willing to carry this “tag” along should be well-versed with a lot
of concepts.
Some of them are
• Mathematics
• Statistics
• Problem-solving
• Data wrangling or data munging
• Coding prowess in both R and Python
• SQL
• Hadoop
• Machine learning and AI
• Data visualization
• Communication skills
www.swaraadyasolutions.co.in
Data Analyst v/s Data Scientist
• Data Analyst has a lot to do with converting the data into a structured
format in order to process it further.
• Focus more on Data Mining and Data Auditing
• Data mining involves retrieving information from large databases with the help of SQL to
extract new data/information.
• Data auditing involves checking the essence of data and trying to figure out if the data is
capable enough for gaining useful insights or not.
www.swaraadyasolutions.co.in
Data Analyst v/s Data Scientist
• Data Scientist take the clean data and trying to gain some meaningful
insights.
• An algorithm either from classification or regression is implemented in
order to create a model and make it sustainable enough to gain some
business insights with the help of visualization tools.
www.swaraadyasolutions.co.in
www.swaraadyasolutions.co.in
Are There Enough Skilled Data Scientists In The Industry?
• According to a survey conducted by IBM, the demand for data
scientists will soar by 28% by 2020.
• That includes all jobs which require machine learning, big data,
visualization likeTableau and PowerBI expertise and knowledge of
data analysis.
• This is divided among the industries looking for such professionals in
finance, insurance, professional services, and IT sectors.
www.swaraadyasolutions.co.in
A candidate who is always thirsty for new challenges and loves problem-solving
of any kind is capable to become a skilled data scientist.
He likes observing and defining a problem from different angles and
perspectives.
Coding is his daily hustle and loves doing it, not because the problem demands
him to do, but he knows how interesting it becomes to come up with new findings
and insights and then make a cute little story out of it!
www.swaraadyasolutions.co.in
Data Science Effects
How Can Data Science Help A Business/CompanyGrow?
• Data Science was breathing in the IT industry for a long time.
• The sudden increase in the amount of data hinted the companies to make it a norm slowly and steadily.
• There are numerous ways in which this emerging discipline can help an organization grow and achieve
new heights
• Business logistics, including supply chain optimization
• Finance
• Health and wellness
• Education and electronic teaching
• Climate and energy
www.swaraadyasolutions.co.in
Popular Data ProcessingTOOLS in Data Science
• Jupyter – open source tool to create and distribute documents
• R Studio – open source tool for R programming.
• SAS – analytics tool.
• Apache Spark – open source shared software specializes in cluster computing.
• Microsoft Excel – spreadsheet.
• SQL – programming language.
• Tableau – data visualization tool used for representing data in terms of charts.
• PowerBI – business intelligence tool developed by Microsoft.
www.swaraadyasolutions.co.in
What does Data Science Professional Do?
www.swaraadyasolutions.co.in
www.swaraadyasolutions.co.in
www.swaraadyasolutions.co.in
Installation of R and R Studio
www.swaraadyasolutions.co.in
Conclusion/Endnotes
• Data Science is turning out to be one of the fastest growing fields in the US and India.
• Today, it has its foot in weather forecasting, sales prediction, fraud and spam detection, pattern recognition, taxi fare
prediction, sentiment analysis, and neural networks.
• The future of data science is going to be dominated byArtificial Intelligence and Automation.
• These two big-heads have the capability of changing the current market scenario into something that data scientists describe
as the “age of revolution”.
• Machines are enriching themselves with new concepts and technology every counting second which is making them smarter
and sharper than humans.
• Looking at the current scenario of the market, data science is slowly and gradually making its
way into businesses and enterprises.
www.swaraadyasolutions.co.in
www.swaraadyasolutions.co.in
1 sur 29

Recommandé

Introduction to data science.pptx par
Introduction to data science.pptxIntroduction to data science.pptx
Introduction to data science.pptxSadhanaParameswaran
1.7K vues19 diapositives
Data Science par
Data ScienceData Science
Data ScienceAmit Singh
3.1K vues16 diapositives
Introduction to Data Science par
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceSrishti44
1.5K vues29 diapositives
Data science par
Data scienceData science
Data scienceSwapnilDahake2
3.7K vues14 diapositives
Introduction to data science par
Introduction to data scienceIntroduction to data science
Introduction to data scienceTharushi Ruwandika
1.4K vues36 diapositives
Introduction to data science par
Introduction to data scienceIntroduction to data science
Introduction to data scienceSampath Kumar
2.1K vues36 diapositives

Contenu connexe

Tendances

Introduction To Data Science par
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data ScienceSpotle.ai
861 vues19 diapositives
Introduction to Data Science par
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceEdureka!
3.9K vues84 diapositives
Introduction to data science club par
Introduction to data science clubIntroduction to data science club
Introduction to data science clubData Science Club
1.4K vues40 diapositives
Data science par
Data scienceData science
Data scienceRanjit Nambisan
999 vues10 diapositives
Data science par
Data science Data science
Data science SouravSadhukhan6
3.2K vues9 diapositives
Introduction to Data Science par
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceLaguna State Polytechnic University
110.8K vues62 diapositives

Tendances(20)

Introduction To Data Science par Spotle.ai
Introduction To Data ScienceIntroduction To Data Science
Introduction To Data Science
Spotle.ai861 vues
Introduction to Data Science par Edureka!
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Edureka!3.9K vues
What Is Data Science? | Introduction to Data Science | Data Science For Begin... par Simplilearn
What Is Data Science? | Introduction to Data Science | Data Science For Begin...What Is Data Science? | Introduction to Data Science | Data Science For Begin...
What Is Data Science? | Introduction to Data Science | Data Science For Begin...
Simplilearn12.3K vues
Data Science Tutorial | Introduction To Data Science | Data Science Training ... par Edureka!
Data Science Tutorial | Introduction To Data Science | Data Science Training ...Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Data Science Tutorial | Introduction To Data Science | Data Science Training ...
Edureka!6.2K vues
Introduction on Data Science par Edureka!
Introduction on Data ScienceIntroduction on Data Science
Introduction on Data Science
Edureka!46.9K vues
Ppt on data science par Ansh Budania
Ppt on data science Ppt on data science
Ppt on data science
Ansh Budania15.1K vues
Introduction to Data Science par Niko Vuokko
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
Niko Vuokko18.5K vues
Data science presentation par MSDEVMTL
Data science presentationData science presentation
Data science presentation
MSDEVMTL38.2K vues

Similaire à introduction to data science

Data Science Overview par
Data Science OverviewData Science Overview
Data Science OverviewDavide Mauri
1.3K vues24 diapositives
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION par
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION Elvis Muyanja
1.6K vues37 diapositives
Big data Analytics par
Big data AnalyticsBig data Analytics
Big data AnalyticsShivanandaVSeeri
917 vues73 diapositives
Data science and business analytics par
Data  science and business analyticsData  science and business analytics
Data science and business analyticsInbavalli Valli
379 vues63 diapositives
Big data par
Big dataBig data
Big dataPrince Barai
70 vues22 diapositives
Getting Started in Data Science par
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data ScienceThinkful
196 vues43 diapositives

Similaire à introduction to data science(20)

DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION par Elvis Muyanja
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
DATA SCIENCE IS CATALYZING BUSINESS AND INNOVATION
Elvis Muyanja1.6K vues
Getting Started in Data Science par Thinkful
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
Thinkful196 vues
Never Mind Big Data: We're Still Living in the Era of Big Spreadsheet par InformationActive Inc.
Never Mind Big Data: We're Still Living in the Era of Big SpreadsheetNever Mind Big Data: We're Still Living in the Era of Big Spreadsheet
Never Mind Big Data: We're Still Living in the Era of Big Spreadsheet
2017 06-14-getting started with data science par Thinkful
2017 06-14-getting started with data science2017 06-14-getting started with data science
2017 06-14-getting started with data science
Thinkful101 vues
NDC Oslo : A Practical Introduction to Data Science par Mark West
NDC Oslo : A Practical Introduction to Data ScienceNDC Oslo : A Practical Introduction to Data Science
NDC Oslo : A Practical Introduction to Data Science
Mark West461 vues
Career in Data Science (July 2017, DTLA) par Thinkful
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
Thinkful85 vues
JavaZone 2018 - A Practical(ish) Introduction to Data Science par Mark West
JavaZone 2018 - A Practical(ish) Introduction to Data ScienceJavaZone 2018 - A Practical(ish) Introduction to Data Science
JavaZone 2018 - A Practical(ish) Introduction to Data Science
Mark West245 vues
Intro to Data Science par TJ Stalcup
Intro to Data ScienceIntro to Data Science
Intro to Data Science
TJ Stalcup137 vues
Business Analytics and Data mining.pdf par ssuser0413ec
Business Analytics and Data mining.pdfBusiness Analytics and Data mining.pdf
Business Analytics and Data mining.pdf
ssuser0413ec35 vues
Data analytics career path par Rubikal
Data analytics career pathData analytics career path
Data analytics career path
Rubikal488 vues
Getting started in Data Science (April 2017, Los Angeles) par Thinkful
Getting started in Data Science (April 2017, Los Angeles)Getting started in Data Science (April 2017, Los Angeles)
Getting started in Data Science (April 2017, Los Angeles)
Thinkful196 vues
Thinkful - Intro to Data Science - Washington DC par TJ Stalcup
Thinkful - Intro to Data Science - Washington DCThinkful - Intro to Data Science - Washington DC
Thinkful - Intro to Data Science - Washington DC
TJ Stalcup142 vues
Thinkful DC - Intro to Data Science par TJ Stalcup
Thinkful DC - Intro to Data Science Thinkful DC - Intro to Data Science
Thinkful DC - Intro to Data Science
TJ Stalcup280 vues

Plus de bhavesh lande

The Annual G20 Scorecard – Research Performance 2019 par
The Annual G20 Scorecard – Research Performance 2019 The Annual G20 Scorecard – Research Performance 2019
The Annual G20 Scorecard – Research Performance 2019 bhavesh lande
43 vues27 diapositives
information control and Security system par
information control and Security systeminformation control and Security system
information control and Security systembhavesh lande
20 vues21 diapositives
information technology and infrastructures choices par
information technology and  infrastructures choicesinformation technology and  infrastructures choices
information technology and infrastructures choicesbhavesh lande
16 vues13 diapositives
ethical issues,social issues par
 ethical issues,social issues ethical issues,social issues
ethical issues,social issuesbhavesh lande
191 vues27 diapositives
managing inforamation system par
managing inforamation systemmanaging inforamation system
managing inforamation systembhavesh lande
17 vues18 diapositives
• E-commerce, e-business ,e-governance par
• E-commerce, e-business ,e-governance• E-commerce, e-business ,e-governance
• E-commerce, e-business ,e-governancebhavesh lande
575 vues23 diapositives

Plus de bhavesh lande(20)

The Annual G20 Scorecard – Research Performance 2019 par bhavesh lande
The Annual G20 Scorecard – Research Performance 2019 The Annual G20 Scorecard – Research Performance 2019
The Annual G20 Scorecard – Research Performance 2019
bhavesh lande43 vues
information control and Security system par bhavesh lande
information control and Security systeminformation control and Security system
information control and Security system
bhavesh lande20 vues
information technology and infrastructures choices par bhavesh lande
information technology and  infrastructures choicesinformation technology and  infrastructures choices
information technology and infrastructures choices
bhavesh lande16 vues
ethical issues,social issues par bhavesh lande
 ethical issues,social issues ethical issues,social issues
ethical issues,social issues
bhavesh lande191 vues
• E-commerce, e-business ,e-governance par bhavesh lande
• E-commerce, e-business ,e-governance• E-commerce, e-business ,e-governance
• E-commerce, e-business ,e-governance
bhavesh lande575 vues
organisations and information systems par bhavesh lande
organisations and  information systemsorganisations and  information systems
organisations and information systems
bhavesh lande40 vues
Implement Mapreduce with suitable example using MongoDB. par bhavesh lande
 Implement Mapreduce with suitable example using MongoDB. Implement Mapreduce with suitable example using MongoDB.
Implement Mapreduce with suitable example using MongoDB.
bhavesh lande29 vues
aggregation and indexing with suitable example using MongoDB. par bhavesh lande
aggregation and indexing with suitable example using MongoDB.aggregation and indexing with suitable example using MongoDB.
aggregation and indexing with suitable example using MongoDB.
bhavesh lande198 vues
Unnamed PL/SQL code block: Use of Control structure and Exception handling i... par bhavesh lande
 Unnamed PL/SQL code block: Use of Control structure and Exception handling i... Unnamed PL/SQL code block: Use of Control structure and Exception handling i...
Unnamed PL/SQL code block: Use of Control structure and Exception handling i...
bhavesh lande3.8K vues
database application using SQL DML statements: all types of Join, Sub-Query ... par bhavesh lande
 database application using SQL DML statements: all types of Join, Sub-Query ... database application using SQL DML statements: all types of Join, Sub-Query ...
database application using SQL DML statements: all types of Join, Sub-Query ...
bhavesh lande1.7K vues
database application using SQL DML statements: Insert, Select, Update, Delet... par bhavesh lande
 database application using SQL DML statements: Insert, Select, Update, Delet... database application using SQL DML statements: Insert, Select, Update, Delet...
database application using SQL DML statements: Insert, Select, Update, Delet...
bhavesh lande782 vues
Design and Develop SQL DDL statements which demonstrate the use of SQL objec... par bhavesh lande
 Design and Develop SQL DDL statements which demonstrate the use of SQL objec... Design and Develop SQL DDL statements which demonstrate the use of SQL objec...
Design and Develop SQL DDL statements which demonstrate the use of SQL objec...
bhavesh lande3.3K vues
applications and advantages of python par bhavesh lande
applications and advantages of pythonapplications and advantages of python
applications and advantages of python
bhavesh lande14 vues
introduction of python in data science par bhavesh lande
introduction of python in data scienceintroduction of python in data science
introduction of python in data science
bhavesh lande52 vues

Dernier

[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init... par
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...DataScienceConferenc1
5 vues18 diapositives
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation par
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented GenerationDataScienceConferenc1
15 vues29 diapositives
shivam tiwari.pptx par
shivam tiwari.pptxshivam tiwari.pptx
shivam tiwari.pptxAanyaMishra4
5 vues14 diapositives
Data about the sector workshop par
Data about the sector workshopData about the sector workshop
Data about the sector workshopinfo828217
15 vues27 diapositives
Amy slides.pdf par
Amy slides.pdfAmy slides.pdf
Amy slides.pdfStatsCommunications
5 vues13 diapositives
PRIVACY AWRE PERSONAL DATA STORAGE par
PRIVACY AWRE PERSONAL DATA STORAGEPRIVACY AWRE PERSONAL DATA STORAGE
PRIVACY AWRE PERSONAL DATA STORAGEantony420421
5 vues56 diapositives

Dernier(20)

[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init... par DataScienceConferenc1
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
[DSC Europe 23][Cryptica] Martin_Summer_Digital_central_bank_money_Ideas_init...
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation par DataScienceConferenc1
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
[DSC Europe 23] Spela Poklukar & Tea Brasanac - Retrieval Augmented Generation
Data about the sector workshop par info828217
Data about the sector workshopData about the sector workshop
Data about the sector workshop
info82821715 vues
PRIVACY AWRE PERSONAL DATA STORAGE par antony420421
PRIVACY AWRE PERSONAL DATA STORAGEPRIVACY AWRE PERSONAL DATA STORAGE
PRIVACY AWRE PERSONAL DATA STORAGE
antony4204215 vues
UNEP FI CRS Climate Risk Results.pptx par pekka28
UNEP FI CRS Climate Risk Results.pptxUNEP FI CRS Climate Risk Results.pptx
UNEP FI CRS Climate Risk Results.pptx
pekka2811 vues
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an... par StatsCommunications
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...
OECD-Persol Holdings Workshop on Advancing Employee Well-being in Business an...
CRM stick or twist workshop par info828217
CRM stick or twist workshopCRM stick or twist workshop
CRM stick or twist workshop
info82821711 vues
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx par DataScienceConferenc1
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
[DSC Europe 23] Zsolt Feleki - Machine Translation should we trust it.pptx
SUPER STORE SQL PROJECT.pptx par khan888620
SUPER STORE SQL PROJECT.pptxSUPER STORE SQL PROJECT.pptx
SUPER STORE SQL PROJECT.pptx
khan88862013 vues
CRM stick or twist.pptx par info828217
CRM stick or twist.pptxCRM stick or twist.pptx
CRM stick or twist.pptx
info82821711 vues
Organic Shopping in Google Analytics 4.pdf par GA4 Tutorials
Organic Shopping in Google Analytics 4.pdfOrganic Shopping in Google Analytics 4.pdf
Organic Shopping in Google Analytics 4.pdf
GA4 Tutorials16 vues
Short Story Assignment by Kelly Nguyen par kellynguyen01
Short Story Assignment by Kelly NguyenShort Story Assignment by Kelly Nguyen
Short Story Assignment by Kelly Nguyen
kellynguyen0119 vues

introduction to data science

  • 1. Introduction to Data Science Week 1 www.swaraadyasolutions.co.in
  • 2. Agenda • Defining Data Science • What Does a Data Science Professional Do? • Data Science in Business • Use Cases for Data Science • Installation of R and R studio www.swaraadyasolutions.co.in
  • 4. Defining Data Science • Data Science deals with the science and algorithms related to data. • Data generated from various sort of sources. • Report says, “Every day, approximately 2 quintillion bytes of data is generated. If it grows at this pace, then by the next 3 years, it is expected that 2MB of data will be created every second for every individual on this planet.” • Last 2 years witnessing the creation of 90% of data over the globe. www.swaraadyasolutions.co.in
  • 5. • Data has two sources: • Structured • Unstructured • Structured sources include information that is compatible with the relational database. • E.g. ATM transactions, Flight Tickets which enable SQL to make changes in them. • Unstructured data is generated from tweets and comments on social media, audio and video files which the SQL cannot process. www.swaraadyasolutions.co.in
  • 6. Definition “ Data Science is a broad field which is an assembly of scientific techniques, methods, processes used to clean the data and then extract some useful patterns and insights in form of visualizations.” • Visualizations are crucial to make important business decisions and come up with strategies that are instrumental for organization’s well-being. www.swaraadyasolutions.co.in
  • 7. History In 1997, when C. F. Jeff at University of Michigan, stated that below concepts should be studied under phrase Data Science. • Data Collection • Data Modeling • DataAnalysis www.swaraadyasolutions.co.in
  • 8. Role of Data Science on Statistics • Statistics • Mathematics • Computer Science • DataAnalysis • CriticalThinking • Problem Solving • Machine Learning • DataVisualization www.swaraadyasolutions.co.in
  • 9. Data Science?? In 2012, it was titled as the “The sexiest job of the 21st Century” by Harvard Business School. www.swaraadyasolutions.co.in
  • 11. Statistics • Statistics is the branch of mathematics that deals with data collection, categorization, interpretation and presentation. • These techniques helped with the processing and analyzing of the data at a large scale. www.swaraadyasolutions.co.in
  • 12. StatisticsTechniquesTo Deal with Data • Data Collection – Collecting relevant data/information – Primary data includes surveys, observations and experiments. – Secondary data has internal records and government published data. • Data Categorization and Classification – Organized to get some insights For example, we have data of heights of 10 people 160cm, 165cm, 155cm, 190cm, 177cm, 181cm, 179cm, 185cm, 159cm, 173cm This data in an ordered array will look like 155cm, 159cm, 160cm ,165cm, 173cm, 177cm, 179cm, 181cm, 185cm, 190cm The above data tells us that 155cm is the shortest height while 190cm is the tallest. www.swaraadyasolutions.co.in
  • 13. StatisticsTechniquesTo Deal with Data • Data Classification – Assembly of relevant facts/data into different categories/groups as per features. – Factors are: • Geographical • Chronological (basis of time) • Qualitative • Quantitative • Data Presentation – Includes frequency distribution using histograms. – For example, assume you are looking for prospective clients for your new product which is an electric bike. www.swaraadyasolutions.co.in
  • 14. Applications • Data Science has tons of applications in real-world implementation. • Recommender Systems – Content based – keeps track of users watching habits. – Collaborative based – recognizes users with similar tastes. • Voice and Image Recognition • Spam and Fraud Detection • Many more……. www.swaraadyasolutions.co.in
  • 15. Data Scientists andTheir Role • Data Scientist is a Rockstar!!! • A Data Scientist is an individual who has the power and freedom to experiment with tons of different kinds of data. • Based on knowledge of: – Mathematics – Problem solving – Critical thinking – Careful analysis www.swaraadyasolutions.co.in
  • 16. • For anyone who is willing to carry this “tag” along should be well-versed with a lot of concepts. Some of them are • Mathematics • Statistics • Problem-solving • Data wrangling or data munging • Coding prowess in both R and Python • SQL • Hadoop • Machine learning and AI • Data visualization • Communication skills www.swaraadyasolutions.co.in
  • 17. Data Analyst v/s Data Scientist • Data Analyst has a lot to do with converting the data into a structured format in order to process it further. • Focus more on Data Mining and Data Auditing • Data mining involves retrieving information from large databases with the help of SQL to extract new data/information. • Data auditing involves checking the essence of data and trying to figure out if the data is capable enough for gaining useful insights or not. www.swaraadyasolutions.co.in
  • 18. Data Analyst v/s Data Scientist • Data Scientist take the clean data and trying to gain some meaningful insights. • An algorithm either from classification or regression is implemented in order to create a model and make it sustainable enough to gain some business insights with the help of visualization tools. www.swaraadyasolutions.co.in
  • 20. Are There Enough Skilled Data Scientists In The Industry? • According to a survey conducted by IBM, the demand for data scientists will soar by 28% by 2020. • That includes all jobs which require machine learning, big data, visualization likeTableau and PowerBI expertise and knowledge of data analysis. • This is divided among the industries looking for such professionals in finance, insurance, professional services, and IT sectors. www.swaraadyasolutions.co.in
  • 21. A candidate who is always thirsty for new challenges and loves problem-solving of any kind is capable to become a skilled data scientist. He likes observing and defining a problem from different angles and perspectives. Coding is his daily hustle and loves doing it, not because the problem demands him to do, but he knows how interesting it becomes to come up with new findings and insights and then make a cute little story out of it! www.swaraadyasolutions.co.in
  • 22. Data Science Effects How Can Data Science Help A Business/CompanyGrow? • Data Science was breathing in the IT industry for a long time. • The sudden increase in the amount of data hinted the companies to make it a norm slowly and steadily. • There are numerous ways in which this emerging discipline can help an organization grow and achieve new heights • Business logistics, including supply chain optimization • Finance • Health and wellness • Education and electronic teaching • Climate and energy www.swaraadyasolutions.co.in
  • 23. Popular Data ProcessingTOOLS in Data Science • Jupyter – open source tool to create and distribute documents • R Studio – open source tool for R programming. • SAS – analytics tool. • Apache Spark – open source shared software specializes in cluster computing. • Microsoft Excel – spreadsheet. • SQL – programming language. • Tableau – data visualization tool used for representing data in terms of charts. • PowerBI – business intelligence tool developed by Microsoft. www.swaraadyasolutions.co.in
  • 24. What does Data Science Professional Do? www.swaraadyasolutions.co.in
  • 27. Installation of R and R Studio www.swaraadyasolutions.co.in
  • 28. Conclusion/Endnotes • Data Science is turning out to be one of the fastest growing fields in the US and India. • Today, it has its foot in weather forecasting, sales prediction, fraud and spam detection, pattern recognition, taxi fare prediction, sentiment analysis, and neural networks. • The future of data science is going to be dominated byArtificial Intelligence and Automation. • These two big-heads have the capability of changing the current market scenario into something that data scientists describe as the “age of revolution”. • Machines are enriching themselves with new concepts and technology every counting second which is making them smarter and sharper than humans. • Looking at the current scenario of the market, data science is slowly and gradually making its way into businesses and enterprises. www.swaraadyasolutions.co.in