What is data science ?

Bohitesh Misra, PMP
Bohitesh Misra, PMPVice President - IT & BI à Simpa Networks

What is data science and why it is important now?

1 | P a g e
What is data science and why it is important now?
What is data science and why it is important now?
Author – Bohitesh Misra (bohitesh.misra@gmail.com), September 2017
Data Science!
Fundamentally, in layman terms, data scientists collect data from various
data sources, clean them, organize the data and shape them to be able to
analyze them. We can separate data into training and testing to assess and
experiment the algorithm or model that is developed using statistics and
apply them to any area or sector that we find suitable. Data mining helps end
users extract useful business information from large databases.
Asking the right questions
Asking the right questions is extremely important, and hence apt
communications skills is essential for data scientists. With the advent of
technology and the internet, we now have access to data instantly and the
technology to test our interpretation to make decisions rapidly and promptly.
Data scientist
Data scientists use their data and analytical ability to find and interpret rich
data sources; manage large volume of data; merge data sources; ensure
consistency of datasets; create visualizations in understanding data; build
mathematical models using the data; and present and communicate the data
insights and findings to business decision makers.
"Data scientist" has become a popular buzzword with Harvard Business
Review dubbing it "The Sexiest Job of the 21st Century" and McKinsey &
Company projecting a global excess demand of 1.5 million new data
scientists.
Statistical models
2 | P a g e
What is data science and why it is important now?
How does data mining works? It works the same way a human being does.
Basically, it uses historical information to learn for future. Mathematical
models like linear algebra, probability, statistics and calculus, regression,
clustering, predictive analysis are indispensable in data science. Python and
R are preferred programming languages that have packages and libraries
built specifically for data science which allow us to learn programming and
start applying. I’ve begun with R and use basic libraries for text and data
mining.
Data Cleaning
80% of the work by data scientists is data cleaning. Data is sometimes
available in preferred formats such as csv and xls, but you’ll find very little
data directly available to be executed using programming. APIs, web scraping
and SQL come in to the rescue of Data Scientists. Spark and Map-Reduce are
used to clean and analyze large and distributed datasets.
It’s everywhere!
Data-driven solutions are being used everywhere, from e-commerce websites,
social networking sites, financial visualization and interpretation.
Data-driven practices are increasingly being employed by companies over the
last few years. In fact, it would be difficult to find a sector in which data
science cannot be used to take better decisions, and companies are slowly
realizing this and adopting it.
Want to learn it?
I came across data science and decided it was the right fit for me and recently
completed Executive Management Programme from Indian Institute of
Technology Delhi in the same subject. Learning data science is very easy and
convenient, with the large number of MOOCs and eBooks available for free
online.
I urge you to think about how it may be applied to you, whether it is your
business where you can gather data in the form of reviews and opinions of
3 | P a g e
What is data science and why it is important now?
customers to make better data-driven decisions. You can use the data from
movie review sites to choose your next movie.
Data science for Startups
Startups critically need a Data strategy around the collection, storage and
usage of large data, in a way that data can serve the purpose behind the selling
point of a startup and can also open-up additional potential monetisation
avenues in the future.
A common case can be recommendation engine, which can benefit from
all kinds of information about the users: age, gender, purchases, offerings and
discounts. Designing the platform in a way that improves information
collection from its users, results in a big database that can be used to improve
in better managing discount deals, improving advertising or even the user
experience on the platform.
A clear data strategy can provide startups with additional revenue scope
and can also provide with a competitive advantage.

Recommandé

Data science par
Data scienceData science
Data scienceRanjit Nambisan
982 vues10 diapositives
Data Science par
Data ScienceData Science
Data ScienceAmit Singh
3.1K vues16 diapositives
Data science Big Data par
Data science Big DataData science Big Data
Data science Big Datasreekanthricky
661 vues16 diapositives
Data science par
Data scienceData science
Data scienceBenha University
1.9K vues34 diapositives
Data analytics course in bangalore par
Data analytics course in bangaloreData analytics course in bangalore
Data analytics course in bangaloreUmeshchandra Reddy Tera
22 vues8 diapositives
data science par
data sciencedata science
data scienceskhraletta
372 vues10 diapositives

Contenu connexe

Tendances

Data science & data scientist par
Data science & data scientistData science & data scientist
Data science & data scientistVijayMohan Vasu
1.6K vues8 diapositives
Data analytics par
Data analyticsData analytics
Data analyticsDr.Bhuvaneswari Velumani
2.4K vues38 diapositives
Data analytics par
Data analyticsData analytics
Data analyticsHimanshuPise2
169 vues7 diapositives
Big data analytics par
Big data analyticsBig data analytics
Big data analyticsRAVIKANTSHARMA98
39 vues11 diapositives
Big Data Use-Cases across industries (Georg Polzer, Teralytics) par
Big Data Use-Cases across industries (Georg Polzer, Teralytics)Big Data Use-Cases across industries (Georg Polzer, Teralytics)
Big Data Use-Cases across industries (Georg Polzer, Teralytics)Swiss Big Data User Group
3.3K vues10 diapositives
introduction to data science par
introduction to data scienceintroduction to data science
introduction to data sciencebhavesh lande
4.6K vues29 diapositives

Tendances(20)

introduction to data science par bhavesh lande
introduction to data scienceintroduction to data science
introduction to data science
bhavesh lande4.6K vues
2005) par butest
2005)2005)
2005)
butest206 vues
Data Science Applications | Data Science For Beginners | Data Science Trainin... par Edureka!
Data Science Applications | Data Science For Beginners | Data Science Trainin...Data Science Applications | Data Science For Beginners | Data Science Trainin...
Data Science Applications | Data Science For Beginners | Data Science Trainin...
Edureka!858 vues
Data Science Salon: Building a Data Science Culture par Formulatedby
Data Science Salon: Building a Data Science CultureData Science Salon: Building a Data Science Culture
Data Science Salon: Building a Data Science Culture
Formulatedby575 vues
Business Intelligence & Predictive Analytic by Prof. Lili Saghafi par Professor Lili Saghafi
Business Intelligence & Predictive Analytic by Prof. Lili SaghafiBusiness Intelligence & Predictive Analytic by Prof. Lili Saghafi
Business Intelligence & Predictive Analytic by Prof. Lili Saghafi
Data Science Innovations : Democratisation of Data and Data Science par suresh sood
Data Science Innovations : Democratisation of Data and Data Science  Data Science Innovations : Democratisation of Data and Data Science
Data Science Innovations : Democratisation of Data and Data Science
suresh sood809 vues
Big data and Predictive Analytics By : Professor Lili Saghafi par Professor Lili Saghafi
Big data and Predictive Analytics By : Professor Lili SaghafiBig data and Predictive Analytics By : Professor Lili Saghafi
Big data and Predictive Analytics By : Professor Lili Saghafi
Data Science Salon: Digital Transformation: The Data Science Catalyst par Formulatedby
Data Science Salon: Digital Transformation: The Data Science CatalystData Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science Catalyst
Formulatedby410 vues
Predictive Analytics: Business Perspective & Use Cases par Cagri Sarigoz
Predictive Analytics: Business Perspective & Use CasesPredictive Analytics: Business Perspective & Use Cases
Predictive Analytics: Business Perspective & Use Cases
Cagri Sarigoz765 vues
What is data science artical par kavyapandala
What is data science articalWhat is data science artical
What is data science artical
kavyapandala60 vues

Similaire à What is data science ?

Data Science for Finance Interview. par
Data Science for Finance Interview. Data Science for Finance Interview.
Data Science for Finance Interview. James LoBuono, CAPM, ITILv4
75 vues2 diapositives
An overview on big data analytics methods and applictions in different sectors par
An overview on big data analytics methods and applictions in different sectorsAn overview on big data analytics methods and applictions in different sectors
An overview on big data analytics methods and applictions in different sectorsCKalpana
147 vues13 diapositives
L3 Big Data and Application.pptx par
L3  Big Data and Application.pptxL3  Big Data and Application.pptx
L3 Big Data and Application.pptxShambhavi Vats
7 vues17 diapositives
Recent Trends in Big Data Analytics and Role in Business Decision Making par
Recent Trends in Big Data Analytics and Role in Business Decision MakingRecent Trends in Big Data Analytics and Role in Business Decision Making
Recent Trends in Big Data Analytics and Role in Business Decision MakingVandanaSharma356
15 vues5 diapositives
Embracing data science par
Embracing data scienceEmbracing data science
Embracing data scienceVipul Kalamkar
227 vues42 diapositives
Big data (word file) par
Big data  (word file)Big data  (word file)
Big data (word file)Shahbaz Anjam
805 vues14 diapositives

Similaire à What is data science ?(20)

An overview on big data analytics methods and applictions in different sectors par CKalpana
An overview on big data analytics methods and applictions in different sectorsAn overview on big data analytics methods and applictions in different sectors
An overview on big data analytics methods and applictions in different sectors
CKalpana147 vues
Recent Trends in Big Data Analytics and Role in Business Decision Making par VandanaSharma356
Recent Trends in Big Data Analytics and Role in Business Decision MakingRecent Trends in Big Data Analytics and Role in Business Decision Making
Recent Trends in Big Data Analytics and Role in Business Decision Making
Data analytics presentation- Management career institute par PoojaPatidar11
Data analytics presentation- Management career institute Data analytics presentation- Management career institute
Data analytics presentation- Management career institute
PoojaPatidar11149 vues
ds.pptx par Elves3
ds.pptxds.pptx
ds.pptx
Elves337 vues
How to start thinking like a data scientist par Debashish Jana
How to start thinking like a data scientistHow to start thinking like a data scientist
How to start thinking like a data scientist
Debashish Jana79 vues
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform... par IRJET Journal
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET- Implementing Social CRM System for an Online Grocery Shopping Platform...
IRJET Journal7 vues
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre... par IRJET Journal
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET- Strength and Workability of High Volume Fly Ash Self-Compacting Concre...
IRJET Journal45 vues
Data Science Whitepaper par Tuan Yang
Data Science WhitepaperData Science Whitepaper
Data Science Whitepaper
Tuan Yang214 vues

Plus de Bohitesh Misra, PMP

Innovation in enterpreneurship_2021 par
Innovation in enterpreneurship_2021Innovation in enterpreneurship_2021
Innovation in enterpreneurship_2021Bohitesh Misra, PMP
115 vues59 diapositives
Use of data science for startups_Sept 2021 par
Use of data science for startups_Sept 2021Use of data science for startups_Sept 2021
Use of data science for startups_Sept 2021Bohitesh Misra, PMP
86 vues89 diapositives
Building castles on sand - Project Management in distributed project environment par
Building castles on sand - Project Management in distributed project environmentBuilding castles on sand - Project Management in distributed project environment
Building castles on sand - Project Management in distributed project environmentBohitesh Misra, PMP
38 vues17 diapositives
Disruptive technologies - Session 4 - Biochip Digital twin Smart Fabrics par
Disruptive technologies - Session 4 - Biochip Digital twin Smart FabricsDisruptive technologies - Session 4 - Biochip Digital twin Smart Fabrics
Disruptive technologies - Session 4 - Biochip Digital twin Smart FabricsBohitesh Misra, PMP
244 vues45 diapositives
Disruptive technologies - Session 3 - Green it_Smartdust par
Disruptive technologies - Session 3 - Green it_SmartdustDisruptive technologies - Session 3 - Green it_Smartdust
Disruptive technologies - Session 3 - Green it_SmartdustBohitesh Misra, PMP
166 vues61 diapositives
Disruptive technologies - Session 2 - Blockchain smart_contracts par
Disruptive technologies - Session 2 - Blockchain smart_contractsDisruptive technologies - Session 2 - Blockchain smart_contracts
Disruptive technologies - Session 2 - Blockchain smart_contractsBohitesh Misra, PMP
130 vues50 diapositives

Plus de Bohitesh Misra, PMP(10)

Building castles on sand - Project Management in distributed project environment par Bohitesh Misra, PMP
Building castles on sand - Project Management in distributed project environmentBuilding castles on sand - Project Management in distributed project environment
Building castles on sand - Project Management in distributed project environment
Disruptive technologies - Session 4 - Biochip Digital twin Smart Fabrics par Bohitesh Misra, PMP
Disruptive technologies - Session 4 - Biochip Digital twin Smart FabricsDisruptive technologies - Session 4 - Biochip Digital twin Smart Fabrics
Disruptive technologies - Session 4 - Biochip Digital twin Smart Fabrics
Disruptive technologies - Session 3 - Green it_Smartdust par Bohitesh Misra, PMP
Disruptive technologies - Session 3 - Green it_SmartdustDisruptive technologies - Session 3 - Green it_Smartdust
Disruptive technologies - Session 3 - Green it_Smartdust
Disruptive technologies - Session 2 - Blockchain smart_contracts par Bohitesh Misra, PMP
Disruptive technologies - Session 2 - Blockchain smart_contractsDisruptive technologies - Session 2 - Blockchain smart_contracts
Disruptive technologies - Session 2 - Blockchain smart_contracts
Internet of Things (IoT) based Solar Energy System security considerations par Bohitesh Misra, PMP
Internet of Things (IoT) based Solar Energy System security considerationsInternet of Things (IoT) based Solar Energy System security considerations
Internet of Things (IoT) based Solar Energy System security considerations

Dernier

Cross-network in Google Analytics 4.pdf par
Cross-network in Google Analytics 4.pdfCross-network in Google Analytics 4.pdf
Cross-network in Google Analytics 4.pdfGA4 Tutorials
6 vues7 diapositives
MOSORE_BRESCIA par
MOSORE_BRESCIAMOSORE_BRESCIA
MOSORE_BRESCIAFederico Karagulian
5 vues8 diapositives
Understanding Hallucinations in LLMs - 2023 09 29.pptx par
Understanding Hallucinations in LLMs - 2023 09 29.pptxUnderstanding Hallucinations in LLMs - 2023 09 29.pptx
Understanding Hallucinations in LLMs - 2023 09 29.pptxGreg Makowski
13 vues18 diapositives
Chapter 3b- Process Communication (1) (1)(1) (1).pptx par
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptxayeshabaig2004
5 vues30 diapositives
How Leaders See Data? (Level 1) par
How Leaders See Data? (Level 1)How Leaders See Data? (Level 1)
How Leaders See Data? (Level 1)Narendra Narendra
13 vues76 diapositives
PROGRAMME.pdf par
PROGRAMME.pdfPROGRAMME.pdf
PROGRAMME.pdfHiNedHaJar
17 vues13 diapositives

Dernier(20)

Cross-network in Google Analytics 4.pdf par GA4 Tutorials
Cross-network in Google Analytics 4.pdfCross-network in Google Analytics 4.pdf
Cross-network in Google Analytics 4.pdf
GA4 Tutorials6 vues
Understanding Hallucinations in LLMs - 2023 09 29.pptx par Greg Makowski
Understanding Hallucinations in LLMs - 2023 09 29.pptxUnderstanding Hallucinations in LLMs - 2023 09 29.pptx
Understanding Hallucinations in LLMs - 2023 09 29.pptx
Greg Makowski13 vues
Chapter 3b- Process Communication (1) (1)(1) (1).pptx par ayeshabaig2004
Chapter 3b- Process Communication (1) (1)(1) (1).pptxChapter 3b- Process Communication (1) (1)(1) (1).pptx
Chapter 3b- Process Communication (1) (1)(1) (1).pptx
Building Real-Time Travel Alerts par Timothy Spann
Building Real-Time Travel AlertsBuilding Real-Time Travel Alerts
Building Real-Time Travel Alerts
Timothy Spann109 vues
Data structure and algorithm. par Abdul salam
Data structure and algorithm. Data structure and algorithm.
Data structure and algorithm.
Abdul salam 18 vues
Supercharging your Data with Azure AI Search and Azure OpenAI par Peter Gallagher
Supercharging your Data with Azure AI Search and Azure OpenAISupercharging your Data with Azure AI Search and Azure OpenAI
Supercharging your Data with Azure AI Search and Azure OpenAI
Peter Gallagher37 vues
JConWorld_ Continuous SQL with Kafka and Flink par Timothy Spann
JConWorld_ Continuous SQL with Kafka and FlinkJConWorld_ Continuous SQL with Kafka and Flink
JConWorld_ Continuous SQL with Kafka and Flink
Timothy Spann100 vues
Organic Shopping in Google Analytics 4.pdf par GA4 Tutorials
Organic Shopping in Google Analytics 4.pdfOrganic Shopping in Google Analytics 4.pdf
Organic Shopping in Google Analytics 4.pdf
GA4 Tutorials10 vues
UNEP FI CRS Climate Risk Results.pptx par pekka28
UNEP FI CRS Climate Risk Results.pptxUNEP FI CRS Climate Risk Results.pptx
UNEP FI CRS Climate Risk Results.pptx
pekka2811 vues
Advanced_Recommendation_Systems_Presentation.pptx par neeharikasingh29
Advanced_Recommendation_Systems_Presentation.pptxAdvanced_Recommendation_Systems_Presentation.pptx
Advanced_Recommendation_Systems_Presentation.pptx
RuleBookForTheFairDataEconomy.pptx par noraelstela1
RuleBookForTheFairDataEconomy.pptxRuleBookForTheFairDataEconomy.pptx
RuleBookForTheFairDataEconomy.pptx
noraelstela167 vues

What is data science ?

  • 1. 1 | P a g e What is data science and why it is important now? What is data science and why it is important now? Author – Bohitesh Misra (bohitesh.misra@gmail.com), September 2017 Data Science! Fundamentally, in layman terms, data scientists collect data from various data sources, clean them, organize the data and shape them to be able to analyze them. We can separate data into training and testing to assess and experiment the algorithm or model that is developed using statistics and apply them to any area or sector that we find suitable. Data mining helps end users extract useful business information from large databases. Asking the right questions Asking the right questions is extremely important, and hence apt communications skills is essential for data scientists. With the advent of technology and the internet, we now have access to data instantly and the technology to test our interpretation to make decisions rapidly and promptly. Data scientist Data scientists use their data and analytical ability to find and interpret rich data sources; manage large volume of data; merge data sources; ensure consistency of datasets; create visualizations in understanding data; build mathematical models using the data; and present and communicate the data insights and findings to business decision makers. "Data scientist" has become a popular buzzword with Harvard Business Review dubbing it "The Sexiest Job of the 21st Century" and McKinsey & Company projecting a global excess demand of 1.5 million new data scientists. Statistical models
  • 2. 2 | P a g e What is data science and why it is important now? How does data mining works? It works the same way a human being does. Basically, it uses historical information to learn for future. Mathematical models like linear algebra, probability, statistics and calculus, regression, clustering, predictive analysis are indispensable in data science. Python and R are preferred programming languages that have packages and libraries built specifically for data science which allow us to learn programming and start applying. I’ve begun with R and use basic libraries for text and data mining. Data Cleaning 80% of the work by data scientists is data cleaning. Data is sometimes available in preferred formats such as csv and xls, but you’ll find very little data directly available to be executed using programming. APIs, web scraping and SQL come in to the rescue of Data Scientists. Spark and Map-Reduce are used to clean and analyze large and distributed datasets. It’s everywhere! Data-driven solutions are being used everywhere, from e-commerce websites, social networking sites, financial visualization and interpretation. Data-driven practices are increasingly being employed by companies over the last few years. In fact, it would be difficult to find a sector in which data science cannot be used to take better decisions, and companies are slowly realizing this and adopting it. Want to learn it? I came across data science and decided it was the right fit for me and recently completed Executive Management Programme from Indian Institute of Technology Delhi in the same subject. Learning data science is very easy and convenient, with the large number of MOOCs and eBooks available for free online. I urge you to think about how it may be applied to you, whether it is your business where you can gather data in the form of reviews and opinions of
  • 3. 3 | P a g e What is data science and why it is important now? customers to make better data-driven decisions. You can use the data from movie review sites to choose your next movie. Data science for Startups Startups critically need a Data strategy around the collection, storage and usage of large data, in a way that data can serve the purpose behind the selling point of a startup and can also open-up additional potential monetisation avenues in the future. A common case can be recommendation engine, which can benefit from all kinds of information about the users: age, gender, purchases, offerings and discounts. Designing the platform in a way that improves information collection from its users, results in a big database that can be used to improve in better managing discount deals, improving advertising or even the user experience on the platform. A clear data strategy can provide startups with additional revenue scope and can also provide with a competitive advantage.