SlideShare une entreprise Scribd logo
1  sur  53
Télécharger pour lire hors ligne
Data Science
Dr. Sunil Kr Pandey
Professor & Director (IT & UG)
Institute of Technology & Science
Mohan Nagar, Ghaziabad
Evolution of Databases
There's certainly a lot of it!
2015
1 Zettabyte
1 Exabyte
1 Petabyte
(brain) 14 PB: http://www.quora.com/Neuroscience-1/How-much-data-can-the-human-brain-store
(2002) 5 EB: http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/execsum.htm
1 Petabyte == 1000 TB 2002 2009
(2009) 800 EB: http://www.emc.com/collateral/analyst-reports/idc-digital-universe-are-you-ready.pdf
(2015) 8 ZB: http://www.emc.com/collateral/analyst-reports/idc-extracting-value-from-chaos-ar.pdf
2006 2011
(2006) 161 EB: http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf
(2011) 1.8 ZB: http://www.emc.com/leadership/programs/digital-universe.htm (life in video) 60 PB: in 4320p resolution, extrapolated from 16MB for 1:21 of 640x480 video
(w/sound) – almost certainly a gross overestimate, as sleep can be compressed significantly!
5 EB
161 EB
800 EB
1.8 ZB 8.0 ZB
14 PB
60 PB
Data produced each year
100-years of HD video + audio
Human brain's capacity
Data, data everywhere…
References
1 TB = 1000 GB
120 PB
logarithmicscale
Data has become a Resource that needs to be carefully stored, processed,
analyzed, visualize and Present where it is required securely.
Growing Need for Analytics
DATA
HARNESSING
Companies store
each piece of
information
generated during
the business
operations and
customer
interactions.
DATA VOLUMESData is generated.
Learning from the data
is used in the decision
making and process
optimization.
Data is analyzed. 1.22010
2012
2015
2.4
7.9
Volumes in Trillion GB
DID
YOU
KNOW
?
Generation of Large Amount of Data from Business Transactions
4
Billion
Number of
transactions
every year
900
Number
of Stores
Number
of SKUs
10000
-1 lakh
Year Data Volume in
Zetabytes
2010 2
2011 5
12 6.5
13 9
14 12.5
15 15.5
16 18
17 26
18 33
19 41
20 50.5
21 64.5
22 79.5
23 101
24 129.5
25 175
2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025
2 5 6.5 9 12.5 15.5 18 26 33 41
50.5
64.5
79.5
101
129.5
175
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Data Volume Growth from 2010 – 2025
Year Data Volume
Growth in Data Volume
2010-2025 (Projections)
Fourth Paradigm of Science
Turing award winner Jim
Gray imagined data science
as a "fourth paradigm" of
science -
• Thousands of years
• Empirical (अनुभवजन्य)
• Few hundreds of years
• Theoretical (सैद्धांतिक)
• Last fifty years
• Computational (गणनधत्मक)
• “Query the world”
• Last twenty years
• eScience (Data Science)
• “Download the world”
What is Data Science
• Data Science is a multi-disciplinary field that uses scientific
methods, processes, algorithms and systems to
extract knowledge and insights from structured and
unstructured data.
• Data Science is a "concept to unify statistics, data analysis,
machine learning and their related methods" in order to
"understand and analyze actual phenomena" with data. It
employs techniques and theories drawn from many fields within
the context of mathematics, statistics, comp. science,
and information science.
• The availability of high-capacity networks, low-cost computers and
storage devices as well as the widespread adoption of hardware
virtualization, service-oriented
architecture and autonomic and utility computing has led to growth
in cloud computing.
Data Science – A Visual Definition
Data Science : A Definition
Data Science is the science which uses computer science, statistics and
machine learning, visualization and human-computer interactions to:
1. Collect
2. Clean
3. Integrate
4. Analyze
5. Visualize
6. Interact
with data to create data products.
Objective of Data Science is to “Turn Data into Data Products”.
Traditionally, the data that we had was mostly structured and small in size,
which could be analyzed by using the simple BI tools. Unlike data in
the traditional systems which was mostly structured, today most of the
data is unstructured or semi-structured. Let’s have a look at the data
trends in the image given below which shows that by 2020, more than 80 % of
the data will be unstructured.
Data Science Team
•Business Analyst
•Data & Analytics Manager
•Data Analyst
•Database Administrator
•Data Scientist
•Statistician
•Data Engineer
•Data Architect
Role of Business Analyst
What is Analytics?
Data on its own is useless unless you can make sense of it!
WHAT IS ANALYTICS?
The scientific process of transforming data into insight for making
better decisions, offering new opportunities for a competitive
advantage
22
Types of Analytics
1
32
Analytics
Prescriptive Analytics
Descriptive analyticsPredictive analytics
Enabling smart decisions
based on data
What should we do?
Mining data to provide
business insights
What has happened?
Predicting the future based
on historical patterns
What could happen?
Types of Analytics
Prescriptive
Analytics
advice on possible outcomes
Predictive
Analytics
understanding the future
Descriptive
Analytics
insight into the past
Why do airline prices
change every hour?
How do grocery cashiers
know to hand you coupons
you might actually use?
How does Netflix
frequently recommend
just the right movie?
Features Business Intelligence (BI) Data Science
Data Sources
Structured
(Usually SQL, often Data Warehouse)
Both Structured and
Unstructured
( logs, cloud data, SQL,
NoSQL, text)
Approach Statistics and Visualization
Statistics, Machine
Learning, Graph Analysis,
Neuro- linguistic
Programming (NLP)
Focus Past and Present Present and Future
Tools Pentaho, Microsoft BI, QlikView, R
RapidMiner, BigML, Weka,
R
Business Intelligence (BI) vs. Data Science
Scope of
Business
Intelligence
techniques
employed in
2018.
Interest for “Data Science” term since
December 2013
(source: Google Trends)
Hype bag-of-words. Let’s not focus on buzzwords, but on what the
beneath technologies can actually solve.
Lifecycle of Data Science
Contrast: Databases
Databases Data Science
Data Value “Precious” “Cheap”
Data Volume Modest Massive
Examples Bank records,
Personnel records,
Census, Medical records
Online clicks, GPS logs,
Tweets, Building sensor readings
Priorities Consistency,
Error recovery,
Auditability
Speed,
Availability,
Query richness
Structured Strongly (Schema) Weakly or none (Text)
Properties Transactions, ACID* CAP* theorem (2/3),
eventual consistency
Realizations SQL NoSQL: MongoDB, CouchDB,
Hbase, Cassandra, Riak, Memcached,
Apache River, …
ACID = Atomicity, Consistency, Isolation and Durability
CAP = Consistency, Availability, Partition Tolerance
Contrast: Machine Learning
Data Science
Explore many models, build and tune hybrids
Understand empirical properties of models
Develop/use tools that can handle massive
datasets
Take action!
Machine Learning
Develop new (individual) models
Prove mathematical properties of models
Improve/validate on a few, relatively clean,
small datasets
Publish a paper
the companies are expanding as fast as the data!
The first war: Terminology
• Analyzing data has a long history!
• There have been many terms that have been used to describe such
endeavors:
• Statistics
• Artificial Intelligence
• Machine learning
• Data analytics
• Since I happen to work in a “Data Science” program perhaps I may be
allowed the indulgence of using that terminology…
The Case for Business Analytics
• The Business environment today is
more complex than ever before.
• Businesses are expected to be
diligently responsive to the
increasing demands of customers,
various stakeholders and even
regulators.
• Organizations have been turning to
the use of analytics.
• More than 83% of Global CIOs
surveyed by IBM in 2010 singled out
Business Intelligence and Analytics
as one of their visionary plans for
enhancing competitiveness.
In most cases the primary objective of
an organization that seeks to turn to
analytics is:
• Revenue/Profit growth
• Optimize expenditure
SOLUTION
BUSINESS NEED
GOAL
34
Data Analysis Has Been Around for a While…
R.A. Fisher
Howard
Dresner
Peter Luhn
W.E. Deming
Experiments, observations, and numerical simulations in many
areas of science and business are currently generating terabytes of
data, and in some cases are on the verge of generating petabytes
and beyond. Analyses of the information contained in these data
sets have already led to major breakthroughs in fields ranging from
genomics to astronomy and high-energy physics and to the
development of new information-based industries.
- Frontiers in Massive Data Analysis, National Research Council of the National Academies
Given a large mass of data, we can by judicious selection
construct perfectly plausible unassailable theories—all of
which, some of which, or none of which may be right.
- Paul Arnold Srere
The ability to take data—to be able to understand it, to process it, to
extract value from it, to visualize it, to communicate it—that’s going
to be a hugely important skill in the next decades, not only at the
professional level but even at the educational level for elementary
school kids, for high school kids, for college kids. Because now we
really do have essentially free and ubiquitous data. So the
complimentary scarce factor is the ability to understand that data
and extract value from it.
-Hal Varian, Google's Chief Economist, http://www.mckinsey.com/insights/innovation/hal_varian_on_how_the_web_challenges_managers
My personal goal: Getting students to be able to
think critically about data.
What is Big Data?
The are many examples of "data", but what makes some of it “big”? The classic
definition revolves around the three V’s - Volume, velocity, and variety.
 Volume: There is a just a lot of it being generated all the time. Things get
interesting and “big”, when you can’t fit it all on one computer anymore.
Why? There are many ideas here such as MapReduce, Hadoop, etc. that all
revolve around being able to process data that goes from Terabytes, to
Petabytes, to Exabytes.
 Velocity: Data is being generated very quickly. Can you even store it all? If
not, then what do you get rid of and what do you keep?
 Variety: The data types you mention all take different shapes. What does it
mean to store them so that you can play with or compare them?
BIGDATAData that is TOO LARGE & TOO
COMPLEX for conventional data tools
to capture, store and analyze.
Shares traded on US
Stock Markets each
day:
7 Billion
Data generated in
one flight from NY
to London:
10 Terabytes
Number of tweets
per day on Twitter:
400 Million
Number of ‘Likes’
each day on
Facebook:
3 Billion
The 3V’s of Big Data
VOLUME VARIETY VELOCITY
90% OF THE WORLD’S
DATA WAS
GENERATED IN THE
LAST TWO YEARS
Big Data Everywhere!
www.imarticus.org 39
Is Big Data the same as Data Science?
 Are Big Data and Data Science the same thing?
 I wouldn't say so...
 Data Science can be done on small data sets.
 And not everything done using Big Data would necessarily be called Data
Science.
Big Data
Data
Science
Is Big Data the same as Data Science?
 Are Big Data and Data Science the same thing?
 I wouldn't say so...
 Data Science can be done on small data sets.
 And not everything done using Big Data would necessarily be called Data
Science.
 But there certainly is a substantial overlap!
Big Data
Data
Science
Perspective Of Big Data's Growth
• Worldwide Big Data market revenues for software and services are projected to
increase from $42B in 2018 to $103B in 2027, attaining a Compound Annual
Growth Rate (CAGR) of 10.48% according to Wikibon.
•According to an Accenture study, 79% of enterprise executives agree that
companies that do not embrace Big Data will lose their competitive position and
could face extinction. Even more, 83%, have pursued Big Data projects to seize a
competitive edge.
•Forrester predicts the global Big Data software market will be worth $31B this
year, growing 14% from the previous year. The entire global software market is
forecast to be worth $628B in revenue, with $302B from applications.
•Worldwide Big Data market revenues for software and services are projected to
increase from $42B in 2018 to $103B in 2027, attaining a Compound Annual
Growth Rate (CAGR) of 10.48% according to Wikibon.
• 59% of executives say Big Data at their company would be improved through the
use of AI according to PwC.
Future Trends
Tech & Industries to watch out in near Future:
• Progressive Web Apps (PWAs) — A mixture of a mobile and web apps.
• Block Chain & Fintech – Meta-model building, reliable trading & credit scoring.
• Healthcare — Diagnosis by Medical Imaging (Computer vision & ML).
• AR/VR — Sport Analysis, Business Cards (Image Tracking), Real -Life Gaming
(Hado).
• AI Speech Assistants, smarter Chat-bot integrations.
• Smart Supply Chain — Digital twins (IoT Sensors).
• 5G — Big data, Mobile cloud computing, scalable IoT & Network function
virtualisation (NFV).
• 3D Printing — Prefabrication efficiency, Defect detection, Predictive ML
maintenance.
• Dark Data — Information that is yet to become available in digital format.
• Quantum Computing — Cutting data processing times into fractions.
Thank You!
Dr. Sunil Kr Pandey
Professor & Director (IT & UG)
Institute of Technology & Science
Mohan Nagar, Ghaziabad
Email: sunilpandey@its.edu.in

Contenu connexe

Tendances

Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data scienceSampath Kumar
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceANOOP V S
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesRukshan Batuwita
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First CourseArnab Majumdar
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Data Science London
 
The Evolution of Data Science
The Evolution of Data ScienceThe Evolution of Data Science
The Evolution of Data ScienceKenny Daniel
 
Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI Webina...
Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI  Webina...Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI  Webina...
Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI Webina...Pistoia Alliance
 
Pistoia Alliance Demystifying AI & ML part 2
Pistoia Alliance Demystifying AI & ML part 2Pistoia Alliance Demystifying AI & ML part 2
Pistoia Alliance Demystifying AI & ML part 2Pistoia Alliance
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data ScienceEdureka!
 
Public Data and Data Mining Competitions - What are Lessons?
Public Data and Data Mining Competitions - What are Lessons?Public Data and Data Mining Competitions - What are Lessons?
Public Data and Data Mining Competitions - What are Lessons?Gregory Piatetsky-Shapiro
 
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...Ilkay Altintas, Ph.D.
 
25 June 2013 - Advanced Data Analytics - an Introduction - Paul kennedy Power...
25 June 2013 - Advanced Data Analytics - an Introduction - Paul kennedy Power...25 June 2013 - Advanced Data Analytics - an Introduction - Paul kennedy Power...
25 June 2013 - Advanced Data Analytics - an Introduction - Paul kennedy Power...BigData AAI
 
Data science
Data scienceData science
Data science9diov
 
A Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceA Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceMark West
 

Tendances (20)

Introduction to data science
Introduction to data scienceIntroduction to data science
Introduction to data science
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Big Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our LivesBig Data and Data Science: The Technologies Shaping Our Lives
Big Data and Data Science: The Technologies Shaping Our Lives
 
Intro to Data Science Concepts
Intro to Data Science ConceptsIntro to Data Science Concepts
Intro to Data Science Concepts
 
Lecture #01
Lecture #01Lecture #01
Lecture #01
 
Data+Science : A First Course
Data+Science : A First CourseData+Science : A First Course
Data+Science : A First Course
 
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?Big Data [sorry] & Data Science: What Does a Data Scientist Do?
Big Data [sorry] & Data Science: What Does a Data Scientist Do?
 
The Evolution of Data Science
The Evolution of Data ScienceThe Evolution of Data Science
The Evolution of Data Science
 
Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI Webina...
Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI  Webina...Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI  Webina...
Pistoia Alliance Webinar Demystifying AI: Centre of Excellence for AI Webina...
 
Data science
Data scienceData science
Data science
 
Data Science
Data ScienceData Science
Data Science
 
Pistoia Alliance Demystifying AI & ML part 2
Pistoia Alliance Demystifying AI & ML part 2Pistoia Alliance Demystifying AI & ML part 2
Pistoia Alliance Demystifying AI & ML part 2
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Public Data and Data Mining Competitions - What are Lessons?
Public Data and Data Mining Competitions - What are Lessons?Public Data and Data Mining Competitions - What are Lessons?
Public Data and Data Mining Competitions - What are Lessons?
 
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
Creating a Data Science Ecosystem for Scientific, Societal and Educational Im...
 
25 June 2013 - Advanced Data Analytics - an Introduction - Paul kennedy Power...
25 June 2013 - Advanced Data Analytics - an Introduction - Paul kennedy Power...25 June 2013 - Advanced Data Analytics - an Introduction - Paul kennedy Power...
25 June 2013 - Advanced Data Analytics - an Introduction - Paul kennedy Power...
 
Data science
Data scienceData science
Data science
 
A Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data ScienceA Practical-ish Introduction to Data Science
A Practical-ish Introduction to Data Science
 
Analytics Education in the era of Big Data
Analytics Education in the era of Big DataAnalytics Education in the era of Big Data
Analytics Education in the era of Big Data
 
What is Data Science
What is Data ScienceWhat is Data Science
What is Data Science
 

Similaire à Data Science - An emerging Stream of Science with its Spreading Reach & Impact

intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...jybufgofasfbkpoovh
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptxshalini s
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data scienceJordan Engbers
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedPhilip Bourne
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science LandscapePhilip Bourne
 
Top 10 data science takeaways for executives
Top 10 data science takeaways for executivesTop 10 data science takeaways for executives
Top 10 data science takeaways for executivesDylan Erens
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxwahiba ben abdessalem
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxssuser1a4f0f
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data ScienceSanghamitra Deb
 
Data science and business analytics
Data  science and business analyticsData  science and business analytics
Data science and business analyticsInbavalli Valli
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)Zenodia Charpy
 
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfA New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfArmyTrilidiaDevegaSK
 
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGargColloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGargShiv Shakti Ghosh
 
Workshop_Presentation.pptx
Workshop_Presentation.pptxWorkshop_Presentation.pptx
Workshop_Presentation.pptxRUDRAPRASADSABAR
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfvishal choudhary
 

Similaire à Data Science - An emerging Stream of Science with its Spreading Reach & Impact (20)

intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...intro to data science Clustering and visualization of data science subfields ...
intro to data science Clustering and visualization of data science subfields ...
 
Real-time applications of Data Science.pptx
Real-time applications  of Data Science.pptxReal-time applications  of Data Science.pptx
Real-time applications of Data Science.pptx
 
00-01 DSnDA.pdf
00-01 DSnDA.pdf00-01 DSnDA.pdf
00-01 DSnDA.pdf
 
Making an impact with data science
Making an impact  with data scienceMaking an impact  with data science
Making an impact with data science
 
DataScience_introduction.pdf
DataScience_introduction.pdfDataScience_introduction.pdf
DataScience_introduction.pdf
 
Data Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has ChangedData Science and AI in Biomedicine: The World has Changed
Data Science and AI in Biomedicine: The World has Changed
 
The Analytics and Data Science Landscape
The Analytics and Data Science LandscapeThe Analytics and Data Science Landscape
The Analytics and Data Science Landscape
 
Top 10 data science takeaways for executives
Top 10 data science takeaways for executivesTop 10 data science takeaways for executives
Top 10 data science takeaways for executives
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
From Rocket Science to Data Science
From Rocket Science to Data ScienceFrom Rocket Science to Data Science
From Rocket Science to Data Science
 
Data science and business analytics
Data  science and business analyticsData  science and business analytics
Data science and business analytics
 
Göteborg university(condensed)
Göteborg university(condensed)Göteborg university(condensed)
Göteborg university(condensed)
 
Datascience
DatascienceDatascience
Datascience
 
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdfA New Paradigm on Analytic-Driven Information and Automation V2.pdf
A New Paradigm on Analytic-Driven Information and Automation V2.pdf
 
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGargColloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
Colloquium(7)_DataScience:ShivShaktiGhosh&MohitGarg
 
Workshop_Presentation.pptx
Workshop_Presentation.pptxWorkshop_Presentation.pptx
Workshop_Presentation.pptx
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
 
Information entanglement
Information entanglementInformation entanglement
Information entanglement
 
Data science
Data scienceData science
Data science
 

Plus de Dr. Sunil Kr. Pandey

Cloud Security, Standards and Applications
Cloud Security, Standards and ApplicationsCloud Security, Standards and Applications
Cloud Security, Standards and ApplicationsDr. Sunil Kr. Pandey
 
Virtualization for Cloud Environment
Virtualization for Cloud EnvironmentVirtualization for Cloud Environment
Virtualization for Cloud EnvironmentDr. Sunil Kr. Pandey
 
Collaborating Using Cloud Services
Collaborating Using Cloud ServicesCollaborating Using Cloud Services
Collaborating Using Cloud ServicesDr. Sunil Kr. Pandey
 
Future Skills & Career Opportunities in POST COVID-19
Future Skills & Career Opportunities in POST COVID-19Future Skills & Career Opportunities in POST COVID-19
Future Skills & Career Opportunities in POST COVID-19Dr. Sunil Kr. Pandey
 
Digital India: Use of Technology For Transforming Society
Digital India: Use of Technology For Transforming SocietyDigital India: Use of Technology For Transforming Society
Digital India: Use of Technology For Transforming SocietyDr. Sunil Kr. Pandey
 
Mobile Technology – Historical Evolution, Present Status & Future Directions
Mobile Technology – Historical Evolution, Present Status & Future DirectionsMobile Technology – Historical Evolution, Present Status & Future Directions
Mobile Technology – Historical Evolution, Present Status & Future DirectionsDr. Sunil Kr. Pandey
 
Mobile Technology – Historical Evolution, Present Status & Future Directions
Mobile Technology – Historical Evolution, Present Status & Future DirectionsMobile Technology – Historical Evolution, Present Status & Future Directions
Mobile Technology – Historical Evolution, Present Status & Future DirectionsDr. Sunil Kr. Pandey
 
Green Commputing - Paradigm Shift in Computing Technology, ICT & its Applicat...
Green Commputing - Paradigm Shift in Computing Technology, ICT & its Applicat...Green Commputing - Paradigm Shift in Computing Technology, ICT & its Applicat...
Green Commputing - Paradigm Shift in Computing Technology, ICT & its Applicat...Dr. Sunil Kr. Pandey
 
Digital India MIssion - An oveview
Digital India MIssion - An oveviewDigital India MIssion - An oveview
Digital India MIssion - An oveviewDr. Sunil Kr. Pandey
 
Business Analysis, Query Tools, Dm unit-3
Business Analysis, Query Tools, Dm unit-3Business Analysis, Query Tools, Dm unit-3
Business Analysis, Query Tools, Dm unit-3Dr. Sunil Kr. Pandey
 
Data Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural FrameworkData Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural FrameworkDr. Sunil Kr. Pandey
 

Plus de Dr. Sunil Kr. Pandey (15)

Cloud Security, Standards and Applications
Cloud Security, Standards and ApplicationsCloud Security, Standards and Applications
Cloud Security, Standards and Applications
 
Virtualization for Cloud Environment
Virtualization for Cloud EnvironmentVirtualization for Cloud Environment
Virtualization for Cloud Environment
 
Collaborating Using Cloud Services
Collaborating Using Cloud ServicesCollaborating Using Cloud Services
Collaborating Using Cloud Services
 
Cloud Services: Types of Cloud
Cloud Services: Types of CloudCloud Services: Types of Cloud
Cloud Services: Types of Cloud
 
Cloud Computing - Introduction
Cloud Computing - IntroductionCloud Computing - Introduction
Cloud Computing - Introduction
 
Future Skills & Career Opportunities in POST COVID-19
Future Skills & Career Opportunities in POST COVID-19Future Skills & Career Opportunities in POST COVID-19
Future Skills & Career Opportunities in POST COVID-19
 
Digital India: Use of Technology For Transforming Society
Digital India: Use of Technology For Transforming SocietyDigital India: Use of Technology For Transforming Society
Digital India: Use of Technology For Transforming Society
 
Mobile Technology – Historical Evolution, Present Status & Future Directions
Mobile Technology – Historical Evolution, Present Status & Future DirectionsMobile Technology – Historical Evolution, Present Status & Future Directions
Mobile Technology – Historical Evolution, Present Status & Future Directions
 
Mobile Technology – Historical Evolution, Present Status & Future Directions
Mobile Technology – Historical Evolution, Present Status & Future DirectionsMobile Technology – Historical Evolution, Present Status & Future Directions
Mobile Technology – Historical Evolution, Present Status & Future Directions
 
Green Commputing - Paradigm Shift in Computing Technology, ICT & its Applicat...
Green Commputing - Paradigm Shift in Computing Technology, ICT & its Applicat...Green Commputing - Paradigm Shift in Computing Technology, ICT & its Applicat...
Green Commputing - Paradigm Shift in Computing Technology, ICT & its Applicat...
 
Digital India MIssion - An oveview
Digital India MIssion - An oveviewDigital India MIssion - An oveview
Digital India MIssion - An oveview
 
Business Analysis, Query Tools, Dm unit-3
Business Analysis, Query Tools, Dm unit-3Business Analysis, Query Tools, Dm unit-3
Business Analysis, Query Tools, Dm unit-3
 
Data Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural FrameworkData Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural Framework
 
Introduction to Data Warehousing
Introduction to Data WarehousingIntroduction to Data Warehousing
Introduction to Data Warehousing
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 

Dernier

Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQuiz Club NITW
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...Nguyen Thanh Tu Collection
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17Celine George
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationdeepaannamalai16
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDhatriParmar
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxMichelleTuguinay1
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Celine George
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operationalssuser3e220a
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptxAneriPatwari
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptxmary850239
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdfMr Bounab Samir
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17Celine George
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...DhatriParmar
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQuiz Club NITW
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfPatidar M
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 

Dernier (20)

INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptxINCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
INCLUSIVE EDUCATION PRACTICES FOR TEACHERS AND TRAINERS.pptx
 
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITWQ-Factor General Quiz-7th April 2024, Quiz Club NITW
Q-Factor General Quiz-7th April 2024, Quiz Club NITW
 
prashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Professionprashanth updated resume 2024 for Teaching Profession
prashanth updated resume 2024 for Teaching Profession
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
31 ĐỀ THI THỬ VÀO LỚP 10 - TIẾNG ANH - FORM MỚI 2025 - 40 CÂU HỎI - BÙI VĂN V...
 
Paradigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTAParadigm shift in nursing research by RS MEHTA
Paradigm shift in nursing research by RS MEHTA
 
How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17How to Fix XML SyntaxError in Odoo the 17
How to Fix XML SyntaxError in Odoo the 17
 
Congestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentationCongestive Cardiac Failure..presentation
Congestive Cardiac Failure..presentation
 
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptxDecoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
Decoding the Tweet _ Practical Criticism in the Age of Hashtag.pptx
 
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptxDIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
DIFFERENT BASKETRY IN THE PHILIPPINES PPT.pptx
 
Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17Tree View Decoration Attribute in the Odoo 17
Tree View Decoration Attribute in the Odoo 17
 
Expanded definition: technical and operational
Expanded definition: technical and operationalExpanded definition: technical and operational
Expanded definition: technical and operational
 
ARTERIAL BLOOD GAS ANALYSIS........pptx
ARTERIAL BLOOD  GAS ANALYSIS........pptxARTERIAL BLOOD  GAS ANALYSIS........pptx
ARTERIAL BLOOD GAS ANALYSIS........pptx
 
4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx4.11.24 Mass Incarceration and the New Jim Crow.pptx
4.11.24 Mass Incarceration and the New Jim Crow.pptx
 
MS4 level being good citizen -imperative- (1) (1).pdf
MS4 level   being good citizen -imperative- (1) (1).pdfMS4 level   being good citizen -imperative- (1) (1).pdf
MS4 level being good citizen -imperative- (1) (1).pdf
 
How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17How to Manage Buy 3 Get 1 Free in Odoo 17
How to Manage Buy 3 Get 1 Free in Odoo 17
 
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
Beauty Amidst the Bytes_ Unearthing Unexpected Advantages of the Digital Wast...
 
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITWQ-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
Q-Factor HISPOL Quiz-6th April 2024, Quiz Club NITW
 
Active Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdfActive Learning Strategies (in short ALS).pdf
Active Learning Strategies (in short ALS).pdf
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 

Data Science - An emerging Stream of Science with its Spreading Reach & Impact

  • 1. Data Science Dr. Sunil Kr Pandey Professor & Director (IT & UG) Institute of Technology & Science Mohan Nagar, Ghaziabad
  • 2.
  • 3.
  • 5. There's certainly a lot of it! 2015 1 Zettabyte 1 Exabyte 1 Petabyte (brain) 14 PB: http://www.quora.com/Neuroscience-1/How-much-data-can-the-human-brain-store (2002) 5 EB: http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/execsum.htm 1 Petabyte == 1000 TB 2002 2009 (2009) 800 EB: http://www.emc.com/collateral/analyst-reports/idc-digital-universe-are-you-ready.pdf (2015) 8 ZB: http://www.emc.com/collateral/analyst-reports/idc-extracting-value-from-chaos-ar.pdf 2006 2011 (2006) 161 EB: http://www.emc.com/collateral/analyst-reports/expanding-digital-idc-white-paper.pdf (2011) 1.8 ZB: http://www.emc.com/leadership/programs/digital-universe.htm (life in video) 60 PB: in 4320p resolution, extrapolated from 16MB for 1:21 of 640x480 video (w/sound) – almost certainly a gross overestimate, as sleep can be compressed significantly! 5 EB 161 EB 800 EB 1.8 ZB 8.0 ZB 14 PB 60 PB Data produced each year 100-years of HD video + audio Human brain's capacity Data, data everywhere… References 1 TB = 1000 GB 120 PB logarithmicscale
  • 6. Data has become a Resource that needs to be carefully stored, processed, analyzed, visualize and Present where it is required securely.
  • 7. Growing Need for Analytics DATA HARNESSING Companies store each piece of information generated during the business operations and customer interactions. DATA VOLUMESData is generated. Learning from the data is used in the decision making and process optimization. Data is analyzed. 1.22010 2012 2015 2.4 7.9 Volumes in Trillion GB DID YOU KNOW ? Generation of Large Amount of Data from Business Transactions 4 Billion Number of transactions every year 900 Number of Stores Number of SKUs 10000 -1 lakh
  • 8. Year Data Volume in Zetabytes 2010 2 2011 5 12 6.5 13 9 14 12.5 15 15.5 16 18 17 26 18 33 19 41 20 50.5 21 64.5 22 79.5 23 101 24 129.5 25 175 2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025 2 5 6.5 9 12.5 15.5 18 26 33 41 50.5 64.5 79.5 101 129.5 175 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Data Volume Growth from 2010 – 2025 Year Data Volume Growth in Data Volume 2010-2025 (Projections)
  • 9.
  • 10. Fourth Paradigm of Science Turing award winner Jim Gray imagined data science as a "fourth paradigm" of science - • Thousands of years • Empirical (अनुभवजन्य) • Few hundreds of years • Theoretical (सैद्धांतिक) • Last fifty years • Computational (गणनधत्मक) • “Query the world” • Last twenty years • eScience (Data Science) • “Download the world”
  • 11.
  • 12. What is Data Science • Data Science is a multi-disciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data. • Data Science is a "concept to unify statistics, data analysis, machine learning and their related methods" in order to "understand and analyze actual phenomena" with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, comp. science, and information science. • The availability of high-capacity networks, low-cost computers and storage devices as well as the widespread adoption of hardware virtualization, service-oriented architecture and autonomic and utility computing has led to growth in cloud computing.
  • 13. Data Science – A Visual Definition
  • 14. Data Science : A Definition Data Science is the science which uses computer science, statistics and machine learning, visualization and human-computer interactions to: 1. Collect 2. Clean 3. Integrate 4. Analyze 5. Visualize 6. Interact with data to create data products. Objective of Data Science is to “Turn Data into Data Products”.
  • 15. Traditionally, the data that we had was mostly structured and small in size, which could be analyzed by using the simple BI tools. Unlike data in the traditional systems which was mostly structured, today most of the data is unstructured or semi-structured. Let’s have a look at the data trends in the image given below which shows that by 2020, more than 80 % of the data will be unstructured.
  • 16. Data Science Team •Business Analyst •Data & Analytics Manager •Data Analyst •Database Administrator •Data Scientist •Statistician •Data Engineer •Data Architect
  • 17.
  • 18.
  • 19. Role of Business Analyst
  • 20.
  • 21.
  • 22. What is Analytics? Data on its own is useless unless you can make sense of it! WHAT IS ANALYTICS? The scientific process of transforming data into insight for making better decisions, offering new opportunities for a competitive advantage 22
  • 23.
  • 24. Types of Analytics 1 32 Analytics Prescriptive Analytics Descriptive analyticsPredictive analytics Enabling smart decisions based on data What should we do? Mining data to provide business insights What has happened? Predicting the future based on historical patterns What could happen?
  • 25. Types of Analytics Prescriptive Analytics advice on possible outcomes Predictive Analytics understanding the future Descriptive Analytics insight into the past Why do airline prices change every hour? How do grocery cashiers know to hand you coupons you might actually use? How does Netflix frequently recommend just the right movie?
  • 26. Features Business Intelligence (BI) Data Science Data Sources Structured (Usually SQL, often Data Warehouse) Both Structured and Unstructured ( logs, cloud data, SQL, NoSQL, text) Approach Statistics and Visualization Statistics, Machine Learning, Graph Analysis, Neuro- linguistic Programming (NLP) Focus Past and Present Present and Future Tools Pentaho, Microsoft BI, QlikView, R RapidMiner, BigML, Weka, R Business Intelligence (BI) vs. Data Science
  • 28. Interest for “Data Science” term since December 2013 (source: Google Trends) Hype bag-of-words. Let’s not focus on buzzwords, but on what the beneath technologies can actually solve.
  • 29. Lifecycle of Data Science
  • 30. Contrast: Databases Databases Data Science Data Value “Precious” “Cheap” Data Volume Modest Massive Examples Bank records, Personnel records, Census, Medical records Online clicks, GPS logs, Tweets, Building sensor readings Priorities Consistency, Error recovery, Auditability Speed, Availability, Query richness Structured Strongly (Schema) Weakly or none (Text) Properties Transactions, ACID* CAP* theorem (2/3), eventual consistency Realizations SQL NoSQL: MongoDB, CouchDB, Hbase, Cassandra, Riak, Memcached, Apache River, … ACID = Atomicity, Consistency, Isolation and Durability CAP = Consistency, Availability, Partition Tolerance
  • 31. Contrast: Machine Learning Data Science Explore many models, build and tune hybrids Understand empirical properties of models Develop/use tools that can handle massive datasets Take action! Machine Learning Develop new (individual) models Prove mathematical properties of models Improve/validate on a few, relatively clean, small datasets Publish a paper
  • 32. the companies are expanding as fast as the data!
  • 33. The first war: Terminology • Analyzing data has a long history! • There have been many terms that have been used to describe such endeavors: • Statistics • Artificial Intelligence • Machine learning • Data analytics • Since I happen to work in a “Data Science” program perhaps I may be allowed the indulgence of using that terminology…
  • 34. The Case for Business Analytics • The Business environment today is more complex than ever before. • Businesses are expected to be diligently responsive to the increasing demands of customers, various stakeholders and even regulators. • Organizations have been turning to the use of analytics. • More than 83% of Global CIOs surveyed by IBM in 2010 singled out Business Intelligence and Analytics as one of their visionary plans for enhancing competitiveness. In most cases the primary objective of an organization that seeks to turn to analytics is: • Revenue/Profit growth • Optimize expenditure SOLUTION BUSINESS NEED GOAL 34
  • 35. Data Analysis Has Been Around for a While… R.A. Fisher Howard Dresner Peter Luhn W.E. Deming
  • 36. Experiments, observations, and numerical simulations in many areas of science and business are currently generating terabytes of data, and in some cases are on the verge of generating petabytes and beyond. Analyses of the information contained in these data sets have already led to major breakthroughs in fields ranging from genomics to astronomy and high-energy physics and to the development of new information-based industries. - Frontiers in Massive Data Analysis, National Research Council of the National Academies Given a large mass of data, we can by judicious selection construct perfectly plausible unassailable theories—all of which, some of which, or none of which may be right. - Paul Arnold Srere
  • 37. The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill in the next decades, not only at the professional level but even at the educational level for elementary school kids, for high school kids, for college kids. Because now we really do have essentially free and ubiquitous data. So the complimentary scarce factor is the ability to understand that data and extract value from it. -Hal Varian, Google's Chief Economist, http://www.mckinsey.com/insights/innovation/hal_varian_on_how_the_web_challenges_managers My personal goal: Getting students to be able to think critically about data.
  • 38. What is Big Data? The are many examples of "data", but what makes some of it “big”? The classic definition revolves around the three V’s - Volume, velocity, and variety.  Volume: There is a just a lot of it being generated all the time. Things get interesting and “big”, when you can’t fit it all on one computer anymore. Why? There are many ideas here such as MapReduce, Hadoop, etc. that all revolve around being able to process data that goes from Terabytes, to Petabytes, to Exabytes.  Velocity: Data is being generated very quickly. Can you even store it all? If not, then what do you get rid of and what do you keep?  Variety: The data types you mention all take different shapes. What does it mean to store them so that you can play with or compare them?
  • 39. BIGDATAData that is TOO LARGE & TOO COMPLEX for conventional data tools to capture, store and analyze. Shares traded on US Stock Markets each day: 7 Billion Data generated in one flight from NY to London: 10 Terabytes Number of tweets per day on Twitter: 400 Million Number of ‘Likes’ each day on Facebook: 3 Billion The 3V’s of Big Data VOLUME VARIETY VELOCITY 90% OF THE WORLD’S DATA WAS GENERATED IN THE LAST TWO YEARS Big Data Everywhere! www.imarticus.org 39
  • 40.
  • 41. Is Big Data the same as Data Science?  Are Big Data and Data Science the same thing?  I wouldn't say so...  Data Science can be done on small data sets.  And not everything done using Big Data would necessarily be called Data Science. Big Data Data Science
  • 42. Is Big Data the same as Data Science?  Are Big Data and Data Science the same thing?  I wouldn't say so...  Data Science can be done on small data sets.  And not everything done using Big Data would necessarily be called Data Science.  But there certainly is a substantial overlap! Big Data Data Science
  • 43. Perspective Of Big Data's Growth • Worldwide Big Data market revenues for software and services are projected to increase from $42B in 2018 to $103B in 2027, attaining a Compound Annual Growth Rate (CAGR) of 10.48% according to Wikibon. •According to an Accenture study, 79% of enterprise executives agree that companies that do not embrace Big Data will lose their competitive position and could face extinction. Even more, 83%, have pursued Big Data projects to seize a competitive edge. •Forrester predicts the global Big Data software market will be worth $31B this year, growing 14% from the previous year. The entire global software market is forecast to be worth $628B in revenue, with $302B from applications. •Worldwide Big Data market revenues for software and services are projected to increase from $42B in 2018 to $103B in 2027, attaining a Compound Annual Growth Rate (CAGR) of 10.48% according to Wikibon. • 59% of executives say Big Data at their company would be improved through the use of AI according to PwC.
  • 44.
  • 45.
  • 46.
  • 47.
  • 48.
  • 49.
  • 50.
  • 51. Future Trends Tech & Industries to watch out in near Future: • Progressive Web Apps (PWAs) — A mixture of a mobile and web apps. • Block Chain & Fintech – Meta-model building, reliable trading & credit scoring. • Healthcare — Diagnosis by Medical Imaging (Computer vision & ML). • AR/VR — Sport Analysis, Business Cards (Image Tracking), Real -Life Gaming (Hado). • AI Speech Assistants, smarter Chat-bot integrations. • Smart Supply Chain — Digital twins (IoT Sensors). • 5G — Big data, Mobile cloud computing, scalable IoT & Network function virtualisation (NFV). • 3D Printing — Prefabrication efficiency, Defect detection, Predictive ML maintenance. • Dark Data — Information that is yet to become available in digital format. • Quantum Computing — Cutting data processing times into fractions.
  • 52.
  • 53. Thank You! Dr. Sunil Kr Pandey Professor & Director (IT & UG) Institute of Technology & Science Mohan Nagar, Ghaziabad Email: sunilpandey@its.edu.in