SlideShare une entreprise Scribd logo
1  sur  21
eScience 2050
A Look back from the midpoint of the century
Dennis Gannon, Professor Emeritus
School of Informatics, Computing and Engineering
Indiana University
& Microsoft Research (retired)
The Problem with Predictions
• Our vision of the future may look too much like today
Some are obviously way off. Others not so much!
Extrapolating eScience from 1965 to 2019
• My first computer program.
One Fortran line per card.
High school was fun!
• How might I have predicted computing
50 years ahead?
Programs and data stored on
a paper tape a mile long!!
Where was computing in the 1960s ?
• 1961 -The programming language FORTRAN IV was created
• 1964 - IBM introduced its System/360.
• 1965 - Gordon Moore creates “Moore’s Law”
• 1967 - IBM created the first floppy disk.
• 1969 - The ARPAnet (which became Internet).
• Mostly used for email until 1980.
• 1970 - Edgar Codd invented Relation Database Idea.
• Not implemented until 1974.
The 30 years after 2019
• The cloud and supercomputing merge.
• Quantum computing as a service in the
cloud.
• DNA data storage in the cloud.
• Neuromorphic computing.
• The explosion of AI as an eScience enabler.
Huh?
The Cloud Supercomputer Convergence
• The converged supercloud merges best of both
• Support for thousands of concurrent interactive users
• Planet scale data resources supporting on-line personal agents and science gateways
• Capable of launching multiple exascale parallel computations
• By 2019 moving there already
• Google TPU accelerators and Microsoft Azure mesh network of FPGA engines
• US NSF Supercomputer centers already exploring cloud tech such as Kubernetes
Google TPUs Azure FPGA mesh on top of servers
Quantum in the Cloud
• In 2018 small programmable quantum processors
• Attached to clouds from IBM and Google
• Programmable!
• 200 stable qubits became
real in 2030.
DNA Storage and Neuromorphic computing
• Long history of research in DNA
• Longevity and density amazing
• By 2019 able to fully automate
storage and retrieval
• UW+MSR working on microfluidics
devices.
• New ways in which DNA encoded data
could be searched and structured in
ways like relational databases
• By 2050 the standard for long term
cloud storage.
• Neuromorphic research moving fast
• In 2019 Intel released “Pohoiki Beach”
– a 64-Loihi Chip Neuromorphic system
capable of simulating eight million
neurons
ML & AI becomes a standard tool of eScience
• Long history of Machine Learning in eScience
• By 2017 Generative Neural Networks
• GANs and Vars: Used to generate fakes.
Also useful for Science
• Applications in Astronomy, Biology,
Cosmology, …
From Shahar Harel and Kira Radinsky
“Prototype-Based Compound Discovery
using Deep Generative Models”
Mustafa Mustafa, et. al. “Creating Virtual Universes
Using Generative Adversarial Networks"
Synthetic Galaxies
The rise of Probabilistic Programming Languages
• An important new tool for eScience
• To make Bayesian inference about random behaviors that give rise to
experimental outcomes
• inferring the masses of subatomic particles based on the results of collider
experiments,
• or inferring the distribution of dark matter from the gravitational lensing effects on
nearby galaxies,
• or unravelling complex models of gene expression that manifest as disease
Random
draws
Simulation code
X1
X2
X3
…
Xn
y1
y2
y3
…
yn
P( Y | X )
Compiled Inference code
X1
X2
X3
…
Xn
y1
y2
y3
…
yn
P( X | Y )
Observed
Results
PPL Inference Compiler
Check out Gen from MIT and PyProb from Oxford
When does Machine Learning
Become AI?
• Deep Neural Nets allow us to magnify our
senses.
• Look at thousands of things faster and see details
better
• Do language translation in real time
• Reinforcement Learning works for closed
world games like Go, chess and Pong.
Huh?
• “Deep Learning Isn't a dangerous Magic Genie. It’s just math.”
• Oren Etzioni, CEO of the Allen Institute for Artificial Intelligence
• What about bots and “smart speakers”?
From Alexa to a tool for eScience Research
• 2019 Smart speaker bots are really dumb
• but they work … for some things.
• How about for eScience tasks?
• Alexa, please find the metadata associated with
experiment 32 and then compare it to the NIH
standard. What are the important differences?
• Please read the recent papers on the evolution of dark
matter halos? Are there simulation codes available?
• Recall that I asked you to look for any work in the
biology community that uses methods similar to that
astrophysics work. Any results yet?
Raj Reddy’s “Cognition Amplifier”
• I call it Research Assistant.
• Lives in the cloud and catalogs all my papers,
notes, codes, experimental results.
• “Knows” what topics are important and seeks out related research.
• Creates smart summaries that encapsulate key ideas
• Monitors ongoing computational experimental workflows
• Proposes new experiments to test my theories
• Verifies mathematical analysis
A Toy Example
• Basically a metasearch tool
• Does voice to text
• Text parsed via cloud service
• Analysis extracts key actions from parsed text
• Search from Bing, Wikipedia or arXiv
• Wikipedia summary rendered as voice by amazon lex.
voice
text
text
parsed text
Web browser
text
Amazon
Lex
text
voice
Analysis Engine
Google voice Kit
Wikipedia
API
Bing
search
Cornell
arXiv
How close are we to making this real in 2019?
• Need a Personal Knowledge Graph
• Document oriented … Wikidata-like
• Google KG key to search
• Start with “general science” KG
• RA auto extends it as you use it.
How about deep analysis of research papers?
• Text “comprehension” requires
• Strong language model
• Have that now with BERT transformers
• Knowledge graph
• Abductive inference engine
• Big Challenges being worked on
now
• Relating technical diagrams and
equations to text.
• Can you classify documents by the
content of the mathematics used?
• Can you derive theory from
observation?
Their Aristo system recently got an “A” on the N.Y.
Regents 8th grade science exams
Progress …
• Hanalyzer (short for high-throughput analyzer) uses natural language processing
to automatically extract a semantic network from all PubMed papers relevant to a
specific scientist
• Eureka (now DataRobot) does automatic AI based time series analysis and
DataRobot is a tool for automatically building ML models given only data.
• Michael Schmidt and Hod Lipson, Distilling Free-Form Natural Laws from
Experimental Data. (SCIENCE VOL 324 3 APRIL 2009)
Conclusions
• The range of the tech revolution between 1960 and 2020 was huge.
• It made eScience possible
• The developments from 2020 to 2050 will be just as surprising.
• Clouds and Supercomputers merge
• Quantum because a standard attached accelerator
• DNA storage changes the dynamics of data science
• AI becomes our research assistant.
• eScience becomes Science.
Esciencetalk

Contenu connexe

Tendances

Tendances (15)

Deep Learning for AI - Yoshua Bengio, Mila
Deep Learning for AI - Yoshua Bengio, MilaDeep Learning for AI - Yoshua Bengio, Mila
Deep Learning for AI - Yoshua Bengio, Mila
 
Computer science -
Computer science -Computer science -
Computer science -
 
Simplified Introduction to AI
Simplified Introduction to AISimplified Introduction to AI
Simplified Introduction to AI
 
Fields in computer science
Fields in computer scienceFields in computer science
Fields in computer science
 
AIML_Unit1.pptx
AIML_Unit1.pptxAIML_Unit1.pptx
AIML_Unit1.pptx
 
NLP Community Conference - Dr. Catherine Havasi (ConceptNet/MIT Media Lab/Lum...
NLP Community Conference - Dr. Catherine Havasi (ConceptNet/MIT Media Lab/Lum...NLP Community Conference - Dr. Catherine Havasi (ConceptNet/MIT Media Lab/Lum...
NLP Community Conference - Dr. Catherine Havasi (ConceptNet/MIT Media Lab/Lum...
 
Big data and AI presentation slides
Big data and AI presentation slidesBig data and AI presentation slides
Big data and AI presentation slides
 
Social databases - A brief overview
Social databases - A brief overviewSocial databases - A brief overview
Social databases - A brief overview
 
Open source hardware for academic projects
Open source hardware for academic projectsOpen source hardware for academic projects
Open source hardware for academic projects
 
H2O & Tensorflow - Fabrizio
H2O & Tensorflow - Fabrizio H2O & Tensorflow - Fabrizio
H2O & Tensorflow - Fabrizio
 
IoT
IoTIoT
IoT
 
HyperMembrane Structures for Open Source Cognitive Computing
HyperMembrane Structures for Open Source Cognitive ComputingHyperMembrane Structures for Open Source Cognitive Computing
HyperMembrane Structures for Open Source Cognitive Computing
 
When AI becomes a data-driven machine, and digital is everywhere!
When AI becomes a data-driven machine, and digital is everywhere!When AI becomes a data-driven machine, and digital is everywhere!
When AI becomes a data-driven machine, and digital is everywhere!
 
AI, Machine Learning and Deep Learning - The Overview
AI, Machine Learning and Deep Learning - The OverviewAI, Machine Learning and Deep Learning - The Overview
AI, Machine Learning and Deep Learning - The Overview
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 

Similaire à Esciencetalk

Similaire à Esciencetalk (20)

Machine Learning Overview: How did we get here ?
Machine Learning Overview: How did we get here ?Machine Learning Overview: How did we get here ?
Machine Learning Overview: How did we get here ?
 
Natural language Analysis
Natural language AnalysisNatural language Analysis
Natural language Analysis
 
AI and Healthcare: An Overview (January 2024)
AI and Healthcare: An Overview (January 2024)AI and Healthcare: An Overview (January 2024)
AI and Healthcare: An Overview (January 2024)
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural Networks
 
Artificial intelligence : what it is
Artificial intelligence : what it isArtificial intelligence : what it is
Artificial intelligence : what it is
 
The Ai & I at Work
The Ai & I at WorkThe Ai & I at Work
The Ai & I at Work
 
Presentation v3
Presentation v3Presentation v3
Presentation v3
 
2023-My AI Experience - Colm Dunphy.pdf
2023-My AI Experience - Colm Dunphy.pdf2023-My AI Experience - Colm Dunphy.pdf
2023-My AI Experience - Colm Dunphy.pdf
 
Big Data & Artificial Intelligence
Big Data & Artificial IntelligenceBig Data & Artificial Intelligence
Big Data & Artificial Intelligence
 
Data science and Artificial Intelligence
Data science and Artificial IntelligenceData science and Artificial Intelligence
Data science and Artificial Intelligence
 
Data Science in Future Tense
Data Science in Future TenseData Science in Future Tense
Data Science in Future Tense
 
antrikshindutrialmachinelearningPPT.pptx
antrikshindutrialmachinelearningPPT.pptxantrikshindutrialmachinelearningPPT.pptx
antrikshindutrialmachinelearningPPT.pptx
 
Artificial intelligence
Artificial intelligence Artificial intelligence
Artificial intelligence
 
Complex Networks: Science, Programming, and Databases
Complex Networks: Science, Programming, and DatabasesComplex Networks: Science, Programming, and Databases
Complex Networks: Science, Programming, and Databases
 
AI and Healthcare 2023.pdf
AI and Healthcare 2023.pdfAI and Healthcare 2023.pdf
AI and Healthcare 2023.pdf
 
AI and Healthcare 2023.pdf
AI and Healthcare 2023.pdfAI and Healthcare 2023.pdf
AI and Healthcare 2023.pdf
 
Artificial Intelligence in testing - A STeP-IN Evening Talk Session Speech by...
Artificial Intelligence in testing - A STeP-IN Evening Talk Session Speech by...Artificial Intelligence in testing - A STeP-IN Evening Talk Session Speech by...
Artificial Intelligence in testing - A STeP-IN Evening Talk Session Speech by...
 
Social Semantic (Sensor) Web
Social Semantic (Sensor) WebSocial Semantic (Sensor) Web
Social Semantic (Sensor) Web
 
The New e-Science
The New e-ScienceThe New e-Science
The New e-Science
 
What is Artificial Intelligence
What is Artificial IntelligenceWhat is Artificial Intelligence
What is Artificial Intelligence
 

Dernier

Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
GOWTHAMIM22
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
University of Hertfordshire
 

Dernier (20)

Lubrication System in forced feed system
Lubrication System in forced feed systemLubrication System in forced feed system
Lubrication System in forced feed system
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) Enzymology
 
Heads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdfHeads-Up Multitasker: CHI 2024 Presentation.pdf
Heads-Up Multitasker: CHI 2024 Presentation.pdf
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
 
THE GENERAL PROPERTIES OF PROTEOBACTERIA AND ITS TYPES
THE GENERAL PROPERTIES OF PROTEOBACTERIA AND ITS TYPESTHE GENERAL PROPERTIES OF PROTEOBACTERIA AND ITS TYPES
THE GENERAL PROPERTIES OF PROTEOBACTERIA AND ITS TYPES
 
Fun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdfFun for mover student's book- English book for teaching.pdf
Fun for mover student's book- English book for teaching.pdf
 
VILLAGE ATTACHMENT For rural agriculture PPT.pptx
VILLAGE ATTACHMENT For rural agriculture  PPT.pptxVILLAGE ATTACHMENT For rural agriculture  PPT.pptx
VILLAGE ATTACHMENT For rural agriculture PPT.pptx
 
The Scientific names of some important families of Industrial plants .pdf
The Scientific names of some important families of Industrial plants .pdfThe Scientific names of some important families of Industrial plants .pdf
The Scientific names of some important families of Industrial plants .pdf
 
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
Chemistry Data Delivery from the US-EPA Center for Computational Toxicology a...
 
Film Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdfFilm Coated Tablet and Film Coating raw materials.pdf
Film Coated Tablet and Film Coating raw materials.pdf
 
Isolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptxIsolation of AMF by wet sieving and decantation method pptx
Isolation of AMF by wet sieving and decantation method pptx
 
RACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptxRACEMIzATION AND ISOMERISATION completed.pptx
RACEMIzATION AND ISOMERISATION completed.pptx
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolation
 
Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!Quantifying Artificial Intelligence and What Comes Next!
Quantifying Artificial Intelligence and What Comes Next!
 
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
Alternative method of dissolution in-vitro in-vivo correlation and dissolutio...
 
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
dkNET Webinar: The 4DN Data Portal - Data, Resources and Tools to Help Elucid...
 
TEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdfTEST BANK for Organic Chemistry 6th Edition.pdf
TEST BANK for Organic Chemistry 6th Edition.pdf
 
MSCII_ FCT UNIT 5 TOXICOLOGY.pdf
MSCII_              FCT UNIT 5 TOXICOLOGY.pdfMSCII_              FCT UNIT 5 TOXICOLOGY.pdf
MSCII_ FCT UNIT 5 TOXICOLOGY.pdf
 
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptxPOST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
POST TRANSCRIPTIONAL GENE SILENCING-AN INTRODUCTION.pptx
 
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
 

Esciencetalk

  • 1. eScience 2050 A Look back from the midpoint of the century Dennis Gannon, Professor Emeritus School of Informatics, Computing and Engineering Indiana University & Microsoft Research (retired)
  • 2. The Problem with Predictions • Our vision of the future may look too much like today Some are obviously way off. Others not so much!
  • 3. Extrapolating eScience from 1965 to 2019 • My first computer program. One Fortran line per card. High school was fun! • How might I have predicted computing 50 years ahead? Programs and data stored on a paper tape a mile long!!
  • 4. Where was computing in the 1960s ? • 1961 -The programming language FORTRAN IV was created • 1964 - IBM introduced its System/360. • 1965 - Gordon Moore creates “Moore’s Law” • 1967 - IBM created the first floppy disk. • 1969 - The ARPAnet (which became Internet). • Mostly used for email until 1980. • 1970 - Edgar Codd invented Relation Database Idea. • Not implemented until 1974.
  • 5. The 30 years after 2019 • The cloud and supercomputing merge. • Quantum computing as a service in the cloud. • DNA data storage in the cloud. • Neuromorphic computing. • The explosion of AI as an eScience enabler. Huh?
  • 6. The Cloud Supercomputer Convergence • The converged supercloud merges best of both • Support for thousands of concurrent interactive users • Planet scale data resources supporting on-line personal agents and science gateways • Capable of launching multiple exascale parallel computations • By 2019 moving there already • Google TPU accelerators and Microsoft Azure mesh network of FPGA engines • US NSF Supercomputer centers already exploring cloud tech such as Kubernetes Google TPUs Azure FPGA mesh on top of servers
  • 7. Quantum in the Cloud • In 2018 small programmable quantum processors • Attached to clouds from IBM and Google • Programmable! • 200 stable qubits became real in 2030.
  • 8. DNA Storage and Neuromorphic computing • Long history of research in DNA • Longevity and density amazing • By 2019 able to fully automate storage and retrieval • UW+MSR working on microfluidics devices. • New ways in which DNA encoded data could be searched and structured in ways like relational databases • By 2050 the standard for long term cloud storage. • Neuromorphic research moving fast • In 2019 Intel released “Pohoiki Beach” – a 64-Loihi Chip Neuromorphic system capable of simulating eight million neurons
  • 9. ML & AI becomes a standard tool of eScience • Long history of Machine Learning in eScience • By 2017 Generative Neural Networks • GANs and Vars: Used to generate fakes.
  • 10. Also useful for Science • Applications in Astronomy, Biology, Cosmology, … From Shahar Harel and Kira Radinsky “Prototype-Based Compound Discovery using Deep Generative Models” Mustafa Mustafa, et. al. “Creating Virtual Universes Using Generative Adversarial Networks" Synthetic Galaxies
  • 11. The rise of Probabilistic Programming Languages • An important new tool for eScience • To make Bayesian inference about random behaviors that give rise to experimental outcomes • inferring the masses of subatomic particles based on the results of collider experiments, • or inferring the distribution of dark matter from the gravitational lensing effects on nearby galaxies, • or unravelling complex models of gene expression that manifest as disease Random draws Simulation code X1 X2 X3 … Xn y1 y2 y3 … yn P( Y | X ) Compiled Inference code X1 X2 X3 … Xn y1 y2 y3 … yn P( X | Y ) Observed Results PPL Inference Compiler Check out Gen from MIT and PyProb from Oxford
  • 12. When does Machine Learning Become AI? • Deep Neural Nets allow us to magnify our senses. • Look at thousands of things faster and see details better • Do language translation in real time • Reinforcement Learning works for closed world games like Go, chess and Pong. Huh? • “Deep Learning Isn't a dangerous Magic Genie. It’s just math.” • Oren Etzioni, CEO of the Allen Institute for Artificial Intelligence • What about bots and “smart speakers”?
  • 13. From Alexa to a tool for eScience Research • 2019 Smart speaker bots are really dumb • but they work … for some things. • How about for eScience tasks? • Alexa, please find the metadata associated with experiment 32 and then compare it to the NIH standard. What are the important differences? • Please read the recent papers on the evolution of dark matter halos? Are there simulation codes available? • Recall that I asked you to look for any work in the biology community that uses methods similar to that astrophysics work. Any results yet?
  • 14. Raj Reddy’s “Cognition Amplifier” • I call it Research Assistant. • Lives in the cloud and catalogs all my papers, notes, codes, experimental results. • “Knows” what topics are important and seeks out related research. • Creates smart summaries that encapsulate key ideas • Monitors ongoing computational experimental workflows • Proposes new experiments to test my theories • Verifies mathematical analysis
  • 15. A Toy Example • Basically a metasearch tool • Does voice to text • Text parsed via cloud service • Analysis extracts key actions from parsed text • Search from Bing, Wikipedia or arXiv • Wikipedia summary rendered as voice by amazon lex. voice text text parsed text Web browser text Amazon Lex text voice Analysis Engine Google voice Kit Wikipedia API Bing search Cornell arXiv
  • 16.
  • 17. How close are we to making this real in 2019? • Need a Personal Knowledge Graph • Document oriented … Wikidata-like • Google KG key to search • Start with “general science” KG • RA auto extends it as you use it.
  • 18. How about deep analysis of research papers? • Text “comprehension” requires • Strong language model • Have that now with BERT transformers • Knowledge graph • Abductive inference engine • Big Challenges being worked on now • Relating technical diagrams and equations to text. • Can you classify documents by the content of the mathematics used? • Can you derive theory from observation? Their Aristo system recently got an “A” on the N.Y. Regents 8th grade science exams
  • 19. Progress … • Hanalyzer (short for high-throughput analyzer) uses natural language processing to automatically extract a semantic network from all PubMed papers relevant to a specific scientist • Eureka (now DataRobot) does automatic AI based time series analysis and DataRobot is a tool for automatically building ML models given only data. • Michael Schmidt and Hod Lipson, Distilling Free-Form Natural Laws from Experimental Data. (SCIENCE VOL 324 3 APRIL 2009)
  • 20. Conclusions • The range of the tech revolution between 1960 and 2020 was huge. • It made eScience possible • The developments from 2020 to 2050 will be just as surprising. • Clouds and Supercomputers merge • Quantum because a standard attached accelerator • DNA storage changes the dynamics of data science • AI becomes our research assistant. • eScience becomes Science.