SlideShare une entreprise Scribd logo
1  sur  63
Lessons from COVID-19:
How Are Data Science and AI Changing Future Biomedical Research
Jake Y. Chen, PhD, FACMI
Professor & Chief Bioinformatics Officer, Informatics Institute
The University of Alabama at Birmingham, USA
jakechen@uab.edu | aimed-lab.org
October 30th 2020
2
Outline
• COVID-19: can we estimate the pandemic numbers?
• COVID-19 Big Data, Data Science, and AI
• Data science: team-based approach at UAB
• Modeling, simulations, and predictions
• Final thoughts: data-driven medicine
3
4
On 3/10, I gave the first UAB-wide COVID-19 talk
The world was alarmed while the US has <1000 cases
5
My COVID-19 prediction relied
on inferring World’s response
from China’s accomplishment
6
China
Other countries
10
100
1k
10k
100k
~1 month delay
Jake’s prediction (if
contained):
• 3/20: 87k
• 4/1: 130-170k
• 5/1: ~300k
• June: season ends
Total affected:
• <500K worldwide
excluding China
Total deaths:
• <15k worldwide
excluding China
Warm weather appears to curb transmission
Analysis Done on March 10th, 2020; Data source: JHU
Why was I so wrong in my prediction?
7
“Infectious diseases do not respect national boundaries…The United
States should be investing efforts and funds to strengthen the
health structures in countries around the world.”
“Lessons from SARS” (2003) Science: Vol. 300, Issue 5620, pp. 701
COVID-19: Where are we now?
8
1/21 4/11
500,000 cases
10/21-28
500,000cases
500,000
15.2 days
Cases in the
past week
Doubling time
since the start
109 days
Doubling time
in the past two
months
A very conservative look into the near future
9
10.6 M
Cases by 12/1/2020
13 M
Cases by 1/1/2021
EstimatedUncheckedDate
20 M
Cases by March 2021
COVID-19 Infection Fatality Ratio (IFR)
US vs. other countries over time
10
COVID-19 Infection Fatality Ratio (IFR)
A historic comparison between the US and China
Chinese CDC (2/11/2020) US CDC (9/10/2020)
0-19 Years 0.1% (1 in 1,000) 0.003% (1 in 33,333)
20-49 Years 0.26% (1 in 375) 0.02% (1 in 5,000)
50-69 Years 2.45% (1 in 40) 0.5% (1 in 200)
70-79 Years 8% (1 in 12) 5.5% (1 in 18)
80+ Years 14.8% (1 in 7) NA
11
Source: China CDC Weekly, Vol. 2, No. 8, pp 113-122; US CDC
https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html
Asymptomatic COVID-19 patients
• “40-45% of COVID-19 patients may be asymptomatic”
• D.P. Oran and E.J. Topol“Prevalence of Asymptomatic SARS-CoV-2 infection“,
in Annals of Internal Medicine, 9/1/2020
• “20% of people who become infected with SARS-CoV-2 and remain
asymptomatic throughout infection”
• 94 studies
• D Buitrago-Garcia et al “Occurrence and transmission potential of
asymptomatic and presymptomatic SARS-CoV-2 infections: A living systematic
review and meta-analysis” in Plos Medicine, 9/22/2020
12
COVID-19 Tests Performed Varies by Country
13
COVID-19 PCR
diagnostic test
has a False
Negative Rate
of 25-50%
(days 5-16)
14Ann Intern Med . 2020 Aug 18;173(4):262-267. doi: 10.7326/M20-1495
Is 10% COVID infection worldwide true?
15
“WHO’s best
estimates
indicate roughly
1 in 10 people
worldwide may
have been
infected by the
coronavirus.”
--Dr. Michael Ryan,
WHO COVID-19 Executive
Board 10/5/2020
10% of World population (7.8 B) = 780 M, or ~ 23 x 35M confirmed cases (10/4/2020)
USA India
Population 331M 1.4 B
COVID-19 Tests Performed 142 M 101 M
Cumulative Cases 8.4M 7.8 M
% positives given tests = 8.4 / 142 =5.9% = 7.8 / 101 = 7.7%
Adjust for asymptomatic = 5.9% / 80% = 7.4 % = 7.7% / 80% = 9.6%
Adjust further for false negative
(FNR=30%) tests
= 7.4 % * 70% + (1 - 7.4%) * 30% =
29.3%
= 9.6 % * 70% + (1-9.6%) * 30% = 29.1 %
Adjusted Actual Cases = 142 M * 29.3% = 42 M =101 M * 29.1% = 29 M
Adjust for un-tested population = 42M + (331 M - 142 M) *
(8.4M/331M) = 42M + 4.8M = 47M
= 29M + (1.4 B - 101 M) * (7.8M/101M)
= 29M + 101M = 130M 16
My unpublished worksheet showing 10% of all of USA and
India population may have already been infected
Data Science Lessons Learned
1. Sound data science is built on sound data collection
and curation
• Epidemiology and public health data collections
2. Ensure data quality and trustworthiness before using
them.
3. Valid real-world assumptions go before “theoretical
beauty” of any model
17
Outline
• COVID-19: can we estimate the pandemic numbers?
• COVID-19 Big Data, Data Science, and AI
• Data science: team-based approach at UAB
• Modeling, simulations, and predictions
• Final thoughts: data-driven medicine
18
Global sharing of COVID-19 information critical
for biomedical research and healthcare
19LANCET. VOLUME 395, ISSUE 10224, P537, FEBRUARY 22, 2020
Growth of COVID-19 Papers in PubMed
20
N=65, 685 as of 10/28/2020
160 % All UAB PubMed Publications since 1981
50 % All cancer-related PubMed Publications in 2020
3 years All infectious disease Publications in 2017-19
16 % All papers contain the phrase “data” in Title/Abstract
COVID-19 Paper Trends by Topic Themes
21
Mechanism (N=7070)
Transmission (N=1820)
Diagnosis (N=10235)Treatment (N=14787)
Prevention (N=21972)Forecasting (N=945)
Specialized Peaked
Stable/Growing
407 Papers (0.6%) on “Big Data” or “AI”
22
Weekly publications World Map
COVID-19 Big Data Source
23
COVID-19 Biomedical Data at a Glance
24
3,768
ClinicalTrials.gov
36,288
Nucleotide sequences
595
Gene Expression
Omnibus Samples
N3C Clinical Data CORD-19 Curated Literature
78,309
Github Repositories
81
DrugBank
Targets
1154
PubChem
Compoun
ds
25
AI helping fight against COVID-19
• Detecting outbreaks, tracing contacts and shaping public health
responses
• Screening for people who might be infected
• Facilitating earlier diagnosis
• Predicting risk of deterioration and poor outcomes
• Augmenting remote monitoring and virtual care
• Developing potential treatments and vaccines
• Assisting hospital responses
26
PMID: 33111999
27
PMID: 33110297
28PMID: 33100403
Individual Patient Fatality
Prediction using AI
29
PMID: 33102426
Robot-nurse and AI radiologist
30
Critics of Black-box AI in the era of COVID-19
• Risk of cyber-attack
• E.g., input attacks that leads to wrong results (criminal
offense) are difficult to discern
• Risk of systematic bias affecting the patient’s
healthcare
• Sampling bias, e.g., against minorities or rare disorders or
gender/age
• Human has an “automation” bias (i.e., think machine is
unbiased from human decisions)
• Machines are not as held responsible as human
physicians
• Risk of mismatch
• Correlation instead of causation
• Unable to discern “meta-risk” (false risk) from real risk
31
PMID: 33110296
Data Science Lessons Learned
1. Global community of open data sharing is critical to
fighting future major diseases
2. AI is increasingly relevant for biomedical research
and selected healthcare applications
3. AI is not mature enough for broad-scale adoptions in
the US
32
Outline
• COVID-19: can we estimate the pandemic numbers?
• COVID-19 Big Data, Data Science, and AI
• Data science: team-based approach at UAB
• Modeling, simulations, and predictions
• Final thoughts: data-driven medicine
33
Can we speed up team data science development?
1. Ideas & Team Formation
34
2. Prototyping
3. Pilot Studies
4. System Building
5. Real-world Solutions
Feedback,
decomposition, testing,
refinement, integration,
and redeployment
We developed a new COVID-UBRITE infrastructure
Amy Wang, MD
Matt Wyatt
Jelai Wang
http://covid19.ubrite.org/
© UAB. All Rights Reserved.
…the hackathon was a significant
factor in the project's success
…scheduling the hackathon
earlier in the tool
development process would
have been even more
beneficial.
Many student participants began as COVID-19
“knowledge curators” during lockdown
Idea generation/filtering for the Hackathon
2020 COVID-19 Data Science Hackathon:
An Overview
116 90 40 10
MS Team
Users
Registrants Registrants Teams Winners
3
Pre-hackathon Course
“Programming with
Biological Data”
(Rosenberg, Basu, & Wang)
Bootcamps
“UBRITE for COVID-19
Hackathon”
(Chen, Wang, Jelai &
Wyatt)
Two-day Hackathon
“COVID-19 Data-driven
Medicine”
(Wang, Jelai, Wyatt &
Chen)
Hackathon Symposium
“COVID-19 Hackathon
Showcase”
(Wang)
6/15-16 6/196/9-125/11 – 6/5
Modeling the COVID-19 Epidemic with CNN and Conway’s Game of Life
Selected Team Projects Performed by Contestants
County-level social determinants of COVID-19 health
Genetic epidemiology analysis of SARS-CoV-2A Network-based SEIDR model in simulating the outbreak of COVID-19
Post-hackathon Survey Results
What do you like about the hackathon? What is your overall experience?
42
http://discovery.informatics.uab.edu/PAGER-COV/
11,835 PAGs (pathways, annotated gene lists,
and gene signatures)
19,996,993 PAG-to-PAG relationships stored in
the database
1. There are three pillars to successful team-based data science:
people, technology, and science
2. Data curation and data processing takes the majority of time yet
they are essential to problem-solving
3. To be prepared for national impact in a rapidly evolving
environment, a person/team needs to already become expert on
a selected topic
Data Science Lessons Learned
Outline
• COVID-19: can we estimate the pandemic numbers?
• COVID-19 Big Data, Data Science, and AI
• Data science: team-based approach at UAB
• Modeling, simulations, and predictions
• Final thoughts: data-driven medicine
44
Are COVID-19 Health Policy Evidence-based?
45
Policy indices of COVID-19 government responses
46
https://github.com/OxCGRT/covid-policy-tracker/blob/master/documentation/codebook.md
The Oxford COVID-19 Government Response Tracker
(OxCGRT) provides a systematic measure across
governments and across time to understand how
government responses have evolved over the full
period of the disease’s spread.
CHI
US Geographical variations of CHI Reported
daily cases
CHI
Days after 1st
recorded case
Source: University of Oxford
https://www.bsg.ox.ac.uk/research/publications/variation-us-states-responses-covid-19
Containment and Health Index (CHI)
Drawn according to state government’s party affiliations
48
Source: University of Oxford
https://www.bsg.ox.ac.uk/research/publications/variation-us-states-responses-covid-19
49
Mask Wearing: a central policy mandate difference
among republican and democrat controlled states
50
Salem, IN
(9/19)
Simulating the impact of three health policies
51
Scenario
Mask use in the
population?
When are
mandates removed?
Threshold at which
mandates are re-imposed?
What mandates?
Current
projection
Assumes mask use
continues at currently
observed rates
Assumes that the gradual easing
of social distancing mandates
continues
Assumes that mandates will
be re-imposed for six weeks if
daily deaths reach 8 per million
•Educational facilities closed
•Non-essential businesses closed
•People ordered to stay at home
•Large gatherings banned
Mandates
easing
Assumes mask use
continues at currently
observed rates
Assumes that the gradual easing
of social distancing mandates
continues
Assumes that mandates are
never re-imposed
Not applicable
Universal
masks
Assumes mask use
rises to 95% within 7
days
Assumes that the gradual easing
of social distancing mandates
continues
Assumes that mandates will
be re-imposed for six weeks if
daily deaths reach 8 per million
•Educational facilities closed
•Non-essential businesses closed
•People ordered to stay at home
•Large gatherings banned
Source: UW Institute for Health Metrics and Evaluation https://covid19.healthdata.org
Public mask use over time in the US
% of population who say they would wear a mask in public
52Source: https://covid19.healthdata.org/united-states-of-america?view=total-deaths&tab=trend
Singapore, China, etc
Change in mobility over time in the US
Mobility estimated from cell phone proximity data
53Source: https://covid19.healthdata.org/united-states-of-america?view=social-distancing&tab=trend
Projection: mask use, testing, isolation,
and contact tracing all together
Universal masks could save 240K lives, while mandates
easing could cost 350K more lives (by 1/1/2021)
54Source: https://covid19.healthdata.org/global?view=total-deaths&tab=trend
The Network SEIR Model
susceptible (S), exposed (E), infected (I),
deceased (D), and recovered (R)
4
2 2
1 3
2
3
3
3
2
2
2
2
3 2
2
3800
2800
1500
2600
210,000
200,000
Family levelCommunity levelCounty level
4,908,621
State level
Family size avg.: 2.52±1.5Community size avg.: 3000±1000County size avg.: 35000±10000
𝛽
σ
𝜃
3/14/20 500+
250-499
100-249
50-99
10-49
1-9
0
Confirmed case propagation in the community network of Jefferson county, AL
3/28/20
4/11/20
4/25/20
5/9/20
5/23/20
6/6/20
6/14/20
A susceptible-infected contact results in a new exposure
An exposed person becomes infective
The group-to-group contact rate
Population: 658,573
Quarantine: community
Average size: 3000
The first day of the infected case (1 case) reported in Jefferson county in Alabama is '3/13/20’
Developing a parameterized network SEIR model to
predict Jefferson county infections on 1/1/2021
𝜷
𝝈
𝜽
April 4th, Stay-at-home order
April 30th, Reopened
May 25th, Protest
Model β σ θ
stay_home 0.033 0.353 0.0006
reopen 0.039 0.359 0.003
protest 0.043 0.376 0.002
Interpretation Range
β
A susceptible-infected contact
results in a new exposure
[0,0.15]
σ
An exposed person becomes
infective
[0.3,0.4]
θ The group-to-group contact rate [0, 0.01]
Define Model for Jefferson County, AL
• Total population: 658,573
• Community size: 3,084±941
• Community: 219
48,000
4. 1/1/2021 prediction
43,000
35,000
1/1/2021
10/30/2020
3. Model parameterization2. Model parameter fitting1. Model construction
Data Science Lessons Learned
1. Consider discretizing the qualitative traits such as health policy to
be able to quantitatively studying them
2. Mathematical/engineering models can better explain mechanisms,
while giving predictions.
3. Governmental health policy can directly save or cost human lives!
58
Outline
• COVID-19: can we estimate the pandemic numbers?
• COVID-19 Big Data, Data Science, and AI
• Data science: team-based approach at UAB
• Modeling, simulations, and predictions
• Final thoughts: data-driven medicine
59
Physiological
Changes
Discomfort
Pain
Wounds
Bleeding
…
Symptoms
Spiritual rituals
Religion
Herbal medicine
Trials and errors
Natural healing
…
Spontaneity
Nature-inspired remedies
Cellular abnormality
Subcellular changes
Genetic variation
Altered reactions
Phenotypic changes
Organ damage
…
Human Diseases
Animal Models
Microscopy
X-ray crystallography
Neuro imaging
Radioactive tracers
Surgery
…
Experimental Tools
Physician-led hospital care
Genetic basis of the
disease
Mutations
Inflammation
Oxidative stress
Cytokines/Hormones
Biochemical changes
…
Individual
Abnormality
Genomic Medicine
Precision Medicine
CRISPER
Network Medicine
Deep learning
Robo-surgery
AI-enabled Drug
discovery
…
Data science + AI
Data-driven robo-doc-assisted healthcare
Acknowledgment
Zongliang Yue, PhD
Eric Zhang,
Sunny Khurana
Nishant Batra
Son Do Hai Dang
Thanh Nguyen, PhD
The AI.Med Lab
Univ. WI Madison
U54TR002731
U01CA223976
R01AI134023
R01AR073850
UAB Informatics Institute
Jim Cimino, MD
Amy Wang, MD
Wayne Liang, MD
Jelai Wang
Matt Wyatt
Geoff Gordon
Nafisa Ajala
Heather WattsUAB CS
Da Yan, PhD
Rubio Pique Jose
UAB BME
Jay Zhang, MD/PhD
Clark Xu

Contenu connexe

Tendances

COVID-19 and Italy: what next?
COVID-19 and Italy: what next?COVID-19 and Italy: what next?
COVID-19 and Italy: what next?Valentina Corona
 
2020.09.16.20194787v1.full
2020.09.16.20194787v1.full2020.09.16.20194787v1.full
2020.09.16.20194787v1.fullLaurent Sailly
 
Bcg vaccination may be protective against covid-19
Bcg vaccination may be protective against covid-19Bcg vaccination may be protective against covid-19
Bcg vaccination may be protective against covid-19Valentina Corona
 
A united front to beat COVID19
A united front to beat COVID19A united front to beat COVID19
A united front to beat COVID19Richard Canabate
 
Developing Therapeutic Strategies & Current Knowledge on Drugs For Treatment ...
Developing Therapeutic Strategies & Current Knowledge on Drugs For Treatment ...Developing Therapeutic Strategies & Current Knowledge on Drugs For Treatment ...
Developing Therapeutic Strategies & Current Knowledge on Drugs For Treatment ...LaraV1
 
How machine learning is used to find the covid 19 vaccine
How machine learning is used to find the covid 19 vaccineHow machine learning is used to find the covid 19 vaccine
How machine learning is used to find the covid 19 vaccineValiant Technosoft
 
INSIGHT ABOUT DETECTION, PREDICTION AND WEATHER IMPACT OF CORONAVIRUS (COVID-...
INSIGHT ABOUT DETECTION, PREDICTION AND WEATHER IMPACT OF CORONAVIRUS (COVID-...INSIGHT ABOUT DETECTION, PREDICTION AND WEATHER IMPACT OF CORONAVIRUS (COVID-...
INSIGHT ABOUT DETECTION, PREDICTION AND WEATHER IMPACT OF CORONAVIRUS (COVID-...ijaia
 
Fair Allocation of Scarce Medical Resources in the Time of Covid-19
Fair Allocation of Scarce Medical Resources in the Time of Covid-19Fair Allocation of Scarce Medical Resources in the Time of Covid-19
Fair Allocation of Scarce Medical Resources in the Time of Covid-19Valentina Corona
 
At the Epicenter of the Covid-19 Pandemic and Humanitarian Crises in Italy
At the Epicenter of the Covid-19 Pandemic and Humanitarian Crises in ItalyAt the Epicenter of the Covid-19 Pandemic and Humanitarian Crises in Italy
At the Epicenter of the Covid-19 Pandemic and Humanitarian Crises in ItalyValentina Corona
 
Reduction of hospitalizations
Reduction of hospitalizationsReduction of hospitalizations
Reduction of hospitalizationsFrancesco Megna
 
COVID-19 Clinical Development
COVID-19 Clinical DevelopmentCOVID-19 Clinical Development
COVID-19 Clinical DevelopmentManish Gupta
 
Rational use of face masks in the COVID-19 pandemic
Rational use of face masks in the COVID-19 pandemicRational use of face masks in the COVID-19 pandemic
Rational use of face masks in the COVID-19 pandemicValentina Corona
 
13 valent pneumococcal conjugate vaccine (pcv13) r mm5909
13 valent pneumococcal conjugate vaccine (pcv13)  r mm590913 valent pneumococcal conjugate vaccine (pcv13)  r mm5909
13 valent pneumococcal conjugate vaccine (pcv13) r mm5909Ruth Vargas Gonzales
 
COVID-19 in Europe the italian lesson
COVID-19 in Europe the italian lessonCOVID-19 in Europe the italian lesson
COVID-19 in Europe the italian lessonValentina Corona
 
Exploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease InformaticsExploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease InformaticsNigel Collier
 
Whole Genome Sequencing (WGS) for food safety management-Perspectives from C...
Whole Genome Sequencing (WGS)  for food safety management-Perspectives from C...Whole Genome Sequencing (WGS)  for food safety management-Perspectives from C...
Whole Genome Sequencing (WGS) for food safety management-Perspectives from C...ExternalEvents
 

Tendances (20)

Psychiatry Research
Psychiatry ResearchPsychiatry Research
Psychiatry Research
 
COVID-19 and Italy: what next?
COVID-19 and Italy: what next?COVID-19 and Italy: what next?
COVID-19 and Italy: what next?
 
2020.09.16.20194787v1.full
2020.09.16.20194787v1.full2020.09.16.20194787v1.full
2020.09.16.20194787v1.full
 
Bcg vaccination may be protective against covid-19
Bcg vaccination may be protective against covid-19Bcg vaccination may be protective against covid-19
Bcg vaccination may be protective against covid-19
 
A united front to beat COVID19
A united front to beat COVID19A united front to beat COVID19
A united front to beat COVID19
 
Developing Therapeutic Strategies & Current Knowledge on Drugs For Treatment ...
Developing Therapeutic Strategies & Current Knowledge on Drugs For Treatment ...Developing Therapeutic Strategies & Current Knowledge on Drugs For Treatment ...
Developing Therapeutic Strategies & Current Knowledge on Drugs For Treatment ...
 
How machine learning is used to find the covid 19 vaccine
How machine learning is used to find the covid 19 vaccineHow machine learning is used to find the covid 19 vaccine
How machine learning is used to find the covid 19 vaccine
 
Vitamin D and COVID-19
Vitamin D and COVID-19Vitamin D and COVID-19
Vitamin D and COVID-19
 
INSIGHT ABOUT DETECTION, PREDICTION AND WEATHER IMPACT OF CORONAVIRUS (COVID-...
INSIGHT ABOUT DETECTION, PREDICTION AND WEATHER IMPACT OF CORONAVIRUS (COVID-...INSIGHT ABOUT DETECTION, PREDICTION AND WEATHER IMPACT OF CORONAVIRUS (COVID-...
INSIGHT ABOUT DETECTION, PREDICTION AND WEATHER IMPACT OF CORONAVIRUS (COVID-...
 
Fair Allocation of Scarce Medical Resources in the Time of Covid-19
Fair Allocation of Scarce Medical Resources in the Time of Covid-19Fair Allocation of Scarce Medical Resources in the Time of Covid-19
Fair Allocation of Scarce Medical Resources in the Time of Covid-19
 
At the Epicenter of the Covid-19 Pandemic and Humanitarian Crises in Italy
At the Epicenter of the Covid-19 Pandemic and Humanitarian Crises in ItalyAt the Epicenter of the Covid-19 Pandemic and Humanitarian Crises in Italy
At the Epicenter of the Covid-19 Pandemic and Humanitarian Crises in Italy
 
Reduction of hospitalizations
Reduction of hospitalizationsReduction of hospitalizations
Reduction of hospitalizations
 
COVID-19 Clinical Development
COVID-19 Clinical DevelopmentCOVID-19 Clinical Development
COVID-19 Clinical Development
 
Jamainternal boccia 2020
Jamainternal boccia 2020Jamainternal boccia 2020
Jamainternal boccia 2020
 
Rational use of face masks in the COVID-19 pandemic
Rational use of face masks in the COVID-19 pandemicRational use of face masks in the COVID-19 pandemic
Rational use of face masks in the COVID-19 pandemic
 
13 valent pneumococcal conjugate vaccine (pcv13) r mm5909
13 valent pneumococcal conjugate vaccine (pcv13)  r mm590913 valent pneumococcal conjugate vaccine (pcv13)  r mm5909
13 valent pneumococcal conjugate vaccine (pcv13) r mm5909
 
COVID-19 in Europe the italian lesson
COVID-19 in Europe the italian lessonCOVID-19 in Europe the italian lesson
COVID-19 in Europe the italian lesson
 
Exploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease InformaticsExploiting NLP for Digital Disease Informatics
Exploiting NLP for Digital Disease Informatics
 
Ssrn id3963606
Ssrn id3963606Ssrn id3963606
Ssrn id3963606
 
Whole Genome Sequencing (WGS) for food safety management-Perspectives from C...
Whole Genome Sequencing (WGS)  for food safety management-Perspectives from C...Whole Genome Sequencing (WGS)  for food safety management-Perspectives from C...
Whole Genome Sequencing (WGS) for food safety management-Perspectives from C...
 

Similaire à Lessons from COVID-19: How Are Data Science and AI Changing Future Biomedical Research?

Clinical Research Informatics Year-in-Review - 2023
Clinical Research Informatics Year-in-Review - 2023Clinical Research Informatics Year-in-Review - 2023
Clinical Research Informatics Year-in-Review - 2023Peter Embi
 
Network medicine
Network medicineNetwork medicine
Network medicinestepano7
 
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.caGenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.cafionabrinkman
 
Webinar Presentation Syndromic Testing – A New Take on Diagnosing Infectious ...
Webinar Presentation Syndromic Testing – A New Take on Diagnosing Infectious ...Webinar Presentation Syndromic Testing – A New Take on Diagnosing Infectious ...
Webinar Presentation Syndromic Testing – A New Take on Diagnosing Infectious ...BIS Research Inc.
 
Data Science for (Health) Science: tales from a challenging front line, and h...
Data Science for (Health) Science:tales from a challenging front line, and h...Data Science for (Health) Science:tales from a challenging front line, and h...
Data Science for (Health) Science: tales from a challenging front line, and h...Paolo Missier
 
Bias in covid 19 models
Bias in covid 19 modelsBias in covid 19 models
Bias in covid 19 modelsLaure Wynants
 
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...Philip Bourne
 
A Study to Assess the Post Covid Home Quarantine Stress Level in Second Wave ...
A Study to Assess the Post Covid Home Quarantine Stress Level in Second Wave ...A Study to Assess the Post Covid Home Quarantine Stress Level in Second Wave ...
A Study to Assess the Post Covid Home Quarantine Stress Level in Second Wave ...ijtsrd
 
Role of data science during covid times
Role of data science during covid timesRole of data science during covid times
Role of data science during covid timesTanyaAgarwal71
 
Opportunities for computing in cancer research
Opportunities for computing in cancer researchOpportunities for computing in cancer research
Opportunities for computing in cancer researchWarren Kibbe
 
Fattori - 50 abstracts of e patient. In collaborazione con Monica Daghio
Fattori - 50 abstracts of e patient. In collaborazione con Monica DaghioFattori - 50 abstracts of e patient. In collaborazione con Monica Daghio
Fattori - 50 abstracts of e patient. In collaborazione con Monica DaghioGiuseppe Fattori
 
Professor Aboul ella COVID-19 related publications
Professor Aboul ella COVID-19 related publications Professor Aboul ella COVID-19 related publications
Professor Aboul ella COVID-19 related publications Aboul Ella Hassanien
 
04.03.20 | COVID-19 and HIV: What We Know and Efforts to Answer What We Don’t...
04.03.20 | COVID-19 and HIV: What We Know and Efforts to Answer What We Don’t...04.03.20 | COVID-19 and HIV: What We Know and Efforts to Answer What We Don’t...
04.03.20 | COVID-19 and HIV: What We Know and Efforts to Answer What We Don’t...UC San Diego AntiViral Research Center
 
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELSCOVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELSIRJET Journal
 
Computational Epidemiology tutorial featured at ACM Knowledge Discovery and D...
Computational Epidemiology tutorial featured at ACM Knowledge Discovery and D...Computational Epidemiology tutorial featured at ACM Knowledge Discovery and D...
Computational Epidemiology tutorial featured at ACM Knowledge Discovery and D...Biocomplexity Institute of Virginia Tech
 

Similaire à Lessons from COVID-19: How Are Data Science and AI Changing Future Biomedical Research? (20)

Clinical Research Informatics Year-in-Review - 2023
Clinical Research Informatics Year-in-Review - 2023Clinical Research Informatics Year-in-Review - 2023
Clinical Research Informatics Year-in-Review - 2023
 
An Investigation Into the Impacts of ICT in the Compacting of COVID-19: A Nam...
An Investigation Into the Impacts of ICT in the Compacting of COVID-19: A Nam...An Investigation Into the Impacts of ICT in the Compacting of COVID-19: A Nam...
An Investigation Into the Impacts of ICT in the Compacting of COVID-19: A Nam...
 
Network medicine
Network medicineNetwork medicine
Network medicine
 
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.caGenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
GenomeTrakr: Perspectives on linking internationally - Canada and IRIDA.ca
 
Webinar Presentation Syndromic Testing – A New Take on Diagnosing Infectious ...
Webinar Presentation Syndromic Testing – A New Take on Diagnosing Infectious ...Webinar Presentation Syndromic Testing – A New Take on Diagnosing Infectious ...
Webinar Presentation Syndromic Testing – A New Take on Diagnosing Infectious ...
 
Data Science for (Health) Science: tales from a challenging front line, and h...
Data Science for (Health) Science:tales from a challenging front line, and h...Data Science for (Health) Science:tales from a challenging front line, and h...
Data Science for (Health) Science: tales from a challenging front line, and h...
 
MNM COVID-19 Study
MNM COVID-19 StudyMNM COVID-19 Study
MNM COVID-19 Study
 
Bias in covid 19 models
Bias in covid 19 modelsBias in covid 19 models
Bias in covid 19 models
 
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...
Big Data and the Promise and Pitfalls when Applied to Disease Prevention and ...
 
A Study to Assess the Post Covid Home Quarantine Stress Level in Second Wave ...
A Study to Assess the Post Covid Home Quarantine Stress Level in Second Wave ...A Study to Assess the Post Covid Home Quarantine Stress Level in Second Wave ...
A Study to Assess the Post Covid Home Quarantine Stress Level in Second Wave ...
 
Infodemics.pptx
Infodemics.pptxInfodemics.pptx
Infodemics.pptx
 
Role of data science during covid times
Role of data science during covid timesRole of data science during covid times
Role of data science during covid times
 
Opportunities for computing in cancer research
Opportunities for computing in cancer researchOpportunities for computing in cancer research
Opportunities for computing in cancer research
 
SRGE COVID-19 Publications 2020
SRGE COVID-19 Publications 2020SRGE COVID-19 Publications 2020
SRGE COVID-19 Publications 2020
 
Fattori - 50 abstracts of e patient. In collaborazione con Monica Daghio
Fattori - 50 abstracts of e patient. In collaborazione con Monica DaghioFattori - 50 abstracts of e patient. In collaborazione con Monica Daghio
Fattori - 50 abstracts of e patient. In collaborazione con Monica Daghio
 
Future of healthcare
Future of healthcareFuture of healthcare
Future of healthcare
 
Professor Aboul ella COVID-19 related publications
Professor Aboul ella COVID-19 related publications Professor Aboul ella COVID-19 related publications
Professor Aboul ella COVID-19 related publications
 
04.03.20 | COVID-19 and HIV: What We Know and Efforts to Answer What We Don’t...
04.03.20 | COVID-19 and HIV: What We Know and Efforts to Answer What We Don’t...04.03.20 | COVID-19 and HIV: What We Know and Efforts to Answer What We Don’t...
04.03.20 | COVID-19 and HIV: What We Know and Efforts to Answer What We Don’t...
 
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELSCOVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
COVID-19 FUTURE FORECASTING USING SUPERVISED MACHINE LEARNING MODELS
 
Computational Epidemiology tutorial featured at ACM Knowledge Discovery and D...
Computational Epidemiology tutorial featured at ACM Knowledge Discovery and D...Computational Epidemiology tutorial featured at ACM Knowledge Discovery and D...
Computational Epidemiology tutorial featured at ACM Knowledge Discovery and D...
 

Dernier

Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Disha Kariya
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfchloefrazer622
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDThiyagu K
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 

Dernier (20)

Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..Sports & Fitness Value Added Course FY..
Sports & Fitness Value Added Course FY..
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Disha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdfDisha NEET Physics Guide for classes 11 and 12.pdf
Disha NEET Physics Guide for classes 11 and 12.pdf
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Measures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SDMeasures of Dispersion and Variability: Range, QD, AD and SD
Measures of Dispersion and Variability: Range, QD, AD and SD
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptxINDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
INDIA QUIZ 2024 RLAC DELHI UNIVERSITY.pptx
 

Lessons from COVID-19: How Are Data Science and AI Changing Future Biomedical Research?

  • 1. Lessons from COVID-19: How Are Data Science and AI Changing Future Biomedical Research Jake Y. Chen, PhD, FACMI Professor & Chief Bioinformatics Officer, Informatics Institute The University of Alabama at Birmingham, USA jakechen@uab.edu | aimed-lab.org October 30th 2020
  • 2. 2
  • 3. Outline • COVID-19: can we estimate the pandemic numbers? • COVID-19 Big Data, Data Science, and AI • Data science: team-based approach at UAB • Modeling, simulations, and predictions • Final thoughts: data-driven medicine 3
  • 4. 4 On 3/10, I gave the first UAB-wide COVID-19 talk
  • 5. The world was alarmed while the US has <1000 cases 5
  • 6. My COVID-19 prediction relied on inferring World’s response from China’s accomplishment 6 China Other countries 10 100 1k 10k 100k ~1 month delay Jake’s prediction (if contained): • 3/20: 87k • 4/1: 130-170k • 5/1: ~300k • June: season ends Total affected: • <500K worldwide excluding China Total deaths: • <15k worldwide excluding China Warm weather appears to curb transmission Analysis Done on March 10th, 2020; Data source: JHU
  • 7. Why was I so wrong in my prediction? 7 “Infectious diseases do not respect national boundaries…The United States should be investing efforts and funds to strengthen the health structures in countries around the world.” “Lessons from SARS” (2003) Science: Vol. 300, Issue 5620, pp. 701
  • 8. COVID-19: Where are we now? 8 1/21 4/11 500,000 cases 10/21-28 500,000cases 500,000 15.2 days Cases in the past week Doubling time since the start 109 days Doubling time in the past two months
  • 9. A very conservative look into the near future 9 10.6 M Cases by 12/1/2020 13 M Cases by 1/1/2021 EstimatedUncheckedDate 20 M Cases by March 2021
  • 10. COVID-19 Infection Fatality Ratio (IFR) US vs. other countries over time 10
  • 11. COVID-19 Infection Fatality Ratio (IFR) A historic comparison between the US and China Chinese CDC (2/11/2020) US CDC (9/10/2020) 0-19 Years 0.1% (1 in 1,000) 0.003% (1 in 33,333) 20-49 Years 0.26% (1 in 375) 0.02% (1 in 5,000) 50-69 Years 2.45% (1 in 40) 0.5% (1 in 200) 70-79 Years 8% (1 in 12) 5.5% (1 in 18) 80+ Years 14.8% (1 in 7) NA 11 Source: China CDC Weekly, Vol. 2, No. 8, pp 113-122; US CDC https://www.cdc.gov/coronavirus/2019-ncov/hcp/planning-scenarios.html
  • 12. Asymptomatic COVID-19 patients • “40-45% of COVID-19 patients may be asymptomatic” • D.P. Oran and E.J. Topol“Prevalence of Asymptomatic SARS-CoV-2 infection“, in Annals of Internal Medicine, 9/1/2020 • “20% of people who become infected with SARS-CoV-2 and remain asymptomatic throughout infection” • 94 studies • D Buitrago-Garcia et al “Occurrence and transmission potential of asymptomatic and presymptomatic SARS-CoV-2 infections: A living systematic review and meta-analysis” in Plos Medicine, 9/22/2020 12
  • 13. COVID-19 Tests Performed Varies by Country 13
  • 14. COVID-19 PCR diagnostic test has a False Negative Rate of 25-50% (days 5-16) 14Ann Intern Med . 2020 Aug 18;173(4):262-267. doi: 10.7326/M20-1495
  • 15. Is 10% COVID infection worldwide true? 15 “WHO’s best estimates indicate roughly 1 in 10 people worldwide may have been infected by the coronavirus.” --Dr. Michael Ryan, WHO COVID-19 Executive Board 10/5/2020 10% of World population (7.8 B) = 780 M, or ~ 23 x 35M confirmed cases (10/4/2020)
  • 16. USA India Population 331M 1.4 B COVID-19 Tests Performed 142 M 101 M Cumulative Cases 8.4M 7.8 M % positives given tests = 8.4 / 142 =5.9% = 7.8 / 101 = 7.7% Adjust for asymptomatic = 5.9% / 80% = 7.4 % = 7.7% / 80% = 9.6% Adjust further for false negative (FNR=30%) tests = 7.4 % * 70% + (1 - 7.4%) * 30% = 29.3% = 9.6 % * 70% + (1-9.6%) * 30% = 29.1 % Adjusted Actual Cases = 142 M * 29.3% = 42 M =101 M * 29.1% = 29 M Adjust for un-tested population = 42M + (331 M - 142 M) * (8.4M/331M) = 42M + 4.8M = 47M = 29M + (1.4 B - 101 M) * (7.8M/101M) = 29M + 101M = 130M 16 My unpublished worksheet showing 10% of all of USA and India population may have already been infected
  • 17. Data Science Lessons Learned 1. Sound data science is built on sound data collection and curation • Epidemiology and public health data collections 2. Ensure data quality and trustworthiness before using them. 3. Valid real-world assumptions go before “theoretical beauty” of any model 17
  • 18. Outline • COVID-19: can we estimate the pandemic numbers? • COVID-19 Big Data, Data Science, and AI • Data science: team-based approach at UAB • Modeling, simulations, and predictions • Final thoughts: data-driven medicine 18
  • 19. Global sharing of COVID-19 information critical for biomedical research and healthcare 19LANCET. VOLUME 395, ISSUE 10224, P537, FEBRUARY 22, 2020
  • 20. Growth of COVID-19 Papers in PubMed 20 N=65, 685 as of 10/28/2020 160 % All UAB PubMed Publications since 1981 50 % All cancer-related PubMed Publications in 2020 3 years All infectious disease Publications in 2017-19 16 % All papers contain the phrase “data” in Title/Abstract
  • 21. COVID-19 Paper Trends by Topic Themes 21 Mechanism (N=7070) Transmission (N=1820) Diagnosis (N=10235)Treatment (N=14787) Prevention (N=21972)Forecasting (N=945) Specialized Peaked Stable/Growing
  • 22. 407 Papers (0.6%) on “Big Data” or “AI” 22 Weekly publications World Map
  • 23. COVID-19 Big Data Source 23
  • 24. COVID-19 Biomedical Data at a Glance 24 3,768 ClinicalTrials.gov 36,288 Nucleotide sequences 595 Gene Expression Omnibus Samples N3C Clinical Data CORD-19 Curated Literature 78,309 Github Repositories 81 DrugBank Targets 1154 PubChem Compoun ds
  • 25. 25
  • 26. AI helping fight against COVID-19 • Detecting outbreaks, tracing contacts and shaping public health responses • Screening for people who might be infected • Facilitating earlier diagnosis • Predicting risk of deterioration and poor outcomes • Augmenting remote monitoring and virtual care • Developing potential treatments and vaccines • Assisting hospital responses 26 PMID: 33111999
  • 29. Individual Patient Fatality Prediction using AI 29 PMID: 33102426
  • 30. Robot-nurse and AI radiologist 30
  • 31. Critics of Black-box AI in the era of COVID-19 • Risk of cyber-attack • E.g., input attacks that leads to wrong results (criminal offense) are difficult to discern • Risk of systematic bias affecting the patient’s healthcare • Sampling bias, e.g., against minorities or rare disorders or gender/age • Human has an “automation” bias (i.e., think machine is unbiased from human decisions) • Machines are not as held responsible as human physicians • Risk of mismatch • Correlation instead of causation • Unable to discern “meta-risk” (false risk) from real risk 31 PMID: 33110296
  • 32. Data Science Lessons Learned 1. Global community of open data sharing is critical to fighting future major diseases 2. AI is increasingly relevant for biomedical research and selected healthcare applications 3. AI is not mature enough for broad-scale adoptions in the US 32
  • 33. Outline • COVID-19: can we estimate the pandemic numbers? • COVID-19 Big Data, Data Science, and AI • Data science: team-based approach at UAB • Modeling, simulations, and predictions • Final thoughts: data-driven medicine 33
  • 34. Can we speed up team data science development? 1. Ideas & Team Formation 34 2. Prototyping 3. Pilot Studies 4. System Building 5. Real-world Solutions Feedback, decomposition, testing, refinement, integration, and redeployment
  • 35. We developed a new COVID-UBRITE infrastructure Amy Wang, MD Matt Wyatt Jelai Wang http://covid19.ubrite.org/
  • 36. © UAB. All Rights Reserved. …the hackathon was a significant factor in the project's success …scheduling the hackathon earlier in the tool development process would have been even more beneficial.
  • 37. Many student participants began as COVID-19 “knowledge curators” during lockdown
  • 39. 2020 COVID-19 Data Science Hackathon: An Overview 116 90 40 10 MS Team Users Registrants Registrants Teams Winners 3 Pre-hackathon Course “Programming with Biological Data” (Rosenberg, Basu, & Wang) Bootcamps “UBRITE for COVID-19 Hackathon” (Chen, Wang, Jelai & Wyatt) Two-day Hackathon “COVID-19 Data-driven Medicine” (Wang, Jelai, Wyatt & Chen) Hackathon Symposium “COVID-19 Hackathon Showcase” (Wang) 6/15-16 6/196/9-125/11 – 6/5
  • 40. Modeling the COVID-19 Epidemic with CNN and Conway’s Game of Life Selected Team Projects Performed by Contestants County-level social determinants of COVID-19 health Genetic epidemiology analysis of SARS-CoV-2A Network-based SEIDR model in simulating the outbreak of COVID-19
  • 41. Post-hackathon Survey Results What do you like about the hackathon? What is your overall experience?
  • 42. 42 http://discovery.informatics.uab.edu/PAGER-COV/ 11,835 PAGs (pathways, annotated gene lists, and gene signatures) 19,996,993 PAG-to-PAG relationships stored in the database
  • 43. 1. There are three pillars to successful team-based data science: people, technology, and science 2. Data curation and data processing takes the majority of time yet they are essential to problem-solving 3. To be prepared for national impact in a rapidly evolving environment, a person/team needs to already become expert on a selected topic Data Science Lessons Learned
  • 44. Outline • COVID-19: can we estimate the pandemic numbers? • COVID-19 Big Data, Data Science, and AI • Data science: team-based approach at UAB • Modeling, simulations, and predictions • Final thoughts: data-driven medicine 44
  • 45. Are COVID-19 Health Policy Evidence-based? 45
  • 46. Policy indices of COVID-19 government responses 46 https://github.com/OxCGRT/covid-policy-tracker/blob/master/documentation/codebook.md The Oxford COVID-19 Government Response Tracker (OxCGRT) provides a systematic measure across governments and across time to understand how government responses have evolved over the full period of the disease’s spread. CHI
  • 47. US Geographical variations of CHI Reported daily cases CHI Days after 1st recorded case Source: University of Oxford https://www.bsg.ox.ac.uk/research/publications/variation-us-states-responses-covid-19
  • 48. Containment and Health Index (CHI) Drawn according to state government’s party affiliations 48 Source: University of Oxford https://www.bsg.ox.ac.uk/research/publications/variation-us-states-responses-covid-19
  • 49. 49 Mask Wearing: a central policy mandate difference among republican and democrat controlled states
  • 51. Simulating the impact of three health policies 51 Scenario Mask use in the population? When are mandates removed? Threshold at which mandates are re-imposed? What mandates? Current projection Assumes mask use continues at currently observed rates Assumes that the gradual easing of social distancing mandates continues Assumes that mandates will be re-imposed for six weeks if daily deaths reach 8 per million •Educational facilities closed •Non-essential businesses closed •People ordered to stay at home •Large gatherings banned Mandates easing Assumes mask use continues at currently observed rates Assumes that the gradual easing of social distancing mandates continues Assumes that mandates are never re-imposed Not applicable Universal masks Assumes mask use rises to 95% within 7 days Assumes that the gradual easing of social distancing mandates continues Assumes that mandates will be re-imposed for six weeks if daily deaths reach 8 per million •Educational facilities closed •Non-essential businesses closed •People ordered to stay at home •Large gatherings banned Source: UW Institute for Health Metrics and Evaluation https://covid19.healthdata.org
  • 52. Public mask use over time in the US % of population who say they would wear a mask in public 52Source: https://covid19.healthdata.org/united-states-of-america?view=total-deaths&tab=trend Singapore, China, etc
  • 53. Change in mobility over time in the US Mobility estimated from cell phone proximity data 53Source: https://covid19.healthdata.org/united-states-of-america?view=social-distancing&tab=trend Projection: mask use, testing, isolation, and contact tracing all together
  • 54. Universal masks could save 240K lives, while mandates easing could cost 350K more lives (by 1/1/2021) 54Source: https://covid19.healthdata.org/global?view=total-deaths&tab=trend
  • 55. The Network SEIR Model susceptible (S), exposed (E), infected (I), deceased (D), and recovered (R) 4 2 2 1 3 2 3 3 3 2 2 2 2 3 2 2 3800 2800 1500 2600 210,000 200,000 Family levelCommunity levelCounty level 4,908,621 State level Family size avg.: 2.52±1.5Community size avg.: 3000±1000County size avg.: 35000±10000
  • 56. 𝛽 σ 𝜃 3/14/20 500+ 250-499 100-249 50-99 10-49 1-9 0 Confirmed case propagation in the community network of Jefferson county, AL 3/28/20 4/11/20 4/25/20 5/9/20 5/23/20 6/6/20 6/14/20 A susceptible-infected contact results in a new exposure An exposed person becomes infective The group-to-group contact rate Population: 658,573 Quarantine: community Average size: 3000 The first day of the infected case (1 case) reported in Jefferson county in Alabama is '3/13/20’
  • 57. Developing a parameterized network SEIR model to predict Jefferson county infections on 1/1/2021 𝜷 𝝈 𝜽 April 4th, Stay-at-home order April 30th, Reopened May 25th, Protest Model β σ θ stay_home 0.033 0.353 0.0006 reopen 0.039 0.359 0.003 protest 0.043 0.376 0.002 Interpretation Range β A susceptible-infected contact results in a new exposure [0,0.15] σ An exposed person becomes infective [0.3,0.4] θ The group-to-group contact rate [0, 0.01] Define Model for Jefferson County, AL • Total population: 658,573 • Community size: 3,084±941 • Community: 219 48,000 4. 1/1/2021 prediction 43,000 35,000 1/1/2021 10/30/2020 3. Model parameterization2. Model parameter fitting1. Model construction
  • 58. Data Science Lessons Learned 1. Consider discretizing the qualitative traits such as health policy to be able to quantitatively studying them 2. Mathematical/engineering models can better explain mechanisms, while giving predictions. 3. Governmental health policy can directly save or cost human lives! 58
  • 59. Outline • COVID-19: can we estimate the pandemic numbers? • COVID-19 Big Data, Data Science, and AI • Data science: team-based approach at UAB • Modeling, simulations, and predictions • Final thoughts: data-driven medicine 59
  • 61. Cellular abnormality Subcellular changes Genetic variation Altered reactions Phenotypic changes Organ damage … Human Diseases Animal Models Microscopy X-ray crystallography Neuro imaging Radioactive tracers Surgery … Experimental Tools Physician-led hospital care
  • 62. Genetic basis of the disease Mutations Inflammation Oxidative stress Cytokines/Hormones Biochemical changes … Individual Abnormality Genomic Medicine Precision Medicine CRISPER Network Medicine Deep learning Robo-surgery AI-enabled Drug discovery … Data science + AI Data-driven robo-doc-assisted healthcare
  • 63. Acknowledgment Zongliang Yue, PhD Eric Zhang, Sunny Khurana Nishant Batra Son Do Hai Dang Thanh Nguyen, PhD The AI.Med Lab Univ. WI Madison U54TR002731 U01CA223976 R01AI134023 R01AR073850 UAB Informatics Institute Jim Cimino, MD Amy Wang, MD Wayne Liang, MD Jelai Wang Matt Wyatt Geoff Gordon Nafisa Ajala Heather WattsUAB CS Da Yan, PhD Rubio Pique Jose UAB BME Jay Zhang, MD/PhD Clark Xu

Notes de l'éditeur

  1. Hospital capacity need Second paper: Improved How the infected cases propagate
  2. To provide you a better understanding of the N-SEIDR, we show the