SlideShare une entreprise Scribd logo
1  sur  39
Health data science
Why study data science?
Why study data science?
What is health data science?
• Data-driven solution to solve complex real world health problems
• Or to derive knowledge from unstructured and messy data
• It is an interdisciplinary field: biostatistics, computer science,
epidemiology, public health, mathematics, etc
But basically…
Real life health data science example
• HIV:
• Visualising the pattern of early HIV transmission within the mucosal barrier
• COVID-19:
• What can predict covid-19 neutralisation activity?
• Can we predict covid-19 vaccine efficacy?
Early HIV transmission
dynamics
Background
• Early HIV transmission event might occur during vaginal or anal sex
• Want to investigate if the mucosal barrier (within the vaginal tissue) is
effective in blocking HIV virus transmission or not
If the mucosal barrier is good in preventing viral
transmission, this is what we expect to see
If the mucosal barrier is not good at preventing
transmission, multiple viruses can be found
(random infection)
If the mucosal barrier is not good at preventing
transmission, multiple viruses can be found
(clustered infection)
Animal experiment
Data
14
Data Visualisation
Can still see many viral variants
no evidence that the vaginal tissue
is effective in blocking viral entry
Need a formal method
• How can we say (formally) if infection is spatially clustered (or not) ?
• Mantel test (or Mantel and Valand) -> relate a matrix of
“geographical” distance and a matrix of “biological” distance
• So, need to define the “geographical” matrix and “biological” matrix
first
15
“Geographical” distance
• Euclidean distance
di, j = (xi - xj )2
+(yi - yj )2
16
“Biological” distance
• Morisita – Horn index of overlap
MH =
2
n1in2i
N1N2
i
å
n1i
2
N1
+
n2i
2
N2
i
å
17
“Biological” distance
• Similarity between 1 and 2 =
0.98
• Similarity between 1 and 3 =
0.46
18
Mantel Test (or Mantel and Valand)
• Testing the association between two matrices
• Mantel quantity (Zm) is given by:
• Basic idea -> permutation test
• Randomly changing the rows and columns of the two matrices
• And store the value of Zm for each permutation of rows and columns
Zm = gij
j
å
i
å bij
19
20
Low p-values: infection is clustered locally
within the vaginal tissue
What can predict covid-19
viral neutralisation activity?
Background
• Neutralising antibody (NAb): antibody that can defend the host from
the specific pathogen
• Data: 41 convalescent adults; measured several immunological
parameters (13 parameters total)
• Goal: want to know in those 41 recovered patients, what
immunological parameters can be used to predict NAb
Methods
• Data visualisation is very important in data science
• First step: plot the correlation matrix for the whole dataset
Microneutralization is positively correlated
with SARS-CoV-2 RBD
Microneutralization is negatively correlated
with CCR6+CXCR3-
Ok, not very informative….
Have so many things correlated with microneutralization
Methods
• Correlation matrix shows that Nab is correlated with so many things
• Next step: Can I find some hidden features in this dataset?
• Method: principal component analysis (PCA)
The main focus is microneutralization
If the angle between microneut and another variable is less
than 90o; then it’s a positive association
If the angle between microneut and another variable is greater
than 90o; then it’s a negative association
For instance, higher ELISA S trimer gives higher
microneutralization level (less than 90o)
For instance, higher CCR6+CXCR3- gives lower
microneutralization level (more than 90o)
Methods
• PCA visualisation is better than correlation matrix
• But, still cannot just pick one thing that can be used to predict NAb
• Next step: I want to only pick one thing to predict NAb
• Method: multiple linear regression with a backward model selection
strategy
• The idea is to run a linear regression with all the variables, and iteratively
remove non-significant predictor until all the predictors are significant
Two main things are highly predictive of NAb
Predicting covid-19
vaccine efficacy
Background
Background
• At the end of the phase 2 trial, we get the immunogenicity data
(measuring the amount of antibody)
• Given the data from phase 2 trial (antibody data), can we predict
what the efficacy of the vaccine will be?
• Training dataset: efficacy and antibody data from all available vaccines
Methods
• The first step is always to visualise your data, so why don’t we plot
efficacy against antibody first?
High antibody = high efficacy
Low antibody = low efficacy
Can we simply do a classification method based on the
level of antibody?
Methods
• The model is a distribution-free binary classification model, based on
the threshold level of antibody
• The lower your antibody level, higher chance for you to be infected,
so the vaccine efficacy will be lower
• The higher your antibody level, lower chance for you to be infected,
so the vaccine efficacy will be higher
• We want to know what is this threshold of antibody
We normalised the antibody to the convalescent patients
(the mean for convalescent is one)
Covaxin data came out a bit later, so we used covaxin to
validate our ‘classifier’ model
Using our classifier, as long as we have antibody data (from
phase 2 trial), we can predict any vaccine efficacy
CureVac mRNA vaccine failure – why???
Simple data visualisation can help to answer
Because lower dose than Pfizer and Moderna

Contenu connexe

Similaire à Health data science.pptx

STDS- recent diagnosis methods@1223.pptx
STDS- recent diagnosis methods@1223.pptxSTDS- recent diagnosis methods@1223.pptx
STDS- recent diagnosis methods@1223.pptxKamalJungShahi
 
Laboratory monitoring of Progression of HIV
Laboratory monitoring of  Progression of HIVLaboratory monitoring of  Progression of HIV
Laboratory monitoring of Progression of HIVAnkita Mohanty
 
Effect of healthy diet on covid-19
Effect of healthy diet on covid-19Effect of healthy diet on covid-19
Effect of healthy diet on covid-19saimashahab1
 
Development of monoclonal antibodies Workshop
Development of monoclonal antibodies WorkshopDevelopment of monoclonal antibodies Workshop
Development of monoclonal antibodies WorkshopAngel Hernández
 
Biostatistics and Statistical Bioinformatics
Biostatistics and Statistical BioinformaticsBiostatistics and Statistical Bioinformatics
Biostatistics and Statistical BioinformaticsSetia Pramana
 
Cadth 2015 d5 symposium 2015 endonodal trials - version 2
Cadth 2015 d5 symposium 2015   endonodal trials - version 2Cadth 2015 d5 symposium 2015   endonodal trials - version 2
Cadth 2015 d5 symposium 2015 endonodal trials - version 2CADTH Symposium
 
Immune Monitoring
Immune MonitoringImmune Monitoring
Immune MonitoringPamoja
 
Dr. Stephanie Rossow - Applications of Next Generation Sequencing
Dr. Stephanie Rossow - Applications of Next Generation SequencingDr. Stephanie Rossow - Applications of Next Generation Sequencing
Dr. Stephanie Rossow - Applications of Next Generation SequencingJohn Blue
 
Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...
Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...
Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...Shaista Jabeen
 
Pinning control of disease networks
Pinning control of disease networksPinning control of disease networks
Pinning control of disease networksEben du Toit
 
Ryblov - Presentation (ppt)
Ryblov - Presentation (ppt)Ryblov - Presentation (ppt)
Ryblov - Presentation (ppt)Artem Ryblov
 
Epcm l9(new) screening for diseases
Epcm l9(new) screening for diseasesEpcm l9(new) screening for diseases
Epcm l9(new) screening for diseasesDr Ghaiath Hussein
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13Russ Altman
 
Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...
Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...
Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...Tom Connor
 
07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...
07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...
07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...UC San Diego AntiViral Research Center
 
Bioclonetics summary presentation~july 2016
Bioclonetics summary presentation~july 2016Bioclonetics summary presentation~july 2016
Bioclonetics summary presentation~july 2016Charles S. Cotropia
 

Similaire à Health data science.pptx (20)

STDS- recent diagnosis methods@1223.pptx
STDS- recent diagnosis methods@1223.pptxSTDS- recent diagnosis methods@1223.pptx
STDS- recent diagnosis methods@1223.pptx
 
Laboratory monitoring of Progression of HIV
Laboratory monitoring of  Progression of HIVLaboratory monitoring of  Progression of HIV
Laboratory monitoring of Progression of HIV
 
Effect of healthy diet on covid-19
Effect of healthy diet on covid-19Effect of healthy diet on covid-19
Effect of healthy diet on covid-19
 
Development of monoclonal antibodies Workshop
Development of monoclonal antibodies WorkshopDevelopment of monoclonal antibodies Workshop
Development of monoclonal antibodies Workshop
 
Biostatistics and Statistical Bioinformatics
Biostatistics and Statistical BioinformaticsBiostatistics and Statistical Bioinformatics
Biostatistics and Statistical Bioinformatics
 
Cadth 2015 d5 symposium 2015 endonodal trials - version 2
Cadth 2015 d5 symposium 2015   endonodal trials - version 2Cadth 2015 d5 symposium 2015   endonodal trials - version 2
Cadth 2015 d5 symposium 2015 endonodal trials - version 2
 
Immune Monitoring
Immune MonitoringImmune Monitoring
Immune Monitoring
 
HIV MANAGEMENT
HIV MANAGEMENT HIV MANAGEMENT
HIV MANAGEMENT
 
Dr. Stephanie Rossow - Applications of Next Generation Sequencing
Dr. Stephanie Rossow - Applications of Next Generation SequencingDr. Stephanie Rossow - Applications of Next Generation Sequencing
Dr. Stephanie Rossow - Applications of Next Generation Sequencing
 
Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...
Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...
Research Paper Presentation: Sensitivity Evaluation of 2019 Novel Coronavirus...
 
Pinning control of disease networks
Pinning control of disease networksPinning control of disease networks
Pinning control of disease networks
 
Ryblov - Presentation (ppt)
Ryblov - Presentation (ppt)Ryblov - Presentation (ppt)
Ryblov - Presentation (ppt)
 
Lab diagnosis hiv
Lab diagnosis hivLab diagnosis hiv
Lab diagnosis hiv
 
Epcm l9(new) screening for diseases
Epcm l9(new) screening for diseasesEpcm l9(new) screening for diseases
Epcm l9(new) screening for diseases
 
Amia tb-review-13
Amia tb-review-13Amia tb-review-13
Amia tb-review-13
 
Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...
Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...
Beating Bugs with Big Data: Harnessing HPC to Realize the Potential of Genomi...
 
Incidence Testing in HIV
Incidence Testing in HIVIncidence Testing in HIV
Incidence Testing in HIV
 
07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...
07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...
07.31.20 | Vaccines for the Prevention of COVID-19: An Unprecedented Need – A...
 
Bioclonetics summary presentation~july 2016
Bioclonetics summary presentation~july 2016Bioclonetics summary presentation~july 2016
Bioclonetics summary presentation~july 2016
 
WHO global RSV surveillance schema for future planning. Moving from RSV detec...
WHO global RSV surveillance schema for future planning. Moving from RSV detec...WHO global RSV surveillance schema for future planning. Moving from RSV detec...
WHO global RSV surveillance schema for future planning. Moving from RSV detec...
 

Dernier

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Valters Lauzums
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz1
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...ZurliaSoop
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...amitlee9823
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...SUHANI PANDEY
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 

Dernier (20)

Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Invezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signalsInvezz.com - Grow your wealth with trading signals
Invezz.com - Grow your wealth with trading signals
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 

Health data science.pptx

  • 2. Why study data science?
  • 3. Why study data science?
  • 4. What is health data science? • Data-driven solution to solve complex real world health problems • Or to derive knowledge from unstructured and messy data • It is an interdisciplinary field: biostatistics, computer science, epidemiology, public health, mathematics, etc
  • 6. Real life health data science example • HIV: • Visualising the pattern of early HIV transmission within the mucosal barrier • COVID-19: • What can predict covid-19 neutralisation activity? • Can we predict covid-19 vaccine efficacy?
  • 8. Background • Early HIV transmission event might occur during vaginal or anal sex • Want to investigate if the mucosal barrier (within the vaginal tissue) is effective in blocking HIV virus transmission or not
  • 9. If the mucosal barrier is good in preventing viral transmission, this is what we expect to see
  • 10. If the mucosal barrier is not good at preventing transmission, multiple viruses can be found (random infection)
  • 11. If the mucosal barrier is not good at preventing transmission, multiple viruses can be found (clustered infection)
  • 13. Data
  • 14. 14 Data Visualisation Can still see many viral variants no evidence that the vaginal tissue is effective in blocking viral entry
  • 15. Need a formal method • How can we say (formally) if infection is spatially clustered (or not) ? • Mantel test (or Mantel and Valand) -> relate a matrix of “geographical” distance and a matrix of “biological” distance • So, need to define the “geographical” matrix and “biological” matrix first 15
  • 16. “Geographical” distance • Euclidean distance di, j = (xi - xj )2 +(yi - yj )2 16
  • 17. “Biological” distance • Morisita – Horn index of overlap MH = 2 n1in2i N1N2 i å n1i 2 N1 + n2i 2 N2 i å 17
  • 18. “Biological” distance • Similarity between 1 and 2 = 0.98 • Similarity between 1 and 3 = 0.46 18
  • 19. Mantel Test (or Mantel and Valand) • Testing the association between two matrices • Mantel quantity (Zm) is given by: • Basic idea -> permutation test • Randomly changing the rows and columns of the two matrices • And store the value of Zm for each permutation of rows and columns Zm = gij j å i å bij 19
  • 20. 20 Low p-values: infection is clustered locally within the vaginal tissue
  • 21. What can predict covid-19 viral neutralisation activity?
  • 22. Background • Neutralising antibody (NAb): antibody that can defend the host from the specific pathogen • Data: 41 convalescent adults; measured several immunological parameters (13 parameters total) • Goal: want to know in those 41 recovered patients, what immunological parameters can be used to predict NAb
  • 23. Methods • Data visualisation is very important in data science • First step: plot the correlation matrix for the whole dataset
  • 24. Microneutralization is positively correlated with SARS-CoV-2 RBD Microneutralization is negatively correlated with CCR6+CXCR3-
  • 25. Ok, not very informative…. Have so many things correlated with microneutralization
  • 26. Methods • Correlation matrix shows that Nab is correlated with so many things • Next step: Can I find some hidden features in this dataset? • Method: principal component analysis (PCA)
  • 27. The main focus is microneutralization If the angle between microneut and another variable is less than 90o; then it’s a positive association If the angle between microneut and another variable is greater than 90o; then it’s a negative association
  • 28. For instance, higher ELISA S trimer gives higher microneutralization level (less than 90o) For instance, higher CCR6+CXCR3- gives lower microneutralization level (more than 90o)
  • 29. Methods • PCA visualisation is better than correlation matrix • But, still cannot just pick one thing that can be used to predict NAb • Next step: I want to only pick one thing to predict NAb • Method: multiple linear regression with a backward model selection strategy • The idea is to run a linear regression with all the variables, and iteratively remove non-significant predictor until all the predictors are significant
  • 30. Two main things are highly predictive of NAb
  • 33. Background • At the end of the phase 2 trial, we get the immunogenicity data (measuring the amount of antibody) • Given the data from phase 2 trial (antibody data), can we predict what the efficacy of the vaccine will be? • Training dataset: efficacy and antibody data from all available vaccines
  • 34. Methods • The first step is always to visualise your data, so why don’t we plot efficacy against antibody first?
  • 35. High antibody = high efficacy Low antibody = low efficacy Can we simply do a classification method based on the level of antibody?
  • 36. Methods • The model is a distribution-free binary classification model, based on the threshold level of antibody • The lower your antibody level, higher chance for you to be infected, so the vaccine efficacy will be lower • The higher your antibody level, lower chance for you to be infected, so the vaccine efficacy will be higher • We want to know what is this threshold of antibody
  • 37. We normalised the antibody to the convalescent patients (the mean for convalescent is one) Covaxin data came out a bit later, so we used covaxin to validate our ‘classifier’ model Using our classifier, as long as we have antibody data (from phase 2 trial), we can predict any vaccine efficacy
  • 38. CureVac mRNA vaccine failure – why???
  • 39. Simple data visualisation can help to answer Because lower dose than Pfizer and Moderna