SlideShare une entreprise Scribd logo
1  sur  40
Summary of Epidemic Model
Optimization Work
HIV
• HIV is a sexually-transmitted disease found
across the globe.
• Over 37.9 million people living with HIV (PLHIV)
worldwide and over 35 million deaths so far
• Sub-Saharan Africa accounts for 25.7 million
PLHIV, over 65% of current worldwide estimates
Interventions
• Thankfully, HIV can be treated
• An intervention is a measure taken in order to limit the spread of HIV
• Interventions we deal with:
• ART – Anti Retroviral Therapy
• VMMC – Voluntary Male Medical Circumcision
• PrEP – Pre Exposure Prophylaxis
• HCT – Home Counciling/Testing
• ANC/PMTCT – AnteNatal Care/Prevention of Mother To Child Transmission
• They are expensive!
Motivation
• Money isn’t infinite 
• Treating diseases like HIV is expensive, and if money is distributed
among treatment programs inefficiently, lives may be lost in the
inefficiency
• Our main goal is to develop a framework for intervention optimization
which, for a given budget, minimizes the effect of HIV.
Formulation
• Given parameters 𝑥1, 𝑥2, … , 𝑥𝑛 and budget 𝐵
min 𝐷𝐴𝐿𝑌(𝑥1, 𝑥2, … , 𝑥𝑛)
𝑠. 𝑡. 𝐶𝑜𝑠𝑡 𝑥1, 𝑥2, … , 𝑥𝑛 < 𝐵
Framework
EMOD Details
• Disease- and region- specific stochastic disease model.
• HIV
• South Africa & Kenya
• EMOD simulates a population over discrete timesteps, mimics the behavior
of an endemic disease.
• EMOD can be likened to a massive Markov Chain - agents move from state
to state based on their current status and internal probabilities
• These probabilities are represented by the parameters which we will change.
• EMOD uses calibrated datasets to run ‘sub’ simulations
• Each ‘sub’ simulation is slightly different and generates one set of output files
• We average the outputs for ‘sub’ simulations to correspond to a single parameter set
• It takes 2 minutes to evaluate a single set of input parameters
Cost Calculation
• Cost is the total amount of money spent on interventions past our
threshold year, 𝑡0
• For each intervention:
• 𝐼𝑛𝑡𝑒𝑟𝑣𝑒𝑛𝑡𝑖𝑜𝑛 𝐶𝑜𝑠𝑡 = 𝑁𝑖 ∗ 𝐶𝑖 ∗ 𝑑𝑓 𝑡 −𝑡0
• 𝑁𝑖 is the number of times this intervention is given in year 𝑖.
• 𝐶 is the intervention cost.
• 𝑑𝑓 is the decay factor, normally 0.97
• Used to mitigate uncertainty caused by randomness
• Total cost is the sum of each intervention cost.
DALY Calculation
• DALYs stand for disability-adjusted life years, they are the amount of
time lost due to HIV.
• We begin counting DALYs after a threshold year, denoted 𝑡0
• For each person who died due to HIV:
• 𝐷𝐴𝐿𝑌𝑠 = 𝐿𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 − 𝐿𝑅𝑒𝑎𝑙𝑖𝑧𝑒𝑑 ∗ 𝑑𝑓 𝑡 −𝑡0
• 𝐿𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 is the expected lifespan
• 𝐿𝑅𝑒𝑎𝑙𝑖𝑧𝑒𝑑 is the actual lifespan
• 𝑑𝑓 is the decay factor, normally 0.97
• Used to mitigate uncertainty caused by randomness
Main Algorithm
• Initial sampling of parameter space to obtain ‘training’ data
• While True:
• Construct a surrogate model using training data
• Find the optimal allocation policy using surrogate model
• Simulate hypothesized optimal
• If simulation output satisfies our stopping criteria, break
• Augment training data using new data
Optimal Allocation Policy Algorithm
1. Evaluate 1M random parameter sets with a surrogate model
2. While True
1. Throw out those which violate the constraint (Cost > TargetCost)
2. Select the top 1/15th of the viable points according to DALY count
1. Break if only a single point is selected
3. Perturb the selected points into 10 new points
3. Repeat step 1 and the previous loop 5 times, return the parameter
set with the least DALY count
1M evaluations at 2 min/evaluation = 2M mins ≈ 4 years!
Tensor Train Motivation
• In order to work on higher dimension problems, we need a new
method for constructing surrogate models
• This is where tensor trains come in:
• Generalization of SVD to higher dimensions
• The version of tensor trains I used is not the original version.
Tensor Train Formulation
• Tensor trains try to represent a multivariate function as a
product of univariate polynomials with matrix coefficients
• 𝑓 𝑥1, 𝑥2, … , 𝑥𝑛 = 𝐺0
𝑇
𝑥0 ∗ 𝐺1(𝑥1) ∗ … ∗ 𝐺𝑛−1(𝑥𝑛−1) ∗ 𝐺𝑛(𝑥𝑛)
• 𝐺𝑖 𝑥𝑖 = 𝑗=0
𝑘
𝑏𝑗 𝑥𝑖 ∗ 𝐶𝑖𝑗
• 𝑘 is the order which we are using for our polynomial approximations
• 𝐶𝑖𝑗 are the coefficient matrices to the polynomial terms
• 𝑏𝑖(𝑥) is the 𝑖𝑡ℎ
basis function (Hermite polynomials, Taylor polynomials,
etc.)
• Each 𝐺𝑖 is referred to as a ’cart’
• Together, they form a ‘train,’ hence, Tensor Train
Tensor Train Shape
• To help understand the train, it may be useful to have a visual
rendering:
In this case, each univariate function is
estimated by a third-order polynomial
We want to solve for the coefficient
matrices
The main ‘issue’ arises when attempting to
solve a cart in the middle of the train –
difficult to isolate
Fitting Process
• We will fit the model iteratively.
• Our objective is to manipulate terms into the least squares form 𝐴𝑥 = 𝑏,
where 𝐴 is some representation of our current cart, 𝑥 is some form of our
unknown coefficient matrices, and 𝑏 is our outputs.
• To explain fitting the model, we begin with the example of a single point 𝑥,
and the corresponding output 𝑦
We assume 𝑘 is the order of the polynomial estimation, 𝑏𝑖(𝑥𝑖) is the 𝑖𝑡ℎ
basis function evaluated at 𝑥𝑖
• For each component of 𝑥, generate 𝑏0(𝑥𝑖) through 𝑏𝑘(𝑥𝑖)
• Iterate over each cart, assume we are at cart 𝐺𝑖(𝑥𝑖):
• ”Fix” on the selected cart, evaluate the other carts at their respective 𝑥’s
We are fixing G2
Z Term
• We now compute three important terms: 𝑈, 𝑉, and 𝑍.
• 𝑈 is the product of all computed carts LEFT of the fixed cart
• 𝑉 is the product of all computed carts RIGHT of the fixed cart
• For our FIXED cart, 𝐺𝑖(𝑥𝑖), we evaluate 𝑍:
• 𝑍 = 𝑗=0
𝑘
𝑏𝑗 𝑥𝑖 ∗ 𝐶𝑖𝑗, where each 𝐶𝑖𝑗 is taken as a set of UNKNOWNS
(shape 𝑟1 ∗ 𝑟2 ∗ 𝑘).
• Each entry of 𝑍 contains 𝑘 + 1
unknowns.
Vectorization Trick
• We now apply a Kronecker identity:
• 𝑈𝑇
𝑍𝑉 = 𝑌
• 𝑉𝑇 ⊗ 𝑈𝑇 vec 𝑍 = vec(𝑌) = 𝑌
• This equation has the following shape:
We reformat this as:
Least Squares
• Now we have an equation with known values on the left and
purely unknowns on the right, which can be solved very simply with
least squares.
• This method can be generalized as well:
• With additional sample points, both the column vector and 𝑌 grow in the first
dimension.
• Since we are always solving for the same set of unknowns, least squares still
works!
Single Point Evaluation
When using the train, to evaluate a single parameter set, 𝑥,
we go through the following process:
1. Iterate through each train cart, assume the current cart is cart 𝑖
1. Using the 𝑥𝑖 corresponding to the current cart, generate 𝑏0(𝑥𝑖) through
𝑏𝑘 𝑥𝑖 .
2. Compute 𝐺𝑖(𝑥𝑖) = 𝑗=0
𝑘
𝑏𝑗 𝑥𝑖 ∗ 𝐶𝑖𝑗
2. Once each cart has been evaluated, multiply the carts together to
produce 𝑌.
Visual Representation
Advantages/Potential Issues
• Advantages
• Massively dimension-reducing
• Able to model complex interactions between variables
• Generalizable to higher dimensions
• Customizable (more on that later)
• Disadvantages
• Requires a minimum quantity of points to begin solving, otherwise an
underdetermined system
• Sometimes unstable
• Results may not adhere to physical limitations
South Africa
Problem Description
• 11 parameters total:
• 8 for ART
• 1 for VMMC
• 1 for PrEP
• 1 for HCT
• 10 range from 0-1 and one ranges from 0-0.2
• Bounds built into the optimization algorithm
Parameter Name Parameter Description Default Value Range
PresProb Probability a symptomatic individual
will get an HIV test
0.94 [0-1]
Child_6w Probability that a 6-week-old child
presents for HIV testing.
0.337 [0-1]
TestUptake Increased rate of testing after 2009 0.9 [0-1]
Staging Probability someone begins ART
Staging from a positive diagnosis
0.837 [0-1]
PreART Probability someone continues
PreART, instead of being LTFU
0.75 [0-1]
FastART Probability someone immediately
begins taking ART from the first stage
of testing after 2016.
0.95 [0-1]
On_ART Probability that someone who has
dropped out of ART is willing to re-
enroll in ART
0.9 [0-1]
KeepART Probability someone keeps using ART
after 2016
0.9 [0-1]
ARTInterrupted Probability ART is interrupted due to
unforeseen events.
0.2 [0-0.2]
PrEP Probability someone with a negative
HIV test begins to use PrEP
0.15 [0-1]
VMMC VMMC Coverage 0.8 [0-1]
Results
• I began training with 1000 training points, which took 2 days to run
• Surrogate models were within 1% error on the training data
• They ran into some trouble with ‘unexplored’ regions, but the algorithm was
able to update the models
Parameter sets chosen by the
optimization algorithm vs.
randomly selected points (all
evaluated by EMOD)
Varying Price Points
• The algorithm was able to generate different investment
recommendations for different price points – $1M to $3M
• Algorithm was able to locate and cluster around local minima
Cost $1,113,055.47 $1,465,936.00 $1,978,418.95 $2,513,331.67 $2,985,608.23
DALY 2,470.75 2,461.45 2,417.12 2,442.63 2,416.71
Important Parameters
• I sorted each simulation that I ran into one of 4 categories:
• Cheap – below $1.5M
• Expensive – above $1.5M
• Good – below 7K DALYs
• Bad – above 7K DALYs
• Calculated mean & STD for each parameter
Cheap & Good
Parameter Name Mean STD
KeepART 0.96 0.10
TestUptake 0.85 0.27
FastART 0.71 0.38
PresProb 0.79 0.34
Child_6w 0.50 0.46
Staging 0.37 0.50
PreART 1.00 0.00
On_ART 1.00 0.00
VMMC 0.31 0.34
PrEP 0.00 0.00
ARTInterrupted 0.03 0.06
Cost $1.04M $40.5K
DALY 4.68K 1.07K
Cheap & Bad
Parameter Name Mean STD
KeepART 0.46 0.29
TestUptake 0.50 0.29
FastART 0.49 0.31
PresProb 0.50 0.29
Child_6w 0.46 0.29
Staging 0.49 0.31
PreART 0.51 0.30
On_ART 0.47 0.27
VMMC 0.46 0.31
PrEP 0.04 0.03
ARTInterrupted 0.10 0.06
Cost $849K $173K
DALY 23K 5.47K
Expensive & Good
Parameter Name Mean STD
KeepART 0.93 0.07
TestUptake 0.47 0.30
FastART 0.62 0.35
PresProb 0.70 0.30
Child_6w 0.49 0.38
Staging 0.71 0.23
PreART 0.57 0.34
On_ART 0.98 0.02
VMMC 0.59 0.35
PrEP 0.36 0.35
ARTInterrupted 0.12 0.07
Cost $3.66M $2.46M
DALY 4.8K 1.03K
Expensive & Bad
Parameter Name Mean STD
KeepART 0.50 0.29
TestUptake 0.50 0.29
FastART 0.50 0.29
PresProb 0.50 0.29
Child_6w 0.50 0.29
Staging 0.51 0.29
PreART 0.50 0.29
On_ART 0.50 0.29
VMMC 0.49 0.29
PrEP 0.53 0.26
ARTInterrupted 0.10 0.06
Cost $4.27M $1.85M
DALY 20K 5.42K
Kenya
42-dim Problem Description
• I took a new model (Kenya) and searched for every parameter I could
find:
• 12 parameters for PrEP
• 2 parameters for ANC
• 4 parameters for HCT/testing
• 6 parameters for ART
• 18 parameters for VMMC
Results
Model Type Many Degs of Freedom Few Degs of Freedom Many Degs of Freedom + Preconditioning
# Negative Points (1M
random samples)
2547 761 0
# Points Over 100K
DALYs
1395 157 0
Kenya Results
• Cost was bounded at $3M
• Expected DALYs were now overestimated at 10K, while DALYs actually ranged from ~1800-2500
• Cost ranged from 2.4M to 3M
• Each point was within the allowed range.
• Cost was accurate to within $100K, probably because the expected Cost was in the middle of the output
range
• No clear local minima/groups of points
Future Work
• Using gradient descent to solve for optima instead of random
sampling.
• Account for ‘hidden costs’ of optimization
• Application to an accurate epidemiological problem
• The parameters chosen are not necessarily accurate to real-world capabilities,
and represent a proof of concept
• Testing work on other problems
• Rank–1 update instead of re-solving least squares?

Contenu connexe

Similaire à EMOD_Optimization_Presentation.pptx

Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!ChenYiHuang5
 
Dueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learningDueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learningTaehoon Kim
 
Linear regression
Linear regressionLinear regression
Linear regressionMartinHogg9
 
Intro to Reinforcement learning - part III
Intro to Reinforcement learning - part IIIIntro to Reinforcement learning - part III
Intro to Reinforcement learning - part IIIMikko Mäkipää
 
Coffee beans variants recommendation using clustering
Coffee beans variants recommendation using clusteringCoffee beans variants recommendation using clustering
Coffee beans variants recommendation using clusteringMihirKadam3
 
Analysis and Design of Algorithms
Analysis and Design of AlgorithmsAnalysis and Design of Algorithms
Analysis and Design of AlgorithmsBulbul Agrawal
 
Transformers.pdf
Transformers.pdfTransformers.pdf
Transformers.pdfAli Zoljodi
 
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Maninda Edirisooriya
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learningNimrita Koul
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceAmit Sharma
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelineChenYiHuang5
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationBridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationThomas Ploetz
 
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Maninda Edirisooriya
 

Similaire à EMOD_Optimization_Presentation.pptx (20)

Neural Network Part-2
Neural Network Part-2Neural Network Part-2
Neural Network Part-2
 
Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!Paper study: Attention, learn to solve routing problems!
Paper study: Attention, learn to solve routing problems!
 
Py data19 final
Py data19   finalPy data19   final
Py data19 final
 
lecture-05.pptx
lecture-05.pptxlecture-05.pptx
lecture-05.pptx
 
Dueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learningDueling network architectures for deep reinforcement learning
Dueling network architectures for deep reinforcement learning
 
Linear regression
Linear regressionLinear regression
Linear regression
 
Intro to Reinforcement learning - part III
Intro to Reinforcement learning - part IIIIntro to Reinforcement learning - part III
Intro to Reinforcement learning - part III
 
Coffee beans variants recommendation using clustering
Coffee beans variants recommendation using clusteringCoffee beans variants recommendation using clustering
Coffee beans variants recommendation using clustering
 
Analysis and Design of Algorithms
Analysis and Design of AlgorithmsAnalysis and Design of Algorithms
Analysis and Design of Algorithms
 
Transformers.pdf
Transformers.pdfTransformers.pdf
Transformers.pdf
 
TTrains.pptx
TTrains.pptxTTrains.pptx
TTrains.pptx
 
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
Lecture 5 - Gradient Descent, a lecture in subject module Statistical & Machi...
 
Nimrita deep learning
Nimrita deep learningNimrita deep learning
Nimrita deep learning
 
Dowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inferenceDowhy: An end-to-end library for causal inference
Dowhy: An end-to-end library for causal inference
 
Paper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipelinePaper Study: Melding the data decision pipeline
Paper Study: Melding the data decision pipeline
 
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- EvaluationBridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
Bridging the Gap: Machine Learning for Ubiquitous Computing -- Evaluation
 
Introduction to Deep Reinforcement Learning
Introduction to Deep Reinforcement LearningIntroduction to Deep Reinforcement Learning
Introduction to Deep Reinforcement Learning
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
Lecture 10 - Model Testing and Evaluation, a lecture in subject module Statis...
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Dernier

Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Silpa
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsSérgio Sacani
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Serviceshivanisharma5244
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryAlex Henderson
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptxryanrooker
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professormuralinath2
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPirithiRaju
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Servicemonikaservice1
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfSumit Kumar yadav
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPirithiRaju
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Servicenishacall1
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsOrtegaSyrineMay
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformationAreesha Ahmad
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLkantirani197
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)Areesha Ahmad
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...chandars293
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxMohamedFarag457087
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxFarihaAbdulRasheed
 

Dernier (20)

Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort ServiceCall Girls Ahmedabad +917728919243 call me Independent Escort Service
Call Girls Ahmedabad +917728919243 call me Independent Escort Service
 
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and SpectrometryFAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate ProfessorThyroid Physiology_Dr.E. Muralinath_ Associate Professor
Thyroid Physiology_Dr.E. Muralinath_ Associate Professor
 
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdfPests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
Pests of cotton_Borer_Pests_Binomics_Dr.UPR.pdf
 
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts ServiceJustdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
Justdial Call Girls In Indirapuram, Ghaziabad, 8800357707 Escorts Service
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
Pests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdfPests of mustard_Identification_Management_Dr.UPR.pdf
Pests of mustard_Identification_Management_Dr.UPR.pdf
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Grade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its FunctionsGrade 7 - Lesson 1 - Microscope and Its Functions
Grade 7 - Lesson 1 - Microscope and Its Functions
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Site Acceptance Test .
Site Acceptance Test                    .Site Acceptance Test                    .
Site Acceptance Test .
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Digital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptxDigital Dentistry.Digital Dentistryvv.pptx
Digital Dentistry.Digital Dentistryvv.pptx
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 

EMOD_Optimization_Presentation.pptx

  • 1. Summary of Epidemic Model Optimization Work
  • 2. HIV • HIV is a sexually-transmitted disease found across the globe. • Over 37.9 million people living with HIV (PLHIV) worldwide and over 35 million deaths so far • Sub-Saharan Africa accounts for 25.7 million PLHIV, over 65% of current worldwide estimates
  • 3. Interventions • Thankfully, HIV can be treated • An intervention is a measure taken in order to limit the spread of HIV • Interventions we deal with: • ART – Anti Retroviral Therapy • VMMC – Voluntary Male Medical Circumcision • PrEP – Pre Exposure Prophylaxis • HCT – Home Counciling/Testing • ANC/PMTCT – AnteNatal Care/Prevention of Mother To Child Transmission • They are expensive!
  • 4. Motivation • Money isn’t infinite  • Treating diseases like HIV is expensive, and if money is distributed among treatment programs inefficiently, lives may be lost in the inefficiency • Our main goal is to develop a framework for intervention optimization which, for a given budget, minimizes the effect of HIV.
  • 5. Formulation • Given parameters 𝑥1, 𝑥2, … , 𝑥𝑛 and budget 𝐵 min 𝐷𝐴𝐿𝑌(𝑥1, 𝑥2, … , 𝑥𝑛) 𝑠. 𝑡. 𝐶𝑜𝑠𝑡 𝑥1, 𝑥2, … , 𝑥𝑛 < 𝐵
  • 7. EMOD Details • Disease- and region- specific stochastic disease model. • HIV • South Africa & Kenya • EMOD simulates a population over discrete timesteps, mimics the behavior of an endemic disease. • EMOD can be likened to a massive Markov Chain - agents move from state to state based on their current status and internal probabilities • These probabilities are represented by the parameters which we will change. • EMOD uses calibrated datasets to run ‘sub’ simulations • Each ‘sub’ simulation is slightly different and generates one set of output files • We average the outputs for ‘sub’ simulations to correspond to a single parameter set • It takes 2 minutes to evaluate a single set of input parameters
  • 8. Cost Calculation • Cost is the total amount of money spent on interventions past our threshold year, 𝑡0 • For each intervention: • 𝐼𝑛𝑡𝑒𝑟𝑣𝑒𝑛𝑡𝑖𝑜𝑛 𝐶𝑜𝑠𝑡 = 𝑁𝑖 ∗ 𝐶𝑖 ∗ 𝑑𝑓 𝑡 −𝑡0 • 𝑁𝑖 is the number of times this intervention is given in year 𝑖. • 𝐶 is the intervention cost. • 𝑑𝑓 is the decay factor, normally 0.97 • Used to mitigate uncertainty caused by randomness • Total cost is the sum of each intervention cost.
  • 9. DALY Calculation • DALYs stand for disability-adjusted life years, they are the amount of time lost due to HIV. • We begin counting DALYs after a threshold year, denoted 𝑡0 • For each person who died due to HIV: • 𝐷𝐴𝐿𝑌𝑠 = 𝐿𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 − 𝐿𝑅𝑒𝑎𝑙𝑖𝑧𝑒𝑑 ∗ 𝑑𝑓 𝑡 −𝑡0 • 𝐿𝐸𝑥𝑝𝑒𝑐𝑡𝑒𝑑 is the expected lifespan • 𝐿𝑅𝑒𝑎𝑙𝑖𝑧𝑒𝑑 is the actual lifespan • 𝑑𝑓 is the decay factor, normally 0.97 • Used to mitigate uncertainty caused by randomness
  • 10. Main Algorithm • Initial sampling of parameter space to obtain ‘training’ data • While True: • Construct a surrogate model using training data • Find the optimal allocation policy using surrogate model • Simulate hypothesized optimal • If simulation output satisfies our stopping criteria, break • Augment training data using new data
  • 11.
  • 12. Optimal Allocation Policy Algorithm 1. Evaluate 1M random parameter sets with a surrogate model 2. While True 1. Throw out those which violate the constraint (Cost > TargetCost) 2. Select the top 1/15th of the viable points according to DALY count 1. Break if only a single point is selected 3. Perturb the selected points into 10 new points 3. Repeat step 1 and the previous loop 5 times, return the parameter set with the least DALY count 1M evaluations at 2 min/evaluation = 2M mins ≈ 4 years!
  • 13.
  • 14. Tensor Train Motivation • In order to work on higher dimension problems, we need a new method for constructing surrogate models • This is where tensor trains come in: • Generalization of SVD to higher dimensions • The version of tensor trains I used is not the original version.
  • 15. Tensor Train Formulation • Tensor trains try to represent a multivariate function as a product of univariate polynomials with matrix coefficients • 𝑓 𝑥1, 𝑥2, … , 𝑥𝑛 = 𝐺0 𝑇 𝑥0 ∗ 𝐺1(𝑥1) ∗ … ∗ 𝐺𝑛−1(𝑥𝑛−1) ∗ 𝐺𝑛(𝑥𝑛) • 𝐺𝑖 𝑥𝑖 = 𝑗=0 𝑘 𝑏𝑗 𝑥𝑖 ∗ 𝐶𝑖𝑗 • 𝑘 is the order which we are using for our polynomial approximations • 𝐶𝑖𝑗 are the coefficient matrices to the polynomial terms • 𝑏𝑖(𝑥) is the 𝑖𝑡ℎ basis function (Hermite polynomials, Taylor polynomials, etc.) • Each 𝐺𝑖 is referred to as a ’cart’ • Together, they form a ‘train,’ hence, Tensor Train
  • 16. Tensor Train Shape • To help understand the train, it may be useful to have a visual rendering: In this case, each univariate function is estimated by a third-order polynomial We want to solve for the coefficient matrices The main ‘issue’ arises when attempting to solve a cart in the middle of the train – difficult to isolate
  • 17. Fitting Process • We will fit the model iteratively. • Our objective is to manipulate terms into the least squares form 𝐴𝑥 = 𝑏, where 𝐴 is some representation of our current cart, 𝑥 is some form of our unknown coefficient matrices, and 𝑏 is our outputs. • To explain fitting the model, we begin with the example of a single point 𝑥, and the corresponding output 𝑦 We assume 𝑘 is the order of the polynomial estimation, 𝑏𝑖(𝑥𝑖) is the 𝑖𝑡ℎ basis function evaluated at 𝑥𝑖 • For each component of 𝑥, generate 𝑏0(𝑥𝑖) through 𝑏𝑘(𝑥𝑖) • Iterate over each cart, assume we are at cart 𝐺𝑖(𝑥𝑖): • ”Fix” on the selected cart, evaluate the other carts at their respective 𝑥’s
  • 19. Z Term • We now compute three important terms: 𝑈, 𝑉, and 𝑍. • 𝑈 is the product of all computed carts LEFT of the fixed cart • 𝑉 is the product of all computed carts RIGHT of the fixed cart • For our FIXED cart, 𝐺𝑖(𝑥𝑖), we evaluate 𝑍: • 𝑍 = 𝑗=0 𝑘 𝑏𝑗 𝑥𝑖 ∗ 𝐶𝑖𝑗, where each 𝐶𝑖𝑗 is taken as a set of UNKNOWNS (shape 𝑟1 ∗ 𝑟2 ∗ 𝑘). • Each entry of 𝑍 contains 𝑘 + 1 unknowns.
  • 20. Vectorization Trick • We now apply a Kronecker identity: • 𝑈𝑇 𝑍𝑉 = 𝑌 • 𝑉𝑇 ⊗ 𝑈𝑇 vec 𝑍 = vec(𝑌) = 𝑌 • This equation has the following shape:
  • 22. Least Squares • Now we have an equation with known values on the left and purely unknowns on the right, which can be solved very simply with least squares. • This method can be generalized as well: • With additional sample points, both the column vector and 𝑌 grow in the first dimension. • Since we are always solving for the same set of unknowns, least squares still works!
  • 23. Single Point Evaluation When using the train, to evaluate a single parameter set, 𝑥, we go through the following process: 1. Iterate through each train cart, assume the current cart is cart 𝑖 1. Using the 𝑥𝑖 corresponding to the current cart, generate 𝑏0(𝑥𝑖) through 𝑏𝑘 𝑥𝑖 . 2. Compute 𝐺𝑖(𝑥𝑖) = 𝑗=0 𝑘 𝑏𝑗 𝑥𝑖 ∗ 𝐶𝑖𝑗 2. Once each cart has been evaluated, multiply the carts together to produce 𝑌.
  • 25. Advantages/Potential Issues • Advantages • Massively dimension-reducing • Able to model complex interactions between variables • Generalizable to higher dimensions • Customizable (more on that later) • Disadvantages • Requires a minimum quantity of points to begin solving, otherwise an underdetermined system • Sometimes unstable • Results may not adhere to physical limitations
  • 27. Problem Description • 11 parameters total: • 8 for ART • 1 for VMMC • 1 for PrEP • 1 for HCT • 10 range from 0-1 and one ranges from 0-0.2 • Bounds built into the optimization algorithm
  • 28. Parameter Name Parameter Description Default Value Range PresProb Probability a symptomatic individual will get an HIV test 0.94 [0-1] Child_6w Probability that a 6-week-old child presents for HIV testing. 0.337 [0-1] TestUptake Increased rate of testing after 2009 0.9 [0-1] Staging Probability someone begins ART Staging from a positive diagnosis 0.837 [0-1] PreART Probability someone continues PreART, instead of being LTFU 0.75 [0-1] FastART Probability someone immediately begins taking ART from the first stage of testing after 2016. 0.95 [0-1] On_ART Probability that someone who has dropped out of ART is willing to re- enroll in ART 0.9 [0-1] KeepART Probability someone keeps using ART after 2016 0.9 [0-1] ARTInterrupted Probability ART is interrupted due to unforeseen events. 0.2 [0-0.2] PrEP Probability someone with a negative HIV test begins to use PrEP 0.15 [0-1] VMMC VMMC Coverage 0.8 [0-1]
  • 29. Results • I began training with 1000 training points, which took 2 days to run • Surrogate models were within 1% error on the training data • They ran into some trouble with ‘unexplored’ regions, but the algorithm was able to update the models Parameter sets chosen by the optimization algorithm vs. randomly selected points (all evaluated by EMOD)
  • 30. Varying Price Points • The algorithm was able to generate different investment recommendations for different price points – $1M to $3M • Algorithm was able to locate and cluster around local minima Cost $1,113,055.47 $1,465,936.00 $1,978,418.95 $2,513,331.67 $2,985,608.23 DALY 2,470.75 2,461.45 2,417.12 2,442.63 2,416.71
  • 31. Important Parameters • I sorted each simulation that I ran into one of 4 categories: • Cheap – below $1.5M • Expensive – above $1.5M • Good – below 7K DALYs • Bad – above 7K DALYs • Calculated mean & STD for each parameter
  • 32. Cheap & Good Parameter Name Mean STD KeepART 0.96 0.10 TestUptake 0.85 0.27 FastART 0.71 0.38 PresProb 0.79 0.34 Child_6w 0.50 0.46 Staging 0.37 0.50 PreART 1.00 0.00 On_ART 1.00 0.00 VMMC 0.31 0.34 PrEP 0.00 0.00 ARTInterrupted 0.03 0.06 Cost $1.04M $40.5K DALY 4.68K 1.07K
  • 33. Cheap & Bad Parameter Name Mean STD KeepART 0.46 0.29 TestUptake 0.50 0.29 FastART 0.49 0.31 PresProb 0.50 0.29 Child_6w 0.46 0.29 Staging 0.49 0.31 PreART 0.51 0.30 On_ART 0.47 0.27 VMMC 0.46 0.31 PrEP 0.04 0.03 ARTInterrupted 0.10 0.06 Cost $849K $173K DALY 23K 5.47K
  • 34. Expensive & Good Parameter Name Mean STD KeepART 0.93 0.07 TestUptake 0.47 0.30 FastART 0.62 0.35 PresProb 0.70 0.30 Child_6w 0.49 0.38 Staging 0.71 0.23 PreART 0.57 0.34 On_ART 0.98 0.02 VMMC 0.59 0.35 PrEP 0.36 0.35 ARTInterrupted 0.12 0.07 Cost $3.66M $2.46M DALY 4.8K 1.03K
  • 35. Expensive & Bad Parameter Name Mean STD KeepART 0.50 0.29 TestUptake 0.50 0.29 FastART 0.50 0.29 PresProb 0.50 0.29 Child_6w 0.50 0.29 Staging 0.51 0.29 PreART 0.50 0.29 On_ART 0.50 0.29 VMMC 0.49 0.29 PrEP 0.53 0.26 ARTInterrupted 0.10 0.06 Cost $4.27M $1.85M DALY 20K 5.42K
  • 36. Kenya
  • 37. 42-dim Problem Description • I took a new model (Kenya) and searched for every parameter I could find: • 12 parameters for PrEP • 2 parameters for ANC • 4 parameters for HCT/testing • 6 parameters for ART • 18 parameters for VMMC
  • 38. Results Model Type Many Degs of Freedom Few Degs of Freedom Many Degs of Freedom + Preconditioning # Negative Points (1M random samples) 2547 761 0 # Points Over 100K DALYs 1395 157 0
  • 39. Kenya Results • Cost was bounded at $3M • Expected DALYs were now overestimated at 10K, while DALYs actually ranged from ~1800-2500 • Cost ranged from 2.4M to 3M • Each point was within the allowed range. • Cost was accurate to within $100K, probably because the expected Cost was in the middle of the output range • No clear local minima/groups of points
  • 40. Future Work • Using gradient descent to solve for optima instead of random sampling. • Account for ‘hidden costs’ of optimization • Application to an accurate epidemiological problem • The parameters chosen are not necessarily accurate to real-world capabilities, and represent a proof of concept • Testing work on other problems • Rank–1 update instead of re-solving least squares?