An Analysis of the Accuracies of AC50 Estimates from Dose-Response Curve Models
1. An Analysis of the Accuracies of
AC50 Estimates of Dose-Response
Curve Modeling Equations
Mitas Ray
2. Background
The goal of the High Throughput Screening Initiative is to transform
traditional toxicological testing, one that uses animals such as rodents
and suffers from very high costs and very low throughput, into a non-
rodent animal cell-based assay that can use technological advances to
produce much higher throughput at a lot lower cost (National
Toxicology Program 2012). Through experimentation in a cytotoxicity
assay, quantitative high throughput screening (qHTS) produced robust
and reproducible data (Xia et al. 2008). An equation to quantify the
dose-response points was first created by A.V. Hill in 1910 and is
known as the Hill equation (Hill 1910). An alternate model is proposed
by Dr. K.R. Shockley and used for dose-response model fitting and is
formed as follows:
3. Background (cont.)
where is the response for concentration , is the minimum
response, is the maximum response, is the
concentration at which 50% of is achieved and
determines how wide or narrow the function is (Shockley 2012). In this
model, there is a log2 transformed AC50 parameter fit for data
generated by a Hill equation.
Similarly, a logistic 4-parameter fit was proposed by Mr. C. Ritz and Mr.
J.C. Streibig for model fitting, and is formed as follows:
Where parameter and represent the upper and lower limits
respectively and where represents the AC50 and represents the
slope (Ritz, Streibig 2005).
4. Problem
Analyzing data similar to that produced from a cell-based assay is
important to interpreting the meaning of the data. However, the method
in which to analyze the data is quite unclear. Both the equations, the
alternative model and the standard logistic 4-parameter model, were
written to fit curves for the Hill equation. In terms of the accuracy of the
AC50 parameter estimates from the model fitting of both equations,
which equation turns out to fit a data set produced from the Hill
equation itself?
5. Goals/Hypothesis
The goal of the experiment is to determine which method is better for
fitting a model dose response curve to a data set similar to one that
could be expected from a cell-based assay, one that is produced from a
Hill equation. The systematic approach that the alternative model uses
in estimation of the AC50 parameter seems to account for error more
effectively due to the log2 transformation than the standard logistic 4-
parameter equation. It is to this that I hypothesize that if the standard
deviation for normal error is increased upon the expected values
produced from a Hill equation, and other parameters in the Hill equation
such as maximum value, minimum value and slope are remained
constant, then the alternative model’s estimates of the AC50 parameter
will increase in accuracy in comparison to the standard logistic 4-
parameter equation.
6. Methods
Using the programming language R (Chambers, 2003), and the add-on
package DRC, used for bioassay analysis, I ran simulations to test both
the alternative model, and the standard logistic 4-parameter model. For
the Hill equation from which I extracted data and used to test the models,
I maintained a maximum response at 100 percent of positive control,
minimum response at 0 percent, and a slope of 4 for a range of
concentration values. The normal error was calculated with the expected
value as the mean and a manipulated standard deviation. The standard
deviation was varied from 0-10 in integer increments starting at 0. For
each standard deviation, both the standard logistic 4-parameter equation
and the alternative model were fitted and the accuracy of the AC50
parameter estimates for each equation was recorded for nine trials per
standard deviation. The averages of the nine trials for each standard
deviation were representative of the average accuracy of the AC50
parameter estimates for that standard deviation.
7. Methods (cont.)
After the averages were calculated for all eleven standard deviations,
two separate plots were created for each equation, and an appropriate
regression model was fitted to project future changes in standard
deviation. This allowed for the projection of accuracy of the AC50
parameter estimates for higher standard deviations. If there happened
to be an intersection amongst the regression equations, then it was
indicative that up until a certain standard deviation, one of the dose-
response model equations had a better accuracy of the AC50
parameter estimate, but beyond that certain standard deviation, the
other model equation provided a better fit. This allowed me to test my
hypothesis as I was able to directly see the correlation between the
increasing standard deviation and the accuracies of the AC50
parameter estimates.
8. Figure 1
Hill Function
This is a model Hill
100
function that was used
to simulate data that
80
tested the two models.
The blue curve is the
60
Hill function without any
Response
normal error where as
the red points represent
40
the points of the Hill
function with a normal
20
error with a varying
standard deviation. In
0
this case, the standard
0 20 40 60 80 100
deviation is 4.
Dose
9. Table 1
Std Dev: 0 1 2 3 4 5 6 7 8 9 10
AC50 0.0000 0.1088 0.1847 0.5053 0.8776 0.9027 1.1688 1.0962 1.3599 1.5815 1.6782
Standard 0.0000 0.1072 0.1886 0.4541 0.5759 0.9152 1.0264 1.1318 1.3191 1.7698 1.5294
Error: 0.0000 0.1101 0.4842 0.3856 0.6051 0.8325 0.9367 1.2728 1.2626 1.6200 1.4427
0.0000 0.1056 0.2116 0.7112 0.4365 0.8944 0.8799 1.2474 1.2025 1.2606 1.5382
0.0000 0.1141 0.3656 0.7554 0.7739 0.9451 0.9895 1.2467 1.2676 1.3560 1.4801
0.0000 0.1073 0.4706 0.3840 0.8740 0.8744 0.9654 1.2351 1.2962 1.4443 1.7880
0.0000 0.0996 0.3445 0.7796 0.7056 0.8458 1.0201 1.2479 1.1218 1.4215 1.7174
0.0000 0.1103 0.2306 0.6018 0.7786 0.8568 0.9834 1.3414 1.4883 1.4464 1.3557
0.0000 0.1083 0.4852 0.4351 0.3554 0.8568 0.8808 1.1643 1.3816 1.5715 1.6060
Avg: 0.0000 0.1079 0.3295 0.5569 0.6647 0.8804 0.9835 1.2204 1.3000 1.4968 1.5706
This table charts the AC50 standard errors for nine trials on the logistic
4-paramter model. The average AC50 standard error is at the bottom of
the column for each standard deviation.
10. Table 2
Std Dev: 0 1 2 3 4 5 6 7 8 9 10
AC50 0.0000 0.1778 0.2642 0.7292 0.9814 0.9416 1.1568 1.0942 1.3594 1.5925 1.6806
Standard 0.0000 0.1708 0.2499 0.5934 0.8911 0.9219 1.0475 1.1660 1.3362 1.7678 1.5206
Error: 0.0000 0.1715 0.2870 0.6008 0.6630 0.8358 0.9663 1.2747 1.2784 1.6158 1.4857
0.0000 0.1727 0.2304 0.8109 0.6125 0.8984 0.8828 1.2430 1.2125 1.2861 1.5528
0.0000 0.1772 0.2408 0.9933 0.9187 0.9573 0.9931 1.3091 1.2610 1.3928 1.4876
0.0000 0.1708 0.2758 0.4578 1.1011 0.8820 0.9902 1.2389 1.3340 1.4736 1.8113
0.0000 0.1513 0.2241 1.0280 0.8663 0.9039 1.0032 1.2661 1.1197 1.4390 1.7095
0.0000 0.1853 0.2462 0.8098 0.7801 0.8903 1.0225 1.3510 1.5024 1.4391 1.3618
0.0000 0.1665 0.2557 0.7028 0.6538 0.8903 0.8483 1.1767 1.4214 1.5578 1.6386
Avg: 0.0000 0.1716 0.2527 0.7473 0.8298 0.9024 0.9901 1.2355 1.3139 1.5072 1.5832
This table charts the AC50 standard errors for nine trials on the
alternative model. The average AC50 standard error is at the bottom of
the column for each standard deviation.
11. Figure 2
Avg AC50 Standard Error vs. Std Dev
Avg AC50 Standard Error
1.8
1.6
1.4
1.2
1
y = 0.1633x + 0.0116
0.8
0.6
0.4
0.2
0
0 2 4 6 8 10 12
Standard Deviation
This graph shows the logistic 4-parameter model AC50 standard error
results. The graph plots AC50 standard error versus standard deviation.
More importantly, the linear regression is given as y = 0.1633x + 0.0116.
12. Figure 3
Avg AC50 Standard Error vs. Std Dev
Avg AC50 Standard Error
1.8
1.6
1.4
1.2
1
y = 0.1598x + 0.0677
0.8
0.6
0.4
0.2
0
0 2 4 6 8 10 12
Standard Deviation
This graph shows the alternative model AC50 standard error results.
The graph plots AC50 standard error versus standard deviation. More
importantly, the linear regression is given as y = 0.1598x + 0.0677.
13. Discussion
The research conducted in this project led to a better understanding of
the accuracy of two models, the logistic 4-parameter and the alternative
model, in determining the AC50 parameter estimates to a Hill function
with normal error. This is a critical step in determining which model is
best for analyzing data from high throughput screening (HTS) cell-based
assays. A future goal of this project is to be able to simulate more than
ten sets of data to obtain more stable results for the accuracy of the
parameter estimates. Another future goal of this project is to branch out
from just analyzing the accuracies of the AC50 parameter estimates of
the two models to analyzing the accuracy of all the parameter and finally
the model itself. Other parameters would be redefined such as the range
of concentrations per chemical. An important aspect to consider,
however, is that in HTS data, there are typically fifteen data points or
less. Then, more methods would be analyzed in many head-to-head
comparisons based on this parameter to truly determine which statistical
method is the best for analyzing the HTS data.
14. Conclusion
From the two graphs, as presented above, it is clear that for smaller
standard deviations, the standard logistic 4-paramter is more accurate
for estimating the AC50 parameter. However, the regression lines for
both graphs will intersect at the standard deviation 15.743. This
indicates that at a standard deviation of 16 and beyond, the accuracy of
the AC50 parameter estimates by the alternative method will supersede
that of the standard logistic 4-parameter model.
15. References
Hill A.V. 1910. The possible effects of the aggregation of the molecules of hemoglobin on
its dissociation curves. J Physiol 40
Chambers, John. "What Is R?" The R Project for Statistical Computing. R-project, 2003.
Web. 14 Feb. 2013. <http://www.r-project.org/>.
National Toxicology Program. 2012. ""Toxicology Testing in the 21st Century" - A New
Strategy." High Throughput Screening Initiative. National Institute of Health, Web. 5
Sept. 2012 <http://ntp.niehs.nih.gov/?objectid=06002ADB-F1F6-975E-
73B25B4E3F2A41CB>.
Ritz C., Streibeig J.C. 2005. Bioassay analysis using R. J Stat Softw 12
Shockley K.R. 2012. A Three-Stage Algorithm to Make Toxicologically Relevant Activity
Calls from Quantitative High Throughput Screening Data. Environmental Health
Perspectives 120
Xia M., et al. 2008. Compound Cytotoxicity Profiling Using Quantitative High-Throughput
Screening. Environmental Health Perspectives 116
16. Acknowledgements
This research was conducted in the Biostatistics Branch at the National
Institute of Environmental Health Sciences, NIH, DHHS, Research
Triangle Park, NC 27709. Many thanks to Dr. Kissling and Dr. Shockley
for their continued encouragement and guidance throughout this
project.