Bangalore Call Girls Marathahalli 📞 9907093804 High Profile Service 100% Safe
Data mining in pharmacovigilance
1. DATA MINING IN
PHARMACOVIGILANCE
Dr. Bhaswat S. Chakraborty
Sr. VP & Chair, R&D Core Committee
Cadila Pharmaceuticals Ltd., Ahmedabad
Presented at Indian Pharmacological Society
Meeting, Ahmedabad, October 5, 2013
1
3. PREMATURE APPROVAL,
INCOMPLETE SAFETY PROFILE?
Many drugs whose complete safety profile is still
unknown have been approved
In some cases, drugs are approveddespite
identification of SAEs in premarketing trials
Alosetron hydrochloride – ischemic colitis
Grepafloxacinhydrochloride – QT prolongationand
deaths
Rofecoxib – heart attack and stroke (long-term, high-
dosage use)
They were all subsequently withdrawn fromthe
market because of these SAEs
In currently marketed drugs black box
warnings (SAEs caused by prescription drugs) is
very common
3
5. PHARMACOVIGILANCE
(PV)
Monitoring, evaluation and
implementation of drug safety
Detection and quantitation
of adverse drug reactions (ADRs)
novel or partially known
previously unknown
known hazard ↑frequency or ↑severity
in their Clinical nature, Severity or
Frequency
5
7. PHARMACOVIGILANCE
DATABASES
PV is usually practiced by agencies and pharmaceutical
companies by focusing on SD in large databases
These databases are of huge sizes, e.g.,
USFDA database, AERS: > 6.2 million records
WHO database, VIGIBASE: >7.2 million records
GSK databse, OCEANS: > 2 million records
Based on a study, the highest power for finding a true signal is
achieved by combining those databases with the most drug-
specific data.
Also early safety SD should involve the use of multiple large
global databases
Reliance on a single database may reduce statistical power and
diversity of ADRs
Hammond IW et al. (2007). Expert Opin Drug Saf. 6:713-21
7
8. DESIRABLE ATTRIBUTES OF AE
DATABASE SOFTWARE
Should be well integrated with Clinical data
management software
User friendly
Individual reports management features
Easy for query
Line listing of the entire database or part is
possible and easy
Data extraction is easy, with desirable filters
May also keep track of postmarketing Rx utility
and complaints data
8
9. DATA MINING
Getting something useful from lots and lots and lots of data
Although it might appear so, the methodology is not linear,
as it involves building and assessing models, carrying out
simultaneous as well as serial steps
9
10. DRUG TOXIC SIGNALS
WHO: “reported information on a possible causal
relationship between an adverse event and a
drug, the relationship being unknown or
incompletely documented previously.”
More than a single report needed
Suggests Drug-ADR (D-R) association (doesn't
establish causality)
An alert from any available source
Pre or post-marketing data generated
Data-mining of especially post-marketing safety
databases
10
11. SIGNAL DETECTION
Comes originally from electronics engg.
In signal detection theory
a receiver operating characteristic
(ROC) illustrates performance of true
positives vs. false positives out of the
negatives
at various threshold settings
Sensitivity is high with low true negative
rate
Specificity is high with a true positive
rate
11
12. Increasing the threshold would mean fewer false
positives (and more false negatives). The actual shape of the
curve is determined by the overlap the two distributions. 12
13. GOALS FOR ADR SIGNALS
Low false positive signals
Drug-ADR association should be real
Low false negative signal
Should not miss any Drug-ADR signal
Early detection of signals is desirable
False discovery rate → 0
Association
Bupropion – seizures
Olanzapine – thrombosis
Pergolide – increased libido
Risperidon – diabetes mellitus
Terbinafine – stomatistis
Rosiglitazone – liver function abnormalities
Dis-association
Isotretinoine– suicide
Source: LAREB
13
14. DATA MINING
& SD
PROTOCOL
Report collection
Database
cleaning
Quantitative
assessment
Qualitative
assessment
Evaluation
Communication
Gavali, Kulkarni, Kumar and Chakraborty (2009),
Ind J Pharmacol, 41, 162-166
14
15. 15
DATA DISPLAY & MINING METHODS
IN PV
No.
Reports
Target R Other R Total
Target D a b nTD
Other D c d nOD
Total nTA nOA n
Methods for Mining
Reporting Ratio (RR): E(a) = nTD × nTA/n
Proportional Reporting Ratio (PRR): E(a) = nTD × c/nOD
Odds Ratio (OR): E(a) = b × c/d
Need to accommodate uncertainty, especially if a is small
Bayesian approaches provide a way to do this
Basic approach: possible Signal when R = a/E(a) is “large”
16. CRITERIA FOR A TOXIC
DISPROPORTIONAL ADR
ROR =
χ2
=
Expected
ExpectedObserved 2
)( −
Significant disproportional
Signal is detected when χ2
is ≥ 4.0 and the rest ≥ 2.0
16
c
baa )( +
=PRR
dc
ba /
18. BAYESIAN STATISTICS
IN SD
where Pr(R|D) is the posterior probability of observing a
specific adverse event R given that a specific drug D is
the suspect drug.
Pr(R) and Pr(D) are prior probabilities of observing R
and D in the entire database.
Pr(R,D) is joint probability that both R and D were
observed in the same database coincidentally.
Pr(R|D) / Pr(R) = Pr(R,D) / Pr(R)*Pr(D)
18
19. MULTI-ITEM GAMMA POISSON
SHRINKER (MGPS)
It ranks drug-event combinations
According to how ‘interestingly large’ the number
of reports of that R-D combination
compared with what would be expected if the drug
and event were statistically independent.
Unlike the Information Component (IC), MGPS
technique gives an overall ranking of R-D
combinations
IC gives a kind of non-relative measure (IC) for
each R-D combination
19
22. BAYESIAN CONFIDENCE
PROPAGATION NEURAL
NETWORK (BCPNN)
The Uppsala Monitoring Centre (UMC) for WHO
databases uses BCPNN architecture for SD
Neural networks are highly organized & efficient
Give simple probabilistic interpretation of network
weights
Analogous to a living neuron with its multiple
dendrites and single axon
BCPNN calculates cell counts for all potential R-D
combinations in the database, not just those
appearing in at least one report
Done with two fully interconnected layers
One for all drugs and one for all adverse events
22
23. INFORMATION
COMPONENT (IC)
IC is used to decide whether the joint
probabilities of ADRs are different from
independent D & R.
This makes sense because if the events are
independent
the knowledge of one of the variables contributes no
new information about the other &
does not reduce the uncertainty about Y (due to
knowledge about X)
IC = log2 [Pr(R,D) / Pr(R)*Pr(D)
23
24. POSITIVE IC AND TIME SCANS
If Pr of co-occurrence of R & D is the same as the
product of the individual Pr of R & D, the
Bayesian likelihood estimator
Pr(R,D)/Pr(R)*Pr(D) will be equal to 1
This means equal prior and posterior
probabilities
Log2 1 = 0, therefore IC = 0
However, when posterior probability Pr(R|D)
exceeds the prior probability P(R), the IC
becomes more positive
An IC with a lower bound of 95% CI>0 that
increases with sequential time scans is positive
stable signal 24
25. CAPTOPRIL AND COUGH
The diagram shows the IC for the drug-ADR association. Error bars: + 95% CI.
R. Orre et al. (2000) Computational Statistics & Data Analysis 34, 473-493
25
26. A well known signal: suprofen and back pain. The diagram shows the IC for the
drug-ADR association. Error bars: + 95% CI.
R. Orre et al. (2000) Computational Statistics & Data Analysis 34, 473-493
26
27. The development from 1973 to 1990 of the IC for the drug azapropazone
vs. the photosensitivity reaction with 95% CI.
R. Orre et al. (2000) Computational Statistics & Data Analysis 34, 473-493
27
28. CHARACTERISTICS OF
IC
The preceding
diagrams show how the
IC for the D-R (e.g.,
suprofen-back pain
association varies over
a span of time (e.g.,
1983 – 1990)
The cumulative probability function for IC being
greater than zero [Pr(IC>0)] develops over time. This
association is seen with 80% certainty after the Q1,
1984. 28
29. DIGOXINE & RASH: AN INTERESTING
CASE
Although overall negative IC, when examined across age group,
increasing age was aasociated with positive IC.
R. Orre et al. (2000) Computational Statistics & Data Analysis 34, 473-493
29
30. PACLITAXEL-TACHYCARDIA
Change of IC between 1970 to 2010 for the association of tachycardia-
paclitaxel. The IC is plotted from year of 1970 to 2010 with five year
intervals with 95% CI
Singhal & Chakraborty. Unpublished data
30
31. DOCETAXEL - FLUSHING
Change of IC between 1970 to 2010 for the association of Doclitaxel-
flushing.
Singhal & Chakraborty. Unpublished data
-2
-1
0
1
2
3
4
5
6
7
1970-1975 1976-1980 1981-1985 1986-1990 1991-1995 1996-2000 2001-2005 2006-2010
E(IC)
Time(Year)
31
32. CONCLUDING REMARKS
Statistical data mining for drug-adverse reaction offers a
useful, non-invasive and sophisticated tool for unknown or
incompletely signals
Mainly proportional reporting ratios (PRR) and Bayesian
data mining including Empirical Bayesian Screening
(EBS) & Bayesian Confidence Propagation Neural Network
(BCPNN) are used
PRRs and EBS are comparable, only EBS has an
advantage with D-R combinations in very small numbers
but it is based on relative ranking
BCPNN provides an IC (a kind of threshold) for signaling
that applies to any D-R cells irrespective of ranking
The signals do not establish causality, they only indicate
very strong association between D & R
With all methods of data mining (especially PRR, EBS &
BCPNN), the quality & size of the database is very
important (can amplify or dilute a signal)
32
Alosetron is indicated only for women with severe diarrhea-predominant irritable bowel syndrome (IBS). Grepafloxacin hydrochloride (trade name Raxar, Glaxo Wellcome) is an oral broad-spectrum quinoline antibiotic agent used to treat bacterial infection. Rofecoxib is a nonsteroidal anti-inflammatory drug (NSAId) that has now been withdrawn over safety concerns.
Proportional Reporting Ratio (PRR); Reporting Odds Ratio (ROR);
neural networks are self-organising, suited to parallel computation, computationally efficient and provide a simple probabilistic interpretation of network weights.[3] Computational efficiency may be particularly advantageous with this programme because the BCPNN starts by calculating cell counts for all potential drug-adverse event combinations in the database, not just those that appear together in at least one report. This is acccomplished with two fully interconnected layers, one for all drugs and one for all adverse events.
neural networks are self-organising, suited to parallel computation, computationally efficient and provide a simple probabilistic interpretation of network weights.[3] Computational efficiency may be particularly advantageous with this programme because the BCPNN starts by calculating cell counts for all potential drug-adverse event combinations in the database, not just those that appear together in at least one report. This is acccomplished with two fully interconnected layers, one for all drugs and one for all adverse events.
Azapropazone is a non-steroidal anti-inflammatory drug used in a cute gout, ankylosing spondylitis & rheumatoid arthritis