Applying Semantic Web Mining to Analyze Global Research Activity and Render Business Intelligence
1. Applying Semantic Web Mining to
Analyze Global Research Activity
and Render Business Intelligence
David Cocker
MDCPartners Belgium
Clinical Trial Disclosures
Bethesda
2. The views and opinions expressed in the following PowerPoint slides are those of the individual presenter and
should not be attributed to Drug Information Association, Inc. (“DIA”), its directors, officers, employees, volunteers,
members, chapters, councils, Special Interest Area Communities or affiliates, or any organization with which the
presenter is employed or affiliated.
These PowerPoint slides are the intellectual property of the individual presenter and are protected under the
copyright laws of the United States of America and other countries. Used by permission. All rights reserved. Drug
Information Association, DIA and DIA logo are registered trademarks or trademarks of Drug Information
Association Inc. All other trademarks are the property of their respective owners.
You can use my slides if you want…
Drug Information Association
www.diahome.org
2
3. Basic flow of this presentation
• What is Business intelligence?
• What is semantic web mining?
• How can you use the data from the web to augment
strategy?
• More disclosure means more data to evaluate!
• Output data treatments
Drug Information Association
www.diahome.org
3
4. What is business intelligence
•
•
•
•
•
Data which is useful to a commercial company
Supports an assumption(s)
Monitors strategy
Help notice a change in homeostasis
GET QUICKER TO THE TRUTH
M a ke b e t t e r c h o i c e s
Drug Information Association
www.diahome.org
4
6. Decision Engineering…..when do you need the info
Interdependence and complexity, creating greater uncertainties, systemic risk
and a less predictable future.
Specification
Security
Planning Phase
Quality Assistance
Retention
Decision Lifecycle
Scientific
Question
Requirements
Design
Alignment
Implementation Phase
Execution & Monitoring
Rapid Response to Change
These changes have led to reduced warning times and compressed decision cycles.
7. How can you use the data from the web to augment
strategy
How can you use the data from the web to
augment strategy
Drug Information Association
www.diahome.org
7
15. Inferred data: source CT.gov
Argentina, Buenos Aires
Hospital Britanico-Buenos Aires
Hospital Italiano de Cordoba
Ciudad Autonoma de Buenos Aires, Buenos Aires, Argentina, C1280AEB
Cordoba, Argentina, X5004 FJE
UAI Hosp. Universitario
Ciudad Autonoma de Buenos Aires, Buenos Aires, Argentina, C1437BZL
Sanatorio Allende
Cordoba, Argentina, X5000JHQ
Sanatorio Otamendi
Ciudad Autonoma de Buenos Aires, Buenos Aires, Argentina, C1115AAB
Instituto del Corazon Denton A. Cooley
Ciudad Autonoma de Buenos Aires, Buenos Aires, Argentina, C1416A
HIGA Hospital Interzonal General de Agudos Oscar Allende
Instituto de Cardiologia J.F. Cabral
Corrientes, Argentina,
W3400AMZ
Mar del Plata, Buenos Aires, Argentina, 07600
Clinica Independencia Munro
Munro, Buenos Aires, Argentina, 01605
Drug Information Association
www.diahome.org
15
16. Inferred data with added information from PubMed
Drug Information Association
www.diahome.org
16
17. Where are the new trials going ?
Diabetes
Oncology
???
???
18. 60
Trial count
50
Enrolment statistics over Diabetes, Blue
chip, ignoring the economic blocks North
America, Europe, Europe West, Japan
40
Economic block
Africa
Asia
Central America
China
30
Europe East
India
Middle East
20
Russia
South America
Southern Hemisphere
10
0
1
2
3
4
5
6
7
8
9
10
22. Organization Ranking System
Sponsor activity
Drug research
sponsored
Investigator Institutional score
Therapeutic
Academic score
Ranked assessment of the
Number of trials active, recruiting and
completed, for a given organisation or
department
specific drug or drug class
research as an active sponsor
Individuals found in trial registries, regulatory
websites & publishers on clinical trial activity
Impact assessment of TA-targeted publications
60
organisation by weight of
publications pertaining to the
therapy area
1
302
30
23. Calculation of scores
TA specific Journal categories:
• Top 10%: High Impact Journal (HIJ)
• 11 – 40%: Medium Impact Journal (MIJ)
• 41 – 100%: Low Impact Journal (LIJ)
Person Academic score
24. snips of data sources
Information rendered
KOL profile
Top centre in DMT2
Well he knows his stuff
Sponsor conducting
trial @
OH, he’s worked for
drug last year.
Diabetes. And works for hospital X
Sponsor on xyz wonder
And he gets money from
Company disclosure $$
XYZ Pharma Co. as a speaker
He’s been an
Participated in trials
Speaker @ American
Diabetes Association
Expert in regulatory
review
investigator for a while
now
He’s speaking next week at the big US
event
Expert in Diabetes
On review panel
GLP-1
Works with ABC Pharma Co. on
new GLP-1
He works for ABC Pharma Co. too on
their GLP-1
Data Source examples
PubMed
clinicaltrials.gov
Corporate databases
BioMonitor
Google
EMEA
(Competent authorities)
Drug databases
Corporate databases
25. Key Components of KOL Systems going forward
• Transparent data capture procedure
Objective
• Inferred data not enough
• New mechanisms to catch & update information
• Real-time
• Objective, measurable components (=accepted)
Should
+• Compliant++ codes of Well
with Hum conduct
be
Yes
Yes
• Portable data
• globally accessible
• Metrics
• Dashboard approach
• Robust privacy , secure technologies opted in/out
Subjective
26. Where will my patients come from?
New York
Allentown
50 miles
50 miles
Philadelphia
50 miles
Wilmington
Baltimore
Washington
Drug Information Association
www.diahome.org
26
27. Making choices about trial placement
Current environment
Time – Cost – Quality
Commercial risk
analysis
Investment
choices
28. Add population statistics to measure patient referral
Debrecn-Budapest=216 km
Debrecn-Szeged=220 km
Ovarian Cancer
Alzheimer's
Budapest (2009)
- City
- Density
Density
169,678
604.2/km2 (1,564.9/sq mi)
▲ 2,503,205
- Metro
population
3,241.5/km2
- Urban
Szeged
▲ 1,712,210
▲ 3,271,110
Debrecn
population
- Density
206,225
442.53/km2 (1,14
6.1/sq mi)
30. Critical emerging regions
Where will I be doing my oncology trials in 2012?
.
4
%
1
9
.
7
%
6
.
3
%
7
.
4
%
Source:
MDCpartners
0.8
Regional Site Utilisation (1998 - 2008)
Site by trial ratio (1999 – 2007)
0.6
11
0.4
10
0.2
9
8
0
1996
1998
2000
2002
2004
2006
2008
2010
7
1999
2000
2001
2002
2003
2004
2005
2006
2007
31. Leverage information
• Can we leverage these expanding public
data sources?
• Can advances in technology optimise
information gathering?
• If we have the information can we apply it
at the right time to enhance efficiency?