A session from SemTech SF in June 2012
Abstract: As access to a richer set of knowledge and research continues to be critical to the healthcare community, the users of healthcare and life science solutions are demanding the same level of discoverability, integration, and innovation from their professional tools that they enjoy in their personal applications. Through the Smart Content initiative Elsevier seeks to semantically enrich its diverse offerings of health sciences content to both improve the performance of existing online resources as well as to enable the creation of the next generation of digital products. In this session, Alan Yagoda will discuss Elsevier’s efforts in developing Smart Content capabilities to power a new portfolio of strategic product offerings. The journey into smarter search and discovery resulted in a new infrastructure with a rich set of semantic capabilities include the development of a standardized medical taxonomy called EMMeT (Elsevier’s Merged Medical Taxonomy), indexing and content enrichment, and linked data services.
1. Elsevier Health Sciences
Smart Content Drives Smart Applications
The Future Of Using Knowledge In Healthcare
SemTechBiz 2012 Conference Alan Yagoda
June 5, 2012 VP, Business Technology
a.yagoda@elsevier.com
@alanyagoda
2. About Elsevier
Elsevier is the largest Science, Technical and Medical
Publisher in the world. In the area of Health Sciences,
Elsevier publishes leading brands including The Lancet,
Braunwald’s Heart Disease, Gray’s Anatomy, and the Netter
Atlases among others. In addition, Elsevier produces leading
online clinical support tools and products including:
• MD Consult
• Procedures Consult
• Mosby’s Nursing Consult
• CPMRC Nursing Care Plans
• Gold Standard Drug Database
• MEDai Analytics for Managed Care Plans
Elsevier
Proprietary
and
Confiden3al
4. The Challenge: Getting doctors the right information to make
the best decisions and provide the best clinical care
Trusted:
Authoritative medical and surgical content from Elsevier.
Comprehensive:
Integrated Medline and 3rd party journal content.
Speed To Answer:
Fast discoverability of the most relevant answers and
more intuitive searching.
Elsevier
Health
Sciences
|
Proprietary
and
Confiden3al
5. Introducing
Smart Content
Elsevier
Proprietary
and
Confiden3al
6. Taxonomy-Powered Content = Smart Content
Content
with
applied
taxonomy
Content
today
with
structured
XML
Copyright 2011 Outsell Gilbane Services, Inc.
Elsevier
Proprietary
and
Confiden3al
http://www.outsellinc.com
http://gilbane.com/xml/2009/11/what-is-smart-content.html#ixzz0hnuRhaBc
7. Smart Content At Elsevier
Smart Content Applications
Better discovery through
semantic search & navigation
Linked data from
• Faceted search & browse
partners and the Web
• Ontology-driven navigation
• Task-specific results
• Personalized/localized results
• Question answering"
• Link to evidenced-based content
Text Better understanding through
analysis and visualization
Entities, • Tag clouds
Elsevier concepts • Heatmaps"
content Tables and • Streamgraphs"
• Scatterplots"
relationships • Time series
• Animations
Images
New knowledge through
aggregation and synthesis
• Topic pages
Elsevier • Social network maps
knowledge • Geolocation maps
organization
• Data mashups"
systems • Text mining reports
7
Elsevier
Proprietary
and
Confiden3all
8. Making Smart Content Work in the Clinical Setting
Clinical
Clinical
Trials
Journals
Summaries
Guidelines
Procedural
Drug
Info
Pa3ent
Ed
Books
Videos
Elsevier Merged Medical Taxonomy (EMMeT)
250K+
Core
Clinical
Concepts
EMMeT
Concept
Mapping
Elsevier
1M+
Synonyms
Custom
1M+
Hierarchical
Rela3onships
1M+
Ontological
Rela3onships
UMLS
•
Vast
amounts
of
content
made
easily
discoverable
•
Specialty-‐specific
naviga9on
•
Dynamic
clinical
summary
crea9on
•
Meaningful
related
content
recommenda9ons
Elsevier
Proprietary
and
Confiden3al
9. Introducing EMMeT (Elsevier Merged Medical Taxonomy)
Parent Terms
• Breast Disorders 2
• Cancer of the Thorax
• Mammary Neoplasms
• More….
Symptoms Breast Lump, Nipple Retraction, …..
Medical Name
Diagnostic
Malignant Neoplasm of the Breast Mammography, Breast Biopsy, …..
Procedures
Consumer Friendly Name
Breast Cancer
Synonyms 1 4
Malignant Tumor of Breast Treatment
Chemotherapy, Mastectomy, ….
Malignant Breast Neoplasm Procedures
Semantic Relationships
Breast Ca
Codes
ICD9 – 174.9
MeSH – D001943 Medications Tamoxifen, Doxorubicin, …..
SNOMED-CT – 190121004
Semantic Type/Group
Neoplastic Process/Disease
Risk Factors Family History, Genetics, Predisposition, ….
Children Terms
• Breast Sarcoma
3 Prevention Screening, Preemptive Mastectomy, ….
• Familial Breast Cancer
• Malignant lymphoma of the Breast
• Malignant Neoplasm of the breast outer
quadrant Complications Metastatic Cancer, ….
• More…
Elsevier
Proprietary
and
Confiden3al
10. Automated Indexing: Weighted Tags for Better Search
Article-level SMART Content tags help
confirm relevance and provide a topical
overview about a piece of content.
Paragraph-level SMART Content tags
uncover highly-relevant information not
necessarily evident from the title or
abstract alone.
Elsevier
Proprietary
and
Confiden3al
16. Represent Enhancements and Vocabularies In RDF
Satellites
Creation of Satellite Standards
• Linked data compliant RDF representing metadata objects
• Leverage common namespaces from dct, pav, rdf, skos
• Taxonomies in SKOS to enhance portability in the linked data world
• Subject tagging against a vocabulary representing extracted
knowledge LDR
• Concept URIs that can be equated to URIs in linked data
Delivery Infrastructure
• Product-specific indexes generating RDF “Smart Tags”
• Data pipeline transformations for building semantic warehouse
• Exposed through linked data delivery services
Example RDF Statements
Tags from a taxonomy for a given document
Document sections relevant to a given concept
Document sections providing answers to a given question
Genes mentioned in a given document
Documents supporting or disputing conclusions of a given document
Concepts in the areas of expertise for a given author
Elsevier
Proprietary
and
Confiden3al
17. LDR Semantic Infrastructure
Linked Data Linked Data Loader (REST)
Data
Space
Services
Vocab &
Annotation
Annota3on
Linked
Data
Satellites
Satellites
Satellites
3rd
Party
RDF
Vocab
Asset
Data
Satellites
Smart Content Indexing Pipeline
Linked
Data
Pipeline
Services
(Hadoop)
AWS Cloud Management
EMMeT Vocabulary
SKOS
Semantic
RDF Validation
Ontology Svcs
Genera3on
Interlinking
Reasoning
Transform
Network
N-Quads
Extract
JSON
…
Tagging
and
Indexing
Content
Elsevier
Services
(Concepts,
Chapters,
Ar3cles,
Guidelines,etc)
RDF
Genera3on
Discovery Services (Semantic Knowledgebase)
3rd
Party
Content
Content
Ins3t.
Amazon MongoDB SOLR/ Virtuoso
S3 NoSQL SIREn Triplestore
Product-specific
Smart Content Access & Admin &
Atom Feed Analytics
Search Index Entitlements Monitoring
Discovery Svc Ontology
SPARQL Alerts
API (REST) Service
17
Elsevier
Proprietary
and
Confiden3al
18. Elsevier Smart Content In Action
Applications powered by Smart Content:
– Semantic search for practitioners and medical researchers
– Expose medical taxonomies in SKOS
– Crossref collaboration of scholarly publishers and funding agencies
– Lancet application mashups on specialty health topics
– Sciverse applications
– Clinical Decision Support Drug Research
Elsevier
Proprietary
and
Confiden3al
19. LDR
API
Access
To
Ar4cle
Metadata
Elsevier
Proprietary
and
Confiden3al
20. Trend Analysis Of Special Health Topics
Elsevier
Proprietary
and
Confiden3al
21. Comprehensive Drug Research
• Moving world-class content online to Point of Care.
• Extracted knowledge is linked for further enrichment.
• Information is condensed, immediate and actionable.
Elsevier
Proprietary
and
Confiden3al
22. Linking Patient Data To Evidence-Based Research
-‐
Discover
knowledge
from
research
relevant
to
a
pa3ent
profile
-‐
Alerts
on
FDA
Announcements.
Elsevier
Proprietary
and
Confiden3al
23. SciVerse Widgets Powered by Smart Content
Article search on ScienceDirect results in related
specialty content recommendations available from
The Lancet Journal.
Elsevier
Proprietary
and
Confiden3al
24. Smart content is a bridge to the future of publishing
• Smart content allows publishers to create new products
and services through structuring content for better
discovery, insight and utility
– The value is in the structure
– Creating that structure is hard work
– The kind of hard work that publishers have
traditionally focused on
• New consumer Internet businesses are using open
source software and the cloud to add structure to content
today… quickly and on the cheap
• Publishers and societies both large and small can use
the same techniques to follow suit
Elsevier
Proprietary
and
Confiden3al
25. Thank you.
Alan Yagoda
a.yagoda@elsevier.com
Elsevier
Proprietary
and
Confiden3al