Our Embase expert, Ian Crowlesmith, showed us:
- The main indexing principles for Embase
- How Embase drug and disease indexing can optimize your searches and results.
Embase for biomedical answers - Indexing - webinar - 21 Nov 2012
1. Welcome to our Embase webinar!
Embase for biomedical searching.
Indexing and retrieval
Your host: Ann-Marie Roche Your presenter: Ian Crowlesmith
1
2. Need to know
• Webinar control panel:
• „chat‟ or „ask a question‟ for questions
and comments
• Option for full screen view
• Q&A after presentation
3. Review past
Embase
webinars on
our info site.
NEW
webinar
schedule by
end of 2012.
3
4. Agenda
• Anatomy of an index
• Principles and history (in brief)
• Focus on drugs: in-depth indexing
- drugs and diseases
- subheadings
- trade names, manufacturers, CAS numbers
• Indexing topics A-Z: some highlights
- major terms - mapping from MEDLINE
- Emtree / backposting - automatic indexing
- check tags & limits - medical devices
- study types & topic terms - numerical indexing
Reference: Embase Indexing guide
See: http://www.embase.com/info/what-is-embase/emtree
4
11. DRUG TERMS AND SUBHEADINGS
drug of major focus (A term)
drug subheading
disease treated
adverse drug reactions
comparator drug
route by which drug was
administered
other drug subheadings
11
12. DRUG TERMS AND SUBHEADINGS
A aclidinium bromide drug of major focus (A term)
drug subheading
disease treated
adverse drug reactions
comparator drug
route by which drug was
administered
other drug subheadings
12
13. DRUG TERMS AND SUBHEADINGS
A aclidinium bromide drug of major focus (A term)
drug therapy drug subheading
chronic obstructive lung disease disease treated
adverse drug reactions
comparator drug
route by which drug was
administered
other drug subheadings
13
14. DRUG TERMS AND SUBHEADINGS
A aclidinium bromide drug of major focus (A term)
drug therapy drug subheading
chronic obstructive lung disease disease treated
adverse drug reaction
coughing adverse drug reactions
diarrhea
ECG abnormality
headache
pruritus
rhinopharyngitis
tooth pain
comparator drug
route by which drug was
administered
other drug subheadings
14
15. DRUG TERMS AND SUBHEADINGS
A aclidinium bromide drug of major focus (A term)
drug therapy drug subheading
chronic obstructive lung disease disease treated
adverse drug reaction
coughing adverse drug reactions
diarrhea
ECG abnormality
headache
pruritus
rhinopharyngitis
tooth pain
drug comparison
formoterol fumarate comparator drug
route by which drug was
administered
other drug subheadings
15
16. DRUG TERMS AND SUBHEADINGS
A aclidinium bromide drug of major focus (A term)
drug therapy drug subheading
chronic obstructive lung disease disease treated
adverse drug reaction
coughing adverse drug reactions
diarrhea
ECG abnormality
headache
pruritus
rhinopharyngitis
tooth pain
drug comparison
formoterol fumarate comparator drug
(route of drug administration)
inhalational drug administration route by which drug was
administered
other drug subheadings
16
17. DRUG TERMS AND SUBHEADINGS
A aclidinium bromide drug of major focus (A term)
drug therapy drug subheading
chronic obstructive lung disease disease treated
adverse drug reaction
coughing adverse drug reactions
diarrhea
ECG abnormality
headache
pruritus
rhinopharyngitis
tooth pain
drug comparison
formoterol fumarate comparator drug
(route of drug administration)
inhalational drug administration route by which drug was
administered
other subheadings
clinical trial other drug subheadings
drug dose
17
18. DRUG TERMS AND SUBHEADINGS
B formoterol fumarate drug with minor focus (B term)
drug therapy drug subheading
chronic obstructive lung disease disease treated
adverse drug reaction
coughing adverse drug reactions
diarrhea
headache
pruritus
rhinopharyngitis
tooth pain
drug comparison
aclidinium bromide comparator drug
(route of drug administration)
inhalational drug administration route by which drug was
administered
other subheadings
clinical trial other drug subheadings
18
19. DISEASES AND OTHER TERMS
disease with major focus (A)
disease subheading
drugs treating disease
study type check tags
human study type check tag
sex and age check tags
other check tags
minor focus (B) terms
19
20. DISEASES AND OTHER TERMS
A chronic obstructive lung disease disease with major focus (A)
disease subheading
drugs treating disease
study type check tags
human study type check tag
sex and age check tags
other check tags
minor focus (B) terms
20
21. DISEASES AND OTHER TERMS
A chronic obstructive lung disease disease with major focus (A)
drug therapy disease subheading
aclidinium bromide, formoterol* drugs treating disease
study type check tags
* i.e. formoterol fumarate
human study type check tag
sex and age check tags
other check tags
minor focus (B) terms
21
22. DISEASES AND OTHER TERMS
A chronic obstructive lung disease disease with major focus (A)
drug therapy disease subheading
aclidinium bromide, formoterol* drugs treating disease
B randomized controlled trial study type check tags
B controlled study
B crossover procedure * i.e. formoterol fumarate
B double blind procedure
B phase 2 clinical trial
B multicenter study
human study type check tag
sex and age check tags
other check tags
minor focus (B) terms
22
23. DISEASES AND OTHER TERMS
A chronic obstructive lung disease disease with major focus (A)
drug therapy disease subheading
aclidinium bromide, formoterol* drugs treating disease
B randomized controlled trial study type check tags
B controlled study
B crossover procedure * i.e. formoterol fumarate
B double blind procedure
B phase 2 clinical trial
B multicenter study
B major clinical study human study type check tag
sex and age check tags
other check tags
minor focus (B) terms
23
24. DISEASES AND OTHER TERMS
A chronic obstructive lung disease disease with major focus (A)
drug therapy disease subheading
aclidinium bromide, formoterol* drugs treating disease
B randomized controlled trial study type check tags
B controlled study
B crossover procedure * i.e. formoterol fumarate
B double blind procedure
B phase 2 clinical trial
B multicenter study
B major clinical study human study type check tag
B adult sex and age check tags
B female
B male
B human other check tags
minor focus (B) terms
24
25. DISEASES AND OTHER TERMS
A chronic obstructive lung disease disease with major focus (A)
drug therapy disease subheading
aclidinium bromide, formoterol* drugs treating disease
B randomized controlled trial study type check tags
B controlled study
B crossover procedure * i.e. formoterol fumarate
B double blind procedure
B phase 2 clinical trial
B multicenter study
B major clinical study human study type check tag
B adult sex and age check tags
B female
B male
B human other check tags
B bronchodilatation minor focus (B) terms
B forced expiratory volume
B forced vital capacity
B powder inhaler
B drug dose comparison
B evening dosage
B morning dosage
25
26. DRUG TRADE NAMES WITH LINKED
MANUFACTURER NAMES
drug trade name & drug manufacturer
name & country code
MEDICAL DEVICE TRADE NAMES WITH LINKED
MANUFACTURER NAMES
device trade name & device
manufacturer name & country code (for
2 devices)
CLINICAL TRIAL NUMBERS
clinical trial number repository & clinical
trial number
26
27. DRUG TRADE NAMES WITH LINKED
MANUFACTURER NAMES
foradil drug trade name & drug manufacturer
Novartis CHE name & country code
MEDICAL DEVICE TRADE NAMES WITH LINKED
MANUFACTURER NAMES
device trade name & device
manufacturer name & country code (for
2 devices)
CLINICAL TRIAL NUMBERS
clinical trial number repository & clinical
trial number
27
28. DRUG TRADE NAMES WITH LINKED
MANUFACTURER NAMES
foradil drug trade name & drug manufacturer
Novartis CHE name & country code
MEDICAL DEVICE TRADE NAMES WITH LINKED
MANUFACTURER NAMES
Genuair device trade name & device
Almirall ESP manufacturer name & country code (for
Aerolizer 2 devices)
Novartis CHE
CLINICAL TRIAL NUMBERS
clinical trial number repository & clinical
trial number
28
29. DRUG TRADE NAMES WITH LINKED
MANUFACTURER NAMES
foradil drug trade name & drug manufacturer
Novartis CHE name & country code
MEDICAL DEVICE TRADE NAMES WITH LINKED
MANUFACTURER NAMES
Genuair device trade name & device
Almirall ESP manufacturer name & country code (for
Aerolizer 2 devices)
Novartis CHE
CLINICAL TRIAL NUMBERS
ClinicalTrials.gov clinical trial number repository & clinical
NCT01120093 trial number
29
31. Embase indexing principles
1. TRANSLATE
To bring the semantic richness of medical terminology
within your grasp: mapping many synonyms to a single
(natural language) preferred terminology
2. EXPAND
To expose and summarize the information in biomedical
articles beyond title and abstract: discovering in-depth
data about drugs, diseases and medical devices
3. FOCUS
To identify the key concepts hidden within those articles –
what they are really about – providing you with a toolkit to
find answers beginning with comprehensive searches
31
32. Embase history of indexing
1947 1963 1987 1990 1993 2009 2012
Additional
Excerpta Item check tags Medical
Medica types devices
Controlled RCTs
1 vocabulary 4 7
5 Automatic
Emtree +
2 indexing
subheadings
3 6
1 First 9 independent abstract journals with natural language indexing
2 Indexing unified into a controlled vocabulary, with synonyms
3 Tree structure added based on MeSH: birth of Emtree
4 Introduction of 8 item types (aka publication types)
5 Extension of check tags with the first of several new EBM terms
6 Conference abstracts & In-process records automatically indexed
7 Emtree extended with over 600 new medical device terms
32
33. Drugs & diseases: in depth indexing
Embase Indexing Guide, section 5.3.3
“Drug terms are index terms used for all drugs and chemicals: not only
therapeutic drugs, but also endogenous compounds, laboratory chemicals
and environmental chemicals or toxins. It is important to realise that “drugs”
as described in Embase may refer to any chemical entity.”
All significant mentions are indexed *
New drugs as candidate terms *
CAS registry numbers generated
All generic drug names in Emtree
Emtree updated 3x per year *
Indexed with subheadings *
Key subheadings indexed as triples
Drug trade names also indexed
Also drug manufacturer names
* applies to disease terms as well as drugs
33
35. Drugs & diseases: finding new drugs
Step 1: identifying the drug category
35
36. Drugs & diseases: finding new drugs
Step 2: searching using the drug category
36
37. Drugs & diseases: finding new drugs
Step 2a: candidate terms – “not found” in Emtree
37
38. Drugs & diseases: finding new drugs
Step 3: identifying trade names (codes) & manufacturers
From drug index for this article
38
39. Drugs & diseases: finding new drugs
Step 3a: more information on trade names / lab codes
39
40. Drugs & diseases: finding new drugs
Step 4: identifying chemical structures
40
41. Drugs & diseases: finding new drugs
Step 5: PMID → comparison with PubMed indexing
41
42. Drugs & diseases: drug subheadings
5 Key subheadings: ADR, drug therapy, drug combination, drug comparison, drug interaction
12 Other subheadings: clinical trial, drug administration, drug analysis, drug concentration,
drug development, drug dose, drug toxicity, endogenous compound, pharmaceutics,
pharmacoeconomics, pharmacokinetics, pharmacology
Plus: 47 routes of 42 administration
drug
43. Drugs & diseases: drug subheadings
Scope notes
Reference for all scope notes: Embase Indexing guide
See: http://www.embase.com/info/what-is-embase/emtree
43
46. Disease subheadings
Key subheadings: side effect, drug therapy
12 Other subheadings: complication, congenital disorder, diagnosis, disease management,
drug resistance, epidemiology, etiology, prevention, radiotherapy, rehabilitation, surgery, therapy
46
47. Drug & disease subheadings
Floating subheadings
Question: how can I search subheadings on their own, without
specifying in advance what they are linked to?
e.g. pharmacokinetics (searched using subheading /dd_pk)
47
48. Agenda
• Anatomy of an index
• Principles and history (in brief)
• Focus on drugs: in-depth indexing
- drugs and diseases
- subheadings
- trade names, manufacturers, CAS numbers
• Indexing topics A-Z: some highlights
- major terms - mapping from MEDLINE
- Emtree / backposting - automatic indexing
- check tags & limits - medical devices
- study types & topic terms - numerical indexing
Reference: Embase Indexing guide
See: http://www.embase.com/info/what-is-embase/emtree
48
51. Indexing with major terms
History
Diazepam indexed as major term (%)
100%
80%
60%
40%
20%
0%
1980
1990
2000
2010
Publication year
51
52. Emtree & backposting
Emtree is now updated 3x each year
Terms added and changed are listed:
http://www.embase.com/info/what-is-embase/emtree
New terms backposted to past years:
a) if indexed earlier as candidate terms
b) if previous PTs are assigned as synonyms
52
54. Check tags
Embase Indexing Guide, section 5.3.2 & Appendix 2
“Check tags … represent a special group of general terms whose definitions
are described by scope notes, and which are assigned by indexers using a
check list to ensure the highest possible consistency of indexing.”
Category Examples
Item types article, review, letter, editorial, conference abstract
Human study human, major clinical study, case report, human experiment,
types human cell
Animal study nonhuman, animal model, animal experiment, animal cell
types
Sex and age male, female, newborn, child, adolescent, adult, aged
Clinical trials & randomized controlled trial, meta analysis, double blind
EBM procedure, systematic review, diagnostic test accuracy study
Reference (incl. scope notes): Embase Indexing guide
See: http://www.embase.com/info/what-is-embase/emtree
54
56. Other limits
Animal limit
animal:de
OR
invertebrate'/exp OR 'amphibia'/exp OR 'fish'/exp OR 'boreoeutheria'/exp OR 'afrotheria'/exp OR
'dermoptera'/exp OR 'glires'/exp OR 'scandentia'/exp OR 'sauropsid'/exp OR 'laurasiatheria'/exp OR
'ungulate'/exp OR 'reptile'/exp OR 'cercopithecidae'/exp OR 'marsupial'/exp OR 'monotremate'/exp OR
'prosimian'/exp OR 'tarsiiform'/exp OR 'hylobatidae'/exp OR 'xenarthra'/exp OR 'platyrrhini'/exp OR
'chimpanzee'/exp OR 'gorilla'/exp OR 'orang utan'/exp OR 'homo neanderthalensis'/exp OR
'cephalochordata'/exp OR 'hyperotreti'/exp OR 'urochordata'/exp OR 'ambulacraria'/exp OR
'coelomata'/exp OR 'protostomia'/exp OR 'pseudocoelomata'/exp OR 'coelenterate'/exp OR
'mesozoa'/exp OR 'placozoa'/exp OR 'porifera'/exp OR 'juvenile animal'/exp OR 'male animal'/exp OR
'female animal'/exp
OR
primate'/de OR 'haplorhini'/de OR 'mammal'/de OR 'catarrhini'/de OR 'simian'/de OR 'ape'/de OR
'amniote'/de OR 'tetrapod'/de OR 'vertebrate'/de OR 'chordata'/de OR 'deuterostomia'/de OR
'bilateria'/de OR 'therian'/de OR 'hominid'/de OR 'euarchontoglires'/de OR 'placental mammals'/de
56
57. Study types and topic terms
From Emtree facet “Types of article or study”
Include many more terms than check tags
Now include “Topic terms” (from 2011)
Topic terms: 10 terms ending with (topic)
Topic terms were introduced in 2011 to differentiate
between study types, indexed when the article IS the
primary report for an RCT (for example), and articles
in which that term is only a topic that is discussed.
57
58. MEDLINE records in Embase
Over 9m records in Embase are licensed from NLM
They currently come from over 2500 active journals
… distributed over all years (1940 – present)
280,000 recs /yr
1974
58
59. Mapping from MEDLINE
MEDLINE indexing is mapped to Emtree terms
See white paper: Coverage of MEDLINE on Embase
at: http://www.embase.com/info/whitepapers-and-downloads
“When MeSH terms are mapped to
Emtree, subheadings are mapped
to Embase subheadings.
Since not all MeSH subheadings
have an exact Emtree
equivalent, some of them generate
Emtree terms rather than
subheadings.”
59
61. Drug Indexing: Emtree vs PubMed
“2-12 times more information is retrieved from Embase”
See white paper: Differences between Emtree and MeSH
at: http://www.embase.com/info/whitepapers-and-downloads
The next step …
Limit to major focus
Use check tags
Apply search limits
Refine search using
filters
61
62. Automatic indexing with Emtree
Why index automatically?
1. When data is limited to TI, AB
(conference abstracts)
2. When data is provisional
(articles in press, in process recs)
How is it done?
1. Term recognition
(incl. mapping synonym s => PTs)
2. Morphological variants
Not included: (e.g. plural => singular )
• subheadings 2. Term-specific rules
• candidate terms (e.g. human)
62
63. Medical devices
Expanded from 900 to
1500 devices (2012)
Medical devices indexed
in 1.37 million records
More devices in 2013
63
64. Numerical indexing
CAS registry numbers 873054-44-5:rn (ivacaftor)
Molecular sequence numbers GU372418:ms (a 16s-rRNA gene)
Clinical trial registration numbers NCT00568503:cn (QAX028 trial)
MSNs: 25,000 recs / yr CTNs: 10,000 recs / yr
64
67. Summary
Embase for biomedical searching.
Indexing and retrieval
Translate Expand Focus
Every record in Embase is indexed to help you identify the
records you need to find biomedical answers.
Embase indexing focuses on drugs, diseases and medical
devices: these concepts are indexed in-depth.
Essential support is provided by tools such as Emtree back-
posting, check tags, major focus terms and numerical indexing.
67
68. • A Q&A will be sent by email.
• For more information and questions , please contact
bdtraining@elsevier.com
• This is our last Embase webinar for 2012. A new
schedule will be available soon for 2013.
• Go to http://www.embase.com/info/embase-webinars
for all Embase webinars and archives.
Please fill out the survey that
appears on you screen after
leaving the webinar.
68
Notes de l'éditeur
Good afternoon / good morningIn this webinar I would like to tell you about how we index Embase, and how you can make best use of our indexing to find the information you are looking for.I’ll begin by showing you how we index a typical article, using an example that has most of the features that I want to tell you about. Then I’ll follow that up with a few words about our approach to indexing, and show you how Embase indexing has evolved over time.The heart of this webinar will be about how Embase indexes drugs and diseases, what we call “in-depth indexing”After that I will describe a number of indexing tools Embase offers outside the area of drugs and diseases, which I hope will help you to find the information you are searching for.I may not have time to address all these highlights, and I have greyed out a few topics that I don’t expect to have time for. However, if you download this webinar later from the Embase info site, you will find all these topics included.I do want to emphasize that the Embase approach to indexing is described in an Indexing Guide that you can find on our Info site at the URL shown at the bottom of this page. This is a synopsis of the manual used by our indexers, but still includes quite a bit of detail I won’t have time to describe in the 30 minutes or so allotted to this presentation. However, if you have any questions about these aspects I will be very happy to try and answer them during the webinar.
First of all, I will show you how we index a typical article.This is an article we indexed recently, with the concepts highlighted that in one way or another are represented in the Embase indexing.The concepts on this page come from the title and abstract …
… but if you look further into the article, for example the Methods section, you see there are quite a few index concepts that we pick up here as well …
And from the heart of the article, from tables as well as text, and from figures as well, our indexers identify important information that we think is important to index, such as (here) information about adverse drug events.In general, Embase indexers are seeking to identify important original information that is presented in the article, rather than hypotheses or ideas that might be mentioned in the discussion. For review articles, our indexers summarize the key points in the review, and other types of article such as letters to the editor may have a mix of original and non-original information that is indexed.
For this article, the indexing process resulted in all the terms shown here, divided as you see into Drug terms, Disease terms, Other terms and Additional information.At the right hand side of each of the blue bars you can “open” the drug and disease terms, which is what I will show you next.
Here first of all is the opened drug indexing.What you see is that each of the drug terms, here identified in the light blue bars, is qualified by a number of other terms. We call these subheadings, which may be “key subheadings” or “other subheadings”. Key subheadings have additional information shown in the middle column, which we call triple links. I’ll be showing more examples of this shortly.
Here is the corresponding disease indexing.In this case it is quite lengthy, because each of the so-called disease terms is actually a side effect of one or more drug terms.OK, what I will now show you is how our indexers go about identifying all these index terms.
What you see here is an empty index form such as our indexers use.The job of the indexers is to fill in the blanks, and they move from one topic to the next while scanning the full text of the article, but paying a little more attention to key components such as tile, abstract, methods and tables, as we have seen.I won’t show all of the index terms identified, but will focus on just a couple of drugs and diseases. Here we go !
First the important drugs. Notice that this aclidinium bromide is identified as a major focus term, or “A” term.
This is followed by the key drug subheadings and ancillary information, the triple links.Here, chronic obstructive lung disease is identified as the disease that is being treated by aclidinium bromide.Please note that the major focus is assigned to the drug term, not to the subheading. This is different from the approach taken by MEDLINE.
The next subheading is adverse drug reaction, and the triple linked information shows a range of side effects derived from the table which I showed a few minutes ago.
This is followed by other key subheading: drug comparison …
… and another, the route of drug administration (which is actually indexed directly as a subheading) …
… and finally a set of other subheadings, here clinical trial and drug dose.
The minor focus terms, or B terms, are indexed in exactly the same wayImportant to note is that the side effects may actually reflect the absence of a side effect.If you check the original table, you can see that this was the case for tooth pain and diarrhea.The key thing is that the side effect was looked for. Therefore its absence is an important aspect of the article.
The disease terms are indexed in an exactly complementary manner …
Here is the disease …
… linked of course to aclidinium bromide and formoterol – this is the inverse of the triple links we saw before.This indexing is not actually done twice, but it is presented to you, the user, from both points of view.We could have shown a similar picture for each of the side effects but I’ll skip that here …
Beyond the drugs and diseases, Embase indexers pay particular attention to the check tags.This is a list of around 50 concepts that are described by scope notes, and which we ask indexers to pay particular attention to.As you can see, this article is not just a clinical trial (which was identified as a drug subheading), but in fact a phase 2 trial that is also a randomized controlled trial. We are particularly keen that terms like these are identified by our indexing staff.
This is followed by other check tags – here: major clinical study (meaning that more than 50 patients participated) …
… and information about the gender and age of the patients, and the all important “human” tag.
Finally there is a group of other index terms which, although identified here as “minor” represent important aspects of the main focus of the article – for example dosage information.Everything I have shown you so far is controlled by the Embase thesaurus, Emtree. But there is more !
Here , you see a part of the index form used for indexing drug & device trade names and manufacturers, as well as clinical trial numbers.
One of the drugs is a trade name registered by Novartis in SwitzerlandIf you check in Emtree, you will find that foradil is a trade name for formoterol fumarate
Here are a couple of device trade names indexed for this article.As for the drug trade names, these are linked to their manufacturer names and country, if this information is present in the article.Drug trade names have been indexed in Embase since 1974, and device trade names since 1998.
Finally, the clinical trial registration number, which in the Embase.com display links out to clinical trials.gov, as you can see near the bottom of the next slide.
So here once more is the end result (**)For comparison, you can see here the MEDLINE index for the same record.As expected, given the Embase drugs focus, the Embase index is significantly more detailed, especially for drug-related indexing.This is an important difference between Embase and MEDLINE.What I would like to do now is to drill down into some of these aspects of indexing.
But first, let’s take a high level view of what is happening here.Probably like all databases, Embase indexing serves three main roles: to translate (**), expand (**), and focus (**).What is specific to Embase is how we interpret each of these aspects.For example:Under the heading translation, what is special about Embase is the natural language terminology used for indexing, and the very large number of synonyms.expansion stands for the in-depth indexing of drugs, diseases and in fact also medical devices which is derived from the full text of the articleand focus refers to the specific tools that Embase makes available to help you drill down to the articles you need.
Beyond these principles, it is helpful to bear in mind the actual history of indexing at Embase … which I will highlight in this slideWhen Embase, or Excerpta Medica as it was known then, was founded in 1947, there was already a focus on Natural language indexingAbout 15 years later, the indexing was consolidated into a single Controlled vocabulary, initially with 25,000 terms and 50,000 synonyms.The vocabulary grew quickly, reaching 250,000 preferred terms by 1987. This was when we decided to organize the top 35,000 terms (accounting for over 90% of the uses) into a tree structure called Emtree. Emtree was deliberately modelled on MeSH, which is why even today there are great similarities between the two thesauri. In fact, during the 90’s we incorporated all MeSH terms into Emtree, and continue to update Emtree with new MeSH terms every year. This is why, today, you can confidently search Embase using MeSH terms. But returning to 1987 … the other major change was the introduction, for the first time, of drug and disease subheadings We chose to implement a core set of subheadings similar but not identical to the MeSH subheadings available in MEDLINE.A few years later we introduced our 8 item types, also known in Embase.com and other platforms as publication types. Our indexing rule is that every record is identified by a single item type. This is somewhat different from the approach taken by MEDLINE.All along we had emphasized the importance of a core set of 40-50 index terms known as check tags, which I mentioned before. In 1993 we began extending the check tags with terms relevant for EBM, beginning with “randomized controlled trial”More recently, in 2009, we introduced automatic indexing to allow us to begin indexing conference abstracts. We could not have done this manually; and automatic indexing ensures that the key concepts are retrievable via Emtree.The growth of Emtree, and before 1987 the Embase controlled vocabulary, is spread over more than 65 years. Emtree now counts more than 60,000 preferred terms and nearly 300,000 synonyms – and it is still growing. Most recently, we have focused on medical devices terminology, with over 600 new devices added in 2012.So … to summarize, let me just say that although new features have been introduced in every decade, one feature is timeless.This is, that even though Emtree was only introduced in 1987, you can use Emtree to search right back to the earliest record in 1947.Of course, indexing tools such as subheadings are only applicable back to the time they were first introduced.As you an imagine, it wasn’t possible for us to re-index Embase articles from earlier than 1987 with subheadings !
What I would like to do now is a deep dive into how we index drugs on Embase.The reference here is to the Embase Indexing guide, where you can find much more information than I can tell you here.(**) Perhaps most important is that all “significant mentions” of drug terms are indexed.“Significant” means: whenever there is new information presented about a drug, such as all the side effects.This means that Embase typically indexes articles with many more drugs than PubMed, for example.And if the indexer finds a drug term that is not yet in Emtree, he still indexes it as a so-called candidate term(**) As far as Emtree is concerned, we want to ensure that new terms are recognised as soon as possible.To do that, we add new generic terms to Emtree proactively, as soon as they are known.And - for a couple of years now we have been updating Emtree 3x a year(**) All drugs and diseases are indexed with one or more subheadings, where possibleAnd we have also identified 5 subheadings as key subheadings - I will explain what that means shortly(**) Last but not least, and completely separate from Emtree, we index drug trade names and manufacture names.Many of these trade names are synonyms in Emtree and can be mapped to generic drug names.But regardless of this mapping, you can search for trade names in their original format in our trade names field.
So let’s look at an example of how this might work.… and let’s suppose you are interested in developing new drugs for cystic fibrosisIn this example, I am imagining that you found an article which mentions the treatment of cystic fibrosis with a new drug: kalydeco, a chloride channel stimulator.This might make you wonder what other similar drugs might be under development.
On Emtree, you can quickly identifykalydeco as a synonym of ivacaftor (**)Ivacaftor was added to Emtree in 2011, and is listed as an “ion transport affecting agent” (**)From the list of “ion transport affecting agents” it seems there is as yet no term for “chloride channel stimulating agents”(actually, I can divulge that this is coming in the next Emtree release in December)So let’s go with “ion transport affecting agents”.So are there similar drugs under development in this category for treating cystic fibrosis?Even if they are not yet listed in Emtree, you can easily find them …Here’s how you do it !First of all, you can search the entire category by clicking on the link indicated by the red arrow.
The result is shown here: this explosion search with /exp captures all the articles in the entire category.What you can now do is to edit the search to retrieve only records indexed with the category name itselfTo do that, you change the explosion to an exact search using /de(**) Why should you do this? It’s because Embase indexers index using the most specific terms in Emtree.But if the terms they want to index are not yet covered in Emtree, they index the next best thing: its category.In this case, that is “ion transport affecting agent”As you can see, this already brings the number of results down by nearly three orders of magnitude to 367 !Of course, the indexer also indexes the new drug as a candidate termBut as we don’t know what that is, we can’t search it directly.Fortunately there is a way around this.When he indexes a candidate drug term, there is one more thing the indexer has to do.He also indexes the term “unclassified drug” to indicate that he has assigned a candidate term.(**) You can make the results even more specific (**) by searching the disease name as well.The third result looks quite promising, so let’s look at the index.(**) As you can see, quite a few drug names are listed, including ivacaftor, the drug we started with.If you click on this (** ivacaftor), you get more information about why it has been indexed in this article.As I mentioned before, the kind of indexing we see here is called “triple indexing”, because the subheading links two other terms …… in this case the triples are a drug linked to two other drugs, and a drug linked to a disease.You can’t actually search on triples yet, but you can see in the actual results what is linked with what, as here.Clicking on other terms (** vr496) brings up similar information for other drugsAnd similarly (** vx661) for vx 661.Although you can’t see which drugs are the candidate terms here, it’s quite likely that one or more of the lab codes is in fact a candidate term.For example, if you click on the symbol next to “vx 661” you can check whether the drug is in Emtree
In this case, “vx 661” is not recognised by Emtree, so it must have been a candidate term
In this same article, the trade names, or lab codes, are also indexed separately, together with their manufacturers.The indexing rule is, that all these trade names are also searchable in the descriptor field as drugs.(xx) First of all, the codes highlighted in the red box were all indexed without any change as drug terms in this article.(you can see this by comparing the terms with the drug index at the bottom of the slide).This means they are either Emtree preferred terms in their own right, or candidate drug terms (like vx 661).(xx) However, the two lab codes on the right were don’t seem to have been indexed as drug terms. So what are they?
A quick check in Emtree (now searching with the letters vx) shows that they are in fact the codes for ivacaftor and lumacaftorWhat happened in both cases is that the codes were mapped to their generic names; and it is the generic names that appear in the index.
Supplementary slide
Supplementary slide
So far I have talked about how drugs are listed in Emtree, about triple indexing, and about trade names and manufacture names.What I haven’t really mentioned is the full breadth of subheadings that the indexer has at his disposal, and how to search them.That is the subject of this slide.This is the drug search form in Embase.comWithin this search form, you can find all 17 drug subheadings, together with 47 routes of drug administration.
If for example you are interested in “adverse drug reaction”, you can find its scope note, here in Emtree.Adverse drug reaction is an interesting one, because it is limited to side effects reported at therapeutic dose ranges.If you were interested in side effects at non-therapeutic levels, you would need to use a different subheading: “drug toxicity”.As well as in Emtree, you can find all of the scope notes in an Appendix 3 of the Embase Indexing guide.
So let’s set up a search using this form.Because its easy to do, I decided to use the first two subheadings in the list, linked to aspirin.As you see, I can choose to combine the subheadings using AND or OR; I have chosen AND.I have also chosen to search aspirin as a major focus term.
Here are the results.The major focus is represented in the search using /mjAnd the adverse drug reaction and clinical trial subheadings are represented by the codes /dd_ae and /dd_ct respectivelyThe AND logic I chose has resulted in the two subheadings being linked separately to aspirin.I’ve also shown the first record retrieved here – aspirin, or actually its preferred term acetylsalicylic acid, is highlighted because it was one of the search termsIf you click on it (** acetyl …) you can see three triple indexes, with the third one indicating that aspirin is used in this article as drug therapy for a whole range of diseases. Clinical trial, one of the terms we searched, is shown as one of the other subheadings.
You can search the disease subheadings in an analogous way.Here, I have decided to link cystic fibrosis, again searched as a major focus term, with three subheadings: surgery, therapy and drug therapy. The choice of drug therapy is hidden just at the moment because I had to scroll forward in the disease subheadings box to surgery and therapy.In this case, I want to combine these options using OR, because I am interested in any of these modes of therapy.By the way, the subheading side effect is the exact mirror image of adverse drug reaction: where a drug is linked via the subheading “adverse drug reaction” to a disease, the disease is always linked in the same article by the subheading “side effect to the causative drug.(**) Anyway, the results of our actual search are shown at the bottom of the slideAs you see, the OR mode of this search resulted in the codes for the three subheadings /dm_dt, /dm_su and /dt_th being grouped together using a comma separator.
So, one final point about subheadings: how can you search for subheadings on their own, without specifying in advance what they are linked to?The answer is illustrated here, using “pharmacokinetics” as an example.(**) You could, of course, combine the subheading in an explosion search with a whole category, if you know which drug category you are interested in.(**) But if not, here is how you can do the search: using the LNK field label.(**) A search with “pharmacokinetics” as a normal Emtree term gives quite different results.The 140,000 results retre3ived by search #3 are literally retrieving articles indexed with this term, not – or at least not necessarily - by the subheading(**) You could also do an explosion search using pharmacokinetics as in this final search.(**) As you can see from the Emtree inset, this retrieves records indexed with any one of a great number of terms – only the first 3 are shown here – and even this does not retrieve all of the articles in set #1 (I haven’t shown it here, but if you combine sets #1 and #4 , the overlap is only about 180,000 records).The lesson here is that you have to be quite careful in how you formulate your search. Embase provides several tools, but it is up to you to decide how to deploy them.
OK, this covers the third bullet of my agenda: the focus on in-depth drug indexing.Now I would like to touch on a number of other indexing tools which, as I mentioned at the outset, I hope may help you with your search.As I mentioned, I won’t talk about the greyed out topics, but they will be in the copy of this webinar that you can find via our info site.
The first topic is the indexing with major terms.I have shown how to search this, and I probably don’t need to say very much more about it.Here, for example, is the article about cystic fibrosis that we started with.The major terms are all in bold – and as you see “cystic fibrosis” and most of the drug terms are “major” in this article.Considering the title of the article, that is probably not too much of a surprise.(**) Here is a completely different article.In this case there are several drug index terms, but none of them are major concepts.The major concepts now are all disease and other terms.Again, this should not be too much of a surprise given the topic of the article.
For every term the frequency with which it is indexed as a major focus term will be a little different.For diazepam, for example, it is a little less than 50%
Actually, this is only half the story.It turns out that for the past 10 years, diazepam has been indexed as a major term only about 10% of the time.This therefore gives you a great power of discrimination if you need to focus your search on the key results.(**) The same pattern applies to many drugs that have been indexed over the past decades.As you can see, a different policy applied to drug indexing in the 1970s and 1980s, but this has been turned around in the last 20 years.
The second topic on my list was Emtree.Emtree is now updated 3 times each year - you can find lists of added and changed terms at the URL shown on this slide.When we make these updates, we also reload Embase.com to update the indexing of older records whose indexing has changed.We call this backposting.(**) using the example we searched on earlier, we can see in Emtree that ivacaftor was introduced to Emtree in 2011
However, if you search on ivacaftor you can find records that are indexed with this term as early as 2007.These earlier records have been backposted, and if you look at their indexes they will indeed display the current term, ivacaftor.The value of this to you as a searcher is that you don’t need to worry about name changes in Emtree.You can confidently search using the current version of Emtree, knowing that you will not miss any records that were indexed slightly differently in the past – they will have been backposted.
The next point on our agenda is check tags.The use of check tags is described in the Embase indexing guide, and you can find a full list, with scope notes, in Appendix 2.As I have mentioned a couple of times before, there are around 50 check tags, and a few examples are shown in this slide.
You can search check tags either as normal Emtree terms, or in some cases as Advanced Limits.The choice is up to you.(**) for example here is the check tag for male – showing identical results with both approaches.
While on the topic of limits I just want to mention one limit that is different.This is the limit to animals.Earlier this year, as part of a reorganization of the Organisms facet in Emtree, this term moved to its rightful place at the head of one of the most important groups of organisms.Unfortunately, this means that in Emtree “humans” are defined as animals.But searchers usually want to search animals with the meaning: all animals except humans.To meet this need we have re-engineered the animal limit to cover all animals except humans.The profile that we have used is shown here.
Now, as you’ll recall, the check tags include a group of study types, such as “major clinical study” and “randomized controlled trial”.Another way of finding study types is to use the Filters at the left hand side of the results page, which I mentioned a few moments ago.(**) All of the study types mentioned here for one particular search are taken from the Emtree facet on “Types of article or study”(**) This is a rich source of terms that you can use to filter your search, and it now includes 10 so-called topic terms shown here, that I described last month in an Embase webinar on Evidence Based Medicine. If you want to follow up on this, you can access those slides via the Embase info site.This definition of topic terms, by the way, mirrors the definitions of similar terms which the NLM uses to index MEDLINE.
The last topic I want to address here is the indexing of MEDLINE records on Embase.As you can see here, almost 9 million records on Embase.com are licensed from NLM. You can access them using the search shown here, where the records deduplicated against Embase and Embase Classic are NOT-ed out.(**) These records are spread over all the years of Embase, including the early years before 1974 also covered by Embase Classic.
For all these records, we use the MEDLINE indexing, which we map against Emtree in order to ensure that these MEDLINE records are fully retrievable in all searches using Emtree.(**) The details of how we do the mapping are described in the white paper whose URL is shown here.(**) For example, the mapping for MeSH subheadings has to take into account that for some of them there is no equivalent Embase subheading.
A portion of this mapping is shown here, from Appendix 2 of the white paper.As you can see, a mapping to Embase subheadings has been set up for 19 of the MeSH subheadingsFor the remaining 64 MeSH subheadings, we generate Emtree terms as shown for a selection of these terms at the bottom of the slide.
While I am on the topic of MEDLINE, I want to mention how our in-depth drug indexing policy at Embase affects your search, compared with the equivalent search on PubMed.(**) In the white paper described here, there is a table showing equivalent drug searches for 10 top drugs in 2011.The results are clear: you can retrieve 2-12 times more information from Embase.I want to make clear that this does NOT mean that you have to wade through up to 12 times as much information to find the results you need.What it DOES mean is that with Embase you can be much more certain that the result you need is included in your result set, which may not be the case for the equivalent search on MEDLINE.(**) The next step is up to you: you have all the tools at your disposal that I have described in this webinar, and more – for example limiting to major focus terms, using check tags, applying search limits or refining your search using filters.
The main thing to explain about automatic indexing is why we do it.There are two reasons.First, when we began indexing conference abstracts in 2009, we expected to be adding up to 300,000 records a year – and that is exactly what we are now doing. To index all of these records manually would not have been possible for us. However, knowing that there was no full text other than the abstract itself for these records made it reasonable for us to consider indexing them automatically – we knew that there was no hidden full text that we would be missing.Having decided to set up a procedure for automatic indexing, we felt it would also be worthwhile using this technology to index the provisional data in Articles in Press and In Process records. Even though these records may only be in Embase for a matter of days before they are updated by fully manually indexed records, we felt that there were significant advantages in making these records accessible via Emtree searches.This does mean, however, that you need to be aware of a couple of limitations of automatic indexing. We cannot at the moment index subheadings, and as you can imagine we are unable to index candidate terms. Even so, with automatic indexing we can get very close to the power of manual indexing – for example in the term-specific rules that we are applying to ensure the best possible automatic indexing of concepts such as “human”.
During 2012 we have focused especially on the expansion of medical device terminology in Emtree, and the number of terms has grown by around 60% to over 1500.Using these terms, you can retrieve up to 1.37 million records in Embase – or 5% of the entire database. This is just to illustrate tht alongside drugs and disease, medical devices represent a special strength of Embase.(**) Along with the terms themselves, we have overhauled the Emtree hierarchy for devices, as shown here. These categories represent the key classifications that searchers for device information are interested in.
Extra slide
Extra slide
Extra slide
So, to summarize, I would like to bring up one of the slides I began with, where I wanted to draw your attention to three key components of indexing in Embase and in fact in any database: translation, or mapping using the Emtree thesaurus; expansion, or using indexing to expose the content of the full article, and focus, or applying index terminology to limit your search to exactly the results you need.(**) In Embase, we apply these principles in very specific ways:By using an extensive thesaurus with nearly 300,000 synonyms, we ensure that your search terms are effectively translated to our terminology.By indexing drugs, diseases and medical devices in-depth, we expand the range of your search beyond what you can find in PubMed.And by providing a wide range of tools to help you focus your search, we give you effective control over the search results you get.Thanks you for your attention, and good luck with your next search in Embase !
We have come to the end of our Embase webinar. Thank you for attending and thank you Ian for this fascinating overview of Embase and … You will receive a link to the webinar calendar in a follow-up email and please feel free to register for as many Embase webinars as you wish. And of course send it to your colleagues. When you leave the session, a survey will pop up. Please fill out what your thoughts are regarding this webinar and so help us to improve future webinars. Besides this you will receive the Q&A by e-mail shortly with a link to the slides and recording. Many thanks again, good luck with further exploring Embase and we hope to meet you again soon.