SlideShare une entreprise Scribd logo
1  sur  16
Automating annotations of the cognitive
neuroimaging literature using ATHENA
Riedel MC, Salo T, Hays J, Turner MD, Sutherland MT, Turner
JA, & Laird AR
Neuroinformatics and
Brain Connectivity Lab
Neuroimaging Research
• Increasing in volume and scope
• Embedded in this literature is knowledge capturing a system-level probing of
functional brain organization
• The challenge for cognitive neuroscience is harnessing this knowledge and
translating it into improved neurocognitive models
0
2000
4000
6000
8000
10000
12000
14000
2000 2002 2004 2006 2008 2010 2012 2014
Published Neuroimaging Studies
Cognitive Paradigm Ontology
• Knowledge modeling effort to study the relationship between
brain structure and function
• Seeks to represent stimuli, responses, and instructions that
define conditions of an fMRI experiment in a standardized
format
• System of labels for annotating neuroimaging articles
Cognitive Paradigm Ontology
Behavioral Domain
Paradigm Class
Diagnosis
Instruction
Context
Stimulus Modality
Stimulus Type
Response Modality
Response Type
Action
Cognition
Emotion
Interoception
Perception
Anger
Fear
Happiness
Sadness
n-back
Face Monitor/Discrimination
Classical conditioning
Delay discounting
Film viewing
Go/No-Go
Autism Spectrum Disorders
Bipolar Disorders
Depression
Normal
Schizophrenia
Attend
Count
Detect
Discriminate
Recall
Disease Effects
Drug Effects
Normal Mapping
Auditory
Tactile
Visual
Digits
Faces
Letters
Pictures
Shapes
Hand
None
Oral/Facial
Button Press
None
Speech
Goals
• Develop framework for automated annotations of neuroimaging articles
• Evaluate classifier performance across variable parameters:
• corpus
• feature space
• classification algorithm
• Characterize relationships between labels by assessing similar vocabularies used
for classification
Problem
• Manual annotation is time-limiting, field is too large
• Bias/human error
Classification Features
• Property or characteristic of something being measured
• Related to explanator variables in linear regression
• Examples:
• Speech recognition: noise ratios, length of sounds, relative power, filter
matches
• Spam detection: email headers, email structure, language, term frequency
• Character recognition: histogram counts of black pixels in horizontal and
vertical direction, number of internal holes, stroke detection
Background-Studies incorporating direct
comparisons across all phases of bipolar
(BP) disorder are needed to elucidate the
pathophysiology of bipolar disorder.
However functional, neuroimaging studies
that differentiate bipolar mood states from
each other and from healthy subjects are
few and have yielded inconsistent
findings.
Feature Spaces
bag-of-words
Cognitive Atlas
bipolar
bipolar disorder
disorder
bipolar
bipolar mood
bipolar mood states
mood states
mood
states
bipolar disorder
mood
Classification Procedure
neuroimaging
article
n = 2,633
Behavioral Domain
Context
Diagnosis
Instruction
Paradigm Class
Response Modality
Response Type
Stimulus Modality
Stimulus Type
abstracts-only
full-text
CogPO Labels
corpora
text extraction
bag-of-words
Cognitive Atlas
feature spaces
training/test
dataset splits
k = 5
80%/20%
feature
vectorization
and reduction
f = 1,754
parameter
tuning
k = 2
classification
Bernoulli naïve Bayes
k-nearest neighbors
logistic regression
support vector classifier
cross-validation
100 iterations
Assessing Classifier Performance
• Classifier performance evaluated using F1-score
• 𝐹1 = 2 ×
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛×𝑟𝑒𝑐𝑎𝑙𝑙
𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑟𝑒𝑐𝑎𝑙𝑙
, 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =
𝑡𝑝
𝑡𝑝+𝑓𝑝
, 𝑟𝑒𝑐𝑎𝑙𝑙 =
𝑡𝑝
𝑡𝑝+𝑓𝑛
• Ranges from 0 to 1
• F1-scores averaged across labels for overall performance
Classifier Performance
F1-score
Representation of Classification Features
• bag-of-words features used to classify each label for Behavioral
Domain and Paradigm Class
• Used distributions of feature representation to calculate correlation
matrix
• Regressed co-occurrence of labels from correlation coefficients
• Performed hierarchical clustering on resulting matrix to assess
similarity of classification features between labels
ParadigmClass_Reward
BehavioralDomain_Cognition_SocialCognition
BehavioralDomain_Cognition_Memory_Explicit
BehavioralDomain_Cognition_Attention
BehavioralDomain_Perception_Vision_Shape
BehavioralDomain_Perception_Vision
BehavioralDomain_Perception_Audition
ParadigmClass_WordGeneration
BehavioralDomain_Cognition_Language_Speech
BehavioralDomain_Cognition_Language_Semantics
BehavioralDomain_Cognition_Language
ParadigmClass_Reading
BehavioralDomain_Action_Execution_Speech
BehavioralDomain_Perception
ParadigmClass_Stroop
ParadigmClass_GoNoGo
BehavioralDomain_Action_Inhibition
ParadigmClass_EmotionInduction
BehavioralDomain_Emotion_Happiness
ParadigmClass_FaceMonitorDiscrimination
BehavioralDomain_Emotion_Fear
BehavioralDomain_Emotion
ParadigmClass_SemanticMonitorDiscrimination
BehavioralDomain_Cognition_Memory_Working
ParadigmClass_PassiveViewing
ParadigmClass_nback
ParadigmClass_DelayedMatchtoSample
BehavioralDomain_Cognition_Reasoning
ParadigmClass_Encoding
ParadigmClass_CuedExplicitRecognitionRecall
BehavioralDomain_Cognition_Memory
ParadigmClass_FingerTappingButtonPress
ParadigmClass_VisuospatialAttention
BehavioralDomain_Action_Execution
BehavioralDomain_Action
BehavioralDomain_Interoception
BehavioralDomain_Cognition
BehavioralDomain_Perception_Vision_Motion
ParadigmClass_Rest
BehavioralDomain_Action_Rest
BehavioralDomain_Perception_Somesthesis
ParadigmClass_PainMonitorDiscrimination
BehavioralDomain_Perception_Somesthesis_Pain
0.10.20.30.40.50.60.70.80.9
LanguageEmotionMemoryPain
MEMORY
EMOTION
PAIN
LANGUAGE
Conclusions and Future Works
• full-text, bag-of-words performed best
• Cognitive Atlas features outperform bag-of-words when only using text from abstracts
• Anatomical terms dominate features for classification when using bag-of-words
• Test on independent dataset
• Validate by replicating existing meta-analyses
• Specify Cognitive Atlas
• Integrate with existing frameworks
Acknowledgements
External Collaborators
Dr. Angela Laird
Dr. Matthew Sutherland
Dr. Michael Tobia
Dr. Veronica Del Prete
Jessica Bartley
Katherine Bottenhorn
Jessica Flannery
Ranjita Poudel
Taylor Salo
Lauren Hill
Chelsea Greaves
Rosario Pintos Lobo
Laura Ucros
Diamela Arencibia
Jennifer Foreman
Ariel Gonzalez
Neuroinformatics and Brain Connectivity Lab
Jessica Turner
Matthew Turner
Neuroinformatics and
Brain Connectivity Lab
NSF 1631325
NSF REAL DRL1420627
NSF CNS 1532061
NIH R01 DA041353
NIH U01 DA041156
NIH K01 DA037819
NIH U54 MD012393
Classifiers
• Bernoulli naïve Bayes
• Trains on binary word occurrence vectors instead of word counts
• logistic regression
• Linear model for classification
• k-nearest neighbors
• Identifies nearest k articles in distance and uses majority vote to
determine if its about a label
• support vector machine
• Creates high-dimensional decision hyper-plane

Contenu connexe

Similaire à athena-neuroinformatics-2018

Data analysis
Data analysisData analysis
Data analysisamlbinder
 
Conversational transfer learning for emotion recognition
Conversational transfer learning for emotion recognitionConversational transfer learning for emotion recognition
Conversational transfer learning for emotion recognitionTakato Hayashi
 
Assessment of Anxiety,Depression and Stress using Machine Learning Models
Assessment of Anxiety,Depression and Stress using Machine Learning ModelsAssessment of Anxiety,Depression and Stress using Machine Learning Models
Assessment of Anxiety,Depression and Stress using Machine Learning ModelsPrince Kumar
 
Evaluation of multilabel multi class classification
Evaluation of multilabel multi class classificationEvaluation of multilabel multi class classification
Evaluation of multilabel multi class classificationSridhar Nomula
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.pptmanaswidebbarma1
 
Demystifying Machine Learning
Demystifying Machine LearningDemystifying Machine Learning
Demystifying Machine LearningAyodele Odubela
 
Analyzing Road Side Breath Test Data with WEKA
Analyzing Road Side Breath Test Data with WEKAAnalyzing Road Side Breath Test Data with WEKA
Analyzing Road Side Breath Test Data with WEKAYogesh Shinde
 
Introduction to basic statistics
Introduction to basic statisticsIntroduction to basic statistics
Introduction to basic statisticsAnkit Katiyar
 
Introduction to basic statistics
Introduction to basic statisticsIntroduction to basic statistics
Introduction to basic statisticsothanatoso
 
Multiple discriminant analysis
Multiple discriminant analysisMultiple discriminant analysis
Multiple discriminant analysisMUHAMMAD HASRATH
 
SemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment AnalysisSemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment AnalysisAditya Joshi
 
Hierarchical Transformer for Early Detection of Alzheimer’s Disease
Hierarchical Transformer for Early Detection of Alzheimer’s DiseaseHierarchical Transformer for Early Detection of Alzheimer’s Disease
Hierarchical Transformer for Early Detection of Alzheimer’s DiseaseJinho Choi
 
Natural Language Understanding in Healthcare
Natural Language Understanding in HealthcareNatural Language Understanding in Healthcare
Natural Language Understanding in HealthcareDavid Talby
 
Emotion Recognition in Classical Music - Presentation
Emotion Recognition in Classical Music - PresentationEmotion Recognition in Classical Music - Presentation
Emotion Recognition in Classical Music - PresentationRavi Kiran Holur Vijay
 
Blueprinting and drafting examination questions, Liz Norman, ANZCVS Exam Writ...
Blueprinting and drafting examination questions, Liz Norman, ANZCVS Exam Writ...Blueprinting and drafting examination questions, Liz Norman, ANZCVS Exam Writ...
Blueprinting and drafting examination questions, Liz Norman, ANZCVS Exam Writ...Liz Norman
 

Similaire à athena-neuroinformatics-2018 (20)

Data analysis
Data analysisData analysis
Data analysis
 
Conversational transfer learning for emotion recognition
Conversational transfer learning for emotion recognitionConversational transfer learning for emotion recognition
Conversational transfer learning for emotion recognition
 
Assessment of Anxiety,Depression and Stress using Machine Learning Models
Assessment of Anxiety,Depression and Stress using Machine Learning ModelsAssessment of Anxiety,Depression and Stress using Machine Learning Models
Assessment of Anxiety,Depression and Stress using Machine Learning Models
 
Personality
PersonalityPersonality
Personality
 
Evaluation of multilabel multi class classification
Evaluation of multilabel multi class classificationEvaluation of multilabel multi class classification
Evaluation of multilabel multi class classification
 
DNN Model Interpretability
DNN Model InterpretabilityDNN Model Interpretability
DNN Model Interpretability
 
Analysing & interpreting data.ppt
Analysing & interpreting data.pptAnalysing & interpreting data.ppt
Analysing & interpreting data.ppt
 
Demystifying Machine Learning
Demystifying Machine LearningDemystifying Machine Learning
Demystifying Machine Learning
 
Analyzing Road Side Breath Test Data with WEKA
Analyzing Road Side Breath Test Data with WEKAAnalyzing Road Side Breath Test Data with WEKA
Analyzing Road Side Breath Test Data with WEKA
 
Introduction to basic statistics
Introduction to basic statisticsIntroduction to basic statistics
Introduction to basic statistics
 
Introduction to basic statistics
Introduction to basic statisticsIntroduction to basic statistics
Introduction to basic statistics
 
NCME_040916
NCME_040916NCME_040916
NCME_040916
 
Analyzing Movie Reviews : Machine learning project
Analyzing Movie Reviews : Machine learning projectAnalyzing Movie Reviews : Machine learning project
Analyzing Movie Reviews : Machine learning project
 
Multiple discriminant analysis
Multiple discriminant analysisMultiple discriminant analysis
Multiple discriminant analysis
 
SemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment AnalysisSemEval - Aspect Based Sentiment Analysis
SemEval - Aspect Based Sentiment Analysis
 
Hierarchical Transformer for Early Detection of Alzheimer’s Disease
Hierarchical Transformer for Early Detection of Alzheimer’s DiseaseHierarchical Transformer for Early Detection of Alzheimer’s Disease
Hierarchical Transformer for Early Detection of Alzheimer’s Disease
 
Natural Language Understanding in Healthcare
Natural Language Understanding in HealthcareNatural Language Understanding in Healthcare
Natural Language Understanding in Healthcare
 
Parkinson disease classification v2.0
Parkinson disease classification v2.0Parkinson disease classification v2.0
Parkinson disease classification v2.0
 
Emotion Recognition in Classical Music - Presentation
Emotion Recognition in Classical Music - PresentationEmotion Recognition in Classical Music - Presentation
Emotion Recognition in Classical Music - Presentation
 
Blueprinting and drafting examination questions, Liz Norman, ANZCVS Exam Writ...
Blueprinting and drafting examination questions, Liz Norman, ANZCVS Exam Writ...Blueprinting and drafting examination questions, Liz Norman, ANZCVS Exam Writ...
Blueprinting and drafting examination questions, Liz Norman, ANZCVS Exam Writ...
 

Dernier

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfrohankumarsinghrore1
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptxRajatChauhan518211
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfSumit Kumar yadav
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencySheetal Arora
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)Areesha Ahmad
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfSumit Kumar yadav
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 

Dernier (20)

Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
Green chemistry and Sustainable development.pptx
Green chemistry  and Sustainable development.pptxGreen chemistry  and Sustainable development.pptx
Green chemistry and Sustainable development.pptx
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Botany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdfBotany 4th semester series (krishna).pdf
Botany 4th semester series (krishna).pdf
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls AgencyHire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
Hire 💕 9907093804 Hooghly Call Girls Service Call Girls Agency
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
Chemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdfChemistry 4th semester series (krishna).pdf
Chemistry 4th semester series (krishna).pdf
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 

athena-neuroinformatics-2018

Notes de l'éditeur

  1. Thank you for the opportunity to speak here today. Im going to present some work we have been doing using classification techniques to automatically annotate the neuroimaging literature with labels from a cognitive ontology.
  2. Neuroimaging research has seen an explosion of growth over the past 20 years, indicated here by the number of published neuroimaging studies per year. As we have continued to explore the relationship between brain structure and function, the research has also become increasingly complex.   With such a wealth of information embedded in this literature about brain organization, the challenge for us as neuroscientists is to harness this knowledge in a digestible manner in a way that enhances our understanding of neurocognitive models.
  3. The Cognitive Paradigm Ontology or CogPO is a step in this direction, providing a discrete set of terms meant to help study the relationship between brain structure and function.   These terms are meant to characterize elements of experimental design like stimuli, responses, instructions, and can be used to annotate neuroimaging articles to provide concise references about the research in that that article.
  4. Briefly, CogPO consists of 9 dimensions. One of which is behavioral domain, which describes a mental construct and contains labels such as Action, Cognition, Emotion, Interoception and Perception. And these labels may be even more descriptive such as Anger, Fear, or Happiness for Emotion.   Then Paradigm Class describes different tasks, such as n-back, delay-discounting and go/no-go. Then there are other types of labels such as Diagnosis, Instruction, Context, and Response and Stimulus Modality and Type.   Thus, CogPO informs cognitive models by being able to synthesize articles with similar labels to perform meta-analyses.
  5. The problem we are currently facing is that with such a large amount of research available, manual annotation of the literature is time-limiting and nearly impossible to keep up with. Plus add in bias and human error associated with manual annotations.   Therefore, we sought to develop a framework for annotating neuroimaging articles in an automated manner using the CogPO labels.   To do this, we wanted to evaluate classification performance by varying three parameters: corpus, features, and different classification algorithms.   Then we wanted to characterize the relationships between labels by assessing the most frequently used features in the classification process. That is, can we use data-driven approaches to determine if neuroimagers are using similar vocabularies to describe certain cognitive paradigms.
  6. First, when I talk about classification features, they are a measurable property or characteristic that can be used for classification.   They are similar to the explanatory variables in a linear regression.   Some examples of features in classification are noise ratios and lengths of sound in speech recognition, and email text or headers in spam detection.
  7. We wanted to evaluate the performance of two types of features: terms extracted from the text, which is the bag-of-words approach, and representation of terms defined by the Cognitive Atlas. A description of the Cognitive Atlas deserves more time here, but briefly, it is a vocabulary of about 1700 terms describing concepts, tasks, disorders, and theories in cognitive science. What also makes the Cognitive Atlas unique is the relationships between terms, such as working memory is a KIND OF memory, and a PART OF decision making.   Here is a short example of how the bag of words and Cognitive Atlas approaches for defining features differs. Consider this small text. The bag-of-words approach would take the words “bipolar disorder” and break it into “bipolar” “bipolar disorder” and “disorder”, and “bipolar mood states” can then be broken into all combinations of three or less terms shown here. I should mention that all terms in this text could be used for classification, Im just focusing on these two example for illustrative purposes.   Then, in the Cognitive Atlas approach, only terms defined by the Cognitive Atlas are used, so the only terms that would be used for classification would be “bipolar disorder” and “mood”, and all other terms in this text are ignored.
  8. Now I’ll walk you through how we defined our classifiers for each CogPO label.   We utilized a dataset of 2,633 neuroimaging articles that were manually annotated with CogPO labels.   As I mentioned before, we evaluate extracting features from either just the abstracts or the full-text.   Once text was extracted according to the bag-of-words or Cognitive Atlas feature space, we performed 100 iterations of a repeated 5-fold cross-validation procedure. In this procedure, in each iteration, the dataset was split into 5 folds, where each fold was divided into a training dataset, which consisted of 80% of the articles, and a test dataset, which consisted of 20% of the articles.   We then vectorized the features based on frequency of appearance in an article and incorporated the frequency of that feature across all articles. Because the Cognitive Atlas only consisted of 1,754 terms, we reduced the bag-of-words terms using a chi-square test that removes all but the top 1,754 features for a particular label.   Then, in preparation for classification, we performed a 2-fold cross-validation procedure to tune the hyperparameters for classification. Depending on the classification algorithm, this step basically optimizes the cost function and smoothing kernel.   Finally, we used 4 different classification algorithms to generate a classifier for each CogPO label, logistic regression, Bernoulli naïve bayes, k-nearest neighbors, and support vector machine.
  9. To assess classifier performance, we used F1-scores, which are dependent on precision and recall.   The F1-score can range from 0 to 1, where 1 represents perfect classification   We calculated the F1-score for each label and averaged across all 100 iterations. Then, to determine which combination of corpus, feature space, and classification algorithm performed best, we averaged across all CogPO labels.
  10. This graph represents overall performance, separated by classification algorithm on the bottom. Abstracts are in blue, full-text is in orange, bag-of-words are represented by circles and Cognitive Atlas by X’s.   Here, the top performer used full-text, bag-of-words, and the logistic regression algorithm.   Its also worth noting that when using the support vector machine algorithm, the cognitive Atlas approach did not differ that greatly from the bag-of-words approach when using full-text.   And perhaps more interesting, the Cognitive Atlas feature space approach actually outperformed the bag-of-words approach when only using article abstracts. This could be particularly useful for two reasons: 1) Cognitive Atlas provides a platform for classifying based on an ontology specifically designed for the cognitive sciences, and 2) currently abstract-text are more accessible than full-article text, and may provide a means for annotating a larger proportion of the literature.
  11. Since the bag-of-words approach performed the best, we wanted to determine which CogPO labels in Behavioral Domain and Paradigm Class used the same features for classification across iterations. This provides insight into vocabularies used to discuss similar constructs.   We used the distribution of feature representations to calculate a correlation matrix, and corrected for the fact that some labels tend to be assigned together a lot, such as Emotion and Emotion.Fear.   Then we performed hierarchical clustering on the resulting matrix to provide a visual representation of similar labels.
  12. Here is the dendrogram, and just based on visual inspection of the dendrogram, we isolated four clusters of labels. You can see that each cluster contains labels from both Paradigm Class and Behavioral Domains.   The green cluster contains labels related to cognition, perception and language. We created a word cloud of the top 10% of features within the labels associated with this cluster and see dominant terms such as temporal gyrus, anterior cingulate. This cluster seems to be dominated by anatomical terms.   The blue cluster seems primarily related to inhibition, and the resulting word cloud exhibits cingulate cortex, anterior cingulate, “event related”. Now we can see some terms related to task design involved in the classification process.   The purple cluster is pretty large, containing terms related to emotion and memory. The resulting word cloud exhibits terms like working memory, prefrontal cortex, facial expression, and major depressive disorder. Here disorders frequently studied within a domain become prominent in addition to more information about task design.   Finally, the red cluster contains terms related to pain and action. The resulting word cloud contains terms such as reaction time, working memory. This may be a little less informative which isn’t that surprising given the diversity of the labels assigned to the cluster.   We can also see tight groupings of labels related to specific constructs, such as language, emotion, memory, and pain.
  13. While these topics are somewhat subjectively chosen, we generated word clouds for each one.   Within the language topic we can see terms such as superior temporal being dominant, and amygdala, emotion, and fusiform dominant for emotion. The memory topic contains terms like working memory, prefrontal cortex, dorsolateral, and cingulate. Finally the pain topic contains terms such as anterior cingulate and insula.   Again, we can see here that related to specific constructs, anatomical terms really seemed to dominate the features used for classification. But it does demonstrate that within constructs, neuroimagers are discussing similar brain structures!
  14. To wrap everything up, we evaluated classifier performance for CogPO labels using text from either abstracts or the full article, bag-of-words features or Cognitive Atlas terms, and different classification algorithms. We found that the combination of full-text, bag-of-words, and logistic regression performed the best.   The Cognitive Atlas features outperformed the bag-of-words features when only using text from the abstracts.   Anatomical terms dominated the features used for classification when using bag-of-words.   Our future works include testing on an independent dataset, and validating these classifiers by replicating existing meta-analyses of manually annotated articles.   We would additionally like to assist in the process of fully specifying the relationships between terms in the Cognitive Atlas and seek to integrate these classifiers in existing frameworks.
  15. I would like to thank everyone in the Neuroinformatics and Brain Connectivity Lab for their contributions to this project, especially Taylor Salo. Id also like to thank our collaborators at Georgia State for project and analysis development. And of course thank you again for inviting me to present our work here today.   I’ll take any questions you have at this time.
  16. Logistic regression – probabilities describing the possible outcomes of a single trial are modeled using the S-shaped logistic function