SlideShare une entreprise Scribd logo
1  sur  22
Télécharger pour lire hors ligne
Finding Ostriches in the Courtroom
Enabling Insight with Linguistic Visualization

                               Christopher Collins
                            University of Toronto (to Dec 2009)
      University of Ontario Institute of Technology (Jan 2010-)
Target Audience




          General         Domain          Language
          Public          Experts         Researchers




          Real-time   Single Document       Linguistic


                       Discrete Corpus        NLP


                      Continuous Corpus        CL
Problem Areas




         Real-time   Single Document     Linguistic


                      Discrete Corpus      NLP


                     Continuous Corpus      CL
Humans have reached
their cognitive capacity.
Humans have reached
their cognitive capacity.
Information is overwhelming
         because of
      the naïve manner
   in which it is delivered.
7
External Cognition
• External cognition is the interaction
  between internal and external
  representations when performing cognitive
  tasks.
• Computational offloading is the extent to
  which external representations can reduce
  the amount of cognitive effort to solve a
  problem.
  Yvonne Rogers, New Theoretical Approaches for Human-Computer Interaction, 2004.
Document Visualization




                                                Collins, C.; Carpendale, S.; Penn, G.
               DocuBurst: Visualizing Document Content using Language Structure.
       Proceedings of Eurographics/IEEE VGTC Symposium on Visualization, June, 2009.
Many Eyes Tag Cloud
Mihalcea and Tarau, 2004
DocuBurst
          games game
          taken take




          absolute,noun,10
          chair,noun,2
          moment,noun,11
          game,noun,30
          reality,noun,3
          take,verb,13
          represent,verb,17
          ...




          game IS activity
WordNet   chair IS furniture
U.S. Presidential Debates
Corpus Visualization

• Beyond similarity and clustering
  – How do we discern differences within and between
    document collections?




                                                         Collins, C.; Viégas, F.; Wattenberg, M.
                           Parallel Tag Clouds to Explore and Analyze Faceted Text Corpora.
     To appear in Proc. IEEE Symposium on Visual Analytics Science & Technology (VAST), 2009.
Our Data: U.S. Federal Court Decisions




Data from public.resource.org
Visualization Design          Patent Invention

17


     • Size = significance of
       difference (G2 score)
     • Order = alphabetic
     • Edges = word occurring in
       multiple columns
Ostriches in the 7th Circuit
Highfalutin Judge Selya

furculum
             immurement
       impuissant
Bridging the Linguistic Divide

 Open APIs for data


 NYT, Twitter, Google




                                  ?
                          Open APIs for NLP

                              -    Summarization
                          -       Keyword extraction
 Toolkits and APIs for    -       Sentiment analysis
     Visualization

 Processing, Rafael,
    Flare, Flash
Visualization
 Augments
  Reading


       www.christophercollins.ca

Contenu connexe

Similaire à Finding Ostriches in the Courtroom

Why Languages Matter 20090123
Why Languages Matter 20090123Why Languages Matter 20090123
Why Languages Matter 20090123David Wood
 
Semantic webslideshareversion
Semantic webslideshareversionSemantic webslideshareversion
Semantic webslideshareversionCaroline_Rose
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowValeria de Paiva
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with PythonBenjamin Bengfort
 
Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Cornelius Puschmann
 
Understanding natural language processing
Understanding natural language processingUnderstanding natural language processing
Understanding natural language processingjbene mourad
 
Lean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural LogicLean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural LogicValeria de Paiva
 
NLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptNLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptOlusolaTop
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingpunedevscom
 
Knowledge Integration and Language relativism–Of the triade knowledge,languag...
Knowledge Integration and Language relativism–Of the triade knowledge,languag...Knowledge Integration and Language relativism–Of the triade knowledge,languag...
Knowledge Integration and Language relativism–Of the triade knowledge,languag...Oliver Krone-Franken
 
Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...
Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...
Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...Cornelius Puschmann
 
Bosch1991a bermuda
Bosch1991a bermudaBosch1991a bermuda
Bosch1991a bermudagorin2008
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language ProcessingMichel Bruley
 
Nlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudyNlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudyRaza Azeem
 
Cognitive ethnography
Cognitive ethnographyCognitive ethnography
Cognitive ethnographyBrock Dubbels
 

Similaire à Finding Ostriches in the Courtroom (20)

Why Languages Matter 20090123
Why Languages Matter 20090123Why Languages Matter 20090123
Why Languages Matter 20090123
 
Semantic webslideshareversion
Semantic webslideshareversionSemantic webslideshareversion
Semantic webslideshareversion
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and How
 
Natural Language Processing with Python
Natural Language Processing with PythonNatural Language Processing with Python
Natural Language Processing with Python
 
Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)Corpora, Blogs and Linguistic Variation (Paderborn)
Corpora, Blogs and Linguistic Variation (Paderborn)
 
A Bridge Not too Far
A Bridge Not too FarA Bridge Not too Far
A Bridge Not too Far
 
Understanding natural language processing
Understanding natural language processingUnderstanding natural language processing
Understanding natural language processing
 
Lean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural LogicLean Logic for Lean Times: Varieties of Natural Logic
Lean Logic for Lean Times: Varieties of Natural Logic
 
NLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.pptNLP introduced and in 47 slides Lecture 1.ppt
NLP introduced and in 47 slides Lecture 1.ppt
 
Bird05 nltk-intro
Bird05 nltk-introBird05 nltk-intro
Bird05 nltk-intro
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Knowledge Integration and Language relativism–Of the triade knowledge,languag...
Knowledge Integration and Language relativism–Of the triade knowledge,languag...Knowledge Integration and Language relativism–Of the triade knowledge,languag...
Knowledge Integration and Language relativism–Of the triade knowledge,languag...
 
LSDI.pptx
LSDI.pptxLSDI.pptx
LSDI.pptx
 
Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...
Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...
Discourse Or Document? Issues of adopting Emerging Digital Genres for Scholar...
 
Bosch1991a bermuda
Bosch1991a bermudaBosch1991a bermuda
Bosch1991a bermuda
 
Big Data and Natural Language Processing
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
 
Diachronic Analysis
Diachronic AnalysisDiachronic Analysis
Diachronic Analysis
 
Nlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudyNlp Sentemental analysis of Tweetr And CaseStudy
Nlp Sentemental analysis of Tweetr And CaseStudy
 
Cognitive ethnography
Cognitive ethnographyCognitive ethnography
Cognitive ethnography
 
Wittgenstein Language-game and Ontologies
Wittgenstein Language-game and OntologiesWittgenstein Language-game and Ontologies
Wittgenstein Language-game and Ontologies
 

Dernier

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 

Dernier (20)

Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 

Finding Ostriches in the Courtroom

  • 1. Finding Ostriches in the Courtroom Enabling Insight with Linguistic Visualization Christopher Collins University of Toronto (to Dec 2009) University of Ontario Institute of Technology (Jan 2010-)
  • 2. Target Audience General Domain Language Public Experts Researchers Real-time Single Document Linguistic Discrete Corpus NLP Continuous Corpus CL
  • 3. Problem Areas Real-time Single Document Linguistic Discrete Corpus NLP Continuous Corpus CL
  • 4. Humans have reached their cognitive capacity.
  • 5. Humans have reached their cognitive capacity.
  • 6. Information is overwhelming because of the naïve manner in which it is delivered.
  • 7. 7
  • 8. External Cognition • External cognition is the interaction between internal and external representations when performing cognitive tasks. • Computational offloading is the extent to which external representations can reduce the amount of cognitive effort to solve a problem. Yvonne Rogers, New Theoretical Approaches for Human-Computer Interaction, 2004.
  • 9. Document Visualization Collins, C.; Carpendale, S.; Penn, G. DocuBurst: Visualizing Document Content using Language Structure. Proceedings of Eurographics/IEEE VGTC Symposium on Visualization, June, 2009.
  • 10. Many Eyes Tag Cloud Mihalcea and Tarau, 2004
  • 11. DocuBurst games game taken take absolute,noun,10 chair,noun,2 moment,noun,11 game,noun,30 reality,noun,3 take,verb,13 represent,verb,17 ... game IS activity WordNet chair IS furniture
  • 12.
  • 13.
  • 15. Corpus Visualization • Beyond similarity and clustering – How do we discern differences within and between document collections? Collins, C.; Viégas, F.; Wattenberg, M. Parallel Tag Clouds to Explore and Analyze Faceted Text Corpora. To appear in Proc. IEEE Symposium on Visual Analytics Science & Technology (VAST), 2009.
  • 16. Our Data: U.S. Federal Court Decisions Data from public.resource.org
  • 17. Visualization Design Patent Invention 17 • Size = significance of difference (G2 score) • Order = alphabetic • Edges = word occurring in multiple columns
  • 18.
  • 19. Ostriches in the 7th Circuit
  • 20. Highfalutin Judge Selya furculum immurement impuissant
  • 21. Bridging the Linguistic Divide Open APIs for data NYT, Twitter, Google ? Open APIs for NLP - Summarization - Keyword extraction Toolkits and APIs for - Sentiment analysis Visualization Processing, Rafael, Flare, Flash
  • 22. Visualization Augments Reading www.christophercollins.ca