SlideShare une entreprise Scribd logo
1  sur  33
Télécharger pour lire hors ligne
1
The NIF format (hands on)
Annotating Strings and Documents using the
NLP Interchange Format
2
Practical session outcomes
• Participants will learn to use NIF API to
annotate strings and documents using
the following wrappers:
–OpenNLP
–Stanford Core NLP
–Snowball Stemmer
–DBpedia Spotlight
• Query your corpus using SPARQL
3
NIF Example
4
Snowball Stemmer Wrapper
• Stemming algorithm is a process
for removing suffixes from words.
–CONNECT
• CONNECTED
• CONNECTION
• CONNECTING
• CONNECTIONS
5
Snowball Stemmer Wrapper
java -jar snowball.jar -f text -i 'I am
connected.'
• -f is used to define the format
• -i is used to define the input
6
Snowball Stemmer Wrapper
7
Snowball Stemmer Wrapper
8
Snowball Stemmer Wrapper
NIF Standard Annotations
NIF Offset
9
Snowball Stemmer Wrapper
NIF Standard Annotations
Snowball Stemmer
NIF Offset
10
Annotating Strings: Step-by-step
• 1. Open the USB stick folder
• 2. Decompress the “session-nif.zip” folder
• 3. Open the “NIF_DATATHON” folder and
decompress
“NIF_tutorial_hands_on_jars.zip”
• Open the prompt command, and use the
commands from the next slide in the “jar”
folder.
11
Available Wrappers
• To annotate documents, use the local wrappers (USB Stick)
java -jar opennlp.jar -f text -i 'This is a test.' -modelFolder ../model/
java -jar stanford.jar -f text -i 'This is a test.'
java -jar snowball.jar -f text -i 'This is my favorite test.'
java -jar spotlight.jar -f text -i 'Welcome to Germany.' -confidence 0.2
• To annotate small strings, you can try the on-line services:
http://spotlight.nlp2rdf.aksw.org/spotlight?
f=text&i=Welcome+to+Germany.&t=direct&confidence=0.3&prefix=http://yourDomain.org/
• http://snowball.nlp2rdf.aksw.org/snowball?
f=text&i=This+is+my+favorite+test.&t=direct&prefix=http://yourDomain.org/
• http://stanford.nlp2rdf.aksw.org/stanfordcorenlpn?
f=text&i=This+is+a+test.&t=direct&prefix=http://yourDomain.org/
• http://opennlp.nlp2rdf.aksw.org/opennlp?
f=text&i=This+is+a+test.&t=direct&modelFolder=model&prefix=http://yourDomain.org
12
Reading and Writing Files
• Write results in a file:
“--outfile myAnnotatedFile.ttl“
• Read a document as input
“--intype file -i /path/myDoc”
13
POS tagger for multiple languages
• The -modelFolder parameter set the folder
that contains the POS tagging OpenNLP
trained models and tokenization.
• Different languages can be found at
OpenNLP website
http://opennlp.sourceforge.net/models-
1.5/http://opennlp.sourceforge.net/models-1.5/
14
Example 2: Query a Corpus
15
Querying with Twinkle
Open the “/twinkle” folder and run
the command:
java -jar twinkle.jar
16
Querying a Corpus
17
Querying a Corpus
18
Querying a Corpus
19
Querying a Corpus
20
Querying a Corpus
21
Querying a Corpus
22
Querying a Corpus
23
Querying a Corpus
24
Querying a Corpus
25
Querying a Corpus
26
Querying a Corpus
27
Querying a Corpus
28
Querying a Corpus
29
Exercise 3: Querying your own NIF
annotated corpus
30
Querying your own NIF annotated
corpus
1. Annotate your string using one of the
wrappers
2. Save your annotated sentence to a file
(using “--outfile”)
3. Open Twinkle
4. Query your corpus using Twinkle
31
• Query your annotated corpus:
– nif:Context
– nif:Sentence
– nif:anchorOf
– nif:oliaCategory
– nif:oliaLink
… or practice with Brown Corpus!
32
33
Thank you!
http://site.nlp2rdf.org/

Contenu connexe

Similaire à Nif practical

Creating Reusable Geospatial Pipelines
Creating Reusable Geospatial PipelinesCreating Reusable Geospatial Pipelines
Creating Reusable Geospatial PipelinesDatabricks
 
Deep Learning Automated Helpdesk
Deep Learning Automated HelpdeskDeep Learning Automated Helpdesk
Deep Learning Automated HelpdeskPranav Sharma
 
Testing Adhearsion Applications
Testing Adhearsion ApplicationsTesting Adhearsion Applications
Testing Adhearsion ApplicationsLuca Pradovera
 
Prg 421 guide focus dreams prg421guide.com
Prg 421 guide focus dreams   prg421guide.comPrg 421 guide focus dreams   prg421guide.com
Prg 421 guide focus dreams prg421guide.comchandika6
 
Vancouver part 1 intro to elasticsearch and kibana-beginner's crash course ...
Vancouver   part 1 intro to elasticsearch and kibana-beginner's crash course ...Vancouver   part 1 intro to elasticsearch and kibana-beginner's crash course ...
Vancouver part 1 intro to elasticsearch and kibana-beginner's crash course ...UllyCarolinneSampaio
 
The basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwon
The basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwonThe basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwon
The basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwonKenneth Kwon
 
Chainable and Extendable Knowledge Integration Web Services: the FREME Framework
Chainable and Extendable Knowledge Integration Web Services: the FREME FrameworkChainable and Extendable Knowledge Integration Web Services: the FREME Framework
Chainable and Extendable Knowledge Integration Web Services: the FREME FrameworkMilan Dojchinovski
 
Perl5 meta programming
Perl5 meta programmingPerl5 meta programming
Perl5 meta programmingkarupanerura
 
Make your Ansible playbooks maintainable, flexible, and scalable
Make your Ansible playbooks maintainable, flexible, and scalableMake your Ansible playbooks maintainable, flexible, and scalable
Make your Ansible playbooks maintainable, flexible, and scalableJeff Geerling
 
OWASP 2013 APPSEC USA ZAP Hackathon
OWASP 2013 APPSEC USA ZAP HackathonOWASP 2013 APPSEC USA ZAP Hackathon
OWASP 2013 APPSEC USA ZAP HackathonSimon Bennetts
 
Prg 420 Enhance teaching / snaptutorial.com
Prg 420 Enhance teaching / snaptutorial.comPrg 420 Enhance teaching / snaptutorial.com
Prg 420 Enhance teaching / snaptutorial.comBaileya28
 
Mastering Test Automation: How to Use Selenium Successfully
Mastering Test Automation: How to Use Selenium Successfully Mastering Test Automation: How to Use Selenium Successfully
Mastering Test Automation: How to Use Selenium Successfully Applitools
 
Tracing the Breadcrumbs: Apache Spark Workload Diagnostics
Tracing the Breadcrumbs: Apache Spark Workload DiagnosticsTracing the Breadcrumbs: Apache Spark Workload Diagnostics
Tracing the Breadcrumbs: Apache Spark Workload DiagnosticsDatabricks
 
2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with Blackfire2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with BlackfireMarko Mitranić
 
Experience on-freeswitch-cluecon2011
Experience on-freeswitch-cluecon2011Experience on-freeswitch-cluecon2011
Experience on-freeswitch-cluecon2011seven1240
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovVasil Remeniuk
 
SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)andyseaborne
 
21st Century CPAN Testing: CPANci
21st Century CPAN Testing: CPANci21st Century CPAN Testing: CPANci
21st Century CPAN Testing: CPANciMike Friedman
 

Similaire à Nif practical (20)

Creating Reusable Geospatial Pipelines
Creating Reusable Geospatial PipelinesCreating Reusable Geospatial Pipelines
Creating Reusable Geospatial Pipelines
 
Php extensions
Php extensionsPhp extensions
Php extensions
 
Deep Learning Automated Helpdesk
Deep Learning Automated HelpdeskDeep Learning Automated Helpdesk
Deep Learning Automated Helpdesk
 
Testing Adhearsion Applications
Testing Adhearsion ApplicationsTesting Adhearsion Applications
Testing Adhearsion Applications
 
Prg 421 guide focus dreams prg421guide.com
Prg 421 guide focus dreams   prg421guide.comPrg 421 guide focus dreams   prg421guide.com
Prg 421 guide focus dreams prg421guide.com
 
Vancouver part 1 intro to elasticsearch and kibana-beginner's crash course ...
Vancouver   part 1 intro to elasticsearch and kibana-beginner's crash course ...Vancouver   part 1 intro to elasticsearch and kibana-beginner's crash course ...
Vancouver part 1 intro to elasticsearch and kibana-beginner's crash course ...
 
The basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwon
The basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwonThe basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwon
The basics of hacking and penetration testing 이제 시작이야 해킹과 침투 테스트 kenneth.s.kwon
 
Natural Language Processing using Java
Natural Language Processing using JavaNatural Language Processing using Java
Natural Language Processing using Java
 
Chainable and Extendable Knowledge Integration Web Services: the FREME Framework
Chainable and Extendable Knowledge Integration Web Services: the FREME FrameworkChainable and Extendable Knowledge Integration Web Services: the FREME Framework
Chainable and Extendable Knowledge Integration Web Services: the FREME Framework
 
Perl5 meta programming
Perl5 meta programmingPerl5 meta programming
Perl5 meta programming
 
Make your Ansible playbooks maintainable, flexible, and scalable
Make your Ansible playbooks maintainable, flexible, and scalableMake your Ansible playbooks maintainable, flexible, and scalable
Make your Ansible playbooks maintainable, flexible, and scalable
 
OWASP 2013 APPSEC USA ZAP Hackathon
OWASP 2013 APPSEC USA ZAP HackathonOWASP 2013 APPSEC USA ZAP Hackathon
OWASP 2013 APPSEC USA ZAP Hackathon
 
Prg 420 Enhance teaching / snaptutorial.com
Prg 420 Enhance teaching / snaptutorial.comPrg 420 Enhance teaching / snaptutorial.com
Prg 420 Enhance teaching / snaptutorial.com
 
Mastering Test Automation: How to Use Selenium Successfully
Mastering Test Automation: How to Use Selenium Successfully Mastering Test Automation: How to Use Selenium Successfully
Mastering Test Automation: How to Use Selenium Successfully
 
Tracing the Breadcrumbs: Apache Spark Workload Diagnostics
Tracing the Breadcrumbs: Apache Spark Workload DiagnosticsTracing the Breadcrumbs: Apache Spark Workload Diagnostics
Tracing the Breadcrumbs: Apache Spark Workload Diagnostics
 
2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with Blackfire2019 StartIT - Boosting your performance with Blackfire
2019 StartIT - Boosting your performance with Blackfire
 
Experience on-freeswitch-cluecon2011
Experience on-freeswitch-cluecon2011Experience on-freeswitch-cluecon2011
Experience on-freeswitch-cluecon2011
 
Scalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex GryzlovScalding by Adform Research, Alex Gryzlov
Scalding by Adform Research, Alex Gryzlov
 
SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)SPARQL 1.1 Update (2013-03-05)
SPARQL 1.1 Update (2013-03-05)
 
21st Century CPAN Testing: CPANci
21st Century CPAN Testing: CPANci21st Century CPAN Testing: CPANci
21st Century CPAN Testing: CPANci
 

Dernier

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Dernier (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Nif practical