SlideShare une entreprise Scribd logo
1  sur  40
Using NLP to Accelerate
Product Development and
Help You Create a Killer App
Max Kaufmann, November 2015
© 2014 International Business Machines Corporation 2
Our Speaker:
Max Kaufmann
NLP Developer
IBM Watson Ecosystem
Max Kaufmann has a BS in Linguistics from Grinnell College, and a M.S. in Computational
Linguistics from the University of Washington. He has previous experience in academia, where
he has published papers on topics such as using machine translation, and in industry, where he
has several NLP-related patents pending. He is currently a member of the IBM Watson
Ecosystem team, where he helps help fast growing companies and entrepreneurial minded
organizations use Watson's NLP capabilities to solve real world problems.
Agenda:
• What is NLP and why do you need it?
• What are the main problems in NLP?
• How are NLP problems solved?
• Why the Watson Ecosystem is awesome
• Q & A
© 2014 International Business Machines Corporation 3
© 2014 International Business Machines Corporation 4
-Wikipedia
“Teaching computers to understand human languages”
− Max Kaufmann
What is NLP?
Why NLP is hard
© 2014 International Business Machines Corporation 5
• Ambiguity
– I saw her duck with a telescope
Why NLP is hard
© 2014 International Business Machines Corporation 6
• Novelty
– Watson is super duper kewl, I'm so excited to be an
Ecosystem Partner #watson #ibm
#subliminalMessaging
– Omg rmbr that sandwhich I ate for lunch it was
soooo dope #sandwhich #food
#everyoneNeedsToKnowWhatIateforLunch
• Lots of NLP systems are ‘trained’ on text from
newspapers/books.
Required Knowledge: Who
Ginni is, what 'behind'
means
Possible answers: 'a sign'
'the ibm logo' 'multiple signs'
Why NLP is hard
• World Knowledge
– What’s behind Ginni?
s: http://acl2014.org/
Why should I care about NLP?
• These companies do:
Sources: http://altaplana.com/TextAnalytics2014.pdf
What applications involve NLP
What type of data requires NLP?
What capabilities can NLP add?
What types of problems are there in
NLP?
© 2014 International Business Machines Corporation 12
Type 1: Applications
• These are things you all want to do
– Natural Language Generation
– Summarization
– Dialog
– Machine Translation
– Question Answering
– Sentiment Analysis
Type 2: Tasks
• These are usually precursors to applications
– Part of Speech tagging
• Identify which part of speech a word is
– Lemmatizing
• Running  run, ran  run
– Multiword Expression (Idiom) Identification
• ‘Kick the bucket’ can’t become ‘kicking the bucket’
– Parsing
– Word sense disambiguation
• ‘bank’ as in money vs ‘bank’ as in river
– Conference/Anaphora Resolution
• Sally didn’t know what to do with all the money she made by
becoming a Watson Ecosystem partner
If I want to tackle NLP on my own,
what would a solution look like?
© 2014 International Business Machines Corporation 15
How to Solve NLP Problems
• Sample problem: Relationship Extraction
• Input:
– Today I saw Ginni Rometty give an amazing talk about
Watson. She was a fantastic speaker, I want her to give a talk
at my organization.
• Goal: Extract all relations involving people.
• Solution: Chain a bunch of tasks together until you have
enough information to extract relations
– Going to focus on the pipeline process, not implementation
Sentence Splitting
• Input:
– Today I saw Ginni Rometty give an amazing talk about
Watson. She was a fantastic speaker, I want her to give a talk
at my organization.
• Output:
– Today I saw Ginni Rometty give an amazing talk about
Watson.
– She was such a fantastic speaker, I want her to give a talk at
my organization.
• Food for thought:
– How would we split if it said Mrs. Rometty?
Pop quiz: How many POS tags are there?
− 36
 Today I saw Ginni Rometty give an amazing talk about Watson.
− Today = Adverb
 Today is a great day
− I = Pronoun
− Saw = Verb
 Why not the cutting tool?
− Ginni = Proper noun
Part of Speech (POS) tagging
Key
N = Noun
NNP = Proper Noun
PRP = Personal Pronoun
DT = Determiner
JJ = Adjective
IN = Preposition
NP, VP, ect = Noun Phrase,
Verb Phrase,ect
NX,VX,ect = Incomplete NP,
Incomplete VP
Parsing
Entity Identification
Why is Ginni a person,
but Watson isn't?
Entity Identification
Relationship Extraction
Relationship Extraction
Relationship Extraction
• Today I saw Ginni Rometty give an amazing talk about
Watson. She was such a fantastic speaker, I want her to
give a talk at my organization.
• Which is right?
– Give (an amazing talk about Watson, Ginni Rometty)
– Give (an amazing talk, Ginni Rometty)
– Give (talk, Ginni Rometty)
• Extracting speaker (Ginni Rometty)
– How do we know that (she == Ginni Rometty) but (I ! = Ginni
Rometty)
Relationship Extraction
• How do we express that the talk hasn’t happened yet?
– What if the sentence was “I want her to give another talk at my
organization”
• “She was a fantastic speaker”
– Was it this time only? Or is this a property of Ginni?
What did we learn?
• NLP is hard
– Language will always surprise you.
• Everything is a pipeline
– Part of Speech tagging  Parsing  Entity Detection 
Relationship extraction
– What if we had identified “Ginni Rometty” as a verb?
How do you build applications that
deal with language?
• See what progress has been made on your problem
– “basically solved”
– “good progress”
– “here be dragons”
What problems have we basically solved?
• Part of Speech tagging
– I went to the store  Store
• Lemmatizing
– Running  run
• Morphological segmentation
– Running  run + ing
• Sentence Splitting
• Tokenizing
– Breaking sentences into ‘words’
What problems have we made good progress on?
• Machine Translation
• Parsing
• Search
• Coreference Resolution
• Sentiment Analysis
• Relationship Extraction
• Word Sense Disambiguation
• Idiom Identification
http://www.datacommunitydc.o
rg/blog/2013/05/recommendati
on-engines-why-you-shouldnt-
build-one
‘Here be Dragons’ Problems
• Summarization
• Natural Language Generation
• Dialog
• Content Recommendation
• Artificial Intelligence
• Question Answering
• Building Ontologies
• Pragmatics
More Reasons Why NLP is hard
• Accept that things will go wrong
– Nothing in NLP ever has 100% accuracy
• Accept that NLP numbers are uncomfortable
– 50% accuracy can be very good
– Going from 72%  74% accuracy can be a HUGE deal
• Embrace Cognitive Computing
– “While they’ll have deep domain expertise, instead of replacing
human experts, cognitive computers will act as a decision
support system and help them make better decisions based
on the best available data, whether in healthcare, finance or
customer service.”
It's difficult to do NLP if it's not your core business
competency
Are there any companies that can help?
What our APIs are awesome
We love feedback!
Why our APIs are awesome
Lets you figure out if it works for you
Easy REST API
Deploys in
Bluemix
Why our APIs are awesome
Science
Why our APIs are awesome
Who has used Watson?
Summary
• NLP is hard to tackle on your own
• But if your application involves users, NLP can provide
huge value
• The best way to get that value without all the hard work is
to become an Ecosystem Partner
What now?
• Explore the Watson APIs:
http://www.ibm.com/smarterplanet/us/en/ibmwatson/devel
opercloud/services-catalog.html
• Apply to become an Ecosystem Partner:
http://www.ibm.com/smarterplanet/us/en/ibmwatson/ecos
ystem.html
Questions?
© 2014 International Business Machines Corporation 40

Contenu connexe

Similaire à Using NLP to Accelerate Product Development and Help You Create a Killer App

Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingDavid Rostcheck
 
Be a Database Marketing Mind Reader with Persona and Segment Intelligence
Be a Database Marketing Mind Reader with Persona and Segment IntelligenceBe a Database Marketing Mind Reader with Persona and Segment Intelligence
Be a Database Marketing Mind Reader with Persona and Segment IntelligenceSalesEngine
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA DATASCIENCE
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for developmentAravind Reddy
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for developmentAravind Reddy
 
Everything is Awesome - How to stimulate conversations about the future in yo...
Everything is Awesome - How to stimulate conversations about the future in yo...Everything is Awesome - How to stimulate conversations about the future in yo...
Everything is Awesome - How to stimulate conversations about the future in yo...Simon Wong
 
NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk Vijay Ganti
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningCloudxLab
 
NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk Vijay Ganti
 
Be a Database Marketing Mind Reader
Be a Database Marketing Mind ReaderBe a Database Marketing Mind Reader
Be a Database Marketing Mind ReaderSalesEngine
 
Give A/P Shared Services Reps an Extra Brain with Intelligent SAP-Empowered V...
Give A/P Shared Services Reps an Extra Brain with Intelligent SAP-Empowered V...Give A/P Shared Services Reps an Extra Brain with Intelligent SAP-Empowered V...
Give A/P Shared Services Reps an Extra Brain with Intelligent SAP-Empowered V...Carrie Bucko
 
5 lessons to help you transition into Product Management
5 lessons to help you transition into Product Management5 lessons to help you transition into Product Management
5 lessons to help you transition into Product ManagementJonathan Lai
 
Teaching lean startup capital enterprise
Teaching lean startup   capital enterpriseTeaching lean startup   capital enterprise
Teaching lean startup capital enterpriseFounder-Centric
 
Zero Adoption: Lessons Learned From Failing at Open Source
Zero Adoption: Lessons Learned From Failing at Open SourceZero Adoption: Lessons Learned From Failing at Open Source
Zero Adoption: Lessons Learned From Failing at Open SourceMemi Beltrame
 
Talking to people: the forgotten DevOps tool
Talking to people: the forgotten DevOps toolTalking to people: the forgotten DevOps tool
Talking to people: the forgotten DevOps toolPeter Varhol
 
UXPA2019 Not Your Average Chatbot: Using Cognitive Intercept to Improve Infor...
UXPA2019 Not Your Average Chatbot: Using Cognitive Intercept to Improve Infor...UXPA2019 Not Your Average Chatbot: Using Cognitive Intercept to Improve Infor...
UXPA2019 Not Your Average Chatbot: Using Cognitive Intercept to Improve Infor...UXPA International
 
Introduction to NLP.pptx
Introduction to NLP.pptxIntroduction to NLP.pptx
Introduction to NLP.pptxbuivantan_uneti
 
Deep Learning Overview
Deep Learning OverviewDeep Learning Overview
Deep Learning OverviewCloudxLab
 
Workshop Content Planning - WordCamp Europe - Belgrade - June 2018
Workshop Content Planning - WordCamp Europe - Belgrade - June 2018Workshop Content Planning - WordCamp Europe - Belgrade - June 2018
Workshop Content Planning - WordCamp Europe - Belgrade - June 2018Yvette Sonneveld
 
Leveraging the Twitter Economy for a DevOps World
Leveraging	 the Twitter Economy for a DevOps WorldLeveraging	 the Twitter Economy for a DevOps World
Leveraging the Twitter Economy for a DevOps WorldTodd Vernon
 

Similaire à Using NLP to Accelerate Product Development and Help You Create a Killer App (20)

Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Be a Database Marketing Mind Reader with Persona and Segment Intelligence
Be a Database Marketing Mind Reader with Persona and Segment IntelligenceBe a Database Marketing Mind Reader with Persona and Segment Intelligence
Be a Database Marketing Mind Reader with Persona and Segment Intelligence
 
NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2NOVA Data Science Meetup 1/19/2017 - Presentation 2
NOVA Data Science Meetup 1/19/2017 - Presentation 2
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Natural Language Processing for development
Natural Language Processing for developmentNatural Language Processing for development
Natural Language Processing for development
 
Everything is Awesome - How to stimulate conversations about the future in yo...
Everything is Awesome - How to stimulate conversations about the future in yo...Everything is Awesome - How to stimulate conversations about the future in yo...
Everything is Awesome - How to stimulate conversations about the future in yo...
 
NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk NLP & Machine Learning - An Introductory Talk
NLP & Machine Learning - An Introductory Talk
 
Be a Database Marketing Mind Reader
Be a Database Marketing Mind ReaderBe a Database Marketing Mind Reader
Be a Database Marketing Mind Reader
 
Give A/P Shared Services Reps an Extra Brain with Intelligent SAP-Empowered V...
Give A/P Shared Services Reps an Extra Brain with Intelligent SAP-Empowered V...Give A/P Shared Services Reps an Extra Brain with Intelligent SAP-Empowered V...
Give A/P Shared Services Reps an Extra Brain with Intelligent SAP-Empowered V...
 
5 lessons to help you transition into Product Management
5 lessons to help you transition into Product Management5 lessons to help you transition into Product Management
5 lessons to help you transition into Product Management
 
Teaching lean startup capital enterprise
Teaching lean startup   capital enterpriseTeaching lean startup   capital enterprise
Teaching lean startup capital enterprise
 
Zero Adoption: Lessons Learned From Failing at Open Source
Zero Adoption: Lessons Learned From Failing at Open SourceZero Adoption: Lessons Learned From Failing at Open Source
Zero Adoption: Lessons Learned From Failing at Open Source
 
Talking to people: the forgotten DevOps tool
Talking to people: the forgotten DevOps toolTalking to people: the forgotten DevOps tool
Talking to people: the forgotten DevOps tool
 
UXPA2019 Not Your Average Chatbot: Using Cognitive Intercept to Improve Infor...
UXPA2019 Not Your Average Chatbot: Using Cognitive Intercept to Improve Infor...UXPA2019 Not Your Average Chatbot: Using Cognitive Intercept to Improve Infor...
UXPA2019 Not Your Average Chatbot: Using Cognitive Intercept to Improve Infor...
 
Introduction to NLP.pptx
Introduction to NLP.pptxIntroduction to NLP.pptx
Introduction to NLP.pptx
 
Deep Learning Overview
Deep Learning OverviewDeep Learning Overview
Deep Learning Overview
 
Workshop Content Planning - WordCamp Europe - Belgrade - June 2018
Workshop Content Planning - WordCamp Europe - Belgrade - June 2018Workshop Content Planning - WordCamp Europe - Belgrade - June 2018
Workshop Content Planning - WordCamp Europe - Belgrade - June 2018
 
Leveraging the Twitter Economy for a DevOps World
Leveraging	 the Twitter Economy for a DevOps WorldLeveraging	 the Twitter Economy for a DevOps World
Leveraging the Twitter Economy for a DevOps World
 

Dernier

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024SynarionITSolutions
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Dernier (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024Top 10 Most Downloaded Games on Play Store in 2024
Top 10 Most Downloaded Games on Play Store in 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Using NLP to Accelerate Product Development and Help You Create a Killer App

  • 1. Using NLP to Accelerate Product Development and Help You Create a Killer App Max Kaufmann, November 2015
  • 2. © 2014 International Business Machines Corporation 2 Our Speaker: Max Kaufmann NLP Developer IBM Watson Ecosystem Max Kaufmann has a BS in Linguistics from Grinnell College, and a M.S. in Computational Linguistics from the University of Washington. He has previous experience in academia, where he has published papers on topics such as using machine translation, and in industry, where he has several NLP-related patents pending. He is currently a member of the IBM Watson Ecosystem team, where he helps help fast growing companies and entrepreneurial minded organizations use Watson's NLP capabilities to solve real world problems.
  • 3. Agenda: • What is NLP and why do you need it? • What are the main problems in NLP? • How are NLP problems solved? • Why the Watson Ecosystem is awesome • Q & A © 2014 International Business Machines Corporation 3
  • 4. © 2014 International Business Machines Corporation 4 -Wikipedia “Teaching computers to understand human languages” − Max Kaufmann What is NLP?
  • 5. Why NLP is hard © 2014 International Business Machines Corporation 5 • Ambiguity – I saw her duck with a telescope
  • 6. Why NLP is hard © 2014 International Business Machines Corporation 6 • Novelty – Watson is super duper kewl, I'm so excited to be an Ecosystem Partner #watson #ibm #subliminalMessaging – Omg rmbr that sandwhich I ate for lunch it was soooo dope #sandwhich #food #everyoneNeedsToKnowWhatIateforLunch • Lots of NLP systems are ‘trained’ on text from newspapers/books.
  • 7. Required Knowledge: Who Ginni is, what 'behind' means Possible answers: 'a sign' 'the ibm logo' 'multiple signs' Why NLP is hard • World Knowledge – What’s behind Ginni?
  • 8. s: http://acl2014.org/ Why should I care about NLP? • These companies do:
  • 10. What type of data requires NLP?
  • 12. What types of problems are there in NLP? © 2014 International Business Machines Corporation 12
  • 13. Type 1: Applications • These are things you all want to do – Natural Language Generation – Summarization – Dialog – Machine Translation – Question Answering – Sentiment Analysis
  • 14. Type 2: Tasks • These are usually precursors to applications – Part of Speech tagging • Identify which part of speech a word is – Lemmatizing • Running  run, ran  run – Multiword Expression (Idiom) Identification • ‘Kick the bucket’ can’t become ‘kicking the bucket’ – Parsing – Word sense disambiguation • ‘bank’ as in money vs ‘bank’ as in river – Conference/Anaphora Resolution • Sally didn’t know what to do with all the money she made by becoming a Watson Ecosystem partner
  • 15. If I want to tackle NLP on my own, what would a solution look like? © 2014 International Business Machines Corporation 15
  • 16. How to Solve NLP Problems • Sample problem: Relationship Extraction • Input: – Today I saw Ginni Rometty give an amazing talk about Watson. She was a fantastic speaker, I want her to give a talk at my organization. • Goal: Extract all relations involving people. • Solution: Chain a bunch of tasks together until you have enough information to extract relations – Going to focus on the pipeline process, not implementation
  • 17. Sentence Splitting • Input: – Today I saw Ginni Rometty give an amazing talk about Watson. She was a fantastic speaker, I want her to give a talk at my organization. • Output: – Today I saw Ginni Rometty give an amazing talk about Watson. – She was such a fantastic speaker, I want her to give a talk at my organization. • Food for thought: – How would we split if it said Mrs. Rometty?
  • 18. Pop quiz: How many POS tags are there? − 36  Today I saw Ginni Rometty give an amazing talk about Watson. − Today = Adverb  Today is a great day − I = Pronoun − Saw = Verb  Why not the cutting tool? − Ginni = Proper noun Part of Speech (POS) tagging
  • 19. Key N = Noun NNP = Proper Noun PRP = Personal Pronoun DT = Determiner JJ = Adjective IN = Preposition NP, VP, ect = Noun Phrase, Verb Phrase,ect NX,VX,ect = Incomplete NP, Incomplete VP Parsing
  • 21. Why is Ginni a person, but Watson isn't? Entity Identification
  • 24. Relationship Extraction • Today I saw Ginni Rometty give an amazing talk about Watson. She was such a fantastic speaker, I want her to give a talk at my organization. • Which is right? – Give (an amazing talk about Watson, Ginni Rometty) – Give (an amazing talk, Ginni Rometty) – Give (talk, Ginni Rometty) • Extracting speaker (Ginni Rometty) – How do we know that (she == Ginni Rometty) but (I ! = Ginni Rometty)
  • 25. Relationship Extraction • How do we express that the talk hasn’t happened yet? – What if the sentence was “I want her to give another talk at my organization” • “She was a fantastic speaker” – Was it this time only? Or is this a property of Ginni?
  • 26. What did we learn? • NLP is hard – Language will always surprise you. • Everything is a pipeline – Part of Speech tagging  Parsing  Entity Detection  Relationship extraction – What if we had identified “Ginni Rometty” as a verb?
  • 27. How do you build applications that deal with language? • See what progress has been made on your problem – “basically solved” – “good progress” – “here be dragons”
  • 28. What problems have we basically solved? • Part of Speech tagging – I went to the store  Store • Lemmatizing – Running  run • Morphological segmentation – Running  run + ing • Sentence Splitting • Tokenizing – Breaking sentences into ‘words’
  • 29. What problems have we made good progress on? • Machine Translation • Parsing • Search • Coreference Resolution • Sentiment Analysis • Relationship Extraction • Word Sense Disambiguation • Idiom Identification
  • 30. http://www.datacommunitydc.o rg/blog/2013/05/recommendati on-engines-why-you-shouldnt- build-one ‘Here be Dragons’ Problems • Summarization • Natural Language Generation • Dialog • Content Recommendation • Artificial Intelligence • Question Answering • Building Ontologies • Pragmatics
  • 31. More Reasons Why NLP is hard • Accept that things will go wrong – Nothing in NLP ever has 100% accuracy • Accept that NLP numbers are uncomfortable – 50% accuracy can be very good – Going from 72%  74% accuracy can be a HUGE deal • Embrace Cognitive Computing – “While they’ll have deep domain expertise, instead of replacing human experts, cognitive computers will act as a decision support system and help them make better decisions based on the best available data, whether in healthcare, finance or customer service.”
  • 32. It's difficult to do NLP if it's not your core business competency Are there any companies that can help?
  • 33. What our APIs are awesome
  • 34. We love feedback! Why our APIs are awesome
  • 35. Lets you figure out if it works for you Easy REST API Deploys in Bluemix Why our APIs are awesome
  • 36. Science Why our APIs are awesome
  • 37. Who has used Watson?
  • 38. Summary • NLP is hard to tackle on your own • But if your application involves users, NLP can provide huge value • The best way to get that value without all the hard work is to become an Ecosystem Partner
  • 39. What now? • Explore the Watson APIs: http://www.ibm.com/smarterplanet/us/en/ibmwatson/devel opercloud/services-catalog.html • Apply to become an Ecosystem Partner: http://www.ibm.com/smarterplanet/us/en/ibmwatson/ecos ystem.html
  • 40. Questions? © 2014 International Business Machines Corporation 40