SlideShare a Scribd company logo
1 of 19
Download to read offline
State of the #NLProc
Vsevolod Domkin
Iforum 2017
2017-05-25
About me
* Lisp programmer
* 5+ years of NLP work at Grammarly
* (m8n)ware NLP consultancy
* volunteer at lang-uk (http://lang.org.ua)
* Lecturer on OS, Algortihms, NLProc
(KPI, Projector, UCU)
https://vseloved.github.io
Outline
- Intro to NLProc
- Academic & industrial NLProc:
theory & practice
- Problems, algorithms, tools,
datasets
- Main directions of current and
future development
What is Natural
Language Processing?
Transforming free-form text into
structured data and back
since 1950
What is Natural
Language Processing?
Transforming free-form text into
structured data and back
since 1950
Related, but not quite:
- Compling
- AI
- Data Science
- Regular expressions, Neural Networks...
Roles
- dictionaries, lexicons, thesauri,
ontologies
- labelling (POS, NER, coref, semantic
roles, ...)
- parsing (constituency, dependency,
semantic, abstract, ...)
- modelling (language, topics, ...)
- embedding (word2vec, doc2vec, ...)
Computational
Linguistics
Data
Plenty!
- Dictionaries
- Corpora (Penn Treebank, Brown,
Europarl, Gigaword, ...)
- Shared task resources
Algorithms (classic)
- Reliance on rich linguistic data and
features
- Ngram-based linear models
- Shift-reduce-based parsing
- PCA-based semantics
Deep Learning
“NLP is kind of like a rabbit in the
headlights of the Deep Learning
machine, waiting to be flattened.”
Deep Learning
- Vector-space word & document
representations
- Recurrent neural networks
Haven't really taken the classic tasks
by storm (like in Signal Processing),
but seriously expanded what's possible
DL for NLP Formula
“Embed, encode, attend, predict”
https://explosion.ai/blog/deep-learning-formula-nlp
Tools
Everything, basically, happens in 2
major languages: Python & Java
A number of solid NLProc frameworks:
Stanford, OpenNLP, Spacy, LingPipe
A plethora of academic-quality
targeted tools
word2vec etc.
Main end-user applications:
Industrial
NLProc
Groupping &
labelling
Transformation Dialog
- sentiment analysis
- plagiarism
detection
- classification/
clustering
- parts extractions
- machine translation
- summarization
- error correction
- generation
- information
retrieval
- QA
- conversational
interfaces
Data
None :(
(b/c verticals)
Need to create your own:
- Extract
- Annotate
- Crowdsource
- Generate as by-product
Algorithms
Hybrid approaches
Rule-based ML/DL
- initial data collection
- pre-/post-processing
- power to domain experts
- often needs lots of data
- linguistic features +
word vectors
- simple ngram-based
models
- seq2seq neural network
models
Ex: Gmail smart reply
- Data analysis
- Traditional NLP
processing
- FNN
- LSTM
- Semi-supervised
graph learning
- Rule-based
post-processing
- Engineering
Ex: wit.ai
What's next
Some predictions:
- Bots go bust
- Deep learning goes commodity
- AI is cleantech 2.0 for VCs
- MLaaS dies a second death
- Full stack vertical AI startups
actually work
http://www.bradfordcross.com/blog/2017/3/3/five-ai-startup
-predictions-for-2017
What should work:
- NLP automation
- NLP + IR
- ???

More Related Content

What's hot

Presentation of OpenNLP
Presentation of OpenNLPPresentation of OpenNLP
Presentation of OpenNLPRobert Viseur
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extractionGabriel Hamilton
 
Feature Engineering for NLP
Feature Engineering for NLPFeature Engineering for NLP
Feature Engineering for NLPBill Liu
 
Open nlp presentationss
Open nlp presentationssOpen nlp presentationss
Open nlp presentationssChandan Deb
 
Information extraction for Free Text
Information extraction for Free TextInformation extraction for Free Text
Information extraction for Free Textbutest
 
Webinar: OpenNLP and Solr for Superior Relevance
Webinar: OpenNLP and Solr for Superior RelevanceWebinar: OpenNLP and Solr for Superior Relevance
Webinar: OpenNLP and Solr for Superior RelevanceLucidworks
 
Information Extraction from the Web - Algorithms and Tools
Information Extraction from the Web - Algorithms and ToolsInformation Extraction from the Web - Algorithms and Tools
Information Extraction from the Web - Algorithms and ToolsBenjamin Habegger
 
Large Scale Text Processing
Large Scale Text ProcessingLarge Scale Text Processing
Large Scale Text ProcessingSuneel Marthi
 
Big Data Spain 2017 - Deriving Actionable Insights from High Volume Media St...
Big Data Spain 2017  - Deriving Actionable Insights from High Volume Media St...Big Data Spain 2017  - Deriving Actionable Insights from High Volume Media St...
Big Data Spain 2017 - Deriving Actionable Insights from High Volume Media St...Apache OpenNLP
 
Embracing diversity searching over multiple languages
Embracing diversity  searching over multiple languagesEmbracing diversity  searching over multiple languages
Embracing diversity searching over multiple languagesSuneel Marthi
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to pythonYi-Fan Chu
 
Natural Language to Visualization by Neural Machine Translation
Natural Language to Visualization by Neural Machine TranslationNatural Language to Visualization by Neural Machine Translation
Natural Language to Visualization by Neural Machine Translationivaderivader
 
From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?Constantin Orasan
 
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf EremyanDataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyanrudolf eremyan
 
Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Lifeng (Aaron) Han
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and ChallengesJens Lehmann
 

What's hot (20)

Presentation of OpenNLP
Presentation of OpenNLPPresentation of OpenNLP
Presentation of OpenNLP
 
Natural language processing: feature extraction
Natural language processing: feature extractionNatural language processing: feature extraction
Natural language processing: feature extraction
 
Feature Engineering for NLP
Feature Engineering for NLPFeature Engineering for NLP
Feature Engineering for NLP
 
Open nlp presentationss
Open nlp presentationssOpen nlp presentationss
Open nlp presentationss
 
Information extraction for Free Text
Information extraction for Free TextInformation extraction for Free Text
Information extraction for Free Text
 
Webinar: OpenNLP and Solr for Superior Relevance
Webinar: OpenNLP and Solr for Superior RelevanceWebinar: OpenNLP and Solr for Superior Relevance
Webinar: OpenNLP and Solr for Superior Relevance
 
NLP and LSA getting started
NLP and LSA getting startedNLP and LSA getting started
NLP and LSA getting started
 
Information Extraction from the Web - Algorithms and Tools
Information Extraction from the Web - Algorithms and ToolsInformation Extraction from the Web - Algorithms and Tools
Information Extraction from the Web - Algorithms and Tools
 
Large Scale Text Processing
Large Scale Text ProcessingLarge Scale Text Processing
Large Scale Text Processing
 
Big Data Spain 2017 - Deriving Actionable Insights from High Volume Media St...
Big Data Spain 2017  - Deriving Actionable Insights from High Volume Media St...Big Data Spain 2017  - Deriving Actionable Insights from High Volume Media St...
Big Data Spain 2017 - Deriving Actionable Insights from High Volume Media St...
 
AINL 2016: Kravchenko
AINL 2016: KravchenkoAINL 2016: Kravchenko
AINL 2016: Kravchenko
 
Embracing diversity searching over multiple languages
Embracing diversity  searching over multiple languagesEmbracing diversity  searching over multiple languages
Embracing diversity searching over multiple languages
 
OpenNLP demo
OpenNLP demoOpenNLP demo
OpenNLP demo
 
Introduction to python
Introduction to pythonIntroduction to python
Introduction to python
 
NLTK
NLTKNLTK
NLTK
 
Natural Language to Visualization by Neural Machine Translation
Natural Language to Visualization by Neural Machine TranslationNatural Language to Visualization by Neural Machine Translation
Natural Language to Visualization by Neural Machine Translation
 
From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?From TREC to Watson: is open domain question answering a solved problem?
From TREC to Watson: is open domain question answering a solved problem?
 
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf EremyanDataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
DataFest 2017. Introduction to Natural Language Processing by Rudolf Eremyan
 
Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...Apply chinese radicals into neural machine translation: deeper than character...
Apply chinese radicals into neural machine translation: deeper than character...
 
Question Answering - Application and Challenges
Question Answering - Application and ChallengesQuestion Answering - Application and Challenges
Question Answering - Application and Challenges
 

Similar to The State of #NLProc

Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Dhruv Gohil
 
Lacey Liu SDE II Resume
Lacey Liu SDE II ResumeLacey Liu SDE II Resume
Lacey Liu SDE II ResumeLacey (Xi) Liu
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsShreyas Suresh Rao
 
Agile Mumbai 2022 - Vikesh Morye | Transfer Learning for Business Agility
Agile Mumbai 2022 - Vikesh Morye | Transfer Learning for Business AgilityAgile Mumbai 2022 - Vikesh Morye | Transfer Learning for Business Agility
Agile Mumbai 2022 - Vikesh Morye | Transfer Learning for Business AgilityAgileNetwork
 
A Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And RlbpA Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And RlbpRikki Wright
 
Sudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdfSudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdfSudipta Mukherjee
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and TransformerArvind Devaraj
 
Image Recognition on AWS with Apache Spark and BigDL
Image Recognition on AWS with Apache Spark and BigDLImage Recognition on AWS with Apache Spark and BigDL
Image Recognition on AWS with Apache Spark and BigDLAmazon Web Services
 
Overview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsOverview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsMaxime Lefrançois
 
Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingSeth Grimes
 
Evolution Of Object Oriented Technology
Evolution Of Object Oriented TechnologyEvolution Of Object Oriented Technology
Evolution Of Object Oriented TechnologySharon Roberts
 
Sudipta_Mukherjee_Resume_APR_2023.pdf
Sudipta_Mukherjee_Resume_APR_2023.pdfSudipta_Mukherjee_Resume_APR_2023.pdf
Sudipta_Mukherjee_Resume_APR_2023.pdfsudipto801
 
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinalProf. Wim Van Criekinge
 
Machine Learning with Spark
Machine Learning with SparkMachine Learning with Spark
Machine Learning with Sparkelephantscale
 
Всеволод Демкин "Natural language processing на практике"
Всеволод Демкин "Natural language processing на практике"Всеволод Демкин "Natural language processing на практике"
Всеволод Демкин "Natural language processing на практике"GeeksLab Odessa
 
NLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inNLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inKumari Naveen
 
lect36-tasks.ppt
lect36-tasks.pptlect36-tasks.ppt
lect36-tasks.pptHaHa501620
 
Top 11 python frameworks for machine learning and deep learning
Top 11 python frameworks for machine learning and deep learningTop 11 python frameworks for machine learning and deep learning
Top 11 python frameworks for machine learning and deep learningThinkTanker Technosoft PVT LTD
 

Similar to The State of #NLProc (20)

Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical Nautral Langauge Processing - Basics / Non Technical
Nautral Langauge Processing - Basics / Non Technical
 
Lacey Liu SDE II Resume
Lacey Liu SDE II ResumeLacey Liu SDE II Resume
Lacey Liu SDE II Resume
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
 
Agile Mumbai 2022 - Vikesh Morye | Transfer Learning for Business Agility
Agile Mumbai 2022 - Vikesh Morye | Transfer Learning for Business AgilityAgile Mumbai 2022 - Vikesh Morye | Transfer Learning for Business Agility
Agile Mumbai 2022 - Vikesh Morye | Transfer Learning for Business Agility
 
A Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And RlbpA Strong Object Recognition Using Lbp, Ltp And Rlbp
A Strong Object Recognition Using Lbp, Ltp And Rlbp
 
Sudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdfSudipta_Mukherjee_Resume-Nov_2022.pdf
Sudipta_Mukherjee_Resume-Nov_2022.pdf
 
Deep learning for NLP and Transformer
 Deep learning for NLP  and Transformer Deep learning for NLP  and Transformer
Deep learning for NLP and Transformer
 
Image Recognition on AWS with Apache Spark and BigDL
Image Recognition on AWS with Apache Spark and BigDLImage Recognition on AWS with Apache Spark and BigDL
Image Recognition on AWS with Apache Spark and BigDL
 
Overview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developmentsOverview of the SPARQL-Generate language and latest developments
Overview of the SPARQL-Generate language and latest developments
 
Recent Advances in Natural Language Processing
Recent Advances in Natural Language ProcessingRecent Advances in Natural Language Processing
Recent Advances in Natural Language Processing
 
Evolution Of Object Oriented Technology
Evolution Of Object Oriented TechnologyEvolution Of Object Oriented Technology
Evolution Of Object Oriented Technology
 
Sudipta_Mukherjee_Resume_APR_2023.pdf
Sudipta_Mukherjee_Resume_APR_2023.pdfSudipta_Mukherjee_Resume_APR_2023.pdf
Sudipta_Mukherjee_Resume_APR_2023.pdf
 
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
 
Machine Learning with Spark
Machine Learning with SparkMachine Learning with Spark
Machine Learning with Spark
 
Всеволод Демкин "Natural language processing на практике"
Всеволод Демкин "Natural language processing на практике"Всеволод Демкин "Natural language processing на практике"
Всеволод Демкин "Natural language processing на практике"
 
NLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful inNLP Tasks and Applications.ppt useful in
NLP Tasks and Applications.ppt useful in
 
lect36-tasks.ppt
lect36-tasks.pptlect36-tasks.ppt
lect36-tasks.ppt
 
Top 11 python frameworks for machine learning and deep learning
Top 11 python frameworks for machine learning and deep learningTop 11 python frameworks for machine learning and deep learning
Top 11 python frameworks for machine learning and deep learning
 
Spark meetup TCHUG
Spark meetup TCHUGSpark meetup TCHUG
Spark meetup TCHUG
 
eScience Cluster Arch. Overview
eScience Cluster Arch. OvervieweScience Cluster Arch. Overview
eScience Cluster Arch. Overview
 

More from Vsevolod Dyomkin

Lisp in a Startup: the Good, the Bad, and the Ugly
Lisp in a Startup: the Good, the Bad, and the UglyLisp in a Startup: the Good, the Bad, and the Ugly
Lisp in a Startup: the Good, the Bad, and the UglyVsevolod Dyomkin
 
Loading Multiple Versions of an ASDF System in the Same Lisp Image
Loading Multiple Versions of an ASDF System in the Same Lisp ImageLoading Multiple Versions of an ASDF System in the Same Lisp Image
Loading Multiple Versions of an ASDF System in the Same Lisp ImageVsevolod Dyomkin
 
NLP in the WILD or Building a System for Text Language Identification
NLP in the WILD or Building a System for Text Language IdentificationNLP in the WILD or Building a System for Text Language Identification
NLP in the WILD or Building a System for Text Language IdentificationVsevolod Dyomkin
 
Sugaring Lisp for the 21st Century
Sugaring Lisp for the 21st CenturySugaring Lisp for the 21st Century
Sugaring Lisp for the 21st CenturyVsevolod Dyomkin
 
Lisp как универсальная обертка
Lisp как универсальная оберткаLisp как универсальная обертка
Lisp как универсальная оберткаVsevolod Dyomkin
 
Lisp for Python Programmers
Lisp for Python ProgrammersLisp for Python Programmers
Lisp for Python ProgrammersVsevolod Dyomkin
 
Tedxkyiv communication guidelines
Tedxkyiv communication guidelinesTedxkyiv communication guidelines
Tedxkyiv communication guidelinesVsevolod Dyomkin
 
Новые нереляционные системы хранения данных
Новые нереляционные системы хранения данныхНовые нереляционные системы хранения данных
Новые нереляционные системы хранения данныхVsevolod Dyomkin
 
Чему мы можем научиться у Lisp'а?
Чему мы можем научиться у Lisp'а?Чему мы можем научиться у Lisp'а?
Чему мы можем научиться у Lisp'а?Vsevolod Dyomkin
 
Экосистема Common Lisp
Экосистема Common LispЭкосистема Common Lisp
Экосистема Common LispVsevolod Dyomkin
 

More from Vsevolod Dyomkin (12)

NLP Project Full Circle
NLP Project Full CircleNLP Project Full Circle
NLP Project Full Circle
 
Lisp in a Startup: the Good, the Bad, and the Ugly
Lisp in a Startup: the Good, the Bad, and the UglyLisp in a Startup: the Good, the Bad, and the Ugly
Lisp in a Startup: the Good, the Bad, and the Ugly
 
Loading Multiple Versions of an ASDF System in the Same Lisp Image
Loading Multiple Versions of an ASDF System in the Same Lisp ImageLoading Multiple Versions of an ASDF System in the Same Lisp Image
Loading Multiple Versions of an ASDF System in the Same Lisp Image
 
NLP in the WILD or Building a System for Text Language Identification
NLP in the WILD or Building a System for Text Language IdentificationNLP in the WILD or Building a System for Text Language Identification
NLP in the WILD or Building a System for Text Language Identification
 
Sugaring Lisp for the 21st Century
Sugaring Lisp for the 21st CenturySugaring Lisp for the 21st Century
Sugaring Lisp for the 21st Century
 
CL-NLP
CL-NLPCL-NLP
CL-NLP
 
Lisp как универсальная обертка
Lisp как универсальная оберткаLisp как универсальная обертка
Lisp как универсальная обертка
 
Lisp for Python Programmers
Lisp for Python ProgrammersLisp for Python Programmers
Lisp for Python Programmers
 
Tedxkyiv communication guidelines
Tedxkyiv communication guidelinesTedxkyiv communication guidelines
Tedxkyiv communication guidelines
 
Новые нереляционные системы хранения данных
Новые нереляционные системы хранения данныхНовые нереляционные системы хранения данных
Новые нереляционные системы хранения данных
 
Чему мы можем научиться у Lisp'а?
Чему мы можем научиться у Lisp'а?Чему мы можем научиться у Lisp'а?
Чему мы можем научиться у Lisp'а?
 
Экосистема Common Lisp
Экосистема Common LispЭкосистема Common Lisp
Экосистема Common Lisp
 

Recently uploaded

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfOverkill Security
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 

Recently uploaded (20)

Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Cyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdfCyberprint. Dark Pink Apt Group [EN].pdf
Cyberprint. Dark Pink Apt Group [EN].pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

The State of #NLProc

  • 1. State of the #NLProc Vsevolod Domkin Iforum 2017 2017-05-25
  • 2. About me * Lisp programmer * 5+ years of NLP work at Grammarly * (m8n)ware NLP consultancy * volunteer at lang-uk (http://lang.org.ua) * Lecturer on OS, Algortihms, NLProc (KPI, Projector, UCU) https://vseloved.github.io
  • 3. Outline - Intro to NLProc - Academic & industrial NLProc: theory & practice - Problems, algorithms, tools, datasets - Main directions of current and future development
  • 4. What is Natural Language Processing? Transforming free-form text into structured data and back since 1950
  • 5. What is Natural Language Processing? Transforming free-form text into structured data and back since 1950 Related, but not quite: - Compling - AI - Data Science - Regular expressions, Neural Networks...
  • 7. - dictionaries, lexicons, thesauri, ontologies - labelling (POS, NER, coref, semantic roles, ...) - parsing (constituency, dependency, semantic, abstract, ...) - modelling (language, topics, ...) - embedding (word2vec, doc2vec, ...) Computational Linguistics
  • 8. Data Plenty! - Dictionaries - Corpora (Penn Treebank, Brown, Europarl, Gigaword, ...) - Shared task resources
  • 9. Algorithms (classic) - Reliance on rich linguistic data and features - Ngram-based linear models - Shift-reduce-based parsing - PCA-based semantics
  • 10. Deep Learning “NLP is kind of like a rabbit in the headlights of the Deep Learning machine, waiting to be flattened.”
  • 11. Deep Learning - Vector-space word & document representations - Recurrent neural networks Haven't really taken the classic tasks by storm (like in Signal Processing), but seriously expanded what's possible
  • 12. DL for NLP Formula “Embed, encode, attend, predict” https://explosion.ai/blog/deep-learning-formula-nlp
  • 13. Tools Everything, basically, happens in 2 major languages: Python & Java A number of solid NLProc frameworks: Stanford, OpenNLP, Spacy, LingPipe A plethora of academic-quality targeted tools word2vec etc.
  • 14. Main end-user applications: Industrial NLProc Groupping & labelling Transformation Dialog - sentiment analysis - plagiarism detection - classification/ clustering - parts extractions - machine translation - summarization - error correction - generation - information retrieval - QA - conversational interfaces
  • 15. Data None :( (b/c verticals) Need to create your own: - Extract - Annotate - Crowdsource - Generate as by-product
  • 16. Algorithms Hybrid approaches Rule-based ML/DL - initial data collection - pre-/post-processing - power to domain experts - often needs lots of data - linguistic features + word vectors - simple ngram-based models - seq2seq neural network models
  • 17. Ex: Gmail smart reply - Data analysis - Traditional NLP processing - FNN - LSTM - Semi-supervised graph learning - Rule-based post-processing - Engineering
  • 19. What's next Some predictions: - Bots go bust - Deep learning goes commodity - AI is cleantech 2.0 for VCs - MLaaS dies a second death - Full stack vertical AI startups actually work http://www.bradfordcross.com/blog/2017/3/3/five-ai-startup -predictions-for-2017 What should work: - NLP automation - NLP + IR - ???