SlideShare une entreprise Scribd logo
1  sur  36
Understanding 
human language 
with Python 
Alyona Medelyan
Who am I? 
Alyona 
Medelyan 
aka @zelandiya 
▪ In Natural Language Processing since 2000 
▪ PhD in NLP & Machine Learning from Waikato 
▪ Author of the open source keyword extraction algorithm Maui 
▪ Author of the most-cited 2009 journal survey “Mining Meaning with Wikipedia” 
▪ Past: Chief Research Officer at Pingar 
▪ Now: Founder of Entopix, NLP consultancy & software development
Agenda 
State of NLP 
Recap on fiction vs reality: Are we there yet? 
NLP Complexities 
Why is understanding language so complex? 
NLP using Python 
NLTK, Gensim, TextBlob & Co 
Building NLP applications 
A little bit of data science 
Other NLP areas 
And what’s coming next
State of NLP 
Fiction versus Reality
He (KITT) “always had an ego that was easy to bruise and displayed a 
very sensitive, but kind and dryly humorous personality.” - Wikipedia
Android Auto: “hands-free operation through voice commands 
will be emphasized to ensure safe driving”
“by putting this into one's ear one can instantly 
understand anything said in any language” (Hitchhiker Wiki)
WordLense: 
“augmented 
reality 
translation”
Two girls use Google Translate to call a real Indian restaurant and order in Hindi… 
How did it go? www.youtube.com/watch?v=wxDRburxwz8
The LCARS (or simply library computer) … used sophisticated 
artificial intelligence routines to understand and execute vocal natural 
language commands (From Memory Alpha Wiki)
Let’s try out Google
“Samantha [the OS] 
proves to be constantly 
available, always curious 
and interested, supportive 
and undemanding”
Siri doesn’t seem 
to be as “available”
NLP Complexities 
Why is understanding language so complex?
Word segmentation complexities 
▪ 广大发展中国家一致支持这个目标,并提出了各自的期望细节。 
▪ 广大发展中国家一致支持这个目标,并提出了各自的期望细节。 
▪ The first hot dogs were sold by Charles Feltman on Coney Island in 
1870. 
▪ The first hot dogs were sold by Charles Feltman on Coney Island in 
1870.
Disambiguation complexities 
Flying planes can be dangerous
NLP using Python 
NLTK, Gensim, TextBlob & Co
text text text 
text text text 
text text text 
text text text 
text text text 
text text text 
sentiment 
keywords 
tags 
genre 
categories 
taxonomy terms 
entities 
names 
patterns 
biochemical 
… entities text text text 
text text text 
text text text 
text text text 
text text text 
text text text 
What can we do with text?
NLTK 
Python platform for NLP
How to get to the core words? 
Remove Stopwords with NLTK 
even the acting in transcendence is solid , with the dreamy 
depp turning in a typically strong performance 
i think that transcendence has a pretty solid acting, with the 
dreamy depp turning in a strong performance as he usually does 
>>> from nltk.corpus import stopwords 
>>> stop = stopwords.words('english') 
>>> words = ['the', 'acting', 'in', 'transcendence', 'is', 
'solid', 'with', 'the', 'dreamy', 'depp'] 
>>> print [word for word in words if word not in stop] 
['acting', 'transcendence', 'solid’, 'dreamy', 'depp']
Getting closer to the meaning: 
Part of Speech tagging with NLTK 
Flying planes can be dangerous 
✓ 
>>> import nltk 
>>> from nltk.tokenize import word_tokenize 
>>> nltk.pos_tag(word_tokenize("Flying planes can be dangerous")) 
[('Flying', 'VBG'), ('planes', 'NNS'), ('can', 'MD'), 
('be', 'VB'), ('dangerous', 'JJ')]
Keyword scoring: 
TFxIDF 
Relative frequency 
of a term t in a 
document d 
The inverse 
proportion of 
documents d in 
collection D 
mentioning term t
TFxIDF with Gensim 
from nltk.corpus import movie_reviews 
from gensim import corpora, models 
texts = [] 
for fileid in movie_reviews.fileids(): 
words = texts.append(movie_reviews.words(fileid)) 
dictionary = corpora.Dictionary(texts) 
corpus = [dictionary.doc2bow(text) for text in texts] 
tfidf = models.TfidfModel(corpus)
TFxIDF with Gensim (Results) 
for word in ['film', 'movie', 'comedy', 
'violence', 'jolie']: 
my_id = dictionary.token2id.get(word) 
print word, 't', tfidf.idfs[my_id] 
film 0.190174003903 
movie 0.364013496254 
comedy 1.98564470702 
violence 3.2108967825 
jolie 6.96578428466
Where does this text belong? 
Text Categorization with NLTK 
Entertainment 
TVNZ: “Obama and 
Hangover star 
trade insults in interview” 
Politics 
>>> train_set = [(document_features(d), c) for (d,c) in categorized_documents] 
>>> classifier = nltk.NaiveBayesClassifier.train(train_set) 
>>> doc_features = document_features(new_document) 
>>> category = classifier.classify(features)
Sentiment analysis with TextBlob 
>>> from textblob import TextBlob 
>>> blob = TextBlob("I love this library") 
>>> blob.sentiment 
Sentiment(polarity=0.5, subjectivity=0.6) 
for review in transcendence: 
blob = TextBlob(open(review).read()) 
print review, blob.sentiment.polarity 
../data/transcendence_1star.txt 0.0170799124247 
../data/transcendence_5star.txt 0.0874591503268 
../data/transcendence_8star.txt 0.256845238095 
../data/transcendence_10star.txt 0.304310344828
Building NLP applications 
A little bit of data science
Keywords extracton in 3h: 
Understanding a movie review 
…four of the biggest directors in hollywood : quentin 
tarantino , robert rodriguez , … were all directing one big film 
with a big and popular cast ...the second room ( jennifer 
beals ) was better , but lacking in plot ... the bumbling and 
mumbling bellboy … ruins every joke in the film … 
bellboy 
jennifer beals 
four rooms 
beals 
rooms 
tarantino 
madonna 
antonio banderas 
valeria golino 
github.com/zelandiya/KiwiPyCon-NLP-tutorial
Keyword extraction on 2000 movie reviews: 
What makes a successful movie? 
Negative Positive 
van damme 
zeta – jones 
smith 
batman 
de palma 
eddie murphy 
killer 
tommy lee jones 
wild west 
mars 
murphy 
ship 
space 
brothers 
de bont 
... 
star wars 
disney 
war 
de niro 
jackie 
alien 
jackie chan 
private ryan 
truman show 
ben stiller 
cameron 
science fiction 
cameron diaz 
fiction 
jack 
...
How NLP can help a beer drinker? 
Sweaty Horse Blanket: Processing the Natural Language of Beer 
by Ben Fields 
vimeo.com/96809735
Other NLP areas 
What’s coming next?
Filling the gaps in machine understanding 
… Jack Ruby, who killed J.F.Kennedy's assassin Lee Harvey Oswald. … 
/m/0d3k14 
/m/044sb 
/m/0d3k14 
Freebase
What’s next? 
Vs.
Conclusions: 
Understanding human language with Python 
NLTK 
nltk.org 
Are we there yet? 
radimrehurek.com/gensim 
scikit-learn.org/stable deeplearning.net/software/theano 
textblob.readthedocs.org 
@zelandiya #nlproc

Contenu connexe

Tendances

Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing WorkshopLakshya Sivaramakrishnan
 
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...Edureka!
 
Natural language procssing
Natural language procssing Natural language procssing
Natural language procssing Rajnish Raj
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingPranav Gupta
 
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...Edureka!
 
Chat bot making process using Python 3 & TensorFlow
Chat bot making process using Python 3 & TensorFlowChat bot making process using Python 3 & TensorFlow
Chat bot making process using Python 3 & TensorFlowJeongkyu Shin
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Saurabh Kaushik
 
Natural language processing (Python)
Natural language processing (Python)Natural language processing (Python)
Natural language processing (Python)Sumit Raj
 
You too can nlp - PyBay 2018 lightning talk
You too can nlp - PyBay 2018 lightning talkYou too can nlp - PyBay 2018 lightning talk
You too can nlp - PyBay 2018 lightning talkJacob Perkins
 
Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Taggingtheyaseen51
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introductionRobert Lujo
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageRoelof Pieters
 
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...AI Frontiers
 
UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2Yuriy Guts
 
Chatbots from first principles
Chatbots from first principlesChatbots from first principles
Chatbots from first principlesJonathan Mugan
 

Tendances (20)

NLP
NLPNLP
NLP
 
Pycon India 2018 Natural Language Processing Workshop
Pycon India 2018   Natural Language Processing WorkshopPycon India 2018   Natural Language Processing Workshop
Pycon India 2018 Natural Language Processing Workshop
 
NLTK
NLTKNLTK
NLTK
 
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...
 
Natural language procssing
Natural language procssing Natural language procssing
Natural language procssing
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
Stemming And Lemmatization Tutorial | Natural Language Processing (NLP) With ...
 
Chat bot making process using Python 3 & TensorFlow
Chat bot making process using Python 3 & TensorFlowChat bot making process using Python 3 & TensorFlow
Chat bot making process using Python 3 & TensorFlow
 
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2 Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
 
Natural language processing (Python)
Natural language processing (Python)Natural language processing (Python)
Natural language processing (Python)
 
Nltk
NltkNltk
Nltk
 
You too can nlp - PyBay 2018 lightning talk
You too can nlp - PyBay 2018 lightning talkYou too can nlp - PyBay 2018 lightning talk
You too can nlp - PyBay 2018 lightning talk
 
Parts of Speect Tagging
Parts of Speect TaggingParts of Speect Tagging
Parts of Speect Tagging
 
Natural language processing (NLP) introduction
Natural language processing (NLP) introductionNatural language processing (NLP) introduction
Natural language processing (NLP) introduction
 
Visual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on LanguageVisual-Semantic Embeddings: some thoughts on Language
Visual-Semantic Embeddings: some thoughts on Language
 
DeepPavlov 2019
DeepPavlov 2019DeepPavlov 2019
DeepPavlov 2019
 
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
Dilek Hakkani-Tur at AI Frontiers: Conversational machines: Deep Learning for...
 
UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2UCU NLP Summer Workshops 2017 - Part 2
UCU NLP Summer Workshops 2017 - Part 2
 
Chatbots from first principles
Chatbots from first principlesChatbots from first principles
Chatbots from first principles
 
Chatbot ppt
Chatbot pptChatbot ppt
Chatbot ppt
 

En vedette

Introduction to Chat Bots
Introduction to Chat BotsIntroduction to Chat Bots
Introduction to Chat BotsAlyona Medelyan
 
The Lives of Kiwi CS PostGrads
The Lives of Kiwi CS PostGradsThe Lives of Kiwi CS PostGrads
The Lives of Kiwi CS PostGradsAlyona Medelyan
 
Divoli & Medelyan: HCIR-2011 Presentation
Divoli & Medelyan: HCIR-2011 PresentationDivoli & Medelyan: HCIR-2011 Presentation
Divoli & Medelyan: HCIR-2011 PresentationAlyona Medelyan
 
Text Analytics on 2 Million Documents: A Case Study
Text Analytics on 2 Million Documents: A Case StudyText Analytics on 2 Million Documents: A Case Study
Text Analytics on 2 Million Documents: A Case StudyAlyona Medelyan
 
Word2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad MahdaviWord2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad Mahdaviirpycon
 
The Next Generation SharePoint: Powered by Text Analytics
The Next Generation SharePoint: Powered by Text AnalyticsThe Next Generation SharePoint: Powered by Text Analytics
The Next Generation SharePoint: Powered by Text AnalyticsAlyona Medelyan
 
Chatbot Artificial Intelligence
Chatbot Artificial IntelligenceChatbot Artificial Intelligence
Chatbot Artificial IntelligenceMd. Mahedi Mahfuj
 
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skillVoice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skillKay Lerch
 
How to Succeed With Rewarded Video Ads
How to Succeed With Rewarded Video AdsHow to Succeed With Rewarded Video Ads
How to Succeed With Rewarded Video AdsSohan Maheshwar
 
Speech Recognition, Text to Speech, and Voice Interfaces
Speech Recognition, Text to Speech, and Voice InterfacesSpeech Recognition, Text to Speech, and Voice Interfaces
Speech Recognition, Text to Speech, and Voice InterfacesChristiana Vasquez
 
Mobile Gaming Monetization Trends in 2016
Mobile Gaming Monetization Trends in 2016Mobile Gaming Monetization Trends in 2016
Mobile Gaming Monetization Trends in 2016Sohan Maheshwar
 
Designing a Conversational Intelligent Bot which can cook
Designing a Conversational Intelligent Bot which can cookDesigning a Conversational Intelligent Bot which can cook
Designing a Conversational Intelligent Bot which can cookKaushik Das
 
An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"sandinmyjoints
 
Applying Science to Conversational UX Design
Applying Science to Conversational UX DesignApplying Science to Conversational UX Design
Applying Science to Conversational UX DesignRaphael Arar
 
The Journey to conversational interfaces
The Journey to conversational interfacesThe Journey to conversational interfaces
The Journey to conversational interfacesRomin Irani
 
Amazon Alexa Voice Interfaces Meetup Berlin August 2016
Amazon Alexa Voice Interfaces Meetup Berlin August 2016Amazon Alexa Voice Interfaces Meetup Berlin August 2016
Amazon Alexa Voice Interfaces Meetup Berlin August 2016Tilmann Böhme
 
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...UXPA International
 
Chatbots - What, Why and How? - Beerud Sheth
Chatbots - What, Why and How? - Beerud ShethChatbots - What, Why and How? - Beerud Sheth
Chatbots - What, Why and How? - Beerud ShethWithTheBest
 
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS StartupsSelf-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS StartupsDatentreiber
 

En vedette (20)

Introduction to Chat Bots
Introduction to Chat BotsIntroduction to Chat Bots
Introduction to Chat Bots
 
The Lives of Kiwi CS PostGrads
The Lives of Kiwi CS PostGradsThe Lives of Kiwi CS PostGrads
The Lives of Kiwi CS PostGrads
 
Divoli & Medelyan: HCIR-2011 Presentation
Divoli & Medelyan: HCIR-2011 PresentationDivoli & Medelyan: HCIR-2011 Presentation
Divoli & Medelyan: HCIR-2011 Presentation
 
Text Analytics on 2 Million Documents: A Case Study
Text Analytics on 2 Million Documents: A Case StudyText Analytics on 2 Million Documents: A Case Study
Text Analytics on 2 Million Documents: A Case Study
 
Word2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad MahdaviWord2Vec: Vector presentation of words - Mohammad Mahdavi
Word2Vec: Vector presentation of words - Mohammad Mahdavi
 
The Next Generation SharePoint: Powered by Text Analytics
The Next Generation SharePoint: Powered by Text AnalyticsThe Next Generation SharePoint: Powered by Text Analytics
The Next Generation SharePoint: Powered by Text Analytics
 
Chatbot Artificial Intelligence
Chatbot Artificial IntelligenceChatbot Artificial Intelligence
Chatbot Artificial Intelligence
 
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skillVoice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
Voice Interfaces Usergroup Berlin - 05-10-2016 : Kay Lerch on Morse-Coder skill
 
How to Succeed With Rewarded Video Ads
How to Succeed With Rewarded Video AdsHow to Succeed With Rewarded Video Ads
How to Succeed With Rewarded Video Ads
 
Speech Recognition, Text to Speech, and Voice Interfaces
Speech Recognition, Text to Speech, and Voice InterfacesSpeech Recognition, Text to Speech, and Voice Interfaces
Speech Recognition, Text to Speech, and Voice Interfaces
 
Mobile Gaming Monetization Trends in 2016
Mobile Gaming Monetization Trends in 2016Mobile Gaming Monetization Trends in 2016
Mobile Gaming Monetization Trends in 2016
 
Designing a Conversational Intelligent Bot which can cook
Designing a Conversational Intelligent Bot which can cookDesigning a Conversational Intelligent Bot which can cook
Designing a Conversational Intelligent Bot which can cook
 
ICS2208 lecture4
ICS2208 lecture4ICS2208 lecture4
ICS2208 lecture4
 
An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"An Introduction to gensim: "Topic Modelling for Humans"
An Introduction to gensim: "Topic Modelling for Humans"
 
Applying Science to Conversational UX Design
Applying Science to Conversational UX DesignApplying Science to Conversational UX Design
Applying Science to Conversational UX Design
 
The Journey to conversational interfaces
The Journey to conversational interfacesThe Journey to conversational interfaces
The Journey to conversational interfaces
 
Amazon Alexa Voice Interfaces Meetup Berlin August 2016
Amazon Alexa Voice Interfaces Meetup Berlin August 2016Amazon Alexa Voice Interfaces Meetup Berlin August 2016
Amazon Alexa Voice Interfaces Meetup Berlin August 2016
 
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
Where's Jarvis? The future of Voice Recognition and Natural Language User Int...
 
Chatbots - What, Why and How? - Beerud Sheth
Chatbots - What, Why and How? - Beerud ShethChatbots - What, Why and How? - Beerud Sheth
Chatbots - What, Why and How? - Beerud Sheth
 
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS StartupsSelf-Service.AI - Pitch Competition for AI-Driven SaaS Startups
Self-Service.AI - Pitch Competition for AI-Driven SaaS Startups
 

Similaire à KiwiPyCon 2014 talk - Understanding human language with Python

ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsForward Gradient
 
Cloud AI GenAI Overview.pptx
Cloud AI GenAI Overview.pptxCloud AI GenAI Overview.pptx
Cloud AI GenAI Overview.pptxSahithiGurlinka
 
Paraphrase Detection in NLP
Paraphrase Detection in NLPParaphrase Detection in NLP
Paraphrase Detection in NLPYuriy Guts
 
Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...
Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...
Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...Lucidworks
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Pythonshanbady
 
Overview of text mining and NLP (+software)
Overview of text mining and NLP (+software)Overview of text mining and NLP (+software)
Overview of text mining and NLP (+software)Florian Leitner
 
CSCE181 Big ideas in NLP
CSCE181 Big ideas in NLPCSCE181 Big ideas in NLP
CSCE181 Big ideas in NLPInsoo Chung
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowValeria de Paiva
 
GenAI by Google Cloud Leading the Way to Tomorrow - GDSC GHRCEM GCSJ Session 2
GenAI by Google Cloud Leading the Way to Tomorrow - GDSC GHRCEM GCSJ Session 2GenAI by Google Cloud Leading the Way to Tomorrow - GDSC GHRCEM GCSJ Session 2
GenAI by Google Cloud Leading the Way to Tomorrow - GDSC GHRCEM GCSJ Session 2ProbalKar2
 
The Neural Search Frontier - Doug Turnbull, OpenSource Connections
The Neural Search Frontier - Doug Turnbull, OpenSource ConnectionsThe Neural Search Frontier - Doug Turnbull, OpenSource Connections
The Neural Search Frontier - Doug Turnbull, OpenSource ConnectionsLucidworks
 
Data Day Seattle, Chatbots from First Principles
Data Day Seattle, Chatbots from First PrinciplesData Day Seattle, Chatbots from First Principles
Data Day Seattle, Chatbots from First PrinciplesJonathan Mugan
 
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsBreaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsJason Anderson
 
Artificial Intelligence for Undergrads
Artificial Intelligence for UndergradsArtificial Intelligence for Undergrads
Artificial Intelligence for UndergradsJose Berengueres
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processingpunedevscom
 
AI-driven UI: conversational interfaces and more
AI-driven UI: conversational interfaces and moreAI-driven UI: conversational interfaces and more
AI-driven UI: conversational interfaces and moreEirik Stavelin
 
Mining Interesting Trivia for Entities from Wikipedia PART-I
Mining Interesting Trivia for Entities from Wikipedia PART-IMining Interesting Trivia for Entities from Wikipedia PART-I
Mining Interesting Trivia for Entities from Wikipedia PART-IAbhay Prakash
 
Nltk - Boston Text Analytics
Nltk - Boston Text AnalyticsNltk - Boston Text Analytics
Nltk - Boston Text Analyticsshanbady
 
Data Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AIData Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AIJonathan Mugan
 

Similaire à KiwiPyCon 2014 talk - Understanding human language with Python (20)

ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and ApplicationsICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
ICDM 2019 Tutorial: Speech and Language Processing: New Tools and Applications
 
Cloud AI GenAI Overview.pptx
Cloud AI GenAI Overview.pptxCloud AI GenAI Overview.pptx
Cloud AI GenAI Overview.pptx
 
Paraphrase Detection in NLP
Paraphrase Detection in NLPParaphrase Detection in NLP
Paraphrase Detection in NLP
 
Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...
Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...
Inside the Black Box: How Does a Neural Network Understand Names? - Philip Bl...
 
NLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in PythonNLTK - Natural Language Processing in Python
NLTK - Natural Language Processing in Python
 
Overview of text mining and NLP (+software)
Overview of text mining and NLP (+software)Overview of text mining and NLP (+software)
Overview of text mining and NLP (+software)
 
CSCE181 Big ideas in NLP
CSCE181 Big ideas in NLPCSCE181 Big ideas in NLP
CSCE181 Big ideas in NLP
 
Portuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and HowPortuguese Linguistic Tools: What, Why and How
Portuguese Linguistic Tools: What, Why and How
 
GenAI by Google Cloud Leading the Way to Tomorrow - GDSC GHRCEM GCSJ Session 2
GenAI by Google Cloud Leading the Way to Tomorrow - GDSC GHRCEM GCSJ Session 2GenAI by Google Cloud Leading the Way to Tomorrow - GDSC GHRCEM GCSJ Session 2
GenAI by Google Cloud Leading the Way to Tomorrow - GDSC GHRCEM GCSJ Session 2
 
The Neural Search Frontier - Doug Turnbull, OpenSource Connections
The Neural Search Frontier - Doug Turnbull, OpenSource ConnectionsThe Neural Search Frontier - Doug Turnbull, OpenSource Connections
The Neural Search Frontier - Doug Turnbull, OpenSource Connections
 
kornev.pdf
kornev.pdfkornev.pdf
kornev.pdf
 
Data Day Seattle, Chatbots from First Principles
Data Day Seattle, Chatbots from First PrinciplesData Day Seattle, Chatbots from First Principles
Data Day Seattle, Chatbots from First Principles
 
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video AnalyticsBreaking Through The Challenges of Scalable Deep Learning for Video Analytics
Breaking Through The Challenges of Scalable Deep Learning for Video Analytics
 
Deep learning for NLP
Deep learning for NLPDeep learning for NLP
Deep learning for NLP
 
Artificial Intelligence for Undergrads
Artificial Intelligence for UndergradsArtificial Intelligence for Undergrads
Artificial Intelligence for Undergrads
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
AI-driven UI: conversational interfaces and more
AI-driven UI: conversational interfaces and moreAI-driven UI: conversational interfaces and more
AI-driven UI: conversational interfaces and more
 
Mining Interesting Trivia for Entities from Wikipedia PART-I
Mining Interesting Trivia for Entities from Wikipedia PART-IMining Interesting Trivia for Entities from Wikipedia PART-I
Mining Interesting Trivia for Entities from Wikipedia PART-I
 
Nltk - Boston Text Analytics
Nltk - Boston Text AnalyticsNltk - Boston Text Analytics
Nltk - Boston Text Analytics
 
Data Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AIData Day Seattle, From NLP to AI
Data Day Seattle, From NLP to AI
 

Dernier

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 

Dernier (20)

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 

KiwiPyCon 2014 talk - Understanding human language with Python

  • 1. Understanding human language with Python Alyona Medelyan
  • 2. Who am I? Alyona Medelyan aka @zelandiya ▪ In Natural Language Processing since 2000 ▪ PhD in NLP & Machine Learning from Waikato ▪ Author of the open source keyword extraction algorithm Maui ▪ Author of the most-cited 2009 journal survey “Mining Meaning with Wikipedia” ▪ Past: Chief Research Officer at Pingar ▪ Now: Founder of Entopix, NLP consultancy & software development
  • 3. Agenda State of NLP Recap on fiction vs reality: Are we there yet? NLP Complexities Why is understanding language so complex? NLP using Python NLTK, Gensim, TextBlob & Co Building NLP applications A little bit of data science Other NLP areas And what’s coming next
  • 4. State of NLP Fiction versus Reality
  • 5. He (KITT) “always had an ego that was easy to bruise and displayed a very sensitive, but kind and dryly humorous personality.” - Wikipedia
  • 6. Android Auto: “hands-free operation through voice commands will be emphasized to ensure safe driving”
  • 7. “by putting this into one's ear one can instantly understand anything said in any language” (Hitchhiker Wiki)
  • 9. Two girls use Google Translate to call a real Indian restaurant and order in Hindi… How did it go? www.youtube.com/watch?v=wxDRburxwz8
  • 10. The LCARS (or simply library computer) … used sophisticated artificial intelligence routines to understand and execute vocal natural language commands (From Memory Alpha Wiki)
  • 11. Let’s try out Google
  • 12. “Samantha [the OS] proves to be constantly available, always curious and interested, supportive and undemanding”
  • 13. Siri doesn’t seem to be as “available”
  • 14. NLP Complexities Why is understanding language so complex?
  • 15.
  • 16. Word segmentation complexities ▪ 广大发展中国家一致支持这个目标,并提出了各自的期望细节。 ▪ 广大发展中国家一致支持这个目标,并提出了各自的期望细节。 ▪ The first hot dogs were sold by Charles Feltman on Coney Island in 1870. ▪ The first hot dogs were sold by Charles Feltman on Coney Island in 1870.
  • 17. Disambiguation complexities Flying planes can be dangerous
  • 18. NLP using Python NLTK, Gensim, TextBlob & Co
  • 19. text text text text text text text text text text text text text text text text text text sentiment keywords tags genre categories taxonomy terms entities names patterns biochemical … entities text text text text text text text text text text text text text text text text text text What can we do with text?
  • 21. How to get to the core words? Remove Stopwords with NLTK even the acting in transcendence is solid , with the dreamy depp turning in a typically strong performance i think that transcendence has a pretty solid acting, with the dreamy depp turning in a strong performance as he usually does >>> from nltk.corpus import stopwords >>> stop = stopwords.words('english') >>> words = ['the', 'acting', 'in', 'transcendence', 'is', 'solid', 'with', 'the', 'dreamy', 'depp'] >>> print [word for word in words if word not in stop] ['acting', 'transcendence', 'solid’, 'dreamy', 'depp']
  • 22. Getting closer to the meaning: Part of Speech tagging with NLTK Flying planes can be dangerous ✓ >>> import nltk >>> from nltk.tokenize import word_tokenize >>> nltk.pos_tag(word_tokenize("Flying planes can be dangerous")) [('Flying', 'VBG'), ('planes', 'NNS'), ('can', 'MD'), ('be', 'VB'), ('dangerous', 'JJ')]
  • 23. Keyword scoring: TFxIDF Relative frequency of a term t in a document d The inverse proportion of documents d in collection D mentioning term t
  • 24. TFxIDF with Gensim from nltk.corpus import movie_reviews from gensim import corpora, models texts = [] for fileid in movie_reviews.fileids(): words = texts.append(movie_reviews.words(fileid)) dictionary = corpora.Dictionary(texts) corpus = [dictionary.doc2bow(text) for text in texts] tfidf = models.TfidfModel(corpus)
  • 25. TFxIDF with Gensim (Results) for word in ['film', 'movie', 'comedy', 'violence', 'jolie']: my_id = dictionary.token2id.get(word) print word, 't', tfidf.idfs[my_id] film 0.190174003903 movie 0.364013496254 comedy 1.98564470702 violence 3.2108967825 jolie 6.96578428466
  • 26. Where does this text belong? Text Categorization with NLTK Entertainment TVNZ: “Obama and Hangover star trade insults in interview” Politics >>> train_set = [(document_features(d), c) for (d,c) in categorized_documents] >>> classifier = nltk.NaiveBayesClassifier.train(train_set) >>> doc_features = document_features(new_document) >>> category = classifier.classify(features)
  • 27. Sentiment analysis with TextBlob >>> from textblob import TextBlob >>> blob = TextBlob("I love this library") >>> blob.sentiment Sentiment(polarity=0.5, subjectivity=0.6) for review in transcendence: blob = TextBlob(open(review).read()) print review, blob.sentiment.polarity ../data/transcendence_1star.txt 0.0170799124247 ../data/transcendence_5star.txt 0.0874591503268 ../data/transcendence_8star.txt 0.256845238095 ../data/transcendence_10star.txt 0.304310344828
  • 28. Building NLP applications A little bit of data science
  • 29. Keywords extracton in 3h: Understanding a movie review …four of the biggest directors in hollywood : quentin tarantino , robert rodriguez , … were all directing one big film with a big and popular cast ...the second room ( jennifer beals ) was better , but lacking in plot ... the bumbling and mumbling bellboy … ruins every joke in the film … bellboy jennifer beals four rooms beals rooms tarantino madonna antonio banderas valeria golino github.com/zelandiya/KiwiPyCon-NLP-tutorial
  • 30. Keyword extraction on 2000 movie reviews: What makes a successful movie? Negative Positive van damme zeta – jones smith batman de palma eddie murphy killer tommy lee jones wild west mars murphy ship space brothers de bont ... star wars disney war de niro jackie alien jackie chan private ryan truman show ben stiller cameron science fiction cameron diaz fiction jack ...
  • 31. How NLP can help a beer drinker? Sweaty Horse Blanket: Processing the Natural Language of Beer by Ben Fields vimeo.com/96809735
  • 32.
  • 33. Other NLP areas What’s coming next?
  • 34. Filling the gaps in machine understanding … Jack Ruby, who killed J.F.Kennedy's assassin Lee Harvey Oswald. … /m/0d3k14 /m/044sb /m/0d3k14 Freebase
  • 36. Conclusions: Understanding human language with Python NLTK nltk.org Are we there yet? radimrehurek.com/gensim scikit-learn.org/stable deeplearning.net/software/theano textblob.readthedocs.org @zelandiya #nlproc

Notes de l'éditeur

  1. Let’s start with fiction. Here we have Knight Rider’s car, which not only communicates with David Hasselhof’s character, but is also great at dry humour.
  2. I don’t know about humour, but when I was at Google I/O this year, I enjoyed the demo of the Android Auto, a system that will soon be integreated into most cars. You can ask for things like opening times, recommendations and of course directions in real voice.
  3. Back to fiction again: Those who red Hitchiker’s guide to the galaxy will still remember Babel Fish, a creature that you insert into your ear for instant translation.
  4. The reality’s answer to that is probably WordLense, an app that offline translates short bits of text while keeping the font rendering. We’ve showed it off at a party in Germany last month, and one person said “it’s magic!”
  5. Those who’ve seen Star Treck will remember the library computer everybody talks to.
  6. Well, Google actually is pretty much already there. Let’s try it out.
  7. And finally, computers and love. In this recent movie, this guy forms a relationship with an operating system. She is the perfect girlfriend, and she has Scarlet Johansson’s voice.
  8. A guy named Joshua tried to do this with Siri. Unfortunately, she told him that her “Licensing agreement does not cover marriage”.
  9. I would like to finish this talk with a funny story from another event I got a chance to attend several years ago. It was the unconference Foocamp in the US and I was brave enough to decide to hold a session. I called it “NLP: Are we there yet?” And what I meant by that was “Are NLP algorithms good enough to be used in commercial-grade application” I’m waiting in the room and the first person who comes in is a guy who I remembered introduced himself as the security guide at the Burning Man, a festival in Nevada. I asked him if he is after the NLP which stands for “Natural Language Processing”, … When he left, I was terrified. What happens now? The next person who walked in asked “Is this Natural Language Processing?” I said, yes, it is, phew! The room filled up quickly and I was star struck when Andrew Ng, the Director of the Stanford AI Lab walked in. The discussion was lively and we concluded that NLP is there! Particularly the data-driven algorithms that use machine-learning, such as “deep leaning”. I talked today about NLTK, Gensim and TextBlob, but I encourage you to also look into Scikit Learn and Theano, libraries that do exactly that.