Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Introducing Priberam Labs: Machine Learning and Natural Language Processing
1. Introducing Priberam Labs:
Machine Learning and Natural Language Processing
Andr´ Martins
e
IST, Lisbon, November 22nd, 2012
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 1 / 56
2. Collaborators
M´rio Figueiredo, Noah Smith, Pedro Aguiar, Eric Xing, Miguel Almeida.
a
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 2 / 56
3. Outline
1 Introduction
What is Priberam?
What are the Priberam Labs?
2 Research at Priberam Labs
3 Master’s Projects
4 Academia Partnerships
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 3 / 56
4. Outline
1 Introduction
What is Priberam?
What are the Priberam Labs?
2 Research at Priberam Labs
3 Master’s Projects
4 Academia Partnerships
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 4 / 56
5. What is Priberam?
A spin-off from IST funded in 1989
R&D in the area of language technologies
Microsoft gold certified partner, PME L´
ıder, PME Inovadora COTEC
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 5 / 56
6. What is Priberam?
A spin-off from IST funded in 1989
R&D in the area of language technologies
Microsoft gold certified partner, PME L´
ıder, PME Inovadora COTEC
Some of our clients:
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 5 / 56
7. Online Dictionary
(http://www.priberam.pt/dlpo — 1M page-views per day)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 6 / 56
8. Grammar Checker
(http://www.flip.pt)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 7 / 56
9. Legal Search
(http://www.legix.pt)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 8 / 56
10. Newswire Search
(http://www.dn.pt, http://www.jn.pt, http://www.tsf.pt)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 9 / 56
11. Newswire Search
question
(http://www.dn.pt, http://www.jn.pt, http://www.tsf.pt)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 9 / 56
12. Newswire Search
question
answer
(http://www.dn.pt, http://www.jn.pt, http://www.tsf.pt)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 9 / 56
13. Outline
1 Introduction
What is Priberam?
What are the Priberam Labs?
2 Research at Priberam Labs
3 Master’s Projects
4 Academia Partnerships
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 10 / 56
14. What are the Priberam Labs?
Every day we deal with challenging and stimulating problems, some of
them unanswered by current scientific knowledge
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 11 / 56
15. What are the Priberam Labs?
Every day we deal with challenging and stimulating problems, some of
them unanswered by current scientific knowledge
Our key areas: Natural Language Processing and Machine Learning
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 11 / 56
16. What are the Priberam Labs?
Every day we deal with challenging and stimulating problems, some of
them unanswered by current scientific knowledge
Our key areas: Natural Language Processing and Machine Learning
Our goals:
advance the state of the art in NLP and ML
incorporate the resulting innovations in new products
promote collaborations with other researchers in academia
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 11 / 56
17. Outline
1 Introduction
What is Priberam?
What are the Priberam Labs?
2 Research at Priberam Labs
3 Master’s Projects
4 Academia Partnerships
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 12 / 56
18. Our Research Interests
Natural Language Processing
Machine Learning
Structured Prediction
Graphical Models
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 13 / 56
19. Our Research Interests
Natural Language Processing
Machine Learning
Structured Prediction
Graphical Models
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 13 / 56
20. Natural Language Processing
Goal: make machines capable of “understanding” human language.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 14 / 56
21. Natural Language Processing
Goal: make machines capable of “understanding” human language.
Information Retrieval
Machine Translation
Syntactic Parsing
Semantic Parsing
Speech Recognition
...
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 14 / 56
22. The Empirical “Revolution” in NLP
Until the 1980s: rule-based methods were prevalent in AI
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 15 / 56
23. The Empirical “Revolution” in NLP
Until the 1980s: rule-based methods were prevalent in AI
Since the mid 1990s: statistical methods, corpus linguistics
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 15 / 56
24. The Empirical “Revolution” in NLP
Until the 1980s: rule-based methods were prevalent in AI
Since the mid 1990s: statistical methods, corpus linguistics
Today: emphasis in machine learning and large-scale data processing
“The unreasonable effectiveness of data”, Halevy et al. 2009
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 15 / 56
25. Our Research Interests
Natural Language Processing
Machine Learning
Structured Prediction
Graphical Models
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 16 / 56
26. Our Research Interests
Natural Language Processing
Machine Learning
Structured Prediction
Graphical Models
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 16 / 56
33. Machine Learning
Goal: build systems that learn from the data.
Mitchell (1997); Manning and Sch¨tze (1999); Sch¨lkopf and Smola (2002); Bishop (2006)
u o
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 18 / 56
34. Machine Learning
Goal: build systems that learn from the data.
Input set X and output set Y
Mitchell (1997); Manning and Sch¨tze (1999); Sch¨lkopf and Smola (2002); Bishop (2006)
u o
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 18 / 56
35. Machine Learning
Goal: build systems that learn from the data.
Input set X and output set Y
Learn a classifier h : X → Y from a set of labeled examples
{(xi , yi )}N ⊆ X × Y
i=1
Mitchell (1997); Manning and Sch¨tze (1999); Sch¨lkopf and Smola (2002); Bishop (2006)
u o
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 18 / 56
36. Machine Learning
Goal: build systems that learn from the data.
Input set X and output set Y
Learn a classifier h : X → Y from a set of labeled examples
{(xi , yi )}N ⊆ X × Y
i=1
Given an unseen example x ∈ X, predict y = h(x)
Mitchell (1997); Manning and Sch¨tze (1999); Sch¨lkopf and Smola (2002); Bishop (2006)
u o
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 18 / 56
37. Machine Learning
Goal: build systems that learn from the data.
Input set X and output set Y
Learn a classifier h : X → Y from a set of labeled examples
{(xi , yi )}N ⊆ X × Y
i=1
Given an unseen example x ∈ X, predict y = h(x)
Many approaches: decision trees, neural networks, nearest neighbors,
naive Bayes, logistic regression, support vector machines, ...
Many learning formalisms: supervised, unsupervised, semi-supervised,
weakly-supervised, active, online, reinforcement, ...
Mitchell (1997); Manning and Sch¨tze (1999); Sch¨lkopf and Smola (2002); Bishop (2006)
u o
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 18 / 56
38. Our Research Interests
Natural Language Processing
Machine Learning
Structured Prediction
Graphical Models
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 19 / 56
39. Our Research Interests
Natural Language Processing
Machine Learning
Structured Prediction
Graphical Models
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 19 / 56
40. Structured Prediction
Language is structured, complex, and ambiguous.
Lafferty et al. (2001); Taskar et al. (2003); Altun et al. (2003); Tsochantaridis et al. (2004)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 20 / 56
41. Structured Prediction
Language is structured, complex, and ambiguous.
The input set X is typically structured (a string, an acoustic signal, etc.)
Often: the output set Y is also structured (a string, a parse tree, etc.)
Lafferty et al. (2001); Taskar et al. (2003); Altun et al. (2003); Tsochantaridis et al. (2004)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 20 / 56
42. Structured Prediction
Language is structured, complex, and ambiguous.
The input set X is typically structured (a string, an acoustic signal, etc.)
Often: the output set Y is also structured (a string, a parse tree, etc.)
Some problems:
How to decode structured outputs?
How to learn models for structured prediction?
How to learn the structure itself?
Lafferty et al. (2001); Taskar et al. (2003); Altun et al. (2003); Tsochantaridis et al. (2004)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 20 / 56
43. Example: Part-of-Speech Tagging
Goal: given a sentence, determine the part-of-speech tag of each word.
Time flies like an arrow
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 21 / 56
44. Example: Part-of-Speech Tagging
Goal: given a sentence, determine the part-of-speech tag of each word.
Noun Det Noun
Time flies like an arrow
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 21 / 56
45. Example: Part-of-Speech Tagging
Goal: given a sentence, determine the part-of-speech tag of each word.
Noun?
Noun Verb? Det Noun
Time flies like an arrow
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 21 / 56
46. Example: Part-of-Speech Tagging
Goal: given a sentence, determine the part-of-speech tag of each word.
Noun? Prep?
Noun Verb? Verb? Det Noun
Time flies like an arrow
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 21 / 56
47. Example: Part-of-Speech Tagging
Goal: given a sentence, determine the part-of-speech tag of each word.
Rule-based systems (Brill, 1993)
Noun? Prep?
Noun Verb? Verb? Det Noun
Time flies like an arrow
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 21 / 56
48. Example: Part-of-Speech Tagging
Goal: given a sentence, determine the part-of-speech tag of each word.
Rule-based systems (Brill, 1993)
Hidden Markov models (Brants, 2000)
Noun Verb Prep Det Noun
Time flies like an arrow
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 21 / 56
49. Example: Part-of-Speech Tagging
Goal: given a sentence, determine the part-of-speech tag of each word.
Rule-based systems (Brill, 1993)
Hidden Markov models (Brants, 2000)
Conditional random fields (Lafferty et al., 2001)
Noun Verb Prep Det Noun
Time flies like an arrow
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 21 / 56
50. Our Research Interests
Natural Language Processing
Machine Learning
Structured Prediction
Graphical Models
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 22 / 56
51. Our Research Interests
Natural Language Processing
Machine Learning
Structured Prediction
Graphical Models
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 22 / 56
52. Graphical Models
Inspired in Statistical Mechanics (Ising, 1925; Potts, 1952)
Applications in coding theory, vision, computational biology, ...
(Tanner, 1981; Pearl, 1988; Kschischang et al., 2001; Koller and Friedman, 2009)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 23 / 56
53. Graphical Models
Inspired in Statistical Mechanics (Ising, 1925; Potts, 1952)
Applications in coding theory, vision, computational biology, ...
(Tanner, 1981; Pearl, 1988; Kschischang et al., 2001; Koller and Friedman, 2009)
MAP Inference: obtain the most likely configuration.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 23 / 56
54. Graphical Models
Inspired in Statistical Mechanics (Ising, 1925; Potts, 1952)
Applications in coding theory, vision, computational biology, ...
(Tanner, 1981; Pearl, 1988; Kschischang et al., 2001; Koller and Friedman, 2009)
MAP Inference: obtain the most likely configuration.
Graphs without cycles: dynamic programming (Viterbi, 1967)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 23 / 56
55. Graphical Models
Inspired in Statistical Mechanics (Ising, 1925; Potts, 1952)
Applications in coding theory, vision, computational biology, ...
(Tanner, 1981; Pearl, 1988; Kschischang et al., 2001; Koller and Friedman, 2009)
MAP Inference: obtain the most likely configuration.
Graphs without cycles: dynamic programming (Viterbi, 1967)
In general NP-hard!
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 23 / 56
56. AD3 Algorithm (Martins et al., 2010a, 2011a)
“Alternating Directions Dual Decomposition.”
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 24 / 56
57. AD3 Algorithm (Martins et al., 2010a, 2011a)
“Alternating Directions Dual Decomposition.”
An approximate MAP inference algorithm based on an LP relaxation
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 24 / 56
58. AD3 Algorithm (Martins et al., 2010a, 2011a)
“Alternating Directions Dual Decomposition.”
An approximate MAP inference algorithm based on an LP relaxation
Fundamental idea: decompose the graph in parts, at each iteration
t solve local subproblems and promote a consensus on the overlaps
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 24 / 56
59. AD3 Algorithm (Martins et al., 2010a, 2011a)
“Alternating Directions Dual Decomposition.”
An approximate MAP inference algorithm based on an LP relaxation
Fundamental idea: decompose the graph in parts, at each iteration
t solve local subproblems and promote a consensus on the overlaps
Convergence rate O(1/t)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 24 / 56
60. AD3 Algorithm (Martins et al., 2010a, 2011a)
“Alternating Directions Dual Decomposition.”
An approximate MAP inference algorithm based on an LP relaxation
Fundamental idea: decompose the graph in parts, at each iteration
t solve local subproblems and promote a consensus on the overlaps
Convergence rate O(1/t)
Can tackle combinatorial parts and first-order logic constraints
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 24 / 56
61. AD3 Algorithm (Martins et al., 2010a, 2011a)
“Alternating Directions Dual Decomposition.”
An approximate MAP inference algorithm based on an LP relaxation
Fundamental idea: decompose the graph in parts, at each iteration
t solve local subproblems and promote a consensus on the overlaps
Convergence rate O(1/t)
Can tackle combinatorial parts and first-order logic constraints
Code available at: http://www.ark.cs.cmu.edu/AD3
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 24 / 56
62. Graphs are Everywhere
Facebook graph
WWW graph
Protein folding Image Segmentation
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 25 / 56
63. Syntactic Parsing
(Chomsky, 1965; Magerman, 1995; Charniak, 1996; Collins, 1999; Klein and Manning, 2003)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 26 / 56
64. Syntactic Parsing
(Chomsky, 1965; Magerman, 1995; Charniak, 1996; Collins, 1999; Klein and Manning, 2003)
She solved the problem with the statistical method.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 26 / 56
65. Syntactic Parsing
(Chomsky, 1965; Magerman, 1995; Charniak, 1996; Collins, 1999; Klein and Manning, 2003)
She solved the problem with the statistical method.
S
S --> NP VP
NP --> Pro
NP --> Det N NP VP
NP --> Det Nbar
Nbar --> Adj N Pro
VP --> V NP PP
PP --> P NP She
Det --> the V NP PP
Pro --> She solved Det N
N --> problem P NP
N --> method the problem
V --> solved with
Det Nbar
P --> with
Adj --> the Adj N
statistical
statistical method
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 26 / 56
66. Syntactic Ambiguity
1 She employed the statistical method:
S
NP VP
She
V NP PP
solved the problem
with the statistical method
2 The statistical method was broken:
S
NP VP
She
V NP
solved
NP PP
the problem
with the statistical method
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 27 / 56
67. Dependency Syntax
(P¯nini, 4th century BCE, Tesni`re 1959; Hudson 1984; Mel’ˇuk 1988; Eisner 1996; McDonald
a. e c
et al. 2005; Nivre et al. 2006; Koo et al. 2007)
* She solved the problem with the statistical method
Tree obtained “lexicalizing” the previous phrase-structure tree.
A lightweight syntactic formalism, without phrases
Grammar functions represented as lexical relationships
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 28 / 56
68. Turbo Parser (Martins et al., 2009, 2010b, 2011b)
A multi-lingual statistical dependency parser,
which formulates parsing as inference in a
graphical model.
Ignores global effects caused by the cycles of the graph
Same idea that underlies turbo decoders (Berrou et al., 1993)
Uses AD3 for solving the relaxation
State-of-the-art accuracies, extremely fast (1, 200 words per second)
Code available at: http://www.ark.cs.cmu.edu/TurboParser
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 29 / 56
69. Ongoing Project: Summarization
Given a set of documents about an event, generate a brief summary.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 30 / 56
70. Ongoing Project: Summarization
Given a set of documents about an event, generate a brief summary.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 30 / 56
72. Extractive Summarization
Just extract the most salient sentences.
Reward relevance and coverage, penalize redundancy
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 31 / 56
74. Compressive Summarization
Jointly extract and compress sentences.
Trade-off between informativeness, length, and grammaticality
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 32 / 56
75. Released Software
A multilingual part-of-speech tagger (TurboTagger)
A multilingual dependency parser (TurboParser)
A algorithm for approximate inference in graphical models (AD3 )
http://www.ark.cs.cmu.edu/TurboParser
http://www.ark.cs.cmu.edu/AD3
lti
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 33 / 56
76. Outline
1 Introduction
What is Priberam?
What are the Priberam Labs?
2 Research at Priberam Labs
3 Master’s Projects
4 Academia Partnerships
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 34 / 56
77. Master’s Projects
Opinion Mining in Newspapers and Blogs
Text-Driven Forecasting
Recommendation Systems
Weakly Supervised Sentiment Analysis
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 35 / 56
78. Master’s Projects
Opinion Mining in Newspapers and Blogs
Text-Driven Forecasting
Recommendation Systems
Weakly Supervised Sentiment Analysis
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 35 / 56
79. Opinion Mining in Newspapers and Blogs
Build a system that extracts “opinions” from text in natural language.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 36 / 56
80. Opinion Mining in Newspapers and Blogs
Build a system that extracts “opinions” from text in natural language.
Examples: opinions of politicians about controversial topics, user
reviews about products, opinions expressed in blogs and Twitter, etc.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 36 / 56
81. Opinion Mining in Newspapers and Blogs
Build a system that extracts “opinions” from text in natural language.
Examples: opinions of politicians about controversial topics, user
reviews about products, opinions expressed in blogs and Twitter, etc.
Goal: a computer program that extracts opinions, identifies the
opinion holder, the aspect that is being opinionated about, and the
opinion polarity (positive or negative sentiment)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 36 / 56
82. Example: Google Products
opinion snippets
aspects
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 37 / 56
83. Master’s Projects
Opinion Mining in Newspapers and Blogs
Text-Driven Forecasting
Recommendation Systems
Weakly Supervised Sentiment Analysis
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 38 / 56
84. Master’s Projects
Opinion Mining in Newspapers and Blogs
Text-Driven Forecasting
Recommendation Systems
Weakly Supervised Sentiment Analysis
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 38 / 56
85. Text-Driven Forecasting
Example: a movie by a famous director has
premiered. Can we predict its gross revenue
given opinionated text?
“[...] a masterpiece in sheer
awfulness.” — Rotten Tomatoes
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 39 / 56
86. Text-Driven Forecasting
Example: a movie by a famous director has
premiered. Can we predict its gross revenue
given opinionated text?
“[...] a masterpiece in sheer
awfulness.” — Rotten Tomatoes
Goal: develop ML algorithms for predicting numeric quantities about
an event given a body of text.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 39 / 56
87. Text-Driven Forecasting
Example: a movie by a famous director has
premiered. Can we predict its gross revenue
given opinionated text?
“[...] a masterpiece in sheer
awfulness.” — Rotten Tomatoes
Goal: develop ML algorithms for predicting numeric quantities about
an event given a body of text.
Possible applications: predicting the revenue of movies, opinion
polls from blogs, stock volatility from financial reports, the number of
external links given a news article, etc.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 39 / 56
88. Master’s Projects
Opinion Mining in Newspapers and Blogs
Text-Driven Forecasting
Recommendation Systems
Weakly Supervised Sentiment Analysis
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 40 / 56
89. Master’s Projects
Opinion Mining in Newspapers and Blogs
Text-Driven Forecasting
Recommendation Systems
Weakly Supervised Sentiment Analysis
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 40 / 56
90. Recommendation Systems
In many applications (e.g. movie rental systems) users assign ratings to
products according to their taste (from to )
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 41 / 56
91. Recommendation Systems
In many applications (e.g. movie rental systems) users assign ratings to
products according to their taste (from to )
These ratings can be seen as entries in a matrix (of N users by M movies)
? ? ...
? ? ...
? ? ...
.
. . . ..
. . .
.
. . . . .
? ? ...
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 41 / 56
92. Recommendation Systems
In many applications (e.g. movie rental systems) users assign ratings to
products according to their taste (from to )
These ratings can be seen as entries in a matrix (of N users by M movies)
? ? ...
? ? ...
? ? ...
.
. . . ..
. . .
.
. . . . .
? ? ...
Goal: fill the blanks (matrix completion).
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 41 / 56
93. Recommendation Systems
In many applications (e.g. movie rental systems) users assign ratings to
products according to their taste (from to )
These ratings can be seen as entries in a matrix (of N users by M movies)
? ? ...
? ? ...
? ? ...
.
. . . ..
. . .
.
. . . . .
? ? ...
Goal: fill the blanks (matrix completion).
Predict the rating that the ith user will assign to the jth movie based
on similar user/movie profiles: collaborative filtering
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 41 / 56
94. Recommendation Systems
In many applications (e.g. movie rental systems) users assign ratings to
products according to their taste (from to )
These ratings can be seen as entries in a matrix (of N users by M movies)
? ? ...
? ? ...
? ? ...
.
. . . ..
. . .
.
. . . . .
? ? ...
Goal: fill the blanks (matrix completion).
Predict the rating that the ith user will assign to the jth movie based
on similar user/movie profiles: collaborative filtering
Recommend new movies to unseen users
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 41 / 56
95. Recommendation Systems
Netflix Prize: $1M for whoever improves Netflix’s Cinematch R in > 10%
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 42 / 56
96. Recommendation Systems
Netflix Prize: $1M for whoever improves Netflix’s Cinematch R in > 10%
Winner: BellKor’s Pragmatic Chaos, 21/9/2009
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 42 / 56
97. Recommendation Systems
Netflix Prize: $1M for whoever improves Netflix’s Cinematch R in > 10%
Winner: BellKor’s Pragmatic Chaos, 21/9/2009
Data: some entries of the user/movie matrix (training and test splits)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 42 / 56
98. Recommendation Systems
Netflix Prize: $1M for whoever improves Netflix’s Cinematch R in > 10%
Winner: BellKor’s Pragmatic Chaos, 21/9/2009
Data: some entries of the user/movie matrix (training and test splits)
Evaluation metric: root mean squared error (RMSE)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 42 / 56
99. Recommendation Systems
Netflix Prize: $1M for whoever improves Netflix’s Cinematch R in > 10%
Winner: BellKor’s Pragmatic Chaos, 21/9/2009
Data: some entries of the user/movie matrix (training and test splits)
Evaluation metric: root mean squared error (RMSE)
Some possible approaches:
k-nearest neighbors (for some similarity metric)
probabilistic models with latent variables
low-rank matrix factorization
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 42 / 56
100. Master’s Projects
Opinion Mining in Newspapers and Blogs
Text-Driven Forecasting
Recommendation Systems
Weakly Supervised Sentiment Analysis
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 43 / 56
101. Master’s Projects
Opinion Mining in Newspapers and Blogs
Text-Driven Forecasting
Recommendation Systems
Weakly Supervised Sentiment Analysis
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 43 / 56
102. Weakly Supervised Sentiment Analysis
Classify a product review as positive or negative.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 44 / 56
103. Weakly Supervised Sentiment Analysis
Classify a product review as positive or negative.
“This camera takes poor quality photos. Yes, it’s slim and
lightweight. Yes, the shutter speed is snappy. But the photos are
of such poor quality that it’s a pretty useless camera.”
— Amazon.com
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 44 / 56
104. Weakly Supervised Sentiment Analysis
Classify a product review as positive or negative.
“This camera takes poor quality photos. Yes, it’s slim and
lightweight. Yes, the shutter speed is snappy. But the photos are
of such poor quality that it’s a pretty useless camera.”
— Amazon.com
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 44 / 56
105. Weakly Supervised Sentiment Analysis
Classify a product review as positive or negative.
“This camera takes poor quality photos. Yes, it’s slim and
lightweight. Yes, the shutter speed is snappy. But the photos are
of such poor quality that it’s a pretty useless camera.”
— Amazon.com
Data: a set of reviews along with product ratings.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 44 / 56
106. Weakly Supervised Sentiment Analysis
Classify a product review as positive or negative.
“This camera takes poor quality photos. Yes, it’s slim and
lightweight. Yes, the shutter speed is snappy. But the photos are
of such poor quality that it’s a pretty useless camera.”
— Amazon.com
Data: a set of reviews along with product ratings.
Goal: an algorithm which, given as input a new product review, predicts
its polarity (positive or negative)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 44 / 56
107. Weakly Supervised Sentiment Analysis
Consider a scenario with weak supervision: domain adaptation,
semi-supervised learning, language transfer, etc.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 45 / 56
108. Weakly Supervised Sentiment Analysis
Consider a scenario with weak supervision: domain adaptation,
semi-supervised learning, language transfer, etc.
Possible tasks:
Classify movie reviews with a system trained on cellphone reviews
Train a system in English data and use it for reviews in Portuguese
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 45 / 56
109. Weakly Supervised Sentiment Analysis
Consider a scenario with weak supervision: domain adaptation,
semi-supervised learning, language transfer, etc.
Possible tasks:
Classify movie reviews with a system trained on cellphone reviews
Train a system in English data and use it for reviews in Portuguese
What are the relevant features?
Adjectives? (not always helpful...)
Connective words: but, however, although,...
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 45 / 56
110. Outline
1 Introduction
What is Priberam?
What are the Priberam Labs?
2 Research at Priberam Labs
3 Master’s Projects
4 Academia Partnerships
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 46 / 56
111. Academia Partnerships
CMU/Portugal
Seminars
Summer School (LxMLS)
Opportunity: Research Internships
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 47 / 56
112. CMU/Portugal
Dual PhD Program in Language Technologies
Priberam is an industrial partner
See how to apply in: http://www.cmuportugal.org
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 48 / 56
113. CMU/Portugal
Dual PhD Program in Language Technologies
Priberam is an industrial partner
See how to apply in: http://www.cmuportugal.org
Note: deadline soon (December 15th)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 48 / 56
114. Priberam Machine Learning Lunch Seminars
A series of informal meetings every two weeks at IST (Tuesdays 1PM)
Discussion forum involving different research groups interested in
machine learning
Everyone can attend, no registration needed
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 49 / 56
115. Priberam Machine Learning Lunch Seminars
A series of informal meetings every two weeks at IST (Tuesdays 1PM)
Discussion forum involving different research groups interested in
machine learning
Everyone can attend, no registration needed
Delicious free food!
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 49 / 56
116. Lisbon Machine Learning School
An annual summer school held since 2011 devoted to ML and NLP
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 50 / 56
117. Lisbon Machine Learning School
An annual summer school held since 2011 devoted to ML and NLP
> 100 participants worldwide (mostly MSc and PhD students)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 50 / 56
118. Lisbon Machine Learning School
An annual summer school held since 2011 devoted to ML and NLP
> 100 participants worldwide (mostly MSc and PhD students)
Priberam Labs co-organizes and is one of the sponsors
Google is the main sponsor
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 50 / 56
119. Lisbon Machine Learning School
An annual summer school held since 2011 devoted to ML and NLP
> 100 participants worldwide (mostly MSc and PhD students)
Priberam Labs co-organizes and is one of the sponsors
Google is the main sponsor
Next year’s topic is Big Data
More information and videos of past lectures: http://lxmls.it.pt
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 50 / 56
120. Opportunity: Research Internships
We’re offering short term research internships at Priberam Labs!
Who? MSc/PhD students wanting a short experience in the industry
What? A stimulating research environment, connections to the
international ML and NLP research scene
How? Interns will work with us in a research project of their choice
Interested?
labs@priberam.com
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 51 / 56
121. Thank You!
More information about the Labs: http://labs.priberam.com
(You could be here.)
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 52 / 56
122. References I
Altun, Y., Tsochantaridis, I., and Hofmann, T. (2003). Hidden Markov support vector
machines. In Proc. of International Conference of Machine Learning.
Berrou, C., Glavieux, A., and Thitimajshima, P. (1993). Near Shannon limit error-correcting
coding and decoding. In Proc. of International Conference on Communications, volume 93,
pages 1064–1070.
Bishop, C. (2006). Pattern recognition and machine learning. Springer New York.
Brants, T. (2000). Tnt: a statistical part-of-speech tagger. In Proc. of the Sixth Conference on
Applied Natural Language Processing.
Brill, E. (1993). A Corpus-Based Approach to Language Learning. PhD thesis, University of
Pennsylvania.
Charniak, E. (1996). Tree-bank grammars. In Proc. of the National Conference on Artificial
Intelligence, pages 1031–1036.
Chomsky, N. (1965). Aspects of the Theory of Syntax, volume 119. The MIT press.
Collins, M. (1999). Head-driven statistical models for natural language parsing. PhD thesis,
University of Pennsylvania.
Eisner, J. (1996). Three new probabilistic models for dependency parsing: An exploration. In
Proc. of International Conference on Computational Linguistics, pages 340–345.
Halevy, A., Norvig, P., and Pereira, F. (2009). The unreasonable effectiveness of data.
Intelligent Systems, IEEE, 24(2):8–12.
Hudson, R. (1984). Word grammar. Blackwell Oxford.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 53 / 56
123. References II
Ising, E. (1925). Beitrag zur theorie des ferromagnetismus. Zeitschrift f¨r Physik A Hadrons
u
and Nuclei, 31(1):253–258.
Klein, D. and Manning, C. (2003). Accurate unlexicalized parsing. In Proc. of Annual Meeting
on Association for Computational Linguistics, pages 423–430.
Koller, D. and Friedman, N. (2009). Probabilistic Graphical Models: Principles and Techniques.
The MIT Press.
Koo, T., Globerson, A., Carreras, X., and Collins, M. (2007). Structured prediction models via
the matrix-tree theorem. In Empirical Methods for Natural Language Processing.
Kschischang, F. R., Frey, B. J., and Loeliger, H. A. (2001). Factor graphs and the sum-product
algorithm. IEEE Transactions on Information Theory, 47.
Lafferty, J., McCallum, A., and Pereira, F. (2001). Conditional random fields: Probabilistic
models for segmenting and labeling sequence data. In Proc. of International Conference of
Machine Learning.
Magerman, D. (1995). Statistical decision-tree models for parsing. In Proc. of Annual Meeting
on Association for Computational Linguistics, pages 276–283.
Manning, C. and Sch¨tze, H. (1999). Foundations of Statistical Natural Language Processing.
u
MIT Press, Cambridge, MA.
Martins, A. F. T., Figueiredo, M. A. T., Aguiar, P. M. Q., Smith, N. A., and Xing, E. P.
(2011a). An Augmented Lagrangian Approach to Constrained MAP Inference. In Proc. of
International Conference of Machine Learning.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 54 / 56
124. References III
Martins, A. F. T., Smith, N. A., Aguiar, P. M. Q., and Figueiredo, M. A. T. (2011b). Dual
Decomposition with Many Overlapping Components. In Proc. of Empirical Methods for
Natural Language Processing.
Martins, A. F. T., Smith, N. A., and Xing, E. P. (2009). Concise Integer Linear Programming
Formulations for Dependency Parsing. In Proc. of Annual Meeting of the Association for
Computational Linguistics.
Martins, A. F. T., Smith, N. A., Xing, E. P., Aguiar, P. M. Q., and Figueiredo, M. A. T.
(2010a). Augmented Dual Decomposition for MAP Inference. In Neural Information
Processing Systems: Workshop in Optimization for Machine Learning.
Martins, A. F. T., Smith, N. A., Xing, E. P., Figueiredo, M. A. T., and Aguiar, P. M. Q.
(2010b). Turbo Parsers: Dependency Parsing by Approximate Variational Inference. In Proc.
of Empirical Methods for Natural Language Processing.
McDonald, R. T., Pereira, F., Ribarov, K., and Hajic, J. (2005). Non-projective dependency
parsing using spanning tree algorithms. In Proc. of Empirical Methods for Natural Language
Processing.
Mel’ˇuk, I. (1988). Dependency syntax: theory and practice. State University of New York Press.
c
Mitchell, T. (1997). Machine learning. McGraw Hill.
Nivre, J., Hall, J., Nilsson, J., Eryiˇit, G., and Marinov, S. (2006). Labeled pseudo-projective
g
dependency parsing with support vector machines. In Procs. of International Conference on
Natural Language Learning.
Pearl, J. (1988). Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference.
Morgan Kaufmann.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 55 / 56
125. References IV
Potts, R. (1952). Some generalized order-disorder transformations. In Proceedings of the
Cambridge Philosophical Society, volume 48, pages 106–109. Cambridge Univ Press.
Sch¨lkopf, B. and Smola, A. J. (2002). Learning with Kernels. The MIT Press, Cambridge, MA.
o
Tanner, R. (1981). A recursive approach to low complexity codes. IEEE Transactions on
Information Theory, 27(5):533–547.
Taskar, B., Guestrin, C., and Koller, D. (2003). Max-margin Markov networks. In Proc. of
Neural Information Processing Systems.
Tesni`re, L. (1959). El´ments de syntaxe structurale. Libraire C. Klincksieck.
e e
Tsochantaridis, I., Hofmann, T., Joachims, T., and Altun, Y. (2004). Support vector machine
learning for interdependent and structured output spaces. In Proc. of International
Conference of Machine Learning.
Viterbi, A. (1967). Error bounds for convolutional codes and an asymptotically optimum
decoding algorithm. IEEE Transactions on Information Theory, 13(2):260–269.
Andr´ Martins (Priberam/IT)
e Introducing Priberam Labs IST 22/11/2012 56 / 56