Neural information retrieval and conversational question answering techniques are being used to build intelligent systems like conversational knowledge bases and ticketing systems. However, operationalizing deep learning models presents challenges regarding data needs, online usage, and interpretability. Combining neural models with linear models and term frequency-based approaches can help address these challenges, enabling reliable user experiences through one-shot learning and an editable knowledge base. User behavior like skimming content also requires interfaces that manage expectations and provide hybrid experiences.
Apidays New York 2024 - The value of a flexible API Management solution for O...
Byron Galbraith, Chief Data Scientist, Talla, at MLconf SEA 2017
1. Byron Galbraith, PhD
Co-founder / Chief Data Scientist, Talla
MLConf Seattle
2017.05.19
Neural Information Retrieval
&
Conversational Question Answering
2. / 29
Intelligent
Conversational
Service Desk
Human in the loop
Conversational
Knowledge Base
Conversational
Ticketing System
Intelligent
Workflows
Talla gets smarter, faster.
Conversational
Ticketing System
Intelligent
Workflows
Conversational
Knowledge Base
Human in the Loop
Talla gets smarter, faster.
Stay in control.
2
8. / 29
Neural Information Retrieval
2014 2015 2016 2017
1 %
4 %
8 %
21 %
051015202530
Year
%ofSIGIRpapers
relatedtoneuralIR
Figure 1: The percentage of neural IR papers at the ACM SIGIR conference—as
manual inspection of the paper titles—shows a clear trend in the growing popularity
important IR task. A search query may typically contain a few terms, while the d
depending on the scenario, may range from a few terms to hundreds of sentences
models for IR use vector representations of text, and usually contain a large numb
that needs to be tuned. ML models with large set of parameters typically require
Mitra and Craswell (2017)
8
16. / 29
Neural IR
Resources
Mitra and Craswell (2017) Neural Models for Information Retrieval
https://arxiv.org/abs/1705.01509
Mitra and Craswell (2017) Neural Text Embeddings for IR
WSDM 2017 Tutorial
https://www.slideshare.net/BhaskarMitra3/neural-text-embeddings-for-information-retrieval-wsdm-2017
Zhang et al. (2016) Neural Information Retrieval: A Literature Review
https://arxiv.org/abs/1611.06792
Neu-IR Workshop at SIGIR
http://neu-ir.weebly.com/
16
18. / 29
Conversational
Knowledge
Base
Goal: Automatically Efficiently
answer employees’ requests
Method:
1. Respond with high confidence answer from KB
2. Suggest up to four similar questions from KB
3. Provide easy path to service desk representatives
Enable rep to train
18
19. / 29
Word embeddings
are susceptible to
out of vocabulary
terms
Problem: Out of Vocabulary Terms
Unseen at Training
Skipped for being too rare
Can be highly discriminative
What does cromulent mean?
What does bigly mean?
19
20. / 29
Word embeddings
are susceptible to
out of vocabulary
terms
Problem: Out of Vocabulary Terms
Unseen at Training
Skipped for being too rare
Can be highly discriminative
What does UNK mean?
What does UNK mean?
20
21. / 29
OOV can be
overcome through
ensembling
Solution: Out of Vocabulary Terms
Infer embedding from local context
Ensemble with term frequency methods
Mitra et al. (2016)
Table 4: Results of NDCG evaluations under the non-telescoping settings. Both the DESM and the LSA models perform poorly in
the presence of random irrelevant documents in the candidate set. The mixture of DESMIN OUT with BM25 achieves the best
NDCG. The best NDCG values are highlighted per column in bold and all the statistically significant (p < 0.05) differences with the
BM25 baseline are indicated by the asterisk (*)
.
Explicitly Judged Test Set Implicit Feedback based Test Set
NDCG@1 NDCG@3 NDCG@10 NDCG@1 NDCG@3 NDCG@10
BM25 21.44 26.09 37.53 11.68 22.14 33.19
LSA 04.61* 04.63* 04.83* 01.97* 03.24* 04.54*
DESM (IN-IN, trained on body text) 06.69* 06.80* 07.39* 03.39* 05.09* 07.13*
DESM (IN-IN, trained on queries) 05.56* 05.59* 06.03* 02.62* 04.06* 05.92*
DESM (IN-OUT, trained on body text) 01.01* 01.16* 01.58* 00.78* 01.12* 02.07*
DESM (IN-OUT, trained on queries) 00.62* 00.58* 00.81* 00.29* 00.39* 01.36*
BM25 + DESM (IN-IN, trained on body text) 21.53 26.16 37.48 11.96 22.58* 33.70*
BM25 + DESM (IN-IN, trained on queries) 21.58 26.20 37.62 11.91 22.47* 33.72*
BM25 + DESM (IN-OUT, trained on body text) 21.47 26.18 37.55 11.83 22.42* 33.60*
BM25 + DESM (IN-OUT, trained on queries) 21.54 26.42* 37.86* 12.22* 22.96* 34.11*
We do not report the results of evaluating the mixture models
under the telescoping setup because tuning the ↵ parameter under
those settings on the training set results in the best performance from
the standalone DESM models. Overall, we conclude that the DESM
The probabilistic model of information retrieval leads to the de-
velopment of the BM25 ranking feature [35]. The increase in BM25
as term frequency increases is justified according to the 2-Poisson
model [15, 36], which makes a distinction between documents about
21
22. / 29
Deep Learning
methods have both
training and
operational
challenges
Problem: Operationalizing Deep Learning
A lot of labeled data required
UX requires online, one-shot learning
Poor interpretability, hard to debug
Performance gain vs model complexity
Model persistence with auto-scaling infrastructure
22
23. / 29
In this case, Deep
Learning is better
suited for offline
scenarios
Solution: Operationalizing Deep Learning
Use linear models instead for online / nearline
Deep learning for offline and global tasks
e.g. generating new word embeddings
https://xkcd.com/1838/
23
24. / 29
The user controls
the question-
answer pairs in the
knowledge base
Problem: User-Trained Agent
End users can ad hoc update the knowledge base
New Q&A pairs should be accessible immediately
Real-time, one-shot learning expected
24
25. / 29
The user controls
the question-
answer pairs in the
knowledge base
Problem: User-Trained Agent
End users can ad hoc update the knowledge base
New Q&A pairs should be accessible immediately
Real-time, one-shot learning expected
25
26. / 29
IR-based methods
give us the
interpretability and
speed needed for a
reliable UX
Solution: User-Trained Agent
Fully inspectable, editable KB via web interface
Cascade of fast online and nearline models
Linear models and term-frequency features easier
to debug and modify
26