Soumettre la recherche
Mettre en ligne
Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)
•
1 j'aime
•
1,450 vues
Bhaskar Mitra
Suivre
Slides presented at the Neu-IR workshop in SIGIR 2016.
Lire moins
Lire la suite
Technologie
Signaler
Partager
Signaler
Partager
1 sur 23
Télécharger maintenant
Télécharger pour lire hors ligne
Recommandé
Query Expansion with Locally-Trained Word Embeddings (ACL 2016)
Query Expansion with Locally-Trained Word Embeddings (ACL 2016)
Bhaskar Mitra
A Proposal for Evaluating Answer Distillation from Web Data
A Proposal for Evaluating Answer Distillation from Web Data
Bhaskar Mitra
Neu-IR 2016: Lessons from the Trenches
Neu-IR 2016: Lessons from the Trenches
Bhaskar Mitra
Neu-ir 2016: Opening note
Neu-ir 2016: Opening note
Bhaskar Mitra
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
Bhaskar Mitra
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
Bhaskar Mitra
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Bhaskar Mitra
A Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
Bhaskar Mitra
Recommandé
Query Expansion with Locally-Trained Word Embeddings (ACL 2016)
Query Expansion with Locally-Trained Word Embeddings (ACL 2016)
Bhaskar Mitra
A Proposal for Evaluating Answer Distillation from Web Data
A Proposal for Evaluating Answer Distillation from Web Data
Bhaskar Mitra
Neu-IR 2016: Lessons from the Trenches
Neu-IR 2016: Lessons from the Trenches
Bhaskar Mitra
Neu-ir 2016: Opening note
Neu-ir 2016: Opening note
Bhaskar Mitra
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
Bhaskar Mitra
Using Text Embeddings for Information Retrieval
Using Text Embeddings for Information Retrieval
Bhaskar Mitra
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Bhaskar Mitra
A Simple Introduction to Word Embeddings
A Simple Introduction to Word Embeddings
Bhaskar Mitra
Joint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and Recommendation
Bhaskar Mitra
What’s next for deep learning for Search?
What’s next for deep learning for Search?
Bhaskar Mitra
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
Bhaskar Mitra
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Bhaskar Mitra
Multisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and Recommendation
Bhaskar Mitra
Neural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progress
Bhaskar Mitra
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Bhaskar Mitra
Neural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning Track
Bhaskar Mitra
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Bhaskar Mitra
Deep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
Bhaskar Mitra
Neural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
Learning to Rank with Neural Networks
Learning to Rank with Neural Networks
Bhaskar Mitra
Deep Learning for Search
Deep Learning for Search
Bhaskar Mitra
Deep Learning for Search
Deep Learning for Search
Bhaskar Mitra
Neural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
Deep Learning for Search
Deep Learning for Search
Bhaskar Mitra
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
Bhaskar Mitra
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrieval
Bhaskar Mitra
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Khem
Contenu connexe
Plus de Bhaskar Mitra
Joint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and Recommendation
Bhaskar Mitra
What’s next for deep learning for Search?
What’s next for deep learning for Search?
Bhaskar Mitra
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
Bhaskar Mitra
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Bhaskar Mitra
Multisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and Recommendation
Bhaskar Mitra
Neural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progress
Bhaskar Mitra
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Bhaskar Mitra
Neural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning Track
Bhaskar Mitra
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Bhaskar Mitra
Deep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
Bhaskar Mitra
Neural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
Learning to Rank with Neural Networks
Learning to Rank with Neural Networks
Bhaskar Mitra
Deep Learning for Search
Deep Learning for Search
Bhaskar Mitra
Deep Learning for Search
Deep Learning for Search
Bhaskar Mitra
Neural Learning to Rank
Neural Learning to Rank
Bhaskar Mitra
Deep Learning for Search
Deep Learning for Search
Bhaskar Mitra
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
Bhaskar Mitra
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrieval
Bhaskar Mitra
Plus de Bhaskar Mitra
(20)
Joint Multisided Exposure Fairness for Search and Recommendation
Joint Multisided Exposure Fairness for Search and Recommendation
What’s next for deep learning for Search?
What’s next for deep learning for Search?
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
So, You Want to Release a Dataset? Reflections on Benchmark Development, Comm...
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Efficient Machine Learning and Machine Learning for Efficiency in Information...
Multisided Exposure Fairness for Search and Recommendation
Multisided Exposure Fairness for Search and Recommendation
Neural Learning to Rank
Neural Learning to Rank
Neural Information Retrieval: In search of meaningful progress
Neural Information Retrieval: In search of meaningful progress
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Conformer-Kernel with Query Term Independence @ TREC 2020 Deep Learning Track
Neural Learning to Rank
Neural Learning to Rank
Duet @ TREC 2019 Deep Learning Track
Duet @ TREC 2019 Deep Learning Track
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Benchmarking for Neural Information Retrieval: MS MARCO, TREC, and Beyond
Deep Neural Methods for Retrieval
Deep Neural Methods for Retrieval
Neural Learning to Rank
Neural Learning to Rank
Learning to Rank with Neural Networks
Learning to Rank with Neural Networks
Deep Learning for Search
Deep Learning for Search
Deep Learning for Search
Deep Learning for Search
Neural Learning to Rank
Neural Learning to Rank
Deep Learning for Search
Deep Learning for Search
Dual Embedding Space Model (DESM)
Dual Embedding Space Model (DESM)
Adversarial and reinforcement learning-based approaches to information retrieval
Adversarial and reinforcement learning-based approaches to information retrieval
Dernier
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
Khem
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
The Digital Insurer
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Roshan Dwivedi
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
sudhanshuwaghmare1
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Safe Software
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
UK Journal
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
RTylerCroy
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
wesley chun
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
lior mazor
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
wesley chun
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
apidays
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
The Digital Insurer
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
apidays
Dernier
(20)
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)
1.
Query Expansion with Locally-Trained Word Embeddings Fernando Diaz Bhaskar
Mitra Nick Craswell Microsoft July 21, 2016 1 / 22
2.
word embedding: discriminatively trained vector representation 2 /
22
3.
L = T∑ t=1 ωxt term weight ∑ y∈Vt c log σ(ϕ(xt)
· ϕ(y)) observed context + ∑ y∈Vt n log σ(−ϕ(xt) · ϕ(y)) negative context 3 / 22
4.
ωxt needs to reflect the importance of the term at evaluation time. 4 /
22
5.
T∑ t=1 ωxt=w ∝ p(w|C) 5
/ 22
6.
what terms are important at query time? 6 / 22
7.
p(w|R) probability of the term in the relevant documents. 7 /
22
8.
how different is p(w|R) from
p(w|C)? 8 / 22
9.
KL(R, C)w =
p(w|R) log p(w|R) p(w|C) 9 / 22
10.
KL 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 rank 10 / 22
11.
how much better can we do if we train with∑T t=1 ωxt ∝
p(w|R)? 11 / 22
12.
Language Model Scoring score(d, q) =
KL(θq, θd) θq maximum likelihood query language model θd document language model 12 / 22
13.
Query Expansion with Word Embeddings ˜θq = UUT θq U
|V| × k term embedding matrix 13 / 22
14.
Query Expansion with Word Embeddings Uglobal embedding trained with p(w|C) Ulocal
embedding trained with p(w|R) 14 / 22
15.
Getting p(w|R) p(d) = exp(−KL(θq,
θd)) ∑ d′ exp(−KL(θq, θd′ )) 15 / 22
16.
Getting p(w|R) p(d) = exp(−KL(θq,
θd)) ∑ d′ exp(−KL(θq, θd′ )) ˜p(w|R) = ∑ d p(w|θd)p(d) 15 / 22
17.
Experiments 16 / 22
18.
Data docs words queries trec12
469,949 438,338 150 robust 528,155 665,128 250 web 50,220,423 90,411,624 200 giga 9,875,524 2,645,367 - wiki 3,225,743 4,726,862 - 17 / 22
19.
Embeddings • global • public embeddings (GloVe,
word2vec) • word2vec on target corpus • local: word2vec with documents sampled by p(d) 18 / 22
20.
• ten-fold cross-validation • metric:
NDCG@10 19 / 22
21.
Results global local wiki+giga gnews
target target giga wiki QL 50 100 200 300 300 400 400 400 400 trec12 0.514 0.518 0.518 0.530 0.531 0.530 0.545 0.535 0.563* 0.523 robust 0.467 0.470 0.463 0.469 0.468 0.472 0.465 0.475 0.517* 0.476 web 0.216 0.227 0.229 0.230 0.232 0.218 0.216 0.234 0.236 0.258* 20 / 22
22.
global local top words by ˜p(w|R) (blue:
query; red: top words by p(w|R)) 21 / 22
23.
Summary • local embedding provides a stronger representation than global embedding • potential impact for other topic-specific natural language processing tasks •
future work • effectiveness improvements • efficiency improvements 22 / 22
Télécharger maintenant