SlideShare a Scribd company logo
1 of 49
Download to read offline
Overview of the NTCIR-12
MobileClick-2 Task
Makoto P. Kato (Kyoto U.), Tetsuya Sakai (Waseda U.),
Takehiro Yamamoto (Kyoto U.), Virgil Pavlu (Northeastern U.),
Hajime Morita (Kyoto U.), and Sumio Fujita (Yahoo Japan Corporation)
2
Let's see
the current
mobile search
"NTCIR"
3
What's NTCIR?
Your Search Stats
Clicks: 2
Time: 00:31
"NTCIR"
4
When is
the deadline
of NTCIR?
Your Search Stats
Clicks: 2
Time: 00:29
5
30sec are
too long for
mobile users
6
Let's do
better!
• Given a query, a set of iUnits, and a set of intents,
generate a two-layered summary
iUnit Summarization Subtask
7
iUnit
A series of evaluation workshops
Designed to enhance IA research
…
NTCIR
Input: Query
Input: iUnit set
Intents
News
Schedule
…
Input: Intents
M-measure
0.5
The NTCIR Workshop is a
series of evaluation
workshops designed to
enhance research in
information access
technologies including
information retrieval,
summarization, extraction,
question answering, etc.
News
Schedule
Tasks
2nd layer
20/Jan./2016: Task Registration Due
06/Jan./2016: Document Set Release
Jan.-May/2016: Dry Run
Mar.-July/2016: Formal Run
01/Aug./2016: Evaluation Results Due
01/Aug./2016: Task overview release
15/Sep./2016: Paper submission Due
01/Nov./2016: All paper Due
09-12/Dec./2016: NTCIR-11 Conference
Output: Two-layered summary
Evaluation metric
designed for mobile
information access
Lay out iUnits so that
any types of users can be immediately satisfied
Challenge
"NTCIR"
8
What's NTCIR?
Home | NTCIR
The NTCIR Workshop is a series of evaluation
workshops designed to enhance research in
information access technologies including
information retrieval, summarization, extraction,
question answering, etc.
NTCIR-12
Held on June 9(Tue)-12(Fri), 2016
at National Center of Sciences, Tokyo, Japan
NTCIR-12 News
NTCIR-12 Schedule
NTCIR-12 Tasks
Your Search Stats
Clicks: 0
Time: 00:03
"NTCIR"
9
When is
the deadline
of NTCIR?
Home | NTCIR
The NTCIR Workshop is a series of evaluation
workshops designed to enhance research in
information access technologies including
information retrieval, summarization, extraction,
question answering, etc.
NTCIR-12
Held on June 9(Tue)-12(Fri), 2016
at National Center of Sciences, Tokyo, Japan
NTCIR-12 News
NTCIR-12 Schedule
NTCIR-12 Tasks
NTCIR-12 Schedule
20/Jan./2016: Task Registration Due
06/Jan./2016: Document Set Release
Jan.-May/2016: Dry Run
Mar.-July/2016: Formal Run
01/Aug./2016: Evaluation Results Due
01/Aug./2016: Task overview release
15/Sep./2016: Paper submission Due
01/Nov./2016: All paper Due
09-12/Dec./2016: NTCIR-11 Conference
Your Search Stats
Clicks: 1
Time: 00:15
Is This Interface So Different from That of the Current Search Engine?
10
Home | NTCIR
The NTCIR Workshop is a series of evaluation
workshops designed to enhance research in
information access technologies including
information retrieval, summarization, extraction,
question answering, etc.
NTCIR-12
Held on June 9(Tue)-12(Fri), 2016
at National Center of Sciences, Tokyo, Japan
NTCIR-12 News
NTCIR-12 Schedule
NTCIR-12 Tasks
No. Thus, using this interface is not very unrealistic.
"NTCIR"
11
When is
the deadline
of NTCIR? Home | NTCIR
http://research.nii.ac.jp/ntcir/
The NTCIR Workshop is a series of evaluation
workshops designed to enhance research in
information access technologies including
information retrieval, summarization, extraction,
question answering, etc.
NTCIR-11
http://research.nii.ac.jp/ntcir/
Held on December 9(Tue)-12(Fri), 2014
at National Center of Sciences, Tokyo, Japan
NTCIR-11 News
http://research.nii.ac.jp/ntcir/news
NTCIR-11 Schedule
http://research.nii.ac.jp/ntcir/schedule
NTCIR-11 Tasks
http://research.nii.ac.jp/ntcir/tasks
NTCIR-11 Schedule
http://research.nii.ac.jp/ntcir/schedule
20/Jan./2014: Task Registration Due
06/Jan./2014: Document Set Release
Jan.-May/2014: Dry Run
Mar.-July/2014: Formal Run
01/Aug./2014: Evaluation Results Due
01/Aug./2014: Early draft Task overview release
15/Sep./2014: Draft paper submission Due
01/Nov./2014: All camera-ready paper Due
09-12/Dec./2014: NTCIR-11 Conference
Your Search Score
Clicks: 1
Time: 00:15
Goal of MobileClick
Provide Direct and Immediate
Mobile Information Access
SUBTASKS
12
Two Subtasks
13
Query
Importance of iUnits
Two-layered Summary
iUnit Ranking Subtask
iUnit Summarization Subtask
NTCIR
iUnit
1 A series of evaluation workshops
2 Task Registration Due 20/Jun./2016
3 Designed to enhance IA research
… …
The NTCIR Workshop is a series
of evaluation workshops
designed to enhance research in
information access technologies
including information retrieval,
summarization, extraction,
question answering, etc.
News
Schedule
Tasks
2nd layer
20/Jan./2016:
Task Registration Due
06/Jan./2016:
Document Set Release
Jan.-May/2016:
Dry Run
Mar.-July/2016:
Formal Run
01/Aug./2016:
Evaluation Results Due
01/Aug./2016:
Task overview release
15/Sep./2016:
Paper submission Due
01/Nov./2016:
All paper Due
09-12/Dec./2016:
NTCIR-11 Conference
• Given a query and a set of iUnits,
rank them based on their estimated importance
Note: iUnits are information pieces relevant to a given query
iUnit Ranking Subtask
14
iUnit
A series of evaluation workshops
Designed to enhance IA research
Task Registration Due 20/Jun./2016
NTCIR
Input: Query
Input: iUnit set
iUnit
1 A series of evaluation workshops
2 Task Registration Due 20/Jun./2016
3 Designed to enhance IA research
… …
Output: iUnit list
nDCG
0.5
Predict the importance of strings rather than documents
Challenge
• Given a query, a set of iUnits, and a set of intents,
generate a two-layered summary
iUnit Summarization Subtask
15
iUnit
A series of evaluation workshops
Designed to enhance IA research
…
NTCIR
Input: Query
Input: iUnit set
Intents
News
Schedule
…
Input: Intents
M-measure
0.5
The NTCIR Workshop is a
series of evaluation
workshops designed to
enhance research in
information access
technologies including
information retrieval,
summarization, extraction,
question answering, etc.
News
Schedule
Tasks
2nd layer
20/Jan./2016: Task Registration Due
06/Jan./2016: Document Set Release
Jan.-May/2016: Dry Run
Mar.-July/2016: Formal Run
01/Aug./2016: Evaluation Results Due
01/Aug./2016: Task overview release
15/Sep./2016: Paper submission Due
01/Nov./2016: All paper Due
09-12/Dec./2016: NTCIR-11 Conference
Output: Two-layered summary
Evaluation metric
designed for mobile
information access
Lay out iUnits so that
any types of users can be immediately satisfied
Challenge
Two-layered Summary in Action
16
DATA
17
Overview of Data
18
napoleon
Queries
Documents
Web search
Born on the island of Corsica
Defeated at the Battle of Waterloo
Established legal equality and religious
toleration an innovator
iUnits
Extraction
Achievement
Skill
Career
Clustering
Intents
iUnit
summarization
iUnit ranking
Input
Input
• Queries
– 100 English/Japanese queries
– Most of which were ambiguous/underspecified
– Selected from five categories:
celebrity, location, definition, and QA (similar to NTCIR 1CLICK-2)
• Documents
– 500 commercial search engine results for each query
– From which iUnits were extracted
Queries and Documents
19
CELEBRITY LOCATION DEFINITION QA
hulk hogan bank adelanto bitcoin what is mirror made of
bruno mars cafe killeen divers disease how to cook coleslaw
sharon stone cincinnati art museum windows 7 role of animal tail
Examples
• Definition
– Atomic information pieces relevant to a given query
• The number of iUnits
– 2,317 (23.8 iUnits per query) for English
– 4,169 (41.7 iUnits per query) for Japanese
iUnits
20
Born on the island of Corsica General of the Army of Italy
Defeated at the Battle of Waterloo One of the most controversial political figures
won at the Battle of Wagram
Established legal equality and religious
toleration an innovator
Baptised as a Catholic
Absent during Peninsular War Cut off European trade with Britain
Examples of iUnits for query “Napoleon”
• https://addons.mozilla.org/ja/firefox/addon/iunit-extractor/
– Useful for nugget extraction, etc.
iUnit Extractor
21
• An intent can be defined as
– A specific interpretation of an ambiguous query
(“Mac OS” and “car brand” for “jaguar”),
– An aspect of a faceted query
(“windows 8” and “windows 10” for “windows”)
• Obtained by clustering iUnits
Intents
22
Achievement
Skill
Career
Born on the island of Corsica
Defeated at the Battle of Waterloo
Established legal equality and religious
toleration an innovator
Absent during Peninsular War
iUnits Intents
Clustering
• Queries and their statistics related to our
training and test query sets were provided by
Yahoo Japan Corporation
– Co-Click Queries
Queries that share clicks with the query sets
– Co-topic Queries
Queries that include a query string in the query sets
– Co-Session Queries
Queries that appeared in the same session as the
query sets
• Used by participants for ranking iUnits and
generating two-layered summaries
Yahoo Search Query Data
23
Overview of Data (Repeated)
24
napoleon
Queries
Documents
Web search
Born on the island of Corsica
Defeated at the Battle of Waterloo
Established legal equality and religious
toleration an innovator
iUnits
Extraction
Achievement
Skill
Career
Clustering
Intents
iUnit
summarization
iUnit ranking
Input
Input
EVALUATION
25
• Importance of iUnits in terms of an intent
was given by two assessors at a 5-point scale
– An iUnit is more important if it is more necessary for
more users who are interested in the intent
– The inter-rater agreement: 0.556 (weighted kappa)
Per-intent iUnit Importance
iUnit Importance
A series of evaluation workshops 5
Task Registration Due 20/Jun./2016 3
iUnit Importance
A series of evaluation workshops 2
Task Registration Due 20/Jun./2016 5
In terms of intent “Definition” In terms of intent “Schedule”
Per-intent iUnit Importance
• Intent probability was estimated by voting
– 𝑃(𝑖|𝑞): probability of having intent i given q
– 10 assessors voted for one or more intents for a
given query
Intent Probability
27
Intent Prob.
Definition 0.4
Schedule 0.3
Tasks 0.3
Intent # of votes
Definition 4
Schedule 3
Tasks 3
Intent Voting Intent Probability
• Evaluated in the same way as ad-hoc retrieval
Evaluation of iUnit Ranking
28
iUnit Importance
A series of evaluation workshops 2
Task Registration Due 20/Jun./2016 5
In terms of intent “Schedule”
Per-intent iUnit Importance
Intent Prob.
Definition 0.4
Schedule 0.3
Intent Probability
iUnit Importance
A series of evaluation workshops 3.8
Task Registration Due 20/Jun./2016 2.5
Global Importance
𝐺 𝑢 =
𝑖∈𝐼 𝑞
𝑃 𝑖 𝑞 𝑔𝑖(𝑢)
𝑃 𝑖 𝑞 : intent probability
𝑔𝑖 𝑢 : per-intent importance
𝐼 𝑞: intents for query q
iUnit
1 A series of evaluation …
2 Task Registration Due …
Output: iUnit list
iUnit GI
1 A series of evaluation … 3.8
2 Task Registration Due … 2.5
0.87
nDCG@10
Q-measure
• Consider single-layered summary evaluation
• U-measure [Sakai and Dou. SIGIR2013]
– Higher if more important iUnits appear earlier
Evaluation of iUnit Summarization (Single-layer Case)
29
𝑢1 𝑢2
𝑢3
Summary Trailtext
(reading path)
𝑢1 𝑢3
G(u1)(1-10/L)
+ G(u2)(1-15/L)
+ G(u3)(1-25/L)
U-measure
Create a list of iUnits
by assuming that users
read text from left to right,
from top to bottom
𝑈 =
𝑟=1
𝐺(𝑢 𝑟)(1 − pos(𝑢 𝑟)/𝐿)
𝑢 𝑟: r-th iUnit
𝐺(𝑢): importance of u
pos(𝑢): offset of u from the beginning
𝐿: patience parameter
𝑢2
10chars 10chars5chars
• M-measure
– Expectation of U-measure over multiple trailtexts
𝑀 =
𝐭
𝑃(𝐭)𝑈(𝐭)
• Generate trailtexts by assuming that
– Users read a summary from the top of the first layer
– Users click on an intent if they are interested in it
M-measure
30
𝑃(𝐭): probability of trailtext t
𝑈(𝐭): U-measure of trailtext t
𝑙1
𝑢1 𝑢2
𝑢3
𝑢4
User interested in
Intent 1 (𝑃(𝑖1|𝑞))
User interested in
Intent 2 (𝑃(𝑖2|𝑞))
𝑢1 𝑢2 𝑢3 𝑢4
𝑢1 𝑢2 𝑢3
• Compute the expectation of U-measure
Evaluation of iUnit Summarization (Two-layer Case)
31
𝑙1
𝑙2
𝑢1 𝑢2
𝑢3
𝑢6
𝑢4 𝑢5
Trailtext (t)
(reading path)
U
𝑢1 𝑢2 𝑢3
𝑢4 𝑢5
𝑢1 𝑢2 𝑢3
𝑢6
0.44
0.12
0.36
𝑃 𝐭1 = 𝑃 𝑖1 𝑞 = 0.75
𝑃 𝐭2 = 𝑃 𝑖2 𝑞 = 0.25
M-measure
𝑀 =
𝐭
𝑃(𝐭)𝑈(𝐭)
Because trailtext t2 is read
by users interested in i2
RESULTS
32
iUnit Ranking (English)
Submitted runs showed similar performance
(a few statistically significant differences)
iUnit Ranking (Japanese)
UHYG, YJST, and rsrch significantly outperformed the
baseline method
Significant
difference
iUnit Summarization (English)
TITEC and YJST are the top and are not statistically
distinguishable, but did not significantly outperform the
best baseline
Significant
difference
iUnit Summarization (Japanese)
YJST and UHYG significantly outperformed the baseline,
and are not statistically distinguishable
Significant
differences
Approaches of Participants
37Please come to our session! (DAY-3 (Thu) 9:00 – 10:30)
NEW TRIALS
38
MobileClick tool available at https://github.com/mpkato/mobileclick
39
1 line for downloading the data
5 lines to generate baseline results
Q. When can we get our evaluation result?
A. Right after you submit your run!
Leader Board System
40
Got a result Impressive
result!
Submission
Feedback
• Evaluation for test queries started from Nov 2015
– Participants were allowed to submit a run per week
Leader board
Leader Board Timeline
41
0
20
40
60
80
100
120
140
160
Test data
released
Evaluation
system released
Consistent
growth
Latest Submission Statistics
42
Latest Leader Board
43
• No team
outperformed
the baseline
• 4 teams
participated
• 14 runs were
submitted
• Statistically
significant
differences
• 11 teams
participated
• 66 runs were
submitted
44
Possible Effects of Leader Board in NTCIR
MobileClick-1 MobileClick-2
• Goal of MobileClick:
Provide direct and immediate mobile information
access
• Subtasks:
– iUnit ranking
– iUnit summarization
• Results:
–11 teams submitted 66 runs
– Participants outperformed the baseline in all the subtasks
– Some teams showed significant improvement
• Acknowledgements
– Yahoo Japan Corporation
– Wider Planet
Summary
45
46
47
48
49

More Related Content

Similar to Overview of the NTCIR-12 MobileClick-2 Task

A presentation on Applications of ICT in Research.pptx
A presentation on Applications of ICT in Research.pptxA presentation on Applications of ICT in Research.pptx
A presentation on Applications of ICT in Research.pptxROHITSHARMA779690
 
SR-R-nKAnwar_PPM_Penulisan_ProposalLPDP.pdf
SR-R-nKAnwar_PPM_Penulisan_ProposalLPDP.pdfSR-R-nKAnwar_PPM_Penulisan_ProposalLPDP.pdf
SR-R-nKAnwar_PPM_Penulisan_ProposalLPDP.pdfHabibAbda
 
OpenStack Day Taiwan 2016 -Shintaro Mizuno
OpenStack Day Taiwan 2016 -Shintaro MizunoOpenStack Day Taiwan 2016 -Shintaro Mizuno
OpenStack Day Taiwan 2016 -Shintaro Mizunoshintaro mizuno
 
"OpenStack in Japan", from OpenStack Days Taiwan 2016
"OpenStack in Japan", from OpenStack Days Taiwan 2016"OpenStack in Japan", from OpenStack Days Taiwan 2016
"OpenStack in Japan", from OpenStack Days Taiwan 2016shintaro mizuno
 
Ambient Intelligence Design Process
Ambient Intelligence Design ProcessAmbient Intelligence Design Process
Ambient Intelligence Design ProcessFulvio Corno
 
[ADBIS2022] Insight-based Vocalization of OLAP Sessions
[ADBIS2022] Insight-based Vocalization of OLAP Sessions[ADBIS2022] Insight-based Vocalization of OLAP Sessions
[ADBIS2022] Insight-based Vocalization of OLAP SessionsUniversity of Bologna
 
Discover deep insights with Salesforce Einstein Analytics and Discovery
Discover deep insights with Salesforce Einstein Analytics and DiscoveryDiscover deep insights with Salesforce Einstein Analytics and Discovery
Discover deep insights with Salesforce Einstein Analytics and DiscoveryNew Delhi Salesforce Developer Group
 
Poster ECIS 2016
Poster ECIS 2016Poster ECIS 2016
Poster ECIS 2016Rui Silva
 
TAROT summerschool slides 2013 - Italy
TAROT summerschool slides 2013 - ItalyTAROT summerschool slides 2013 - Italy
TAROT summerschool slides 2013 - ItalyTanja Vos
 
Eduworks kick-off presentation: USAL
Eduworks kick-off presentation: USALEduworks kick-off presentation: USAL
Eduworks kick-off presentation: USALEduworks Network
 
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...IRJET Journal
 
American Drivers Don't Understand Today's Automotive Safety Features
American Drivers Don't Understand Today's Automotive Safety FeaturesAmerican Drivers Don't Understand Today's Automotive Safety Features
American Drivers Don't Understand Today's Automotive Safety FeaturesSebastian James
 
International Cooperation Experiences: Results Achieved, Lessons Learned, and...
International Cooperation Experiences: Results Achieved, Lessons Learned, and...International Cooperation Experiences: Results Achieved, Lessons Learned, and...
International Cooperation Experiences: Results Achieved, Lessons Learned, and...SOFIProject
 
TestOps in the Cloud
TestOps in the CloudTestOps in the Cloud
TestOps in the CloudTEST Huddle
 
20150620 Meetup U-Qasar - Obtaining an integrated and objective overview of t...
20150620 Meetup U-Qasar - Obtaining an integrated and objective overview of t...20150620 Meetup U-Qasar - Obtaining an integrated and objective overview of t...
20150620 Meetup U-Qasar - Obtaining an integrated and objective overview of t...Manu García Rodríguez
 

Similar to Overview of the NTCIR-12 MobileClick-2 Task (20)

A presentation on Applications of ICT in Research.pptx
A presentation on Applications of ICT in Research.pptxA presentation on Applications of ICT in Research.pptx
A presentation on Applications of ICT in Research.pptx
 
SR-R-nKAnwar_PPM_Penulisan_ProposalLPDP.pdf
SR-R-nKAnwar_PPM_Penulisan_ProposalLPDP.pdfSR-R-nKAnwar_PPM_Penulisan_ProposalLPDP.pdf
SR-R-nKAnwar_PPM_Penulisan_ProposalLPDP.pdf
 
OpenStack Day Taiwan 2016 -Shintaro Mizuno
OpenStack Day Taiwan 2016 -Shintaro MizunoOpenStack Day Taiwan 2016 -Shintaro Mizuno
OpenStack Day Taiwan 2016 -Shintaro Mizuno
 
"OpenStack in Japan", from OpenStack Days Taiwan 2016
"OpenStack in Japan", from OpenStack Days Taiwan 2016"OpenStack in Japan", from OpenStack Days Taiwan 2016
"OpenStack in Japan", from OpenStack Days Taiwan 2016
 
Ambient Intelligence Design Process
Ambient Intelligence Design ProcessAmbient Intelligence Design Process
Ambient Intelligence Design Process
 
[ADBIS2022] Insight-based Vocalization of OLAP Sessions
[ADBIS2022] Insight-based Vocalization of OLAP Sessions[ADBIS2022] Insight-based Vocalization of OLAP Sessions
[ADBIS2022] Insight-based Vocalization of OLAP Sessions
 
202212APSEC.pptx.pdf
202212APSEC.pptx.pdf202212APSEC.pptx.pdf
202212APSEC.pptx.pdf
 
Discover deep insights with Salesforce Einstein Analytics and Discovery
Discover deep insights with Salesforce Einstein Analytics and DiscoveryDiscover deep insights with Salesforce Einstein Analytics and Discovery
Discover deep insights with Salesforce Einstein Analytics and Discovery
 
00 intro
00 intro00 intro
00 intro
 
Poster ECIS 2016
Poster ECIS 2016Poster ECIS 2016
Poster ECIS 2016
 
Summary of pilot cases. New ways of working. Esa Nykänen, Jari Laarni, Hanna-...
Summary of pilot cases. New ways of working. Esa Nykänen, Jari Laarni, Hanna-...Summary of pilot cases. New ways of working. Esa Nykänen, Jari Laarni, Hanna-...
Summary of pilot cases. New ways of working. Esa Nykänen, Jari Laarni, Hanna-...
 
TAROT summerschool slides 2013 - Italy
TAROT summerschool slides 2013 - ItalyTAROT summerschool slides 2013 - Italy
TAROT summerschool slides 2013 - Italy
 
Eduworks kick-off presentation: USAL
Eduworks kick-off presentation: USALEduworks kick-off presentation: USAL
Eduworks kick-off presentation: USAL
 
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
Bibliometric Analysis on Computer Vision based Anomaly Detection using Deep L...
 
American Drivers Don't Understand Today's Automotive Safety Features
American Drivers Don't Understand Today's Automotive Safety FeaturesAmerican Drivers Don't Understand Today's Automotive Safety Features
American Drivers Don't Understand Today's Automotive Safety Features
 
Project Report
Project ReportProject Report
Project Report
 
International Cooperation Experiences: Results Achieved, Lessons Learned, and...
International Cooperation Experiences: Results Achieved, Lessons Learned, and...International Cooperation Experiences: Results Achieved, Lessons Learned, and...
International Cooperation Experiences: Results Achieved, Lessons Learned, and...
 
TestOps in the Cloud
TestOps in the CloudTestOps in the Cloud
TestOps in the Cloud
 
20150620 Meetup U-Qasar - Obtaining an integrated and objective overview of t...
20150620 Meetup U-Qasar - Obtaining an integrated and objective overview of t...20150620 Meetup U-Qasar - Obtaining an integrated and objective overview of t...
20150620 Meetup U-Qasar - Obtaining an integrated and objective overview of t...
 
Project Report 05_06_13
Project Report 05_06_13Project Report 05_06_13
Project Report 05_06_13
 

More from kt.mako

情報検索とゼロショット学習
情報検索とゼロショット学習情報検索とゼロショット学習
情報検索とゼロショット学習kt.mako
 
Context-guided Learning to Rank Entities
Context-guided Learning to Rank EntitiesContext-guided Learning to Rank Entities
Context-guided Learning to Rank Entitieskt.mako
 
情報アクセス技術のためのテストコレクション作成
情報アクセス技術のためのテストコレクション作成情報アクセス技術のためのテストコレクション作成
情報アクセス技術のためのテストコレクション作成kt.mako
 
筑波大学 図書館情報メディア系 知識獲得システム 研究紹介
筑波大学 図書館情報メディア系 知識獲得システム 研究紹介筑波大学 図書館情報メディア系 知識獲得システム 研究紹介
筑波大学 図書館情報メディア系 知識獲得システム 研究紹介kt.mako
 
DEIM2017 私が愛したSIGIR Paper [京都大学 加藤誠]
DEIM2017 私が愛したSIGIR Paper [京都大学 加藤誠]DEIM2017 私が愛したSIGIR Paper [京都大学 加藤誠]
DEIM2017 私が愛したSIGIR Paper [京都大学 加藤誠]kt.mako
 
検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法
検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法
検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法kt.mako
 
Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect ...
Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect ...Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect ...
Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect ...kt.mako
 
情報検索のためのユーザモデル
情報検索のためのユーザモデル情報検索のためのユーザモデル
情報検索のためのユーザモデルkt.mako
 
MobileClick-2 キックオフイベント
MobileClick-2 キックオフイベントMobileClick-2 キックオフイベント
MobileClick-2 キックオフイベントkt.mako
 

More from kt.mako (9)

情報検索とゼロショット学習
情報検索とゼロショット学習情報検索とゼロショット学習
情報検索とゼロショット学習
 
Context-guided Learning to Rank Entities
Context-guided Learning to Rank EntitiesContext-guided Learning to Rank Entities
Context-guided Learning to Rank Entities
 
情報アクセス技術のためのテストコレクション作成
情報アクセス技術のためのテストコレクション作成情報アクセス技術のためのテストコレクション作成
情報アクセス技術のためのテストコレクション作成
 
筑波大学 図書館情報メディア系 知識獲得システム 研究紹介
筑波大学 図書館情報メディア系 知識獲得システム 研究紹介筑波大学 図書館情報メディア系 知識獲得システム 研究紹介
筑波大学 図書館情報メディア系 知識獲得システム 研究紹介
 
DEIM2017 私が愛したSIGIR Paper [京都大学 加藤誠]
DEIM2017 私が愛したSIGIR Paper [京都大学 加藤誠]DEIM2017 私が愛したSIGIR Paper [京都大学 加藤誠]
DEIM2017 私が愛したSIGIR Paper [京都大学 加藤誠]
 
検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法
検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法
検索評価ツールキットNTCIREVALを用いた様々な情報アクセス技術の評価方法
 
Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect ...
Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect ...Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect ...
Two-layered Summaries for Mobile Search: Does the Evaluation Measure Reflect ...
 
情報検索のためのユーザモデル
情報検索のためのユーザモデル情報検索のためのユーザモデル
情報検索のためのユーザモデル
 
MobileClick-2 キックオフイベント
MobileClick-2 キックオフイベントMobileClick-2 キックオフイベント
MobileClick-2 キックオフイベント
 

Recently uploaded

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 

Recently uploaded (20)

The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 

Overview of the NTCIR-12 MobileClick-2 Task

  • 1. Overview of the NTCIR-12 MobileClick-2 Task Makoto P. Kato (Kyoto U.), Tetsuya Sakai (Waseda U.), Takehiro Yamamoto (Kyoto U.), Virgil Pavlu (Northeastern U.), Hajime Morita (Kyoto U.), and Sumio Fujita (Yahoo Japan Corporation)
  • 3. "NTCIR" 3 What's NTCIR? Your Search Stats Clicks: 2 Time: 00:31
  • 4. "NTCIR" 4 When is the deadline of NTCIR? Your Search Stats Clicks: 2 Time: 00:29
  • 5. 5 30sec are too long for mobile users
  • 7. • Given a query, a set of iUnits, and a set of intents, generate a two-layered summary iUnit Summarization Subtask 7 iUnit A series of evaluation workshops Designed to enhance IA research … NTCIR Input: Query Input: iUnit set Intents News Schedule … Input: Intents M-measure 0.5 The NTCIR Workshop is a series of evaluation workshops designed to enhance research in information access technologies including information retrieval, summarization, extraction, question answering, etc. News Schedule Tasks 2nd layer 20/Jan./2016: Task Registration Due 06/Jan./2016: Document Set Release Jan.-May/2016: Dry Run Mar.-July/2016: Formal Run 01/Aug./2016: Evaluation Results Due 01/Aug./2016: Task overview release 15/Sep./2016: Paper submission Due 01/Nov./2016: All paper Due 09-12/Dec./2016: NTCIR-11 Conference Output: Two-layered summary Evaluation metric designed for mobile information access Lay out iUnits so that any types of users can be immediately satisfied Challenge
  • 8. "NTCIR" 8 What's NTCIR? Home | NTCIR The NTCIR Workshop is a series of evaluation workshops designed to enhance research in information access technologies including information retrieval, summarization, extraction, question answering, etc. NTCIR-12 Held on June 9(Tue)-12(Fri), 2016 at National Center of Sciences, Tokyo, Japan NTCIR-12 News NTCIR-12 Schedule NTCIR-12 Tasks Your Search Stats Clicks: 0 Time: 00:03
  • 9. "NTCIR" 9 When is the deadline of NTCIR? Home | NTCIR The NTCIR Workshop is a series of evaluation workshops designed to enhance research in information access technologies including information retrieval, summarization, extraction, question answering, etc. NTCIR-12 Held on June 9(Tue)-12(Fri), 2016 at National Center of Sciences, Tokyo, Japan NTCIR-12 News NTCIR-12 Schedule NTCIR-12 Tasks NTCIR-12 Schedule 20/Jan./2016: Task Registration Due 06/Jan./2016: Document Set Release Jan.-May/2016: Dry Run Mar.-July/2016: Formal Run 01/Aug./2016: Evaluation Results Due 01/Aug./2016: Task overview release 15/Sep./2016: Paper submission Due 01/Nov./2016: All paper Due 09-12/Dec./2016: NTCIR-11 Conference Your Search Stats Clicks: 1 Time: 00:15
  • 10. Is This Interface So Different from That of the Current Search Engine? 10 Home | NTCIR The NTCIR Workshop is a series of evaluation workshops designed to enhance research in information access technologies including information retrieval, summarization, extraction, question answering, etc. NTCIR-12 Held on June 9(Tue)-12(Fri), 2016 at National Center of Sciences, Tokyo, Japan NTCIR-12 News NTCIR-12 Schedule NTCIR-12 Tasks No. Thus, using this interface is not very unrealistic.
  • 11. "NTCIR" 11 When is the deadline of NTCIR? Home | NTCIR http://research.nii.ac.jp/ntcir/ The NTCIR Workshop is a series of evaluation workshops designed to enhance research in information access technologies including information retrieval, summarization, extraction, question answering, etc. NTCIR-11 http://research.nii.ac.jp/ntcir/ Held on December 9(Tue)-12(Fri), 2014 at National Center of Sciences, Tokyo, Japan NTCIR-11 News http://research.nii.ac.jp/ntcir/news NTCIR-11 Schedule http://research.nii.ac.jp/ntcir/schedule NTCIR-11 Tasks http://research.nii.ac.jp/ntcir/tasks NTCIR-11 Schedule http://research.nii.ac.jp/ntcir/schedule 20/Jan./2014: Task Registration Due 06/Jan./2014: Document Set Release Jan.-May/2014: Dry Run Mar.-July/2014: Formal Run 01/Aug./2014: Evaluation Results Due 01/Aug./2014: Early draft Task overview release 15/Sep./2014: Draft paper submission Due 01/Nov./2014: All camera-ready paper Due 09-12/Dec./2014: NTCIR-11 Conference Your Search Score Clicks: 1 Time: 00:15 Goal of MobileClick Provide Direct and Immediate Mobile Information Access
  • 13. Two Subtasks 13 Query Importance of iUnits Two-layered Summary iUnit Ranking Subtask iUnit Summarization Subtask NTCIR iUnit 1 A series of evaluation workshops 2 Task Registration Due 20/Jun./2016 3 Designed to enhance IA research … … The NTCIR Workshop is a series of evaluation workshops designed to enhance research in information access technologies including information retrieval, summarization, extraction, question answering, etc. News Schedule Tasks 2nd layer 20/Jan./2016: Task Registration Due 06/Jan./2016: Document Set Release Jan.-May/2016: Dry Run Mar.-July/2016: Formal Run 01/Aug./2016: Evaluation Results Due 01/Aug./2016: Task overview release 15/Sep./2016: Paper submission Due 01/Nov./2016: All paper Due 09-12/Dec./2016: NTCIR-11 Conference
  • 14. • Given a query and a set of iUnits, rank them based on their estimated importance Note: iUnits are information pieces relevant to a given query iUnit Ranking Subtask 14 iUnit A series of evaluation workshops Designed to enhance IA research Task Registration Due 20/Jun./2016 NTCIR Input: Query Input: iUnit set iUnit 1 A series of evaluation workshops 2 Task Registration Due 20/Jun./2016 3 Designed to enhance IA research … … Output: iUnit list nDCG 0.5 Predict the importance of strings rather than documents Challenge
  • 15. • Given a query, a set of iUnits, and a set of intents, generate a two-layered summary iUnit Summarization Subtask 15 iUnit A series of evaluation workshops Designed to enhance IA research … NTCIR Input: Query Input: iUnit set Intents News Schedule … Input: Intents M-measure 0.5 The NTCIR Workshop is a series of evaluation workshops designed to enhance research in information access technologies including information retrieval, summarization, extraction, question answering, etc. News Schedule Tasks 2nd layer 20/Jan./2016: Task Registration Due 06/Jan./2016: Document Set Release Jan.-May/2016: Dry Run Mar.-July/2016: Formal Run 01/Aug./2016: Evaluation Results Due 01/Aug./2016: Task overview release 15/Sep./2016: Paper submission Due 01/Nov./2016: All paper Due 09-12/Dec./2016: NTCIR-11 Conference Output: Two-layered summary Evaluation metric designed for mobile information access Lay out iUnits so that any types of users can be immediately satisfied Challenge
  • 18. Overview of Data 18 napoleon Queries Documents Web search Born on the island of Corsica Defeated at the Battle of Waterloo Established legal equality and religious toleration an innovator iUnits Extraction Achievement Skill Career Clustering Intents iUnit summarization iUnit ranking Input Input
  • 19. • Queries – 100 English/Japanese queries – Most of which were ambiguous/underspecified – Selected from five categories: celebrity, location, definition, and QA (similar to NTCIR 1CLICK-2) • Documents – 500 commercial search engine results for each query – From which iUnits were extracted Queries and Documents 19 CELEBRITY LOCATION DEFINITION QA hulk hogan bank adelanto bitcoin what is mirror made of bruno mars cafe killeen divers disease how to cook coleslaw sharon stone cincinnati art museum windows 7 role of animal tail Examples
  • 20. • Definition – Atomic information pieces relevant to a given query • The number of iUnits – 2,317 (23.8 iUnits per query) for English – 4,169 (41.7 iUnits per query) for Japanese iUnits 20 Born on the island of Corsica General of the Army of Italy Defeated at the Battle of Waterloo One of the most controversial political figures won at the Battle of Wagram Established legal equality and religious toleration an innovator Baptised as a Catholic Absent during Peninsular War Cut off European trade with Britain Examples of iUnits for query “Napoleon”
  • 21. • https://addons.mozilla.org/ja/firefox/addon/iunit-extractor/ – Useful for nugget extraction, etc. iUnit Extractor 21
  • 22. • An intent can be defined as – A specific interpretation of an ambiguous query (“Mac OS” and “car brand” for “jaguar”), – An aspect of a faceted query (“windows 8” and “windows 10” for “windows”) • Obtained by clustering iUnits Intents 22 Achievement Skill Career Born on the island of Corsica Defeated at the Battle of Waterloo Established legal equality and religious toleration an innovator Absent during Peninsular War iUnits Intents Clustering
  • 23. • Queries and their statistics related to our training and test query sets were provided by Yahoo Japan Corporation – Co-Click Queries Queries that share clicks with the query sets – Co-topic Queries Queries that include a query string in the query sets – Co-Session Queries Queries that appeared in the same session as the query sets • Used by participants for ranking iUnits and generating two-layered summaries Yahoo Search Query Data 23
  • 24. Overview of Data (Repeated) 24 napoleon Queries Documents Web search Born on the island of Corsica Defeated at the Battle of Waterloo Established legal equality and religious toleration an innovator iUnits Extraction Achievement Skill Career Clustering Intents iUnit summarization iUnit ranking Input Input
  • 26. • Importance of iUnits in terms of an intent was given by two assessors at a 5-point scale – An iUnit is more important if it is more necessary for more users who are interested in the intent – The inter-rater agreement: 0.556 (weighted kappa) Per-intent iUnit Importance iUnit Importance A series of evaluation workshops 5 Task Registration Due 20/Jun./2016 3 iUnit Importance A series of evaluation workshops 2 Task Registration Due 20/Jun./2016 5 In terms of intent “Definition” In terms of intent “Schedule” Per-intent iUnit Importance
  • 27. • Intent probability was estimated by voting – 𝑃(𝑖|𝑞): probability of having intent i given q – 10 assessors voted for one or more intents for a given query Intent Probability 27 Intent Prob. Definition 0.4 Schedule 0.3 Tasks 0.3 Intent # of votes Definition 4 Schedule 3 Tasks 3 Intent Voting Intent Probability
  • 28. • Evaluated in the same way as ad-hoc retrieval Evaluation of iUnit Ranking 28 iUnit Importance A series of evaluation workshops 2 Task Registration Due 20/Jun./2016 5 In terms of intent “Schedule” Per-intent iUnit Importance Intent Prob. Definition 0.4 Schedule 0.3 Intent Probability iUnit Importance A series of evaluation workshops 3.8 Task Registration Due 20/Jun./2016 2.5 Global Importance 𝐺 𝑢 = 𝑖∈𝐼 𝑞 𝑃 𝑖 𝑞 𝑔𝑖(𝑢) 𝑃 𝑖 𝑞 : intent probability 𝑔𝑖 𝑢 : per-intent importance 𝐼 𝑞: intents for query q iUnit 1 A series of evaluation … 2 Task Registration Due … Output: iUnit list iUnit GI 1 A series of evaluation … 3.8 2 Task Registration Due … 2.5 0.87 nDCG@10 Q-measure
  • 29. • Consider single-layered summary evaluation • U-measure [Sakai and Dou. SIGIR2013] – Higher if more important iUnits appear earlier Evaluation of iUnit Summarization (Single-layer Case) 29 𝑢1 𝑢2 𝑢3 Summary Trailtext (reading path) 𝑢1 𝑢3 G(u1)(1-10/L) + G(u2)(1-15/L) + G(u3)(1-25/L) U-measure Create a list of iUnits by assuming that users read text from left to right, from top to bottom 𝑈 = 𝑟=1 𝐺(𝑢 𝑟)(1 − pos(𝑢 𝑟)/𝐿) 𝑢 𝑟: r-th iUnit 𝐺(𝑢): importance of u pos(𝑢): offset of u from the beginning 𝐿: patience parameter 𝑢2 10chars 10chars5chars
  • 30. • M-measure – Expectation of U-measure over multiple trailtexts 𝑀 = 𝐭 𝑃(𝐭)𝑈(𝐭) • Generate trailtexts by assuming that – Users read a summary from the top of the first layer – Users click on an intent if they are interested in it M-measure 30 𝑃(𝐭): probability of trailtext t 𝑈(𝐭): U-measure of trailtext t 𝑙1 𝑢1 𝑢2 𝑢3 𝑢4 User interested in Intent 1 (𝑃(𝑖1|𝑞)) User interested in Intent 2 (𝑃(𝑖2|𝑞)) 𝑢1 𝑢2 𝑢3 𝑢4 𝑢1 𝑢2 𝑢3
  • 31. • Compute the expectation of U-measure Evaluation of iUnit Summarization (Two-layer Case) 31 𝑙1 𝑙2 𝑢1 𝑢2 𝑢3 𝑢6 𝑢4 𝑢5 Trailtext (t) (reading path) U 𝑢1 𝑢2 𝑢3 𝑢4 𝑢5 𝑢1 𝑢2 𝑢3 𝑢6 0.44 0.12 0.36 𝑃 𝐭1 = 𝑃 𝑖1 𝑞 = 0.75 𝑃 𝐭2 = 𝑃 𝑖2 𝑞 = 0.25 M-measure 𝑀 = 𝐭 𝑃(𝐭)𝑈(𝐭) Because trailtext t2 is read by users interested in i2
  • 33. iUnit Ranking (English) Submitted runs showed similar performance (a few statistically significant differences)
  • 34. iUnit Ranking (Japanese) UHYG, YJST, and rsrch significantly outperformed the baseline method Significant difference
  • 35. iUnit Summarization (English) TITEC and YJST are the top and are not statistically distinguishable, but did not significantly outperform the best baseline Significant difference
  • 36. iUnit Summarization (Japanese) YJST and UHYG significantly outperformed the baseline, and are not statistically distinguishable Significant differences
  • 37. Approaches of Participants 37Please come to our session! (DAY-3 (Thu) 9:00 – 10:30)
  • 39. MobileClick tool available at https://github.com/mpkato/mobileclick 39 1 line for downloading the data 5 lines to generate baseline results
  • 40. Q. When can we get our evaluation result? A. Right after you submit your run! Leader Board System 40 Got a result Impressive result! Submission Feedback • Evaluation for test queries started from Nov 2015 – Participants were allowed to submit a run per week Leader board
  • 41. Leader Board Timeline 41 0 20 40 60 80 100 120 140 160 Test data released Evaluation system released Consistent growth
  • 44. • No team outperformed the baseline • 4 teams participated • 14 runs were submitted • Statistically significant differences • 11 teams participated • 66 runs were submitted 44 Possible Effects of Leader Board in NTCIR MobileClick-1 MobileClick-2
  • 45. • Goal of MobileClick: Provide direct and immediate mobile information access • Subtasks: – iUnit ranking – iUnit summarization • Results: –11 teams submitted 66 runs – Participants outperformed the baseline in all the subtasks – Some teams showed significant improvement • Acknowledgements – Yahoo Japan Corporation – Wider Planet Summary 45
  • 46. 46
  • 47. 47
  • 48. 48
  • 49. 49