User Guide: Orion™ Weather Station (Columbia Weather Systems)
Taking up the Gaokao Challenge: An Information Retrieval Approach
1. Taking up the Gaokao Challenge:
An Information Retrieval Approach
Gong Cheng, Weixi Zhu, Ziwei Wang, Jianghui Chen, Yuzhong Qu
National Key Laboratory for Novel Software Technology
Nanjing University, China
Websoft
2. What is Gaokao?
• National Higher Education Entrance Examination,
a.k.a. China’s SAT, but
• being more difficult
• testing more subjects
• Chinese literature, Mathematics, English language, and
• Humanities (History, Geography, Political Education)
or Natural Sciences (Physics, Chemistry, Biology)
3. Why is Gaokao difficult for the computer?
• Questions in Gaokao challenge QA systems:
• multiple sentences to understand
• domain-specific expressions to parse (e.g., quotes, formulas, maps)
• unspecified, seemingly unbounded resources to search
4. Overview of the Approach
Recollecting
Relevance
Knowledge
Drawing
Evidence
ReasoningHuman:
5. Overview of the Approach
Recollecting
Relevance
Knowledge
Drawing
Evidence
ReasoningHuman:
Computer:
6. Stage 1: Retrieving Pages
• Retrieving Concept Pages
(Step 1/2: leftmost longest title matching)
The Effectiveness of Confucianism chapter of the Xunzi said:
(The King of Zhou) ruled the land and founded seventy-one (feudal) states,
of which fifty-three (governors) were from the Ji family.
It shows that in the Zhou Dynasty, feoffments were mainly granted to relatives.
The King of Zhou would invest those relatives with
7. Stage 1: Retrieving Pages
• Retrieving Concept Pages
(Step 2/2: context-based disambiguation)
Feudalism may refer to:
• Feudalism (in China), existed during the Shang and the Zhou dynasty, and was replaced by
centralization of authority during and after the Qin dynasty…
• Feudalism (in Europe), prevailed in the Middle Ages from the 5th to the 15th century…
• Feudalism (in Japan), originated from ritsuryo in the Heian period from the 8th to the 12th
century…
The Effectiveness of Confucianism chapter of the Xunzi said:
(The King of Zhou) ruled the land and founded seventy-one (feudal) states,
of which fifty-three (governors) were from the Ji family.
It shows that in the Zhou Dynasty, feoffments were mainly granted to relatives.
The King of Zhou would invest those relatives with
matching a disambiguation page Context helps disambiguation.
8. Stage 1: Retrieving Pages
• Retrieving Quote Pages
(exact content match)
The Effectiveness of Confucianism chapter of the Xunzi said:
(The King of Zhou) ruled the land and founded seventy-one (feudal) states,
of which fifty-three (governors) were from the Ji family.
It shows that in the Zhou Dynasty, feoffments were mainly granted to relatives.
The King of Zhou would invest those relatives with
matching the content of a page
a quote
9. Stage 2: Ranking and Filtering Pages
• Centrality-based Ranking
(centrality = cosine similarity)
• Three kinds of vector space:
• words in a page
• links in a page
• categories of a page
center of retrieved pages
10. Stage 2: Ranking and Filtering Pages
• Domain-based Filtering
(within historical categories,
i.e., categories whose names contain the word
history, and their descendant categories)
• Relevance-based Ranking
(relevance to the stem and options)
11. Stage 3: Assessing Options
• truth of an option
= extent to which question/pages can entail it
= extent to which pages from the stem can entail it
+ extent to which pages from it can entail the stem
(entailment = relevance measurement)
12. Experiments
• Real-life questions in Gaokao or mock Gaokao
• QS-A: 123 questions, answerable based on Wikipedia
• QS-B: 454 questions, out of the scope of Wikipedia
QS-A QS-B
43.09% 31.28%
Correctly answered questions