These slides were presented at SEMANTiCS 2019 in Karlsruhe, Germany. The corresponding paper can be found here: https://link.springer.com/chapter/10.1007/978-3-030-33220-4_25
Identify Customer Segments to Create Customer Offers for Each Segment - Appli...
QUANT-Question Answering Benchmark Curator
1. QUANT-Question Answering Benchmark Curator
Ria Hari Gusmita, Rricha Jalota, Daniel Vollmers, Jan Reineke, Axel-Cyrille Ngonga
Ngomo, and Ricardo Usbeck
March 24, 2020
Gusmita et al QUANT March 24, 2020 1 / 33
2. Outline
1 Motivation
2 Approach
3 Evaluation
4 QALD-specific Analysis
5 Conclusion & Future Work
Gusmita et al QUANT March 24, 2020 2 / 33
3. Motivation
Drawback in evaluating Question Answering systems over knowledge bases
Mainly based on benchmark datasets
(benchmarks)
Challenge in maintaining high-quality and
benchmarks
Gusmita et al QUANT March 24, 2020 3 / 33
4. Motivation
Challenge in maintaining high-quality and benchmarks
Change of the underlying knowledge base
DBpedia 2016-04 DBpedia 2016-10
http://dbpedia.org/resource/Surfing http://dbpedia.org/resource/Surfer
http://dbpedia.org/ontology/seatingCapacity http://dbpedia.org/property/capacity
http://dbpedia.org/property/portrayer http://dbpedia.org/ontology/portrayer
http://dbpedia.org/property/establishedDate http://dbpedia.org/ontology/foundingDate
Gusmita et al QUANT March 24, 2020 4 / 33
7. Contribution
QUANT, a framework for the intelligent creation and curation of QA benchmarks
Definition
Given B, D, and Q as benchmark, dataset, and questions respectively
S represents QUANT’s suggestions
ith
version of a QA benchmark Bi as a pair (Di , Qi )
Given a query qij ∈ Qi with zero results on Dk with k > i
S : qij −→ qij
QUANT aims
to ensure that queries from Bi can be reused for Bk
to speed up the curation process as compared to the existing one
Gusmita et al QUANT March 24, 2020 7 / 33
8. What QUANT supports
1 Creation of SPARQL queries
2 The validity of benchmark metadata
3 Spelling and grammatical correctness of questions
Gusmita et al QUANT March 24, 2020 8 / 33
10. Approach
Smart suggestions
1 SPARQL suggestion
2 Metadata suggestion
3 Multilingual Questions and Keywords Suggestion
Gusmita et al QUANT March 24, 2020 10 / 33
11. Smart suggestion
1. How SPARQL suggestion module works
Gusmita et al QUANT March 24, 2020 11 / 33
12. 1. SPARQL suggestion
Missing prefix
The original SPARQL query
SELECT ? s
WHERE {
r e s : New_Delhi dbo : country ? s .
}
Gusmita et al QUANT March 24, 2020 12 / 33
13. 1. SPARQL suggestion
Missing prefix
The original SPARQL query
SELECT ? s
WHERE {
r e s : New_Delhi dbo : country ? s .
}
The suggested SPARQL query
PREFIX dbo : <http :// dbpedia . org / ontology/>
PREFIX r e s : <http :// dbpedia . org / r e s o u r c e/>
SELECT ? s
WHERE {
r e s : New_Delhi dbo : country ? s .
}
Gusmita et al QUANT March 24, 2020 13 / 33
14. 1. SPARQL suggestion
Predicate change
The original SPARQL query
SELECT ? date
WHERE {
? website r d f : type onto : Software .
? website onto : r e l e a s e D a t e ? date .
? website r d f s : l a b e l "DBpedia" .
}
Gusmita et al QUANT March 24, 2020 14 / 33
15. 1. SPARQL suggestion
Predicate change
The suggested SPARQL query
SELECT ? date
WHERE {
? website r d f : type onto : Software .
? website r d f s : l a b e l "DBpedia" .
? website dbp : l a t e s t R e l e a s e D a t e ? date .
}
Gusmita et al QUANT March 24, 2020 15 / 33
16. 1. SPARQL suggestion
Predicate missing
The original SPARQL query
SELECT ? u r i
WHERE {
? s u b j e c t r d f s : l a b e l "Tom␣Hanks" .
? s u b j e c t f o a f : homepage ? u r i
}
Gusmita et al QUANT March 24, 2020 16 / 33
17. 1. SPARQL suggestion
Predicate missing
The original SPARQL query
SELECT ? u r i
WHERE {
? s u b j e c t r d f s : l a b e l "Tom␣Hanks" .
? s u b j e c t f o a f : homepage ? u r i
}
The suggested SPARQL query The predicate foaf:homepage is missing in ?subject
foaf:homepage ?uri
Gusmita et al QUANT March 24, 2020 17 / 33
18. 1. SPARQL suggestion
Entity change
The original SPARQL query
SELECT ? u r i WHERE
{ ? u r i r d f : type yago : C ap it als In Eu ro pe }
Gusmita et al QUANT March 24, 2020 18 / 33
19. 1. SPARQL suggestion
Entity change
The original SPARQL query
SELECT ? u r i WHERE
{ ? u r i r d f : type yago : C ap it als In Eu ro pe }
The suggested SPARQL query
SELECT ? u r i WHERE
{ ? u r i r d f : type yago : WikicatCapitalsInEurope }
Gusmita et al QUANT March 24, 2020 19 / 33
21. 3. Multilingual questions and keywords suggestion
Question with missing keywords and translations
Gusmita et al QUANT March 24, 2020 21 / 33
22. 3. Multilingual questions and keywords suggestion
Generated keywords: state, united, states, america, highest, density
Utilizing Trans Shell tool→Generated keywords translations suggestion
Gusmita et al QUANT March 24, 2020 22 / 33
23. 3. Multilingual questions and keywords suggestion
Suggested Question Translations
Gusmita et al QUANT March 24, 2020 23 / 33
24. Evaluation
Three goals of the evaluation:
1 QUANT vs manual curation
Graduate students curated 50
questions using QUANT and another
50-question manually
23 minutes vs 278 minutes
2 Effectiveness of smart suggestions
10 expert users got involved in
creating a new joint benchmark, called
QALD-9, with 653 questions
3 QUANT’s capability to provide a
high-quality benchmark dataset
The inter-rater agreement between
each two users amounts up to 0.83 on
average
Group Inter-rater Agreement
1st Two-Users 0.97
2nd Two-Users 0.72
3rd Two-Users 0.88
4th Two-Users 0.77
5th Two-Users 0.96
Average 0.83
Gusmita et al QUANT March 24, 2020 24 / 33
25. Evaluation
Users acceptance rate in %
User 1
User 2
User 3
User 4
User 5
User 6
User 7
User 8
User 9
User 10
List of users
0
10
20
30
40
50
60
70
80
90
100
Acceptanceratein%
acceptance rate per user
QUANT provided 2380 suggestions and user acceptance rate on average is 81%
The top 4 acceptance-rate are for QALD-7 and QALD-8
Gusmita et al QUANT March 24, 2020 25 / 33
26. Evaluation
Number of accepted suggestions from all users
User 1
User 2
User 3
User 4
User 5
User 6
User 7
User 8
User 9
User 10
List of users
0
100
200
300
400
500
Numberofacceptedsuggestion
SPARQL Query
Question Translations
Out of Scope
Onlydbo
Keywords Translations
Hybrid
Answer Type
Aggregation
Most users accepted suggestion for out-of-scope metadata
Keyword and question translation suggestions yielded the second and third highest
acceptance rates.
Gusmita et al QUANT March 24, 2020 26 / 33
27. Evaluation
Number of users who accepted QUANT’s suggestions for each question’s attribute.
Aggregation
Answer Type
Hybrid
Keywords Translations
Only Dbo
Out of Scope
Question Translations
SPARQL Query
Name of attributes
0
10
20
30
40
50
60
70
80
90
100
110
Numberofusersacceptedsuggestionin%
Percentage
83.75% of the users accepted QUANT’s smart suggestions on average
Hybrid and SPARQL suggestions were only accepted by 2 and 5 users respectively.
Gusmita et al QUANT March 24, 2020 27 / 33
28. Evaluation
Number of suggestions provided by users
User 1
User 2
User 3
User 4
User 5
User 6
User 7
User 8
User 9
User 10
List of users
0
10
20
30
40
Numberofprovidedsuggestions
SPARQL Query Question Translations Out of Scope Onlydbo
Keywords Translations Hybrid Answer Type Aggregation
Answer type, onlydbo, out-of-scope, and SPARQL query metadata were attributes whose
value redefined by users
Gusmita et al QUANT March 24, 2020 28 / 33
29. QALD-specific Analysis
There are 1924 questions where 1442 questions are training data and 482 questions are test
data
Gusmita et al QUANT March 24, 2020 29 / 33
30. QALD-specific Analysis
Duplication removal resulted 655 unique
questions
Removing 2 semantically similar questions
produced 653 questions
Using QUANT with 10 expert users, we
got 558 total benchmark questions →
increase QALD-8 size by 110.6%
The new benchmark formed QALD-9
dataset
Distribution of unique questions in all QALD
versions
Gusmita et al QUANT March 24, 2020 30 / 33
31. Conclusion
QUANT’s evaluation highlights the need for better
datasets and their maintenance
QUANT speeds up the curation process by up to
91%.
Smart suggestions motivate users to engage in more
attribute corrections than if there were no hints
Gusmita et al QUANT March 24, 2020 31 / 33
32. Future Work
There is a need to invest more time into SPARQL
suggestions as only 5 users accepted them
We plan to support more file formats based on our
internal library
Gusmita et al QUANT March 24, 2020 32 / 33
33. Thank you for your attention!
Ria Hari Gusmita
ria.hari.gusmita@uni-paderborn.de
https://github.com/dice-group/QUANT
DICE Group at Paderborn University
https:
//dice-research.org/team/profiles/gusmita/
Gusmita et al QUANT March 24, 2020 33 / 33