Using Neural SPARQL Machines to translate an utterance into a structured query for question answering over the Linked Open Data cloud.
Invited talk at the 6th Leipzig Semantic Web Day (LSWT2018).
Vision and reflection on Mining Software Repositories research in 2024
Translating Natural Language into SPARQL for Neural Question Answering
1. TRANSLATING
NATURAL LANGUAGE INTO SPARQL
FOR NEURAL QUESTION ANSWERING
Tommaso Soru
AKSW, University of Leipzig, Germany
6. Leipziger Semantic WebTag (LSWT2018) – 18.06.2018
2. LINKED OPEN DATA
👍 >10K published datasets
👍 ~150B triples as (s, p, o)
👎 Low accessibility
lod-cloud.net
2
3. SPARQL QUERY LANGUAGE
3
SELECT ?x WHERE {
?x a ontology:Person .
?x ontology:birthPlace dbpedia:Leipzig .
}
dbpedia:Walter_Ulbricht
dbpedia:Anita_Berber
dbpedia:Martin_Benno_Schmidt
…
4. NATURAL LANGUAGETO SPARQL
4
SELECT ?x WHERE {
?x a ontology:Person .
?x ontology:birthPlace dbpedia:Leipzig .
}
people born in Leipzig
who was born in Leipzig?
Leipzig is the birth place of whom?
5. MODELING NATURAL LANGUAGE
• Model semantics at word and phrase level.
• Be robust to small imperfections (e.g., a missing article).
• Handle question compositionality.
• Work with all human languages.
5
Language Model using Recurrent Neural Networks!
9. THE GENERATOR
9
Build question-query pairs from a set of manually-annotated templates.
where was <A> born?
select var_x where brack_open <A> dbo_birthPlace
var_x sep_dot brack_close
10. CHALLENGE #1: TEMPLATE DISCOVERY
10
where was <A> born?
select var_x where brack_open <A> dbo_birthPlace
var_x sep_dot brack_close
[…] Joe Abercrombie (born 1974) – fantasy writer and film
editor, was born in Lancaster and attended LRGS […]
Idea! Mine templates from a large text corpus using entity pairs. dbpedia:Joe_Abercrombie
dbpedia:Lancaster
ontology:birthPlace
12. CHALLENGE #2:WORD EXPANSION
12
How to deal with synonyms and out-of-vocabulary words?
Credits: github.com/ahaas/synonymvis
Distributional Semantics
Similar words are represented by
similar vectors (or word embeddings).
Language model handles word
disambiguation using context.
13. THE INTERPRETER
13
Sequence interpretation for SPARQL query reconstruction.
select var_x where brack_open var_x rdf_type
dbo_Person sep_dot var_x dbo_birthPlace dbr_Leipzig
Missing brack_close
SELECT ?x WHERE {
?x a ontology:Person .
?x ontology:birthPlace dbpedia:Leipzig
}
14. CHALLENGE #3: COMPOSITIONALITY
14
?x a ontology:Person .
?x dbo:birthPlace dbr:Dresden .
people born in Dresden
dbr:Saxony dbo:capital ?x .
what’s the capital of Saxony?
?x a ontology:Person .
?x dbo:birthPlace ?y .
dbr:Saxony dbo:capital ?y .
people born in the capital of Saxony
Learn the correct variable assignments in the reconstructed query.
+
=
Curriculum Learning
Learn to translate at baby steps.
15. CURRENT STATE
15
• Non-funded work
• Involving people from these institutes:
• AKSW, University of Leipzig
• HTWK / Leipzig University of Applied Sciences
• Paderborn University
• Bonn University
• DBpedia’s Google Summer of Code 2018
• Looking for partnerships!
16. Tommaso Soru
AKSW Research Group
University of Leipzig
Germany
tsoru@informatik.uni-leipzig.de
http://tommaso-soru.it
🤖 https://github.com/AKSW/NSpM
Thank you.
16