Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Semantic Parsing and Representation for Software Requirements
1.
2. Semantic Parsing
Definition: process of mapping natural language
text into a formal representation of its meaning.
Ewan forgot the mozzarella in his car.
∃x0 named(x0, ewan, person) ∧
∃x1 mozzarella(x1) ∧
∃x2 car(x2) ∧ of(x2,x0) ∧ in(x1, x2) ∧
∃e event(e) ∧ forget(e) ∧ agent(e, x0) ∧
patient(e, x1)
10 September 2014 2
3. Semantic Parsing
Definition: process of mapping natural language
text into a formal representation of its meaning.
Ewan forgot the mozzarella in his car.
∃x0 named(x0, ewan, person) ∧
∃x1 mozzarella(x1) ∧
∃x2 car(x2) ∧ of(x2,x0) ∧ in(x1, x2) ∧
∃e event(e) ∧ forget(e) ∧ agent(e, x0) ∧
patient(e, x1)
10 September 2014 3
4. Semantic Role Labelling
Mapping to a shallow semantic representation of
predicates and associated semantic arguments
Ewan forgot the mozzarella in his car.
∃x0 named(x0, ewan, person) ∧
∃x1 mozzarella(x1) ∧
∃x2 car(x2) ∧ of(x2,x0) ∧ in(x1, x2) ∧
∃e event(e) ∧ forget(e) ∧ agent(e, x0) ∧
patient(e, x1)
10 September 2014 4
5. Semantic Role Labelling
Mapping to a shallow semantic representation of
predicates and associated semantic arguments
Ewan forgot the mozzarella in his car.
∃x0 named(x0, ewan, per) ∧ male(x0) ∧
∃x1 mozzarella(x1) ∧
∃x2 car(car←———x2) ∧ of(his x2,x0) ∧
∧
∃e event(e) ∧ forget(e) ∧ agent(e, x0) ∧
Ewan—actor→forget)
Ewan———↗ patient(e, x3) ∧ ↖ in(———e, x2)
mozzarella
10 September 2014 5
6. Semantic Role Labelling
Mapping to a shallow semantic representation of
predicates and associated semantic arguments
Ewan forgot the mozzarella in his car.
∃x0 named(x0, ewan, per) ∧ male(x0) ∧
∃x1 mozzarella(x1) ∧
∃x2 car←car(x2) owner—∧ of(x2,his x0) ∧
car—location ∧
∃e event(e) ∧ forget(e) ∧ agent(e, x0) ∧
Ewan—actor→forget) ↓
Ewan—patient(actor↗ e, x3) ∧ in(↖theme—e, x2)
mozzarella
10 September 2014 6
7. Semantic Role Labelling
Motivation: identify who did what to whom,
where, why and how, etc.
Ewan forgot the mozzarella in his car.
∃x0 named(x0, ewan, per) ∧ male(x0) ∧
∃x1 mozzarella(x1) ∧
∃x2 car(car←x2) owner—∧ of(x2,his x0) ∧
car—location ∧
∃e event(e) ∧ forget(e) ∧ agent(e, x0) ∧
Ewan—actor→forget) ↓
Ewan—patient(actor↗ e, x3) ∧ in(↖theme—e, x2)
mozzarella
10 September 2014 7
8. S-CASE Project
CASE: Computer Assisted Software Engineering
Plethora of software solutions already available
Could be (re)used for rapid prototyping
The role of UEDIN in this project:
Analyse textual requirements of existing
solutions that describe their functionalities
Provide a search interface for finding solutions
10 September 2014 8
9. Functional Requirements
Properties
Discussed by developers and customers
Basis for work plans, implementations, etc.
Examples
“The user must be able to login to his account.”
“The system should store all activities.”
…
10 September 2014 9
10. Pre-processing
Before mapping text to meaning representations:
Tokenization
Part-of-speech tagging and lemmatization
Syntactic dependency parsing
“The user must be able to login to his account.”
The user must be able to login to his account .
the
user
must
be
able
to
login
to
his
account
.
DT
NN
MD
VB
JJ
TO
VB
TO
PRP
NN
.
10 September 2014 10
11. Semantic Analysis
Several steps of analysis are required
Find “predicates” in a sentence
Identify potential arguments
Classify arguments of each predicate
“The user must be able to login to his account.”
The user must be able to login to his account .
the
user
must
be
able
to
login
to
his
account
.
DT
NN
MD
VB
JJ
TO
VB
TO
PRP
NN
.
10 September 2014 11
12. Semantic Analysis—Detailed View
Several steps of analysis are required
Find “predicates” in a sentence
Identify potential arguments
Classify arguments of each predicate
“The user must be able to login to his account.”
Assigned part-of-speech
Number of children
Parent word form
The user must be able to login to his account .
the
user
must
be
able
to
login
to
his
account
.
DT
NN
MD
VB
JJ
TO
VB
TO
PRP
NN
.
10 September 2014 12
13. Semantic Analysis—Detailed View
Several steps of analysis are required
Find “predicates” in a sentence
Identify potential arguments
Classify arguments of each predicate
“The user must be able to login to his account.”
Assigned part-of-speech
Labelled path to predicate
(Other) children of pred
The user must be able to login to his account .
the
user
must
be
able
to
login
to
his
account
.
DT
NN
MD
VB
JJ
TO
VB
TO
PRP
NN
.
10 September 2014 13
14. Semantic Analysis—Detailed View
Several steps of analysis are required
Find “predicates” in a sentence
Identify potential arguments
Classify arguments of each predicate
“The user must be able to login to his account.”
Head word of argument
Relative position
Labelled dependency
The user must be able to login to his account .
the
user
must
be
able
to
login
to
his
account
.
DT
NN
MD
VB
JJ
TO
VB
TO
PRP
NN
.
10 September 2014 14
15. What about Linked Data?
Once we identified all predicates and arguments
We can map them into a structured format
Link with other information and share online
Store in a database for downstream applications
“The user must be able to login to his account.”
The user must be able to login to his account .
the
user
must
be
able
to
login
to
his
account
.
DT
NN
MD
VB
JJ
TO
VB
TO
PRP
NN
.
10 September 2014 15
16. The user must be able to login to his account .
the
:x0 user
a :must
user; be
able
:e0 to
login
a :to
login;
his
account
.
DT
:actor_NN
of :e0. MD
VB
JJ
TO
:has_actor VB
:TO
x0;
PRP
NN
.
RDF Representation
Storing SRL predicates and arguments
Define one entity per relevant word token
(predicates and arguments can coincide)
Use RDF triples to describe relations
“The user must be able to login to his account.”
user login
:acts_on :x1.
10 September 2014 16
17. The user must be able to login to his account .
the
:x0 user
a :must
user; be
able
:e0 to
login
a :to
login;
his
account
.
DT
:actor_NN
of :e0. MD
VB
JJ
TO
:has_actor VB
:TO
x0;
PRP
NN
.
RDF Representation (cont.)
The bigger scheme: what are users, logins, etc.?
Ontology defines classes, relations, restrictions
user is-a actor is-a thingtype is-a concept
ACTOR_OF(x, y) ↔ HAS_ACTOR(y, x)
“The user must be able to login to his account.”
_:user _:login
:acts_on :x1.
10 September 2014 17
18. Advantages for Applications
Ontology defines concept types and relations
Finite set of pre-defined symbols
user is-a actor is-a thingtype is-a concept
subclasses can be exploited for search
ACTOR_OF(x, y) ↔ HAS_ACTOR(y, x)
Axioms for detecting inconsistencies
and inferring missing relations
10 September 2014 18
19. Putting the Pieces Together
OWL
Ontology
Applications
Applications
Requirements
Document
Implemented
Software
Component
RDF
Triples
DB
10 September 2014 19
20. Conclusions
Semantic parsing is an important prerequisite for
computational natural language understanding
Results of shallow semantic analysis can be
represented in a structured format for
downstream applications
Linked data helps us to connect and share
information on existing software solutions and
makes efficient search possible
10 September 2014 20
Hello and thank you all for coming. My name is Michael Roth and for the next 20 minutes, or so, I will be talking about Semantic Parsing to Linked Data. The talk is mostly about parsing but I will get back to how that relates to linked data towards the end.
So the first question that some of you might ask themselves now is: what is “semantic parsing”? Well, let’s start with a definition. “Semantic Parsing is … into … “. Since this is a very general definition, let’s have a look at a very specific example.
It can be quite challenging to deduce a full logical representation from a sentence. For example, when looking at the text, we do not see any of the existential quantifiers from the logical representation and it’s not always clear why they should be used over universal quantifiers. On the representation side, we can see a very mixed granularity of logical predicates: they can denote specific types of entities such as mozzarella but also very general relations such as “in” or “of”.
A more general variant invented by computational linguistics is a task called semantic role labelling. The idea of this task is still to perform some form of semantic parsing but instead of mapping to logical representations, we here focus on the identification of predicate-argument structures that can be observed as linguistic units.
This means that a predicate, rather than being an abstract concept, is simply a word with a specific sense that can be observed in text. Typically, a predicate identified text has one or more arguments. For example, “car” and “forget” are predicates and their semantic arguments are those words or spans of words that further specify them in text. Computational linguistics came up with a somewhat here associate each of them with a specific thematic relation. For example, …
To go one step further than this, approaches to semantic role labelling make use of a predefined set of roles that specify how an argument is related to the predicate.
And the motivation for this is… So in addition to representing the general structure of a sentence, we can now make some inferences about the actual meaning of a sentence.