SlideShare une entreprise Scribd logo
1  sur  11
Télécharger pour lire hors ligne
Senso Comune as a Knowledge Base of Italian 
language 
The Resource and its Development 
Tommaso Caselli 1 Isabella Chiari 2 Aldo Gangemi 3 Elisabetta 
Jezek 4 Alessandro Oltramari 5 Guido Vetere 6 Laure Vieu 7 
Fabio Massimo Zanzotto 8 
1VU Amsterdam 
2Universit `a di Roma ’Sapienza’ 
3CNR ISTC 
4Universit `a di Pavia 
5Carnegie Mellon University 
6IBM Italia 
7CNRS IRIT 
8Universit `a di Roma ’Tor Vergata’ 
Tommaso Caselli , Isabella Chiari , Aldo GangemSie,nEsloisDaCboeemtctuaneJeemazeskba,KeAnlreosws1lae0nddg,reoB2Oa0lstrea1mo4faIrtia,liaGnuildaongVueategree , LauDreecVeiemub,eFra1b0i,o2M01a4ssimo 1Za/ n1z1otto
Introduction 
Senso Comune (www.sensocomune.it) is an open, machine-readable 
knowledge base of the Italian language 
Lexical content has been extracted from a monolingual Italian 
dictionary (De Mauro’s GRADIT), and is continuously enriched 
through a collaborative online platform 
Linguistic knowledge is represented by a semasiological model where 
each sense can be qualified with respect to a small set of ontological 
categories 
Senses can be further enriched in many ways and mapped to other 
dictionaries, such as the Italian version of MultiWordnet, thus 
qualifying Senso Comune as a linguistic Linked Open Data resource 
Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 2 / 11
General principles 
(Computational) lexicography should be able to build on the direct 
witness of native speakers (not only textual sources) 
The way linguistic meanings relate to ontological categories is 
tangential 
Linguistic knowledge belongs to the entire community of speakers, 
thus we are committed to keep the resource as open as possible 
Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 3 / 11
Lexicon and ontology 
To map lexical senses to concepts Senso Comune adopts a notion of 
ontological commitment: If the sense S commits to (7!) the concept C, 
then there are entities of type C to which occurrences of S may refer to. 
Ontological Commitment 
(S7! C) , 9s; cjS(s) ^ C(c) ^ refers to(s; c) 
A sense may commit to several different ontological categories (e.g. 
ARTIFACT, INFORMATION) 
Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 4 / 11
Lexicon and ontology, a semiotic approach 
Senses are semiotic objects whose relationship with real world 
entities is mediated by cognitive structures, emotional polarity and 
social interactions 
Lexical relations, such as synonymy, which hold among senses, do 
not bear direct ontological import 
Conversely, ontological axioms, such as equivalence, do not have 
immediate linguistic side-effects 
If the equivalence of linguistic senses to ontological concepts is 
desired (e.g. for technical portions of the dictionary), this condition 
has to be specifically formalized and managed 
Synonymy < Equivalence 
S7! C ^ S07! C0 ^ S  S0 ; C  C0 
S7! C ^ S07! C0 ^ C  C0 ; S  S0 
Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 5 / 11
Sense classification 
Senso Comune meanings are 
classified w.r.t. a small set of 
categories inspired by DOLCE 
A tutoring methodology (TMEO) 
supports the classification 
process 
Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 6 / 11
Annotation of lexicographic examples and definitions 
Ongoing work in Senso Comune focuses on manual annotation of the 
usage examples associated with the sense definitions of the most 
common verbs in the resource, with the goal of providing Senso Comune 
with corpus-derived verbal frames. The annotation task, which is 
performed through a Web-based tool, is organized in two main subtasks. 
1 consists in identifying the 
constituents that hold a relation 
with the target verb in the 
example and to annotate them 
with information about the type 
of phrase and grammatical 
relation 
2 users are asked to attach a 
semantic role, an ontological 
category and the sense 
definition associated with the 
argument filler of each frame 
participant in the instances 
Figure: Annotation of andare a cavallo 
(riding) 
Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 7 / 11
Word Sense Alignment 
To enrich Senso Comune (SC) and make it interoperable with other 
lexical-semantic resources, we conducted Word Sense Alignment (WSA) 
experiments with MultiWordNet (MWN), both manually and automatically 
Figure: Aligment of appartamento 
Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 8 / 11
Manual Alignment 
At the time of this writing 
584 SC lemmas (nouns) have been processed for manual alignment, 
for a total of 
6,730 word senses, with 3.64 average word senses for each lemma 
2,131 senses could be aligned with at least one MWN synset (31.7%) 
2,187 MWN synsets could be aligned to at least one SC sense 
1,093 biunique alignments 
SC MWN % 
1,622 1 76.1 
367 2 17.2 
108 3 5 
25 4 1.1 
11 5,7 0.6 
Table: SC to MWN 
MWN SC % 
1,681 1 76.8 
400 2 18.2 
85 3 3.8 
17 4 0.9 
4 5,6 0.3 
Table: MWN to SC 
=) Similar granularity, relatively little overlap 
Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 9 / 11
Automatic Alignment 
Lexical Match (overlapping tokens between two sense description), 
with 
1 Lemmatized version of the original glosses of Senso Comune 
2 Bag-of-words based on synset words, direct hypernyms, nearest 
synsets, the corresponding Italian synset words from the “Princeton 
Annotated Gloss Corpus” and Wikipedia glosses from BabelNet 
Sense Similarity (cosine score between the vector representations 
of sense descriptions) 
1 Vector representations have been obtained by means of the 
Personalized Page Rank (PPR) vector representation with WN30 and 
“Princeton Annotated Gloss Corpus” as knowledge base 
Evaluation 
Two Gold Standards, one for verbs (350 sense pairs) and one for 
nouns (166 sense pairs), with Precision (P), Recall (R) and F1 scores 
Best F1 by merging the outputs of the two methods: 0.47 for verbs 
(P=0.61, R=0.38) and 0.64 for nouns (P=0.67, R=0.61). 
Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 10 / 11
Conclusion 
The gap between a “native” Italian dictionary and an 
English-derivative Wordnet may be relevant 
This should be carefully taken into account when devising techniques 
and methodologies to construct multilingual resources 
Our results suggest that more attention should be paid to the 
semantic peculiarity of each language, i.e. the specific way each 
language constructs a conceptual view of the world 
Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 11 / 11

Contenu connexe

Similaire à Senso Comune as a Knowledge Base of Italian language - The Resource and its Development

Makalah Phonological Construction
Makalah Phonological ConstructionMakalah Phonological Construction
Makalah Phonological ConstructionJerusman Marbun
 
G2 pil a grapheme to-phoneme conversion tool for the italian language
G2 pil a grapheme to-phoneme conversion tool for the italian languageG2 pil a grapheme to-phoneme conversion tool for the italian language
G2 pil a grapheme to-phoneme conversion tool for the italian languageijnlc
 
Introduction_to_Language_and_Linguistics.pptx
Introduction_to_Language_and_Linguistics.pptxIntroduction_to_Language_and_Linguistics.pptx
Introduction_to_Language_and_Linguistics.pptxValeryRamirezMendez
 
Difference Between Alphabet And International Phonetic Theory
Difference Between Alphabet And International Phonetic TheoryDifference Between Alphabet And International Phonetic Theory
Difference Between Alphabet And International Phonetic TheorySandy Harwell
 
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On SilenceSegmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On Silencepaperpublications3
 
Functions of Gestural Semantics in Contemporary Communication
Functions of Gestural Semantics in Contemporary CommunicationFunctions of Gestural Semantics in Contemporary Communication
Functions of Gestural Semantics in Contemporary CommunicationSubramanian Mani
 
3. C. Roper Dissertation.PDF
3. C. Roper Dissertation.PDF3. C. Roper Dissertation.PDF
3. C. Roper Dissertation.PDFClaire Roper
 
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On SilenceSegmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On Silencepaperpublications3
 
Cognitive Process Associated with LanguageNamePsycho.docx
Cognitive Process Associated with LanguageNamePsycho.docxCognitive Process Associated with LanguageNamePsycho.docx
Cognitive Process Associated with LanguageNamePsycho.docxclarebernice
 
Natural language processing and sanskrit
Natural language processing and sanskritNatural language processing and sanskrit
Natural language processing and sanskritIAEME Publication
 
Introduction to Linguistics
Introduction to LinguisticsIntroduction to Linguistics
Introduction to LinguisticsSheng Nuesca
 
speech production in psycholinguistics
speech production in psycholinguistics speech production in psycholinguistics
speech production in psycholinguistics Aseel K. Mahmood
 
Arabic SentiWordNet in Relation to SentiWordNet 3.0
Arabic SentiWordNet in Relation to SentiWordNet 3.0Arabic SentiWordNet in Relation to SentiWordNet 3.0
Arabic SentiWordNet in Relation to SentiWordNet 3.0Waqas Tariq
 
A TMS Study On Abstract And Concrete Phrases
A TMS Study On Abstract And Concrete PhrasesA TMS Study On Abstract And Concrete Phrases
A TMS Study On Abstract And Concrete PhrasesYolanda Ivey
 
Natural Language Processing: State of The Art, Current Trends and Challenges
Natural Language Processing: State of The Art, Current Trends and ChallengesNatural Language Processing: State of The Art, Current Trends and Challenges
Natural Language Processing: State of The Art, Current Trends and Challengesantonellarose
 
The human mind at work
The human mind at workThe human mind at work
The human mind at workFaith Clavaton
 
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGES
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGESA SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGES
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGEScsandit
 
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGES
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGESA SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGES
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGESLinda Garcia
 

Similaire à Senso Comune as a Knowledge Base of Italian language - The Resource and its Development (20)

Makalah Phonological Construction
Makalah Phonological ConstructionMakalah Phonological Construction
Makalah Phonological Construction
 
G2 pil a grapheme to-phoneme conversion tool for the italian language
G2 pil a grapheme to-phoneme conversion tool for the italian languageG2 pil a grapheme to-phoneme conversion tool for the italian language
G2 pil a grapheme to-phoneme conversion tool for the italian language
 
Introduction_to_Language_and_Linguistics.pptx
Introduction_to_Language_and_Linguistics.pptxIntroduction_to_Language_and_Linguistics.pptx
Introduction_to_Language_and_Linguistics.pptx
 
Difference Between Alphabet And International Phonetic Theory
Difference Between Alphabet And International Phonetic TheoryDifference Between Alphabet And International Phonetic Theory
Difference Between Alphabet And International Phonetic Theory
 
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On SilenceSegmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
 
Functions of Gestural Semantics in Contemporary Communication
Functions of Gestural Semantics in Contemporary CommunicationFunctions of Gestural Semantics in Contemporary Communication
Functions of Gestural Semantics in Contemporary Communication
 
3. C. Roper Dissertation.PDF
3. C. Roper Dissertation.PDF3. C. Roper Dissertation.PDF
3. C. Roper Dissertation.PDF
 
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On SilenceSegmentation Words for Speech Synthesis in Persian Language Based On Silence
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
 
Cognitive Process Associated with LanguageNamePsycho.docx
Cognitive Process Associated with LanguageNamePsycho.docxCognitive Process Associated with LanguageNamePsycho.docx
Cognitive Process Associated with LanguageNamePsycho.docx
 
Natural language processing and sanskrit
Natural language processing and sanskritNatural language processing and sanskrit
Natural language processing and sanskrit
 
Edinburgh
EdinburghEdinburgh
Edinburgh
 
Introduction to Linguistics
Introduction to LinguisticsIntroduction to Linguistics
Introduction to Linguistics
 
W17 5406
W17 5406W17 5406
W17 5406
 
speech production in psycholinguistics
speech production in psycholinguistics speech production in psycholinguistics
speech production in psycholinguistics
 
Arabic SentiWordNet in Relation to SentiWordNet 3.0
Arabic SentiWordNet in Relation to SentiWordNet 3.0Arabic SentiWordNet in Relation to SentiWordNet 3.0
Arabic SentiWordNet in Relation to SentiWordNet 3.0
 
A TMS Study On Abstract And Concrete Phrases
A TMS Study On Abstract And Concrete PhrasesA TMS Study On Abstract And Concrete Phrases
A TMS Study On Abstract And Concrete Phrases
 
Natural Language Processing: State of The Art, Current Trends and Challenges
Natural Language Processing: State of The Art, Current Trends and ChallengesNatural Language Processing: State of The Art, Current Trends and Challenges
Natural Language Processing: State of The Art, Current Trends and Challenges
 
The human mind at work
The human mind at workThe human mind at work
The human mind at work
 
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGES
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGESA SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGES
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGES
 
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGES
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGESA SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGES
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGES
 

Dernier

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 

Dernier (20)

Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 

Senso Comune as a Knowledge Base of Italian language - The Resource and its Development

  • 1. Senso Comune as a Knowledge Base of Italian language The Resource and its Development Tommaso Caselli 1 Isabella Chiari 2 Aldo Gangemi 3 Elisabetta Jezek 4 Alessandro Oltramari 5 Guido Vetere 6 Laure Vieu 7 Fabio Massimo Zanzotto 8 1VU Amsterdam 2Universit `a di Roma ’Sapienza’ 3CNR ISTC 4Universit `a di Pavia 5Carnegie Mellon University 6IBM Italia 7CNRS IRIT 8Universit `a di Roma ’Tor Vergata’ Tommaso Caselli , Isabella Chiari , Aldo GangemSie,nEsloisDaCboeemtctuaneJeemazeskba,KeAnlreosws1lae0nddg,reoB2Oa0lstrea1mo4faIrtia,liaGnuildaongVueategree , LauDreecVeiemub,eFra1b0i,o2M01a4ssimo 1Za/ n1z1otto
  • 2. Introduction Senso Comune (www.sensocomune.it) is an open, machine-readable knowledge base of the Italian language Lexical content has been extracted from a monolingual Italian dictionary (De Mauro’s GRADIT), and is continuously enriched through a collaborative online platform Linguistic knowledge is represented by a semasiological model where each sense can be qualified with respect to a small set of ontological categories Senses can be further enriched in many ways and mapped to other dictionaries, such as the Italian version of MultiWordnet, thus qualifying Senso Comune as a linguistic Linked Open Data resource Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 2 / 11
  • 3. General principles (Computational) lexicography should be able to build on the direct witness of native speakers (not only textual sources) The way linguistic meanings relate to ontological categories is tangential Linguistic knowledge belongs to the entire community of speakers, thus we are committed to keep the resource as open as possible Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 3 / 11
  • 4. Lexicon and ontology To map lexical senses to concepts Senso Comune adopts a notion of ontological commitment: If the sense S commits to (7!) the concept C, then there are entities of type C to which occurrences of S may refer to. Ontological Commitment (S7! C) , 9s; cjS(s) ^ C(c) ^ refers to(s; c) A sense may commit to several different ontological categories (e.g. ARTIFACT, INFORMATION) Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 4 / 11
  • 5. Lexicon and ontology, a semiotic approach Senses are semiotic objects whose relationship with real world entities is mediated by cognitive structures, emotional polarity and social interactions Lexical relations, such as synonymy, which hold among senses, do not bear direct ontological import Conversely, ontological axioms, such as equivalence, do not have immediate linguistic side-effects If the equivalence of linguistic senses to ontological concepts is desired (e.g. for technical portions of the dictionary), this condition has to be specifically formalized and managed Synonymy < Equivalence S7! C ^ S07! C0 ^ S S0 ; C C0 S7! C ^ S07! C0 ^ C C0 ; S S0 Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 5 / 11
  • 6. Sense classification Senso Comune meanings are classified w.r.t. a small set of categories inspired by DOLCE A tutoring methodology (TMEO) supports the classification process Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 6 / 11
  • 7. Annotation of lexicographic examples and definitions Ongoing work in Senso Comune focuses on manual annotation of the usage examples associated with the sense definitions of the most common verbs in the resource, with the goal of providing Senso Comune with corpus-derived verbal frames. The annotation task, which is performed through a Web-based tool, is organized in two main subtasks. 1 consists in identifying the constituents that hold a relation with the target verb in the example and to annotate them with information about the type of phrase and grammatical relation 2 users are asked to attach a semantic role, an ontological category and the sense definition associated with the argument filler of each frame participant in the instances Figure: Annotation of andare a cavallo (riding) Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 7 / 11
  • 8. Word Sense Alignment To enrich Senso Comune (SC) and make it interoperable with other lexical-semantic resources, we conducted Word Sense Alignment (WSA) experiments with MultiWordNet (MWN), both manually and automatically Figure: Aligment of appartamento Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 8 / 11
  • 9. Manual Alignment At the time of this writing 584 SC lemmas (nouns) have been processed for manual alignment, for a total of 6,730 word senses, with 3.64 average word senses for each lemma 2,131 senses could be aligned with at least one MWN synset (31.7%) 2,187 MWN synsets could be aligned to at least one SC sense 1,093 biunique alignments SC MWN % 1,622 1 76.1 367 2 17.2 108 3 5 25 4 1.1 11 5,7 0.6 Table: SC to MWN MWN SC % 1,681 1 76.8 400 2 18.2 85 3 3.8 17 4 0.9 4 5,6 0.3 Table: MWN to SC =) Similar granularity, relatively little overlap Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 9 / 11
  • 10. Automatic Alignment Lexical Match (overlapping tokens between two sense description), with 1 Lemmatized version of the original glosses of Senso Comune 2 Bag-of-words based on synset words, direct hypernyms, nearest synsets, the corresponding Italian synset words from the “Princeton Annotated Gloss Corpus” and Wikipedia glosses from BabelNet Sense Similarity (cosine score between the vector representations of sense descriptions) 1 Vector representations have been obtained by means of the Personalized Page Rank (PPR) vector representation with WN30 and “Princeton Annotated Gloss Corpus” as knowledge base Evaluation Two Gold Standards, one for verbs (350 sense pairs) and one for nouns (166 sense pairs), with Precision (P), Recall (R) and F1 scores Best F1 by merging the outputs of the two methods: 0.47 for verbs (P=0.61, R=0.38) and 0.64 for nouns (P=0.67, R=0.61). Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 10 / 11
  • 11. Conclusion The gap between a “native” Italian dictionary and an English-derivative Wordnet may be relevant This should be carefully taken into account when devising techniques and methodologies to construct multilingual resources Our results suggest that more attention should be paid to the semantic peculiarity of each language, i.e. the specific way each language constructs a conceptual view of the world Tommaso Caselli , Isabella Chiari , Aldo Gangemi , Elisabetta Jezek , Alessandro Oltramari , Guido Vetere , Laure Vieu , Fabio Massimo Zanzotto Senso Comune as a Knowledge Base of Italian language December 10, 2014 11 / 11