SlideShare une entreprise Scribd logo
1  sur  22
Combining Human and
Computational Intelligence
  Ilya Zaihrayeu, Pierre Andrews,
             Juan Pane
Semantic annotation lifecycle
                                           Problem 4: semi-
                                       automatic semantification
    free text annotations
                                        of existing annotations




  Problem 2:
    extract         Problem 1: help the
  (semantic)           user find and
 annotations          understand the
from contexts       meaning of semantic
    of user            annotations
  resource at                                                           What if the users could use
   publishing                                                              semantic annotations
                                                                       instead to leverage semantic
                                                                           technology services?
                 User                                                            Semantic
                            Semantic                                       annotation=structure
                             search        …         Reasoning               and/or meaning

       Context
                                       Problem 3: QoS of semantics-enabled
                                                    services
   4/14/2011                                                                                   2
Index: meaning summarization


                Problem 1: help the
                   user find and
                  understand the
                meaning of semantic
                   annotations




             User

                       Semantic
                        search        …   Reasoning



4/14/2011                                             3
Meaning summarization: why?
• The right meaning of the words being used for
  the annotation are in the mind of the people
  using them
• E.g.: Java:
      – an island in Indonesia south of Borneo; one of the
            island
        world's most densely populated regions
      – a beverage consisting of an infusion of ground coffee
          beverage
        beans; "he ordered a cup of coffee“
      – a simple platform-independent object-oriented
        programming language used for writing applets that
        programming language
        are downloaded from the World Wide Web by a client
        and run on the client's machine
• Descriptions are too long for the user to grasp the
  meaning immediately – too high barrier to start
  generating semantic annotations

4/14/2011                                                   4
Meaning summarization: an
            example




                                One word summaries are
                              generated from the relations
                              in the knowledge base, sense
                                definitions, synonyms and
                                     hypernym terms



4/14/2011                                                    5
Meaning summarization:
            evaluation results


                        Best precision: 63%

        If we talk about java, does the word coffee mean the same as island?




                Discriminating power: 76,4%


4/14/2011                                                                      6
Index: gold standard dataset  Problem 4: semi-
                                      automatic semantification
                                       of existing annotations



In order to evaluate the
  performance of the
      algorithms, a
gold standard dataset is
         needed




              User

                           Semantic
                            search        …         Reasoning

                                      Problem 3: QoS of semantics-enabled
                                                   services?
4/14/2011                                                                   7
Proposed Approach
     Create a gold standard of folksonomy with sense

   Tag                        Tokens                                    Senses
                   # of annotations                 4 296
                   Unique tags                        857
                   Unique URLs                        644
              Preprocessing                     Disambiguation
                   Unique users                     1 194
                 Annotator Agreement
         80% Accuracy                     81 %
                                   59% Accuracy
                                                                 Java – an island in
                                                                 Indonesia to the south of
javaisland                       Java island                     Borneo
                                 Java is land                    Island – a land mass that is
                                 …                               surrounded by water
 4/14/2011                                                                             8
A Platform for Gold Standards of
             Semantic Annotation Systems
 • Manual validation
 • RDF export
 • Evaluation of
       – Preprocessing
       – WSD
       – BoW Search
       – Convergence
 • Open source:                                7 modules
                                            25K lines of code
http://sourceforge.net/projects/tags2con/   26% of comments

 4/14/2011                                                  9
Delicious RDF Dataset @ LOD cloud




# triples               85 908
Outlinks to LOD cloud     651    Dereferenceable at:
(WN synsets)                     http://disi.unitn.it/~knowdive/dataset/delicious/
  4/14/2011                                                                  10
Index: QoS for semantic search




             User

                    Semantic
                     search        …         Reasoning

                               Problem 3: QoS of semantics-enabled
                                            services?
4/14/2011                                                            11
Semantic search: why?
• With the free text search, the following problems
  may reduce precision and recall:
      – synonymy problem: searching for “images” should
        return resources annotated with “picture”
      – polysemy problem: searching for “java” (island)
        should not return resources annotated with “java”
        (coffee beverage)
      – specificity gap problem: searching for “animals”
        should also return resources annotated with “dogs”
• Semantic, meaning-based search can address the
  above listed problems


4/14/2011                                                    12
Semantics vs Folksonomy
                    Used to build
javaisland          “raw” queries             Semantic search:
                                               complete and
                                               correct results
                    Used to build              (the baseline)
java island         BoW queries

                                         Used to build
Java(island) island(land)              semantic queries
                                     correct and complete
   Specificity Gap (SG)
                  link
   query                     vehicle

       submit     SG=1                        Recall goes
                                              down as the
                                              specificity gap
                               car            increases
    User
                   SG=2
       result
 resource                     taxi
                annotation
                                                                 Specificity Gap
 4/14/2011                                                                         13
Index: semantic convergence
                                   Problem 4: semi-
                               automatic semantification
                                of existing annotations




             User

                    Semantic
                     search        …         Reasoning



4/14/2011                                                  14
Semantic convergence: Why?
                               Cannot
                  Other        decide                                Other          Cannot
                   1%            6%                                   3%            decide
                                                                                      5%       Abbreviation
                                        Abbreviation
                                                                                                   2%
                                            5%




                                                                                               Missing
                                                                                               sense
                                                                                                15%
             With a WN
               sense                    Missing                                              I don't know
                49%                     sense                 With a WN                           4%
                               Ajax                             sense
                                         36%
                               Mac                               71%
                               Apple
                               CSS
                               …

Random:
programming and                                        “General” domains: cooking, travel,
web domain           I don't
                      know                              education
4/14/2011              3%                                                                           15
Semantic convergence: proposed
            solution
• Find new senses of terms
      – Find different senses of the same term (word sense)
      – Find synonymous of a term (synonymous sets - synset)
• Place the new synset in the vocabulary is-a hierarchy
• What we improve
      – Better use of Machine Learning techniques
      – The polysemy issue is not considered in the state of the art
      – Missing or “subjective” evaluations in the state of the art
• Evaluation using the Delicious dataset



4/14/2011                                                          16
Convergence Evaluation:
              Finding Senses
             Tag Collocation                              User Collocation
                                                               t2
                   t2     B2                         U1                  B1
             B1
 t1                                                       t1                   t3
                    t3   t4    t5
                                                          B4                  U2    t5
B4                                                                  t4
                                                                                         B3
                   B3
                               Random Baseline
       Precision: 56%               Precision: 42%                   Precision: 57%
        Recall: 73%                  Recall: 29%                      Recall: 68%




 4/14/2011                                                                                17
Semantic annotation lifecycle
                                           Problem 4: semi-
                                       automatic semantification
    free text annotations
                                        of existing annotations




  Problem 2:
    extract             combining human and computational
                    Problem 1: help the
  (semantic)
                    user understand the intelligence
 annotations
                    meaning of semantic
from contexts
                       annotations?
    of user
  resource at
                                        Conclusions                     What if the users could use
  publishing?                                                              semantic annotations
                                                                       instead to leverage semantic
                                                                           technology services?
                 User                                                            Semantic
                            Semantic                                       annotation=structure
                             search        …         Reasoning               and/or meaning

       Context
                                       Problem 3: QoS of semantics-enabled
                                                    services?
   4/14/2011                                                                                  18
Conclusions
• We developed and evaluated a meaning summarization algorithm
• We developed a “semantic folksonomy” evaluation platform
• We studied the effect of semantics on social tagging systems:
      – how much semantics can help?
      – how much the user needs to be involved?
      – How human and computer intelligence can be combined in the
        generation and consumption of semantic annotations
• We developed and evaluated a knowledge base enrichment
  algorithm
• We built and used a gold standard dataset for evaluating:
      –     Word Sense Disambiguation
      –     Tag Preprocessing
      –     Semantic Search
      –     Semantic Convergence


4/14/2011                                                            19
Integration with the use cases
4/14/2011                        20
Publications
  • Semantic Disambiguation in Folksonomy: a Case Study
      Pierre Andrews, Juan Pane, and Ilya Zaihrayeu;
    Advanced Language Technologies for Digital Libraries, Springer’s
    LNCS.
  • Semantic Annotation of Images on Flickr
    Pierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu;
    ESWC 2011
  • A Classification of Semantic Annotation Systems
    Pierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu;
    Semantic Web Journal – second review phase

 •    Sense Induction in Folksonomies
      Pierre Andrews, Juan Pane, and Ilya Zaihrayeu;
      IJCAI-LHD 2011 – under review
 •    Evaluating the Quality of Service in Semantic Annotation Systems
      Ilya Zaihrayeu, Pierre Andrews, and Juan Pane;
      in preparation
4/14/2011                                                                21
WP 2 TIMELINE AND DELIVERABLES
Months
                 0                   6                   12                      18                24                30                  36
                 D2.1.1: State of the Art
  Tasks                                           D2.1.2: Specification of the
                 and requirements from
                                                  model
                 the use case partners
Task 2.1
Designing              UIBK
models
                                                             D2.2.2+D2.2.3: Report on linking
                                                                                                        D2.4 Report on the
                 D2.2.1: Report on bootstrapping             semantic annotations to external sources
                                                                                                        refinement of the proposed
                 semantic annotations and on reaching        and on keeping them up-to-date when
                                                                                                        models, methods and
                 consensus in the use of semantics           the underlying semantic model changes
                                                                                                        semantic search
Task 2.2
Designing
methods                                     UNITN
Task 2.3             D2.3.1: Requirements for           D2.3.2: Specification for
Research on          semantics-aware IR methods         semantics-aware IR methods
Information
Retrieval (IR)
methods for                ONTO                                                                            D2.5 Report on the state of
semantic                                                                                                   the art, proposed suitable
                                                                                                           models and methods for
content                                                                                                    automatic visual annotation

Task 2.4
Models and
methods for                                                                              UTC
automatic
visual
annotation

Contenu connexe

En vedette

Ringor e catalog 2010-2011 catalog
Ringor e catalog 2010-2011 catalogRingor e catalog 2010-2011 catalog
Ringor e catalog 2010-2011 catalogJay Halbrook
 
Geewa Startup Camp Bratislava
Geewa Startup Camp BratislavaGeewa Startup Camp Bratislava
Geewa Startup Camp BratislavaMiloš Endrle
 
WP8 Dissemination and Exploitation
WP8 Dissemination and ExploitationWP8 Dissemination and Exploitation
WP8 Dissemination and ExploitationINSEMTIVES project
 
WP8 Okenterprise Use Case - Applying Insemtives to Corporate Portals
WP8 Okenterprise Use Case - Applying Insemtives to Corporate PortalsWP8 Okenterprise Use Case - Applying Insemtives to Corporate Portals
WP8 Okenterprise Use Case - Applying Insemtives to Corporate PortalsINSEMTIVES project
 
Social games GDS2011
Social games GDS2011Social games GDS2011
Social games GDS2011Miloš Endrle
 
INSEMTIVES Tutorial ISWC2011 - Session3
INSEMTIVES Tutorial ISWC2011 - Session3INSEMTIVES Tutorial ISWC2011 - Session3
INSEMTIVES Tutorial ISWC2011 - Session3INSEMTIVES project
 

En vedette (8)

Ringor e catalog 2010-2011 catalog
Ringor e catalog 2010-2011 catalogRingor e catalog 2010-2011 catalog
Ringor e catalog 2010-2011 catalog
 
Geewa Startup Camp Bratislava
Geewa Startup Camp BratislavaGeewa Startup Camp Bratislava
Geewa Startup Camp Bratislava
 
WP8 Dissemination and Exploitation
WP8 Dissemination and ExploitationWP8 Dissemination and Exploitation
WP8 Dissemination and Exploitation
 
WP8 Okenterprise Use Case - Applying Insemtives to Corporate Portals
WP8 Okenterprise Use Case - Applying Insemtives to Corporate PortalsWP8 Okenterprise Use Case - Applying Insemtives to Corporate Portals
WP8 Okenterprise Use Case - Applying Insemtives to Corporate Portals
 
Social games GDS2011
Social games GDS2011Social games GDS2011
Social games GDS2011
 
INSEMTIVES Tutorial ISWC2011 - Session3
INSEMTIVES Tutorial ISWC2011 - Session3INSEMTIVES Tutorial ISWC2011 - Session3
INSEMTIVES Tutorial ISWC2011 - Session3
 
Semantic Games
Semantic GamesSemantic Games
Semantic Games
 
WP2 2nd Review
WP2 2nd ReviewWP2 2nd Review
WP2 2nd Review
 

Similaire à UAB 2011- Combining human and computational intelligence

Eswc2012 ss ontologies
Eswc2012 ss ontologiesEswc2012 ss ontologies
Eswc2012 ss ontologiesElena Simperl
 
20120411 travelalliancemcguinnessfinal
20120411 travelalliancemcguinnessfinal20120411 travelalliancemcguinnessfinal
20120411 travelalliancemcguinnessfinalDeborah McGuinness
 
Integrating digital traces into a semantic enriched data
Integrating digital traces into a semantic enriched dataIntegrating digital traces into a semantic enriched data
Integrating digital traces into a semantic enriched dataDhaval Thakker
 
Taming digital traces for informal learning dhaval
Taming digital traces for informal learning  dhavalTaming digital traces for informal learning  dhaval
Taming digital traces for informal learning dhavalDhavalkumar Thakker
 
Content is King - ECM in SharePoint 2010 - SharePoint Saturday Denver
Content is King - ECM in SharePoint 2010 - SharePoint Saturday DenverContent is King - ECM in SharePoint 2010 - SharePoint Saturday Denver
Content is King - ECM in SharePoint 2010 - SharePoint Saturday DenverChris McNulty
 
Web3.0 seminar wipro-session2-logicalontological
Web3.0 seminar wipro-session2-logicalontologicalWeb3.0 seminar wipro-session2-logicalontological
Web3.0 seminar wipro-session2-logicalontologicalNagaraju Pappu
 
OpenSearchLab and the Lucene Ecosystem
OpenSearchLab and the Lucene EcosystemOpenSearchLab and the Lucene Ecosystem
OpenSearchLab and the Lucene EcosystemGrant Ingersoll
 
Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical ISSGC Summer School
 
Session 49 Practical Semantic Sticky Note
Session 49 Practical Semantic Sticky NoteSession 49 Practical Semantic Sticky Note
Session 49 Practical Semantic Sticky NoteISSGC Summer School
 
Crowd-Sourced Intelligence Built into Search over Hadoop
Crowd-Sourced Intelligence Built into Search over HadoopCrowd-Sourced Intelligence Built into Search over Hadoop
Crowd-Sourced Intelligence Built into Search over HadoopDataWorks Summit
 
Hadoop summit EU - Crowd Sourcing Reflected Intelligence
Hadoop summit EU - Crowd Sourcing Reflected IntelligenceHadoop summit EU - Crowd Sourcing Reflected Intelligence
Hadoop summit EU - Crowd Sourcing Reflected IntelligenceTed Dunning
 
PRISSMA, Towards Mobile Adaptive Presentation of the Web of Data
PRISSMA,Towards Mobile Adaptive Presentation of the Web of DataPRISSMA,Towards Mobile Adaptive Presentation of the Web of Data
PRISSMA, Towards Mobile Adaptive Presentation of the Web of DataLuca Costabello
 
Leveraging Solr and Mahout
Leveraging Solr and MahoutLeveraging Solr and Mahout
Leveraging Solr and MahoutGrant Ingersoll
 
Word Tree Corpus Interface
Word Tree Corpus InterfaceWord Tree Corpus Interface
Word Tree Corpus InterfaceBen Showers
 
Question answer template
Question answer templateQuestion answer template
Question answer templateThanuw Chaks
 
Mesh Labs Introduction June 2012
Mesh Labs Introduction June 2012Mesh Labs Introduction June 2012
Mesh Labs Introduction June 2012Umesh Ramalingachar
 
Developing a digital literacy framework in your school
Developing a digital literacy framework in your schoolDeveloping a digital literacy framework in your school
Developing a digital literacy framework in your schoolEduwebinar
 
Distributed_Database_System
Distributed_Database_SystemDistributed_Database_System
Distributed_Database_SystemPhilip Zhong
 

Similaire à UAB 2011- Combining human and computational intelligence (20)

Eswc2012 ss ontologies
Eswc2012 ss ontologiesEswc2012 ss ontologies
Eswc2012 ss ontologies
 
20120411 travelalliancemcguinnessfinal
20120411 travelalliancemcguinnessfinal20120411 travelalliancemcguinnessfinal
20120411 travelalliancemcguinnessfinal
 
Integrating digital traces into a semantic enriched data
Integrating digital traces into a semantic enriched dataIntegrating digital traces into a semantic enriched data
Integrating digital traces into a semantic enriched data
 
Taming digital traces for informal learning dhaval
Taming digital traces for informal learning  dhavalTaming digital traces for informal learning  dhaval
Taming digital traces for informal learning dhaval
 
Content is King - ECM in SharePoint 2010 - SharePoint Saturday Denver
Content is King - ECM in SharePoint 2010 - SharePoint Saturday DenverContent is King - ECM in SharePoint 2010 - SharePoint Saturday Denver
Content is King - ECM in SharePoint 2010 - SharePoint Saturday Denver
 
Web3.0 seminar wipro-session2-logicalontological
Web3.0 seminar wipro-session2-logicalontologicalWeb3.0 seminar wipro-session2-logicalontological
Web3.0 seminar wipro-session2-logicalontological
 
OpenSearchLab and the Lucene Ecosystem
OpenSearchLab and the Lucene EcosystemOpenSearchLab and the Lucene Ecosystem
OpenSearchLab and the Lucene Ecosystem
 
Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical Session 49 - Semantic metadata management practical
Session 49 - Semantic metadata management practical
 
Session 49 Practical Semantic Sticky Note
Session 49 Practical Semantic Sticky NoteSession 49 Practical Semantic Sticky Note
Session 49 Practical Semantic Sticky Note
 
Crowd-Sourced Intelligence Built into Search over Hadoop
Crowd-Sourced Intelligence Built into Search over HadoopCrowd-Sourced Intelligence Built into Search over Hadoop
Crowd-Sourced Intelligence Built into Search over Hadoop
 
Hadoop summit EU - Crowd Sourcing Reflected Intelligence
Hadoop summit EU - Crowd Sourcing Reflected IntelligenceHadoop summit EU - Crowd Sourcing Reflected Intelligence
Hadoop summit EU - Crowd Sourcing Reflected Intelligence
 
PRISSMA, Towards Mobile Adaptive Presentation of the Web of Data
PRISSMA,Towards Mobile Adaptive Presentation of the Web of DataPRISSMA,Towards Mobile Adaptive Presentation of the Web of Data
PRISSMA, Towards Mobile Adaptive Presentation of the Web of Data
 
Leveraging Solr and Mahout
Leveraging Solr and MahoutLeveraging Solr and Mahout
Leveraging Solr and Mahout
 
Word Tree Corpus Interface
Word Tree Corpus InterfaceWord Tree Corpus Interface
Word Tree Corpus Interface
 
Hadoop in Education
Hadoop in EducationHadoop in Education
Hadoop in Education
 
Search Computing Overview
Search Computing OverviewSearch Computing Overview
Search Computing Overview
 
Question answer template
Question answer templateQuestion answer template
Question answer template
 
Mesh Labs Introduction June 2012
Mesh Labs Introduction June 2012Mesh Labs Introduction June 2012
Mesh Labs Introduction June 2012
 
Developing a digital literacy framework in your school
Developing a digital literacy framework in your schoolDeveloping a digital literacy framework in your school
Developing a digital literacy framework in your school
 
Distributed_Database_System
Distributed_Database_SystemDistributed_Database_System
Distributed_Database_System
 

Plus de INSEMTIVES project

SemTech 2012 - Making your semantic app addictive: Incentivizing Users
SemTech 2012 - Making your semantic app addictive: Incentivizing UsersSemTech 2012 - Making your semantic app addictive: Incentivizing Users
SemTech 2012 - Making your semantic app addictive: Incentivizing UsersINSEMTIVES project
 
SocInfo2011 - Designing For Motivation
SocInfo2011 - Designing For MotivationSocInfo2011 - Designing For Motivation
SocInfo2011 - Designing For MotivationINSEMTIVES project
 
AAAI2012 - Crowd Sourcing Web Service Annotations
AAAI2012 - Crowd Sourcing Web Service AnnotationsAAAI2012 - Crowd Sourcing Web Service Annotations
AAAI2012 - Crowd Sourcing Web Service AnnotationsINSEMTIVES project
 
SemTech2011 - Employee-of-the-Month' Badge Unlocked
SemTech2011 - Employee-of-the-Month' Badge UnlockedSemTech2011 - Employee-of-the-Month' Badge Unlocked
SemTech2011 - Employee-of-the-Month' Badge UnlockedINSEMTIVES project
 
INSEMTIVES Tutorial ISWC2011 - Session5
INSEMTIVES Tutorial ISWC2011 - Session5INSEMTIVES Tutorial ISWC2011 - Session5
INSEMTIVES Tutorial ISWC2011 - Session5INSEMTIVES project
 
INSEMTIVES Tutorial ISWC2011 - Session4
INSEMTIVES Tutorial ISWC2011 - Session4INSEMTIVES Tutorial ISWC2011 - Session4
INSEMTIVES Tutorial ISWC2011 - Session4INSEMTIVES project
 
INSEMTIVES Tutorial ISWC2011 - Session1
INSEMTIVES Tutorial ISWC2011 - Session1INSEMTIVES Tutorial ISWC2011 - Session1
INSEMTIVES Tutorial ISWC2011 - Session1INSEMTIVES project
 
INSEMTIVES Tutorial ISWC2011 - Session2
INSEMTIVES Tutorial ISWC2011 - Session2INSEMTIVES Tutorial ISWC2011 - Session2
INSEMTIVES Tutorial ISWC2011 - Session2INSEMTIVES project
 
UAB 2011 - Seekda Webservices Portal
UAB 2011 - Seekda Webservices PortalUAB 2011 - Seekda Webservices Portal
UAB 2011 - Seekda Webservices PortalINSEMTIVES project
 
INSEMTIVES year 2 - Dissemination and Community Building
INSEMTIVES year 2  - Dissemination and Community BuildingINSEMTIVES year 2  - Dissemination and Community Building
INSEMTIVES year 2 - Dissemination and Community BuildingINSEMTIVES project
 
INSEMTIVES talk at Semtech2010
INSEMTIVES talk at Semtech2010INSEMTIVES talk at Semtech2010
INSEMTIVES talk at Semtech2010INSEMTIVES project
 

Plus de INSEMTIVES project (17)

SemTech 2012 - Making your semantic app addictive: Incentivizing Users
SemTech 2012 - Making your semantic app addictive: Incentivizing UsersSemTech 2012 - Making your semantic app addictive: Incentivizing Users
SemTech 2012 - Making your semantic app addictive: Incentivizing Users
 
SocInfo2011 - Designing For Motivation
SocInfo2011 - Designing For MotivationSocInfo2011 - Designing For Motivation
SocInfo2011 - Designing For Motivation
 
AAAI2012 - Crowd Sourcing Web Service Annotations
AAAI2012 - Crowd Sourcing Web Service AnnotationsAAAI2012 - Crowd Sourcing Web Service Annotations
AAAI2012 - Crowd Sourcing Web Service Annotations
 
SemTech2011 - Employee-of-the-Month' Badge Unlocked
SemTech2011 - Employee-of-the-Month' Badge UnlockedSemTech2011 - Employee-of-the-Month' Badge Unlocked
SemTech2011 - Employee-of-the-Month' Badge Unlocked
 
INSEMTIVES Tutorial ISWC2011 - Session5
INSEMTIVES Tutorial ISWC2011 - Session5INSEMTIVES Tutorial ISWC2011 - Session5
INSEMTIVES Tutorial ISWC2011 - Session5
 
INSEMTIVES Tutorial ISWC2011 - Session4
INSEMTIVES Tutorial ISWC2011 - Session4INSEMTIVES Tutorial ISWC2011 - Session4
INSEMTIVES Tutorial ISWC2011 - Session4
 
INSEMTIVES Tutorial ISWC2011 - Session1
INSEMTIVES Tutorial ISWC2011 - Session1INSEMTIVES Tutorial ISWC2011 - Session1
INSEMTIVES Tutorial ISWC2011 - Session1
 
INSEMTIVES Tutorial ISWC2011 - Session2
INSEMTIVES Tutorial ISWC2011 - Session2INSEMTIVES Tutorial ISWC2011 - Session2
INSEMTIVES Tutorial ISWC2011 - Session2
 
UAB 2011 - Seekda Webservices Portal
UAB 2011 - Seekda Webservices PortalUAB 2011 - Seekda Webservices Portal
UAB 2011 - Seekda Webservices Portal
 
UAB 2011 - L!nks Showcase
UAB 2011 - L!nks ShowcaseUAB 2011 - L!nks Showcase
UAB 2011 - L!nks Showcase
 
UAB 2011 - Games
UAB 2011 - GamesUAB 2011 - Games
UAB 2011 - Games
 
L!NKS Showcase
L!NKS ShowcaseL!NKS Showcase
L!NKS Showcase
 
Technology - WP3 and WP4
Technology - WP3 and WP4Technology - WP3 and WP4
Technology - WP3 and WP4
 
INSEMTIVES year 2 - Dissemination and Community Building
INSEMTIVES year 2  - Dissemination and Community BuildingINSEMTIVES year 2  - Dissemination and Community Building
INSEMTIVES year 2 - Dissemination and Community Building
 
WP2 1st Review
WP2 1st ReviewWP2 1st Review
WP2 1st Review
 
WP1 1st Review
WP1 1st ReviewWP1 1st Review
WP1 1st Review
 
INSEMTIVES talk at Semtech2010
INSEMTIVES talk at Semtech2010INSEMTIVES talk at Semtech2010
INSEMTIVES talk at Semtech2010
 

Dernier

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 

Dernier (20)

Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

UAB 2011- Combining human and computational intelligence

  • 1. Combining Human and Computational Intelligence Ilya Zaihrayeu, Pierre Andrews, Juan Pane
  • 2. Semantic annotation lifecycle Problem 4: semi- automatic semantification free text annotations of existing annotations Problem 2: extract Problem 1: help the (semantic) user find and annotations understand the from contexts meaning of semantic of user annotations resource at What if the users could use publishing semantic annotations instead to leverage semantic technology services? User Semantic Semantic annotation=structure search … Reasoning and/or meaning Context Problem 3: QoS of semantics-enabled services 4/14/2011 2
  • 3. Index: meaning summarization Problem 1: help the user find and understand the meaning of semantic annotations User Semantic search … Reasoning 4/14/2011 3
  • 4. Meaning summarization: why? • The right meaning of the words being used for the annotation are in the mind of the people using them • E.g.: Java: – an island in Indonesia south of Borneo; one of the island world's most densely populated regions – a beverage consisting of an infusion of ground coffee beverage beans; "he ordered a cup of coffee“ – a simple platform-independent object-oriented programming language used for writing applets that programming language are downloaded from the World Wide Web by a client and run on the client's machine • Descriptions are too long for the user to grasp the meaning immediately – too high barrier to start generating semantic annotations 4/14/2011 4
  • 5. Meaning summarization: an example One word summaries are generated from the relations in the knowledge base, sense definitions, synonyms and hypernym terms 4/14/2011 5
  • 6. Meaning summarization: evaluation results Best precision: 63% If we talk about java, does the word coffee mean the same as island? Discriminating power: 76,4% 4/14/2011 6
  • 7. Index: gold standard dataset Problem 4: semi- automatic semantification of existing annotations In order to evaluate the performance of the algorithms, a gold standard dataset is needed User Semantic search … Reasoning Problem 3: QoS of semantics-enabled services? 4/14/2011 7
  • 8. Proposed Approach Create a gold standard of folksonomy with sense Tag Tokens Senses # of annotations 4 296 Unique tags 857 Unique URLs 644 Preprocessing Disambiguation Unique users 1 194 Annotator Agreement 80% Accuracy 81 % 59% Accuracy Java – an island in Indonesia to the south of javaisland Java island Borneo Java is land Island – a land mass that is … surrounded by water 4/14/2011 8
  • 9. A Platform for Gold Standards of Semantic Annotation Systems • Manual validation • RDF export • Evaluation of – Preprocessing – WSD – BoW Search – Convergence • Open source: 7 modules 25K lines of code http://sourceforge.net/projects/tags2con/ 26% of comments 4/14/2011 9
  • 10. Delicious RDF Dataset @ LOD cloud # triples 85 908 Outlinks to LOD cloud 651 Dereferenceable at: (WN synsets) http://disi.unitn.it/~knowdive/dataset/delicious/ 4/14/2011 10
  • 11. Index: QoS for semantic search User Semantic search … Reasoning Problem 3: QoS of semantics-enabled services? 4/14/2011 11
  • 12. Semantic search: why? • With the free text search, the following problems may reduce precision and recall: – synonymy problem: searching for “images” should return resources annotated with “picture” – polysemy problem: searching for “java” (island) should not return resources annotated with “java” (coffee beverage) – specificity gap problem: searching for “animals” should also return resources annotated with “dogs” • Semantic, meaning-based search can address the above listed problems 4/14/2011 12
  • 13. Semantics vs Folksonomy Used to build javaisland “raw” queries Semantic search: complete and correct results Used to build (the baseline) java island BoW queries Used to build Java(island) island(land) semantic queries correct and complete Specificity Gap (SG) link query vehicle submit SG=1 Recall goes down as the specificity gap car increases User SG=2 result resource taxi annotation Specificity Gap 4/14/2011 13
  • 14. Index: semantic convergence Problem 4: semi- automatic semantification of existing annotations User Semantic search … Reasoning 4/14/2011 14
  • 15. Semantic convergence: Why? Cannot Other decide Other Cannot 1% 6% 3% decide 5% Abbreviation Abbreviation 2% 5% Missing sense 15% With a WN sense Missing I don't know 49% sense With a WN 4% Ajax sense 36% Mac 71% Apple CSS … Random: programming and “General” domains: cooking, travel, web domain I don't know education 4/14/2011 3% 15
  • 16. Semantic convergence: proposed solution • Find new senses of terms – Find different senses of the same term (word sense) – Find synonymous of a term (synonymous sets - synset) • Place the new synset in the vocabulary is-a hierarchy • What we improve – Better use of Machine Learning techniques – The polysemy issue is not considered in the state of the art – Missing or “subjective” evaluations in the state of the art • Evaluation using the Delicious dataset 4/14/2011 16
  • 17. Convergence Evaluation: Finding Senses Tag Collocation User Collocation t2 t2 B2 U1 B1 B1 t1 t1 t3 t3 t4 t5 B4 U2 t5 B4 t4 B3 B3 Random Baseline Precision: 56% Precision: 42% Precision: 57% Recall: 73% Recall: 29% Recall: 68% 4/14/2011 17
  • 18. Semantic annotation lifecycle Problem 4: semi- automatic semantification free text annotations of existing annotations Problem 2: extract combining human and computational Problem 1: help the (semantic) user understand the intelligence annotations meaning of semantic from contexts annotations? of user resource at Conclusions What if the users could use publishing? semantic annotations instead to leverage semantic technology services? User Semantic Semantic annotation=structure search … Reasoning and/or meaning Context Problem 3: QoS of semantics-enabled services? 4/14/2011 18
  • 19. Conclusions • We developed and evaluated a meaning summarization algorithm • We developed a “semantic folksonomy” evaluation platform • We studied the effect of semantics on social tagging systems: – how much semantics can help? – how much the user needs to be involved? – How human and computer intelligence can be combined in the generation and consumption of semantic annotations • We developed and evaluated a knowledge base enrichment algorithm • We built and used a gold standard dataset for evaluating: – Word Sense Disambiguation – Tag Preprocessing – Semantic Search – Semantic Convergence 4/14/2011 19
  • 20. Integration with the use cases 4/14/2011 20
  • 21. Publications • Semantic Disambiguation in Folksonomy: a Case Study Pierre Andrews, Juan Pane, and Ilya Zaihrayeu; Advanced Language Technologies for Digital Libraries, Springer’s LNCS. • Semantic Annotation of Images on Flickr Pierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu; ESWC 2011 • A Classification of Semantic Annotation Systems Pierre Andrews, Sergey Kanshin, Juan Pane, and Ilya Zaihrayeu; Semantic Web Journal – second review phase • Sense Induction in Folksonomies Pierre Andrews, Juan Pane, and Ilya Zaihrayeu; IJCAI-LHD 2011 – under review • Evaluating the Quality of Service in Semantic Annotation Systems Ilya Zaihrayeu, Pierre Andrews, and Juan Pane; in preparation 4/14/2011 21
  • 22. WP 2 TIMELINE AND DELIVERABLES Months 0 6 12 18 24 30 36 D2.1.1: State of the Art Tasks D2.1.2: Specification of the and requirements from model the use case partners Task 2.1 Designing UIBK models D2.2.2+D2.2.3: Report on linking D2.4 Report on the D2.2.1: Report on bootstrapping semantic annotations to external sources refinement of the proposed semantic annotations and on reaching and on keeping them up-to-date when models, methods and consensus in the use of semantics the underlying semantic model changes semantic search Task 2.2 Designing methods UNITN Task 2.3 D2.3.1: Requirements for D2.3.2: Specification for Research on semantics-aware IR methods semantics-aware IR methods Information Retrieval (IR) methods for ONTO D2.5 Report on the state of semantic the art, proposed suitable models and methods for content automatic visual annotation Task 2.4 Models and methods for UTC automatic visual annotation

Notes de l'éditeur

  1. Say how it’s different from tagora dataset => we have gold standard preprocessing disambiguation, with agreement between at least two annotators
  2. The first platform for building gold standards for the evaluation of concept-based search algorithms, vocabulary convergence algorithms, etc in folksonomiesThe first gold standard dataset produced and publishedThe first evaluation of a keywords-based search algorithm w.r.t. the gold standard semantic search in a folksonomyTag preprocessing algorithm, WSD algorithm, concept-based search algorithm