SlideShare une entreprise Scribd logo
1  sur  20
Télécharger pour lire hors ligne
Università Politecnica delle Marche
                Department of Computer Science, Management and Automation
                Ancona, Italy




         A Semantic-Aided Designer
          for Knowledge Discovery

        Claudia Diamantini, Domenico Potena, Emanuele Storti

                                e.storti@univpm.it



CTS2011, Philadelphia, May 23-27
Introduction

  Data explosion & KDD
  Organizations need methods and technologies to analyze
  huge amounts of data, to support decisional processes

                        Knowledge Discovery
                        in Databases (KDD) is the
                        process of identifying valid, novel,
                        potentially useful patterns in data


                         many steps, iterations
                         interaction
                         user knowledge

CTS2011, May 23-27                A Semantic-Aided Designer for KD
Introduction
  1st generation of IDAs (Intelligent Data Analysis systems):
  ● local frameworks

  ● single-user

  ● predefined set of tools (little extensibility)



  2nd generation: distribution of tools & computational aspects

  Evolution of organizations: distribution of user, collaboration


               domain
                expert           DB / DWH
                                administrator
                                                      How to support the design
                                                      of a KDD project in an
                                                      open, distributed and
                                                      collaborative scenario?
                                    DM
          KDD                    specialists
        specialists

CTS2011, May 23-27                              A Semantic-Aided Designer for KD
Issues

  Heterogeneity & tool distribution
  Many KDD and Data Mining tools available for any domain/task,
  many possible combinations

   Heterogeneous interfaces
    programming languages, OSs,
    transfer protocols,..

   Complex to use
    process design, data preparation,
    precondition satisfaction, I/O interpretation
   tools should be easily and dinamically added in the platform
   they should be accessible, searchable, executable via standard API

   suggestions about the best tool sequences

   support for tool setup and process execution



CTS2011, May 23-27                         A Semantic-Aided Designer for KD
Issues

  User distribution

  Distributed organizations:
  ● multiple branch enterprises
  ● E-Science project




  Collaboration:
  ● source of complexity
  ● distributed computation: several users can

  succeed where a single user is likely to fail

    collaborative design of KDD processes
    tool/process sharing and annotation

    easy join of new partners in Virtual Teams




CTS2011, May 23-27                           A Semantic-Aided Designer for KD
Methodology

  Service Oriented Architecture

  Basic Services                    Support Services

  Services for any KDD task:        Back-end services:
                                    ● access control
  every KDD tool is wrapped
                                    ● data transfer
  as a Web Service, deployed
                                    ● service publishing
  on the publisher's server,
                                    ● UDDI registry
  and published in a common
  repository
                                    High-level functionalities:
                                    ● service discovery

                                    ● interface matchmaking

                                    ● process composition

  C4.5 tool          C4.5 service


CTS2011, May 23-27                     A Semantic-Aided Designer for KD
Methodology

  Semantic descriptors for Basic Services
                     Separation of information           in    3
                     abstraction layers

                     Tools/services are annotated through
                     XML     descriptors:  details  about
                     interfaces and QoS

                     Algorithms are formally described in a
                     KDD ontology, which contains an
                     algorithm taxonomy and high level
                     information   about     their   tasks,
                     methods and functionalities

CTS2011, May 23-27                 A Semantic-Aided Designer for KD
Methodology
         KDD algorithms            ID3



         KDD tools




     KDD services



  Benefits: loose-coupling, reusability
  Support services rely on such layers:
        service discovery
        interface matchmaking
        process composition
CTS2011, May 23-27                       A Semantic-Aided Designer for KD
Methodology

  KDD ontology       Remove missing values                         C4.5
                          algorithm
                                             Labeled Dataset
                                                                 algorithm




     KDD services
                                                          abc   C4.5_v.2.0




  Benefits: loose-coupling, reusability
  Support services rely on such layers:
        service discovery
        interface matchmaking
        process composition
CTS2011, May 23-27                                 A Semantic-Aided Designer for KD
KDDesigner
  A web-based tool aimed at supporting users in collaborative
  KDD process design




CTS2011, May 23-27                   A Semantic-Aided Designer for KD
Service discovery
  Retrieval of KDD services satisfying user requirements




CTS2011, May 23-27                   A Semantic-Aided Designer for KD
Service discovery
  Retrieval of KDD services satisfying user requirements


                                            4




                       1



                                        2

                                        3       KDDONTO

CTS2011, May 23-27                   A Semantic-Aided Designer for KD
Service discovery
  Retrieval of KDD services satisfying user requirements


                                      1



                                              4




                                          2

                                          3
CTS2011, May 23-27                   A Semantic-Aided Designer for KD
Process design




CTS2011, May 23-27   A Semantic-Aided Designer for KD
Interface matchmaking
  Verification of data compatibility in an I/O connection




CTS2011, May 23-27                      A Semantic-Aided Designer for KD
Interface matchmaking
 Matchmaker service checks the validity of the match
 ●
     syntactic compatibility
     comparison between service descriptors
     (I/O primitive datatype and syntax)




                                              same format?
     KDD services                             same primitive   abc
                                               datatype?


 ●
     Output: cost of match
CTS2011, May 23-27                                A Semantic-Aided Designer for KD
Interface matchmaking
 Matchmaker service checks the validity of the match
 ●
     syntactic compatibility
     comparison between service descriptors
     (I/O primitive datatype and syntax)
 ●
     semantic compatibility
     comparison between ontological annotations of the services
     (kind of match between I/O, preconditions/postconditions... and many more)

                                               same concept?
     KDD ontology                     x        subconcept?         y
                                               part-of concept?




     KDD services                                                 abc



 ●
     Output: cost of match
CTS2011, May 23-27                                   A Semantic-Aided Designer for KD
Semi-automatic composition
 KDDComposer: advanced service for composition

 Input
 ● user dataset
 ● a set of requirements

 (max num algorithms,
 computational complexity,
 max cost of match)
 ● user goal (classification,

 regression, ...)


     Output
      A ranked list of abstract processes
     (suggestions about processes useful to solve the user problem)

CTS2011, May 23-27                         A Semantic-Aided Designer for KD
Collaboration
  ● collaborative process edit/annotation (wiki-style)
  ● versioning system

  ● team management and add of new users

  ● manual parameter setting




CTS2011, May 23-27                      A Semantic-Aided Designer for KD
Conclusion

 SOA for KDD
      ●   Basic Services and Support Services
      ●   KDD Designer: a semantic-aided designer for KDD

 Open environment and heterogeneous tools
      ●   different interfaces: need of a common representation
          (service)
      ●   abstraction for an high-level description of tools (algorithm)
      ●   semantics for interoperability and high-level functionalities

 Future work
      ●   extension with new support services
      ●   process export in more workflow languages
      ●   more collaborative features (real-time editor)

CTS2011, May 23-27                           A Semantic-Aided Designer for KD

Contenu connexe

Dernier

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 

En vedette

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

En vedette (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

A Semantic-Aided Designer for Knowledge Discovery

  • 1. Università Politecnica delle Marche Department of Computer Science, Management and Automation Ancona, Italy A Semantic-Aided Designer for Knowledge Discovery Claudia Diamantini, Domenico Potena, Emanuele Storti e.storti@univpm.it CTS2011, Philadelphia, May 23-27
  • 2. Introduction Data explosion & KDD Organizations need methods and technologies to analyze huge amounts of data, to support decisional processes Knowledge Discovery in Databases (KDD) is the process of identifying valid, novel, potentially useful patterns in data  many steps, iterations  interaction  user knowledge CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 3. Introduction 1st generation of IDAs (Intelligent Data Analysis systems): ● local frameworks ● single-user ● predefined set of tools (little extensibility) 2nd generation: distribution of tools & computational aspects Evolution of organizations: distribution of user, collaboration domain expert DB / DWH administrator How to support the design of a KDD project in an open, distributed and collaborative scenario? DM KDD specialists specialists CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 4. Issues Heterogeneity & tool distribution Many KDD and Data Mining tools available for any domain/task, many possible combinations  Heterogeneous interfaces programming languages, OSs, transfer protocols,..  Complex to use process design, data preparation, precondition satisfaction, I/O interpretation  tools should be easily and dinamically added in the platform  they should be accessible, searchable, executable via standard API  suggestions about the best tool sequences  support for tool setup and process execution CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 5. Issues User distribution Distributed organizations: ● multiple branch enterprises ● E-Science project Collaboration: ● source of complexity ● distributed computation: several users can succeed where a single user is likely to fail  collaborative design of KDD processes  tool/process sharing and annotation  easy join of new partners in Virtual Teams CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 6. Methodology Service Oriented Architecture Basic Services Support Services Services for any KDD task: Back-end services: ● access control every KDD tool is wrapped ● data transfer as a Web Service, deployed ● service publishing on the publisher's server, ● UDDI registry and published in a common repository High-level functionalities: ● service discovery ● interface matchmaking ● process composition C4.5 tool C4.5 service CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 7. Methodology Semantic descriptors for Basic Services Separation of information in 3 abstraction layers Tools/services are annotated through XML descriptors: details about interfaces and QoS Algorithms are formally described in a KDD ontology, which contains an algorithm taxonomy and high level information about their tasks, methods and functionalities CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 8. Methodology KDD algorithms ID3 KDD tools KDD services  Benefits: loose-coupling, reusability  Support services rely on such layers:  service discovery  interface matchmaking  process composition CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 9. Methodology KDD ontology Remove missing values C4.5 algorithm Labeled Dataset algorithm KDD services abc C4.5_v.2.0  Benefits: loose-coupling, reusability  Support services rely on such layers:  service discovery  interface matchmaking  process composition CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 10. KDDesigner A web-based tool aimed at supporting users in collaborative KDD process design CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 11. Service discovery Retrieval of KDD services satisfying user requirements CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 12. Service discovery Retrieval of KDD services satisfying user requirements 4 1 2 3 KDDONTO CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 13. Service discovery Retrieval of KDD services satisfying user requirements 1 4 2 3 CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 14. Process design CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 15. Interface matchmaking Verification of data compatibility in an I/O connection CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 16. Interface matchmaking Matchmaker service checks the validity of the match ● syntactic compatibility comparison between service descriptors (I/O primitive datatype and syntax) same format? KDD services same primitive abc datatype? ● Output: cost of match CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 17. Interface matchmaking Matchmaker service checks the validity of the match ● syntactic compatibility comparison between service descriptors (I/O primitive datatype and syntax) ● semantic compatibility comparison between ontological annotations of the services (kind of match between I/O, preconditions/postconditions... and many more) same concept? KDD ontology x subconcept? y part-of concept? KDD services abc ● Output: cost of match CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 18. Semi-automatic composition KDDComposer: advanced service for composition Input ● user dataset ● a set of requirements (max num algorithms, computational complexity, max cost of match) ● user goal (classification, regression, ...) Output A ranked list of abstract processes (suggestions about processes useful to solve the user problem) CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 19. Collaboration ● collaborative process edit/annotation (wiki-style) ● versioning system ● team management and add of new users ● manual parameter setting CTS2011, May 23-27 A Semantic-Aided Designer for KD
  • 20. Conclusion SOA for KDD ● Basic Services and Support Services ● KDD Designer: a semantic-aided designer for KDD Open environment and heterogeneous tools ● different interfaces: need of a common representation (service) ● abstraction for an high-level description of tools (algorithm) ● semantics for interoperability and high-level functionalities Future work ● extension with new support services ● process export in more workflow languages ● more collaborative features (real-time editor) CTS2011, May 23-27 A Semantic-Aided Designer for KD