SlideShare a Scribd company logo
1 of 31
Similarity Search inNon-text Data Pavel Zezula Faculty of Informatics Masaryk University, Brno 4.10.2011 Searching Session NTK 2011
Real-Life MotivationThe social psychology view Any event in the history of organism is, in a sense, unique. Recognition, learning, and judgment presuppose an ability to categorize stimuli and classify situations by similarity. Similarity (proximity, resemblance, communality, representativeness, psychologicaldistance, etc.) is fundamental to theories of perception, learning,  judgment, etc. Searching Session NTK 2011 4.10.2011
Contemporary Networked MediaThe digital data view Almost everything that we see, read, hear, write, measure, or observe can be digital. Users autonomouslycontribute to production of global media and the growth is exponential. Sites like Flickr, YouTube, Facebook host user contributed content for a variety of events. The elements of networked media are related by numerous multi-facet links of similarity. 4.10.2011 Searching Session NTK 2011
Examples with Similarity Does the computer disk of a suspected criminal contain illegal multimedia material? What are the stocks with similar price histories? Which companies advertise their logos in the direct TV transmission of football match? Is it the situation on the web getting close to any of the network attacks which resulted in significant damage in the past?  4.10.2011 Searching Session NTK 2011
Challenge Networked media is getting close to the human “fact-bases”. Similaritydatamanagement is needed to connect, search, filter, merge, relate, rank, cluster, classify, identify, or categorize objects across various collections. WHY? It is the similarity which is in the world revealing. 4.10.2011 Searching Session NTK 2011
Limitations:Data Types We have Attributes Numbers, strings, etc. Text (text-based) Documents, annotations We need Multimedia Image, video, audio Security  Biometrics Medicine EKG, EEG, EMG, EMR, CT, etc. Scientific data Biology, chemistry, physics, life sciences, economics Others Motion, emotion, events, etc. 4.10.2011 Searching Session NTK 2011
Limitations:Models of Similarity We have Simple geometric models, typically vector spaces We need More complex model Non metric models Asymmetric similarity Subjective similarity Context aware similarity Complex similarity Etc. 4.10.2011 Searching Session NTK 2011
Limitations:Queries We have Simple query Nearest neighbor Range We need More query types Reverse NN, distinct NN, similarity join Other similarity-based operations Filtering, classification, event detection, clustering, etc. Similarity algebra May become the basis of a “Similarity Data Management System” 4.10.2011 Searching Session NTK 2011
Limitations:Implementation Strategies We have Centralized or parallel processing We need Scalable and distributed architectures MapReduce like approaches P2P architectures Cloud computing Self-organized architectures Etc. 4.10.2011 Searching Session NTK 2011
4.10.2011 Searching Session NTK 2011 Search Strategy Evolution Scalability ,[object Object]
number of users (queries)
variety of data types
multi-lingual, -feature –modal querieswell established cutting-edge research high Determinism exact match    ► similarity precise	            ► approximate same answer  ► good answer; recommendation fixed query      ► personalized; context aware fixed infrastr.    ► dynamic mapping; mobile dev. peer-to-peer centralized parallel self-organized distributed grade low
Word Cloud of Applications 4.10.2011 Searching Session NTK 2011
Metric Search Grows in Popularity Hanan Samet Foundation of Multidimensional and Metric Data Structures Morgan Kaufmann, 2006 P. Zezula, G. Amato, V. Dohnal, and M. Batko Similarity Search: The Metric Space Approach Springer, 2006 4.10.2011 Searching Session NTK 2011
The MUFIN Approach MUFIN: MUlti-Feature Indexing Network SEARCH data & queries index structure infrastructure Scalability P2P structure Extensibility metric space Tuning of performance Internet / GRID / LAN network independence 4.10.2011 Searching Session NTK 2011
Metric Space an Abstraction of Similarity Metric space:M = (D,d) D– domain distance function d(x,y) x,y,z  D d(x,y) > 0				- non-negativity d(x,y) = 0  x = y			- identity d(x,y) = d(y,x)			- symmetry d(x,y)≤ d(x,z)+ d(z,y)		- triangle inequality 4.10.2011 Searching Session NTK 2011
Peer-to-Peer Indexing Native metric techniques: GHT*, VPT* Transformation techniques: M-CAN, M-Chord 4.10.2011 Searching Session NTK 2011
Image search Image base similar? 4.10.2011 Searching Session NTK 2011
Images and their Descriptors Image level B Descriptor level R G 4.10.2011 Searching Session NTK 2011
100M images + metadata + MPEG-7 VDs http://cophir.isti.cnr.it/ ,[object Object]
Each image contains:
Five MPEG-7 VDs: Scalable Color, Color Structure, Color Layout, Edge Histogram, Homogeneous Texture
Other textual information: title, tags, comments, etc.
Photos have been crawled from the Flickr photo-sharing site.CoPhIR: Content-based PhotoImage Retrieval 4.10.2011 Searching Session NTK 2011
MUFIN SEARCH ENGINE data & queries index structure infrastructure 6 x IBM server x3400 Scalability M-Chord + M-Tree Extensibility COPHIR color structure scalable color color layout edge histogram homogeneous texture Image Search Demohttp://mufin.fi.muni.cz/imgsearch/ 4.10.2011 Searching Session NTK 2011
demos http://mufin.fi.muni.cz/apps.html 4.10.2011 Searching Session NTK 2011
Current Research Activities Image Query Postprocessing Sub-image Searching Remote Biometrics Event Detection in Video Signal Processing 4.10.2011 Searching Session NTK 2011
Query Postprocessing The understanding of similarity is: subjective context-dependent multi-modal Semantic gap Overcoming semantic gap by combining aspects semantics-learning result postprocessing relevance feedback & iterative search Our objectives Large general data collections with various quality of metadata Online searching response times 4.10.2011 Searching Session NTK 2011
Query Postprocessing by Ranking Initial search Ranking Two-phase query evaluation model Search the whole collection by some aspects => candidate set Rank the candidate set – sort by other aspects  Advantages ,[object Object]
  Enables cooperation with userDisadvantages  ,[object Object],4.10.2011 Searching Session NTK 2011

More Related Content

What's hot

Failed queries: a morpho-syntactic analysis based on transaction log files
Failed queries: a morpho-syntactic analysis based on transaction log filesFailed queries: a morpho-syntactic analysis based on transaction log files
Failed queries: a morpho-syntactic analysis based on transaction log filesGiannis Tsakonas
 
Thomas
ThomasThomas
Thomasanesah
 
Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Marco Brambilla
 
ICDMWorkshopProposal.doc
ICDMWorkshopProposal.docICDMWorkshopProposal.doc
ICDMWorkshopProposal.docbutest
 
Knowledge Engineering and Intelligence Gathering
Knowledge Engineering and Intelligence GatheringKnowledge Engineering and Intelligence Gathering
Knowledge Engineering and Intelligence GatheringNicolae Sfetcu
 
Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010Raphael Troncy
 
Leveraging social media for training object detectors
Leveraging social media for training object detectorsLeveraging social media for training object detectors
Leveraging social media for training object detectorsManish Kumar
 
6528 opensource intelligence as the new introduction in the graduate cybersec...
6528 opensource intelligence as the new introduction in the graduate cybersec...6528 opensource intelligence as the new introduction in the graduate cybersec...
6528 opensource intelligence as the new introduction in the graduate cybersec...Damir Delija
 
"Mass Surveillance" through Distant Reading
"Mass Surveillance" through Distant Reading"Mass Surveillance" through Distant Reading
"Mass Surveillance" through Distant ReadingShalin Hai-Jew
 
Building a Digital Learning Object w/ Articulate Storyline 2
Building a Digital Learning Object w/ Articulate Storyline 2Building a Digital Learning Object w/ Articulate Storyline 2
Building a Digital Learning Object w/ Articulate Storyline 2Shalin Hai-Jew
 
Machine learning and multimedia information retrieval
Machine learning and multimedia information retrievalMachine learning and multimedia information retrieval
Machine learning and multimedia information retrievalSi Krishan
 
Multidimensional Analysis of Complex Networks
Multidimensional Analysis of Complex NetworksMultidimensional Analysis of Complex Networks
Multidimensional Analysis of Complex NetworksLino Possamai
 

What's hot (13)

Failed queries: a morpho-syntactic analysis based on transaction log files
Failed queries: a morpho-syntactic analysis based on transaction log filesFailed queries: a morpho-syntactic analysis based on transaction log files
Failed queries: a morpho-syntactic analysis based on transaction log files
 
Thomas
ThomasThomas
Thomas
 
Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...Myths and challenges in knowledge extraction and analysis from human-generate...
Myths and challenges in knowledge extraction and analysis from human-generate...
 
ICDMWorkshopProposal.doc
ICDMWorkshopProposal.docICDMWorkshopProposal.doc
ICDMWorkshopProposal.doc
 
Knowledge Engineering and Intelligence Gathering
Knowledge Engineering and Intelligence GatheringKnowledge Engineering and Intelligence Gathering
Knowledge Engineering and Intelligence Gathering
 
Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010Multimedia Semantics - SSMS 2010
Multimedia Semantics - SSMS 2010
 
Leveraging social media for training object detectors
Leveraging social media for training object detectorsLeveraging social media for training object detectors
Leveraging social media for training object detectors
 
CV
CVCV
CV
 
6528 opensource intelligence as the new introduction in the graduate cybersec...
6528 opensource intelligence as the new introduction in the graduate cybersec...6528 opensource intelligence as the new introduction in the graduate cybersec...
6528 opensource intelligence as the new introduction in the graduate cybersec...
 
"Mass Surveillance" through Distant Reading
"Mass Surveillance" through Distant Reading"Mass Surveillance" through Distant Reading
"Mass Surveillance" through Distant Reading
 
Building a Digital Learning Object w/ Articulate Storyline 2
Building a Digital Learning Object w/ Articulate Storyline 2Building a Digital Learning Object w/ Articulate Storyline 2
Building a Digital Learning Object w/ Articulate Storyline 2
 
Machine learning and multimedia information retrieval
Machine learning and multimedia information retrievalMachine learning and multimedia information retrieval
Machine learning and multimedia information retrieval
 
Multidimensional Analysis of Complex Networks
Multidimensional Analysis of Complex NetworksMultidimensional Analysis of Complex Networks
Multidimensional Analysis of Complex Networks
 

Viewers also liked

VAT Club: UK - Capital Goods Scheme briefing
VAT Club: UK - Capital Goods Scheme briefingVAT Club: UK - Capital Goods Scheme briefing
VAT Club: UK - Capital Goods Scheme briefingAlex Baulf
 
Gold Investment Symposium 2012 - Company Presentation - Goldminex Resources L...
Gold Investment Symposium 2012 - Company Presentation - Goldminex Resources L...Gold Investment Symposium 2012 - Company Presentation - Goldminex Resources L...
Gold Investment Symposium 2012 - Company Presentation - Goldminex Resources L...Symposium
 
Peel Mining | ASX:PEX | RIS2014 Broken Hill Investor Presentation
Peel Mining | ASX:PEX | RIS2014 Broken Hill Investor Presentation Peel Mining | ASX:PEX | RIS2014 Broken Hill Investor Presentation
Peel Mining | ASX:PEX | RIS2014 Broken Hill Investor Presentation Symposium
 
Southern Gold | ASX:SAU | RIS2014 Broken Hill Investor Presentation
Southern Gold | ASX:SAU | RIS2014 Broken Hill Investor PresentationSouthern Gold | ASX:SAU | RIS2014 Broken Hill Investor Presentation
Southern Gold | ASX:SAU | RIS2014 Broken Hill Investor PresentationSymposium
 
Chesser Resources | ASX:CHZ | RIS2014 Broken Hill Investor Presentation
Chesser Resources | ASX:CHZ | RIS2014 Broken Hill Investor PresentationChesser Resources | ASX:CHZ | RIS2014 Broken Hill Investor Presentation
Chesser Resources | ASX:CHZ | RIS2014 Broken Hill Investor PresentationSymposium
 
Antipodes Gold AXG: April Introduction.
Antipodes Gold AXG: April Introduction.Antipodes Gold AXG: April Introduction.
Antipodes Gold AXG: April Introduction.antipodesgold
 

Viewers also liked (6)

VAT Club: UK - Capital Goods Scheme briefing
VAT Club: UK - Capital Goods Scheme briefingVAT Club: UK - Capital Goods Scheme briefing
VAT Club: UK - Capital Goods Scheme briefing
 
Gold Investment Symposium 2012 - Company Presentation - Goldminex Resources L...
Gold Investment Symposium 2012 - Company Presentation - Goldminex Resources L...Gold Investment Symposium 2012 - Company Presentation - Goldminex Resources L...
Gold Investment Symposium 2012 - Company Presentation - Goldminex Resources L...
 
Peel Mining | ASX:PEX | RIS2014 Broken Hill Investor Presentation
Peel Mining | ASX:PEX | RIS2014 Broken Hill Investor Presentation Peel Mining | ASX:PEX | RIS2014 Broken Hill Investor Presentation
Peel Mining | ASX:PEX | RIS2014 Broken Hill Investor Presentation
 
Southern Gold | ASX:SAU | RIS2014 Broken Hill Investor Presentation
Southern Gold | ASX:SAU | RIS2014 Broken Hill Investor PresentationSouthern Gold | ASX:SAU | RIS2014 Broken Hill Investor Presentation
Southern Gold | ASX:SAU | RIS2014 Broken Hill Investor Presentation
 
Chesser Resources | ASX:CHZ | RIS2014 Broken Hill Investor Presentation
Chesser Resources | ASX:CHZ | RIS2014 Broken Hill Investor PresentationChesser Resources | ASX:CHZ | RIS2014 Broken Hill Investor Presentation
Chesser Resources | ASX:CHZ | RIS2014 Broken Hill Investor Presentation
 
Antipodes Gold AXG: April Introduction.
Antipodes Gold AXG: April Introduction.Antipodes Gold AXG: April Introduction.
Antipodes Gold AXG: April Introduction.
 

Similar to Podobnostní hledání v netextových datech (Pavel Zezula)

Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spacesMounia Lalmas-Roelleke
 
Big social data analytics - social network analysis
Big social data analytics - social network analysis Big social data analytics - social network analysis
Big social data analytics - social network analysis Jari Jussila
 
hsd-faculty-lunch-jan06
hsd-faculty-lunch-jan06hsd-faculty-lunch-jan06
hsd-faculty-lunch-jan06webuploader
 
Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Pluribus One
 
IJRET-V1I1P5 - A User Friendly Mobile Search Engine for fast Accessing the Da...
IJRET-V1I1P5 - A User Friendly Mobile Search Engine for fast Accessing the Da...IJRET-V1I1P5 - A User Friendly Mobile Search Engine for fast Accessing the Da...
IJRET-V1I1P5 - A User Friendly Mobile Search Engine for fast Accessing the Da...ISAR Publications
 
Read Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal DataRead Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal DataDaniele Di Mitri
 
Integrating digital traces into a semantic enriched data
Integrating digital traces into a semantic enriched dataIntegrating digital traces into a semantic enriched data
Integrating digital traces into a semantic enriched dataDhaval Thakker
 
Twente ir-course 20-10-2010
Twente ir-course 20-10-2010Twente ir-course 20-10-2010
Twente ir-course 20-10-2010Arjen de Vries
 
ICRA: Intelligent Platform for Collaboration and Interaction
ICRA: Intelligent Platform for Collaboration and InteractionICRA: Intelligent Platform for Collaboration and Interaction
ICRA: Intelligent Platform for Collaboration and InteractionLukas Tencer
 
De liddo & Buckingham Shum jurix2012
De liddo & Buckingham Shum jurix2012De liddo & Buckingham Shum jurix2012
De liddo & Buckingham Shum jurix2012Anna De Liddo
 
Enhancing Academic Event Participation with Context-aware and Social Recommen...
Enhancing Academic Event Participation with Context-aware and Social Recommen...Enhancing Academic Event Participation with Context-aware and Social Recommen...
Enhancing Academic Event Participation with Context-aware and Social Recommen...Dejan Kovachev
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISrathnaarul
 
Workshop summary software assurance and trust
Workshop summary software assurance and trustWorkshop summary software assurance and trust
Workshop summary software assurance and trustfcleary
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachCSCJournals
 
Epistenet: Facilitating Programmatic Access & Processing of Semantically Rela...
Epistenet: Facilitating Programmatic Access & Processing of Semantically Rela...Epistenet: Facilitating Programmatic Access & Processing of Semantically Rela...
Epistenet: Facilitating Programmatic Access & Processing of Semantically Rela...Sauvik Das
 
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia CommunitiesIEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia CommunitiesKalman Graffi
 
Gettingstartedwithdigitalcollectionsweb[1]
Gettingstartedwithdigitalcollectionsweb[1]Gettingstartedwithdigitalcollectionsweb[1]
Gettingstartedwithdigitalcollectionsweb[1]guest410707c
 
Facilitating Dialogue - Using Semantic Web Technology for eParticipation
Facilitating Dialogue - Using Semantic Web Technology for eParticipationFacilitating Dialogue - Using Semantic Web Technology for eParticipation
Facilitating Dialogue - Using Semantic Web Technology for eParticipationIMC Technologies
 
Privacy and Cryptographic Security Issues within Mobile Recommender Syste...
Privacy and Cryptographic Security Issues within Mobile     Recommender Syste...Privacy and Cryptographic Security Issues within Mobile     Recommender Syste...
Privacy and Cryptographic Security Issues within Mobile Recommender Syste...Jacob Mack
 

Similar to Podobnostní hledání v netextových datech (Pavel Zezula) (20)

Aggregation for searching complex information spaces
Aggregation for searching complex information spacesAggregation for searching complex information spaces
Aggregation for searching complex information spaces
 
Big social data analytics - social network analysis
Big social data analytics - social network analysis Big social data analytics - social network analysis
Big social data analytics - social network analysis
 
hsd-faculty-lunch-jan06
hsd-faculty-lunch-jan06hsd-faculty-lunch-jan06
hsd-faculty-lunch-jan06
 
Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011Ariu - Workshop on Artificial Intelligence and Security - 2011
Ariu - Workshop on Artificial Intelligence and Security - 2011
 
IJRET-V1I1P5 - A User Friendly Mobile Search Engine for fast Accessing the Da...
IJRET-V1I1P5 - A User Friendly Mobile Search Engine for fast Accessing the Da...IJRET-V1I1P5 - A User Friendly Mobile Search Engine for fast Accessing the Da...
IJRET-V1I1P5 - A User Friendly Mobile Search Engine for fast Accessing the Da...
 
Read Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal DataRead Between The Lines: an Annotation Tool for Multimodal Data
Read Between The Lines: an Annotation Tool for Multimodal Data
 
Integrating digital traces into a semantic enriched data
Integrating digital traces into a semantic enriched dataIntegrating digital traces into a semantic enriched data
Integrating digital traces into a semantic enriched data
 
Twente ir-course 20-10-2010
Twente ir-course 20-10-2010Twente ir-course 20-10-2010
Twente ir-course 20-10-2010
 
ICRA: Intelligent Platform for Collaboration and Interaction
ICRA: Intelligent Platform for Collaboration and InteractionICRA: Intelligent Platform for Collaboration and Interaction
ICRA: Intelligent Platform for Collaboration and Interaction
 
De liddo & Buckingham Shum jurix2012
De liddo & Buckingham Shum jurix2012De liddo & Buckingham Shum jurix2012
De liddo & Buckingham Shum jurix2012
 
Enhancing Academic Event Participation with Context-aware and Social Recommen...
Enhancing Academic Event Participation with Context-aware and Social Recommen...Enhancing Academic Event Participation with Context-aware and Social Recommen...
Enhancing Academic Event Participation with Context-aware and Social Recommen...
 
NE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSISNE7012- SOCIAL NETWORK ANALYSIS
NE7012- SOCIAL NETWORK ANALYSIS
 
Workshop summary software assurance and trust
Workshop summary software assurance and trustWorkshop summary software assurance and trust
Workshop summary software assurance and trust
 
Content Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional ApproachContent Modelling for Human Action Detection via Multidimensional Approach
Content Modelling for Human Action Detection via Multidimensional Approach
 
Epistenet: Facilitating Programmatic Access & Processing of Semantically Rela...
Epistenet: Facilitating Programmatic Access & Processing of Semantically Rela...Epistenet: Facilitating Programmatic Access & Processing of Semantically Rela...
Epistenet: Facilitating Programmatic Access & Processing of Semantically Rela...
 
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia CommunitiesIEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
IEEE ISM 2008: Kalman Graffi: A Distributed Platform for Multimedia Communities
 
Gettingstartedwithdigitalcollectionsweb[1]
Gettingstartedwithdigitalcollectionsweb[1]Gettingstartedwithdigitalcollectionsweb[1]
Gettingstartedwithdigitalcollectionsweb[1]
 
Facilitating Dialogue - Using Semantic Web Technology for eParticipation
Facilitating Dialogue - Using Semantic Web Technology for eParticipationFacilitating Dialogue - Using Semantic Web Technology for eParticipation
Facilitating Dialogue - Using Semantic Web Technology for eParticipation
 
Privacy and Cryptographic Security Issues within Mobile Recommender Syste...
Privacy and Cryptographic Security Issues within Mobile     Recommender Syste...Privacy and Cryptographic Security Issues within Mobile     Recommender Syste...
Privacy and Cryptographic Security Issues within Mobile Recommender Syste...
 
CSE509 Lecture 5
CSE509 Lecture 5CSE509 Lecture 5
CSE509 Lecture 5
 

More from Národní technická knihovna (NTK)

Overlooked Principles of Strategic Management of Research at a National Level...
Overlooked Principles of Strategic Management of Research at a National Level...Overlooked Principles of Strategic Management of Research at a National Level...
Overlooked Principles of Strategic Management of Research at a National Level...Národní technická knihovna (NTK)
 
Využití bibliometrických ukazatelů v řízení výzkumné instituce (Daniel Münich...
Využití bibliometrických ukazatelů v řízení výzkumné instituce (Daniel Münich...Využití bibliometrických ukazatelů v řízení výzkumné instituce (Daniel Münich...
Využití bibliometrických ukazatelů v řízení výzkumné instituce (Daniel Münich...Národní technická knihovna (NTK)
 
InCites: Practical Aspects and Effective Use (Evangelia A. E. C. Lipitakis, ...
InCites: Practical Aspects and Effective Use  (Evangelia A. E. C. Lipitakis, ...InCites: Practical Aspects and Effective Use  (Evangelia A. E. C. Lipitakis, ...
InCites: Practical Aspects and Effective Use (Evangelia A. E. C. Lipitakis, ...Národní technická knihovna (NTK)
 
Bibliometrie v Národní technické knihovně: metody, zkušenosti, mise a vize (J...
Bibliometrie v Národní technické knihovně: metody, zkušenosti, mise a vize (J...Bibliometrie v Národní technické knihovně: metody, zkušenosti, mise a vize (J...
Bibliometrie v Národní technické knihovně: metody, zkušenosti, mise a vize (J...Národní technická knihovna (NTK)
 
Význam indikátorů v institucionálním hodnocení a financování (Jitka Moravcová...
Význam indikátorů v institucionálním hodnocení a financování (Jitka Moravcová...Význam indikátorů v institucionálním hodnocení a financování (Jitka Moravcová...
Význam indikátorů v institucionálním hodnocení a financování (Jitka Moravcová...Národní technická knihovna (NTK)
 
Rešeršní služby v komerčním sektoru (Martin Mlčoch, nezávislý konzultant)
Rešeršní služby v komerčním sektoru (Martin Mlčoch, nezávislý konzultant)Rešeršní služby v komerčním sektoru (Martin Mlčoch, nezávislý konzultant)
Rešeršní služby v komerčním sektoru (Martin Mlčoch, nezávislý konzultant)Národní technická knihovna (NTK)
 
Speciální informační služby pro zdravotníky v Národní lékařské knihovně (Mgr....
Speciální informační služby pro zdravotníky v Národní lékařské knihovně (Mgr....Speciální informační služby pro zdravotníky v Národní lékařské knihovně (Mgr....
Speciální informační služby pro zdravotníky v Národní lékařské knihovně (Mgr....Národní technická knihovna (NTK)
 
Rešeršní služby v NK ČR (Mgr. Karolína Košťálová, NK ČR)
 Rešeršní služby v NK ČR (Mgr. Karolína Košťálová, NK ČR)  Rešeršní služby v NK ČR (Mgr. Karolína Košťálová, NK ČR)
Rešeršní služby v NK ČR (Mgr. Karolína Košťálová, NK ČR) Národní technická knihovna (NTK)
 
Model rešeršních služeb v NTK (Bc. Drahomíra Dvořáková, NTK)
 Model rešeršních služeb v NTK (Bc. Drahomíra Dvořáková, NTK)  Model rešeršních služeb v NTK (Bc. Drahomíra Dvořáková, NTK)
Model rešeršních služeb v NTK (Bc. Drahomíra Dvořáková, NTK) Národní technická knihovna (NTK)
 
Rešeršní služby Bibliografie dějin Českých zemí v Historickém ústavu AV ČR (M...
Rešeršní služby Bibliografie dějin Českých zemí v Historickém ústavu AV ČR (M...Rešeršní služby Bibliografie dějin Českých zemí v Historickém ústavu AV ČR (M...
Rešeršní služby Bibliografie dějin Českých zemí v Historickém ústavu AV ČR (M...Národní technická knihovna (NTK)
 
Co znamená, že Google o nás ví víc než my sami; aneb zaprodáme duši vyhledáva...
Co znamená, že Google o nás ví víc než my sami; aneb zaprodáme duši vyhledáva...Co znamená, že Google o nás ví víc než my sami; aneb zaprodáme duši vyhledáva...
Co znamená, že Google o nás ví víc než my sami; aneb zaprodáme duši vyhledáva...Národní technická knihovna (NTK)
 
Co se skrývá za vyhledáváním v katalogu NTK (Kristýna Busch, Eliška Veselá)
Co se skrývá za vyhledáváním v katalogu NTK (Kristýna Busch, Eliška Veselá)Co se skrývá za vyhledáváním v katalogu NTK (Kristýna Busch, Eliška Veselá)
Co se skrývá za vyhledáváním v katalogu NTK (Kristýna Busch, Eliška Veselá)Národní technická knihovna (NTK)
 

More from Národní technická knihovna (NTK) (20)

Overlooked Principles of Strategic Management of Research at a National Level...
Overlooked Principles of Strategic Management of Research at a National Level...Overlooked Principles of Strategic Management of Research at a National Level...
Overlooked Principles of Strategic Management of Research at a National Level...
 
Využití bibliometrických ukazatelů v řízení výzkumné instituce (Daniel Münich...
Využití bibliometrických ukazatelů v řízení výzkumné instituce (Daniel Münich...Využití bibliometrických ukazatelů v řízení výzkumné instituce (Daniel Münich...
Využití bibliometrických ukazatelů v řízení výzkumné instituce (Daniel Münich...
 
InCites: Practical Aspects and Effective Use (Evangelia A. E. C. Lipitakis, ...
InCites: Practical Aspects and Effective Use  (Evangelia A. E. C. Lipitakis, ...InCites: Practical Aspects and Effective Use  (Evangelia A. E. C. Lipitakis, ...
InCites: Practical Aspects and Effective Use (Evangelia A. E. C. Lipitakis, ...
 
Zkušenosti Knihovny Akademie věd ČR (Pavel Míka, AV ČR)
Zkušenosti Knihovny Akademie věd ČR (Pavel Míka, AV ČR)Zkušenosti Knihovny Akademie věd ČR (Pavel Míka, AV ČR)
Zkušenosti Knihovny Akademie věd ČR (Pavel Míka, AV ČR)
 
Bibliometrie v Národní technické knihovně: metody, zkušenosti, mise a vize (J...
Bibliometrie v Národní technické knihovně: metody, zkušenosti, mise a vize (J...Bibliometrie v Národní technické knihovně: metody, zkušenosti, mise a vize (J...
Bibliometrie v Národní technické knihovně: metody, zkušenosti, mise a vize (J...
 
Bibliometrie: přínosy, úskalí (Jiří Jirát, VŠCHT)
Bibliometrie: přínosy, úskalí (Jiří Jirát, VŠCHT)Bibliometrie: přínosy, úskalí (Jiří Jirát, VŠCHT)
Bibliometrie: přínosy, úskalí (Jiří Jirát, VŠCHT)
 
Význam indikátorů v institucionálním hodnocení a financování (Jitka Moravcová...
Význam indikátorů v institucionálním hodnocení a financování (Jitka Moravcová...Význam indikátorů v institucionálním hodnocení a financování (Jitka Moravcová...
Význam indikátorů v institucionálním hodnocení a financování (Jitka Moravcová...
 
Rozhraní VPK
Rozhraní VPKRozhraní VPK
Rozhraní VPK
 
Šmankote, co je to NUŠL? (aktualizovaná verze 2014)
Šmankote, co je to NUŠL? (aktualizovaná verze 2014)Šmankote, co je to NUŠL? (aktualizovaná verze 2014)
Šmankote, co je to NUŠL? (aktualizovaná verze 2014)
 
Rešeršní služby v komerčním sektoru (Martin Mlčoch, nezávislý konzultant)
Rešeršní služby v komerčním sektoru (Martin Mlčoch, nezávislý konzultant)Rešeršní služby v komerčním sektoru (Martin Mlčoch, nezávislý konzultant)
Rešeršní služby v komerčním sektoru (Martin Mlčoch, nezávislý konzultant)
 
Speciální informační služby pro zdravotníky v Národní lékařské knihovně (Mgr....
Speciální informační služby pro zdravotníky v Národní lékařské knihovně (Mgr....Speciální informační služby pro zdravotníky v Národní lékařské knihovně (Mgr....
Speciální informační služby pro zdravotníky v Národní lékařské knihovně (Mgr....
 
Rešeršní služby v NK ČR (Mgr. Karolína Košťálová, NK ČR)
 Rešeršní služby v NK ČR (Mgr. Karolína Košťálová, NK ČR)  Rešeršní služby v NK ČR (Mgr. Karolína Košťálová, NK ČR)
Rešeršní služby v NK ČR (Mgr. Karolína Košťálová, NK ČR)
 
Legislativní rámec rešerší (Mgr. Alena Pavelová, NTK)
Legislativní rámec rešerší (Mgr. Alena Pavelová, NTK) Legislativní rámec rešerší (Mgr. Alena Pavelová, NTK)
Legislativní rámec rešerší (Mgr. Alena Pavelová, NTK)
 
Model rešeršních služeb v NTK (Bc. Drahomíra Dvořáková, NTK)
 Model rešeršních služeb v NTK (Bc. Drahomíra Dvořáková, NTK)  Model rešeršních služeb v NTK (Bc. Drahomíra Dvořáková, NTK)
Model rešeršních služeb v NTK (Bc. Drahomíra Dvořáková, NTK)
 
Rešeršní služby Bibliografie dějin Českých zemí v Historickém ústavu AV ČR (M...
Rešeršní služby Bibliografie dějin Českých zemí v Historickém ústavu AV ČR (M...Rešeršní služby Bibliografie dějin Českých zemí v Historickém ústavu AV ČR (M...
Rešeršní služby Bibliografie dějin Českých zemí v Historickém ústavu AV ČR (M...
 
Novinky ve vyhledávání Seznam .cz (Otakar Smrž)
Novinky ve vyhledávání Seznam .cz (Otakar Smrž)Novinky ve vyhledávání Seznam .cz (Otakar Smrž)
Novinky ve vyhledávání Seznam .cz (Otakar Smrž)
 
Co znamená, že Google o nás ví víc než my sami; aneb zaprodáme duši vyhledáva...
Co znamená, že Google o nás ví víc než my sami; aneb zaprodáme duši vyhledáva...Co znamená, že Google o nás ví víc než my sami; aneb zaprodáme duši vyhledáva...
Co znamená, že Google o nás ví víc než my sami; aneb zaprodáme duši vyhledáva...
 
Vyhledávání hudbou: YouTube trochu jinak (Ondřej Voců)
Vyhledávání hudbou: YouTube trochu jinak (Ondřej Voců)Vyhledávání hudbou: YouTube trochu jinak (Ondřej Voců)
Vyhledávání hudbou: YouTube trochu jinak (Ondřej Voců)
 
Co se skrývá za vyhledáváním v katalogu NTK (Kristýna Busch, Eliška Veselá)
Co se skrývá za vyhledáváním v katalogu NTK (Kristýna Busch, Eliška Veselá)Co se skrývá za vyhledáváním v katalogu NTK (Kristýna Busch, Eliška Veselá)
Co se skrývá za vyhledáváním v katalogu NTK (Kristýna Busch, Eliška Veselá)
 
Kouzlo muzejní noci
Kouzlo muzejní nociKouzlo muzejní noci
Kouzlo muzejní noci
 

Recently uploaded

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Recently uploaded (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Podobnostní hledání v netextových datech (Pavel Zezula)

  • 1. Similarity Search inNon-text Data Pavel Zezula Faculty of Informatics Masaryk University, Brno 4.10.2011 Searching Session NTK 2011
  • 2. Real-Life MotivationThe social psychology view Any event in the history of organism is, in a sense, unique. Recognition, learning, and judgment presuppose an ability to categorize stimuli and classify situations by similarity. Similarity (proximity, resemblance, communality, representativeness, psychologicaldistance, etc.) is fundamental to theories of perception, learning, judgment, etc. Searching Session NTK 2011 4.10.2011
  • 3. Contemporary Networked MediaThe digital data view Almost everything that we see, read, hear, write, measure, or observe can be digital. Users autonomouslycontribute to production of global media and the growth is exponential. Sites like Flickr, YouTube, Facebook host user contributed content for a variety of events. The elements of networked media are related by numerous multi-facet links of similarity. 4.10.2011 Searching Session NTK 2011
  • 4. Examples with Similarity Does the computer disk of a suspected criminal contain illegal multimedia material? What are the stocks with similar price histories? Which companies advertise their logos in the direct TV transmission of football match? Is it the situation on the web getting close to any of the network attacks which resulted in significant damage in the past? 4.10.2011 Searching Session NTK 2011
  • 5. Challenge Networked media is getting close to the human “fact-bases”. Similaritydatamanagement is needed to connect, search, filter, merge, relate, rank, cluster, classify, identify, or categorize objects across various collections. WHY? It is the similarity which is in the world revealing. 4.10.2011 Searching Session NTK 2011
  • 6. Limitations:Data Types We have Attributes Numbers, strings, etc. Text (text-based) Documents, annotations We need Multimedia Image, video, audio Security Biometrics Medicine EKG, EEG, EMG, EMR, CT, etc. Scientific data Biology, chemistry, physics, life sciences, economics Others Motion, emotion, events, etc. 4.10.2011 Searching Session NTK 2011
  • 7. Limitations:Models of Similarity We have Simple geometric models, typically vector spaces We need More complex model Non metric models Asymmetric similarity Subjective similarity Context aware similarity Complex similarity Etc. 4.10.2011 Searching Session NTK 2011
  • 8. Limitations:Queries We have Simple query Nearest neighbor Range We need More query types Reverse NN, distinct NN, similarity join Other similarity-based operations Filtering, classification, event detection, clustering, etc. Similarity algebra May become the basis of a “Similarity Data Management System” 4.10.2011 Searching Session NTK 2011
  • 9. Limitations:Implementation Strategies We have Centralized or parallel processing We need Scalable and distributed architectures MapReduce like approaches P2P architectures Cloud computing Self-organized architectures Etc. 4.10.2011 Searching Session NTK 2011
  • 10.
  • 11. number of users (queries)
  • 13. multi-lingual, -feature –modal querieswell established cutting-edge research high Determinism exact match ► similarity precise ► approximate same answer ► good answer; recommendation fixed query ► personalized; context aware fixed infrastr. ► dynamic mapping; mobile dev. peer-to-peer centralized parallel self-organized distributed grade low
  • 14. Word Cloud of Applications 4.10.2011 Searching Session NTK 2011
  • 15. Metric Search Grows in Popularity Hanan Samet Foundation of Multidimensional and Metric Data Structures Morgan Kaufmann, 2006 P. Zezula, G. Amato, V. Dohnal, and M. Batko Similarity Search: The Metric Space Approach Springer, 2006 4.10.2011 Searching Session NTK 2011
  • 16. The MUFIN Approach MUFIN: MUlti-Feature Indexing Network SEARCH data & queries index structure infrastructure Scalability P2P structure Extensibility metric space Tuning of performance Internet / GRID / LAN network independence 4.10.2011 Searching Session NTK 2011
  • 17. Metric Space an Abstraction of Similarity Metric space:M = (D,d) D– domain distance function d(x,y) x,y,z  D d(x,y) > 0 - non-negativity d(x,y) = 0 x = y - identity d(x,y) = d(y,x) - symmetry d(x,y)≤ d(x,z)+ d(z,y) - triangle inequality 4.10.2011 Searching Session NTK 2011
  • 18. Peer-to-Peer Indexing Native metric techniques: GHT*, VPT* Transformation techniques: M-CAN, M-Chord 4.10.2011 Searching Session NTK 2011
  • 19. Image search Image base similar? 4.10.2011 Searching Session NTK 2011
  • 20. Images and their Descriptors Image level B Descriptor level R G 4.10.2011 Searching Session NTK 2011
  • 21.
  • 23. Five MPEG-7 VDs: Scalable Color, Color Structure, Color Layout, Edge Histogram, Homogeneous Texture
  • 24. Other textual information: title, tags, comments, etc.
  • 25. Photos have been crawled from the Flickr photo-sharing site.CoPhIR: Content-based PhotoImage Retrieval 4.10.2011 Searching Session NTK 2011
  • 26. MUFIN SEARCH ENGINE data & queries index structure infrastructure 6 x IBM server x3400 Scalability M-Chord + M-Tree Extensibility COPHIR color structure scalable color color layout edge histogram homogeneous texture Image Search Demohttp://mufin.fi.muni.cz/imgsearch/ 4.10.2011 Searching Session NTK 2011
  • 28. Current Research Activities Image Query Postprocessing Sub-image Searching Remote Biometrics Event Detection in Video Signal Processing 4.10.2011 Searching Session NTK 2011
  • 29. Query Postprocessing The understanding of similarity is: subjective context-dependent multi-modal Semantic gap Overcoming semantic gap by combining aspects semantics-learning result postprocessing relevance feedback & iterative search Our objectives Large general data collections with various quality of metadata Online searching response times 4.10.2011 Searching Session NTK 2011
  • 30.
  • 31.
  • 32. Sub-image Searching Retrieves all images containing the query image Based on local image descriptors Scale Invariant Feature Transform (SIFT): Descriptor – content of a small neighborhood Locator – coordinates of the neighborhood Scale – importance of the descriptor Image  a set of features, descriptors Task: Find matching pairs (similar features) 4.10.2011 Searching Session NTK 2011 Query Answer:
  • 33. Remote Biometrics: Motivation Most biometrics require the subject’s cooperation Fingerprint, iris, palmprint, handwriting, voice recognition Challenge – recognizing people at a distance Capture devices do not require a close contact with the subject (e.g., surveillance cameras) It can be applied unobtrusively Face and gait recognition at a distance Problems – camera view, lighting, pose Applications – surveillance, security 4.10.2011 Searching Session NTK 2011
  • 34. Remote Biometrics: Approaches Detection, normalization, extraction, recognition Face recognition Methods: Appearance-based – analyze the face as a whole Model-based – compare individual features (e.g., eyes, mouth) MUFIN face recognition demo: http://mufin.fi.muni.cz/faces-feret/ Gait recognition Less likely to be obscured, low resolution suffices Methods are based on shape or dynamics of the person: Appearance-based – analyze person’s silhouettes Model-based – compare features (e.g., trajectory, angular velocity) 4.10.2011 Searching Session NTK 2011
  • 35. Event Detection in Video Video continuous data several aspects image, sound, text, motion, temporal Event defined aspects occurring in given time interval definition of a sample aspect by example or value definition is imprecise – looking for “similar” aspects combination of aspects aggregation function Current approaches annotation-based, learning-based (classifiers) specific domains Example TV news (by image) AND about IRAQ (by text) AND burning vehicles (by image) AND time interval < 1 minute (by temporal) 4.10.2011 Searching Session NTK 2011
  • 36. Signal Processing Vast amount of signals produced: Biomedicine data – ECG, CT Biometric data – personal identification Audio data – audio similarity, recognition Sub-image searching Financial time series – analysis, forecasting Time series streams Demand for a graceful handling of this data flexible reactions to new application needs 4.10.2011 Searching Session NTK 2011
  • 37. Flexible Subsequence Matching Generic engine for rapid development of subsequence matching applications can be used for any class of one-dimensional signals Implementation of various subsequence matching approaches Demo web application User Application Subsequence Matching Layer 4.10.2011 Searching Session NTK 2011
  • 38. Demo application 4.10.2011 Searching Session NTK 2011
  • 39. Face Retrieval Application 10,000 images with people 14,000 faces Face detection – MPEG7 4.10.2011 Searching Session NTK 2011