SlideShare une entreprise Scribd logo
1  sur  38
Oge Marques
Florida Atlantic University
     Boca Raton, FL - USA
    “Image search and retrieval” is not a problem,
     but rather a collection of related problems that
     look like one.

    10 years after “the end of the early years”,
     research in image search and retrieval still has
     many open problems, challenges, and
     opportunities.
    This is a highly interdisciplinary field, but …

                        Image and       (Multimedia)
                                                         Information
                          Video          Database
                                                           Retrieval
                        Processing        Systems




                                          Visual
                     Machine                                 Computer
                     Learning          Information            Vision
                                         Retrieval



                                         Visual data
                                                        Human Visual
                         Data Mining    modeling and
                                                         Perception
                                       representation
    There are many things that I believe…




    … but cannot prove
The “big mismatch”
    It’s been 10 years since the “end of the early
     years” [Smeulders et al., 2000]




     ◦  Are the challenges from 2000 still relevant?
     ◦  Are the directions and guidelines from 2000 still
        appropriate?
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Driving forces
        “[…] content-based image retrieval (CBIR) will continue
         to grow in every direction: new audiences, new
         purposes, new styles of use, new modes of interaction,
         larger data sets, and new methods to solve the
         problems.”
    Yes, we have seen many new audiences, new
     purposes, new styles of use, and new modes
     of interaction emerge.

    Each of these usually requires new methods
     to solve the problems that they bring.

    However, not too many researchers see them
     as a driving force (as they should).
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Heritage of computer vision
        “An important obstacle to overcome […] is to realize
         that image retrieval does not entail solving the general
         image understanding problem.”
    I’m afraid I have bad news…
     ◦  Computer vision hasn’t made so much progress
        during the past 10 years.

     ◦  Some classical problems 

        (including image 

        understanding)

        remain unresolved.

     ◦  Similarly, CBIR from a 

        pure computer vision

        perspective didn’t work 

        too well either.
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Influence on computer vision
        “[…] CBIR offers a different look at traditional computer
         vision problems: large data sets, no reliance on strong
         segmentation, and revitalized interest in color image
         processing and invariance.”
    The adoption of large data sets became standard
     practice in computer vision (see Torralba’s work).
    No reliance on strong segmentation (still
     unresolved)  new areas of research, e.g.,
     automatic ROI extraction and RBIR.
    Color image processing and color descriptors
     became incredibly popular, useful, and (to some
     degree) effective.
    Invariance still a huge problem
     ◦  But it’s cheaper than ever to have multiple views.
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Similarity and learning
        “We make a pledge for the importance of human-
         based similarity rather than general similarity. Also,
         the connection between image semantics, image data,
         and query context will have to be made clearer in the
         future.”
        “[…] in order to bring semantics to the user, learning is
         inevitable.”
    Similarity is a tough problem to crack and
     model.

    See it for yourself…
    Are these two images similar?
    Are these two images similar?
    Is the second or the third image more similar
     to the first?
    Which image fits better to the first two: the
     third or the fourth?
    Is learning really inevitable?

    Maybe, maybe not, but it sure comes handy
     in some specific cases…
     ◦  SVM anyone?
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Interaction
        Better visualization options, more control to the user,
         ability to provide feedback […]
    Significant progress on visualization
     interfaces and devices.

    Relevance Feedback: still a very tricky
     tradeoff (effort vs. perceived benefit), but
     more popular than ever (rating, thumbs up/
     down, etc.)
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Need for databases
        “The connection between CBIR and database research
         is likely to increase in the future. […] problems like the
         definition of suitable query languages, efficient search
         in high dimensional feature space, search in the
         presence of changing similarity measures are largely
         unsolved […]”
    Very little progress
     ◦  Image search and retrieval has benefited much
        more from document information retrieval than
        from database research.
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  The problem of evaluation
        CBIR could use a reference standard against which new
         algorithms could be evaluated (similar to TREC in the
         field of text recognition).
        “A comprehensive and publicly available collection of
         images, sorted by class and retrieval purposes,
         together with a protocol to standardize experimental
         practices, will be instrumental in the next phase of
         CBIR.”
    Significant progress on benchmarks,
     standardized datasets, etc.
     ◦  ImageCLEF
     ◦  Pascal VOC Challenge
     ◦  MSRA dataset
     ◦  Simplicity dataset
     ◦  UCID dataset and ground truth (GT)
     ◦  Accio / SIVAL dataset and GT
     ◦  Caltech 101, Caltech 256
     ◦  LabelMe
    Revisiting the ‘Concluding Remarks’ from
     [Smeulders et al., 2000]:

     ◦  Semantic gap and other sources
        “A critical point in the advancement of CBIR is the
         semantic gap, where the meaning of an image is rarely
         self-evident. […] One way to resolve the semantic gap
         comes from sources outside the image by integrating
         other sources of information about the image in the
         query.”
    The semantic gap problem has not been
     solved (and maybe will never be…)

    What are the alternatives?
     1.  Treat visual similarity and semantic relatedness
         differently
        Examples: Alipr, Google similarity search, etc.
     2.  Improve both (text-based and visual) search
         methods independently
     3.  Trust the user
        CFIR, collaborative filtering, crowdsourcing, games.
    I postulate that image search and retrieval is
     not a problem (but, instead, a collection of
     related problems that look like one)

    There are many potential opportunities for
     good solutions to specific problems

    One promising avenue: think about image
     retrieval as added value (e.g., like.com, SPE,
     etc.)
    Google Similarity Search (VisualRank) [Jing &
     Baluja, 2008]



    Google Goggles (mobile visual search)
    Google Goggles understands narrow-domain
     search and retrieval




    Several other apps for iPhone, iPad, and
     Android (e.g., kooaba and Fetch!)
    The Web 2.0 has brought about:
     ◦  New data sources
     ◦  New usage patterns
     ◦  New understanding about the users, their needs,
        habits, preferences
     ◦  New opportunities
     ◦  Lots of metadata!

     ◦  A chance to experience a true paradigm shift
        Before: image annotation is tedious, labor-intensive,
         expensive
        After: image annotation is fun!
    Games!
     ◦  Google Image Labeler
     ◦  Games with a purpose (GWAP):
        The ESP Game
        Squigl
        Matchin
    New devices and services…

     ◦  Flickr (b. 2004)
     ◦  YouTube (b. 2005)
     ◦  Flip video cameras (b. 2006)
     ◦  iPhone (b. 2007)
     ◦  iPad (b. 2010)
    New opportunities for narrowing the semantic
     gap
     ◦  From bottom up: (semi-)automatic image
        annotation
     ◦  From top down: using (content / context)
        ontologies
     ◦  Combining top-down and bottom-up

    New fields of research, including:
     ◦  Tag recommendation systems
     ◦  User intentions in image search
    Many opportunities await…
–    I believe (but cannot prove…) that successful
     Image Search & Retrieval solutions will:
     •  combine content-based image retrieval (CBIR) with
        metadata (high-level semantic-based image
        retrieval)
     •  only be truly successful in narrow domains
     •  include the user in the loop
      –  Relevance Feedback (RF)
      –  Collaborative efforts (tagging, rating, annotating)
     •  provide friendly, intuitive interfaces
     •  incorporate results and insights from cognitive
        science, particularly human visual attention,
        perception, and memory
Questions?




             omarques@fau.edu

Contenu connexe

Tendances

Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...Data Con LA
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroSi Krishan
 
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII
 
Recent Advances in Computer Vision
Recent Advances in Computer VisionRecent Advances in Computer Vision
Recent Advances in Computer Visionantiw
 
Elegant Resume
Elegant ResumeElegant Resume
Elegant Resumebutest
 
Lecture 1 computer vision introduction
Lecture 1 computer vision introductionLecture 1 computer vision introduction
Lecture 1 computer vision introductioncairo university
 
Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...webhostingguy
 
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DLPractical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DLAlbert Y. C. Chen
 
Who are the users of a video search system?
Who are the users of a video search system?Who are the users of a video search system?
Who are the users of a video search system?MaxKemman
 
Information Architecture Course Part 2 - Spring 2013 - Class 1
Information Architecture Course Part 2 - Spring 2013 - Class 1Information Architecture Course Part 2 - Spring 2013 - Class 1
Information Architecture Course Part 2 - Spring 2013 - Class 1Keith Schengili-Roberts
 
Information Architecture - Part 1 - Spring 2013 - Class 1
Information Architecture - Part 1 - Spring 2013 - Class 1Information Architecture - Part 1 - Spring 2013 - Class 1
Information Architecture - Part 1 - Spring 2013 - Class 1Keith Schengili-Roberts
 
FInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
FInES, ENSEMBLE and A Scientific Perspective For Enterprise InteroperabilityFInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
FInES, ENSEMBLE and A Scientific Perspective For Enterprise InteroperabilityFenareti Lampathaki
 

Tendances (16)

Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
Data Con LA 2019 - State of the Art of Innovation in Computer Vision by Chris...
 
Machine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An IntroMachine Learning 2 deep Learning: An Intro
Machine Learning 2 deep Learning: An Intro
 
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
SSII2021 [SS2] Deepfake Generation and Detection – An Overview (ディープフェイクの生成と検出)
 
Recent Advances in Computer Vision
Recent Advances in Computer VisionRecent Advances in Computer Vision
Recent Advances in Computer Vision
 
Elegant Resume
Elegant ResumeElegant Resume
Elegant Resume
 
16 ijcse-01237
16 ijcse-0123716 ijcse-01237
16 ijcse-01237
 
Lecture 1 computer vision introduction
Lecture 1 computer vision introductionLecture 1 computer vision introduction
Lecture 1 computer vision introduction
 
Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...Multimedia Information Retrieval: What is it, and why isn't ...
Multimedia Information Retrieval: What is it, and why isn't ...
 
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DLPractical computer vision-- A problem-driven approach towards learning CV/ML/DL
Practical computer vision-- A problem-driven approach towards learning CV/ML/DL
 
Who are the users of a video search system?
Who are the users of a video search system?Who are the users of a video search system?
Who are the users of a video search system?
 
Resume 2015/1
Resume 2015/1Resume 2015/1
Resume 2015/1
 
Viva presentation
Viva presentation Viva presentation
Viva presentation
 
Information Architecture Course Part 2 - Spring 2013 - Class 1
Information Architecture Course Part 2 - Spring 2013 - Class 1Information Architecture Course Part 2 - Spring 2013 - Class 1
Information Architecture Course Part 2 - Spring 2013 - Class 1
 
Information Architecture - Part 1 - Spring 2013 - Class 1
Information Architecture - Part 1 - Spring 2013 - Class 1Information Architecture - Part 1 - Spring 2013 - Class 1
Information Architecture - Part 1 - Spring 2013 - Class 1
 
FInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
FInES, ENSEMBLE and A Scientific Perspective For Enterprise InteroperabilityFInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
FInES, ENSEMBLE and A Scientific Perspective For Enterprise Interoperability
 
An Introduction to Face Detection
An Introduction to Face DetectionAn Introduction to Face Detection
An Introduction to Face Detection
 

En vedette

Crew Documents 020700 - 020754
Crew Documents 020700 - 020754Crew Documents 020700 - 020754
Crew Documents 020700 - 020754Obama White House
 
Alistair McNaught Right 2 Read presentation from Assess2010
Alistair McNaught Right 2 Read presentation from Assess2010Alistair McNaught Right 2 Read presentation from Assess2010
Alistair McNaught Right 2 Read presentation from Assess2010Neil Milliken
 
упко младши бр.1
упко младши бр.1упко младши бр.1
упко младши бр.1eclass
 
Entrepreneurship for Developers: Key for Success
Entrepreneurship for Developers: Key for SuccessEntrepreneurship for Developers: Key for Success
Entrepreneurship for Developers: Key for SuccessCory Miller
 
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)Brenda Meller
 
WordCamp Dayton - Keynote
WordCamp Dayton - KeynoteWordCamp Dayton - Keynote
WordCamp Dayton - KeynoteCory Miller
 
WordCamp Atlanta - Go Far Together
WordCamp Atlanta - Go Far TogetherWordCamp Atlanta - Go Far Together
WordCamp Atlanta - Go Far TogetherCory Miller
 
Department of the Interior Preliminary Regulatory Reform Plan
Department of the Interior Preliminary Regulatory Reform PlanDepartment of the Interior Preliminary Regulatory Reform Plan
Department of the Interior Preliminary Regulatory Reform PlanObama White House
 
Whitepaper ame purchasing
Whitepaper ame purchasingWhitepaper ame purchasing
Whitepaper ame purchasingmykalz71
 
The First 6 Critical Partners for Your Business
The First 6 Critical Partners for Your BusinessThe First 6 Critical Partners for Your Business
The First 6 Critical Partners for Your BusinessCory Miller
 
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)wnelson0001
 
How to Put Your Reading on Steroids
How to Put Your Reading on SteroidsHow to Put Your Reading on Steroids
How to Put Your Reading on SteroidsCory Miller
 

En vedette (20)

CAR Email 6.21.02
CAR Email 6.21.02CAR Email 6.21.02
CAR Email 6.21.02
 
Crew Documents 020700 - 020754
Crew Documents 020700 - 020754Crew Documents 020700 - 020754
Crew Documents 020700 - 020754
 
Alistair McNaught Right 2 Read presentation from Assess2010
Alistair McNaught Right 2 Read presentation from Assess2010Alistair McNaught Right 2 Read presentation from Assess2010
Alistair McNaught Right 2 Read presentation from Assess2010
 
упко младши бр.1
упко младши бр.1упко младши бр.1
упко младши бр.1
 
Entrepreneurship for Developers: Key for Success
Entrepreneurship for Developers: Key for SuccessEntrepreneurship for Developers: Key for Success
Entrepreneurship for Developers: Key for Success
 
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
LinkedIn Tips presented to SE Michigan Entrepreneurs Association (SEMEA)
 
WordCamp Dayton - Keynote
WordCamp Dayton - KeynoteWordCamp Dayton - Keynote
WordCamp Dayton - Keynote
 
WordCamp Atlanta - Go Far Together
WordCamp Atlanta - Go Far TogetherWordCamp Atlanta - Go Far Together
WordCamp Atlanta - Go Far Together
 
RCEC Email 4.16.03
RCEC Email 4.16.03RCEC Email 4.16.03
RCEC Email 4.16.03
 
SERA Email 1.20.03
SERA Email 1.20.03SERA Email 1.20.03
SERA Email 1.20.03
 
RCEC Email 2.25.03 (b)
RCEC Email 2.25.03 (b)RCEC Email 2.25.03 (b)
RCEC Email 2.25.03 (b)
 
RCEC Email 5.5.03 (b)
RCEC Email 5.5.03 (b)RCEC Email 5.5.03 (b)
RCEC Email 5.5.03 (b)
 
Profile Inspire
Profile InspireProfile Inspire
Profile Inspire
 
Department of the Interior Preliminary Regulatory Reform Plan
Department of the Interior Preliminary Regulatory Reform PlanDepartment of the Interior Preliminary Regulatory Reform Plan
Department of the Interior Preliminary Regulatory Reform Plan
 
RCEC Email 5.30.03
RCEC Email 5.30.03RCEC Email 5.30.03
RCEC Email 5.30.03
 
Whitepaper ame purchasing
Whitepaper ame purchasingWhitepaper ame purchasing
Whitepaper ame purchasing
 
The First 6 Critical Partners for Your Business
The First 6 Critical Partners for Your BusinessThe First 6 Critical Partners for Your Business
The First 6 Critical Partners for Your Business
 
Carpe diem2
Carpe diem2Carpe diem2
Carpe diem2
 
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
Effectively Communicating Pharmacoeconomic Research Winnie Nelson (Brief)
 
How to Put Your Reading on Steroids
How to Put Your Reading on SteroidsHow to Put Your Reading on Steroids
How to Put Your Reading on Steroids
 

Similaire à Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)

A Comparative Study of Content Based Image Retrieval Trends and Approaches
A Comparative Study of Content Based Image Retrieval Trends and ApproachesA Comparative Study of Content Based Image Retrieval Trends and Approaches
A Comparative Study of Content Based Image Retrieval Trends and ApproachesCSCJournals
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and NowSi Krishan
 
Image retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systemsImage retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systemsunyil96
 
Image retrieval from the world wide web
Image retrieval from the world wide webImage retrieval from the world wide web
Image retrieval from the world wide webunyil96
 
Invited Talk OAGM Workshop Salzburg, May 2015
Invited Talk OAGM Workshop Salzburg, May 2015Invited Talk OAGM Workshop Salzburg, May 2015
Invited Talk OAGM Workshop Salzburg, May 2015dermotte
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction Wael Badawy
 
CORE: Cognitive Organization for Requirements Elicitation
CORE: Cognitive Organization for Requirements ElicitationCORE: Cognitive Organization for Requirements Elicitation
CORE: Cognitive Organization for Requirements ElicitationScott M. Confer
 
Global Descriptor Attributes Based Content Based Image Retrieval of Query Images
Global Descriptor Attributes Based Content Based Image Retrieval of Query ImagesGlobal Descriptor Attributes Based Content Based Image Retrieval of Query Images
Global Descriptor Attributes Based Content Based Image Retrieval of Query ImagesIJERA Editor
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Visionbutest
 
Image retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a surveyImage retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a surveysipij
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?klschoef
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringIRJET Journal
 
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...Gene Moo Lee
 
Project presentation by Debendra Adhikari
Project presentation by Debendra AdhikariProject presentation by Debendra Adhikari
Project presentation by Debendra AdhikariDEBENDRA ADHIKARI
 
Brief History of Visual Representation Learning
Brief History of Visual Representation LearningBrief History of Visual Representation Learning
Brief History of Visual Representation LearningSangwoo Mo
 
The deep learning technology on coco framework full report
The deep learning technology on coco framework full reportThe deep learning technology on coco framework full report
The deep learning technology on coco framework full reportJIEMS Akkalkuwa
 
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance FeedbackIRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance FeedbackIRJET Journal
 
Efficient CBIR Using Color Histogram Processing
Efficient CBIR Using Color Histogram ProcessingEfficient CBIR Using Color Histogram Processing
Efficient CBIR Using Color Histogram Processingsipij
 

Similaire à Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010) (20)

A Comparative Study of Content Based Image Retrieval Trends and Approaches
A Comparative Study of Content Based Image Retrieval Trends and ApproachesA Comparative Study of Content Based Image Retrieval Trends and Approaches
A Comparative Study of Content Based Image Retrieval Trends and Approaches
 
Image Search: Then and Now
Image Search: Then and NowImage Search: Then and Now
Image Search: Then and Now
 
Image retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systemsImage retrieval from the world wide web issues, techniques, and systems
Image retrieval from the world wide web issues, techniques, and systems
 
Image retrieval from the world wide web
Image retrieval from the world wide webImage retrieval from the world wide web
Image retrieval from the world wide web
 
Invited Talk OAGM Workshop Salzburg, May 2015
Invited Talk OAGM Workshop Salzburg, May 2015Invited Talk OAGM Workshop Salzburg, May 2015
Invited Talk OAGM Workshop Salzburg, May 2015
 
Computer vision introduction
Computer vision  introduction Computer vision  introduction
Computer vision introduction
 
CORE: Cognitive Organization for Requirements Elicitation
CORE: Cognitive Organization for Requirements ElicitationCORE: Cognitive Organization for Requirements Elicitation
CORE: Cognitive Organization for Requirements Elicitation
 
Parents
ParentsParents
Parents
 
Global Descriptor Attributes Based Content Based Image Retrieval of Query Images
Global Descriptor Attributes Based Content Based Image Retrieval of Query ImagesGlobal Descriptor Attributes Based Content Based Image Retrieval of Query Images
Global Descriptor Attributes Based Content Based Image Retrieval of Query Images
 
01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision01Introduction.pptx - C280, Computer Vision
01Introduction.pptx - C280, Computer Vision
 
Image retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a surveyImage retrieval and re ranking techniques - a survey
Image retrieval and re ranking techniques - a survey
 
40120140501006
4012014050100640120140501006
40120140501006
 
Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?Interactive Video Search: Where is the User in the Age of Deep Learning?
Interactive Video Search: Where is the User in the Age of Deep Learning?
 
A Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question AnsweringA Literature Survey on Image Linguistic Visual Question Answering
A Literature Survey on Image Linguistic Visual Question Answering
 
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
Content Complexity, Similarity, and Consistency in Social Media: A Deep Learn...
 
Project presentation by Debendra Adhikari
Project presentation by Debendra AdhikariProject presentation by Debendra Adhikari
Project presentation by Debendra Adhikari
 
Brief History of Visual Representation Learning
Brief History of Visual Representation LearningBrief History of Visual Representation Learning
Brief History of Visual Representation Learning
 
The deep learning technology on coco framework full report
The deep learning technology on coco framework full reportThe deep learning technology on coco framework full report
The deep learning technology on coco framework full report
 
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance FeedbackIRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
IRJET-Semi-Supervised Collaborative Image Retrieval using Relevance Feedback
 
Efficient CBIR Using Color Histogram Processing
Efficient CBIR Using Color Histogram ProcessingEfficient CBIR Using Color Histogram Processing
Efficient CBIR Using Color Histogram Processing
 

Dernier

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 

Dernier (20)

The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 

Oge Marques (FAU) - invited talk at WISMA 2010 (Barcelona, May 2010)

  • 1. Oge Marques Florida Atlantic University Boca Raton, FL - USA
  • 2.   “Image search and retrieval” is not a problem, but rather a collection of related problems that look like one.   10 years after “the end of the early years”, research in image search and retrieval still has many open problems, challenges, and opportunities.
  • 3.   This is a highly interdisciplinary field, but … Image and (Multimedia) Information Video Database Retrieval Processing Systems Visual Machine Computer Learning Information Vision Retrieval Visual data Human Visual Data Mining modeling and Perception representation
  • 4.   There are many things that I believe…   … but cannot prove
  • 6.   It’s been 10 years since the “end of the early years” [Smeulders et al., 2000] ◦  Are the challenges from 2000 still relevant? ◦  Are the directions and guidelines from 2000 still appropriate?
  • 7.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Driving forces   “[…] content-based image retrieval (CBIR) will continue to grow in every direction: new audiences, new purposes, new styles of use, new modes of interaction, larger data sets, and new methods to solve the problems.”
  • 8.   Yes, we have seen many new audiences, new purposes, new styles of use, and new modes of interaction emerge.   Each of these usually requires new methods to solve the problems that they bring.   However, not too many researchers see them as a driving force (as they should).
  • 9.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Heritage of computer vision   “An important obstacle to overcome […] is to realize that image retrieval does not entail solving the general image understanding problem.”
  • 10.   I’m afraid I have bad news… ◦  Computer vision hasn’t made so much progress during the past 10 years. ◦  Some classical problems 
 (including image 
 understanding)
 remain unresolved. ◦  Similarly, CBIR from a 
 pure computer vision
 perspective didn’t work 
 too well either.
  • 11.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Influence on computer vision   “[…] CBIR offers a different look at traditional computer vision problems: large data sets, no reliance on strong segmentation, and revitalized interest in color image processing and invariance.”
  • 12.   The adoption of large data sets became standard practice in computer vision (see Torralba’s work).   No reliance on strong segmentation (still unresolved)  new areas of research, e.g., automatic ROI extraction and RBIR.   Color image processing and color descriptors became incredibly popular, useful, and (to some degree) effective.   Invariance still a huge problem ◦  But it’s cheaper than ever to have multiple views.
  • 13.
  • 14.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Similarity and learning   “We make a pledge for the importance of human- based similarity rather than general similarity. Also, the connection between image semantics, image data, and query context will have to be made clearer in the future.”   “[…] in order to bring semantics to the user, learning is inevitable.”
  • 15.   Similarity is a tough problem to crack and model.   See it for yourself…
  • 16.   Are these two images similar?
  • 17.   Are these two images similar?
  • 18.   Is the second or the third image more similar to the first?
  • 19.   Which image fits better to the first two: the third or the fourth?
  • 20.   Is learning really inevitable?   Maybe, maybe not, but it sure comes handy in some specific cases… ◦  SVM anyone?
  • 21.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Interaction   Better visualization options, more control to the user, ability to provide feedback […]
  • 22.   Significant progress on visualization interfaces and devices.   Relevance Feedback: still a very tricky tradeoff (effort vs. perceived benefit), but more popular than ever (rating, thumbs up/ down, etc.)
  • 23.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Need for databases   “The connection between CBIR and database research is likely to increase in the future. […] problems like the definition of suitable query languages, efficient search in high dimensional feature space, search in the presence of changing similarity measures are largely unsolved […]”
  • 24.   Very little progress ◦  Image search and retrieval has benefited much more from document information retrieval than from database research.
  • 25.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  The problem of evaluation   CBIR could use a reference standard against which new algorithms could be evaluated (similar to TREC in the field of text recognition).   “A comprehensive and publicly available collection of images, sorted by class and retrieval purposes, together with a protocol to standardize experimental practices, will be instrumental in the next phase of CBIR.”
  • 26.   Significant progress on benchmarks, standardized datasets, etc. ◦  ImageCLEF ◦  Pascal VOC Challenge ◦  MSRA dataset ◦  Simplicity dataset ◦  UCID dataset and ground truth (GT) ◦  Accio / SIVAL dataset and GT ◦  Caltech 101, Caltech 256 ◦  LabelMe
  • 27.   Revisiting the ‘Concluding Remarks’ from [Smeulders et al., 2000]: ◦  Semantic gap and other sources   “A critical point in the advancement of CBIR is the semantic gap, where the meaning of an image is rarely self-evident. […] One way to resolve the semantic gap comes from sources outside the image by integrating other sources of information about the image in the query.”
  • 28.   The semantic gap problem has not been solved (and maybe will never be…)   What are the alternatives? 1.  Treat visual similarity and semantic relatedness differently   Examples: Alipr, Google similarity search, etc. 2.  Improve both (text-based and visual) search methods independently 3.  Trust the user   CFIR, collaborative filtering, crowdsourcing, games.
  • 29.   I postulate that image search and retrieval is not a problem (but, instead, a collection of related problems that look like one)   There are many potential opportunities for good solutions to specific problems   One promising avenue: think about image retrieval as added value (e.g., like.com, SPE, etc.)
  • 30.   Google Similarity Search (VisualRank) [Jing & Baluja, 2008]   Google Goggles (mobile visual search)
  • 31.   Google Goggles understands narrow-domain search and retrieval   Several other apps for iPhone, iPad, and Android (e.g., kooaba and Fetch!)
  • 32.   The Web 2.0 has brought about: ◦  New data sources ◦  New usage patterns ◦  New understanding about the users, their needs, habits, preferences ◦  New opportunities ◦  Lots of metadata! ◦  A chance to experience a true paradigm shift   Before: image annotation is tedious, labor-intensive, expensive   After: image annotation is fun!
  • 33.   Games! ◦  Google Image Labeler ◦  Games with a purpose (GWAP):   The ESP Game   Squigl   Matchin
  • 34.   New devices and services… ◦  Flickr (b. 2004) ◦  YouTube (b. 2005) ◦  Flip video cameras (b. 2006) ◦  iPhone (b. 2007) ◦  iPad (b. 2010)
  • 35.   New opportunities for narrowing the semantic gap ◦  From bottom up: (semi-)automatic image annotation ◦  From top down: using (content / context) ontologies ◦  Combining top-down and bottom-up   New fields of research, including: ◦  Tag recommendation systems ◦  User intentions in image search
  • 36.   Many opportunities await…
  • 37. –  I believe (but cannot prove…) that successful Image Search & Retrieval solutions will: •  combine content-based image retrieval (CBIR) with metadata (high-level semantic-based image retrieval) •  only be truly successful in narrow domains •  include the user in the loop –  Relevance Feedback (RF) –  Collaborative efforts (tagging, rating, annotating) •  provide friendly, intuitive interfaces •  incorporate results and insights from cognitive science, particularly human visual attention, perception, and memory
  • 38. Questions? omarques@fau.edu