Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine

1
2022/10/24
@shawnmjones 1
2022/10/24
Managed by Triad National Security, LLC, for the U.S. Department of Energy’s NNSA.
Abstract Images Have Different Levels of
Retrievability Per Reverse Image Search Engine
Shawn M. Jones & Diane Oyen
Information Sciences (CCS-3)
2022/10/24
LA-UR-XXXXXX

2
2022/10/24
@shawnmjones
There are few computer vision research papers focused
on querying and retrieving abstract, technical drawings
• Technical documents typically contain
abstract images
• Many reasons exist to search for
abstract images online:
• protect intellectual property
• build datasets
• find evidence for legal cases
• establish scholarly evidence
• justify funding through image
reuse
https://commons.wikimedia.org/wiki/File:Complete_neuron_cell_diagram_en.svg
https://commons.wikimedia.org/wiki/File:Carriage-house-2.jpg
https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png

3
2022/10/24
@shawnmjones
Baidu Bing Google Yandex
Now major search engines support reverse image search
Screenshot source:
https://image.baidu.com
Screenshot source:
https://images.google.com
Screenshot source:
https://www.bing.com/
Screenshot source:
https://yandex.com/images

4
2022/10/24
@shawnmjones
With each service,
a user can upload
an image and
receive different
types of results
pages-with
results
similar-to
results
the uploaded
query image
Uploaded image source: https://commons.wikimedia.org/wiki/File:Adams_The_Tetons_and_the_Snake_River.jpg
Screenshot from: https://www.bing.com

5
2022/10/24
@shawnmjones
Research Question
When using the reverse image search
capability of general web search engines,
are natural images more easily discovered
than abstract images?

6
2022/10/24
@shawnmjones
To collect query images, we submitted terms to
Wikimedia Commons’ API
“diagram”
“schematic”
abstract images
“photo”
“photograph”
natural images
100 images
100 images
100 images
99 images
Previous studies have shown that Wikipedia content has high retrievability.
Image sources:
• https://commons.wikimedia.org/wiki/File:Galileo_Diagram.jpg
• https://commons.wikimedia.org/wiki/File:Complete_neuron_cell_diagram_en.svg
• https://commons.wikimedia.org/wiki/File:Bicycle_diagram-es.svg
• https://commons.wikimedia.org/wiki/File:Systems_Engineering_V_diagram.jpg
Image sources :
• https://commons.wikimedia.org/wiki/File:Hvdc_bipolar_schematic.svg
• https://commons.wikimedia.org/wiki/File:Beve_gear_schematic.png
• https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
• https://commons.wikimedia.org/wiki/File:Carriage-house-2.jpg
Image sources :
• https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg
• https://commons.wikimedia.org/wiki/File:Frank_W._Micklethwaite_photo_of_downtown_Toronto,_1890_-2.jpg
• https://commons.wikimedia.org/wiki/File:James_Abram_Garfield,_photo_portrait_seated.jpg
• https://commons.wikimedia.org/wiki/File:Wtc-photo.jpg
Image sources :
• https://commons.wikimedia.org/wiki/File:Adams_The_Tetons_and_the_Snake_River.jpg
• https://commons.wikimedia.org/wiki/File:Photographing_sunrise_1745.jpg
• https://commons.wikimedia.org/wiki/File:FEMA_-_5399_-_Photograph_by_Andrea_Booher_taken_on_09-28-2001_in_New_York.jpg
• https://commons.wikimedia.org/wiki/File:Photographing_a_model.jpg

7
2022/10/24
@shawnmjones
We then submitted
the same image to
each reverse image
search engine
then again with:
and so on...
Image source: https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg
Image source: https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
Screenshot source:
https://images.google.com
Screenshot source:
https://www.bing.com/
Screenshot source:
https://image.baidu.com
Screenshot source:

8
2022/10/24
@shawnmjones
Using ImageHash’s pHash and GoFigure’s VisHash we
evaluated how often the same image existed in the
results
pHash was designed
to compare
photographs via
Discrete Cosine
Transforms (DCT).
VisHash was designed
to compare diagrams
and technical
drawings by finding
shapes in the image.
Uploaded images:
https://commons.wikimedia.org/wiki/File:Manatee_photo.jpg
https://commons.wikimedia.org/wiki/File:Interspiro_DCSC_loop_schematic.png
Screenshots source:

9
2022/10/24
@shawnmjones
Precision differs based on pages-with or similar-to
results, with Yandex performing best
blue = abstract images
green = natural images
Precision@k:
What percentage of images in the results are the same as the query image if we stop at k results?
S. M. Jones and D. Oyen. 2022. “Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine,” Proceedings
of the 2nd Drawings and abstract Imagery: Representation and Analysis (DIRA) Workshop. (Tel Aviv, Israel).

10
2022/10/24
@shawnmjones
After reviewing 10 pages-with results, Google has a max of 54% retrievability
difference between images from the categories of photograph and diagram
blue = abstract images
green = natural images
Retrievability:
Given a query image, was it retrieved within the cutoff c?

11
2022/10/24
@shawnmjones
For similar-to results, Yandex consistently provides a
high MRR (0.8) for natural images
MRR:
How many results, on
average, across all
queries, must a visitor
review before finding a
the same one again?
Google does well with pages-with results

12
2022/10/24
@shawnmjones
Key Takeaways
• We submitted abstract and natural images
from Wikimedia Commons to four major
reverse image search engines.
• When they do return results, Bing and Baidu
do not perform well.
• Google does not perform well for similar-to
results, likely indicating that their definition
of similar-to differs from other search
engines.
• Yandex performs best in all cases.
• Yandex and Google consistently perform
better for natural images in pages-with
results.

Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine

Recommandé

Recommandé

Contenu connexe

Similaire à Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine

Similaire à Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine (20)

Plus de Shawn Jones

Plus de Shawn Jones (19)

Dernier

Dernier (20)

Abstract Images Have Different Levels of Retrievability Per Reverse Image Search Engine