1. C.BYREGOWDA INSTITUTE OF
TECHNOLOGY, KOLAR-563101
DEPARTMENT OF COMPUTER
SCIENCE AND ENGINEERING
Technical Seminar on
“ TagSense: Approach to automatic image tagging”
Under the guidance of Presented by,
Mr. Raja A Usha V N
Asst. Professor 1CK10CS049
Dept.of CSE,
CBIT, Kolar
4. Digital pictures are undergoing an explosion
Image retrieval becomes crucial and they use tags
Human tagging is accurate but slow
Image based auto-tagging still has many
constraints
How to approximate the human tagging ability?
3
6. Existing problem of auto-tagging
Automatic image tagging has improved by research in
image processing and face recognition But,
cannot recognize individuals moving fast
can only identify individuals who have well-defined
facial features
Picasa iphoto
5
7. TagSense
Main points of new automatic image tagging system
Better than image processing/face recognition
Creates tag including the people, activity and context in a
picture
Tagsense: A Smartphone-based Approach to
Automatic Image Tagging
Leverages multiple sensing domains of Smartphone
“Tag” Definition : keywords that describe the on-going
scenario/event/occasion during which the picture was
taken
“Tag” Format :when-where-who-what
6
8. Sensing multiple dimensions
accelerometer, compass,
light sensor, camera, microphone,
GPS, gyroscope
Basis for Comparison
with iPhoto and Picasa
good under bad lighting conditions
Because it does not depend on the
physical features of a person’s face
TagSense generated the following
tags :
November 21st afternoon, Nasher
Museum, indoor, Romit, Sushma,
Naveen, Souvik, Justin, Vijay, Xuan,
standing, talking
7
9. System Architecture
People enter a common password of TagSense in respective
phones
This password acts as a shared session key, ensuring that sensed
information is assimilated only from group members.
8
10. Example Scenario
Bob’s phone immediately broadcasts an active-sensor bacon,
encrypted with the shared key
Phones in the group activate their respective sensors
Once Bob clicks the picture, Bob’s camera sends a beacon with
its local times-tamp and the phones record it 9
11. Example Scenario (contd…)
After a threshold time from the click, the phones
deactivate their sensors, perform basic activity
recognition on the sensed information, and send them
back to Bob’s phone
Bob’s phone assimilates these per-person activities,
and also infers some contextual information from its
own sensors
10
12. Subtitlecolor
Example of a slide with a subhead
Set the slide title in “title case”
Set subheads in “sentence case”
Generally set subhead to 36pt or smaller so it will
fit on a single line
The subhead color is defined for this template but
must be selected. In PowerPoint 2007, it is the
fourth font color from the left
Tag
Generation
Tag
Generation
11
13. Design & Implementation
Who are in the picture?
- includes only those in camera view
3 possible techniques enabled by multi-
dimensional sensing
Accelerometer based motion signatures
Complementary compass directions
Correlating visual and acceleration
12
15. Accelerometer basedmotion signature (contd..)
People inside
the picture
The variance of
accelerometer readings
From 20pictures at
different times and
Locations
people outside
the picture
picture 14
17. People in picture likely face camera
Personal Compass Offset (PCO)
Use posing picture to calibrate PCO
16
18. People move actively like playing ping-ping,
dancing, running
Correlating visual and acceleration 17
19. TagSense matches the optical velocity with each
of the phone’s accelerometer reading to identify
the moving subjects
Basic idea
1. Taking multiple snapshots from the camera
2. Deriving the subject’s motion vector from
these snapshots
3. Correlating it to the accelerometer
measurements recorded by different phone
18
20. Moving Subjects (contd..)
Extracting motion vectors
of people from two
successive snapshots
The optical flow field
showing the velocity
of each pixel
The motion vectors form
the two detected moving
objects
19
22. Activity recognition with the aid of mobile
phones has been an active area of research
lately.
Ex: SoundSense, Sensing Meets Mobile Social
Networks
The focus of this paper not on devising new
activity recognition schemes
So, they start with a limited vocabulary of tags
to represent a basic set of activities.
21
23. Usage of Accelerometer
Standing, Sitting, Walking, Jumping, Biking, Playing
Clear signature from accelerometer
Sitting Or Standing
Accelerometer readings & location information
walking, jumping, biking, playing
22
24. Usage of Acoustic : Talking, Music, Silence
Photo + Audio Sample
From acoustic sensor
Easier to differentiate between two cases
In TagSense prototype, it provide basic information
regarding ambient sound when the picture is taken
23
25. Location of a picture conveys semantic
information about the picture
It also enables location based photo search.
GPS based location coordinates are suitable for
these purposes.
TagSense leverages mobile phone sensors and
cloud services to approach these goals
TagSense utilizes the light sensor on the camera
phone to detect indoor/outdoor
24
26. The variation of light intensity measured at 400 different
times across days and nights in outdoor and indoor
environments.
Feasible to compute light intensity thresholds Using the
light intensity measurement (from the camera) during the
picture-click
And uses this information to tag the picture as “indoors”
or “outdoors”.
25
27. Location + Phone Compasses combination
To tag the backgrounds
California beach +
Westward =
Infer the ocean in
the background
26
29. Advantages
Envisioning an alternative opportunity towards
automatic image tagging.
Designing TagSense, an architecture for
coordinating the mobile phone sensors, and
processing the sensed information to tag
images.
28
30. TagSense does not generate captions and
cannot tag pictures taken in the past.
TagSense requires users to input a group
password at the beginning of a photo session.
Tag Sense vocabulary of tags is quite limited
29
32. Conclusions
TagSense leverages trend to automatically tag
pictures with people and their activities.
Mobile phones are Replacing traditional cameras.
TagSense has somewhat lower precision and
comparable fall-out but significantly higher recall
than iPhoto/Picasa
Limited vocabulary of tags to represent a basic set
of activities like what they are doing.
GPS-based location coordinates are used to tell
where the picture is taken
31
33. References
[1] “TagSense: Leveraging Smartphones for Automatic Image
Tagging”, IEEE TRANSACTIONS ON MOBILE
COMPUTING, VOL. 13, NO. 1, JANUARY 2014
[2] H. Lu and et al, “SoundSense: scalable sound sensing for people
centric applications on mobile phones,” in ACM MobiSys, 2009.
[3] A. Engstrom and et al., “Mobile collaborative live video
mixing,”Mobile Multimedia Workshop (with MobileHCI), Sep
2008.
[4] M. Azizyan and et al., “Surround Sense: mobile phone
localization via ambience fingerprinting,” in ACM MobiCom,
2009.
32
34. Subtitlecolor
Example of a slide with a subhead
Set the slide title in “title case”
Set subheads in “sentence case”
Generally set subhead to 36pt or smaller so it will
fit on a single line
The subhead color is defined for this template but
must be selected. In PowerPoint 2007, it is the
fourth font color from the left
35. Example of a slide with a subhead
Set the slide title in “title case”
Set subheads in “sentence case”
Generally set subhead to 36pt or smaller so it will
fit on a single line
The subhead color is defined for this template but
must be selected. In PowerPoint 2007, it is the
fourth font color from the left