SlideShare une entreprise Scribd logo
1  sur  44
Télécharger pour lire hors ligne
Low computational cost algorithms
for photo clustering and mail
signature detection in the cloud!
Daniel Manchón
Co-directors: Xavi Giró (UPC) Omar Pera (Pixable)
1
Outline
• Motivation!
• Tasks summary
• Pixable internship
• GPI research assistant
• Photo clustering
• Mail signature detection
• Conclusions
• Introduction
• Requirements
• Design
• Results
2
Motivation: Photo clustering
3
Low computational cost algorithms for photo clustering and mail signature detection in the cloud
Motivation: Mail signature detection
4
Low computational cost algorithms for photo clustering and mail signature detection in the cloud
Motivation: Cloud computing
5
Low computational cost algorithms for photo clustering and mail signature detection in the cloud
Outline
• Motivation
• Tasks summary
• Pixable internship!
• GPI research assistant
• Photo clustering
• Mail signature detection
• Conclusions
• Introduction
• Requirements
• Design
• Results
6
Pixable internship
- Social photos aggregation!
- Photo ranking!
- Editorial content!
- Contacts feeds!
- Owned by Singtel
- Photo storage!
- Synchronization across multiple devices!
- Support for RAW
- CallerID application!
- Multiple contact source support!
- Contact backup and synchronization!
- SPAM detection
7
Photofeed tasks
• Instagram source (in-production)
• Referrals and invitations method
• "New relic" integration
• Photo clustering and
summarization
• Photo download service 

(in-production)
8
• Mail scrapping monitorization
• Signature detection!
• Identity analysis improvement
• Tooling (in-production)
Contactive tasks
9
Outline
• Motivation
• Tasks summary
• Pixable internship
• GPI research assistant!
• Photo clustering
• Mail signature detection
• Conclusions
• Introduction
• Requirements
• Design
• Results
10
GPI research assistant
• Mediaeval 2013 (published paper)
• ICMR SEWM (published paper)
• Pyxel software framework
• Mediaeval 2014
11
Multimedia retrieval conference
GPI: Image and Video Processing Group
Outline
• Motivation
• Tasks summary
• Pixable internship
• GPI research assistant
• Photo clustering!
• Mail signature detection
• Conclusions
• Introduction!
• Requirements
• Design
• Results
12
Photo Clustering: Intro
PhotoTOC
[Platt et al, PACRIM 2003]
State of the artEvent detection
13
Outline
• Motivation
• Tasks summary
• Pixable internship
• GPI research assistant
• Photo clustering!
• Mail signature detection
• Conclusions
• Introduction
• Requirements!
• Design
• Results
14
Photo Clustering: Requirements
• User data stored in Amazon
cloud and MongoDB.
• Low computing
• Easily configurable using
REST API
• Event generation
• Visual and metadata information
available
• F1 and NMI as evaluation metrics
• 400k annotated photo dataset
Mediaeval requirements Photofeed constrains
15
Outline
• Motivation
• Tasks summary
• Pixable internship
• GPI research assistant
• Photo clustering!
• Mail signature detection
• Conclusions
• Introduction
• Requirements
• Design!
• Results
16
Design
Hi, I’m John. Hi, I’m Emily.
(a) Temporal sorting by each user independently
17
Design
(b) Temporal-based oversegmentation in mini-clusters
PhotoTOC
[Platt et al, PacRim 2003]
18
Design
(b) Temporal-based oversegmentation in mini-clusters, mean values modelization
19
Username= John
T.taken= 2010-09-10 02:10:12
GPS= (42.1,-10)
tags= live,stage,deerhunter
Username= emily
T.taken= 2010-12-13 02:11:10
GPS= (43,-8.40)
tags= live,deerhunter
Username= emily
T.taken= 2010-12-13 03:11:10
GPS= (no data)
tags= live,stones
Username= emily
T.taken= 2010-12-14 23:11:10
GPS= (43.2,-8.2)
tags= sound, test
Design
(c) Sequential merging of mini-clusters
?
t
avg(·) avg(·) avg(·)avg(·)
20
Design
(c) Sequential merging of mini-clusters
21
Outline
• Motivation
• Tasks summary
• Pixable internship
• GPI research assistant
• Photo clustering!
• Mail signature detection
• Conclusions
• Introduction
• Requirements
• Design
• Results
22
Results
F1 = 2
PR
P + R
UPC 3rd place of 12 teams!!!
23
Outline
• Motivation
• Tasks summary
• Pixable internship
• GPI research assistant
• Photo clustering
• Mail signature detection!
• Conclusions
• Introduction!
• Requirements
• Design
• Results
24
Mail signature detection: Intro
• Email information extraction
• SPAM detection
• Low computation
State of the artKEY TOPICS
Learning to extract signature and reply lines from email
[Vitor R. Carvalho and William W. Cohen, 2004 ]
25
Outline
• Motivation
• Tasks summary
• Pixable internship
• GPI research assistant
• Photo clustering
• Mail signature detection!
• Conclusions
• Introduction
• Requirements!
• Design
• Results
26
Mail signature detection: Requirements
• Mail scrapping service improvement
• Pre-process the input to reduce the execution time
• Adapt the mail scrapping service to Contactive product
?
fewer information
filter only signatures
MongoDB entries
User mailbox
id 89012
name John Doe
email j.doe@gmail.com
linkedin Id 7788455367_e
phone 789675463
27
Mail
scrapping
service
Outline
• Motivation
• Tasks summary
• Pixable internship
• GPI research assistant
• Photo clustering
• Mail signature detection!
• Conclusions
• Introduction
• Requirements
• Design!
• Results
28
Design
2. Problem Definition and Corpus
A signature block is the set of lines, usually in the end of a message, that contain information about the sender,
such as personal name, affiliation, postal address, web address, email address, telephone number, etc. Quotes from
famous persons and creative ASCII drawings are often present in this block also. An example of a signature block
can be seen in last six lines of the email message pictured in Figure 1 (marked with the line label <sig>). Figure 1
also contains six lines of text that were quoted from a preceding message (marked with the line label <reply>). In
this paper we will call such lines reply lines.
<other> From: wcohen@cs.cmu.edu
<other> To: Vitor Carvalho <vitor@cs.cmu.edu>
<other> Subject: Re: Did you try to compile javadoc recently?
<other> Date: 25 Mar 2004 12:05:51 -0500
<other>
<other> Try cvs update –dP, this removes files & directories that have been
<other> deleted from cvs.
<other> - W
<other>
<reply> On Wed, 2004-03-24 at 19:58, Vitor Carvalho wrote:
<reply> > I’ve just checked-out the baseline m3 code and
<reply> > "Ant dist" is working fine, but "ant javadoc" is not.
<reply> > Thanks
<reply> > Vitor
<other>
<sig> ------------------------------------------------------------------
<sig> William W. Cohen “Would you drive a mime
<sig> wcohen@cs.cmu.edu nuts if you played a
<sig> http://www.wcohen.com blank audio tape
<sig> Associate Research Professor full blast?”
<sig> CALD, Carnegie-Mellon University - S. Wright
Figure 1 - Excerpt from a labeled email message
(a) Split the K last mail lines and retrieve the annotations
Last K
lines
Ground truth
annotations
29
2. Problem Definition and Corpus
A signature block is the set of lines, usually in the end of a message, that contain information about the sender,
such as personal name, affiliation, postal address, web address, email address, telephone number, etc. Quotes from
famous persons and creative ASCII drawings are often present in this block also. An example of a signature block
can be seen in last six lines of the email message pictured in Figure 1 (marked with the line label <sig>). Figure 1
also contains six lines of text that were quoted from a preceding message (marked with the line label <reply>). In
Lines
N Feature
Patterns
(b) feature extraction
30
Design
Design
(c) SVM training and model generation
nition and Corpus
is the set of lines, usually in the end of a message, that contain information about the sender,
e, affiliation, postal address, web address, email address, telephone number, etc. Quotes from
reative ASCII drawings are often present in this block also. An example of a signature block
lines of the email message pictured in Figure 1 (marked with the line label <sig>). Figure 1
of text that were quoted from a preceding message (marked with the line label <reply>). In
such lines reply lines.
om: wcohen@cs.cmu.edu
: Vitor Carvalho <vitor@cs.cmu.edu>
31
Feature matrix
[KxN]
Vector ground truth
[K]
+ SVM
training Model=
Design
(c) SVM training and model generation
Model
● Other
● Reply
● Signature
Lines
Classes
pre-process
Features
32
Outline
• Motivation
• Tasks summary
• Pixable internship
• GPI research assistant
• Photo clustering
• Mail signature detection!
• Conclusions
• Introduction
• Requirements
• Design
• Results
33
Results
F1 = 2
Precision · Recall
Precision + Recall
34
With annotated dataset Without annotated dataset
Manual evaluation
Contactive user base mailboxes
Outline
• Motivation
• Tasks summary
• Pixable internship
• GPI research assistant
• Photo clustering
• Mail signature detection
• Conclusions
• Introduction
• Requirements
• Design
• Results
35
Conclusions
• Academic
• Papers: Mediaeval 2013 and ICMR SEWM, and Mediaeval 2014 on preparation.
• UPC Pyxel framework foundations
• Industrial
• Contributions to Pixable in production servers:
• Instagram integration
• Photofeed Downloader
• Mail signature detection: Proof of concept successful.
• Work in the USA!
36
Thank you very much!!
Q&A
37
BACKUP SLIDES
38
Design
39
(c) Sequential merging of mini-clusters
Weighted
modalities
● creation (or upload) time
● geolocation
● textual labels
● same user
Design
40
(c) Sequential merging of mini-clusters
Geolocation (d=haversine)Time stamp (d=L1)
Text labels (d=Jaccard) Same user (d=boolean)
Design
41
(c) Sequential merging of mini-clusters
Design
42
(c) Sequential merging of mini-clusters
42
Mean and std.
deviation learned on
pairs of photos within
the same training
event.
Design
43
(c) Sequential merging of mini-clusters
43
phi function
Design
44
(c) Sequential merging of mini-clusters
decision threhold

Contenu connexe

En vedette

Bebida " Real Peru "- Marketing Informatico - UNFV
Bebida " Real Peru "- Marketing Informatico - UNFVBebida " Real Peru "- Marketing Informatico - UNFV
Bebida " Real Peru "- Marketing Informatico - UNFVMedaly Ventocilla
 
Global Real Estate Institute - Connecting Real Estate Leaders Worldwide
Global Real Estate Institute - Connecting Real Estate Leaders WorldwideGlobal Real Estate Institute - Connecting Real Estate Leaders Worldwide
Global Real Estate Institute - Connecting Real Estate Leaders WorldwideRoy Maybury
 
Dr. Douglas Rosendale
Dr. Douglas Rosendale Dr. Douglas Rosendale
Dr. Douglas Rosendale Investnet
 
Presentación Psico Educa Vet Corp
Presentación Psico Educa Vet CorpPresentación Psico Educa Vet Corp
Presentación Psico Educa Vet CorpJose Rafael Romero
 
B5 - OTHER REFERENCES - PRIOR TO STEINHOFF
B5 - OTHER REFERENCES - PRIOR TO STEINHOFFB5 - OTHER REFERENCES - PRIOR TO STEINHOFF
B5 - OTHER REFERENCES - PRIOR TO STEINHOFFNivera Ishwarlall
 
Pensemos _un__docente__actor__y_constructor__de__innovaciones[1]
Pensemos  _un__docente__actor__y_constructor__de__innovaciones[1]Pensemos  _un__docente__actor__y_constructor__de__innovaciones[1]
Pensemos _un__docente__actor__y_constructor__de__innovaciones[1]Lorena Mariela Rodriguez
 
La gatera de_la_villa_La Gatera de la Villa nº 5
La gatera de_la_villa_La Gatera de la Villa nº 5La gatera de_la_villa_La Gatera de la Villa nº 5
La gatera de_la_villa_La Gatera de la Villa nº 5La Gatera de la Villa
 
Trend One (Web Expansion) Grape Online Strategies 2009 by Nick Sohnemann
Trend One (Web Expansion) Grape Online Strategies 2009 by Nick SohnemannTrend One (Web Expansion) Grape Online Strategies 2009 by Nick Sohnemann
Trend One (Web Expansion) Grape Online Strategies 2009 by Nick SohnemannHUNGRY BOYS Creative agency
 
Español Aeronáutico creado por Lidia Están
Español Aeronáutico creado por Lidia Están Español Aeronáutico creado por Lidia Están
Español Aeronáutico creado por Lidia Están Lidia Mar Est
 
TechTalk: What is DDVS and How to Make Sense of Data-Driven Service Image.
TechTalk: What is DDVS and How to Make Sense of Data-Driven Service Image.TechTalk: What is DDVS and How to Make Sense of Data-Driven Service Image.
TechTalk: What is DDVS and How to Make Sense of Data-Driven Service Image.CA Technologies
 
Curso de Instrumentación Endoscópica para Enfermería
Curso de Instrumentación Endoscópica para EnfermeríaCurso de Instrumentación Endoscópica para Enfermería
Curso de Instrumentación Endoscópica para EnfermeríaBlog Materno-Infantil
 

En vedette (19)

Bebida " Real Peru "- Marketing Informatico - UNFV
Bebida " Real Peru "- Marketing Informatico - UNFVBebida " Real Peru "- Marketing Informatico - UNFV
Bebida " Real Peru "- Marketing Informatico - UNFV
 
Desarrollo de competencias genéricas y yoga
Desarrollo de competencias genéricas  y yogaDesarrollo de competencias genéricas  y yoga
Desarrollo de competencias genéricas y yoga
 
Global Real Estate Institute - Connecting Real Estate Leaders Worldwide
Global Real Estate Institute - Connecting Real Estate Leaders WorldwideGlobal Real Estate Institute - Connecting Real Estate Leaders Worldwide
Global Real Estate Institute - Connecting Real Estate Leaders Worldwide
 
Dr. Douglas Rosendale
Dr. Douglas Rosendale Dr. Douglas Rosendale
Dr. Douglas Rosendale
 
Comenzar
ComenzarComenzar
Comenzar
 
Presentación Psico Educa Vet Corp
Presentación Psico Educa Vet CorpPresentación Psico Educa Vet Corp
Presentación Psico Educa Vet Corp
 
B5 - OTHER REFERENCES - PRIOR TO STEINHOFF
B5 - OTHER REFERENCES - PRIOR TO STEINHOFFB5 - OTHER REFERENCES - PRIOR TO STEINHOFF
B5 - OTHER REFERENCES - PRIOR TO STEINHOFF
 
Pensemos _un__docente__actor__y_constructor__de__innovaciones[1]
Pensemos  _un__docente__actor__y_constructor__de__innovaciones[1]Pensemos  _un__docente__actor__y_constructor__de__innovaciones[1]
Pensemos _un__docente__actor__y_constructor__de__innovaciones[1]
 
Expo
ExpoExpo
Expo
 
La gatera de_la_villa_La Gatera de la Villa nº 5
La gatera de_la_villa_La Gatera de la Villa nº 5La gatera de_la_villa_La Gatera de la Villa nº 5
La gatera de_la_villa_La Gatera de la Villa nº 5
 
Trend One (Web Expansion) Grape Online Strategies 2009 by Nick Sohnemann
Trend One (Web Expansion) Grape Online Strategies 2009 by Nick SohnemannTrend One (Web Expansion) Grape Online Strategies 2009 by Nick Sohnemann
Trend One (Web Expansion) Grape Online Strategies 2009 by Nick Sohnemann
 
PhD_APC_UPMC_IFPEN_dec1997
PhD_APC_UPMC_IFPEN_dec1997PhD_APC_UPMC_IFPEN_dec1997
PhD_APC_UPMC_IFPEN_dec1997
 
Español Aeronáutico creado por Lidia Están
Español Aeronáutico creado por Lidia Están Español Aeronáutico creado por Lidia Están
Español Aeronáutico creado por Lidia Están
 
TechTalk: What is DDVS and How to Make Sense of Data-Driven Service Image.
TechTalk: What is DDVS and How to Make Sense of Data-Driven Service Image.TechTalk: What is DDVS and How to Make Sense of Data-Driven Service Image.
TechTalk: What is DDVS and How to Make Sense of Data-Driven Service Image.
 
Man Base Datos I
Man Base Datos IMan Base Datos I
Man Base Datos I
 
Proyecto
ProyectoProyecto
Proyecto
 
Pintados de pasi
Pintados de pasiPintados de pasi
Pintados de pasi
 
Curso de Instrumentación Endoscópica para Enfermería
Curso de Instrumentación Endoscópica para EnfermeríaCurso de Instrumentación Endoscópica para Enfermería
Curso de Instrumentación Endoscópica para Enfermería
 
Manual corporativo - MARCA
Manual corporativo - MARCAManual corporativo - MARCA
Manual corporativo - MARCA
 

Similaire à Low computational cost algorithms for photo clustering and mail signature detection in the cloud

SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...South Tyrol Free Software Conference
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design PatternsMongoDB
 
SH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptxSH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptxMongoDB
 
SH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptxSH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptxMongoDB
 
Open Source North - MongoDB Advanced Schema Design Patterns
Open Source North - MongoDB Advanced Schema Design PatternsOpen Source North - MongoDB Advanced Schema Design Patterns
Open Source North - MongoDB Advanced Schema Design PatternsMatthew Kalan
 
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...Gábor Szárnyas
 
Chengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big dataChengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big datajins0618
 
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Research
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design PatternsMongoDB
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhenDavid Peyruc
 
SQL Query Optimization: Why Is It So Hard to Get Right?
SQL Query Optimization: Why Is It So Hard to Get Right?SQL Query Optimization: Why Is It So Hard to Get Right?
SQL Query Optimization: Why Is It So Hard to Get Right?Brent Ozar
 
PASS Summit 2010 Keynote David DeWitt
PASS Summit 2010 Keynote David DeWittPASS Summit 2010 Keynote David DeWitt
PASS Summit 2010 Keynote David DeWittGraySystemsLab
 
Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j: What's Under the Hood & How Knowing This Can Help You Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j: What's Under the Hood & How Knowing This Can Help You Neo4j
 
Keynote IDEAS2013 - Peter Boncz
Keynote IDEAS2013 - Peter BonczKeynote IDEAS2013 - Peter Boncz
Keynote IDEAS2013 - Peter BonczIoan Toma
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB
 

Similaire à Low computational cost algorithms for photo clustering and mail signature detection in the cloud (20)

SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
SFScon 21 - Matteo Camilli - Performance assessment of microservices with str...
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design Patterns
 
PCL (Point Cloud Library)
PCL (Point Cloud Library)PCL (Point Cloud Library)
PCL (Point Cloud Library)
 
SH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptxSH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptx
 
SH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptxSH 1 - SES 1 - advanced_schema_design.pptx
SH 1 - SES 1 - advanced_schema_design.pptx
 
Open Source North - MongoDB Advanced Schema Design Patterns
Open Source North - MongoDB Advanced Schema Design PatternsOpen Source North - MongoDB Advanced Schema Design Patterns
Open Source North - MongoDB Advanced Schema Design Patterns
 
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
An early look at the LDBC Social Network Benchmark's Business Intelligence wo...
 
Chengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big dataChengqi zhang graph processing and mining in the era of big data
Chengqi zhang graph processing and mining in the era of big data
 
Complexity metrics and models
Complexity metrics and modelsComplexity metrics and models
Complexity metrics and models
 
Complexity metrics and models
Complexity metrics and modelsComplexity metrics and models
Complexity metrics and models
 
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AIQualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
Qualcomm Webinar: Solving Unsolvable Combinatorial Problems with AI
 
Advanced Schema Design Patterns
Advanced Schema Design PatternsAdvanced Schema Design Patterns
Advanced Schema Design Patterns
 
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And WhentranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
tranSMART Community Meeting 5-7 Nov 13 - Session 2: MongoDB: What, Why And When
 
SQL Query Optimization: Why Is It So Hard to Get Right?
SQL Query Optimization: Why Is It So Hard to Get Right?SQL Query Optimization: Why Is It So Hard to Get Right?
SQL Query Optimization: Why Is It So Hard to Get Right?
 
Effective C++
Effective C++Effective C++
Effective C++
 
PASS Summit 2010 Keynote David DeWitt
PASS Summit 2010 Keynote David DeWittPASS Summit 2010 Keynote David DeWitt
PASS Summit 2010 Keynote David DeWitt
 
computer architecture.
computer architecture.computer architecture.
computer architecture.
 
Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j: What's Under the Hood & How Knowing This Can Help You Neo4j: What's Under the Hood & How Knowing This Can Help You
Neo4j: What's Under the Hood & How Knowing This Can Help You
 
Keynote IDEAS2013 - Peter Boncz
Keynote IDEAS2013 - Peter BonczKeynote IDEAS2013 - Peter Boncz
Keynote IDEAS2013 - Peter Boncz
 
MongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and ImplicationsMongoDB Schema Design: Practical Applications and Implications
MongoDB Schema Design: Practical Applications and Implications
 

Plus de Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...Universitat Politècnica de Catalunya
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoUniversitat Politècnica de Catalunya
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Universitat Politècnica de Catalunya
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosUniversitat Politècnica de Catalunya
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Universitat Politècnica de Catalunya
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Universitat Politècnica de Catalunya
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Universitat Politècnica de Catalunya
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Universitat Politècnica de Catalunya
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Universitat Politècnica de Catalunya
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Universitat Politècnica de Catalunya
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Universitat Politècnica de Catalunya
 

Plus de Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
Deep Generative Learning for All
Deep Generative Learning for AllDeep Generative Learning for All
Deep Generative Learning for All
 
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...
 
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-NietoTowards Sign Language Translation & Production | Xavier Giro-i-Nieto
Towards Sign Language Translation & Production | Xavier Giro-i-Nieto
 
The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021The Transformer - Xavier Giró - UPC Barcelona 2021
The Transformer - Xavier Giró - UPC Barcelona 2021
 
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...
 
Open challenges in sign language translation and production
Open challenges in sign language translation and productionOpen challenges in sign language translation and production
Open challenges in sign language translation and production
 
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in VideosGeneration of Synthetic Referring Expressions for Object Segmentation in Videos
Generation of Synthetic Referring Expressions for Object Segmentation in Videos
 
Discovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in MinecraftDiscovery and Learning of Navigation Goals from Pixels in Minecraft
Discovery and Learning of Navigation Goals from Pixels in Minecraft
 
Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...Learn2Sign : Sign language recognition and translation using human keypoint e...
Learn2Sign : Sign language recognition and translation using human keypoint e...
 
Intepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural NetworksIntepretability / Explainable AI for Deep Neural Networks
Intepretability / Explainable AI for Deep Neural Networks
 
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...
 
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020
 
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...
 
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020
 
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)
 
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...
 
Curriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object SegmentationCurriculum Learning for Recurrent Video Object Segmentation
Curriculum Learning for Recurrent Video Object Segmentation
 
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020
 

Dernier

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Dernier (20)

FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

Low computational cost algorithms for photo clustering and mail signature detection in the cloud

  • 1. Low computational cost algorithms for photo clustering and mail signature detection in the cloud! Daniel Manchón Co-directors: Xavi Giró (UPC) Omar Pera (Pixable) 1
  • 2. Outline • Motivation! • Tasks summary • Pixable internship • GPI research assistant • Photo clustering • Mail signature detection • Conclusions • Introduction • Requirements • Design • Results 2
  • 3. Motivation: Photo clustering 3 Low computational cost algorithms for photo clustering and mail signature detection in the cloud
  • 4. Motivation: Mail signature detection 4 Low computational cost algorithms for photo clustering and mail signature detection in the cloud
  • 5. Motivation: Cloud computing 5 Low computational cost algorithms for photo clustering and mail signature detection in the cloud
  • 6. Outline • Motivation • Tasks summary • Pixable internship! • GPI research assistant • Photo clustering • Mail signature detection • Conclusions • Introduction • Requirements • Design • Results 6
  • 7. Pixable internship - Social photos aggregation! - Photo ranking! - Editorial content! - Contacts feeds! - Owned by Singtel - Photo storage! - Synchronization across multiple devices! - Support for RAW - CallerID application! - Multiple contact source support! - Contact backup and synchronization! - SPAM detection 7
  • 8. Photofeed tasks • Instagram source (in-production) • Referrals and invitations method • "New relic" integration • Photo clustering and summarization • Photo download service 
 (in-production) 8
  • 9. • Mail scrapping monitorization • Signature detection! • Identity analysis improvement • Tooling (in-production) Contactive tasks 9
  • 10. Outline • Motivation • Tasks summary • Pixable internship • GPI research assistant! • Photo clustering • Mail signature detection • Conclusions • Introduction • Requirements • Design • Results 10
  • 11. GPI research assistant • Mediaeval 2013 (published paper) • ICMR SEWM (published paper) • Pyxel software framework • Mediaeval 2014 11 Multimedia retrieval conference GPI: Image and Video Processing Group
  • 12. Outline • Motivation • Tasks summary • Pixable internship • GPI research assistant • Photo clustering! • Mail signature detection • Conclusions • Introduction! • Requirements • Design • Results 12
  • 13. Photo Clustering: Intro PhotoTOC [Platt et al, PACRIM 2003] State of the artEvent detection 13
  • 14. Outline • Motivation • Tasks summary • Pixable internship • GPI research assistant • Photo clustering! • Mail signature detection • Conclusions • Introduction • Requirements! • Design • Results 14
  • 15. Photo Clustering: Requirements • User data stored in Amazon cloud and MongoDB. • Low computing • Easily configurable using REST API • Event generation • Visual and metadata information available • F1 and NMI as evaluation metrics • 400k annotated photo dataset Mediaeval requirements Photofeed constrains 15
  • 16. Outline • Motivation • Tasks summary • Pixable internship • GPI research assistant • Photo clustering! • Mail signature detection • Conclusions • Introduction • Requirements • Design! • Results 16
  • 17. Design Hi, I’m John. Hi, I’m Emily. (a) Temporal sorting by each user independently 17
  • 18. Design (b) Temporal-based oversegmentation in mini-clusters PhotoTOC [Platt et al, PacRim 2003] 18
  • 19. Design (b) Temporal-based oversegmentation in mini-clusters, mean values modelization 19 Username= John T.taken= 2010-09-10 02:10:12 GPS= (42.1,-10) tags= live,stage,deerhunter Username= emily T.taken= 2010-12-13 02:11:10 GPS= (43,-8.40) tags= live,deerhunter Username= emily T.taken= 2010-12-13 03:11:10 GPS= (no data) tags= live,stones Username= emily T.taken= 2010-12-14 23:11:10 GPS= (43.2,-8.2) tags= sound, test
  • 20. Design (c) Sequential merging of mini-clusters ? t avg(·) avg(·) avg(·)avg(·) 20
  • 21. Design (c) Sequential merging of mini-clusters 21
  • 22. Outline • Motivation • Tasks summary • Pixable internship • GPI research assistant • Photo clustering! • Mail signature detection • Conclusions • Introduction • Requirements • Design • Results 22
  • 23. Results F1 = 2 PR P + R UPC 3rd place of 12 teams!!! 23
  • 24. Outline • Motivation • Tasks summary • Pixable internship • GPI research assistant • Photo clustering • Mail signature detection! • Conclusions • Introduction! • Requirements • Design • Results 24
  • 25. Mail signature detection: Intro • Email information extraction • SPAM detection • Low computation State of the artKEY TOPICS Learning to extract signature and reply lines from email [Vitor R. Carvalho and William W. Cohen, 2004 ] 25
  • 26. Outline • Motivation • Tasks summary • Pixable internship • GPI research assistant • Photo clustering • Mail signature detection! • Conclusions • Introduction • Requirements! • Design • Results 26
  • 27. Mail signature detection: Requirements • Mail scrapping service improvement • Pre-process the input to reduce the execution time • Adapt the mail scrapping service to Contactive product ? fewer information filter only signatures MongoDB entries User mailbox id 89012 name John Doe email j.doe@gmail.com linkedin Id 7788455367_e phone 789675463 27 Mail scrapping service
  • 28. Outline • Motivation • Tasks summary • Pixable internship • GPI research assistant • Photo clustering • Mail signature detection! • Conclusions • Introduction • Requirements • Design! • Results 28
  • 29. Design 2. Problem Definition and Corpus A signature block is the set of lines, usually in the end of a message, that contain information about the sender, such as personal name, affiliation, postal address, web address, email address, telephone number, etc. Quotes from famous persons and creative ASCII drawings are often present in this block also. An example of a signature block can be seen in last six lines of the email message pictured in Figure 1 (marked with the line label <sig>). Figure 1 also contains six lines of text that were quoted from a preceding message (marked with the line label <reply>). In this paper we will call such lines reply lines. <other> From: wcohen@cs.cmu.edu <other> To: Vitor Carvalho <vitor@cs.cmu.edu> <other> Subject: Re: Did you try to compile javadoc recently? <other> Date: 25 Mar 2004 12:05:51 -0500 <other> <other> Try cvs update –dP, this removes files & directories that have been <other> deleted from cvs. <other> - W <other> <reply> On Wed, 2004-03-24 at 19:58, Vitor Carvalho wrote: <reply> > I’ve just checked-out the baseline m3 code and <reply> > "Ant dist" is working fine, but "ant javadoc" is not. <reply> > Thanks <reply> > Vitor <other> <sig> ------------------------------------------------------------------ <sig> William W. Cohen “Would you drive a mime <sig> wcohen@cs.cmu.edu nuts if you played a <sig> http://www.wcohen.com blank audio tape <sig> Associate Research Professor full blast?” <sig> CALD, Carnegie-Mellon University - S. Wright Figure 1 - Excerpt from a labeled email message (a) Split the K last mail lines and retrieve the annotations Last K lines Ground truth annotations 29
  • 30. 2. Problem Definition and Corpus A signature block is the set of lines, usually in the end of a message, that contain information about the sender, such as personal name, affiliation, postal address, web address, email address, telephone number, etc. Quotes from famous persons and creative ASCII drawings are often present in this block also. An example of a signature block can be seen in last six lines of the email message pictured in Figure 1 (marked with the line label <sig>). Figure 1 also contains six lines of text that were quoted from a preceding message (marked with the line label <reply>). In Lines N Feature Patterns (b) feature extraction 30 Design
  • 31. Design (c) SVM training and model generation nition and Corpus is the set of lines, usually in the end of a message, that contain information about the sender, e, affiliation, postal address, web address, email address, telephone number, etc. Quotes from reative ASCII drawings are often present in this block also. An example of a signature block lines of the email message pictured in Figure 1 (marked with the line label <sig>). Figure 1 of text that were quoted from a preceding message (marked with the line label <reply>). In such lines reply lines. om: wcohen@cs.cmu.edu : Vitor Carvalho <vitor@cs.cmu.edu> 31 Feature matrix [KxN] Vector ground truth [K] + SVM training Model=
  • 32. Design (c) SVM training and model generation Model ● Other ● Reply ● Signature Lines Classes pre-process Features 32
  • 33. Outline • Motivation • Tasks summary • Pixable internship • GPI research assistant • Photo clustering • Mail signature detection! • Conclusions • Introduction • Requirements • Design • Results 33
  • 34. Results F1 = 2 Precision · Recall Precision + Recall 34 With annotated dataset Without annotated dataset Manual evaluation Contactive user base mailboxes
  • 35. Outline • Motivation • Tasks summary • Pixable internship • GPI research assistant • Photo clustering • Mail signature detection • Conclusions • Introduction • Requirements • Design • Results 35
  • 36. Conclusions • Academic • Papers: Mediaeval 2013 and ICMR SEWM, and Mediaeval 2014 on preparation. • UPC Pyxel framework foundations • Industrial • Contributions to Pixable in production servers: • Instagram integration • Photofeed Downloader • Mail signature detection: Proof of concept successful. • Work in the USA! 36
  • 37. Thank you very much!! Q&A 37
  • 39. Design 39 (c) Sequential merging of mini-clusters Weighted modalities ● creation (or upload) time ● geolocation ● textual labels ● same user
  • 40. Design 40 (c) Sequential merging of mini-clusters Geolocation (d=haversine)Time stamp (d=L1) Text labels (d=Jaccard) Same user (d=boolean)
  • 42. Design 42 (c) Sequential merging of mini-clusters 42 Mean and std. deviation learned on pairs of photos within the same training event.
  • 43. Design 43 (c) Sequential merging of mini-clusters 43 phi function
  • 44. Design 44 (c) Sequential merging of mini-clusters decision threhold