SlideShare une entreprise Scribd logo
1  sur  28
Learning visual representations
  for unfamiliar environments

      Kate Saenko, Brian Kulis,
           Trevor Darrell


       UC Berkeley EECS & ICSI
The challenge of large scale visual interaction




   Last decade has proven the superiority of models
   learned from data vs. hand engineered structures!
Large-scale learning
• “Unsupervised”: Learn models from “found data”;
  often exploit multiple modalities (text+image)

                             … The Tote is the perfect example of
                             two handbag design principles that ...
                             The lines of this tote are incredibly
                             sleek, but ... The semi buckles that
                             form the handle attachments are ...
E.g., finding visual senses

  Artifact sense: “telephone”          DICTIONARY

                                1: (n)
                                telephone, phone, telepho
                                ne set (electronic
                                equipment that converts
                                sound into electrical
                                signals that can be
                                transmitted over distances
                                and then converts received
                                signals back into sounds)

                                2: (n)
                                telephone, telephony
                                (transmitting speech at a
                                distance)




                                   [Saenko and Darrell ’09]
                                        4
Large-scale Learning
• “Unsupervised”: Learn models from “found data”;
  often exploit multiple modalities (text+image)

                             … The Tote is the perfect example of
                             two handbag design principles that ...
                             The lines of this tote are incredibly
                             sleek, but ... The semi buckles that
                             form the handle attachments are ...




• Supervised: Crowdsource labels (e.g., ImageNet)
Yet…
• Even the best collection of images from the web and
  strong machine learning methods can often yield poor
  classifiers on in-situ data!



                                                  ?
• Supervised learning assumption: training distribution
  == test distribution
• Unsupervised learning assumption: joint distribution is
  stationary w.r.t. online world and real world

                 Almost never true!               6
“What You Saw Is Not What You Get”


                           SVM:20%
                           NBNN:19%
SVM:54%
NBNN:61%




  The models fail due to domain shift
Examples of visual domain shifts




  digital SLR      webcam         Close-up   Far-away




 amazon.com                         FLICKR    CCTV
                Consumer images
Examples of domain shift:
change in camera, feature type, dimension
       digital SLR                webcam




         SURF                      SIFT




       VQ to 300
                      Different    VQ to
                                   1000
                     dimensions
Solutions?

• Do nothing (poor performance)
• Collect all types of data (impossible)
• Find out what changed (impractical)
• Learn what changed
Prior Work on Domain Adaptation

• Pre-process the data [Daumé ’07] : replicate
  features to also create source- and domain-
  specific versions; re-train learner on new features


• SVM-based methods [Yang’07], [Jiang’08],
  [Duan’09], [Duan’10] : adapt SVM parameters


• Kernel mean matching [Gretton’09] : re-weight
  training data to match test data distribution
Our paradigm: Transform-based
Domain Adaptation
                                        Example: “green” and “blue” domains
Previous methods’ drawbacks
• cannot transfer learned shift
  to new categories
• cannot handle new features
We can do both by learning                                  W
 domain transformations*


 * Saenko, Kulis, Fritz, and Darrell.
 Adapting visual category models to
 new domains. ECCV, 2010
Limitations of symmetric transforms
                            Symmetric assumption fails!

Saenko et al. ECCV10 used
  metric learning:
• symmetric transforms
• same features
                                            W
How do we learn more
 general shifts?
Latest approach*: asymmetric transforms
                                         Asymmetric transform (rotation)

• Metric learning model no
  longer applicable
• We propose to learn
  asymmetric transforms
  – Map from target to source
  – Handle different dimensions


 *Kulis, Saenko, and Darrell, What You
 Saw is Not What You Get: Domain
 Adaptation Using Asymmetric Kernel
 Transforms, CVPR 2011
Latest approach: asymmetric transforms
                                  Asymmetric transform (rotation)

• Metric learning model no
  longer applicable
• We propose to learn
  asymmetric transforms
                                                   W
  – Map from target to source
  – Handle different dimensions
Model Details


                          W




• Learn a linear transformation to map points
 from one domain to another
  – Call this transformation W
  – Matrices of source and target:
Loss Functions


Choose a point x from the
source and y from the
target, and consider inner
product:


Should be “large” for similar
objects and “small” for dissimilar
objects
Loss Functions

• Input to problem includes a collection of m
 loss functions


• General assumption: loss functions depend
 on data only through inner product matrix
Regularized Objective Function

• Minimize a linear combination of sum of loss
 functions and a regularizer:




• We use squared Frobenius norm as a
 regularizer
  – Not restricted to this choice
The Model Has Drawbacks

• A linear transformation may be insufficient
• Cost of optimization grows as the product of
 the dimensionalities of the source and target
 data


• What to do?
Kernelization

• Main idea: run in kernel space
  – Use a non-linear kernel function (e.g., RBF kernel)
    to learn non-linear transformations in input space
  – Resulting optimization is independent of input
    dimensionality
  – Additional assumption necessary: regularizer is a
    spectral function
Kernelization
                            Kernel matrices for source
                            and target



Original Transformation
Learning Problem




                                  New Kernel Problem




Relationship between
original and new problems
at optimality
Summary of approach



 Input                            Input
 space                            space
  1. Multi-Domain Data     2. Generate Constraints, Learn W




                    Test point
                   y1
                                            y2
                                                 Test point
    3. Map via W                 4. Apply to New Categories
Multi-domain dataset
Experimental Setup

• Utilized a standard bag-of-words model
• Also utilize different features in the target domain
   – SURF vs SIFT
   – Different visual word dictionaries


• Baseline for comparing such data: KCCA
Novel-class experiments

                                        Our Method (linear)
                                        Our Method




• Test method’s ability to transfer domain shift to unseen
  classes
• Train transform on half of the classes, test on the other half
Extreme shift example


 Query from target   Nearest neighbors in source using KCCA+KNN




                     Nearest neighbors in source using transformation
Conclusion
• Should not rely on hand-engineered features any
  more than we rely on hand engineered models!


• Learn feature transformation across domains

• Developed a domain adaptation method based on
  regularized non-linear transforms
  – Asymmetric transform achieves best results on more
    extreme shifts
  – Saenko et al ECCV 2010 and Kulis et al CVPR 2011;
    journal version forthcoming

Contenu connexe

Tendances

Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition Intel Nervana
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNNJunho Cho
 
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingwolf
 
Object detection
Object detectionObject detection
Object detectionSomesh Vyas
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...Dongmin Choi
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkNader Karimi
 
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Object Detection Methods using Deep Learning
Object Detection Methods using Deep LearningObject Detection Methods using Deep Learning
Object Detection Methods using Deep LearningSungjoon Choi
 
Recent Object Detection Research & Person Detection
Recent Object Detection Research & Person DetectionRecent Object Detection Research & Person Detection
Recent Object Detection Research & Person DetectionKai-Wen Zhao
 
#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentationMatthew Opala
 
Auro tripathy - Localizing with CNNs
Auro tripathy -  Localizing with CNNsAuro tripathy -  Localizing with CNNs
Auro tripathy - Localizing with CNNsAuro Tripathy
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer VisionDongmin Choi
 
Image Object Detection Pipeline
Image Object Detection PipelineImage Object Detection Pipeline
Image Object Detection PipelineAbhinav Dadhich
 
Focal loss for dense object detection
Focal loss for dense object detectionFocal loss for dense object detection
Focal loss for dense object detectionDaeHeeKim31
 
Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]Dongmin Choi
 

Tendances (20)

Object Detection and Recognition
Object Detection and Recognition Object Detection and Recognition
Object Detection and Recognition
 
150807 Fast R-CNN
150807 Fast R-CNN150807 Fast R-CNN
150807 Fast R-CNN
 
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
Towards Set Learning and Prediction - Laura Leal-Taixe - UPC Barcelona 2018
 
Shai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble trackingShai Avidan's Support vector tracking and ensemble tracking
Shai Avidan's Support vector tracking and ensemble tracking
 
Object detection
Object detectionObject detection
Object detection
 
How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...How much position information do convolutional neural networks encode? review...
How much position information do convolutional neural networks encode? review...
 
Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018
Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018
Video Object Segmentation - Laura Leal-Taixé - UPC Barcelona 2018
 
Object Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning FrameworkObject Detection Using R-CNN Deep Learning Framework
Object Detection Using R-CNN Deep Learning Framework
 
SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)SSD: Single Shot MultiBox Detector (UPC Reading Group)
SSD: Single Shot MultiBox Detector (UPC Reading Group)
 
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
CNN vs SIFT-based Visual Localization - Laura Leal-Taixé - UPC Barcelona 2018
 
Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018
 
Object Detection Methods using Deep Learning
Object Detection Methods using Deep LearningObject Detection Methods using Deep Learning
Object Detection Methods using Deep Learning
 
Recent Object Detection Research & Person Detection
Recent Object Detection Research & Person DetectionRecent Object Detection Research & Person Detection
Recent Object Detection Research & Person Detection
 
#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation#6 PyData Warsaw: Deep learning for image segmentation
#6 PyData Warsaw: Deep learning for image segmentation
 
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
 
Auro tripathy - Localizing with CNNs
Auro tripathy -  Localizing with CNNsAuro tripathy -  Localizing with CNNs
Auro tripathy - Localizing with CNNs
 
Transformer in Computer Vision
Transformer in Computer VisionTransformer in Computer Vision
Transformer in Computer Vision
 
Image Object Detection Pipeline
Image Object Detection PipelineImage Object Detection Pipeline
Image Object Detection Pipeline
 
Focal loss for dense object detection
Focal loss for dense object detectionFocal loss for dense object detection
Focal loss for dense object detection
 
Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]Review: Incremental Few-shot Instance Segmentation [CDM]
Review: Incremental Few-shot Instance Segmentation [CDM]
 

En vedette

Bucharest Hubb Events Calendar
Bucharest Hubb Events CalendarBucharest Hubb Events Calendar
Bucharest Hubb Events CalendarBucharest Hubb
 
Fcv bio cv_weiss
Fcv bio cv_weissFcv bio cv_weiss
Fcv bio cv_weisszukun
 
Mila pitch report china mail
Mila pitch report china mailMila pitch report china mail
Mila pitch report china mailRobert Halasz
 
Iccv2009 recognition and learning object categories p3 c00 - summary and da...
Iccv2009 recognition and learning object categories   p3 c00 - summary and da...Iccv2009 recognition and learning object categories   p3 c00 - summary and da...
Iccv2009 recognition and learning object categories p3 c00 - summary and da...zukun
 
On3 pitching competition
On3 pitching competitionOn3 pitching competition
On3 pitching competitionMarie Casas
 
ICCV2009: MAP Inference in Discrete Models: Part 3
ICCV2009: MAP Inference in Discrete Models: Part 3ICCV2009: MAP Inference in Discrete Models: Part 3
ICCV2009: MAP Inference in Discrete Models: Part 3zukun
 
Software Outsourcing: Events Calendar
Software Outsourcing: Events CalendarSoftware Outsourcing: Events Calendar
Software Outsourcing: Events CalendarSoftheme
 

En vedette (8)

Bucharest Hubb Events Calendar
Bucharest Hubb Events CalendarBucharest Hubb Events Calendar
Bucharest Hubb Events Calendar
 
Kompany Presentation
Kompany PresentationKompany Presentation
Kompany Presentation
 
Fcv bio cv_weiss
Fcv bio cv_weissFcv bio cv_weiss
Fcv bio cv_weiss
 
Mila pitch report china mail
Mila pitch report china mailMila pitch report china mail
Mila pitch report china mail
 
Iccv2009 recognition and learning object categories p3 c00 - summary and da...
Iccv2009 recognition and learning object categories   p3 c00 - summary and da...Iccv2009 recognition and learning object categories   p3 c00 - summary and da...
Iccv2009 recognition and learning object categories p3 c00 - summary and da...
 
On3 pitching competition
On3 pitching competitionOn3 pitching competition
On3 pitching competition
 
ICCV2009: MAP Inference in Discrete Models: Part 3
ICCV2009: MAP Inference in Discrete Models: Part 3ICCV2009: MAP Inference in Discrete Models: Part 3
ICCV2009: MAP Inference in Discrete Models: Part 3
 
Software Outsourcing: Events Calendar
Software Outsourcing: Events CalendarSoftware Outsourcing: Events Calendar
Software Outsourcing: Events Calendar
 

Similaire à Fcv rep darrell

How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?Tuan Yang
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptxssuser2023c6
 
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural NetworksJunho Cho
 
Machine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University ChhattisgarhMachine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University ChhattisgarhPoorabpatel
 
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...Universitat Politècnica de Catalunya
 
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Sergey Karayev
 
Sim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement LearningSim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement Learningatulshah16
 
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Universitat Politècnica de Catalunya
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with dataONE Talks
 
One talk Machine Learning
One talk Machine LearningOne talk Machine Learning
One talk Machine LearningONE Talks
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado
 
DL4J at Workday Meetup
DL4J at Workday MeetupDL4J at Workday Meetup
DL4J at Workday MeetupDavid Kale
 
04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptxZainULABIDIN496386
 
Scalable image recognition model with deep embedding
Scalable image recognition model with deep embeddingScalable image recognition model with deep embedding
Scalable image recognition model with deep embedding捷恩 蔡
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningSujit Pal
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingYu Huang
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonAditya Bhattacharya
 

Similaire à Fcv rep darrell (20)

Dl
DlDl
Dl
 
How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?How Machine Learning Helps Organizations to Work More Efficiently?
How Machine Learning Helps Organizations to Work More Efficiently?
 
17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx17- Kernels and Clustering.pptx
17- Kernels and Clustering.pptx
 
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
151106 Sketch-based 3D Shape Retrievals using Convolutional Neural Networks
 
Machine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University ChhattisgarhMachine Learning workshop by GDSC Amity University Chhattisgarh
Machine Learning workshop by GDSC Amity University Chhattisgarh
 
lec6a.ppt
lec6a.pptlec6a.ppt
lec6a.ppt
 
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
Transfer Learning and Domain Adaptation (D2L3 2017 UPC Deep Learning for Comp...
 
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
Lecture 2.B: Computer Vision Applications - Full Stack Deep Learning - Spring...
 
PPT s11-machine vision-s2
PPT s11-machine vision-s2PPT s11-machine vision-s2
PPT s11-machine vision-s2
 
Sim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement LearningSim-to-Real Transfer in Deep Reinforcement Learning
Sim-to-Real Transfer in Deep Reinforcement Learning
 
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
 
Machine Learning: Learning with data
Machine Learning: Learning with dataMachine Learning: Learning with data
Machine Learning: Learning with data
 
One talk Machine Learning
One talk Machine LearningOne talk Machine Learning
One talk Machine Learning
 
DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101DEF CON 24 - Clarence Chio - machine duping 101
DEF CON 24 - Clarence Chio - machine duping 101
 
DL4J at Workday Meetup
DL4J at Workday MeetupDL4J at Workday Meetup
DL4J at Workday Meetup
 
04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx04 Deep CNN (Ch_01 to Ch_3).pptx
04 Deep CNN (Ch_01 to Ch_3).pptx
 
Scalable image recognition model with deep embedding
Scalable image recognition model with deep embeddingScalable image recognition model with deep embedding
Scalable image recognition model with deep embedding
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object tracking
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
 

Plus de zukun

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009zukun
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVzukun
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Informationzukun
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statisticszukun
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibrationzukun
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionzukun
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluationzukun
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-softwarezukun
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptorszukun
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectorszukun
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-introzukun
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video searchzukun
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video searchzukun
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video searchzukun
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learningzukun
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionzukun
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick startzukun
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysiszukun
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structureszukun
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities zukun
 

Plus de zukun (20)

My lyn tutorial 2009
My lyn tutorial 2009My lyn tutorial 2009
My lyn tutorial 2009
 
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCVETHZ CV2012: Tutorial openCV
ETHZ CV2012: Tutorial openCV
 
ETHZ CV2012: Information
ETHZ CV2012: InformationETHZ CV2012: Information
ETHZ CV2012: Information
 
Siwei lyu: natural image statistics
Siwei lyu: natural image statisticsSiwei lyu: natural image statistics
Siwei lyu: natural image statistics
 
Lecture9 camera calibration
Lecture9 camera calibrationLecture9 camera calibration
Lecture9 camera calibration
 
Brunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer visionBrunelli 2008: template matching techniques in computer vision
Brunelli 2008: template matching techniques in computer vision
 
Modern features-part-4-evaluation
Modern features-part-4-evaluationModern features-part-4-evaluation
Modern features-part-4-evaluation
 
Modern features-part-3-software
Modern features-part-3-softwareModern features-part-3-software
Modern features-part-3-software
 
Modern features-part-2-descriptors
Modern features-part-2-descriptorsModern features-part-2-descriptors
Modern features-part-2-descriptors
 
Modern features-part-1-detectors
Modern features-part-1-detectorsModern features-part-1-detectors
Modern features-part-1-detectors
 
Modern features-part-0-intro
Modern features-part-0-introModern features-part-0-intro
Modern features-part-0-intro
 
Lecture 02 internet video search
Lecture 02 internet video searchLecture 02 internet video search
Lecture 02 internet video search
 
Lecture 01 internet video search
Lecture 01 internet video searchLecture 01 internet video search
Lecture 01 internet video search
 
Lecture 03 internet video search
Lecture 03 internet video searchLecture 03 internet video search
Lecture 03 internet video search
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Advances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer visionAdvances in discrete energy minimisation for computer vision
Advances in discrete energy minimisation for computer vision
 
Gephi tutorial: quick start
Gephi tutorial: quick startGephi tutorial: quick start
Gephi tutorial: quick start
 
EM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysisEM algorithm and its application in probabilistic latent semantic analysis
EM algorithm and its application in probabilistic latent semantic analysis
 
Object recognition with pictorial structures
Object recognition with pictorial structuresObject recognition with pictorial structures
Object recognition with pictorial structures
 
Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities Iccv2011 learning spatiotemporal graphs of human activities
Iccv2011 learning spatiotemporal graphs of human activities
 

Dernier

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

Fcv rep darrell

  • 1. Learning visual representations for unfamiliar environments Kate Saenko, Brian Kulis, Trevor Darrell UC Berkeley EECS & ICSI
  • 2. The challenge of large scale visual interaction Last decade has proven the superiority of models learned from data vs. hand engineered structures!
  • 3. Large-scale learning • “Unsupervised”: Learn models from “found data”; often exploit multiple modalities (text+image) … The Tote is the perfect example of two handbag design principles that ... The lines of this tote are incredibly sleek, but ... The semi buckles that form the handle attachments are ...
  • 4. E.g., finding visual senses Artifact sense: “telephone” DICTIONARY 1: (n) telephone, phone, telepho ne set (electronic equipment that converts sound into electrical signals that can be transmitted over distances and then converts received signals back into sounds) 2: (n) telephone, telephony (transmitting speech at a distance) [Saenko and Darrell ’09] 4
  • 5. Large-scale Learning • “Unsupervised”: Learn models from “found data”; often exploit multiple modalities (text+image) … The Tote is the perfect example of two handbag design principles that ... The lines of this tote are incredibly sleek, but ... The semi buckles that form the handle attachments are ... • Supervised: Crowdsource labels (e.g., ImageNet)
  • 6. Yet… • Even the best collection of images from the web and strong machine learning methods can often yield poor classifiers on in-situ data! ? • Supervised learning assumption: training distribution == test distribution • Unsupervised learning assumption: joint distribution is stationary w.r.t. online world and real world Almost never true! 6
  • 7. “What You Saw Is Not What You Get” SVM:20% NBNN:19% SVM:54% NBNN:61% The models fail due to domain shift
  • 8. Examples of visual domain shifts digital SLR webcam Close-up Far-away amazon.com FLICKR CCTV Consumer images
  • 9. Examples of domain shift: change in camera, feature type, dimension digital SLR webcam SURF SIFT VQ to 300 Different VQ to 1000 dimensions
  • 10. Solutions? • Do nothing (poor performance) • Collect all types of data (impossible) • Find out what changed (impractical) • Learn what changed
  • 11. Prior Work on Domain Adaptation • Pre-process the data [Daumé ’07] : replicate features to also create source- and domain- specific versions; re-train learner on new features • SVM-based methods [Yang’07], [Jiang’08], [Duan’09], [Duan’10] : adapt SVM parameters • Kernel mean matching [Gretton’09] : re-weight training data to match test data distribution
  • 12. Our paradigm: Transform-based Domain Adaptation Example: “green” and “blue” domains Previous methods’ drawbacks • cannot transfer learned shift to new categories • cannot handle new features We can do both by learning W domain transformations* * Saenko, Kulis, Fritz, and Darrell. Adapting visual category models to new domains. ECCV, 2010
  • 13. Limitations of symmetric transforms Symmetric assumption fails! Saenko et al. ECCV10 used metric learning: • symmetric transforms • same features W How do we learn more general shifts?
  • 14. Latest approach*: asymmetric transforms Asymmetric transform (rotation) • Metric learning model no longer applicable • We propose to learn asymmetric transforms – Map from target to source – Handle different dimensions *Kulis, Saenko, and Darrell, What You Saw is Not What You Get: Domain Adaptation Using Asymmetric Kernel Transforms, CVPR 2011
  • 15. Latest approach: asymmetric transforms Asymmetric transform (rotation) • Metric learning model no longer applicable • We propose to learn asymmetric transforms W – Map from target to source – Handle different dimensions
  • 16. Model Details W • Learn a linear transformation to map points from one domain to another – Call this transformation W – Matrices of source and target:
  • 17. Loss Functions Choose a point x from the source and y from the target, and consider inner product: Should be “large” for similar objects and “small” for dissimilar objects
  • 18. Loss Functions • Input to problem includes a collection of m loss functions • General assumption: loss functions depend on data only through inner product matrix
  • 19. Regularized Objective Function • Minimize a linear combination of sum of loss functions and a regularizer: • We use squared Frobenius norm as a regularizer – Not restricted to this choice
  • 20. The Model Has Drawbacks • A linear transformation may be insufficient • Cost of optimization grows as the product of the dimensionalities of the source and target data • What to do?
  • 21. Kernelization • Main idea: run in kernel space – Use a non-linear kernel function (e.g., RBF kernel) to learn non-linear transformations in input space – Resulting optimization is independent of input dimensionality – Additional assumption necessary: regularizer is a spectral function
  • 22. Kernelization Kernel matrices for source and target Original Transformation Learning Problem New Kernel Problem Relationship between original and new problems at optimality
  • 23. Summary of approach Input Input space space 1. Multi-Domain Data 2. Generate Constraints, Learn W Test point y1 y2 Test point 3. Map via W 4. Apply to New Categories
  • 25. Experimental Setup • Utilized a standard bag-of-words model • Also utilize different features in the target domain – SURF vs SIFT – Different visual word dictionaries • Baseline for comparing such data: KCCA
  • 26. Novel-class experiments Our Method (linear) Our Method • Test method’s ability to transfer domain shift to unseen classes • Train transform on half of the classes, test on the other half
  • 27. Extreme shift example Query from target Nearest neighbors in source using KCCA+KNN Nearest neighbors in source using transformation
  • 28. Conclusion • Should not rely on hand-engineered features any more than we rely on hand engineered models! • Learn feature transformation across domains • Developed a domain adaptation method based on regularized non-linear transforms – Asymmetric transform achieves best results on more extreme shifts – Saenko et al ECCV 2010 and Kulis et al CVPR 2011; journal version forthcoming

Notes de l'éditeur

  1. Introduce transformations; show why symmetric transformation isn’t enough
  2. End of Kate’s part
  3. Why is this nontrivial?
  4. Todo: make into bar plot