Multiple Object Tracking - Laura Leal-Taixe - UPC Barcelona 2018

•

3 j'aime•1,388 vues

https://telecombcn-dl.github.io/2018-dlcv/ Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.

Données & analyses

Prof. Laura Leal-Taixé
Multiple object tracking

Dynamic Scene Understanding
Understand every pixel of a video
road
tree
Semantic
segmentation
person
car

Dynamic Scene Understanding
Understand every pixel of a video
person 1
tree
Semantic
segmentation
person 2
person 3
Instance-
based
segmentation
road
car

Dynamic Scene Understanding
Understand every pixel of a video
Semantic
segmentation
Instance-
based
segmentation
Multiple
object
tracking

Multiple object tracking
Goal: detect
and track all
objects in a
scene

Person
detection
Tracking-by-detection
Video sequence

Tracking with network flows
Video sequence
Graphical model
Node

Tracking with network flows
L. Leal-Taixé et al. ICCVW2011
Solving the Minimum Cost Flow Problem

Tracking with Linear Programming
• Objective function
T = arg min
T i,j
C(i, j)f(i, j)
T = arg min
T i,j
C(i, j)f(i, j)
L. Leal-Taixé et al. ICCVW2011
Solving the Minimum Cost Flow Problem

• Objective function
Tracking with Linear Programming
T = arg min
T i,j
C(i, j)f(i, j)
Indicator {0,1}
Costs – what will
drive the tracking
C
C

Measuring bounding box similarity
Classic measures
• Bounding box distance
• Appearance similarity
Proposed measures
Modeling pedestrian
motion and
interactions

Modeling pedestrian interaction
Using a physics-
based model of
crowd motion
Learning an
image-based
motion
context
Learning
appearance and
interactions with
Deep Learning

The effect of undetected pedestrians
Image Optical flow

Learning an image-based motion context
Image Features Estimation of the
velocity
L. Leal-Taixé, M. Fenzi, A. Kuznetsova, B. Rosenhahn, S. Savarese. CVPR 2014.
Hough
forests

Pedestrian velocity estimation
Histogram
of the errors
Optical Flow Social Force
Model
Proposed

Learning appearance and interactions
Linear
Programming
Predicted
associations
t t+1Siamese neural
network
L. Leal-Taixé, C. Canton, K. Schindler. CVPRW 2016

Design
Stacking
10x121x53I2 O2
Localspatio-temporal
features
Final
Matching
Prediction
Gradient
Boosting
Classifier
Contextual
features
Height, width,
distance, …
C1 C2 C3 F4 F5 F6 F7

Improvement directions
• Better appearance models
– Ristani & Tomasi. Features for Multi-Target Multi-Camera Tracking
and Re-Identification. CVPR 2018
• Multi-detector fusion
– Henschel et al. Fusion of Head and Full-Body Detectors for Multi-
Object Tracking. CVPRW 2018.

Can we learn it all?
• Problem 1: dimensionality of the output
– Neural networks can handle fixed sized outputs (tensors)
– Tracking:
• Unknown number of detections per frame
• Unknown length of each trajectory
• Solution 1: see the Set Learning lecture
• Solution 2: recursive

Can we learn it all?
• Solution 2: recursive
– Using a Recurrent Neural Network to predict trajectory properties
– Birth/death, transition à motion model
• Does not work well….
• Is it likely that more data is needed to train such a motion model?
Milan et al. Online Multi-Target Tracking Using Recurrent Neural Networks. AAAI 2017

Can we learn it all?
• Our Solution 2: recursive
– One trajectory at a time, one frame at a time

Can we learn it all?
Girshick Fast R-CNN. ICCV 2015

Can we learn it all?
• We use the detections from
the last frame as proposals
• Where did the detection with
ID 1 go in the next frame?
P. Bergmann. Master Thesis. TUM 2018

Pros and cons
• PRO We can reuse an extremely well-trained regressor
– We get well-positioned bounding boxes
• CON The regressor only shifts the box by a small quantity
– We need to compensate for large camera motions à optical flow
P. Bergmann. Master Thesis. TUM 2018

Surprisingly good results
• Extremely good MOTA improvement

Surprisingly good results
• Quite some ID switches but still good identification overall

Conclusions
• The old assumption that objects have small displacements still
holds and can be used to reach SOA performance
• What is next? Hard cases.
– Long occlusions
– Crowded scenes

Prof. Laura Leal-Taixé
Questions?
Prof. Laura Leal-Taixé

Contenu connexe

Tendances

Object Detection Methods using Deep Learning

Sungjoon Choi

Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.

Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018

Universitat Politècnica de Catalunya

Object Detection Using R-CNN Deep Learning Framework

Nader Karimi

How much position information do convolutional neural networks encode? review...

Dongmin Choi

https://github.com/telecombcn-dl/dlmm-2017-dcu Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.

D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)

Universitat Politècnica de Catalunya

Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018

Universitat Politècnica de Catalunya

ViT (Vision Transformer) Review [CDM]

Dongmin Choi

You only look once: Unified, real-time object detection (UPC Reading Group)

Universitat Politècnica de Catalunya

Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...

Universitat Politècnica de Catalunya

Slides by Miriam Bellver from the Computer Vision Reading Group at the Universitat Politecnica de Catalunya about the paper: Lu, Yongxi, Tara Javidi, and Svetlana Lazebnik. "Adaptive Object Detection Using Adjacency and Zoom Prediction." CVPR 2016 Abstract: State-of-the-art object detection systems rely on an accurate set of region proposals. Several recent methods use a neural network architecture to hypothesize promising object locations. While these approaches are computationally efficient, they rely on fixed image regions as anchors for predictions. In this paper we propose to use a search strategy that adaptively directs computational resources to sub-regions likely to contain objects. Compared to methods based on fixed anchor locations, our approach naturally adapts to cases where object instances are sparse and small. Our approach is comparable in terms of accuracy to the state-of-the-art Faster R-CNN approach while using two orders of magnitude fewer anchors on average. Code is publicly available.

Adaptive object detection using adjacency and zoom prediction

Universitat Politècnica de Catalunya

Recent Progress on Object Detection_20170331

Jihong Kang

Object detection - RCNNs vs Retinanet

Rishabh Indoria

Advanced deep learning based object detection methods

Brodmann17

Transformer in Vision

Sangmin Woo

Recurrent Instance Segmentation (UPC Reading Group)

Universitat Politècnica de Catalunya

https://telecombcn-dl.github.io/dlmm-2017-dcu/ Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or image captioning.

Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)

Universitat Politècnica de Catalunya

Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018

Universitat Politècnica de Catalunya

http://imatge-upc.github.io/telecombcn-2016-dlcv/ Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of big annotated data and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which had been addressed until now with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks and Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles and applications of deep learning to computer vision problems, such as image classification, object detection or text captioning.

Deep Learning for Computer Vision: Image Retrieval (UPC 2016)

Universitat Politècnica de Catalunya

Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)

Universitat Politècnica de Catalunya

Recent Breakthroughs in AI + Learning Visual-Linguistic Representation in the...

Sangmin Woo

Tendances (20)

Object Detection Methods using Deep Learning

Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018

Object Detection Using R-CNN Deep Learning Framework

How much position information do convolutional neural networks encode? review...

D1L5 Visualization (D1L2 Insight@DCU Machine Learning Workshop 2017)

Semantic Segmentation - Míriam Bellver - UPC Barcelona 2018

ViT (Vision Transformer) Review [CDM]

You only look once: Unified, real-time object detection (UPC Reading Group)

Visual Saliency Prediction with Deep Learning - Kevin McGuinness - UPC Barcel...

Adaptive object detection using adjacency and zoom prediction

Recent Progress on Object Detection_20170331

Object detection - RCNNs vs Retinanet

Advanced deep learning based object detection methods

Transformer in Vision

Recurrent Instance Segmentation (UPC Reading Group)

Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)

Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018

Deep Learning for Computer Vision: Image Retrieval (UPC 2016)

Transfer Learning (D2L4 Insight@DCU Machine Learning Workshop 2017)

Recent Breakthroughs in AI + Learning Visual-Linguistic Representation in the...

Similaire à Multiple Object Tracking - Laura Leal-Taixe - UPC Barcelona 2018

Deep Neural Networks for Multimodal Learning

Marc Bolaños Solà

A tutorial on deep learning at icml 2013

Philip Zheng

Optical Flow with Semantic Segmentation and Localized Layers

Seval Çapraz

Visualization of Deep Learning

YaminiAlapati1

Real time Traffic Signs Recognition using Deep Learning

IRJET Journal

It is a wearable device. It has a camera, and it detects all living and non living object. This module detects moving object also. It is made with raspberry pi 3, and a camera. One headphone connect with raspberry pi. When this module detects items, it gave a sound output through headphone. Hence the blind man know that item, which is in-front of him or her. We made it in very low budget, and it is very helpful for visually challenged people. And the Blind stick help him to detect obstacles.

Development of wearable object detection system & blind stick for visuall...

Arkadev Kundu

Robust Tracking Via Feature Mapping Method and Support Vector Machine

IRJET Journal

This paper investigates long-term face tracking of a specific person given his/her face image in a single frame as a query in a video stream. Through taking advantage of pre-trained deep learning models on big data, a novel system is developed for accurate video face tracking in the unconstrained environments depicting various people and objects moving in and out of the frame. In the proposed system, we present a detection-verification-tracking method (dubbed as 'DVT') which accomplishes the long-term face tracking task through the collaboration of face detection, face verification, and (short-term) face tracking. An offline trained detector based on cascaded convolutional neural networks localizes all faces appeared in the frames, and an offline trained face verifier based on deep convolutional neural networks and similarity metric learning decides if any face or which face corresponds to the queried person. An online trained tracker follows the face from frame to frame. When validated on a sitcom episode and a TV show, the DVT method outperforms tracking-learning-detection (TLD) and face-TLD in terms of recall and precision. The proposed system is also tested on many other types of videos and shows very promising results.

Long-term Face Tracking in the Wild using Deep Learning

Elaheh Rashedi

Introduction to computer vision with Convoluted Neural Networks

MarcinJedyk

Future semantic segmentation with convolutional LSTM

Kyuri Kim

slide-171212080528.pptx

SharanrajK22MMT1003

The goal of video object tracking system is segmenting a region of interest from a video scene and keeping track of its motion, positioning and occlusion. There are the three steps of video object tracking system those are object detection, object classification and object tracking. Object detection is performed to check existence of objects in video. Then the detected object can be classified in various categories on the basis on their shape, motion, color and texture. Object tracking is performed using monitoring object changes. This paper we are going to take overview of different object detection, object classification and object tracking techniques and also the comparison of different techniques used for various stages of tracking.

Overview Of Video Object Tracking System

Editor IJMTER

Introduction to computer vision

Marcin Jedyk

Lecture 3: RNNs - Full Stack Deep Learning - Spring 2021

Sergey Karayev

Boosting covariance data on Riemannian manifolds has proven to be a convenient strategy in a pedestrian detection context. In this paper we show that the detection performances of the state-of-the-art approach of Tuzel et al. can be greatly improved, from both a computational and a qualitative point of view, by considering practical and theoretical issues, and allowing also the estimation of occlusions in a fine way. The resulting detection system reaches the best performance on the INRIA dataset, setting novel state-of-theart results.

A Re-evaluation of Pedestrian Detection on Riemannian Manifolds

Diego Tosato

Real Time Object Dectection using machine learning

pratik pratyay

IRJET- Real-Time Object Detection using Deep Learning: A Survey

IRJET Journal

The raising number of elderly people urges the research of systems able to monitor and support people inside their domestic environment. An automatic system capturing data about the position of a person in the house, through accelerometers and RGBd cameras can monitor the person activities and produce outputs associating the movements to a given tasks or predicting the set of activities that will be executes. We considered, for the task the classification of the activities a Deep Convolutional Neural Network. We compared two different deep network and analyzed their outputs.

Classification of indoor actions through deep neural networks

Cognitive Robotics and Social Sensing Lab - CNR - ICAR

物件偵測與辨識技術

CHENHuiMei

Deep Learning And Business Models (VNITC 2015-09-13)

Ha Phuong

Similaire à Multiple Object Tracking - Laura Leal-Taixe - UPC Barcelona 2018 (20)

Deep Neural Networks for Multimodal Learning

A tutorial on deep learning at icml 2013

Optical Flow with Semantic Segmentation and Localized Layers

Visualization of Deep Learning

Real time Traffic Signs Recognition using Deep Learning

Development of wearable object detection system & blind stick for visuall...

Robust Tracking Via Feature Mapping Method and Support Vector Machine

Long-term Face Tracking in the Wild using Deep Learning

Introduction to computer vision with Convoluted Neural Networks

Future semantic segmentation with convolutional LSTM

slide-171212080528.pptx

Overview Of Video Object Tracking System

Introduction to computer vision

Lecture 3: RNNs - Full Stack Deep Learning - Spring 2021

A Re-evaluation of Pedestrian Detection on Riemannian Manifolds

Real Time Object Dectection using machine learning

IRJET- Real-Time Object Detection using Deep Learning: A Survey

Classification of indoor actions through deep neural networks

物件偵測與辨識技術

Deep Learning And Business Models (VNITC 2015-09-13)

Plus de Universitat Politècnica de Catalunya

This document provides an overview of deep generative learning and summarizes several key generative models including GANs, VAEs, diffusion models, and autoregressive models. It discusses the motivation for generative models and their applications such as image generation, text-to-image synthesis, and enhancing other media like video and speech. Example state-of-the-art models are provided for each application. The document also covers important concepts like the difference between discriminative and generative modeling, sampling techniques, and the training procedures for GANs and VAEs.

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)

Universitat Politècnica de Catalunya

Deep Generative Learning for All

Universitat Politècnica de Catalunya

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...

Universitat Politècnica de Catalunya

Machine translation and computer vision have greatly benefited from the advances in deep learning. A large and diverse amount of textual and visual data have been used to train neural networks whether in a supervised or self-supervised manner. Nevertheless, the convergence of the two fields in sign language translation and production still poses multiple open challenges, like the low video resources, limitations in hand pose estimation, or 3D spatial grounding from poses.

Towards Sign Language Translation & Production | Xavier Giro-i-Nieto

Universitat Politècnica de Catalunya

The Transformer - Xavier Giró - UPC Barcelona 2021

Universitat Politècnica de Catalunya

Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...

Universitat Politècnica de Catalunya

Machine translation and computer vision have greatly benefited of the advances in deep learning. The large and diverse amount of textual and visual data have been used to train neural networks whether in a supervised or self-supervised manner. Nevertheless, the convergence of the two field in sign language translation and production is still poses multiple open challenges, like the low video resources, limitations in hand pose estimation, or 3D spatial grounding from poses. This talk will present these challenges and the How2✌️Sign dataset (https://how2sign.github.io) recorded at CMU in collaboration with UPC, BSC, Gallaudet University and Facebook. https://imatge.upc.edu/web/publications/sign-language-translation-and-production-multimedia-and-multimodal-challenges-all

Open challenges in sign language translation and production

Universitat Politècnica de Catalunya

https://imatge-upc.github.io/synthref/ Integrating computer vision with natural language processing has achieved significant progress over the last years owing to the continuous evolution of deep learning. A novel vision and language task, which is tackled in the present Master thesis is referring video object segmentation, in which a language query defines which instance to segment from a video sequence. One of the biggest challenges for this task is the lack of relatively large annotated datasets since a tremendous amount of time and human effort is required for annotation. Moreover, existing datasets suffer from poor quality annotations in the sense that approximately one out of ten language expressions fails to uniquely describe the target object. The purpose of the present Master thesis is to address these challenges by proposing a novel method for generating synthetic referring expressions for an image (video frame). This method pro- duces synthetic referring expressions by using only the ground-truth annotations of the objects as well as their attributes, which are detected by a state-of-the-art object detection deep neural network. One of the advantages of the proposed method is that its formulation allows its application to any object detection or segmentation dataset. By using the proposed method, the first large-scale dataset with synthetic referring expressions for video object segmentation is created, based on an existing large benchmark dataset for video instance segmentation. A statistical analysis and comparison of the created synthetic dataset with existing ones is also provided in the present Master thesis. The conducted experiments on three different datasets used for referring video object segmentation prove the efficiency of the generated synthetic data. More specifically, the obtained results demonstrate that by pre-training a deep neural network with the proposed synthetic dataset one can improve the ability of the network to generalize across different datasets, without any additional annotation cost. This outcome is even more important taking into account that no additional annotation cost is involved.

Generation of Synthetic Referring Expressions for Object Segmentation in Videos

Universitat Politècnica de Catalunya

Master MATT thesis defense by Juan José Nieto Advised by Víctor Campos and Xavier Giro-i-Nieto. 27th May 2021. Pre-training Reinforcement Learning (RL) agents in a task-agnostic manner has shown promising results. However, previous works still struggle to learn and discover meaningful skills in high-dimensional state-spaces. We approach the problem by leveraging unsupervised skill discovery and self-supervised learning of state representations. In our work, we learn a compact latent representation by making use of variational or contrastive techniques. We demonstrate that both allow learning a set of basic navigation skills by maximizing an information theoretic objective. We assess our method in Minecraft 3D maps with different complexities. Our results show that representations and conditioned policies learned from pixels are enough for toy examples, but do not scale to realistic and complex maps. We also explore alternative rewards and input observations to overcome these limitations. https://imatge.upc.edu/web/publications/discovery-and-learning-navigation-goals-pixels-minecraft

Discovery and Learning of Navigation Goals from Pixels in Minecraft

Universitat Politècnica de Catalunya

Peter Muschick MSc thesis Universitat Pollitecnica de Catalunya, 2020 Sign language recognition and translation has been an active research field in the recent years with most approaches using deep neural networks to extract information from sign language data. This work investigates the mostly disregarded approach of using human keypoint estimation from image and video data with OpenPose in combination with transformer network architecture. Firstly, it was shown that it is possible to recognize individual signs (4.5% word error rate (WER)). Continuous sign language recognition though was more error prone (77.3% WER) and sign language translation was not possible using the proposed methods, which might be due to low accuracy scores of human keypoint estimation by OpenPose and accompanying loss of information or insufficient capacities of the used transformer model. Results may improve with the use of datasets containing higher repetition rates of individual signs or focusing more precisely on keypoint extraction of hands.

Learn2Sign : Sign language recognition and translation using human keypoint e...

Universitat Politècnica de Catalunya

Intepretability / Explainable AI for Deep Neural Networks

Universitat Politècnica de Catalunya

Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020

Universitat Politècnica de Catalunya

Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.

Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...

Universitat Politècnica de Catalunya

Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.

Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020

Universitat Politècnica de Catalunya

https://telecombcn-dl.github.io/dlai-2020/ Deep learning technologies are at the core of the current revolution in artificial intelligence for multimedia data analysis. The convergence of large-scale annotated datasets and affordable GPU hardware has allowed the training of neural networks for data analysis tasks which were previously addressed with hand-crafted features. Architectures such as convolutional neural networks, recurrent neural networks or Q-nets for reinforcement learning have shaped a brand new scenario in signal processing. This course will cover the basic principles of deep learning from both an algorithmic and computational perspectives.

Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...

Universitat Politècnica de Catalunya

https://telecombcn-dl.github.io/drl-2020/ This course presents the principles of reinforcement learning as an artificial intelligence tool based on the interaction of the machine with its environment, with applications to control tasks (eg. robotics, autonomous driving) o decision making (eg. resource optimization in wireless communication networks). It also advances in the development of deep neural networks trained with little or no supervision, both for discriminative and generative tasks, with special attention on multimedia applications (vision, language and speech).

Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020

Universitat Politècnica de Catalunya

Giro-i-Nieto, X. One Perceptron to Rule Them All: Language, Vision, Audio and Speech. In Proceedings of the 2020 International Conference on Multimedia Retrieval (pp. 7-8). Tutorial page: https://imatge.upc.edu/web/publications/one-perceptron-rule-them-all-language-vision-audio-and-speech-tutorial Deep neural networks have boosted the convergence of multimedia data analytics in a unified framework shared by practitioners in natural language, vision and speech. Image captioning, lip reading or video sonorization are some of the first applications of a new and exciting field of research exploiting the generalization properties of deep neural representation. This tutorial will firstly review the basic neural architectures to encode and decode vision, text and audio, to later review the those models that have successfully translated information across modalities.

Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)

Universitat Politècnica de Catalunya

Image segmentation is a classic computer vision task that aims at labeling pixels with semantic classes. These slides provide an overview of the basic approaches applied from the deep learning field to tackle this challenge and presents the basic subtasks (semantic, instance and panoptic segmentation) and related datasets. Presented at the International Summer School on Deep Learning (ISSonDL) 2020 held online and organized by the University of Gdansk (Poland) between the 30th August and 2nd September. http://2020.dl-lab.eu/virtual-summer-school-on-deep-learning/

Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...

Universitat Politècnica de Catalunya

https://imatge-upc.github.io/rvos-mots/ Video object segmentation can be understood as a sequence-to-sequence task that can benefit from the curriculum learning strategies for better and faster training of deep neural networks. This work explores different schedule sampling and frame skipping variations to significantly improve the performance of a recurrent architecture. Our results on the car class of the KITTI-MOTS challenge indicate that, surprisingly, an inverse schedule sampling is a better option than a classic forward one. Also, that a progressive skipping of frames during training is beneficial, but only when training with the ground truth masks instead of the predicted ones.

Curriculum Learning for Recurrent Video Object Segmentation

Universitat Politècnica de Catalunya

Deep neural networks have achieved outstanding results in various applications such as vision, language, audio, speech, or reinforcement learning. These powerful function approximators typically require large amounts of data to be trained, which poses a challenge in the usual case where little labeled data is available. During the last year, multiple solutions have been proposed to leverage this problem, based on the concept of self-supervised learning, which can be understood as a specific case of unsupervised learning. This talk will cover its basic principles and provide examples in the field of multimedia.

Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020

Universitat Politècnica de Catalunya

Plus de Universitat Politècnica de Catalunya (20)

Deep Generative Learning for All - The Gen AI Hype (Spring 2024)

Deep Generative Learning for All

The Transformer in Vision | Xavier Giro | Master in Computer Vision Barcelona...

Towards Sign Language Translation & Production | Xavier Giro-i-Nieto

The Transformer - Xavier Giró - UPC Barcelona 2021

Learning Representations for Sign Language Videos - Xavier Giro - NIST TRECVI...

Open challenges in sign language translation and production

Generation of Synthetic Referring Expressions for Object Segmentation in Videos

Discovery and Learning of Navigation Goals from Pixels in Minecraft

Learn2Sign : Sign language recognition and translation using human keypoint e...

Intepretability / Explainable AI for Deep Neural Networks

Convolutional Neural Networks - Xavier Giro - UPC TelecomBCN Barcelona 2020

Self-Supervised Audio-Visual Learning - Xavier Giro - UPC TelecomBCN Barcelon...

Attention for Deep Learning - Xavier Giro - UPC TelecomBCN Barcelona 2020

Generative Adversarial Networks GAN - Xavier Giro - UPC TelecomBCN Barcelona ...

Q-Learning with a Neural Network - Xavier Giró - UPC Barcelona 2020

Language and Vision with Deep Learning - Xavier Giró - ACM ICMR 2020 (Tutorial)

Image Segmentation with Deep Learning - Xavier Giro & Carles Ventura - ISSonD...

Curriculum Learning for Recurrent Video Object Segmentation

Deep Self-supervised Learning for All - Xavier Giro - X-Europe 2020

Dernier

Building Real-Time Pipelines With FLaNK Timothy Spann, Principal Developer Advocate, Streaming - Cloudera Future of Data meetup, startup grind, AI Camp The combination of Apache Flink, Apache NiFi, and Apache Kafka for building real-time data processing pipelines is extremely powerful, as demonstrated by this case study using the FLaNK-MTA project. The project leverages these technologies to process and analyze real-time data from the New York City Metropolitan Transportation Authority (MTA). FLaNK-MTA demonstrates how to efficiently collect, transform, and analyze high-volume data streams, enabling timely insights and decision-making. Apache NiFi Apache Kafka Apache Flink Apache Iceberg LLM Generative AI Slack Postgresql

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK

Timothy Spann

Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models We are available 24*7 Booking Contact Details :- WhatsApp Chat :- +91-7014168258 If you're looking for India Call girls you've come to the right place. You'll find some of the most beautiful call girls in our location with. These ladies have pleasing personalities, hot figures, and a passion for physical pleasure. Call girls in India Lucknow Many men have booked them for their erotic and soul-mixing performances, which are sure to leave you with unforgettable memories. #K09 Escort Service India is available in the city for men and women of all ages. They can satisfy your sexual needs and will make your experience even more enjoyable and memorable. Whether you're looking for a blow-job, stripping, lovemaking, or other dirty acts, you'll be able to find a match for your tastes and budget. These highly trained professionals will help you have an unforgettable night. One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7014168258 We are available 24*7 all days of the year. Call us — 7014168258 Thank you for Visiting.

Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...

gajnagarg

Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service Available 24/7 Hire Our agency presents a selection of young, charming call girls #J11 available for bookings at Oyo Hotels. Experience high-class escort services at pocket-friendly rates, with our female escorts exuding both beauty and a delightful personality, ready to meet your desires. Whether it's Housewives,#J11 College girls, Russian girls, Muslim girls, or any other preference, we offer a diverse range of options to cater to your tastes. We provide both in-call and out-call services for your convenience. Our in-call location in Kolkata ensures cleanliness, hygiene, and 100% safety, while our out-call services offer doorstep delivery for added ease. We value your time and money, hence we kindly request pic collectors, time-passers, and bargain hunters to refrain from contacting us. Our services feature various packages at competitive rates: One shot: ₹2000/in-call, ₹5000/out-call Two shots with one girl: ₹3500/in-call, ₹6000/out-call Body to body massage with sex: ₹3000/in-call Full night for one person: ₹7000/in-call, ₹10000/out-call Full night for more than 1 person: Contact us at 🔝 8005736733 🔝. for details Operating 24/7, we serve various locations in Kolkata, including Green Park, near metro stations. For premium call girl services in Kolkata 🔝 8005736733 🔝. Thank you for considering us!

Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...

HyderabadDolls

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed

amy56318795

Klinik_ Apotek Onlin 085657271886 Solusi Menggugurkan Masalah Kehamilan Anda Jual Obat Aborsi Asli KLINIK ABORSI TERPEECAYA _ Jual Obat Aborsi Cytotec Misoprostol Asli 100% Ampuh Hanya 3 Jam Langsung Gugur || OBAT PENGGUGUR KANDUNGAN AMPUH MANJUR OBAT ABORSI OLINE" APOTIK Jual Obat Cytotec, Gastrul, Gynecoside Asli Ampuh. JUAL ” Obat Aborsi Tuntas | Obat Aborsi Manjur | Obat Aborsi Ampuh | Obat Penggugur Janin | Obat Pencegah Kehamilan | Obat Pelancar Haid | Obat terlambat Bulan | Ciri Obat Aborsi Asli | Obat Telat Bulan | Pil Aborsi Asli | Cara Menggugurkan Konten | Cara Aborsi Tuntas | Harga Obat Aborsi Asli | Pil Aborsi | Jual Obat Aborsi Cytotec | Cara Aborsi Sendiri | Cara Aborsi Usia 1 Bulan | Cara Aborsi Usia 2 Tahun | Cara Aborsi Usia 3 Bulan | Obat Aborsi Usia 4 Bulan | Cara Abrasi Usia 5 Bulan | Cara Menggugurkan Konten | Kandungan Obat Penggugur | Cara Menghitung Usia Konten | Cara Mengatasi Terlambat Bulan | Penjual Obat Aborsi Asli | Obat Aborsi Garansi | Kandungan Obat Peluntur | Obat Telat Datang Bulan | Obat Telat Haid | Obat Aborsi Paling Murah | Klinik Jual Obat Aborsi | Jual Pil Cytotec | Apotik Jual Obat Aborsi | Kandungan Dokter Abrasi | Cara Aborsi Cepat | Jual Obat Aborsi Bergaransi | Jual Obat Cytotec Asli | Obat Aborsi Aman Manjur | Obat Misoprostol Cytotec Asli. "APA ITU ABORSI" “Aborsi Adalah dengan membendung hormon yang di perlukan untuk mempertahankan kehamilan yaitu hormon progesteron, karena hormon ini dibendung, maka jalur kehamilan mulai membuka dan leher rahim menjadi melunak,sehingga mengeluarkan darah yang merupakan tanda bahwa obat telah bekerja || maksimal 1 jam obat diminum || PENJELASAN OBAT ABORSI USIA 1 _7 BULAN Pada usia kandungan ini, pasien akan merasakan sakit yang sedikit tidak berlebihan || sekitar 1 jam ||. namun hanya akan terjadi pada saatdarah keluar merupakan pertanda menstruasi. Hal ini dikarenakan pada usiakandungan 3 bulan,janin sudah terbentuk sebesar kepalan tangan orang dewasa. Cara kerja obat aborsi : JUAL OBAT ABORSI AMPUH dosis 3 bulan secara umum sama dengan cara kerja || DOSIS OBAT ABORSI 2 bulan”, hanya berbedanya selain mengisolasijanin juga menghancurkan janin dengan formula methotrexate dikandungdidalamnya. Formula methotrexate ini sangat ampuh untuk menghancurkan janinmenjadi serpihan-serpihan kecil akan sangat berguna pada saat dikeluarkan nanti. APA ALASAN WANITA MELAKUKAN ABORSI? Aborsi di lakukan wanita hamil baik yang sudah menikah maupun belum menikah dengan berbagai alasan , akan tetapi alasan yang utama adalah alasan-alasan non medis (termasuk aborsi sendiri / di sengaja/ buatan] MELAYANI PEMESANAN OBAT ABORSI SETIAP HARI, SIAP KIRIM KESELURUH KOTA BESAR DI INDONESIA DAN LUAR NEGERI. HUBUNGI PEMESANAN LEBIH NYAMAN VIA WA/: 085657271886

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...

ZurliaSoop

Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Service Booking Now open +91- 9332606886 Pune VIP Call Girls is highly qualified professionals. #S10 They will offer only top quality services and never try to scam or drain you of money; in fact, they strive to give complete satisfaction so that each cent spent was worth its worth. Furthermore, these professionals pride themselves in keeping customer details confidential; any information gleaned will never be shared with any third party.$V15 Two shots with one girl: ₹4500/in-call, ₹7500/out-call Body to body massage with : ₹5500/in-call Full night for one person: ₹7000/in-call, ₹12000/out-call •𝐂𝐨𝐥𝐥𝐞𝐠𝐞 𝐆𝐢𝐫𝐥𝐬 •𝐇𝐨𝐮𝐬𝐞𝐰𝐢𝐯𝐞𝐬 •𝐇𝐢𝐠𝐡 𝐏𝐫𝐨𝐟𝐢𝐥𝐞 𝐆𝐢𝐫𝐥𝐬 •𝐑𝐚𝐦𝐩 𝐌𝐨𝐝𝐞𝐥𝐬 •𝐅𝐨𝐫𝐞𝐢𝐠𝐧𝐞𝐫 𝐆𝐢𝐫𝐥𝐬 𝐎𝐮𝐫 𝐞𝐬𝐜𝐨𝐫𝐭 𝐬𝐞𝐫𝐯𝐢𝐜𝐞𝐬 •𝐇𝐉 (𝐇𝐚𝐧𝐝 𝐉𝐨𝐛) •𝐁𝐉 (𝐁𝐥𝐨𝐰𝐣𝐨𝐛) # Completion # (Oral To Completion) # Special Massage # O-Level (Oral ) # Oral With A Noncondom) # COB (Come On Body) # Extraball (Have Many Times) # All Meetings 100%Safe

Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...

kumargunjan9515

Statistics notes ,it includes mean to index numbers

suginr1

Gartner's Data Analytics Maturity Model.pptx

chadhar227

Computer science Sql cheat sheet.pdf.pdf

SayantanBiswas37

Ranking and Scoring Exercises for Research

Rajesh Mondal

Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now Booking Contact Details :- WhatsApp Chat :- +91-7737669865 We offer all types of girls of your choice with space. Our escorts are fully cooperative and understand your needs. All types of call girls like Housewives, College girls,#K09 Russian girls, Muslim girls, Afghani girls, Bengali girls, Working girls, south Indian girls, Punjabi girls, etc. In-Call: — You Can Reach At Our Place in Bangalore Our place Which Is Very Clean Hygienic 100% safe Accommodation. Out-Call: — Service for Out Call You have To Come Pick The Girl From My Place We Also Provide Door-Step Services Hygienic: — Full Ac Neat And Clean Rooms Available In Hotel 24 * 7 Hrs In Bangalore Our Services and Rates: – One Shot — 2500/in call (time ½ hour), 5000/out call Two shot with one girl — 5000/in call (time 1 hour), 6000/out call Body to body massage with sex- 3000/in call (time 1 hour) full night for one person– 8000/in call, 10000/out call (shot limit 4 shot) We are available 24*7 all days of the year

Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now

gargpaaro

Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & Dating Escorts Service CALL GIRL IN Lucknow 9548273370 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL IN #j11 We are Providing :- ● – Private independent collage Going girls . ● – independent Models . ● – House Wife’s . ● – Private Independent House Wife’s ● – Corporate M.N.C Working Profiles . ● – Call Center Girls . ● – Live Band Girls . ●- Foreigners & Many More . Service type: 1.In call 2.out call 3. full Lip to Lip kiss 4.69 5.b-job without Condom 6. Hard Core sex & Much More. 7 Body to Body Touch 8 Kissing 9 Sucking Boobs and More 10 Enjoy by Hand 11 Relax By Oral 12 Sex with Happy Ending • In Call and Out Call Service • 3* 5* 7* Hotels Service • 24 Hours Available • Indian, Russian, Punjabi, Kashmiri Escorts • Real Models, College Girls, House Wife, Also Available • Short Time and Full Time Service Available • Hygienic Full AC Neat and Clean Rooms Avail. In Hotel 24 hours • Daily Escorts Staff Available • Minimum to Maximu m Range Available.c

Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...

HyderabadDolls

Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Models We are available 24*7 Booking Contact Details :- WhatsApp Chat :- +91-7014168258 If you're looking for India Call girls you've come to the right place. You'll find some of the most beautiful call girls in our location with. These ladies have pleasing personalities, hot figures, and a passion for physical pleasure. Call girls in India Lucknow Many men have booked them for their erotic and soul-mixing performances, which are sure to leave you with unforgettable memories. #K09 Escort Service India is available in the city for men and women of all ages. They can satisfy your sexual needs and will make your experience even more enjoyable and memorable. Whether you're looking for a blow-job, stripping, lovemaking, or other dirty acts, you'll be able to find a match for your tastes and budget. These highly trained professionals will help you have an unforgettable night. One Shot — 5000/in call (time 1 hour), 6000/out call Two shot with one girl — 8000/in call (time 2 hour), 10000/out call Body to body massage with sex- 8000/in call (time 1 hour) Full night Service for one person– 12000/in call, 13000/out call (shot limit 3-4 shots) Full night Service for more than 1 person — please contact Us —7014168258 We are available 24*7 all days of the year. Call us — 7014168258 Thank you for Visiting.

Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...

gajnagarg

+97470301568 Qatar THC Oil and weed in Qatar? Where can I get THC vape in Doha Qatar?WhatsApp +97470301568 Buy Weed, Cocaine, Heroin and Shrooms in France,Germany, Poland Serbia,Romania, Ukraine WhatsApp +97470301568 Buy Weed, Cocaine, Heroin and Shrooms in Dubai UAE Malaysia Oman Kuwait Bahrain Saudi Arabia Qatar Where can I get weed in Qatar? Where can I get THC vape in Doha Qatar?WhatsApp +97470301568 Buy Weed, Cocaine, Heroin and Shrooms in France,Germany, Poland Serbia,Romania, UkraineWhatsApp +97470301568 Buy Weed, Cocaine, Heroin and Shrooms in Dubai UAE Malaysia Oman Kuwait Bahrain Saudi Arabia Qatar WhatsApp+97470301568 WhatsApp +97470301568 Buy Weed, Cocaine, Heroin and Shrooms in Qatar Dubai UAE Malaysia Oman Kuwait Bahrain Saudi Arabia #Singapore #Jordan #Ireland, #Belgium, #United Kingdom, #Iceland, #*Portugal, Spain, China, Japan, Turkey, Canada United States, Morocco, France,Germany, Poland Serbia,Romania, Ukraine, and all countries United Arab Emirates . Our team has succesfully delivered in 26 different countries . All marijuana and Cocaine is double vacuum packed before shipping, making it completely odorless to ensure that it arrives safely to your door. Our distribution crew is expert at making packages that blend in with the rest of the mail. We have also put into place many other security measures to ensure the security of our customers. buy weed Dubai +97470301568buy Weed Qatar #Buy Weed Kuwait #Buy Weed Bahrain #Buy #Weed #Oman #Buy Weed UAE #Buy Weed Abu Dhabi @Buy Weed Doha Qatar #@Buy Weed Ajman #@Buy Weed Online #@Buy Weed UK #@Buy Weed Iceland #*@Buy Weed All Countries Below are the various strains of kush available ; buy hash and Weed in dubai,abu dhabi,sharjah where to buy weed in doha,where can i find weed in jeddah,Can I get weed delivered to Riyadh?,Buy weed Online Jeddah Saudi Arabia,Buy Weed and THC Cannabis Oil online ,QATAR , DOHA buy kush in DOHA , buy kush in DOHAWeed in QATAR # DOHA Buy Weed and THC Cannabis Oil online who delivers at your own location in Qatar Doha ,Kuwait ,Dubai including cannabis / weed,Where can I find weed in Dubai as a tourist?,Is marijuana allowed in Dubai ? How much is medical marijuana in Dubai ?Is weed legal in Dubai ?How to get marijuana in Saudi Arabia Do people in Saudi Arabia smoke weed ? Is Hash legal in Saudi Arabia ? Where is marijuana the most illegal?Can you get weed in Baku? Dubai, United Arab Emirates Canabis smokers - Dubai, Buy Marijuana Products Online in UAE Desertcart ships the Marijuana products in Dubai ,Abu Dhabi, Sharjah, Al Ain, Ajman and more cities in UAE. Get unlimited free shipping in 164+ countries Buy Weed Products Online in Saudi Arabia Order thc Weed in Saudi Arabia Order thc Weed in Saudi Arabia Order thc Weed in Saudi Arabia Order thc Weed in Saudi Arabia Buy Marijuana Saudi Arabia weed in jeddahis cbd legal in saudi arabia cali tins smoke in saudi arabia drug use saudi arabia Buy weed marijuana. White Widow OG #Kush Sensi Star x ak 47 Afghan Ku

+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...

Health

Digital Transformation Playbook by Graham Ware

Graham Ware

Lecture_2_Deep_Learning_Overview-newone1

ranjankumarbehera14

Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C 9548273370 Escort Service CALL GIRL IN Lucknow 9548273370 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL IN #j11 We are Providing :- ● – Private independent collage Going girls . ● – independent Models . ● – House Wife’s . ● – Private Independent House Wife’s ● – Corporate M.N.C Working Profiles . ● – Call Center Girls . ● – Live Band Girls . ●- Foreigners & Many More . Service type: 1.In call 2.out call 3. full Lip to Lip kiss 4.69 5.b-job without Condom 6. Hard Core sex & Much More. 7 Body to Body Touch 8 Kissing 9 Sucking Boobs and More 10 Enjoy by Hand 11 Relax By Oral 12 Sex with Happy Ending • In Call and Out Call Service • 3* 5* 7* Hotels Service • 24 Hours Available • Indian, Russian, Punjabi, Kashmiri Escorts • Real Models, College Girls, House Wife, Also Available • Short Time and Full Time Service Available • Hygienic Full AC Neat and Clean Rooms Avail. In Hotel 24 hours • Daily Escorts Staff Available • Minimum to Maximu m Range Available.c

Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...

HyderabadDolls

Kings of Saudi Arabia, information about them

eitharjee

Discover how TrafficWave Generator can drive targeted, engaging visitors to your online assets and propel your business to new heights. Unlock the power of free, set-and-forget traffic today. Featuring a user-friendly interface, set-and-forget functionality, and the ability to monetize the traffic with your affiliate links or product offers, TrafficWave Generator is a game-changer for online entrepreneurs, bloggers, and e-commerce store owners alike. Backed by a 30-day money-back guarantee and comprehensive bonus offerings, this software has the potential to transform your traffic-generation and content-creation efforts, unlocking new levels of success in the digital marketing landscape.

TrafficWave Generator Will Instantly drive targeted and engaging traffic back...

SOFTTECHHUB

Aspirational Block Program Block Syaldey District - Almora

GovindSinghDasila

Dernier (20)

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK

Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...

Jodhpur Park | Call Girls in Kolkata Phone No 8005736733 Elite Escort Service...

5CL-ADBA,5cladba, Chinese supplier, safety is guaranteed

Jual Obat Aborsi Surabaya ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...

Top Call Girls in Balaghat 9332606886Call Girls Advance Cash On Delivery Ser...

Statistics notes ,it includes mean to index numbers

Gartner's Data Analytics Maturity Model.pptx

Computer science Sql cheat sheet.pdf.pdf

Ranking and Scoring Exercises for Research

Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now

Gomti Nagar & best call girls in Lucknow | 9548273370 Independent Escorts & D...

Top profile Call Girls In Chandrapur [ 7014168258 ] Call Me For Genuine Model...

+97470301568>>weed for sale in qatar ,weed for sale in dubai,weed for sale in...

Digital Transformation Playbook by Graham Ware

Lecture_2_Deep_Learning_Overview-newone1

Charbagh + Female Escorts Service in Lucknow | Starting ₹,5K To @25k with A/C...

Kings of Saudi Arabia, information about them

TrafficWave Generator Will Instantly drive targeted and engaging traffic back...

Aspirational Block Program Block Syaldey District - Almora

Multiple Object Tracking - Laura Leal-Taixe - UPC Barcelona 2018

1. Prof. Laura Leal-Taixé Multiple object tracking

2. Dynamic Scene Understanding Understand every pixel of a video road tree Semantic segmentation person car

3. Dynamic Scene Understanding Understand every pixel of a video person 1 tree Semantic segmentation person 2 person 3 Instance- based segmentation road car

4. Dynamic Scene Understanding Understand every pixel of a video Semantic segmentation Instance- based segmentation Multiple object tracking

5. Multiple object tracking Goal: detect and track all objects in a scene

6. Person detection Tracking-by-detection Video sequence

7. Person detection Tracking-by-detection Video sequence

8. Tracking with network flows Video sequence Graphical model Node

9. Tracking with network flows L. Leal-Taixé et al. ICCVW2011 Solving the Minimum Cost Flow Problem

10. Tracking with Linear Programming • Objective function T = arg min T i,j C(i, j)f(i, j) T = arg min T i,j C(i, j)f(i, j) L. Leal-Taixé et al. ICCVW2011 Solving the Minimum Cost Flow Problem

11. • Objective function Tracking with Linear Programming T = arg min T i,j C(i, j)f(i, j) Indicator {0,1} Costs – what will drive the tracking C C

12. Measuring bounding box similarity Classic measures • Bounding box distance • Appearance similarity Proposed measures Modeling pedestrian motion and interactions

13. Modeling pedestrian interaction Using a physics- based model of crowd motion Learning an image-based motion context Learning appearance and interactions with Deep Learning

14. Modeling pedestrian interaction Using a physics- based model of crowd motion Learning an image-based motion context Learning appearance and interactions with Deep Learning

15. Modeling pedestrian interaction Using a physics- based model of crowd motion Learning an image-based motion context Learning appearance and interactions with Deep Learning

16. The effect of undetected pedestrians Image Optical flow

17. Learning an image-based motion context Image Features Estimation of the velocity L. Leal-Taixé, M. Fenzi, A. Kuznetsova, B. Rosenhahn, S. Savarese. CVPR 2014. Hough forests

18. Pedestrian velocity estimation Histogram of the errors Optical Flow Social Force Model Proposed

19. Modeling pedestrian interaction Using a physics- based model of crowd motion Learning an image-based motion context Learning appearance and interactions with Deep Learning

20. Learning appearance and interactions Linear Programming Predicted associations t t+1Siamese neural network L. Leal-Taixé, C. Canton, K. Schindler. CVPRW 2016

21. Design Stacking 10x121x53I2 O2 Localspatio-temporal features Final Matching Prediction Gradient Boosting Classifier Contextual features Height, width, distance, … C1 C2 C3 F4 F5 F6 F7

22. Improvement directions • Better appearance models – Ristani & Tomasi. Features for Multi-Target Multi-Camera Tracking and Re-Identification. CVPR 2018 • Multi-detector fusion – Henschel et al. Fusion of Head and Full-Body Detectors for Multi- Object Tracking. CVPRW 2018.

23. Can we learn it all? • Problem 1: dimensionality of the output – Neural networks can handle fixed sized outputs (tensors) – Tracking: • Unknown number of detections per frame • Unknown length of each trajectory • Solution 1: see the Set Learning lecture • Solution 2: recursive

24. Can we learn it all? • Solution 2: recursive – Using a Recurrent Neural Network to predict trajectory properties – Birth/death, transition à motion model • Does not work well…. • Is it likely that more data is needed to train such a motion model? Milan et al. Online Multi-Target Tracking Using Recurrent Neural Networks. AAAI 2017

25. Can we learn it all? • Our Solution 2: recursive – One trajectory at a time, one frame at a time

26. Can we learn it all? Girshick Fast R-CNN. ICCV 2015

27. Can we learn it all? • We use the detections from the last frame as proposals • Where did the detection with ID 1 go in the next frame? P. Bergmann. Master Thesis. TUM 2018

28. Pros and cons • PRO We can reuse an extremely well-trained regressor – We get well-positioned bounding boxes • CON The regressor only shifts the box by a small quantity – We need to compensate for large camera motions à optical flow P. Bergmann. Master Thesis. TUM 2018

29. Surprisingly good results • Extremely good MOTA improvement

30. Surprisingly good results • Quite some ID switches but still good identification overall

31. Conclusions • The old assumption that objects have small displacements still holds and can be used to reach SOA performance • What is next? Hard cases. – Long occlusions – Crowded scenes

32. Prof. Laura Leal-Taixé Questions? Prof. Laura Leal-Taixé

Multiple Object Tracking - Laura Leal-Taixe - UPC Barcelona 2018

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Multiple Object Tracking - Laura Leal-Taixe - UPC Barcelona 2018

Similaire à Multiple Object Tracking - Laura Leal-Taixe - UPC Barcelona 2018 (20)

Plus de Universitat Politècnica de Catalunya

Plus de Universitat Politècnica de Catalunya (20)

Dernier

Dernier (20)

Multiple Object Tracking - Laura Leal-Taixe - UPC Barcelona 2018