SlideShare une entreprise Scribd logo
1  sur  55
Télécharger pour lire hors ligne
1
STEP: Spatio-Temporal Progressive Learning
for Video Action Detection
Xitong Yang1,2 Xiaodong Yang2 Ming-Yu Liu2
Fanyi Xiao2,3 Larry Davis1 Jan Kautz2
1University of Maryland, College Park 2NVIDIA 3University of California, Davis
2
About Me (Xitong Yang, 杨希桐)
► Education
► 2016 – Present: Ph.D., University of Maryland, College Park; Prof. Larry Davis
► 2014 – 2016: M.S., University of Rochester; Prof. Jiebo Luo
► 2010 – 2014: B.E., Beijing Institute of Technology
► Internship
► 2018, 2019: NVIDIA; Xiaodong Yang, Ming-Yu Liu, Sifei Liu, Jan Kautz
► 2017: Honda Research Institute; Yi-Ting Chen, Teruhisa Misu
► 2016: PARC East; Sriganesh Madhvanath, Raja Bala
► Research Interest
► Computer vision, video understanding
3
Spatio-temporal Action Detection
Time
LongJump
4
Object Detection
► Two-stage methods
► Fast / Faster R-CNN
► One-stage methods
► SSD
Faster R-CNN
(Ren et al, NeurIPS 2015)
SSD
(Liu et al, ECCV 2016)
5
Object Detection Pipeline
source: https://www.saagie.com/fr/blog/object-detection-part1
Proposals/
Anchors
Classification:
object recognition
Regression:
bounding box refinement
Post-processing
6
From Object Detection to Action Detection
► Use optical flow as additional input
► From frame-level prediction to clip-level prediction
► Process long sequences (use 3D CNNs)
► Replicate 2D proposals over time to obtain 3D proposals
Two-stream R-CNN
(Peng et al, ECCV 2016)
Kalogeiton et al, ICCV 2017
I3D + Faster R-CNN
(Girdhar et al, 2018)
7
From Object Detection to Action Detection
► Use optical flow as additional input
► From frame-level prediction to clip-level prediction
► Process long sequences (use 3D CNNs)
► Replicate 2D proposals over time to obtain 3D proposals
Two-stream R-CNN
(Peng et al, ECCV 2016)
Kalogeiton et al, ICCV 2017
I3D + Faster R-CNN
(Girdhar et al, 2018)
8
Challenges
Time
► Extended two-stage methods
✕ Effective temporal modeling
► Spatial displacement over time
9
Challenges
► Extended two-stage methods
✕ Effective temporal modeling
► Spatial displacement over time
10
Challenges
► Extended two-stage methods
✕ Effective temporal modeling
► Spatial displacement over time
✕ Efficient detection
► Thousands of proposals
► Processing long sequences
11
Spatio-TEmporal Progressive Learning
(STEP)
12
► Goals of STEP
✓ Effective temporal modeling
► Adapt to spatial displacement
✓ Efficient detection
► Use a small number of proposals
What is STEP
13
What is STEP
► STEP = progressive learning + spatial refinement + temporal extension
Step
Initial Proposal
Refined Tubelet
Extended Tubelet
Time
progressive learning
14
What is STEP
Step
Initial Proposal
Refined Tubelet
Extended Tubelet
Time
► STEP = progressive learning + spatial refinement + temporal extension
spatial refinement
15
What is STEP
Step
Initial Proposal
Refined Tubelet
Extended Tubelet
Time
► STEP = progressive learning + spatial refinement + temporal extension
temporal extension
16
Our Approach: STEP
Time
t
17
Time
s=1: anchors
t
Our Approach: STEP
18
Time
s=1: anchors
Our Approach: STEP
19
Time
s=1: anchors
Our Approach: STEP
20
Time
s=1: anchors
Our Approach: STEP
21
s=1: temporal extension
Time
Our Approach: STEP
22
Time
s=1: temporal extension
Our Approach: STEP
23
Time
s=1: spatial refinement
Our Approach: STEP
24
Time
s=1: spatial refinement
Our Approach: STEP
25
Time
s=2: temporal extension
Our Approach: STEP
26
Time
s=2: temporal extension
Our Approach: STEP
27
Time
s=2: spatial refinement
Our Approach: STEP
28
Time
s=2: spatial refinement
Our Approach: STEP
29
Time
s=3: temporal extension
Our Approach: STEP
30
Time
s=3: temporal extension
Our Approach: STEP
31
Time
s=3: spatial refinement
Our Approach: STEP
32
Our Approach: STEP
► STEP
✓ Effective temporal modeling
► Adaptive temporal extension
✓ Efficient detection
► Use only 11 (34) proposals on UCF101-24 (AVA)
► Progressively increase the sequence length
✓ Generic learning framework for video understanding
► Instantiate with different backbones / refinement schedule
Step
Initial Proposal
Refined Tubelet
Extended Tubelet
Time
33
Related Work: Iterative Methods in Vision
Iterative pose estimation
(Carreira et al, CVPR16)
Object detection
Grid-CNN (Najibi et al, CVPR16)
Recurrent image generation
DRAW (Gregor et al, ICML15)
Object detection
Cascade R-CNN (Cai et al, CVPR18)
34
Model Details
Temporal
Modeling
Global Branch
Local Branch
Classification
Regression
Convolutional
Features
Proposals
RoI Pool
► Spatial refinement
► Two branches for classification & regression
Action
detection
Classification Regression
• Temporal
information
• Context
• Interaction
• ….
• Precise
localization
• Bounding box
of the actor
• …
35
► Temporal extension
► Linear extrapolation / location anticipation
Model Details
!"#
$
!%#
$
!$
► Spatial refinement
► Two branches for classification & regression
Temporal
Modeling
Global Branch
Local Branch
Classification
Regression
Convolutional
Features
Proposals
RoI Pool
36
Model Details
► Progressive learning
► Joint training
Time
RoI Pool S1
P1
L1
L0
Backbone
Classification
Regression
Proposals
37
Model Details
Time
RoI Pool
RoI Pool
S1
S2
P1
P2
L1
L2
T1
L0
Backbone
► Progressive learning
► Joint training
38
Model Details
Time
RoI Pool
RoI Pool
RoI Pool
S1
S2
S3
P1
P2
P3
L1
L2
L3
T1
T2
L0
Backbone
► Progressive learning
► Joint training
39
Model Details
► The problem of distribution shift over different steps
► Our training strategies
► Increasing IoU thresholds for 3 steps (0.2 à 0.35 à 0.5)
► Separate header networks for different steps
40
Experiments
41
Experiment Setup
► Dataset
► UCF101-24
► A subset of UCF-101 dataset that consists of videos from 24 action
classes and their corresponding bounding box annotations.
► AVA
► Complex actions (60 classes) and scenes sourced from movies.
Annotations are provided at 1-second intervals.
► Evaluation
► Frame-mAP at IoU=0.5
42
Qualitative Results: Progressive Learning
UCF101-24
AVA
43
Qualitative Results: Progressive Learning
Steps
44
Ablation Study
Spatial Refinement Temporal ExtensionNumber of Proposals
► Improvement obtained by more steps
45
Ablation Study
Spatial Refinement Temporal Extension
► Improvement obtained by more steps
► Performance saturates after 3 steps
Number of Proposals
46
Ablation Study
Spatial Refinement Temporal Extension
► Improvement obtained by more proposals
► More inference time
0
0.8
1.6
2.4
3.2
58
61
64
67
11 34 83 132
secondsperbatch
frame-mAP(%)
number of initial proposals
Number of Proposals
47
Ablation Study
Spatial Refinement Temporal Extension
0
0.8
1.6
2.4
3.2
58
61
64
67
11 34 83 132
secondsperbatch
frame-mAP(%)
number of initial proposals
ACT
► Improvement obtained by more proposals
► More inference time
► Achieve SOTA using only 11 proposals
Number of Proposals
48
Ablation Study
Spatial Refinement Temporal Extension
Step
Frame-mAP
51.5
60.7
62.6
49
51
53
55
57
59
61
63
65
67
1 2 3
w/o temporal extension (K = 6) w/o temporal extension (K = 30)
w/ temporal extrapolation w/ temporal anticipation
Number of Proposals
49
Ablation Study
Spatial Refinement Temporal Extension
Step
Frame-mAP
51.5
60.7
62.6
53.1
61.8
63.4
49
51
53
55
57
59
61
63
65
67
1 2 3
w/o temporal extension (K = 6) w/o temporal extension (K = 30)
w/ temporal extrapolation w/ temporal anticipation
Number of Proposals
► Long-range temporal context benefits action classification
50
Ablation Study
Spatial Refinement Temporal Extension
Step
Frame-mAP
(K = 6 à 18 à 30)
51.5
60.7
62.6
53.1
61.8
63.4
51.5
62.8
65.5
51.5
62.5
66.7
49
51
53
55
57
59
61
63
65
67
1 2 3
w/o temporal extension (K = 6) w/o temporal extension (K = 30)
w/ temporal extrapolation w/ temporal anticipation
► Long-range temporal context benefits action classification
► Adaptive temporal extension is more effective (and more efficient)
Number of Proposals
51
Comparison with SOTA
► UCF101-24
► VGG16 backbone
► Two-stream fusion
► K = 6 à 18 à 30
► AVA (v2.1)
► I3D backbone
► K = 12 à 12 à 36
* RGB + Flow
(Updated result on arxiv: 20.2%)
52
Qualitative Results: UCF101-24
53
Qualitative Results: AVA
54
Conclusion
► Spatio-TEmporal Progressive learning for action detection
► A novel framework for effective temporal modeling on long sequences
► A simply, fully end-to-end action detector (without external human detectors)
► Codes: https://github.com/NVlabs/STEP
55
Thanks!
Q & A

Contenu connexe

Tendances

Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...
Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...
Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...Hirokatsu Kataoka
 
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...Wanjin Yu
 
People counting in low density video sequences2
People counting in low density video sequences2People counting in low density video sequences2
People counting in low density video sequences2Ahmed Tememe
 
PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...
PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...
PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...광희 이
 
Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningJui-Hsin (Larry) Lai
 
AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...
AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...
AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...Jui-Hsin (Larry) Lai
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Dongmin Choi
 
보다 유연한 이미지 변환을 하려면?
보다 유연한 이미지 변환을 하려면?보다 유연한 이미지 변환을 하려면?
보다 유연한 이미지 변환을 하려면?광희 이
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in ImagesAnil Kumar Gupta
 
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Dongmin Choi
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: reviewDmytro Mishkin
 
Action Recognitionの歴史と最新動向
Action Recognitionの歴史と最新動向Action Recognitionの歴史と最新動向
Action Recognitionの歴史と最新動向Ohnishi Katsunori
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksNAVER Engineering
 
Real time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimationReal time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimationomid Asudeh
 
Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...
Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...
Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...Nishanth Koganti
 
Dynamic Two-Stage Image Retrieval from Large Multimodal Databases
Dynamic Two-Stage Image Retrieval from Large Multimodal DatabasesDynamic Two-Stage Image Retrieval from Large Multimodal Databases
Dynamic Two-Stage Image Retrieval from Large Multimodal DatabasesKonstantinos Zagoris
 
Unsupervised image to-image translation via pre-trained style gan2 network
Unsupervised image to-image translation via pre-trained style gan2 networkUnsupervised image to-image translation via pre-trained style gan2 network
Unsupervised image to-image translation via pre-trained style gan2 network광희 이
 
A Survey on Deblur The License Plate Image from Fast Moving Vehicles Using Sp...
A Survey on Deblur The License Plate Image from Fast Moving Vehicles Using Sp...A Survey on Deblur The License Plate Image from Fast Moving Vehicles Using Sp...
A Survey on Deblur The License Plate Image from Fast Moving Vehicles Using Sp...IRJET Journal
 
Semantic Filtering (An Image Processing Method)
Semantic Filtering (An Image Processing Method)Semantic Filtering (An Image Processing Method)
Semantic Filtering (An Image Processing Method)Seval Çapraz
 

Tendances (20)

Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...
Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...
Extended Co-occurrence HOG with Dense Trajectories for Fine-grained Activity ...
 
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
Human Behavior Understanding: From Human-Oriented Analysis to Action Recognit...
 
People counting in low density video sequences2
People counting in low density video sequences2People counting in low density video sequences2
People counting in low density video sequences2
 
PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...
PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...
PR100: SeedNet: Automatic Seed Generation with Deep Reinforcement Learning fo...
 
Object Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online LearningObject Tracking with Instance Matching and Online Learning
Object Tracking with Instance Matching and Online Learning
 
AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...
AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...
AI+ Remote Sensing: Applying Deep Learning to Image Enhancement, Analytics, a...
 
20211118 AI+ Remote Sensing
20211118 AI+ Remote Sensing20211118 AI+ Remote Sensing
20211118 AI+ Remote Sensing
 
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
Review : Multi-Domain Image Completion for Random Missing Input Data [cdm]
 
보다 유연한 이미지 변환을 하려면?
보다 유연한 이미지 변환을 하려면?보다 유연한 이미지 변환을 하려면?
보다 유연한 이미지 변환을 하려면?
 
Obscenity Detection in Images
Obscenity Detection in ImagesObscenity Detection in Images
Obscenity Detection in Images
 
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
 
Visual Object Tracking: review
Visual Object Tracking: reviewVisual Object Tracking: review
Visual Object Tracking: review
 
Action Recognitionの歴史と最新動向
Action Recognitionの歴史と最新動向Action Recognitionの歴史と最新動向
Action Recognitionの歴史と最新動向
 
Modeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networksModeling perceptual similarity and shift invariance in deep networks
Modeling perceptual similarity and shift invariance in deep networks
 
Real time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimationReal time pedestrian detection, tracking, and distance estimation
Real time pedestrian detection, tracking, and distance estimation
 
Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...
Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...
Bayesian Nonparametric Motor-skill Representations for Efficient Learning of ...
 
Dynamic Two-Stage Image Retrieval from Large Multimodal Databases
Dynamic Two-Stage Image Retrieval from Large Multimodal DatabasesDynamic Two-Stage Image Retrieval from Large Multimodal Databases
Dynamic Two-Stage Image Retrieval from Large Multimodal Databases
 
Unsupervised image to-image translation via pre-trained style gan2 network
Unsupervised image to-image translation via pre-trained style gan2 networkUnsupervised image to-image translation via pre-trained style gan2 network
Unsupervised image to-image translation via pre-trained style gan2 network
 
A Survey on Deblur The License Plate Image from Fast Moving Vehicles Using Sp...
A Survey on Deblur The License Plate Image from Fast Moving Vehicles Using Sp...A Survey on Deblur The License Plate Image from Fast Moving Vehicles Using Sp...
A Survey on Deblur The License Plate Image from Fast Moving Vehicles Using Sp...
 
Semantic Filtering (An Image Processing Method)
Semantic Filtering (An Image Processing Method)Semantic Filtering (An Image Processing Method)
Semantic Filtering (An Image Processing Method)
 

Similaire à Step zhedong

Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Saimunur Rahman
 
Object Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet IObject Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet IWanjin Yu
 
VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
VL/HCC 2014 - A Longitudinal Study of Programmers' BacktrackingVL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
VL/HCC 2014 - A Longitudinal Study of Programmers' BacktrackingYoungSeok Yoon
 
fully convolutional networks for semantic segmentation
fully convolutional networks for semantic segmentationfully convolutional networks for semantic segmentation
fully convolutional networks for semantic segmentationXinyangLi16
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...MLconf
 
Is424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slidesIs424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slidesJing WANG
 
Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan Kumar
 
Final_Talk_Tool_Team
Final_Talk_Tool_TeamFinal_Talk_Tool_Team
Final_Talk_Tool_TeamMehdi Lamee
 
TRECVID 2016 : Concept Localization
TRECVID 2016 : Concept LocalizationTRECVID 2016 : Concept Localization
TRECVID 2016 : Concept LocalizationGeorge Awad
 
FAST Approaches to Scalable Similarity-based Test Case Prioritization
FAST Approaches to Scalable Similarity-based Test Case PrioritizationFAST Approaches to Scalable Similarity-based Test Case Prioritization
FAST Approaches to Scalable Similarity-based Test Case Prioritizationbrenoafmiranda
 
Real time extension for 3D-PTV Lagrangian measurements of turbulent canopy fl...
Real time extension for 3D-PTV Lagrangian measurements of turbulent canopy fl...Real time extension for 3D-PTV Lagrangian measurements of turbulent canopy fl...
Real time extension for 3D-PTV Lagrangian measurements of turbulent canopy fl...Alex Liberzon
 
Video Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetVideo Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetGiorgio Carbone
 

Similaire à Step zhedong (20)

Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
Reading group - Week 2 - Trajectory Pooled Deep-Convolutional Descriptors (TDD)
 
Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)Deep Learning for Computer Vision: Object Detection (UPC 2016)
Deep Learning for Computer Vision: Object Detection (UPC 2016)
 
Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018Object Detection - Míriam Bellver - UPC Barcelona 2018
Object Detection - Míriam Bellver - UPC Barcelona 2018
 
Object Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet IObject Detection Beyond Mask R-CNN and RetinaNet I
Object Detection Beyond Mask R-CNN and RetinaNet I
 
VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
VL/HCC 2014 - A Longitudinal Study of Programmers' BacktrackingVL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
 
D3L4-objects.pdf
D3L4-objects.pdfD3L4-objects.pdf
D3L4-objects.pdf
 
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
Object Detection (D2L5 Insight@DCU Machine Learning Workshop 2017)
 
Thesis presentation
Thesis presentationThesis presentation
Thesis presentation
 
Sequential Query Expansion using Concept Graph
Sequential Query Expansion using Concept GraphSequential Query Expansion using Concept Graph
Sequential Query Expansion using Concept Graph
 
fully convolutional networks for semantic segmentation
fully convolutional networks for semantic segmentationfully convolutional networks for semantic segmentation
fully convolutional networks for semantic segmentation
 
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
Hanjun Dai, PhD Student, School of Computational Science and Engineering, Geo...
 
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
Content-based Image Retrieval - Eva Mohedano - UPC Barcelona 2018
 
Is424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slidesIs424 g1 t9_proposal_slides
Is424 g1 t9_proposal_slides
 
Gunjan insight student conference v2
Gunjan insight student conference v2Gunjan insight student conference v2
Gunjan insight student conference v2
 
Final_Talk_Tool_Team
Final_Talk_Tool_TeamFinal_Talk_Tool_Team
Final_Talk_Tool_Team
 
TRECVID 2016 : Concept Localization
TRECVID 2016 : Concept LocalizationTRECVID 2016 : Concept Localization
TRECVID 2016 : Concept Localization
 
ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)ECCV WS 2012 (Frank)
ECCV WS 2012 (Frank)
 
FAST Approaches to Scalable Similarity-based Test Case Prioritization
FAST Approaches to Scalable Similarity-based Test Case PrioritizationFAST Approaches to Scalable Similarity-based Test Case Prioritization
FAST Approaches to Scalable Similarity-based Test Case Prioritization
 
Real time extension for 3D-PTV Lagrangian measurements of turbulent canopy fl...
Real time extension for 3D-PTV Lagrangian measurements of turbulent canopy fl...Real time extension for 3D-PTV Lagrangian measurements of turbulent canopy fl...
Real time extension for 3D-PTV Lagrangian measurements of turbulent canopy fl...
 
Video Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 datasetVideo Classification: Human Action Recognition on HMDB-51 dataset
Video Classification: Human Action Recognition on HMDB-51 dataset
 

Plus de 哲东 郑

Visual saliency
Visual saliencyVisual saliency
Visual saliency哲东 郑
 
Image Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and StyleImage Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and Style哲东 郑
 
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal RetrievalPolysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval哲东 郑
 
Weijian image retrieval
Weijian image retrievalWeijian image retrieval
Weijian image retrieval哲东 郑
 
Scops self supervised co-part segmentation
Scops self supervised co-part segmentationScops self supervised co-part segmentation
Scops self supervised co-part segmentation哲东 郑
 
Video object detection
Video object detectionVideo object detection
Video object detection哲东 郑
 
C2 ae open set recognition
C2 ae open set recognitionC2 ae open set recognition
C2 ae open set recognition哲东 郑
 
Sota semantic segmentation
Sota semantic segmentationSota semantic segmentation
Sota semantic segmentation哲东 郑
 
Deep randomized embedding
Deep randomized embeddingDeep randomized embedding
Deep randomized embedding哲东 郑
 
Semantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive NormalizationSemantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive Normalization哲东 郑
 
Instance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flowInstance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flow哲东 郑
 
Learning to adapt structured output space for semantic
Learning to adapt structured output space for semanticLearning to adapt structured output space for semantic
Learning to adapt structured output space for semantic哲东 郑
 
Unsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image GenerationUnsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image Generation哲东 郑
 
Graph based global reasoning networks
Graph based global reasoning networks Graph based global reasoning networks
Graph based global reasoning networks 哲东 郑
 
Variational Discriminator Bottleneck
Variational Discriminator BottleneckVariational Discriminator Bottleneck
Variational Discriminator Bottleneck哲东 郑
 
GNorm and Rethinking pre training-ruijie
GNorm and Rethinking pre training-ruijieGNorm and Rethinking pre training-ruijie
GNorm and Rethinking pre training-ruijie哲东 郑
 
Smoothed manifold
Smoothed manifoldSmoothed manifold
Smoothed manifold哲东 郑
 

Plus de 哲东 郑 (20)

Visual saliency
Visual saliencyVisual saliency
Visual saliency
 
Image Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and StyleImage Synthesis From Reconfigurable Layout and Style
Image Synthesis From Reconfigurable Layout and Style
 
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal RetrievalPolysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
 
Weijian image retrieval
Weijian image retrievalWeijian image retrieval
Weijian image retrieval
 
Scops self supervised co-part segmentation
Scops self supervised co-part segmentationScops self supervised co-part segmentation
Scops self supervised co-part segmentation
 
Video object detection
Video object detectionVideo object detection
Video object detection
 
Center nets
Center netsCenter nets
Center nets
 
C2 ae open set recognition
C2 ae open set recognitionC2 ae open set recognition
C2 ae open set recognition
 
Sota semantic segmentation
Sota semantic segmentationSota semantic segmentation
Sota semantic segmentation
 
Deep randomized embedding
Deep randomized embeddingDeep randomized embedding
Deep randomized embedding
 
Semantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive NormalizationSemantic Image Synthesis with Spatially-Adaptive Normalization
Semantic Image Synthesis with Spatially-Adaptive Normalization
 
Instance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flowInstance level facial attributes transfer with geometry-aware flow
Instance level facial attributes transfer with geometry-aware flow
 
Learning to adapt structured output space for semantic
Learning to adapt structured output space for semanticLearning to adapt structured output space for semantic
Learning to adapt structured output space for semantic
 
Unsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image GenerationUnsupervised Learning of Object Landmarks through Conditional Image Generation
Unsupervised Learning of Object Landmarks through Conditional Image Generation
 
Graph based global reasoning networks
Graph based global reasoning networks Graph based global reasoning networks
Graph based global reasoning networks
 
Style gan
Style ganStyle gan
Style gan
 
Vi2vi
Vi2viVi2vi
Vi2vi
 
Variational Discriminator Bottleneck
Variational Discriminator BottleneckVariational Discriminator Bottleneck
Variational Discriminator Bottleneck
 
GNorm and Rethinking pre training-ruijie
GNorm and Rethinking pre training-ruijieGNorm and Rethinking pre training-ruijie
GNorm and Rethinking pre training-ruijie
 
Smoothed manifold
Smoothed manifoldSmoothed manifold
Smoothed manifold
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 

Dernier (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Step zhedong