SlideShare une entreprise Scribd logo
1  sur  30
Télécharger pour lire hors ligne
Intelligence Machine Vision Lab
Strictly Confidential
2019 CVPR paper overview
SUALAB
Ho Seong Lee
2Type A-3
Contents
• CVPR 2019 Statistics
• 20 paper-one page summary
3Type A-3
CVPR 2019 Statistics
• What is CVPR?
• Conference on Computer Vision and Pattern Recognition (CVPR)
• CVPR was first held in 1983 and has been held annually
• CVPR 2019: June 16th – June 20th in Long Beach, CA
4Type A-3
CVPR 2019 Statistics
• CVPR 2019 statistics
• The total number of papers is increasing every year and this year has increased significantly!
• We can visualize main topic using title of paper and simple python script!
• https://github.com/hoya012/CVPR-Paper-Statistics
28.4% 30%
29.9%
29.6%
25.1%
5Type A-3
CVPR 2019 Statistics
2019 CVPR paper statistics
6Type A-3
CVPR 2018 Statistics
2018 CVPR paper statistics
Compared to 2018 Statistics..
7Type A-3
CVPR 2018 vs CVPR 2019 Statistics
• Most of the top keywords were maintained
• Image, detection, 3d, object, video, segmentation, adversarial, recognition, visual …
• “graph”, “cloud”, “representation” are about twice as frequent
• graph : 15 → 45
• representation: 25 → 48
• cloud: 16 → 35
8Type A-3
Before beginning..
• It does not mean that it is not an interesting article because it is not in the list.
• Since I mainly studied Computer Vision, most papers that I will discuss today are
Computer Vision papers..
• Topics not covered today
• Natural Language Processing
• Reinforcement Learning
• Robotics
• Etc..?
9Type A-3
1. Learning to Synthesize Motion Blur (oral)
• Synthesizing a motion blurred image from a pair of unblurred sequential images
• Motion blur is important in cinematography, and artful photo
• Generate a large-scale synthetic training dataset of motion blurred images
Recommended reference: “Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation”, 2018 CVPR
10Type A-3
2. Semantic Image Synthesis with Spatially-Adaptive Normalization (oral)
• Synthesizing photorealistic images given an input semantic layout
• Spatially-adaptive normalization can keep semantic information
• This model allows user control over both semantic and style as synthesizing images
Demo Code: https://github.com/NVlabs/SPADE
11Type A-3
3. SiCloPe: Silhouette-Based Clothed People (Oral)
• Reconstruct a complete and textured 3D model of a person from a single image
• Use 2D Silhouettes and 3D joints of a body pose to reconstruct 3D mesh
• An effective two-stage 3D shape reconstruction pipeline
• Predicting multi-view 2D silhouettes from single input segmentation
• Deep visual hull based mesh reconstruction technique
Recommended reference: “BodyNet: Volumetric Inference of 3D Human Body Shapes”, 2018 ECCV
12Type A-3
4. Im2Pencil: Controllable Pencil Illustration from Photographs
• Propose controllable photo-to-pencil translation method
• Modeling pencil outline(rough, clean), pencil shading(4 types)
• Create training data pairs from online websites(e.g., Pinterest) and use image filtering techniques
Demo Code: https://github.com/Yijunmaverick/Im2Pencil
13Type A-3
5. End-to-End Time-Lapse Video Synthesis from a Single Outdoor Image
• End-to-end solution to synthesize a time-lapse video from single image
• Use time-lapse videos and image sequences during training
• Use only single image during inference
Input image(single)
14Type A-3
6. StoryGAN: A Sequential Conditional GAN for Story Visualization
• Propose a new task called Story Visualization using GAN
• Sequential conditional GAN based StoryGAN
• Story Encoder – stochastic mapping from story to an low-dimensional embedding vector
• Context Encoder – capture contextual information during sequential image generation
• Two Discriminator – Image Discriminator & Story Discriminator
Context Encoder
15Type A-3
7. Image Super-Resolution by Neural Texture Transfer (oral)
• Improve “RefSR” even when irrelevant reference images are provided
• Traditional Single Image Super-Resolution is extremely challenging (ill-posed problem)
• Reference-based(RefSR) utilizes rich texture from HR references .. but.. only similar Ref images
• Adaptively transferring the texture from Ref Images according to their texture similarity
Recommended reference: “CrossNet: An end-to-end reference-based super resolution network using cross-scale warping”, 2018 ECCV
Similar
Different
16Type A-3
8. DVC: An End-to-end Deep Video Compression Framework (oral)
• Propose the first end-to-end video compression deep model
• Conventional video compression use predictive coding architecture and encode corresponding
motion information and residual information
• Taking advantage of both classical compression and neural network
• Use learning based optical flow estimation
17Type A-3
9. Defense Against Adversarial Images using Web-Scale Nearest-Neighbor Search (oral)
• Defense adversarial attack using Big-data and image manifold
• Assume that adversarial attack move the image away from the image manifold
• A successful defense mechanism should aim to project the images back on the image manifold
• For tens of billions of images, search a nearest-neighbor images (K=50) and use them
• Also propose two novel attack methods to break nearest neighbor defenses
18Type A-3
10. Bag of Tricks for Image Classification with Convolutional Neural Networks
• Examine a collection of some refinements and empirically evaluate their impact
• Improve ResNet-50’s accuracy from 75.3% to 79.29% on ImageNet with some refinements
• Efficient Training
• FP32 with BS=256 → FP16 with BS=1024 with some techniques
• Training Refinements:
• Cosine Learning Rate Decay / Label Smoothing / Knowledge Distillation / Mixup Training
• Transfer from classification to Object Detection, Semantic Segmentation
Linear scaling LR
LR warmup
Zero γ initialization in BN
No bias decay
Result of Efficient Training
Result of Training refinements
ResNet tweaks
19Type A-3
11. Fully Learnable Group Convolution for Acceleration of Deep Neural Networks
• Automatically learn the group structure in training stage with end-to-end manner
• Outperform standard group convolution
• Propose an efficient strategy for index re-ordering
20Type A-3
12. ScratchDet:Exploring to Train Single-Shot Object Detectors from Scratch (oral)
• Explore to train object detectors from scratch robustly
• Almost SOTA detectors are fine-tuned from pretrained CNN (e.g., ImageNet)
• The classification and detection have different degrees of sensitivity to translation
• The architecture is limited by the classification network(backbone) → inconvenience!
• Find that one of the overlooked points is BatchNorm!
Recommended reference: “DSOD: Learning Deeply Supervised Object Detectors from Scratch”, 2017 ICCV
21Type A-3
13. Precise Detection in Densely Packed Scenes
• Propose precise detection in densely packed scenes
• In real-world, there are many applications of object detection (ex, detection and count # of object)
• In densely packed scenes, SOTA detector can’t detect accurately
(1) layer for estimating the Jaccard index (2) a novel EM merging unit (3) release SKU-110K dataset
22Type A-3
14. SIXray: A Large-scale Security Inspection X-ray Benchmark for Prohibited Item Discovery in Overlapping Images
• Present a large-scale dataset and establish a baseline for security inspection X-ray
• Total 1,059,231 X-ray images in which 6 classes of 8,929 prohibited items
• Propose an approach named class-balanced hierarchical refinement(CHR) and class-balanced loss
function
23Type A-3
15. Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regressio
• Address the weaknesses of IoU and introduce generalized version(GIoU)
• Intersection over Union(IoU) is the most popular evaluation metric used in object detection
• But, there is a gap between optimizing distance losses and maximizing IoU
• Introducing generalized IoU as both a new loss and a new metric
24Type A-3
16. Bounding Box Regression with Uncertainty for Accurate Object Detection
• Propose novel bounding box regression loss with uncertainty
• Most of datasets have ambiguities and labeling noise of bounding box coordinate
• Network can learns to predict localization variance for each coordinate
25Type A-3
17. UPSNet: A Unified Panoptic Segmentation Network (oral)
• Propose a unified panoptic segmentation network(UPSNet)
• Semantic segmentation + Instance segmentation = panoptic segmentation
• Semantic Head + Instance Head + Panoptic head → end-to-end manner
Recommended reference: “Panoptic Segmentation”, 2018 arXiv
Countable objects → things
Uncountable objects → stuff
Deformable Conv Mask R-CNN Parameter-free
26Type A-3
18. SFNet: Learning Object-aware Semantic Correspondence (Oral)
• Propose SFNet for semantic correspondence problem
• Propose to use images annotated with binary foreground masks and synthetic geometric
deformations during training
• Manually selecting point correspondences is so expensive!!
• Outperform SOTA on standard benchmarks by a significant margin
27Type A-3
19. Fast Interactive Object Annotation with Curve-GC
• Propose end-to-end fast interactive object annotation tool (Curve-GCN)
• Predict all vertices simultaneously using a Graph Convolutional Network, (→ Polygon-RNN X)
• Human annotator can correct any wrong point and only the neighboring points are affected
Recommended reference: “Efficient interactive annotation of segmentation datasets with polygon-rnn++ ”, 2018 CVPR
Code: https://github.com/fidler-lab/curve-gcn
Correction!
28Type A-3
20. FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stochastic Inference
• Propose image-level WSSS method using stochastic inference (dropout)
• Localization maps(CAM) only focus on the small parts of objects → Problem
• FickleNet allows a single network to generate multiple CAM from a single image
• Does not require any additional training steps and only adds a simple layer
Full Stochastic
Both Training
and Inference
29Type A-3
Related Post..
• In my personal blog, there are similar works
• SIGGRAPH 2018
• NeurIPS 2018
• ICLR 2019
https://hoya012.github.io/
Thank you

Contenu connexe

Tendances

Link prediction
Link predictionLink prediction
Link predictionybenjo
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learningAntonio Rueda-Toicen
 
[DL輪読会]Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images
[DL輪読会]Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images[DL輪読会]Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images
[DL輪読会]Pixel2Mesh: Generating 3D Mesh Models from Single RGB ImagesDeep Learning JP
 
[DL輪読会]Network Deconvolution
[DL輪読会]Network Deconvolution[DL輪読会]Network Deconvolution
[DL輪読会]Network DeconvolutionDeep Learning JP
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsShunta Saito
 
Image Recognition Expert System based on deep learning
Image Recognition Expert System based on deep learningImage Recognition Expert System based on deep learning
Image Recognition Expert System based on deep learningPRATHAMESH REGE
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basicsBrodmann17
 
[DL輪読会]Attentive neural processes
[DL輪読会]Attentive neural processes[DL輪読会]Attentive neural processes
[DL輪読会]Attentive neural processesDeep Learning JP
 
文献紹介:Image Segmentation Using Deep Learning: A Survey
文献紹介:Image Segmentation Using Deep Learning: A Survey文献紹介:Image Segmentation Using Deep Learning: A Survey
文献紹介:Image Segmentation Using Deep Learning: A SurveyToru Tamaki
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative ModelsMLReview
 
PR 127: FaceNet
PR 127: FaceNetPR 127: FaceNet
PR 127: FaceNetTaeoh Kim
 
[DL輪読会]DropBlock: A regularization method for convolutional networks
[DL輪読会]DropBlock: A regularization method for convolutional networks[DL輪読会]DropBlock: A regularization method for convolutional networks
[DL輪読会]DropBlock: A regularization method for convolutional networksDeep Learning JP
 
文献紹介:YOLO series:v1-v5, X, F, and YOWO
文献紹介:YOLO series:v1-v5, X, F, and YOWO文献紹介:YOLO series:v1-v5, X, F, and YOWO
文献紹介:YOLO series:v1-v5, X, F, and YOWOToru Tamaki
 
PRML勉強会 #4 @筑波大学 発表スライド
PRML勉強会 #4 @筑波大学 発表スライドPRML勉強会 #4 @筑波大学 発表スライド
PRML勉強会 #4 @筑波大学 発表スライドSatoshi Yoshikawa
 
Perception distortion-tradeoff 20181128
Perception distortion-tradeoff 20181128Perception distortion-tradeoff 20181128
Perception distortion-tradeoff 20181128Masakazu Shinoda
 
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"Deep Learning JP
 
深層生成モデルと世界モデル, 深層生成モデルライブラリPixyzについて
深層生成モデルと世界モデル,深層生成モデルライブラリPixyzについて深層生成モデルと世界モデル,深層生成モデルライブラリPixyzについて
深層生成モデルと世界モデル, 深層生成モデルライブラリPixyzについてMasahiro Suzuki
 
[DL輪読会]A Style-Based Generator Architecture for Generative Adversarial Networks
[DL輪読会]A Style-Based Generator Architecture for Generative Adversarial Networks[DL輪読会]A Style-Based Generator Architecture for Generative Adversarial Networks
[DL輪読会]A Style-Based Generator Architecture for Generative Adversarial NetworksDeep Learning JP
 
Slideshare unsupervised learning of depth and ego motion from video
Slideshare unsupervised learning of depth and ego motion from videoSlideshare unsupervised learning of depth and ego motion from video
Slideshare unsupervised learning of depth and ego motion from videoishii yasunori
 

Tendances (20)

Link prediction
Link predictionLink prediction
Link prediction
 
Image segmentation with deep learning
Image segmentation with deep learningImage segmentation with deep learning
Image segmentation with deep learning
 
[DL輪読会]Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images
[DL輪読会]Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images[DL輪読会]Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images
[DL輪読会]Pixel2Mesh: Generating 3D Mesh Models from Single RGB Images
 
Deep Semi-Supervised Anomaly Detection
Deep Semi-Supervised Anomaly DetectionDeep Semi-Supervised Anomaly Detection
Deep Semi-Supervised Anomaly Detection
 
[DL輪読会]Network Deconvolution
[DL輪読会]Network Deconvolution[DL輪読会]Network Deconvolution
[DL輪読会]Network Deconvolution
 
A brief introduction to recent segmentation methods
A brief introduction to recent segmentation methodsA brief introduction to recent segmentation methods
A brief introduction to recent segmentation methods
 
Image Recognition Expert System based on deep learning
Image Recognition Expert System based on deep learningImage Recognition Expert System based on deep learning
Image Recognition Expert System based on deep learning
 
Deep learning based object detection basics
Deep learning based object detection basicsDeep learning based object detection basics
Deep learning based object detection basics
 
[DL輪読会]Attentive neural processes
[DL輪読会]Attentive neural processes[DL輪読会]Attentive neural processes
[DL輪読会]Attentive neural processes
 
文献紹介:Image Segmentation Using Deep Learning: A Survey
文献紹介:Image Segmentation Using Deep Learning: A Survey文献紹介:Image Segmentation Using Deep Learning: A Survey
文献紹介:Image Segmentation Using Deep Learning: A Survey
 
Tutorial on Deep Generative Models
 Tutorial on Deep Generative Models Tutorial on Deep Generative Models
Tutorial on Deep Generative Models
 
PR 127: FaceNet
PR 127: FaceNetPR 127: FaceNet
PR 127: FaceNet
 
[DL輪読会]DropBlock: A regularization method for convolutional networks
[DL輪読会]DropBlock: A regularization method for convolutional networks[DL輪読会]DropBlock: A regularization method for convolutional networks
[DL輪読会]DropBlock: A regularization method for convolutional networks
 
文献紹介:YOLO series:v1-v5, X, F, and YOWO
文献紹介:YOLO series:v1-v5, X, F, and YOWO文献紹介:YOLO series:v1-v5, X, F, and YOWO
文献紹介:YOLO series:v1-v5, X, F, and YOWO
 
PRML勉強会 #4 @筑波大学 発表スライド
PRML勉強会 #4 @筑波大学 発表スライドPRML勉強会 #4 @筑波大学 発表スライド
PRML勉強会 #4 @筑波大学 発表スライド
 
Perception distortion-tradeoff 20181128
Perception distortion-tradeoff 20181128Perception distortion-tradeoff 20181128
Perception distortion-tradeoff 20181128
 
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
【DL輪読会】"Instant Neural Graphics Primitives with a Multiresolution Hash Encoding"
 
深層生成モデルと世界モデル, 深層生成モデルライブラリPixyzについて
深層生成モデルと世界モデル,深層生成モデルライブラリPixyzについて深層生成モデルと世界モデル,深層生成モデルライブラリPixyzについて
深層生成モデルと世界モデル, 深層生成モデルライブラリPixyzについて
 
[DL輪読会]A Style-Based Generator Architecture for Generative Adversarial Networks
[DL輪読会]A Style-Based Generator Architecture for Generative Adversarial Networks[DL輪読会]A Style-Based Generator Architecture for Generative Adversarial Networks
[DL輪読会]A Style-Based Generator Architecture for Generative Adversarial Networks
 
Slideshare unsupervised learning of depth and ego motion from video
Slideshare unsupervised learning of depth and ego motion from videoSlideshare unsupervised learning of depth and ego motion from video
Slideshare unsupervised learning of depth and ego motion from video
 

Similaire à 2019 cvpr paper_overview

TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverviewMotaz El-Saban
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyNUPUR YADAV
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingYu Huang
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learningpratik pratyay
 
Computer Vision Landscape : Present and Future
Computer Vision Landscape : Present and FutureComputer Vision Landscape : Present and Future
Computer Vision Landscape : Present and FutureSanghamitra Deb
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learningNAVER Engineering
 
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...NAVER Engineering
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...Edge AI and Vision Alliance
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsPetteriTeikariPhD
 
Neural Networks for Machine Learning and Deep Learning
Neural Networks for Machine Learning and Deep LearningNeural Networks for Machine Learning and Deep Learning
Neural Networks for Machine Learning and Deep Learningcomifa7406
 
Moving object detection in complex scene
Moving object detection in complex sceneMoving object detection in complex scene
Moving object detection in complex sceneKumar Mayank
 
The Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- ReduxThe Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- ReduxPierre Schaus
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSGanesan Narayanasamy
 
Camera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IICamera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IIYu Huang
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonAditya Bhattacharya
 
Content Based Image Retrieval (CBIR)
Content Based Image Retrieval (CBIR)Content Based Image Retrieval (CBIR)
Content Based Image Retrieval (CBIR)Behzad Shomali
 

Similaire à 2019 cvpr paper_overview (20)

TechnicalBackgroundOverview
TechnicalBackgroundOverviewTechnicalBackgroundOverview
TechnicalBackgroundOverview
 
Image Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A surveyImage Segmentation Using Deep Learning : A survey
Image Segmentation Using Deep Learning : A survey
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object tracking
 
slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
 
Real Time Object Dectection using machine learning
Real Time Object Dectection using machine learningReal Time Object Dectection using machine learning
Real Time Object Dectection using machine learning
 
Computer Vision Landscape : Present and Future
Computer Vision Landscape : Present and FutureComputer Vision Landscape : Present and Future
Computer Vision Landscape : Present and Future
 
Visual geometry with deep learning
Visual geometry with deep learningVisual geometry with deep learning
Visual geometry with deep learning
 
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...
 
Multimedia Mining
Multimedia Mining Multimedia Mining
Multimedia Mining
 
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
“Understanding DNN-Based Object Detectors,” a Presentation from Au-Zone Techn...
 
Dataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problemsDataset creation for Deep Learning-based Geometric Computer Vision problems
Dataset creation for Deep Learning-based Geometric Computer Vision problems
 
Neural Networks for Machine Learning and Deep Learning
Neural Networks for Machine Learning and Deep LearningNeural Networks for Machine Learning and Deep Learning
Neural Networks for Machine Learning and Deep Learning
 
Moving object detection in complex scene
Moving object detection in complex sceneMoving object detection in complex scene
Moving object detection in complex scene
 
The Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- ReduxThe Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- Redux
 
Deep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUSDeep learning fundamental and Research project on IBM POWER9 system from NUS
Deep learning fundamental and Research project on IBM POWER9 system from NUS
 
PointNet
PointNetPointNet
PointNet
 
Camera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IICamera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning II
 
Computer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathonComputer vision-nit-silchar-hackathon
Computer vision-nit-silchar-hackathon
 
Content Based Image Retrieval (CBIR)
Content Based Image Retrieval (CBIR)Content Based Image Retrieval (CBIR)
Content Based Image Retrieval (CBIR)
 
kanimozhi2019.pdf
kanimozhi2019.pdfkanimozhi2019.pdf
kanimozhi2019.pdf
 

Plus de LEE HOSEONG

Unsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillationUnsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillationLEE HOSEONG
 
do adversarially robust image net models transfer better
do adversarially robust image net models transfer betterdo adversarially robust image net models transfer better
do adversarially robust image net models transfer betterLEE HOSEONG
 
CNN Architecture A to Z
CNN Architecture A to ZCNN Architecture A to Z
CNN Architecture A to ZLEE HOSEONG
 
carrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationcarrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationLEE HOSEONG
 
"The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen...
"The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen..."The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen...
"The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen...LEE HOSEONG
 
Mixed Precision Training Review
Mixed Precision Training ReviewMixed Precision Training Review
Mixed Precision Training ReviewLEE HOSEONG
 
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly DetectionMVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly DetectionLEE HOSEONG
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewLEE HOSEONG
 
FixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidenceFixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidenceLEE HOSEONG
 
"Revisiting self supervised visual representation learning" Paper Review
"Revisiting self supervised visual representation learning" Paper Review"Revisiting self supervised visual representation learning" Paper Review
"Revisiting self supervised visual representation learning" Paper ReviewLEE HOSEONG
 
Unsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-SupervisionUnsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-SupervisionLEE HOSEONG
 
Human uncertainty makes classification more robust, ICCV 2019 Review
Human uncertainty makes classification more robust, ICCV 2019 ReviewHuman uncertainty makes classification more robust, ICCV 2019 Review
Human uncertainty makes classification more robust, ICCV 2019 ReviewLEE HOSEONG
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution OverviewLEE HOSEONG
 
2019 ICLR Best Paper Review
2019 ICLR Best Paper Review2019 ICLR Best Paper Review
2019 ICLR Best Paper ReviewLEE HOSEONG
 
"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper ReviewLEE HOSEONG
 
"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper ReviewLEE HOSEONG
 
"Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re..."Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re...LEE HOSEONG
 
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper ReviewLEE HOSEONG
 
"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper ReviewLEE HOSEONG
 
"From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ..."From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ...LEE HOSEONG
 

Plus de LEE HOSEONG (20)

Unsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillationUnsupervised anomaly detection using style distillation
Unsupervised anomaly detection using style distillation
 
do adversarially robust image net models transfer better
do adversarially robust image net models transfer betterdo adversarially robust image net models transfer better
do adversarially robust image net models transfer better
 
CNN Architecture A to Z
CNN Architecture A to ZCNN Architecture A to Z
CNN Architecture A to Z
 
carrier of_tricks_for_image_classification
carrier of_tricks_for_image_classificationcarrier of_tricks_for_image_classification
carrier of_tricks_for_image_classification
 
"The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen...
"The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen..."The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen...
"The Many Faces of Robustness: A Critical Analysis of Out-of-Distribution Gen...
 
Mixed Precision Training Review
Mixed Precision Training ReviewMixed Precision Training Review
Mixed Precision Training Review
 
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly DetectionMVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
MVTec AD: A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection
 
YOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection reviewYOLOv4: optimal speed and accuracy of object detection review
YOLOv4: optimal speed and accuracy of object detection review
 
FixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidenceFixMatch:simplifying semi supervised learning with consistency and confidence
FixMatch:simplifying semi supervised learning with consistency and confidence
 
"Revisiting self supervised visual representation learning" Paper Review
"Revisiting self supervised visual representation learning" Paper Review"Revisiting self supervised visual representation learning" Paper Review
"Revisiting self supervised visual representation learning" Paper Review
 
Unsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-SupervisionUnsupervised visual representation learning overview: Toward Self-Supervision
Unsupervised visual representation learning overview: Toward Self-Supervision
 
Human uncertainty makes classification more robust, ICCV 2019 Review
Human uncertainty makes classification more robust, ICCV 2019 ReviewHuman uncertainty makes classification more robust, ICCV 2019 Review
Human uncertainty makes classification more robust, ICCV 2019 Review
 
Single Image Super Resolution Overview
Single Image Super Resolution OverviewSingle Image Super Resolution Overview
Single Image Super Resolution Overview
 
2019 ICLR Best Paper Review
2019 ICLR Best Paper Review2019 ICLR Best Paper Review
2019 ICLR Best Paper Review
 
"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review"Google Vizier: A Service for Black-Box Optimization" Paper Review
"Google Vizier: A Service for Black-Box Optimization" Paper Review
 
"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review"Searching for Activation Functions" Paper Review
"Searching for Activation Functions" Paper Review
 
"Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re..."Learning transferable architectures for scalable image recognition" Paper Re...
"Learning transferable architectures for scalable image recognition" Paper Re...
 
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
"Learning From Noisy Large-Scale Datasets With Minimal Supervision" Paper Review
 
"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review"Dataset and metrics for predicting local visible differences" Paper Review
"Dataset and metrics for predicting local visible differences" Paper Review
 
"From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ..."From image level to pixel-level labeling with convolutional networks" Paper ...
"From image level to pixel-level labeling with convolutional networks" Paper ...
 

Dernier

Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamUiPathCommunity
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 

Dernier (20)

Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 

2019 cvpr paper_overview

  • 1. Intelligence Machine Vision Lab Strictly Confidential 2019 CVPR paper overview SUALAB Ho Seong Lee
  • 2. 2Type A-3 Contents • CVPR 2019 Statistics • 20 paper-one page summary
  • 3. 3Type A-3 CVPR 2019 Statistics • What is CVPR? • Conference on Computer Vision and Pattern Recognition (CVPR) • CVPR was first held in 1983 and has been held annually • CVPR 2019: June 16th – June 20th in Long Beach, CA
  • 4. 4Type A-3 CVPR 2019 Statistics • CVPR 2019 statistics • The total number of papers is increasing every year and this year has increased significantly! • We can visualize main topic using title of paper and simple python script! • https://github.com/hoya012/CVPR-Paper-Statistics 28.4% 30% 29.9% 29.6% 25.1%
  • 5. 5Type A-3 CVPR 2019 Statistics 2019 CVPR paper statistics
  • 6. 6Type A-3 CVPR 2018 Statistics 2018 CVPR paper statistics Compared to 2018 Statistics..
  • 7. 7Type A-3 CVPR 2018 vs CVPR 2019 Statistics • Most of the top keywords were maintained • Image, detection, 3d, object, video, segmentation, adversarial, recognition, visual … • “graph”, “cloud”, “representation” are about twice as frequent • graph : 15 → 45 • representation: 25 → 48 • cloud: 16 → 35
  • 8. 8Type A-3 Before beginning.. • It does not mean that it is not an interesting article because it is not in the list. • Since I mainly studied Computer Vision, most papers that I will discuss today are Computer Vision papers.. • Topics not covered today • Natural Language Processing • Reinforcement Learning • Robotics • Etc..?
  • 9. 9Type A-3 1. Learning to Synthesize Motion Blur (oral) • Synthesizing a motion blurred image from a pair of unblurred sequential images • Motion blur is important in cinematography, and artful photo • Generate a large-scale synthetic training dataset of motion blurred images Recommended reference: “Super SloMo: High Quality Estimation of Multiple Intermediate Frames for Video Interpolation”, 2018 CVPR
  • 10. 10Type A-3 2. Semantic Image Synthesis with Spatially-Adaptive Normalization (oral) • Synthesizing photorealistic images given an input semantic layout • Spatially-adaptive normalization can keep semantic information • This model allows user control over both semantic and style as synthesizing images Demo Code: https://github.com/NVlabs/SPADE
  • 11. 11Type A-3 3. SiCloPe: Silhouette-Based Clothed People (Oral) • Reconstruct a complete and textured 3D model of a person from a single image • Use 2D Silhouettes and 3D joints of a body pose to reconstruct 3D mesh • An effective two-stage 3D shape reconstruction pipeline • Predicting multi-view 2D silhouettes from single input segmentation • Deep visual hull based mesh reconstruction technique Recommended reference: “BodyNet: Volumetric Inference of 3D Human Body Shapes”, 2018 ECCV
  • 12. 12Type A-3 4. Im2Pencil: Controllable Pencil Illustration from Photographs • Propose controllable photo-to-pencil translation method • Modeling pencil outline(rough, clean), pencil shading(4 types) • Create training data pairs from online websites(e.g., Pinterest) and use image filtering techniques Demo Code: https://github.com/Yijunmaverick/Im2Pencil
  • 13. 13Type A-3 5. End-to-End Time-Lapse Video Synthesis from a Single Outdoor Image • End-to-end solution to synthesize a time-lapse video from single image • Use time-lapse videos and image sequences during training • Use only single image during inference Input image(single)
  • 14. 14Type A-3 6. StoryGAN: A Sequential Conditional GAN for Story Visualization • Propose a new task called Story Visualization using GAN • Sequential conditional GAN based StoryGAN • Story Encoder – stochastic mapping from story to an low-dimensional embedding vector • Context Encoder – capture contextual information during sequential image generation • Two Discriminator – Image Discriminator & Story Discriminator Context Encoder
  • 15. 15Type A-3 7. Image Super-Resolution by Neural Texture Transfer (oral) • Improve “RefSR” even when irrelevant reference images are provided • Traditional Single Image Super-Resolution is extremely challenging (ill-posed problem) • Reference-based(RefSR) utilizes rich texture from HR references .. but.. only similar Ref images • Adaptively transferring the texture from Ref Images according to their texture similarity Recommended reference: “CrossNet: An end-to-end reference-based super resolution network using cross-scale warping”, 2018 ECCV Similar Different
  • 16. 16Type A-3 8. DVC: An End-to-end Deep Video Compression Framework (oral) • Propose the first end-to-end video compression deep model • Conventional video compression use predictive coding architecture and encode corresponding motion information and residual information • Taking advantage of both classical compression and neural network • Use learning based optical flow estimation
  • 17. 17Type A-3 9. Defense Against Adversarial Images using Web-Scale Nearest-Neighbor Search (oral) • Defense adversarial attack using Big-data and image manifold • Assume that adversarial attack move the image away from the image manifold • A successful defense mechanism should aim to project the images back on the image manifold • For tens of billions of images, search a nearest-neighbor images (K=50) and use them • Also propose two novel attack methods to break nearest neighbor defenses
  • 18. 18Type A-3 10. Bag of Tricks for Image Classification with Convolutional Neural Networks • Examine a collection of some refinements and empirically evaluate their impact • Improve ResNet-50’s accuracy from 75.3% to 79.29% on ImageNet with some refinements • Efficient Training • FP32 with BS=256 → FP16 with BS=1024 with some techniques • Training Refinements: • Cosine Learning Rate Decay / Label Smoothing / Knowledge Distillation / Mixup Training • Transfer from classification to Object Detection, Semantic Segmentation Linear scaling LR LR warmup Zero γ initialization in BN No bias decay Result of Efficient Training Result of Training refinements ResNet tweaks
  • 19. 19Type A-3 11. Fully Learnable Group Convolution for Acceleration of Deep Neural Networks • Automatically learn the group structure in training stage with end-to-end manner • Outperform standard group convolution • Propose an efficient strategy for index re-ordering
  • 20. 20Type A-3 12. ScratchDet:Exploring to Train Single-Shot Object Detectors from Scratch (oral) • Explore to train object detectors from scratch robustly • Almost SOTA detectors are fine-tuned from pretrained CNN (e.g., ImageNet) • The classification and detection have different degrees of sensitivity to translation • The architecture is limited by the classification network(backbone) → inconvenience! • Find that one of the overlooked points is BatchNorm! Recommended reference: “DSOD: Learning Deeply Supervised Object Detectors from Scratch”, 2017 ICCV
  • 21. 21Type A-3 13. Precise Detection in Densely Packed Scenes • Propose precise detection in densely packed scenes • In real-world, there are many applications of object detection (ex, detection and count # of object) • In densely packed scenes, SOTA detector can’t detect accurately (1) layer for estimating the Jaccard index (2) a novel EM merging unit (3) release SKU-110K dataset
  • 22. 22Type A-3 14. SIXray: A Large-scale Security Inspection X-ray Benchmark for Prohibited Item Discovery in Overlapping Images • Present a large-scale dataset and establish a baseline for security inspection X-ray • Total 1,059,231 X-ray images in which 6 classes of 8,929 prohibited items • Propose an approach named class-balanced hierarchical refinement(CHR) and class-balanced loss function
  • 23. 23Type A-3 15. Generalized Intersection over Union: A Metric and A Loss for Bounding Box Regressio • Address the weaknesses of IoU and introduce generalized version(GIoU) • Intersection over Union(IoU) is the most popular evaluation metric used in object detection • But, there is a gap between optimizing distance losses and maximizing IoU • Introducing generalized IoU as both a new loss and a new metric
  • 24. 24Type A-3 16. Bounding Box Regression with Uncertainty for Accurate Object Detection • Propose novel bounding box regression loss with uncertainty • Most of datasets have ambiguities and labeling noise of bounding box coordinate • Network can learns to predict localization variance for each coordinate
  • 25. 25Type A-3 17. UPSNet: A Unified Panoptic Segmentation Network (oral) • Propose a unified panoptic segmentation network(UPSNet) • Semantic segmentation + Instance segmentation = panoptic segmentation • Semantic Head + Instance Head + Panoptic head → end-to-end manner Recommended reference: “Panoptic Segmentation”, 2018 arXiv Countable objects → things Uncountable objects → stuff Deformable Conv Mask R-CNN Parameter-free
  • 26. 26Type A-3 18. SFNet: Learning Object-aware Semantic Correspondence (Oral) • Propose SFNet for semantic correspondence problem • Propose to use images annotated with binary foreground masks and synthetic geometric deformations during training • Manually selecting point correspondences is so expensive!! • Outperform SOTA on standard benchmarks by a significant margin
  • 27. 27Type A-3 19. Fast Interactive Object Annotation with Curve-GC • Propose end-to-end fast interactive object annotation tool (Curve-GCN) • Predict all vertices simultaneously using a Graph Convolutional Network, (→ Polygon-RNN X) • Human annotator can correct any wrong point and only the neighboring points are affected Recommended reference: “Efficient interactive annotation of segmentation datasets with polygon-rnn++ ”, 2018 CVPR Code: https://github.com/fidler-lab/curve-gcn Correction!
  • 28. 28Type A-3 20. FickleNet: Weakly and Semi-supervised Semantic Image Segmentation using Stochastic Inference • Propose image-level WSSS method using stochastic inference (dropout) • Localization maps(CAM) only focus on the small parts of objects → Problem • FickleNet allows a single network to generate multiple CAM from a single image • Does not require any additional training steps and only adds a simple layer Full Stochastic Both Training and Inference
  • 29. 29Type A-3 Related Post.. • In my personal blog, there are similar works • SIGGRAPH 2018 • NeurIPS 2018 • ICLR 2019 https://hoya012.github.io/