SlideShare une entreprise Scribd logo
1  sur  51
Télécharger pour lire hors ligne
Deep VO and SLAM IV
Yu Huang
Yu.haung07@gmail.com
Sunnyvale, California
Outline
• Flowdometry: An Optical Flow and Deep Learning Based Approach to Visual Odometry
• Supervising the new with the old: learning SFM from SFM
• Self-Supervised Learning of Depth and Motion Under Photometric Inconsistency
• Digging Into Self-Supervised Monocular Depth Estimation
• Learning monocular visual odometry with dense 3D mapping from dense 3D flow
• Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion,
Optical Flow and Motion Segmentation
• Estimating Metric Scale Visual Odometry from Videos using 3D Convolutional Networks
• GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with
Generative Adversarial Networks
• DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network
• DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency
• Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic
Understanding
Flowdometry: An Optical Flow and Deep Learning
Based Approach to Visual Odometry
• https://github.com/petermuller/flowdometry
• Visual odometry is a challenging task related to simultaneous localization and mapping that aims
to generate a map traveled from a visual data stream.
• Based on one or two cameras, motion is estimated from features and pixel differences between
frames.
• Because of the frame rate of the cameras, there are generally small, incremental changes
between subsequent frames where optical flow can be assumed to be proportional to the
physical distance moved by an egocentric reference, such as a camera on a vehicle.
• This paper porposed a visual odometry system called Flowdometry based on optical flow and
deep learning.
• Optical flow images are used as input to a convolutional neural network, which calculates a
rotation and displacement for each image pixel.
• The displacements and rotations are applied incrementally to construct a map of where the
camera has traveled.
• The proposed system is trained and tested on the KITTI visual odometry dataset, and accuracy is
measured by the difference in distances between ground truth and predicted driving trajectories.
Flowdometry: An Optical Flow and Deep Learning
Based Approach to Visual Odometry
The Flowdometry convolutional neural network architecture
based on the contractive part of FlowNetS
FlowNetS architecture with the contractive side of the network
Flowdometry: An Optical Flow and Deep Learning
Based Approach to Visual Odometry
Supervising the new with the old: learning SFM
from SFM
• Recent work has demonstrated that it is possible to learn deep neural networks for monocular
depth and ego-motion estimation from unlabelled video sequences, an interesting theoretical
development with numerous advantages in applications.
• This paper propose a number of improvements to these approaches.
• First, since such self supervised approaches are based on the brightness constancy assumption,
which is valid only for a subset of pixels, apply a probabilistic learning formulation where the
network predicts distributions over variables rather than specific values.
• As these distributions are conditioned on the observed image, the network can learn which scene
and object types are likely to violate the model assumptions, resulting in more robust learning; so
build on dozens of years of experience in developing handcrafted structure-from-motion (SFM)
algorithms by using an off-the-shelf SFM system to generate a supervisory signal for the deep
neural network.
• While this signal is also noisy, this probabilistic formulation can learn and account for the defects
of SFM, helping to integrate different sources of information and boosting the overall
performance of the network.
Supervising the new with the old: learning SFM
from SFM
Supervising the new with the old: learning SFM
from SFM
Supervising the new with the old: learning SFM
from SFM
Self-Supervised Learning of Depth and Motion
Under Photometric Inconsistency
• The self-supervised learning of depth and pose from monocular sequences provides an attractive
solution by using the photometric consistency of nearby frames as it depends much less on the
ground-truth data.
• This paper addresses the issue when previous assumptions of the self-supervised approaches are
violated due to the dynamic nature of real-world scenes.
• Different from handling the noise as uncertainty, this key idea is to incorporate more robust
geometric quantities and enforce internal consistency in the temporal image sequence.
• Enforcing the depth consistency across adjacent frames significantly improves the depth
estimation with much fewer noisy pixels.
• The geometric information is implicitly embedded into neural networks and does not bring
overhead for inference.
Self-Supervised Learning of Depth and Motion
Under Photometric Inconsistency
Self-Supervised Learning of Depth and Motion
Under Photometric Inconsistency
uncertainty
Self-Supervised Learning of Depth and Motion
Under Photometric Inconsistency
Digging Into Self-Supervised Monocular Depth
Estimation
• Per-pixel ground-truth depth data is challenging to acquire at scale.
• To overcome this limitation, self-supervised learning has emerged as a promising alternative for
training models to perform monocular depth estimation.
• This paper proposes a set of improvements, which together result in both quantitatively and
qualitatively improved depth maps compared to competing self-supervised methods.
• Research on self-supervised monocular training usually explores increasingly complex
architectures, loss functions, and image formation models, all of which have recently helped to
close the gap with fully-supervised methods.
• It shows that a surprisingly simple model, and associated design choices, lead to superior
predictions.
• (i) a minimum reprojection loss, designed to robustly handle occlusions;
• (ii) a full-resolution multi-scale sampling method that reduces visual artifacts;
• (iii) an auto-masking loss to ignore training pixels that violate camera motion assumptions.
• https://github.com/nianticlabs/monodepth2
Digging Into Self-Supervised Monocular Depth
Estimation
Moving objects
Digging Into Self-Supervised Monocular Depth
Estimation
Per-pixel minimum reprojection Full-resolution multi-scaleDepth + Pose network
Digging Into Self-Supervised Monocular Depth
Estimation
Learning monocular visual odometry with dense 3D
mapping from dense 3D flow
• This paper introduces a fully deep learning approach to monocular SLAM, which can perform
simultaneous localization using a NN for learning visual odometry (L-VO) and dense 3D mapping.
• Dense 2D flow and a depth image are generated from monocular images by sub-networks, which
are then used by a 3D flow associated layer in the L-VO network to generate dense 3D flow.
• Given this 3D flow, the dual stream L-VO network can then predict the 6DOF relative pose and
furthermore reconstruct the vehicle trajectory.
• In order to learn the correlation between motion directions, the Bivariate Gaussian modeling is
employed in the loss function.
• Moreover, the learned depth is leveraged to generate a dense 3D map.
• As a result, an entire visual SLAM system, that is, learning monocular odometry combined with
dense 3D mapping, is achieved.
Learning monocular visual odometry with dense 3D
mapping from dense 3D flow
Learning monocular visual odometry with dense 3D
mapping from dense 3D flow
learning visual odometry (L-VO) network
Learning monocular visual odometry with dense 3D
mapping from dense 3D flow
Competitive Collaboration: Joint Unsupervised Learning of
Depth, Camera Motion, Optical Flow and Motion Segmentation
• It addresses the unsupervised learning of several interconnected problems in low-level vision:
single view depth prediction, camera motion estimation, optical flow, and segmentation of a
video into the static scene and moving regions.
• The key insight is four fundamental vision problems are coupled through geometric constraints.
• Consequently, learning to solve them together simplifies the problem because the solutions can
reinforce each other.
• They go beyond previous work by exploiting geometry more explicitly and segmenting the scene
into static and moving regions.
• To that end, it introduces Competitive Collaboration, a framework that facilitates the coordinated
training of multiple specialized NNs to solve complex problems.
• Competitive Collaboration works much like expectation-maximization, but with NNs that act as
both competitors to explain pixels that correspond to static or moving regions, and as
collaborators through a moderator that assigns pixels to be either static or independently moving.
• This method integrates all these problems in a common framework and simultaneously reasons
about the segmentation of the scene into moving objects and the static background, the camera
motion, depth of the static scene structure, and the optical flow of moving objects.
• All our models and code are available at https://github.com/anuragranj/cc.
Competitive Collaboration: Joint Unsupervised Learning of
Depth, Camera Motion, Optical Flow and Motion Segmentation
Competitive Collaboration: Joint Unsupervised Learning of
Depth, Camera Motion, Optical Flow and Motion Segmentation
DispNet
MaskNet
FlowNetC
Camera Motion Network
Competitive Collaboration: Joint Unsupervised Learning of
Depth, Camera Motion, Optical Flow and Motion Segmentation
Top to bottom: image, depth, soft consensus masks motion, segmented optical flow and combined optical flow
Competitive Collaboration: Joint Unsupervised Learning of
Depth, Camera Motion, Optical Flow and Motion Segmentation
Estimating Metric Scale Visual Odometry from
Videos using 3D Convolutional Networks
• Monocular visual odometry (VO) is a heavily studied topic in robotics as it enables robust 3D
localization with a ubiquitous, lightweight sensor: a single camera.
• Scale accuracy can only be achieved with geometric methods in one of two ways: 1) by fusing info
from a sensor that measures physical units, such as an IMU or GPS receiver, or 2) by exploiting
prior knowledge about objects in a scene, such as the typical size.
• This is an E2E deep learning approach for performing metric scale-sensitive regression such visual
odometry with a single camera and no additional sensors.
• They propose a 3D convolutional architecture, 3DC-VO, that can leverage temporal relationships
over a short moving window of images to estimate linear and angular velocities.
• The network makes local predictions on stacks of images that can be integrated to form a full
trajectory.
• https://www.github.com/alexanderkoumis/3dc_vo.
Estimating Metric Scale Visual Odometry from
Videos using 3D Convolutional Networks
A 3D convolution
Generic subnetwork structure
Estimating Metric Scale Visual Odometry from
Videos using 3D Convolutional Networks
GANVO: Unsupervised Deep Monocular Visual Odometry and
Depth Estimation with Generative Adversarial Networks
• In the last decade, supervised deep learning approaches have been extensively employed in
visual odometry (VO) applications, which is not feasible in environments where labelled data is
not abundant.
• On the other hand, unsupervised deep learning approaches for localization and mapping in
unknown environments from unlabelled data have received comparatively less attention in VO
research.
• This study proposes a generative unsupervised learning framework that predicts 6-DoF pose
camera motion and monocular depth map of the scene from unlabelled RGB image sequences,
using deep convolutional Generative Adversarial Networks (GANs).
• They create a supervisory signal by warping view sequences and assigning the re-projection
minimization to the objective loss function that is adopted in multi-view pose estimation and
single-view depth generation network.
GANVO: Unsupervised Deep Monocular Visual Odometry and
Depth Estimation with Generative Adversarial Networks
GANVO: Unsupervised Deep Monocular Visual Odometry and
Depth Estimation with Generative Adversarial Networks
GANVO: Unsupervised Deep Monocular Visual Odometry and
Depth Estimation with Generative Adversarial Networks
GANVO: Unsupervised Deep Monocular Visual Odometry and
Depth Estimation with Generative Adversarial Networks
DeepPCO: End-to-End Point Cloud Odometry
through Deep Parallel Neural Network
• Odometry is of key importance for localization in the absence of a map.
• There is considerable work in the area of visual odometry (VO), and recent advances in deep
learning have brought novel approaches to VO, which directly learn salient features from raw
images.
• These learning-based approaches have led to more accurate and robust VO systems.
• However, they have not been well applied to point cloud data yet.
• This work investigates how to exploit deep learning to estimate point cloud odometry (PCO),
which may serve as a critical component in point cloud-based downstream tasks or learning-
based systems.
• Specifically, they propose a end-to-end deep parallel neural network called DeepPCO, which can
estimate the 6-DOF poses using consecutive point clouds.
• It consists of two parallel sub-networks to estimate 3- D translation and orientation respectively
rather than a single neural network.
DeepPCO: End-to-End Point Cloud Odometry
through Deep Parallel Neural Network
DeepPCO: End-to-End Point Cloud Odometry
through Deep Parallel Neural Network
Four point cloud encoding
approaches for deep learning
DeepPCO: End-to-End Point Cloud Odometry
through Deep Parallel Neural Network
DeepPCO: End-to-End Point Cloud Odometry
through Deep Parallel Neural Network
Ablation experiment of single branch fully connected layers. All the parameter configurations
of convolutional layers and fully connected layers are the same as DeepPCO. Different from
DeepPCO in which transformation vector is trained using two branches, 3-D translation (x, y, z)
and orientation (i, j, k) are jointly trained and inferred by just one branch here.
DeepPCO: End-to-End Point Cloud Odometry
through Deep Parallel Neural Network
“Deep Learning for Laser Based Odometry Estimation”
DF-Net: Unsupervised Joint Learning of Depth and
Flow using Cross-Task Consistency
• https://github.com/vt-vl-lab/DF-Net
• It presents an unsupervised learning framework for simultaneously training single-view depth
prediction and optical flow estimation models using unlabeled video sequences.
• Existing unsupervised methods often exploit brightness constancy and spatial smoothness priors
to train depth or flow models.
• This paper proposes to leverage geometric consistency as additional supervisory signals.
• The core idea is that for rigid regions it can use the predicted scene depth and camera motion to
synthesize 2D optical flow by back-projecting the induced 3D scene flow.
• The discrepancy between the rigid flow (from depth prediction and camera motion) and the
estimated flow (from optical flow model) allows us to impose a cross-task consistency loss.
• While all the networks are jointly optimized during training, they can be applied independently at
test time.
DF-Net: Unsupervised Joint Learning of Depth and
Flow using Cross-Task Consistency
DF-Net: Unsupervised Joint Learning of Depth and
Flow using Cross-Task Consistency
DF-Net: Unsupervised Joint Learning of Depth and
Flow using Cross-Task Consistency
Every Pixel Counts ++: Joint Learning of Geometry
and Motion with 3D Holistic Understanding
• https://github.com/chenxuluo/EPC
• Learning to estimate 3D geometry in a single frame and optical flow from consecutive frames by
watching unlabeled videos via Deep CNN has made significant progress recently.
• Current SoTA methods treat the two tasks independently. One typical assumption of the existing
depth estimation methods is that the scenes contain no independent moving objects. while
object moving could be easily modeled using optical flow.
• This paper proposes to address the two tasks as a whole, i.e. to jointly understand per-pixel 3D
geometry and motion.
• This eliminates the need of static scene assumption and enforces the inherent geometrical
consistency during the learning process, yielding significantly improved results for both tasks.
• This method is called as “Every Pixel Counts++” or “EPC++”.
• Specifically, during training, given two consecutive frames from a video, they adopt three parallel
networks to predict the camera motion (MotionNet), dense depth map (DepthNet), and per-pixel
optical flow between two frames (OptFlowNet) respectively.
• The three types of information, are fed into a holistic 3D motion parser (HMP), and per-pixel 3D
motion of both rigid background and moving objects are disentangled and recovered.
• Various loss terms are formulated to jointly supervise the three networks.
Every Pixel Counts ++: Joint Learning of Geometry
and Motion with 3D Holistic Understanding
Every Pixel Counts ++: Joint Learning of Geometry
and Motion with 3D Holistic Understanding
DepthNet
Every Pixel Counts ++: Joint Learning of Geometry
and Motion with 3D Holistic Understanding
Every Pixel Counts ++: Joint Learning of Geometry
and Motion with 3D Holistic Understanding
Every Pixel Counts ++: Joint Learning of Geometry
and Motion with 3D Holistic Understanding
Deep VO and SLAM IV

Contenu connexe

Tendances

3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving IIYu Huang
 
Depth Fusion from RGB and Depth Sensors III
Depth Fusion from RGB and Depth Sensors  IIIDepth Fusion from RGB and Depth Sensors  III
Depth Fusion from RGB and Depth Sensors IIIYu Huang
 
Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Yu Huang
 
3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IVYu Huang
 
Depth Fusion from RGB and Depth Sensors IV
Depth Fusion from RGB and Depth Sensors  IVDepth Fusion from RGB and Depth Sensors  IV
Depth Fusion from RGB and Depth Sensors IVYu Huang
 
Driving behaviors for adas and autonomous driving XI
Driving behaviors for adas and autonomous driving XIDriving behaviors for adas and autonomous driving XI
Driving behaviors for adas and autonomous driving XIYu Huang
 
Driving Behavior for ADAS and Autonomous Driving VII
Driving Behavior for ADAS and Autonomous Driving VIIDriving Behavior for ADAS and Autonomous Driving VII
Driving Behavior for ADAS and Autonomous Driving VIIYu Huang
 
Fisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving IIFisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving IIYu Huang
 
3-d interpretation from single 2-d image III
3-d interpretation from single 2-d image III3-d interpretation from single 2-d image III
3-d interpretation from single 2-d image IIIYu Huang
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingYu Huang
 
camera-based Lane detection by deep learning
camera-based Lane detection by deep learningcamera-based Lane detection by deep learning
camera-based Lane detection by deep learningYu Huang
 
Stereo Matching by Deep Learning
Stereo Matching by Deep LearningStereo Matching by Deep Learning
Stereo Matching by Deep LearningYu Huang
 
Deep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataDeep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataYu Huang
 
Multi sensor calibration by deep learning
Multi sensor calibration by deep learningMulti sensor calibration by deep learning
Multi sensor calibration by deep learningYu Huang
 
Pedestrian Behavior/Intention Modeling for Autonomous Driving VI
Pedestrian Behavior/Intention Modeling for Autonomous Driving VIPedestrian Behavior/Intention Modeling for Autonomous Driving VI
Pedestrian Behavior/Intention Modeling for Autonomous Driving VIYu Huang
 
LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)Yu Huang
 
Camera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning IIICamera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning IIIYu Huang
 
Deep vo and slam iii
Deep vo and slam iiiDeep vo and slam iii
Deep vo and slam iiiYu Huang
 
3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous drivingYu Huang
 
Driving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivDriving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivYu Huang
 

Tendances (20)

3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II3-d interpretation from single 2-d image for autonomous driving II
3-d interpretation from single 2-d image for autonomous driving II
 
Depth Fusion from RGB and Depth Sensors III
Depth Fusion from RGB and Depth Sensors  IIIDepth Fusion from RGB and Depth Sensors  III
Depth Fusion from RGB and Depth Sensors III
 
Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)Lidar for Autonomous Driving II (via Deep Learning)
Lidar for Autonomous Driving II (via Deep Learning)
 
3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV3-d interpretation from single 2-d image IV
3-d interpretation from single 2-d image IV
 
Depth Fusion from RGB and Depth Sensors IV
Depth Fusion from RGB and Depth Sensors  IVDepth Fusion from RGB and Depth Sensors  IV
Depth Fusion from RGB and Depth Sensors IV
 
Driving behaviors for adas and autonomous driving XI
Driving behaviors for adas and autonomous driving XIDriving behaviors for adas and autonomous driving XI
Driving behaviors for adas and autonomous driving XI
 
Driving Behavior for ADAS and Autonomous Driving VII
Driving Behavior for ADAS and Autonomous Driving VIIDriving Behavior for ADAS and Autonomous Driving VII
Driving Behavior for ADAS and Autonomous Driving VII
 
Fisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving IIFisheye Omnidirectional View in Autonomous Driving II
Fisheye Omnidirectional View in Autonomous Driving II
 
3-d interpretation from single 2-d image III
3-d interpretation from single 2-d image III3-d interpretation from single 2-d image III
3-d interpretation from single 2-d image III
 
Fisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous DrivingFisheye Omnidirectional View in Autonomous Driving
Fisheye Omnidirectional View in Autonomous Driving
 
camera-based Lane detection by deep learning
camera-based Lane detection by deep learningcamera-based Lane detection by deep learning
camera-based Lane detection by deep learning
 
Stereo Matching by Deep Learning
Stereo Matching by Deep LearningStereo Matching by Deep Learning
Stereo Matching by Deep Learning
 
Deep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal DataDeep Learning’s Application in Radar Signal Data
Deep Learning’s Application in Radar Signal Data
 
Multi sensor calibration by deep learning
Multi sensor calibration by deep learningMulti sensor calibration by deep learning
Multi sensor calibration by deep learning
 
Pedestrian Behavior/Intention Modeling for Autonomous Driving VI
Pedestrian Behavior/Intention Modeling for Autonomous Driving VIPedestrian Behavior/Intention Modeling for Autonomous Driving VI
Pedestrian Behavior/Intention Modeling for Autonomous Driving VI
 
LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)LiDAR-based Autonomous Driving III (by Deep Learning)
LiDAR-based Autonomous Driving III (by Deep Learning)
 
Camera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning IIICamera-based road Lane detection by deep learning III
Camera-based road Lane detection by deep learning III
 
Deep vo and slam iii
Deep vo and slam iiiDeep vo and slam iii
Deep vo and slam iii
 
3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving3-d interpretation from stereo images for autonomous driving
3-d interpretation from stereo images for autonomous driving
 
Driving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xivDriving behaviors for adas and autonomous driving xiv
Driving behaviors for adas and autonomous driving xiv
 

Similaire à Deep VO and SLAM IV

Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)PetteriTeikariPhD
 
Fisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VIFisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VIYu Huang
 
Fisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VFisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VYu Huang
 
Effective Object Detection and Background Subtraction by using M.O.I
Effective Object Detection and Background Subtraction by using M.O.IEffective Object Detection and Background Subtraction by using M.O.I
Effective Object Detection and Background Subtraction by using M.O.IIJMTST Journal
 
Fisheye-Omnidirectional View in Autonomous Driving III
Fisheye-Omnidirectional View in Autonomous Driving IIIFisheye-Omnidirectional View in Autonomous Driving III
Fisheye-Omnidirectional View in Autonomous Driving IIIYu Huang
 
AaSeminar_Template.pptx
AaSeminar_Template.pptxAaSeminar_Template.pptx
AaSeminar_Template.pptxManojGowdaKb
 
New Method for Traffic Density Estimation Based on Topic Model
New Method for Traffic Density Estimation Based on Topic ModelNew Method for Traffic Density Estimation Based on Topic Model
New Method for Traffic Density Estimation Based on Topic ModelNidhi Shirbhayye
 
IRJET- Criminal Recognization in CCTV Surveillance Video
IRJET-  	  Criminal Recognization in CCTV Surveillance VideoIRJET-  	  Criminal Recognization in CCTV Surveillance Video
IRJET- Criminal Recognization in CCTV Surveillance VideoIRJET Journal
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingYu Huang
 
Fusion of Multi-MAV Data
Fusion of Multi-MAV DataFusion of Multi-MAV Data
Fusion of Multi-MAV DataDariolakis
 
High quality single shot capture of facial geometry
High quality single shot capture of facial geometryHigh quality single shot capture of facial geometry
High quality single shot capture of facial geometryBrohi Aijaz Ali
 
Fisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVFisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVYu Huang
 
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTechEmerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTechPetteriTeikariPhD
 
IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...
IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...
IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...IRJET Journal
 
Camera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IICamera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IIYu Huang
 
Real Time Object Identification for Intelligent Video Surveillance Applications
Real Time Object Identification for Intelligent Video Surveillance ApplicationsReal Time Object Identification for Intelligent Video Surveillance Applications
Real Time Object Identification for Intelligent Video Surveillance ApplicationsEditor IJCATR
 

Similaire à Deep VO and SLAM IV (20)

Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)Deep Learning for Structure-from-Motion (SfM)
Deep Learning for Structure-from-Motion (SfM)
 
Fisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VIFisheye based Perception for Autonomous Driving VI
Fisheye based Perception for Autonomous Driving VI
 
Fisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving VFisheye/Omnidirectional View in Autonomous Driving V
Fisheye/Omnidirectional View in Autonomous Driving V
 
Introduction of slam
Introduction of slamIntroduction of slam
Introduction of slam
 
Effective Object Detection and Background Subtraction by using M.O.I
Effective Object Detection and Background Subtraction by using M.O.IEffective Object Detection and Background Subtraction by using M.O.I
Effective Object Detection and Background Subtraction by using M.O.I
 
AR/SLAM for end-users
AR/SLAM for end-usersAR/SLAM for end-users
AR/SLAM for end-users
 
Fisheye-Omnidirectional View in Autonomous Driving III
Fisheye-Omnidirectional View in Autonomous Driving IIIFisheye-Omnidirectional View in Autonomous Driving III
Fisheye-Omnidirectional View in Autonomous Driving III
 
AaSeminar_Template.pptx
AaSeminar_Template.pptxAaSeminar_Template.pptx
AaSeminar_Template.pptx
 
New Method for Traffic Density Estimation Based on Topic Model
New Method for Traffic Density Estimation Based on Topic ModelNew Method for Traffic Density Estimation Based on Topic Model
New Method for Traffic Density Estimation Based on Topic Model
 
IRJET- Criminal Recognization in CCTV Surveillance Video
IRJET-  	  Criminal Recognization in CCTV Surveillance VideoIRJET-  	  Criminal Recognization in CCTV Surveillance Video
IRJET- Criminal Recognization in CCTV Surveillance Video
 
Unsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object trackingUnsupervised/Self-supervvised visual object tracking
Unsupervised/Self-supervvised visual object tracking
 
V01 i010405
V01 i010405V01 i010405
V01 i010405
 
Fusion of Multi-MAV Data
Fusion of Multi-MAV DataFusion of Multi-MAV Data
Fusion of Multi-MAV Data
 
High quality single shot capture of facial geometry
High quality single shot capture of facial geometryHigh quality single shot capture of facial geometry
High quality single shot capture of facial geometry
 
Fisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IVFisheye/Omnidirectional View in Autonomous Driving IV
Fisheye/Omnidirectional View in Autonomous Driving IV
 
Emerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTechEmerging 3D Scanning Technologies for PropTech
Emerging 3D Scanning Technologies for PropTech
 
IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...
IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...
IRJET- Human Fall Detection using Co-Saliency-Enhanced Deep Recurrent Convolu...
 
slide-171212080528.pptx
slide-171212080528.pptxslide-171212080528.pptx
slide-171212080528.pptx
 
Camera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning IICamera-Based Road Lane Detection by Deep Learning II
Camera-Based Road Lane Detection by Deep Learning II
 
Real Time Object Identification for Intelligent Video Surveillance Applications
Real Time Object Identification for Intelligent Video Surveillance ApplicationsReal Time Object Identification for Intelligent Video Surveillance Applications
Real Time Object Identification for Intelligent Video Surveillance Applications
 

Plus de Yu Huang

Application of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingApplication of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingYu Huang
 
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...Yu Huang
 
Data Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous DrivingData Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous DrivingYu Huang
 
Techniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingTechniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingYu Huang
 
BEV Joint Detection and Segmentation
BEV Joint Detection and SegmentationBEV Joint Detection and Segmentation
BEV Joint Detection and SegmentationYu Huang
 
BEV Object Detection and Prediction
BEV Object Detection and PredictionBEV Object Detection and Prediction
BEV Object Detection and PredictionYu Huang
 
Prediction,Planninng & Control at Baidu
Prediction,Planninng & Control at BaiduPrediction,Planninng & Control at Baidu
Prediction,Planninng & Control at BaiduYu Huang
 
Cruise AI under the Hood
Cruise AI under the HoodCruise AI under the Hood
Cruise AI under the HoodYu Huang
 
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)Yu Huang
 
Scenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingScenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingYu Huang
 
How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?Yu Huang
 
Annotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingAnnotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingYu Huang
 
Simulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgSimulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgYu Huang
 
Prediction and planning for self driving at waymo
Prediction and planning for self driving at waymoPrediction and planning for self driving at waymo
Prediction and planning for self driving at waymoYu Huang
 
Jointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planningJointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planningYu Huang
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingYu Huang
 
Open Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningOpen Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningYu Huang
 
Lidar in the adverse weather: dust, fog, snow and rain
Lidar in the adverse weather: dust, fog, snow and rainLidar in the adverse weather: dust, fog, snow and rain
Lidar in the adverse weather: dust, fog, snow and rainYu Huang
 
Autonomous Driving of L3/L4 Commercial trucks
Autonomous Driving of L3/L4 Commercial trucksAutonomous Driving of L3/L4 Commercial trucks
Autonomous Driving of L3/L4 Commercial trucksYu Huang
 
3-d interpretation from single 2-d image V
3-d interpretation from single 2-d image V3-d interpretation from single 2-d image V
3-d interpretation from single 2-d image VYu Huang
 

Plus de Yu Huang (20)

Application of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous DrivingApplication of Foundation Model for Autonomous Driving
Application of Foundation Model for Autonomous Driving
 
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...The New Perception Framework  in Autonomous Driving: An Introduction of BEV N...
The New Perception Framework in Autonomous Driving: An Introduction of BEV N...
 
Data Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous DrivingData Closed Loop in Simulation Test of Autonomous Driving
Data Closed Loop in Simulation Test of Autonomous Driving
 
Techniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingTechniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous Driving
 
BEV Joint Detection and Segmentation
BEV Joint Detection and SegmentationBEV Joint Detection and Segmentation
BEV Joint Detection and Segmentation
 
BEV Object Detection and Prediction
BEV Object Detection and PredictionBEV Object Detection and Prediction
BEV Object Detection and Prediction
 
Prediction,Planninng & Control at Baidu
Prediction,Planninng & Control at BaiduPrediction,Planninng & Control at Baidu
Prediction,Planninng & Control at Baidu
 
Cruise AI under the Hood
Cruise AI under the HoodCruise AI under the Hood
Cruise AI under the Hood
 
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
LiDAR in the Adverse Weather: Dust, Snow, Rain and Fog (2)
 
Scenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous DrivingScenario-Based Development & Testing for Autonomous Driving
Scenario-Based Development & Testing for Autonomous Driving
 
How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?How to Build a Data Closed-loop Platform for Autonomous Driving?
How to Build a Data Closed-loop Platform for Autonomous Driving?
 
Annotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous DrivingAnnotation tools for ADAS & Autonomous Driving
Annotation tools for ADAS & Autonomous Driving
 
Simulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atgSimulation for autonomous driving at uber atg
Simulation for autonomous driving at uber atg
 
Prediction and planning for self driving at waymo
Prediction and planning for self driving at waymoPrediction and planning for self driving at waymo
Prediction and planning for self driving at waymo
 
Jointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planningJointly mapping, localization, perception, prediction and planning
Jointly mapping, localization, perception, prediction and planning
 
Data pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous drivingData pipeline and data lake for autonomous driving
Data pipeline and data lake for autonomous driving
 
Open Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planningOpen Source codes of trajectory prediction & behavior planning
Open Source codes of trajectory prediction & behavior planning
 
Lidar in the adverse weather: dust, fog, snow and rain
Lidar in the adverse weather: dust, fog, snow and rainLidar in the adverse weather: dust, fog, snow and rain
Lidar in the adverse weather: dust, fog, snow and rain
 
Autonomous Driving of L3/L4 Commercial trucks
Autonomous Driving of L3/L4 Commercial trucksAutonomous Driving of L3/L4 Commercial trucks
Autonomous Driving of L3/L4 Commercial trucks
 
3-d interpretation from single 2-d image V
3-d interpretation from single 2-d image V3-d interpretation from single 2-d image V
3-d interpretation from single 2-d image V
 

Dernier

University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdfKamal Acharya
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptDineshKumar4165
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Arindam Chakraborty, Ph.D., P.E. (CA, TX)
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptxJIT KUMAR GUPTA
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performancesivaprakash250
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringmulugeta48
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VDineshKumar4165
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086anil_gaur
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaOmar Fathy
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf203318pmpc
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfJiananWang21
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationBhangaleSonal
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfRagavanV2
 

Dernier (20)

University management System project report..pdf
University management System project report..pdfUniversity management System project report..pdf
University management System project report..pdf
 
Integrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - NeometrixIntegrated Test Rig For HTFE-25 - Neometrix
Integrated Test Rig For HTFE-25 - Neometrix
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Ankleshwar 7001035870 Whatsapp Number, 24/07 Booking
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
Thermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - VThermal Engineering-R & A / C - unit - V
Thermal Engineering-R & A / C - unit - V
 
Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086Minimum and Maximum Modes of microprocessor 8086
Minimum and Maximum Modes of microprocessor 8086
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf22-prompt engineering noted slide shown.pdf
22-prompt engineering noted slide shown.pdf
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
(INDIRA) Call Girl Aurangabad Call Now 8617697112 Aurangabad Escorts 24x7
 
DC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equationDC MACHINE-Motoring and generation, Armature circuit equation
DC MACHINE-Motoring and generation, Armature circuit equation
 
Unit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdfUnit 2- Effective stress & Permeability.pdf
Unit 2- Effective stress & Permeability.pdf
 

Deep VO and SLAM IV

  • 1. Deep VO and SLAM IV Yu Huang Yu.haung07@gmail.com Sunnyvale, California
  • 2. Outline • Flowdometry: An Optical Flow and Deep Learning Based Approach to Visual Odometry • Supervising the new with the old: learning SFM from SFM • Self-Supervised Learning of Depth and Motion Under Photometric Inconsistency • Digging Into Self-Supervised Monocular Depth Estimation • Learning monocular visual odometry with dense 3D mapping from dense 3D flow • Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation • Estimating Metric Scale Visual Odometry from Videos using 3D Convolutional Networks • GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks • DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network • DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency • Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding
  • 3. Flowdometry: An Optical Flow and Deep Learning Based Approach to Visual Odometry • https://github.com/petermuller/flowdometry • Visual odometry is a challenging task related to simultaneous localization and mapping that aims to generate a map traveled from a visual data stream. • Based on one or two cameras, motion is estimated from features and pixel differences between frames. • Because of the frame rate of the cameras, there are generally small, incremental changes between subsequent frames where optical flow can be assumed to be proportional to the physical distance moved by an egocentric reference, such as a camera on a vehicle. • This paper porposed a visual odometry system called Flowdometry based on optical flow and deep learning. • Optical flow images are used as input to a convolutional neural network, which calculates a rotation and displacement for each image pixel. • The displacements and rotations are applied incrementally to construct a map of where the camera has traveled. • The proposed system is trained and tested on the KITTI visual odometry dataset, and accuracy is measured by the difference in distances between ground truth and predicted driving trajectories.
  • 4. Flowdometry: An Optical Flow and Deep Learning Based Approach to Visual Odometry The Flowdometry convolutional neural network architecture based on the contractive part of FlowNetS FlowNetS architecture with the contractive side of the network
  • 5. Flowdometry: An Optical Flow and Deep Learning Based Approach to Visual Odometry
  • 6. Supervising the new with the old: learning SFM from SFM • Recent work has demonstrated that it is possible to learn deep neural networks for monocular depth and ego-motion estimation from unlabelled video sequences, an interesting theoretical development with numerous advantages in applications. • This paper propose a number of improvements to these approaches. • First, since such self supervised approaches are based on the brightness constancy assumption, which is valid only for a subset of pixels, apply a probabilistic learning formulation where the network predicts distributions over variables rather than specific values. • As these distributions are conditioned on the observed image, the network can learn which scene and object types are likely to violate the model assumptions, resulting in more robust learning; so build on dozens of years of experience in developing handcrafted structure-from-motion (SFM) algorithms by using an off-the-shelf SFM system to generate a supervisory signal for the deep neural network. • While this signal is also noisy, this probabilistic formulation can learn and account for the defects of SFM, helping to integrate different sources of information and boosting the overall performance of the network.
  • 7. Supervising the new with the old: learning SFM from SFM
  • 8. Supervising the new with the old: learning SFM from SFM
  • 9. Supervising the new with the old: learning SFM from SFM
  • 10. Self-Supervised Learning of Depth and Motion Under Photometric Inconsistency • The self-supervised learning of depth and pose from monocular sequences provides an attractive solution by using the photometric consistency of nearby frames as it depends much less on the ground-truth data. • This paper addresses the issue when previous assumptions of the self-supervised approaches are violated due to the dynamic nature of real-world scenes. • Different from handling the noise as uncertainty, this key idea is to incorporate more robust geometric quantities and enforce internal consistency in the temporal image sequence. • Enforcing the depth consistency across adjacent frames significantly improves the depth estimation with much fewer noisy pixels. • The geometric information is implicitly embedded into neural networks and does not bring overhead for inference.
  • 11. Self-Supervised Learning of Depth and Motion Under Photometric Inconsistency
  • 12. Self-Supervised Learning of Depth and Motion Under Photometric Inconsistency uncertainty
  • 13. Self-Supervised Learning of Depth and Motion Under Photometric Inconsistency
  • 14. Digging Into Self-Supervised Monocular Depth Estimation • Per-pixel ground-truth depth data is challenging to acquire at scale. • To overcome this limitation, self-supervised learning has emerged as a promising alternative for training models to perform monocular depth estimation. • This paper proposes a set of improvements, which together result in both quantitatively and qualitatively improved depth maps compared to competing self-supervised methods. • Research on self-supervised monocular training usually explores increasingly complex architectures, loss functions, and image formation models, all of which have recently helped to close the gap with fully-supervised methods. • It shows that a surprisingly simple model, and associated design choices, lead to superior predictions. • (i) a minimum reprojection loss, designed to robustly handle occlusions; • (ii) a full-resolution multi-scale sampling method that reduces visual artifacts; • (iii) an auto-masking loss to ignore training pixels that violate camera motion assumptions. • https://github.com/nianticlabs/monodepth2
  • 15. Digging Into Self-Supervised Monocular Depth Estimation Moving objects
  • 16. Digging Into Self-Supervised Monocular Depth Estimation Per-pixel minimum reprojection Full-resolution multi-scaleDepth + Pose network
  • 17. Digging Into Self-Supervised Monocular Depth Estimation
  • 18. Learning monocular visual odometry with dense 3D mapping from dense 3D flow • This paper introduces a fully deep learning approach to monocular SLAM, which can perform simultaneous localization using a NN for learning visual odometry (L-VO) and dense 3D mapping. • Dense 2D flow and a depth image are generated from monocular images by sub-networks, which are then used by a 3D flow associated layer in the L-VO network to generate dense 3D flow. • Given this 3D flow, the dual stream L-VO network can then predict the 6DOF relative pose and furthermore reconstruct the vehicle trajectory. • In order to learn the correlation between motion directions, the Bivariate Gaussian modeling is employed in the loss function. • Moreover, the learned depth is leveraged to generate a dense 3D map. • As a result, an entire visual SLAM system, that is, learning monocular odometry combined with dense 3D mapping, is achieved.
  • 19. Learning monocular visual odometry with dense 3D mapping from dense 3D flow
  • 20. Learning monocular visual odometry with dense 3D mapping from dense 3D flow learning visual odometry (L-VO) network
  • 21. Learning monocular visual odometry with dense 3D mapping from dense 3D flow
  • 22. Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation • It addresses the unsupervised learning of several interconnected problems in low-level vision: single view depth prediction, camera motion estimation, optical flow, and segmentation of a video into the static scene and moving regions. • The key insight is four fundamental vision problems are coupled through geometric constraints. • Consequently, learning to solve them together simplifies the problem because the solutions can reinforce each other. • They go beyond previous work by exploiting geometry more explicitly and segmenting the scene into static and moving regions. • To that end, it introduces Competitive Collaboration, a framework that facilitates the coordinated training of multiple specialized NNs to solve complex problems. • Competitive Collaboration works much like expectation-maximization, but with NNs that act as both competitors to explain pixels that correspond to static or moving regions, and as collaborators through a moderator that assigns pixels to be either static or independently moving. • This method integrates all these problems in a common framework and simultaneously reasons about the segmentation of the scene into moving objects and the static background, the camera motion, depth of the static scene structure, and the optical flow of moving objects. • All our models and code are available at https://github.com/anuragranj/cc.
  • 23. Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation
  • 24. Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation DispNet MaskNet FlowNetC Camera Motion Network
  • 25. Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation Top to bottom: image, depth, soft consensus masks motion, segmented optical flow and combined optical flow
  • 26. Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation
  • 27. Estimating Metric Scale Visual Odometry from Videos using 3D Convolutional Networks • Monocular visual odometry (VO) is a heavily studied topic in robotics as it enables robust 3D localization with a ubiquitous, lightweight sensor: a single camera. • Scale accuracy can only be achieved with geometric methods in one of two ways: 1) by fusing info from a sensor that measures physical units, such as an IMU or GPS receiver, or 2) by exploiting prior knowledge about objects in a scene, such as the typical size. • This is an E2E deep learning approach for performing metric scale-sensitive regression such visual odometry with a single camera and no additional sensors. • They propose a 3D convolutional architecture, 3DC-VO, that can leverage temporal relationships over a short moving window of images to estimate linear and angular velocities. • The network makes local predictions on stacks of images that can be integrated to form a full trajectory. • https://www.github.com/alexanderkoumis/3dc_vo.
  • 28. Estimating Metric Scale Visual Odometry from Videos using 3D Convolutional Networks A 3D convolution Generic subnetwork structure
  • 29. Estimating Metric Scale Visual Odometry from Videos using 3D Convolutional Networks
  • 30. GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks • In the last decade, supervised deep learning approaches have been extensively employed in visual odometry (VO) applications, which is not feasible in environments where labelled data is not abundant. • On the other hand, unsupervised deep learning approaches for localization and mapping in unknown environments from unlabelled data have received comparatively less attention in VO research. • This study proposes a generative unsupervised learning framework that predicts 6-DoF pose camera motion and monocular depth map of the scene from unlabelled RGB image sequences, using deep convolutional Generative Adversarial Networks (GANs). • They create a supervisory signal by warping view sequences and assigning the re-projection minimization to the objective loss function that is adopted in multi-view pose estimation and single-view depth generation network.
  • 31. GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks
  • 32. GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks
  • 33. GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks
  • 34. GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks
  • 35. DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network • Odometry is of key importance for localization in the absence of a map. • There is considerable work in the area of visual odometry (VO), and recent advances in deep learning have brought novel approaches to VO, which directly learn salient features from raw images. • These learning-based approaches have led to more accurate and robust VO systems. • However, they have not been well applied to point cloud data yet. • This work investigates how to exploit deep learning to estimate point cloud odometry (PCO), which may serve as a critical component in point cloud-based downstream tasks or learning- based systems. • Specifically, they propose a end-to-end deep parallel neural network called DeepPCO, which can estimate the 6-DOF poses using consecutive point clouds. • It consists of two parallel sub-networks to estimate 3- D translation and orientation respectively rather than a single neural network.
  • 36. DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network
  • 37. DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network Four point cloud encoding approaches for deep learning
  • 38. DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network
  • 39. DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network Ablation experiment of single branch fully connected layers. All the parameter configurations of convolutional layers and fully connected layers are the same as DeepPCO. Different from DeepPCO in which transformation vector is trained using two branches, 3-D translation (x, y, z) and orientation (i, j, k) are jointly trained and inferred by just one branch here.
  • 40. DeepPCO: End-to-End Point Cloud Odometry through Deep Parallel Neural Network “Deep Learning for Laser Based Odometry Estimation”
  • 41. DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency • https://github.com/vt-vl-lab/DF-Net • It presents an unsupervised learning framework for simultaneously training single-view depth prediction and optical flow estimation models using unlabeled video sequences. • Existing unsupervised methods often exploit brightness constancy and spatial smoothness priors to train depth or flow models. • This paper proposes to leverage geometric consistency as additional supervisory signals. • The core idea is that for rigid regions it can use the predicted scene depth and camera motion to synthesize 2D optical flow by back-projecting the induced 3D scene flow. • The discrepancy between the rigid flow (from depth prediction and camera motion) and the estimated flow (from optical flow model) allows us to impose a cross-task consistency loss. • While all the networks are jointly optimized during training, they can be applied independently at test time.
  • 42. DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency
  • 43. DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency
  • 44. DF-Net: Unsupervised Joint Learning of Depth and Flow using Cross-Task Consistency
  • 45. Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding • https://github.com/chenxuluo/EPC • Learning to estimate 3D geometry in a single frame and optical flow from consecutive frames by watching unlabeled videos via Deep CNN has made significant progress recently. • Current SoTA methods treat the two tasks independently. One typical assumption of the existing depth estimation methods is that the scenes contain no independent moving objects. while object moving could be easily modeled using optical flow. • This paper proposes to address the two tasks as a whole, i.e. to jointly understand per-pixel 3D geometry and motion. • This eliminates the need of static scene assumption and enforces the inherent geometrical consistency during the learning process, yielding significantly improved results for both tasks. • This method is called as “Every Pixel Counts++” or “EPC++”. • Specifically, during training, given two consecutive frames from a video, they adopt three parallel networks to predict the camera motion (MotionNet), dense depth map (DepthNet), and per-pixel optical flow between two frames (OptFlowNet) respectively. • The three types of information, are fed into a holistic 3D motion parser (HMP), and per-pixel 3D motion of both rigid background and moving objects are disentangled and recovered. • Various loss terms are formulated to jointly supervise the three networks.
  • 46. Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding
  • 47. Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding DepthNet
  • 48. Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding
  • 49. Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding
  • 50. Every Pixel Counts ++: Joint Learning of Geometry and Motion with 3D Holistic Understanding