2. SOTA
I need State Of The Art tech
This talk is my attempt to summarize what SOTA is
currently. There is so much, we will stay at a high level!
I hope you'll find it fun.
I have a PhD in SLAM & I'm VP of
Robotics at an Australian based
autonomous car company.
PROJECT 412
3. SOTA: Autonomous Car Tech
Long Range Autonomy Challenges
Where does Deep Learning fit in?
Autonomous Pilbra
N E R D C H A T
C U R R E N T P R O J E C T
OVERVIEW
4. Sensors
3D & 2D LIDAR (Laser Range Scanning)
Cameras
Radar
INS (Inertial Measurement System)
GPS
LIDARPUCKS,CameraSuite,Longrangeandshort
rangeradars.
SideCameras
RADAR&LIDAR
LEVEL 4-5
RESEARCH
5. SIMPLIFIED ARCHITECTURE
O B S T A C L E
A V O I D A N C E
L O C A L I Z A T I O N
A G A I N S T M A P
L O C A L P A T H
P L A N N I N G
G L O B A L P A T H
P L A N N I N G
Sensors:
Cameras
Lidar
Radar
GPS
Pedestrian detection and
path prediction
*HARD*
Requires large dataset to
match against
*Dataset size is large*
*Has robust performance*
Local navigation in traffic
can be difficult.
*Good solutions exist*
Posed as graph optimization
problem. Won't discuss further.
*Good solutions exist*
6. Deep networks are everywhere!
"Anticipating Traffic Accidents with Adaptive Loss and Large-scale Incident DB"
"Pedestrian Prediction by Planning Using Deep Neural Networks"
Image segmentation
Lidar based approaches
"Pedestrian Recognition Using High Definition Lidar"
Squeeze Seg: CNN for Lidar
OBSTACLES
V I S I O N B A S E D D E T E C T I O N & P R E D I C T I O N
https://github.com/aitorzip/DeepGTAV
7. MAP LOCALIZATION
F R O N T E N D S E N S O R F U S I O N
Feature detection traditionally - SIFT/SURF/HARRIS
Lidar/Radar based scene recognition. (Iterative
Closest Point scan matching.)
New techniques like
"Point Net" and
"FoldingNet: Point Cloud Auto-encoder via Deep
Grid Deformation"
Needs database of features to map against. Vision
only ~ 142km route of Southern England required
330GB data-set. "Vast-scale outdoor Navigation
Using Adaptive Relative Bundle Adjustment"
RSLAM registration
SqueezeSeg Registration
"3d Point Cloud Registration for Localization Using a Deep
Neural Network Autoencoder"
FoldingNet
Autoencoder
PointNet++
Architecture
8. MAP LOCALIZATION
B A C K E N D M A P P I N G
Maintaining a "map" of the entire world
Referencing current location against a set of
landmarks.
Approximating to reduce complexity.
Relative SLAM
COP-SLAM
GPS/INS useful but not good enough for required
precision (0.1m precision needed in all conditions)
Need to keep representation small, but maintain
accuracy.
RSLAM localization
COP-SLAM Pose Chain (above) and Landmarks (below)
9. SIMULATION BASED
Deep Traffic MIT Self Driving Course.
"Navigating Occluded Intersections with Autonomous Vehicles
using Deep Reinforcement Learning"
TRAFFIC NAV
10. Sensors
LIDAR (Laser Range Scanning)
Cameras
Radar
INS (Inertial Measurement System)
GPS
LEVEL 2-3
PRODUCTION
11. LEVEL 2-3 ISSUES
O B S T A C L E
A V O I D A N C E
L O C A L I Z A T I O N
A G A I N S T M A P
L O C A L P A T H
P L A N N I N G
D R I V E R S T A T E
M O N I T O R I N G
Sensors:
Cameras
Lidar
Radar
GPS
+
Driver
sensors
No pedestrians allowed.
What about animals?
Smaller map size as
driving range restricted.
Tech is the same as lvl 4-5
Reduced set of navigation
tasks. I.e. only lane changing,
lane following.
Driver must be attentive
12. Deep Traffic MIT Self Driving Course. Deep Tesla
"Map Based Precision Vehicle Localization in Urban Environments"
LANE FOLLOWING
13. "AI Co-Pilot: RNNs for Dynamic Facial Analysis" Allows for cv based gaze
detection by NVIDIA
DRIVER STATE
15. MAPS
Can't store highly detailed maps
Less road users mean high chance of
unexpected changes.
GPS is usually more available
Likely to interact with wild animals
Cannot rely on lane markings to guide
lane following
Environment can be dusty
Expect fewer intersections to be
navigated
FATIGUEOBSTACLES
W H A T M A K E S L O N G R A N G E D I F F E R E N T
Driver fatigue is probable
Other drivers likely to be
fatigued as well.