SlideShare a Scribd company logo
Scalable Fiducial Tag Localization on a 3D Prior Map
Via Graph-Theoretic Global Tag-Map Registration
Kenji Koide, Shuji Oishi, Masashi Yokozuka, and Atsuhiko Banno
National Institute of Advanced Industrial Science and Technology (AIST), Japan
Background
• Map-based visual localization has been attracting much attention
• It is, however, sometimes necessary to rely on visual fiducial tags
(aka visual markers) for initialization and fail-safe
[Oishi, 2020]
Motivation
• Deploying many tags on a 3D prior map is sometimes difficult and tedious
• Tag positions are often measured by hand; large effort and inaccurate results
• We aim to develop an accurate and automatic method to determine tag poses
in the environment
Proposed Method
1. VIO-based Tag-Relative-Pose Estimation
We use an agile camera to observe tags in the environment and
estimate the relative poses between tags via landmark SLAM
2. Global Tag-Map Registration
We then roughly align tags and a prior map by establishing tag-plane
correspondences via graph-theoretic correspondence estimation
3. Estimation Refinement via Direct Camera-Map Alignment
Tag and camera poses are refined by directly aligning agile camera images with
the prior map and re-optimize all variables under all constraints
VIO-based Tag-Relative-Pose Estimation
• We use an agile camera and observe each tag in the environment at least once
• The tag poses in the VIO frame is estimated via landmark SLAM
VIO
(VINS-Mono)
Tag detections
(Apriltags)
Pose graph optimization
Global Tag-Map Registration
• We want to align the estimated tag poses with a prior 3D map without initial guess
• The modality difference makes it difficult to apply image matching…
Prior 3D map (sparse point cloud) Estimated tag poses (visually detected)
Align w/o initial guess
Geometry-based Tag-Plane Matching
• We assume that most tags are placed on a plane in the environment
• We establish tag-plane correspondences to determine the tag-map transformation
Detecting planes in the environment
1. Region growing segmentation
2. RANSAC plane detection
3. Fit oriented BBoxes to plane points
Geometry-based Tag-Plane Matching
• We assume that most tags are placed on a plane in the environment
• We establish tag-plane correspondences to determine the tag-map transformation
Detecting planes in the environment
1. Region growing segmentation
2. RANSAC plane detection
3. Fit oriented BBoxes to plane points
Geometry-based Tag-Plane Matching
• We assume that most tags are placed on a plane in the environment
• We establish tag-plane correspondences to determine the tag-map transformation
Detecting planes in the environment
1. Region growing segmentation
2. RANSAC plane detection
3. Fit oriented BBoxes to plane points
Geometry-based Tag-Plane Matching
• We assume that most tags are placed on a plane in the environment
• We establish tag-plane correspondences to determine the tag-map transformation
Detecting planes in the environment
1. Region growing segmentation
2. RANSAC plane detection
3. Fit oriented BBoxes to plane points
Plane = (center, normal, lengths)
Max-Clique-based Correspondence Estimation
• Tag-Plane Correspondence Consistency Graph
Vertex: tag-plane correspondence hypothesis
Edge: consistency between correspondence hypotheses
ℎ𝑖𝑗 does not contradict ℎ𝑘𝑙 (i.e., they are consistent)
Tag i corresponds to plane j
Tag k corresponds to plane l
ℎ𝑖𝑗
ℎ𝑘𝑙
Max-Clique-based Correspondence Estimation
• Tag-Plane Correspondence Consistency Graph
Vertex: tag-plane correspondence hypothesis
Edge: consistency between correspondence hypotheses
ℎ𝑖𝑗
ℎ𝑘𝑙
Max-Clique-based Correspondence Estimation
• Tag-Plane Correspondence Consistency Graph
Vertex: tag-plane correspondence hypothesis
Edge: consistency between correspondence hypotheses
• Largest subset of hypotheses that are all mutually consistent (i.e., maximum clique)
gives the best explanation for the tag placement in the given map
ℎ𝑖𝑗
ℎ𝑘𝑙
Tag-Plane Correspondence Consistency
• Consistency between tag-plane correspondence hypotheses is determined
based on geometric consistency check
ℎ𝑖𝑗
ℎ𝑘𝑙
Tag i
Tag k
Plane j
Plane l
Tag-Plane Correspondence Consistency
• Consistency between tag-plane correspondence hypotheses is determined
based on geometric consistency check
• We align tag i and plane j and s.t. distance between tag k and plane l
Plane j
Plane l
Tag-Plane Correspondence Consistency
• Consistency between tag-plane correspondence hypotheses is determined
based on geometric consistency check
• We align tag i and plane j and s.t. distance between tag k and plane l
• If normal and translation errors between tag k and plane l are smaller than
threshold, these hypotheses are mutually consistent
Plane j
Plane l
Normal error
Translation error
Example Result
Planes
Tags
• While the consistency graph contains many edges,
the max-clique can be found very efficiently [Rossi, 2015]
Example Result
Planes
Tags
Consistency graph contains
429,735 hypothesis pairs
• While the consistency graph contains many edges,
the max-clique can be found very efficiently [Rossi, 2015]
Example Result
Planes
Tags
Consistency graph contains
429,735 hypothesis pairs
Maximum clique consists of
56 tag-plane correspondences
found in 92 msec
• While the consistency graph contains many edges,
the max-clique can be found very efficiently [Rossi, 2015]
• Given the tag-plane correspondences, we estimate the tag-map transformation
by minimizing normal-to-normal ICP distance [Rusinkiewicz, 2019]
Estimation Refinement
• We refine the tag poses by directly aligning agile camera images with the map
VIO
Tag detections
Pose graph
Direct alignment
Estimation Refinement
• We refine the tag poses by directly aligning agile camera images with the map
• We use the normalized information distance (NID), a mutual information-based
cross modal metric, to maximize the co-occurrence of pixel and map intensity values
• Tag and camera poses are re-optimized under all the constraints
Agile camera image
Map rendered with
optimized camera pose
Evaluation in Simulation
• The method is evaluated on the Replica dataset [Savva, 2019]
Global tag-map registration
: 0.039m / 1.021°
Tag localization accuracy
: 98% success rate
Baseline (FPFH+RANSAC/Teaser) : 26% and 70%
Robustness to outlier tags
Evaluation in Real Environment
• 117 tags were placed in the environment
• Tag poses were estimated in 22 minutes (16 min for VIO recording, 6 min for post processing)
• Average tag pose error: 0.019m and 2.382°
Final estimation result
Thank you for your attention!!
24
Conclusion
• An accurate and scalable method for fiducial tag localization on a 3D prior
environmental map is proposed
• VIO-based tag relative pose estimation via landmark SLAM
• Global tag-map registration based on tag-plane correspondence estimation
via maximum clique finding
• Estimation refinement via NID-based direct camera-map alignment
• The proposed method could localize over 100 tags in 22 minutes
• The average tag localization error was about 2 cm

More Related Content

Similar to Scalable Fiducial Tag Localization on a 3D Prior Map via Graph-Theoretic Global Tag-Map Registration [IROS2022]

IGARSS presentation WKLEE.pptx
IGARSS presentation WKLEE.pptxIGARSS presentation WKLEE.pptx
IGARSS presentation WKLEE.pptxgrssieee
 
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation Fields
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation FieldsCVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation Fields
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation FieldsJun Saito
 
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE
sipij
 
Geo referencing by Mashhood Arif
Geo referencing by Mashhood ArifGeo referencing by Mashhood Arif
Geo referencing by Mashhood Arif
KU Leuven
 
Graphics
GraphicsGraphics
Graphics
Nidhi Baranwal
 
Lecture 4 image measumrents & refinement
Lecture 4  image measumrents & refinementLecture 4  image measumrents & refinement
Lecture 4 image measumrents & refinement
Sarhat Adam
 
Depth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIDepth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors II
Yu Huang
 
GIS
GISGIS
Extended hybrid region growing segmentation of point clouds with different re...
Extended hybrid region growing segmentation of point clouds with different re...Extended hybrid region growing segmentation of point clouds with different re...
Extended hybrid region growing segmentation of point clouds with different re...
csandit
 
EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...
EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...
EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...
cscpconf
 
Remote Sensing: Georeferencing
Remote Sensing: GeoreferencingRemote Sensing: Georeferencing
Remote Sensing: Georeferencing
Kamlesh Kumar
 
Fd36957962
Fd36957962Fd36957962
Fd36957962
IJERA Editor
 
Effect of sub classes on the accuracy of the classified image
Effect of sub classes on the accuracy of the classified imageEffect of sub classes on the accuracy of the classified image
Effect of sub classes on the accuracy of the classified imageiaemedu
 
Optimizing GIS based Systems
Optimizing GIS based SystemsOptimizing GIS based Systems
Optimizing GIS based Systems
Ajinkya Deshpande
 
Augmented reality session 4
Augmented reality session 4Augmented reality session 4
Augmented reality session 4
NirsandhG
 
Understanding Users Behaviours in User-Centric Immersive Communications
Understanding Users Behaviours in User-Centric Immersive CommunicationsUnderstanding Users Behaviours in User-Centric Immersive Communications
Understanding Users Behaviours in User-Centric Immersive Communications
Förderverein Technische Fakultät
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Symeon Papadopoulos
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Sunando Sengupta
 
Lecture 01 frank dellaert - 3 d reconstruction and mapping: a factor graph ...
Lecture 01   frank dellaert - 3 d reconstruction and mapping: a factor graph ...Lecture 01   frank dellaert - 3 d reconstruction and mapping: a factor graph ...
Lecture 01 frank dellaert - 3 d reconstruction and mapping: a factor graph ...
mustafa sarac
 
Enhanced Tracking Aerial Image by Applying Fusion & Image Registration Technique
Enhanced Tracking Aerial Image by Applying Fusion & Image Registration TechniqueEnhanced Tracking Aerial Image by Applying Fusion & Image Registration Technique
Enhanced Tracking Aerial Image by Applying Fusion & Image Registration Technique
IRJET Journal
 

Similar to Scalable Fiducial Tag Localization on a 3D Prior Map via Graph-Theoretic Global Tag-Map Registration [IROS2022] (20)

IGARSS presentation WKLEE.pptx
IGARSS presentation WKLEE.pptxIGARSS presentation WKLEE.pptx
IGARSS presentation WKLEE.pptx
 
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation Fields
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation FieldsCVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation Fields
CVPR 2012 Review Seminar - Multi-View Hair Capture using Orientation Fields
 
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE
EFFECTIVE INTEREST REGION ESTIMATION MODEL TO REPRESENT CORNERS FOR IMAGE
 
Geo referencing by Mashhood Arif
Geo referencing by Mashhood ArifGeo referencing by Mashhood Arif
Geo referencing by Mashhood Arif
 
Graphics
GraphicsGraphics
Graphics
 
Lecture 4 image measumrents & refinement
Lecture 4  image measumrents & refinementLecture 4  image measumrents & refinement
Lecture 4 image measumrents & refinement
 
Depth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors IIDepth Fusion from RGB and Depth Sensors II
Depth Fusion from RGB and Depth Sensors II
 
GIS
GISGIS
GIS
 
Extended hybrid region growing segmentation of point clouds with different re...
Extended hybrid region growing segmentation of point clouds with different re...Extended hybrid region growing segmentation of point clouds with different re...
Extended hybrid region growing segmentation of point clouds with different re...
 
EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...
EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...
EXTENDED HYBRID REGION GROWING SEGMENTATION OF POINT CLOUDS WITH DIFFERENT RE...
 
Remote Sensing: Georeferencing
Remote Sensing: GeoreferencingRemote Sensing: Georeferencing
Remote Sensing: Georeferencing
 
Fd36957962
Fd36957962Fd36957962
Fd36957962
 
Effect of sub classes on the accuracy of the classified image
Effect of sub classes on the accuracy of the classified imageEffect of sub classes on the accuracy of the classified image
Effect of sub classes on the accuracy of the classified image
 
Optimizing GIS based Systems
Optimizing GIS based SystemsOptimizing GIS based Systems
Optimizing GIS based Systems
 
Augmented reality session 4
Augmented reality session 4Augmented reality session 4
Augmented reality session 4
 
Understanding Users Behaviours in User-Centric Immersive Communications
Understanding Users Behaviours in User-Centric Immersive CommunicationsUnderstanding Users Behaviours in User-Centric Immersive Communications
Understanding Users Behaviours in User-Centric Immersive Communications
 
Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...Placing Images with Refined Language Models and Similarity Search with PCA-re...
Placing Images with Refined Language Models and Similarity Search with PCA-re...
 
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
Urban 3D Semantic Modelling Using Stereo Vision, ICRA 2013
 
Lecture 01 frank dellaert - 3 d reconstruction and mapping: a factor graph ...
Lecture 01   frank dellaert - 3 d reconstruction and mapping: a factor graph ...Lecture 01   frank dellaert - 3 d reconstruction and mapping: a factor graph ...
Lecture 01 frank dellaert - 3 d reconstruction and mapping: a factor graph ...
 
Enhanced Tracking Aerial Image by Applying Fusion & Image Registration Technique
Enhanced Tracking Aerial Image by Applying Fusion & Image Registration TechniqueEnhanced Tracking Aerial Image by Applying Fusion & Image Registration Technique
Enhanced Tracking Aerial Image by Applying Fusion & Image Registration Technique
 

Recently uploaded

Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 

Recently uploaded (20)

Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Assure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyesAssure Contact Center Experiences for Your Customers With ThousandEyes
Assure Contact Center Experiences for Your Customers With ThousandEyes
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 

Scalable Fiducial Tag Localization on a 3D Prior Map via Graph-Theoretic Global Tag-Map Registration [IROS2022]

  • 1. Scalable Fiducial Tag Localization on a 3D Prior Map Via Graph-Theoretic Global Tag-Map Registration Kenji Koide, Shuji Oishi, Masashi Yokozuka, and Atsuhiko Banno National Institute of Advanced Industrial Science and Technology (AIST), Japan
  • 2. Background • Map-based visual localization has been attracting much attention • It is, however, sometimes necessary to rely on visual fiducial tags (aka visual markers) for initialization and fail-safe [Oishi, 2020]
  • 3. Motivation • Deploying many tags on a 3D prior map is sometimes difficult and tedious • Tag positions are often measured by hand; large effort and inaccurate results • We aim to develop an accurate and automatic method to determine tag poses in the environment
  • 4. Proposed Method 1. VIO-based Tag-Relative-Pose Estimation We use an agile camera to observe tags in the environment and estimate the relative poses between tags via landmark SLAM 2. Global Tag-Map Registration We then roughly align tags and a prior map by establishing tag-plane correspondences via graph-theoretic correspondence estimation 3. Estimation Refinement via Direct Camera-Map Alignment Tag and camera poses are refined by directly aligning agile camera images with the prior map and re-optimize all variables under all constraints
  • 5. VIO-based Tag-Relative-Pose Estimation • We use an agile camera and observe each tag in the environment at least once • The tag poses in the VIO frame is estimated via landmark SLAM VIO (VINS-Mono) Tag detections (Apriltags) Pose graph optimization
  • 6. Global Tag-Map Registration • We want to align the estimated tag poses with a prior 3D map without initial guess • The modality difference makes it difficult to apply image matching… Prior 3D map (sparse point cloud) Estimated tag poses (visually detected) Align w/o initial guess
  • 7. Geometry-based Tag-Plane Matching • We assume that most tags are placed on a plane in the environment • We establish tag-plane correspondences to determine the tag-map transformation Detecting planes in the environment 1. Region growing segmentation 2. RANSAC plane detection 3. Fit oriented BBoxes to plane points
  • 8. Geometry-based Tag-Plane Matching • We assume that most tags are placed on a plane in the environment • We establish tag-plane correspondences to determine the tag-map transformation Detecting planes in the environment 1. Region growing segmentation 2. RANSAC plane detection 3. Fit oriented BBoxes to plane points
  • 9. Geometry-based Tag-Plane Matching • We assume that most tags are placed on a plane in the environment • We establish tag-plane correspondences to determine the tag-map transformation Detecting planes in the environment 1. Region growing segmentation 2. RANSAC plane detection 3. Fit oriented BBoxes to plane points
  • 10. Geometry-based Tag-Plane Matching • We assume that most tags are placed on a plane in the environment • We establish tag-plane correspondences to determine the tag-map transformation Detecting planes in the environment 1. Region growing segmentation 2. RANSAC plane detection 3. Fit oriented BBoxes to plane points Plane = (center, normal, lengths)
  • 11. Max-Clique-based Correspondence Estimation • Tag-Plane Correspondence Consistency Graph Vertex: tag-plane correspondence hypothesis Edge: consistency between correspondence hypotheses ℎ𝑖𝑗 does not contradict ℎ𝑘𝑙 (i.e., they are consistent) Tag i corresponds to plane j Tag k corresponds to plane l ℎ𝑖𝑗 ℎ𝑘𝑙
  • 12. Max-Clique-based Correspondence Estimation • Tag-Plane Correspondence Consistency Graph Vertex: tag-plane correspondence hypothesis Edge: consistency between correspondence hypotheses ℎ𝑖𝑗 ℎ𝑘𝑙
  • 13. Max-Clique-based Correspondence Estimation • Tag-Plane Correspondence Consistency Graph Vertex: tag-plane correspondence hypothesis Edge: consistency between correspondence hypotheses • Largest subset of hypotheses that are all mutually consistent (i.e., maximum clique) gives the best explanation for the tag placement in the given map ℎ𝑖𝑗 ℎ𝑘𝑙
  • 14. Tag-Plane Correspondence Consistency • Consistency between tag-plane correspondence hypotheses is determined based on geometric consistency check ℎ𝑖𝑗 ℎ𝑘𝑙 Tag i Tag k Plane j Plane l
  • 15. Tag-Plane Correspondence Consistency • Consistency between tag-plane correspondence hypotheses is determined based on geometric consistency check • We align tag i and plane j and s.t. distance between tag k and plane l Plane j Plane l
  • 16. Tag-Plane Correspondence Consistency • Consistency between tag-plane correspondence hypotheses is determined based on geometric consistency check • We align tag i and plane j and s.t. distance between tag k and plane l • If normal and translation errors between tag k and plane l are smaller than threshold, these hypotheses are mutually consistent Plane j Plane l Normal error Translation error
  • 17. Example Result Planes Tags • While the consistency graph contains many edges, the max-clique can be found very efficiently [Rossi, 2015]
  • 18. Example Result Planes Tags Consistency graph contains 429,735 hypothesis pairs • While the consistency graph contains many edges, the max-clique can be found very efficiently [Rossi, 2015]
  • 19. Example Result Planes Tags Consistency graph contains 429,735 hypothesis pairs Maximum clique consists of 56 tag-plane correspondences found in 92 msec • While the consistency graph contains many edges, the max-clique can be found very efficiently [Rossi, 2015] • Given the tag-plane correspondences, we estimate the tag-map transformation by minimizing normal-to-normal ICP distance [Rusinkiewicz, 2019]
  • 20. Estimation Refinement • We refine the tag poses by directly aligning agile camera images with the map VIO Tag detections Pose graph Direct alignment
  • 21. Estimation Refinement • We refine the tag poses by directly aligning agile camera images with the map • We use the normalized information distance (NID), a mutual information-based cross modal metric, to maximize the co-occurrence of pixel and map intensity values • Tag and camera poses are re-optimized under all the constraints Agile camera image Map rendered with optimized camera pose
  • 22. Evaluation in Simulation • The method is evaluated on the Replica dataset [Savva, 2019] Global tag-map registration : 0.039m / 1.021° Tag localization accuracy : 98% success rate Baseline (FPFH+RANSAC/Teaser) : 26% and 70% Robustness to outlier tags
  • 23. Evaluation in Real Environment • 117 tags were placed in the environment • Tag poses were estimated in 22 minutes (16 min for VIO recording, 6 min for post processing) • Average tag pose error: 0.019m and 2.382° Final estimation result
  • 24. Thank you for your attention!! 24
  • 25. Conclusion • An accurate and scalable method for fiducial tag localization on a 3D prior environmental map is proposed • VIO-based tag relative pose estimation via landmark SLAM • Global tag-map registration based on tag-plane correspondence estimation via maximum clique finding • Estimation refinement via NID-based direct camera-map alignment • The proposed method could localize over 100 tags in 22 minutes • The average tag localization error was about 2 cm