Soumettre la recherche
Mettre en ligne
【DL輪読会】Segment Anything
•
5 j'aime
•
4,374 vues
Deep Learning JP
Suivre
2023/4/7 Deep Learning JP http://deeplearning.jp/seminar-2/
Lire moins
Lire la suite
Technologie
Affichage du diaporama
Signaler
Partager
Affichage du diaporama
Signaler
Partager
1 sur 24
Télécharger maintenant
Télécharger pour lire hors ligne
Recommandé
[DL輪読会]Learning Transferable Visual Models From Natural Language Supervision
[DL輪読会]Learning Transferable Visual Models From Natural Language Supervision
Deep Learning JP
近年のHierarchical Vision Transformer
近年のHierarchical Vision Transformer
Yusuke Uchida
【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習
cvpaper. challenge
【メタサーベイ】基盤モデル / Foundation Models
【メタサーベイ】基盤モデル / Foundation Models
cvpaper. challenge
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Yusuke Uchida
[DL輪読会]Focal Loss for Dense Object Detection
[DL輪読会]Focal Loss for Dense Object Detection
Deep Learning JP
【メタサーベイ】Vision and Language のトップ研究室/研究者
【メタサーベイ】Vision and Language のトップ研究室/研究者
cvpaper. challenge
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
Deep Learning JP
Recommandé
[DL輪読会]Learning Transferable Visual Models From Natural Language Supervision
[DL輪読会]Learning Transferable Visual Models From Natural Language Supervision
Deep Learning JP
近年のHierarchical Vision Transformer
近年のHierarchical Vision Transformer
Yusuke Uchida
【メタサーベイ】数式ドリブン教師あり学習
【メタサーベイ】数式ドリブン教師あり学習
cvpaper. challenge
【メタサーベイ】基盤モデル / Foundation Models
【メタサーベイ】基盤モデル / Foundation Models
cvpaper. challenge
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Swin Transformer (ICCV'21 Best Paper) を完璧に理解する資料
Yusuke Uchida
[DL輪読会]Focal Loss for Dense Object Detection
[DL輪読会]Focal Loss for Dense Object Detection
Deep Learning JP
【メタサーベイ】Vision and Language のトップ研究室/研究者
【メタサーベイ】Vision and Language のトップ研究室/研究者
cvpaper. challenge
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
[DL輪読会]GLIDE: Guided Language to Image Diffusion for Generation and Editing
Deep Learning JP
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII
【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ
Deep Learning JP
画像生成・生成モデル メタサーベイ
画像生成・生成モデル メタサーベイ
cvpaper. challenge
三次元点群を取り扱うニューラルネットワークのサーベイ
三次元点群を取り扱うニューラルネットワークのサーベイ
Naoya Chiba
GAN(と強化学習との関係)
GAN(と強化学習との関係)
Masahiro Suzuki
[DL輪読会]相互情報量最大化による表現学習
[DL輪読会]相互情報量最大化による表現学習
Deep Learning JP
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
Deep Learning JP
全力解説!Transformer
全力解説!Transformer
Arithmer Inc.
敵対的生成ネットワーク(GAN)
敵対的生成ネットワーク(GAN)
cvpaper. challenge
[DL輪読会]End-to-End Object Detection with Transformers
[DL輪読会]End-to-End Object Detection with Transformers
Deep Learning JP
Transformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法について
Sho Takase
動作認識の最前線:手法,タスク,データセット
動作認識の最前線:手法,タスク,データセット
Toru Tamaki
畳み込みLstm
畳み込みLstm
tak9029
[DL輪読会]PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metr...
[DL輪読会]PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metr...
Deep Learning JP
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
Deep Learning JP
最適輸送の解き方
最適輸送の解き方
joisino
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
Deep Learning JP
[DL輪読会]Progressive Growing of GANs for Improved Quality, Stability, and Varia...
[DL輪読会]Progressive Growing of GANs for Improved Quality, Stability, and Varia...
Deep Learning JP
深層学習の数理
深層学習の数理
Taiji Suzuki
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
Deep Learning JP
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
Deep Learning JP
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
Deep Learning JP
Contenu connexe
Tendances
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII
【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ
Deep Learning JP
画像生成・生成モデル メタサーベイ
画像生成・生成モデル メタサーベイ
cvpaper. challenge
三次元点群を取り扱うニューラルネットワークのサーベイ
三次元点群を取り扱うニューラルネットワークのサーベイ
Naoya Chiba
GAN(と強化学習との関係)
GAN(と強化学習との関係)
Masahiro Suzuki
[DL輪読会]相互情報量最大化による表現学習
[DL輪読会]相互情報量最大化による表現学習
Deep Learning JP
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
Deep Learning JP
全力解説!Transformer
全力解説!Transformer
Arithmer Inc.
敵対的生成ネットワーク(GAN)
敵対的生成ネットワーク(GAN)
cvpaper. challenge
[DL輪読会]End-to-End Object Detection with Transformers
[DL輪読会]End-to-End Object Detection with Transformers
Deep Learning JP
Transformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法について
Sho Takase
動作認識の最前線:手法,タスク,データセット
動作認識の最前線:手法,タスク,データセット
Toru Tamaki
畳み込みLstm
畳み込みLstm
tak9029
[DL輪読会]PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metr...
[DL輪読会]PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metr...
Deep Learning JP
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
Deep Learning JP
最適輸送の解き方
最適輸送の解き方
joisino
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
Deep Learning JP
[DL輪読会]Progressive Growing of GANs for Improved Quality, Stability, and Varia...
[DL輪読会]Progressive Growing of GANs for Improved Quality, Stability, and Varia...
Deep Learning JP
深層学習の数理
深層学習の数理
Taiji Suzuki
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
Deep Learning JP
Tendances
(20)
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
【DL輪読会】ViT + Self Supervised Learningまとめ
【DL輪読会】ViT + Self Supervised Learningまとめ
画像生成・生成モデル メタサーベイ
画像生成・生成モデル メタサーベイ
三次元点群を取り扱うニューラルネットワークのサーベイ
三次元点群を取り扱うニューラルネットワークのサーベイ
GAN(と強化学習との関係)
GAN(と強化学習との関係)
[DL輪読会]相互情報量最大化による表現学習
[DL輪読会]相互情報量最大化による表現学習
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
【DL輪読会】High-Resolution Image Synthesis with Latent Diffusion Models
全力解説!Transformer
全力解説!Transformer
敵対的生成ネットワーク(GAN)
敵対的生成ネットワーク(GAN)
[DL輪読会]End-to-End Object Detection with Transformers
[DL輪読会]End-to-End Object Detection with Transformers
Transformerを多層にする際の勾配消失問題と解決法について
Transformerを多層にする際の勾配消失問題と解決法について
動作認識の最前線:手法,タスク,データセット
動作認識の最前線:手法,タスク,データセット
畳み込みLstm
畳み込みLstm
[DL輪読会]PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metr...
[DL輪読会]PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metr...
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
[DL輪読会]data2vec: A General Framework for Self-supervised Learning in Speech,...
最適輸送の解き方
最適輸送の解き方
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
【DL輪読会】The Forward-Forward Algorithm: Some Preliminary
[DL輪読会]Progressive Growing of GANs for Improved Quality, Stability, and Varia...
[DL輪読会]Progressive Growing of GANs for Improved Quality, Stability, and Varia...
深層学習の数理
深層学習の数理
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
Plus de Deep Learning JP
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
Deep Learning JP
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
Deep Learning JP
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
Deep Learning JP
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
Deep Learning JP
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
Deep Learning JP
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
Deep Learning JP
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
Deep Learning JP
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
Deep Learning JP
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
Deep Learning JP
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
Deep Learning JP
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
Deep Learning JP
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
Deep Learning JP
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
Deep Learning JP
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
Deep Learning JP
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
Deep Learning JP
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
Deep Learning JP
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
Deep Learning JP
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
Deep Learning JP
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
Deep Learning JP
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
Deep Learning JP
Plus de Deep Learning JP
(20)
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
Dernier
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Remote DBA Services
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
naman860154
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
ThousandEyes
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
Michael W. Hawkins
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
RTylerCroy
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
UK Journal
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
The Digital Insurer
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
Enterprise Knowledge
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
HampshireHUG
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Maria Levchenko
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Principled Technologies
Evaluating the top large language models.pdf
Evaluating the top large language models.pdf
ChristopherTHyatt
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
wesley chun
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
Safe Software
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
Delhi Call girls
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
apidays
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
The Digital Insurer
Dernier
(20)
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
🐬 The future of MySQL is Postgres 🐘
🐬 The future of MySQL is Postgres 🐘
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
Evaluating the top large language models.pdf
Evaluating the top large language models.pdf
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
【DL輪読会】Segment Anything
1.
Segment Anything Shohei Taniguchi,
Matsuo Lab
2.
Segment Anything ॻࢽใ ஶऀ Alexander Kirillov,
Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Alex Berg, Wan-Yen Lo, Piotr Dollar, Ross Girshick ֓ཁ • Meta͕ެ։ͨ͠ηάϝϯςʔγϣϯͷͨΊͷج൫ϞσϧSAM • 1100ສຕͷը૾ʹ10ԯҎ্ͷϚεΫ͕Ξϊςʔγϣϯ͞Εͨσʔληοτ SA-1Bެ։ 2
3.
֓ཁ Segment-Anything Model,SAM • ༷ʑͳϓϩϯϓτ͔ΒମͷϚεΫΛੜͰ͖ΔϞσϧ ࢦࣔɾςΩετɾྖҬͳͲ
4.
֓ཁ Segment-Anything Model,SAM • Τοδ༧ଌtext-to-maskzero-shotͰ݁ߏͰ͖Δ
5.
ൃද֓ཁ • λεΫɿPromotable segmentation •
ϞσϧɿSegment Anything Model • σʔλɿData engine • ࣮ݧ • ·ͱΊ 5
6.
എܠ • ۙɼେنޠݴϞσϧͷൃల͕͍͢͝ ‣ PromptΛ༩͑ͨΒࣗࡏʹޠݴΛੜͰ͖Δ ‣
Scaling lawͰͲΜͲΜੑೳ্͕͕Δ ➡ίϯϐϡʔλϏδϣϯͰಉ͡Α͏ͳ͜ͱ Ͱ͖ͳ͍ͷ͔ʁ 6 https://j.gifs.com/Y7mBPW.gif
7.
λεΫ Promptable Segmentation • ैདྷͷηάϝϯςʔγϣϯλεΫͱҧ͍ ηάϝϯτରΛϓϩϯϓτͰࢦఆ͢Δ ‣
ࢦࣔɼྖҬɼςΩετͳͲ • ϓϩϯϓτᐆດੑΛؚΉͨΊ ਖ਼͍͠ϚεΫ1ͭͱݶΒͳ͍ 7
8.
Ϟσϧ Segment Anything Model,SAM •
ߏ݁ߏγϯϓϧ 1. ը૾ͱϓϩϯϓτΛ ͦΕͧΕຒΊࠐΉ 2. TransformerϕʔεͷσίʔμͰ ຒΊࠐΈ͔ΒϚεΫΛੜ͢Δ 8
9.
Ϟσϧ • Image encoder ‣
ը૾ΛಛྔʹຒΊࠐΉ ‣ தViT ‣ 1൪͕ࢉܭॏ͍෦͕ͩɼ ਪ࣌ʹಛྔΛอ͓͚࣋ͯ͠ ϓϩϯϓτΛϦΞϧλΠϜͰ͍͡ΕΔ 9 Segment Anything Model,SAM
10.
Ϟσϧ • Prompt encoder
(points, box) ‣ ϓϩϯϓτΛຒΊࠐΉ ‣ positional encodingʹͯ͠ ֶशՄೳͳຒΊࠐΈύϥϝʔλͱ ͠߹ΘͤΔ 10 Segment Anything Model,SAM
11.
Ϟσϧ • Prompt encoder
(text) ‣ ϓϩϯϓτΛຒΊࠐΉ ‣ CLIPͷtext encoderΛ͏ 11 Segment Anything Model,SAM
12.
Ϟσϧ • Prompt encoder
(mask) ‣ ϓϩϯϓτΛຒΊࠐΉ ‣ ΈࠐΈΛ͔͚ͨͷΛ ը૾ຒΊࠐΈͱ͠߹ΘͤΔ 12 Segment Anything Model,SAM
13.
Ϟσϧ • Mask decoder ‣
ϚεΫީิΛग़ྗ͢Δ ‣ தTransformerͷdecoder ‣ ϓϩϯϓτͷᐆດੑʹରॲ͢ΔͨΊʹ 3ͭͷީิΛग़ྗ͢Δ 13 Segment Anything Model,SAM
14.
Ϟσϧ • ֶश ‣ Focal
lossͱdice lossΛ Έ߹Θֶͤͯश ‣ ϓϩϯϓτϥϯμϜʹ αϯϓϧ͢Δ 14 Segment Anything Model,SAM
15.
σʔλ Data Engine • SAMΛΞϊςʔγϣϯʹ͢༻׆Δ ‣
Model-in-the-loop • 3ஈ֊ʹ͚ͯΞϊςʔγϣϯ͢Δ 15
16.
1. SAM͕༧ଌͨ͠ϚεΫΛमਖ਼͢Δ • SAMॳΊʹผͷσʔληοτͰ ࣄલʹֶश͓ͤͯ͘͞ •
σʔλ͕͋Δఔू·ͬͨΒ ͦΕΛͬͯSAMΛֶशͤ͞Δ • 1ը૾͋ͨΓ30ඵҎʹ༩Ͱ͖ΔൣғͰ Ξϊςʔγϣϯ 16 σʔλ Data Engine
17.
2. SAM͕༧ଌͨ͠ͷҎ֎ΛΞϊςʔγϣϯ • ΑΓࡉ͔͍෦ΛΞϊςʔγϣϯ •
͜ͷࡍʹ৽͘͠Ճͨ͠σʔλͰ SAMΛֶशͤ͞Δ • ͜͜·ͰͰ1020ສݸͷϚεΫ͕ಘΒΕΔ 17 σʔλ Data Engine
18.
3. SAMͷ༧ଌͰΞϊςʔγϣϯ • 2ஈ֊ͰSAM͕͔ͳΓ͍͍ਫ਼ʹ ͳ͍ͬͯΔͨΊɼ༧ଌ݁ՌΛ΄ͱΜͲ ͦͷ··Ξϊςʔγϣϯͱͯ͑͠Δ •
Ϟσϧͷ֬৴͕ߴ͍ͷΛબΜͰ NMSͰॏෳΛআ͢ڈΔ 18 σʔλ Data Engine
19.
σʔλ SA-1B • ࠷ऴతʹ1100ສຕͷը૾ʹ11ԯݸͷϚεΫ͕ ͍ͭͨσʔληοτ͕Ͱ͖Δ • طଘͷσʔληοτʹൺͯɼ1ը૾͋ͨΓͷ ϚεΫͷ͕͍ͩͿଟ͍ 19
20.
• ϚεΫͷҐஔͷόΠΞεগͳ͍ • طଘͷͷத৺ۙʹ͔ͳΓภ͍ͬͯΔ 20 σʔλ SA-1B
21.
࣮ݧ ࢦ͔ࣔΒͷϚεΫ༧ଌ • ଟ͘ͷϕϯνϚʔΫͰZero-shotͰطଘͷϞσϧΛ্ճΔੑೳ͕ग़Δ • Zero-shotɿ֤σʔληοτͰfinetune͍ͯ͠ͳ͍ 21
22.
࣮ݧ ͦͷଞͷzero-shotੑೳ 22 Τοδ༧ଌ Text-to-mask
23.
࣮ݧ Ablation study • σʔλྔϞσϧαΠζʹΑͬͯੑೳ͕Ͳͷ͘Β͍มΘΔ͔ͷੳ •
σʔλྔʹؔͯ͠100ສຕ͘Β͍Ͱ݁ߏανͬͯͦ͏ͳҹ
24.
·ͱΊ • ϓϩϯϓτͰ੍ޚՄೳͳηάϝϯςʔγϣϯ༻ج൫ϞσϧSAMΛఏҊ • SAMΛͬͯmodel-in-the-loopͰσʔλΛऩूͨ͠SA-1Bσʔληοτެ։ •
σϞެ։͞Ε͍ͯΔ https://segment-anything.com/demo • ϓϩϯϓτը૾Ͱܥ൚༻తʹ͑ΔΞϓϩʔνʹͳΓͦ͏
Télécharger maintenant