Soumettre la recherche
Mettre en ligne
【DL輪読会】Mastering Diverse Domains through World Models
•
0 j'aime
•
755 vues
Deep Learning JP
Suivre
2023/1/13 Deep Learning JP http://deeplearning.jp/seminar-2/
Lire moins
Lire la suite
Technologie
Signaler
Partager
Signaler
Partager
1 sur 34
Télécharger maintenant
Télécharger pour lire hors ligne
Recommandé
【DL輪読会】Transformers are Sample Efficient World Models
【DL輪読会】Transformers are Sample Efficient World Models
Deep Learning JP
POMDP下での強化学習の基礎と応用
POMDP下での強化学習の基礎と応用
Yasunori Ozaki
「世界モデル」と関連研究について
「世界モデル」と関連研究について
Masahiro Suzuki
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
Deep Learning JP
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
Deep Learning JP
ドメイン適応の原理と応用
ドメイン適応の原理と応用
Yoshitaka Ushiku
【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Deep Learning JP
【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces
【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces
Deep Learning JP
Recommandé
【DL輪読会】Transformers are Sample Efficient World Models
【DL輪読会】Transformers are Sample Efficient World Models
Deep Learning JP
POMDP下での強化学習の基礎と応用
POMDP下での強化学習の基礎と応用
Yasunori Ozaki
「世界モデル」と関連研究について
「世界モデル」と関連研究について
Masahiro Suzuki
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
[DL輪読会]Decision Transformer: Reinforcement Learning via Sequence Modeling
Deep Learning JP
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
Deep Learning JP
ドメイン適応の原理と応用
ドメイン適応の原理と応用
Yoshitaka Ushiku
【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
【DL輪読会】Diffusion Policy: Visuomotor Policy Learning via Action Diffusion
Deep Learning JP
【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces
【DL輪読会】Efficiently Modeling Long Sequences with Structured State Spaces
Deep Learning JP
【DL輪読会】Foundation Models for Decision Making: Problems, Methods, and Opportun...
【DL輪読会】Foundation Models for Decision Making: Problems, Methods, and Opportun...
Deep Learning JP
【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法
Deep Learning JP
backbone としての timm 入門
backbone としての timm 入門
Takuji Tahara
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
Deep Learning JP
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
Deep Learning JP
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII
【DL輪読会】DayDreamer: World Models for Physical Robot Learning
【DL輪読会】DayDreamer: World Models for Physical Robot Learning
Deep Learning JP
【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)
【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)
Deep Learning JP
SSII2022 [TS1] Transformerの最前線〜 畳込みニューラルネットワークの先へ 〜
SSII2022 [TS1] Transformerの最前線〜 畳込みニューラルネットワークの先へ 〜
SSII
[DL輪読会]`強化学習のための状態表現学習 -より良い「世界モデル」の獲得に向けて-
[DL輪読会]`強化学習のための状態表現学習 -より良い「世界モデル」の獲得に向けて-
Deep Learning JP
[DL輪読会]Dream to Control: Learning Behaviors by Latent Imagination
[DL輪読会]Dream to Control: Learning Behaviors by Latent Imagination
Deep Learning JP
深層生成モデルと世界モデル
深層生成モデルと世界モデル
Masahiro Suzuki
【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization
【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization
Deep Learning JP
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
Deep Learning JP
最近強化学習の良記事がたくさん出てきたので勉強しながらまとめた
最近強化学習の良記事がたくさん出てきたので勉強しながらまとめた
Katsuya Ito
【DL輪読会】Flamingo: a Visual Language Model for Few-Shot Learning 画像×言語の大規模基盤モ...
【DL輪読会】Flamingo: a Visual Language Model for Few-Shot Learning 画像×言語の大規模基盤モ...
Deep Learning JP
強化学習 DQNからPPOまで
強化学習 DQNからPPOまで
harmonylab
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
Deep Learning JP
[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介
[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介
Deep Learning JP
[DL輪読会]World Models
[DL輪読会]World Models
Deep Learning JP
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
Deep Learning JP
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
Deep Learning JP
Contenu connexe
Tendances
【DL輪読会】Foundation Models for Decision Making: Problems, Methods, and Opportun...
【DL輪読会】Foundation Models for Decision Making: Problems, Methods, and Opportun...
Deep Learning JP
【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法
Deep Learning JP
backbone としての timm 入門
backbone としての timm 入門
Takuji Tahara
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
Deep Learning JP
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
Deep Learning JP
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII
【DL輪読会】DayDreamer: World Models for Physical Robot Learning
【DL輪読会】DayDreamer: World Models for Physical Robot Learning
Deep Learning JP
【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)
【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)
Deep Learning JP
SSII2022 [TS1] Transformerの最前線〜 畳込みニューラルネットワークの先へ 〜
SSII2022 [TS1] Transformerの最前線〜 畳込みニューラルネットワークの先へ 〜
SSII
[DL輪読会]`強化学習のための状態表現学習 -より良い「世界モデル」の獲得に向けて-
[DL輪読会]`強化学習のための状態表現学習 -より良い「世界モデル」の獲得に向けて-
Deep Learning JP
[DL輪読会]Dream to Control: Learning Behaviors by Latent Imagination
[DL輪読会]Dream to Control: Learning Behaviors by Latent Imagination
Deep Learning JP
深層生成モデルと世界モデル
深層生成モデルと世界モデル
Masahiro Suzuki
【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization
【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization
Deep Learning JP
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
Deep Learning JP
最近強化学習の良記事がたくさん出てきたので勉強しながらまとめた
最近強化学習の良記事がたくさん出てきたので勉強しながらまとめた
Katsuya Ito
【DL輪読会】Flamingo: a Visual Language Model for Few-Shot Learning 画像×言語の大規模基盤モ...
【DL輪読会】Flamingo: a Visual Language Model for Few-Shot Learning 画像×言語の大規模基盤モ...
Deep Learning JP
強化学習 DQNからPPOまで
強化学習 DQNからPPOまで
harmonylab
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
Deep Learning JP
[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介
[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介
Deep Learning JP
[DL輪読会]World Models
[DL輪読会]World Models
Deep Learning JP
Tendances
(20)
【DL輪読会】Foundation Models for Decision Making: Problems, Methods, and Opportun...
【DL輪読会】Foundation Models for Decision Making: Problems, Methods, and Opportun...
【DL輪読会】時系列予測 Transfomers の精度向上手法
【DL輪読会】時系列予測 Transfomers の精度向上手法
backbone としての timm 入門
backbone としての timm 入門
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
[DL輪読会]近年のオフライン強化学習のまとめ —Offline Reinforcement Learning: Tutorial, Review, an...
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
SSII2021 [OS2-02] 深層学習におけるデータ拡張の原理と最新動向
【DL輪読会】DayDreamer: World Models for Physical Robot Learning
【DL輪読会】DayDreamer: World Models for Physical Robot Learning
【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)
【DL輪読会】言語以外でのTransformerのまとめ (ViT, Perceiver, Frozen Pretrained Transformer etc)
SSII2022 [TS1] Transformerの最前線〜 畳込みニューラルネットワークの先へ 〜
SSII2022 [TS1] Transformerの最前線〜 畳込みニューラルネットワークの先へ 〜
[DL輪読会]`強化学習のための状態表現学習 -より良い「世界モデル」の獲得に向けて-
[DL輪読会]`強化学習のための状態表現学習 -より良い「世界モデル」の獲得に向けて-
[DL輪読会]Dream to Control: Learning Behaviors by Latent Imagination
[DL輪読会]Dream to Control: Learning Behaviors by Latent Imagination
深層生成モデルと世界モデル
深層生成モデルと世界モデル
【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization
【DL輪読会】Prompting Decision Transformer for Few-Shot Policy Generalization
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
【DL輪読会】マルチエージェント強化学習における近年の 協調的方策学習アルゴリズムの発展
最近強化学習の良記事がたくさん出てきたので勉強しながらまとめた
最近強化学習の良記事がたくさん出てきたので勉強しながらまとめた
【DL輪読会】Flamingo: a Visual Language Model for Few-Shot Learning 画像×言語の大規模基盤モ...
【DL輪読会】Flamingo: a Visual Language Model for Few-Shot Learning 画像×言語の大規模基盤モ...
強化学習 DQNからPPOまで
強化学習 DQNからPPOまで
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
【DL輪読会】How Much Can CLIP Benefit Vision-and-Language Tasks?
[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介
[DL輪読会]Convolutional Conditional Neural Processesと Neural Processes Familyの紹介
[DL輪読会]World Models
[DL輪読会]World Models
Plus de Deep Learning JP
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
Deep Learning JP
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
Deep Learning JP
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
Deep Learning JP
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
Deep Learning JP
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
Deep Learning JP
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
Deep Learning JP
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
Deep Learning JP
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
Deep Learning JP
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
Deep Learning JP
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
Deep Learning JP
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
Deep Learning JP
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
Deep Learning JP
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
Deep Learning JP
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
Deep Learning JP
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
Deep Learning JP
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
Deep Learning JP
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
Deep Learning JP
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
Deep Learning JP
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
Deep Learning JP
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
Deep Learning JP
Plus de Deep Learning JP
(20)
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
Dernier
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Ridwan Fadjar
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
Enterprise Knowledge
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
Commit University
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
Zilliz
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
comworks
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
2toLead Limited
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
Memoori
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
Rizwan Syed
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
Dubai Multi Commodity Centre
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Patryk Bandurski
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
Manik S Magar
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Wonjun Hwang
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
Kalema Edgar
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Slibray Presentation
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
hariprasad279825
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
BookNet Canada
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
Fwdays
Training state-of-the-art general text embedding
Training state-of-the-art general text embedding
Zilliz
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
ScyllaDB
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
Addepto
Dernier
(20)
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
Vector Databases 101 - An introduction to the world of Vector Databases
Vector Databases 101 - An introduction to the world of Vector Databases
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
Training state-of-the-art general text embedding
Training state-of-the-art general text embedding
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
【DL輪読会】Mastering Diverse Domains through World Models
1.
Mastering Diverse Domains
through World Models Shohei Taniguchi, Matsuo Lab
2.
ॻࢽใ Mastering Diverse Domains
through World Models • ஶऀ • Danijar Hafner, Jurgis Pasukonis, Jimmy Ba, Timothy Lillicrap • ֓ཁ • ੈքϞσϧΛͬͨڧԽֶशख๏Dreamerͷվળ൛ (ver. 3) • εΫϥονͷڧԽֶशͰॳΊͯMinecraftͰμΠϠϞϯυΛͱΔ͜ͱʹޭ https://arxiv.org/abs/2301.04104 2
3.
Minecraft ObtainDiamond • MinecraftͰμΠϠϞϯυΛͱΔλεΫ •
ใुɼதؒΞΠςϜ͔μΠϠΛͱͬͨͱ͖ͷΈಘΒΕΔ • NeurIPSͰ2019͔Βίϯϖ͕ߦΘΕ͓ͯΓɼRLڀݚͷ1ͭϚΠϧετʔϯ • ͜Ε·ͰεΫϥονͷRLͰμΠϠ֫ಘ·Ͱ ޭͨ͠ྫͳ͠ • ਓؒͷσϞΛ͏ख๏Ͱͷޭྫ͋Γ
4.
ൃද֓ཁ • લఏࣝ • ੈքϞσϧ
x ڧԽֶश • PlaNet, Dreamer, DreamerV2 • DreamerV3 • ·ͱΊ εϥΠυͷҰ෦ΛҎԼ͔Βྲྀ༻͍ͯ͠·͢ https://www.slideshare.net/ShoheiTaniguchi2/ss-238325780 4
5.
ڧԽֶशͷ՝ αϯϓϧޮ • ֶशʹେྔͷ͕͔͔࣌ؒΔ • ϩϘοτͳͲͦΜͳʹසൟʹֶ࣮Ͱػशͤ͞Δͷίετతʹ͍͠ݫ 5
6.
ੈքϞσϧ x ڧԽֶश ڥͷϞσϧΛਂֶशͰ֫ಘͰ͖Ε ͦͷϞσϧͰڥΛγϛϡϨʔτͯ͠ ํࡦΛֶशͰ͖Δͣ ➡
ੈքϞσϧ 6
7.
ੈքϞσϧ x ڧԽֶश ֶशͷྲྀΕ 1.
ํࡦ Ͱ͔ڥΒσʔλ ΛूΊΔ 2. Λ༻͍ͯੈքϞσϧ Λֶश 3. ੈքϞσϧΛ༻͍ͯํࡦ Λߋ৽ • 1 ~ 3Λ܁Γฦ͢ π D D = {x1, a1, r1, …, xT, aT, rT} D pψ pψ (x1:T, r1:T ∣ a1:T) π https://arxiv.org/abs/1903.00374 7
8.
World Models [Ha and
Schmidhuber,2018] • ੈքϞσϧܥͷڀݚͷΓͱ͍͑Δจ • ੈքϞσϧͷֶशɿVAE + MDN-RNN • ํࡦͷֶशɿCMA-ES • ࠓճৄ͍͠༰ׂѪ͠·͢ ʢҎԼͷεϥΠυͳͲΛࢀরʣ https://www.slideshare.net/masa_s/ss-97848402 https://worldmodels.github.io/ https://arxiv.org/abs/1803.10122 8
9.
PlaNet [Hafner,et al.,2019] • ੈքϞσϧͷֶशɿ •
Recurrent State Space Model • ํࡦͷֶशɿCEM • ϞσϧϑϦʔͱ΄΅ಉͷੑೳ ্ɿ࣮ͰڥͷϩʔϧΞτ ԼɿੈքϞσϧʹΑΔγϛϡϨʔγϣϯ DM Control SuiteͰͷ࣮݁ݧՌ https://arxiv.org/abs/1811.04551 https://planetrl.github.io/ 9
10.
Ψεܕঢ়ଶۭؒϞσϧ Gaussian State Space
Model • ঢ়ଶભҠ֬ʹਖ਼نΛ͏Ϟσϧ • • ؔ ʹDNNͳͲΛ༻͍Δ • ͜Εͩͱ࣮ݧతʹ͏·͍͔͘ͳ͍ʢޯফࣦͳͲʣ pψ (st+1 ∣ st, at) = Normal (μψ (st, at), diag (σ2 ψ (st, at))) μψ, σ2 ψ ot at rt st ot+1 at+1 rt+1 st+1 10
11.
࠶ؼతঢ়ଶۭؒϞσϧ Recurrent State Space
Model (RSSM) • ঢ়ଶ ΛܾఆతʹભҠ͢Δ ͱ ֬తʹભҠ͢Δ ʹ͚ͯϞσϧԽ͢Δ • LSTMͳͲͷRNNܕͷؔ s h z ht+1 = fψ (ht, st, at) pψ (st ∣ ht) = Normal (μψ (ht), diag (σ2 ψ (ht))) fψ xt at rt st xt+1 at+1 rt+1 st+1 ht ht+1 11
12.
RSSMΛ͏ͱ͔ͳΓੑೳ্͕͕Δ ࠶ؼతঢ়ଶۭؒϞσϧ Recurrent State Space
Model (RSSM) 12
13.
Dreamer [Hafner,et al.,2019] • PlaNetΛϕʔεʹͯ͠ɺ ํࡦͷֶशΛActor-Criticʹܕมߋ •
Ձؔʹ ऩӹΛ༻͍Δ • PlaNet͔Βੑೳ͕େ෯ʹվળ λ https://arxiv.org/abs/1912.01603 https://ai.googleblog.com/2020/03/introducing-dreamer-scalable.html 13
14.
Ձؔͷਪఆ ϕϧϚϯํఔࣜ εςοϓʹ֦ு͢Δͱ Vπ (st) = 𝔼 π [r
(st, at)] + Vπ (st+1) n Vπ n (st) = 𝔼 π [ n−1 ∑ k=1 r (st+k, at+k) ] + Vπ (st+n) 14
15.
Ձؔͷਪఆ ͰࢦฏۉΛͱΔͱ ͜ΕΛ ऩӹͱͿݺ Vπ n (st)
= 𝔼 π [ n−1 ∑ k=1 r (st+k, at+k) ] + Vπ (st+n) n = 1,…, ∞ V̄π (st, λ) = (1 − λ) ∞ ∑ n=1 λn−1 Vπ n (st) λ 15
16.
Ձؔͷਪఆ DreamerͰɺ ऩӹΛՁؔͷλʔήοτͱ͢Δ ͨͩ͠ɺࢦฏۉͷదͳେ͖͞ʢ ͱ͢ΔʣͰଧͪΔ λ θ
← θ − ηθ ∇θ 𝔼 pψ,πϕ [ V πϕ θ (st) − V̄π (st, λ) 2] H V̄π (st, λ) ≈ (1 − λ) H−1 ∑ n=1 λn−1 Vπ n (st) + λH−1 Vπ H (st) 16
17.
ऩӹͷޮՌ λ No valueํࡦޯ๏Ͱֶशͨ͠߹ͷ݁Ռ ऩӹΛ༻͍Δ͜ͱͰɺ ʹґΒͣੑೳ͕վળ λ
H 17
18.
DreamerV2 [Hafner,et al.,2020] Dreamerͷվྑ൛ 1. જࡏมʹࢄͳΧςΰϦΧϧΛ͏ 2.
Τϯίʔμ͕աʹਖ਼ଇԽ͞Εͳ͍Α͏ʹ KL߲ͷֶशΛௐ͢Δ • AtariͰਓؒϨϕϧͷੑೳΛୡ 18
19.
ࢄજࡏม • PlaNetDreamerV1Ͱɼ࿈ଓతͳજࡏมΛ͍ɼਖ਼نͰϞσϧԽ • DreamerV2ͰɼࢄͳΧςΰϦΧϧʹมߋ 19
20.
ࢄજࡏม • ࢄʹͨ͜͠ͱͰɼޯͷਪఆʹreparameterization trick͑ͳ͘ͳΔ •
ΘΓʹstraight-through estimatorͰਪఆ • ਪఆྔʹόΠΞε͕Δ͕ɼ࣮͕؆୯ 20
21.
KL Balancing • ੈքϞσϧͷϩεʹ͓͍ͯɼKL߲encoderͱભҠϞσϧͷpriorΛ͚ۙͮΔ ਖ਼ଇԽͷׂΛ͢Δ •
͔͠͠ɼಛʹֶशॳʹظભҠϞσϧ͕ेʹֶशͰ͖͍ͯͳ͍ঢ়ଶͩͱ ͜ͷKLਖ਼ଇԽ͕ͳ͘ڧΓֶ͗ͯ͢शͷ͛ʹͳΔ 21
22.
KL Balancing • EncoderͱભҠϞσϧͷKL߲ʹ͍ͭͯͷֶशΛௐ͢Δ͜ͱͰܰݮ •
0.8ʹઃఆ α 22
23.
࣮ݧ • AtariͰਓؒ͑ • ϞσϧϑϦʔͷDQN,
RainbowͳͲΑΓ͍ڧ 23
24.
࣮ݧ Ablation • ΧςΰϦΧϧมKL balancingͷޮՌ͔ͳΓେ͖͍ 24
25.
DreamerV3 25
26.
DreamerV3 • DreamerV2ΛΑΓ൚༻తʹ͑Δख๏ʹ͢ΔͨΊʹ͍͔ͭ͘ΛՃ • υϝΠϯ͕มΘͬͯৗʹಉ͡ϋΠύϥͰֶशͰ͖ΔΑ͏ʹ 1.
؍ଌใुͷΛsymlogؔͰม͢Δ 2. ActorͷతؔͰ ऩӹͷΛਖ਼نԽ͢Δ λ 26
27.
Symlog Prediction • υϝΠϯ͕มΘΔͱɼ؍ଌใुͷͷεέʔϧ͕มΘΔͷͰɼ ஞҰϋΠύϥΛௐ͢Δඞཁ͕͋Δ •
ͦΕΛ͠ͳ͍͍ͯ͘Α͏ʹɼsymlogؔΛ͔͚Δ͜ͱͰΛ͋Δఔἧ͑Δ • ՄͳؔͳٯͷͰɼؔٯΛ͔͚ΕݩͷʹͤΔ 27
28.
ऩӹͷਖ਼نԽ λ • Τϯτϩϐʔਖ਼ଇԽ͖ͰactorΛֶश͢Δ߹ɼͦͷͷνϡʔχϯά ใुͷεέʔϧεύʔεੑʹґଘ͢ΔͷͰ͍͠ • ͏·͘ใुͷΛਖ਼نԽͰ͖ΕɼυϝΠϯʹΑΒͣΤϯτϩϐʔ߲ͷΛ ݻఆͰ͖Δͣ 28
29.
ऩӹͷਖ਼نԽ λ • ऩӹΛ5ʙ95%Ґͷ෯Ͱਖ਼نԽ͢Δ • ୯७ʹࢄͰਖ਼نԽ͢Δͱɼใु͕εύʔεͳͱ͖ʹɼऩӹ͕աେධՁ͞Εͯ ͠·͏ͷͰɼ֎ΕΛ͚ΔΑ͏ʹ͜ͷ͢ʹܗΔ 29
30.
࣮ݧ • ͯ͢ͷυϝΠϯɾλεΫͰಉ͡ϋΠύϥͰߴ͍ੑೳ͕ग़ͤΔ 30
31.
࣮ݧ • ϞσϧͷαΠζʹΑͬͯੑೳ͕εέʔϧ͢Δ͜ͱ֬ೝ 31
32.
࣮ݧ ੈքϞσϧʹΑΔະདྷ༧ଌ 32
33.
࣮ݧ • MinecraftͰॳΊͯRL agent͕μΠϠϞϯυΛͱΔ͜ͱʹޭ 33
34.
·ͱΊ • ੈքϞσϧͷදతͳख๏DreamerͷൃలΛղઆ • V3ʹؔͯ͠ਖ਼ώϡʔϦεςΟοΫͷմײ൱Ίͳ͍ •
݁Ռ͍͢͝ 34
Télécharger maintenant