4. A1.分布シフトが生じていることを検知できたらいい
‣ Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?
(ICML2020)
Sergey Levine / Yarin Galのチーム
https://sites.google.com/view/av-detect-recover-adapt
A2. 自己/半教師あり学習使えば良い
‣ Emerging Properties in Self-Supervised Vision Transformers (arXiv 2021/4/29)
‣ Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View
Assignments with Support Samples (arXiv 2021/4/28)
FAIRチーム
アプローチ&書誌情報
4
5. A1.分布シフトが生じていることを検知できたらいい
‣ Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?
(ICML2020)
Sergey Levine / Yarin Galのチーム
https://sites.google.com/view/av-detect-recover-adapt
A2. 自己/半教師あり学習使えば良い
‣ Emerging Properties in Self-Supervised Vision Transformers
‣ Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View
Assignments with Support Samples
FAIRチーム
アプローチ&書誌情報
5
15. 深層模倣モデル[Rhinehart et al., 2020]
‣ 事後分布から一つの を選択(=点推定)
‣ epistemic uncertaintyが使えない&見慣れないシーンで失敗しがち
2種類の集約演算子を提案
‣ Worst Case Model: 不確実性を悲観的に見るロバスト制御[Wald, 1939]
‣ Model Average: Epistemic uncertaintyを周辺化するベイズ決定理論
θk
:事後分布における集約演算子
⊕
15
16. Worst Case Model (RIP-WCM)
‣ 最悪ケースを想定して,そこで最適化する[Wald, 1939]
‣ 一般に はtractableではないがアンサンブルなら簡単
個のモデルで最小値を見つければ良い
Model Averaging (RIP-MA)
‣ 事後予測分布を利用
‣ 本来ならintractableであるが,アンサンブルによって解決
(結局は単にモデルの平均?)
arg max
y
min
θ
K
提案集約演算子
16
23. RIPだけでは,分布外シーンで対応しきれない
‣ 人間の運転手に制御を返せばいいのでは?
‣ →その時の正解データを手に入れられる!
オンライン学習を用いて不確実性を下げることが可能
‣ 不確実性が閾値を超えたら,運転手に制御を譲渡
‣ 閾値:false-negativeのレベルに一致
Adaptive Robust Imitative Planning
23
<latexit sha1_base64="HRbj9GcJ98lUTAfBVHvOgopzWk0=">AAAOsHicnVfrchM3FDbQC7htCu1M//SPpjitA46xnTuUDoQEmgEaKASYyaYZ7a52vUQrbSRtYo9GfaU+T9+gj9Ej7foaO52pf+zKOhfpfPrOOVo/o4lUrdbfV65e++TTzz6/fqP6xZdfLXx989Y37yTPRUAOAk65+OBjSWjCyIFKFCUfMkFw6lPy3j95YuXvz4iQCWdvVT8jRymOWRIlAVYwdXzrxl+eT+KEaUxjLhLVTc1h96iKkPeGqF0WPulisR89BuELWEGbUvL8/AlPU8KULt9G37uHzJh8j+3nSu+xLAdZ8bIyN9Ke53VlhgOil1ea6yQ1nkwxpVah2I3Cfk6xMIfqSFP0SA/Um5010DaImkVQrnkpVt0AU71jauhHtENSzqQSLjaJPM/qPLeS3/LUJwLxCKU8JLSUac8BqMELxf6yT3NiRj63Tc2A6SylHaww8vMoIsJc5krhfK6Pd1gkmAUEqa4gsstpWLryCAuHAAxxajVt5DbqxUmNxf9W+X+4+r7eM3V8rJAHDFJIHmtlGvZ5t22WLKovOOCEMooZA3ALuE/rnt8vLTy/9wAeqksUdgZ7aaLgcM5IcQyFySKqZfVSa2A4Oldn99JpZ1wqIhI+WAqsBnrPzHDJvrN4xmFnNDmBxOhyHo4sxnZTeh15nMQVJhxy2xDfiWW//TOg/eMsE7yXwPqDWEa7SxjwgsDRNhBpxs0GCgnJEGGS2LR0Pp9aMjizhIWkh2onD9vIoyFXEj2vGQ06CG1z+AtszpDEaUYJCoF2kqgJ3h/rE4id2xwnkFiK9JT2wdAY7ckkNWgiR7wHzvObwl+5cSxwSmDvEkWCpwUgDVikgKpYAFyhcfRKR28FTtgwtJ8kBFJbDniacQYoobMEoxQDTHk6fhhEKoscJCmqv3yxu4TctpWyNWT3lDX/rHuCRJqc3k8BryUztRdKIoWF4OfIwyIG/8e6lJfRAnN3ARjYbq/hGFHsfwo14xwdgj8eoynaohFvkSeSuKuOapfyYZ/ZClzkQsJi47LILT9O0bGdlxETdtaE9CeqvjQ4Hu99N6FEM66GWiHAaUpWPAMCCO7nUqFkmE523fkoxgTyMwmWuf+RBFa/xLT/h74zD88+CHLgZkGrEl7t8Yzm0szEbAyyu4XC3AR1gUwAObuCFnpTGOMQZ8qxp5DvFmyyhRSAECRMXIzobFBgoepb0SDgmkMLuHqaY5qoPnJhOpXpTJ6LKPh2GOb10ukSelhoa6jrQK/LKtpF4hXnMAPIAffQIOG8vUiPr/oLKppMwQ0061Rr6HVORB+RXmYXwMoxc5jC4210e5IN49NBnqFhQrm1xzwcZOGsQgj5PdFOR0d5DjcNhFkfReR8GZqfGjtVBMWoy8PicMefU4SpYduQLmy4aFteEHI1otzkhi9kZmMy0+YlKsSV1Y1bdjANWTuoCwEEANvXs4j82EZnEw/9XqTuqBO+GisZHhgQ6u5i9wEQkWSm6prS6HZ2fPN2q9lyP3Rx0C4Htyvl79XxrWv/eCEPcot7QLGUh+1WBu0fC5UEth15uSRwCTjBMTmEIYNmII+0i8GgRZgJUQSHGXEo6G523ELjVMp+6oOmhVNOy+zkLNlhrqLNI53Y6yC0ymKhKKdIcWTvqShMBNQq2ocBDiD0JEAB3EVxYBtV1dshEIsgL8HvPtAaKy7u6LJ8GYgt9hp2dJliwgaKMKp6jJxD40oxoD0qdz5cy2x4nEJrdXPTmn3QcrSLdP+CsDcS9swE0rZHn8BVQ07ibwucZO6ySsKGyCk8GQcGHI0o0Jk8s/KkJqZ6bq7hnsqnIAxJBK4vElOL2De61Wg1V1c37Gtjtd2arc8FZvHQorm51rL6K501+2ptbc4x6xNK+fnQbKuzZfXXt1btqw3Ws81iQQgbWq2urzurjcJqc95igNrQZn2l2NnGZmGzOscmywVchkZLbbXaTr+1Aq+1tc7WbDNqK/MEiM3O1maBYMPFOWUHn2xlZdOxwH0HxgroQIOlhMWqqz24itn7oP2cghw11Sk6QU2CdTybiRIqLJHahy8nQiFR7JytDe3pSnBx8K7TbK83O687tx9tl1XieuX7yg+VeqVd2ag8qvxaeVU5qATV76o/V3erTxc6Cx8WjhdwoXr1SmnzbWXit/DxX8s4EnU=</latexit>
Algorithm 1: Adaptive Robust Imitative Planning
Input:
D Demonstrations
K Number of models
B Data bu↵er
⌧ Variance threshold
I(at|st, st+1) Local planner
q(y|x; ✓) Imitative model
p(G|y) Goal likelihood
p(✓) Model prior
// Approximate model posterior inference, e.g., deep
ensemble
1 for model index k = 1 . . . K do
2 Bootstrap sample dataset Dk
boot
⇠ D
3 Sample model parameters from prior, ✓k ⇠ p(✓)
4 Train ensemble’s k-component via maximum likelihood estimation
(MLE) // Eqn. (??) ✓k arg max✓ E(x,y)⇠Dk
[log q(y|x; ✓)]
// Online planning
5 x, G env.reset()
6 while not done do
7 Get robust imitative plan // Eqn. (??)
y⇤
arg maxy
✓
log q(y|x; ✓) + log p(G|y)
// Online adaptation
8 Estimate the predictive variance of the y⇤
plan’s quality under the
model posterior // Eqn. (??) u(y⇤
) = Varp(✓|D) [log q(y⇤
|x; ✓)]
9 if u(y⇤
) > ⌧ then
10 y⇤
Query expert at x
11 B B [ (x, y⇤
)
12 Update model posterior on B // with any few-shot
adaptation method
13 at I(·|y⇤
)
14 x, G, done env.step(at)
26. A1.分布シフトが生じていることを検知できたらいい
‣ Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?
(ICML2020)
Sergey Levine / Yarin Galのチーム
https://sites.google.com/view/av-detect-recover-adapt
A2. 自己/半教師あり学習使えば良い
‣ Emerging Properties in Self-Supervised Vision Transformers
‣ Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View
Assignments with Support Samples
FAIRチーム
アプローチ&書誌情報
26
39. A1.分布シフトが生じていることを検知できたらいい
‣ Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts?
(ICML2020)
Sergey Levine / Yarin Galのチーム
https://sites.google.com/view/av-detect-recover-adapt
A2. 自己/半教師あり学習使えば良い
‣ Emerging Properties in Self-Supervised Vision Transformers
‣ Semi-Supervised Learning of Visual Features by Non-Parametrically Predicting View
Assignments with Support Samples
FAIRチーム
アプローチ&書誌情報
39