文献紹介：Adversarial Cross-Domain Action Recognition with Co-Attention

•

0 j'aime•24 vues

Boxiao Pan, Zhangjie Cao, Ehsan Adeli, Juan Carlos Niebles, Adversarial Cross-Domain Action Recognition with Co-Attention, AAAI2020. https://doi.org/10.1609/aaai.v34i07.6854 https://ojs.aaai.org/index.php/AAAI/article/view/6854

Adversarial Cross-Domain
Action Recognition
with Co-Attention
Boxian Pan, Zhangjie Cao,
Ehsan Adeli, Juan Carlos Niebles
AAAI2020
大見一樹（名工大玉木研）
論文紹介2021/12/16

論文の概要
◼動作認識のドメイン適応
• 静止画像よりドメインが複雑で困難
◼動画のドメイン適応が困難な原因
• ラベルと無関係なフレーム
• フレームの場所と動作の不一致
◼提案手法
• 重要フレームをアテンション
• 同じ動作のフレームでドメイン適応
ソースドメイン(S)の情報を使用して
ターゲットドメイン（T）におけるタスクを解く
ラベルがない（少ない）Tの特徴量を
ラベルがあるSの特徴量に近づけることで
Tの適切な特徴量を獲得する
ドメイン適応とは
ドメインとは
あるデータが持つ特有の傾向

フレームの場所と動作の不一致
◼既存のドメイン適応
• 同じ場所のフレームでドメイン適応
• 動作の異なるフレームで適応してしまう
◼提案手法
• 動作の一致するフレームでドメイン適応
• 無関係なフレームを避ける
背景引く引くアンカー打つ背景
引くアンカー打つ打つしまうしまう
引くアンカー打つ打つ
引く引くアンカー打つ
ソース
ターゲット
ソース
ターゲット

提案手法で用いるモジュール
◼Co-attention Module
• SとTで共通で重要なフレームにアテンション
• 無関係なフレームを避ける
• 重要なフレームでドメイン適応
• 動作の一致するフレームでドメイン適応
(1) SとTのセルフアテンション (2) クロスアテンション (3) Co-attention行列を計算
Co-attentionの計算

提案手法
ソース(S)
ターゲット(T)
𝑓𝑠 ∶ Sの特徴量
෢
𝑓𝑡 ∶ Tに合わせたSの特徴量
𝑓𝑡 ∶ Tの特徴量
෢
𝑓𝑡 を 𝑓𝑠に
近づける
𝑓𝑡
を ෢
𝑓𝑡 に
近づける
Co-attentionモジュールが
特徴量෢
𝑓𝑡 を出力
推論
SとTで共通して重要な
特徴量に注目して間接的に
𝑓𝑡
を𝑓𝑠
に近づける
推論時のためにCo-attention
モジュールが出力する
アテンションと一致するような
ターゲット用のアテンションネッ
トワークを学習

Co-attention行列の可視化
◼データセット
• UCF50 → Olympic Sports
◼結果
• 重要フレームに高い値
• 重要フレームでもSとTで共通
しなければ低い値
S
T

Co-attention行列の可視化
◼データセット
• UCF50 → Olympic Sports
◼結果
• 重要フレームに高い値
• 重要フレームでもSとTで共通
しなければ低い値
ターゲット

Co-attention行列の可視化
◼データセット
• UCF50 → Olympic Sports
◼結果
• 重要フレームに高い値
• 重要フレームでもSとTで共通
しなければ低い値

Co-attention行列の可視化
◼データセット
• UCF50 → Olympic Sports
◼結果
• 重要フレームに高い値
• 重要フレームでもSとTで共通
しなければ低い値

既存手法との比較
◼結果
• 全てのデータセットで最も良い性能
• Jesterを使った実験で特に既存手法よりも性能向上
• Jesterはジェスチャーが行われるデータセット
• データによって動作の速さ（ジェスチャーの速さ）が異なる
• 時間的な情報のずれがあるデータセットでも提案手法は有効
(HMDB51 → UCF101) (UCF50 → Olympic Sports) (Olympic Sports → UCF50) (Jester(S) → Jester(T))

まとめ
◼動作認識のドメイン適応手法の提案
• Co-attentionモジュール
• SとTで共通して重要なフレームに注目
• 動作の一致するフレームでドメイン適応
◼手法の性能
• 既存手法より高性能
• 特に時間情報がずれたデータセットで高性能
• 動作の一致するフレームでドメイン適応を行う提案手法の有効性

Recommandé

論文紹介：Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...

論文紹介：Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...

論文紹介：Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...Toru Tamaki

論文紹介：Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...

論文紹介：Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...

論文紹介：Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...Toru Tamaki

論文紹介：Automated Classification of Model Errors on ImageNet

論文紹介：Automated Classification of Model Errors on ImageNet

論文紹介：Automated Classification of Model Errors on ImageNetToru Tamaki

論文紹介：Semantic segmentation using Vision Transformers: A survey

論文紹介：Semantic segmentation using Vision Transformers: A survey

論文紹介：Semantic segmentation using Vision Transformers: A surveyToru Tamaki

論文紹介：MOSE: A New Dataset for Video Object Segmentation in Complex Scenes

論文紹介：MOSE: A New Dataset for Video Object Segmentation in Complex Scenes

論文紹介：MOSE: A New Dataset for Video Object Segmentation in Complex ScenesToru Tamaki

論文紹介：MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...

論文紹介：MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...

論文紹介：MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...Toru Tamaki

論文紹介：Tracking Anything with Decoupled Video Segmentation

論文紹介：Tracking Anything with Decoupled Video Segmentation

論文紹介：Tracking Anything with Decoupled Video SegmentationToru Tamaki

論文紹介：Real-Time Evaluation in Online Continual Learning: A New Hope

論文紹介：Real-Time Evaluation in Online Continual Learning: A New Hope

論文紹介：Real-Time Evaluation in Online Continual Learning: A New HopeToru Tamaki

Recommandé

論文紹介：Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...

論文紹介：Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...

論文紹介：Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Gene...Toru Tamaki

論文紹介：Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...

論文紹介：Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...

論文紹介：Content-Aware Token Sharing for Efficient Semantic Segmentation With Vis...Toru Tamaki

論文紹介：Automated Classification of Model Errors on ImageNet

論文紹介：Automated Classification of Model Errors on ImageNet

論文紹介：Automated Classification of Model Errors on ImageNetToru Tamaki

論文紹介：Semantic segmentation using Vision Transformers: A survey

論文紹介：Semantic segmentation using Vision Transformers: A survey

論文紹介：Semantic segmentation using Vision Transformers: A surveyToru Tamaki

論文紹介：MOSE: A New Dataset for Video Object Segmentation in Complex Scenes

論文紹介：MOSE: A New Dataset for Video Object Segmentation in Complex Scenes

論文紹介：MOSE: A New Dataset for Video Object Segmentation in Complex ScenesToru Tamaki

論文紹介：MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...

論文紹介：MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...

論文紹介：MoLo: Motion-Augmented Long-Short Contrastive Learning for Few-Shot Acti...Toru Tamaki

論文紹介：Tracking Anything with Decoupled Video Segmentation

論文紹介：Tracking Anything with Decoupled Video Segmentation

論文紹介：Tracking Anything with Decoupled Video SegmentationToru Tamaki

論文紹介：Real-Time Evaluation in Online Continual Learning: A New Hope

論文紹介：Real-Time Evaluation in Online Continual Learning: A New Hope

論文紹介：Real-Time Evaluation in Online Continual Learning: A New HopeToru Tamaki

論文紹介：PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...

論文紹介：PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...

論文紹介：PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...Toru Tamaki

論文紹介：Multitask Vision-Language Prompt Tuning

論文紹介：Multitask Vision-Language Prompt Tuning

論文紹介：Multitask Vision-Language Prompt TuningToru Tamaki

論文紹介：MovieCLIP: Visual Scene Recognition in Movies

論文紹介：MovieCLIP: Visual Scene Recognition in Movies

論文紹介：MovieCLIP: Visual Scene Recognition in MoviesToru Tamaki

論文紹介：Discovering Universal Geometry in Embeddings with ICA

論文紹介：Discovering Universal Geometry in Embeddings with ICA

論文紹介：Discovering Universal Geometry in Embeddings with ICAToru Tamaki

論文紹介：Efficient Video Action Detection with Token Dropout and Context Refinement

論文紹介：Efficient Video Action Detection with Token Dropout and Context Refinement

論文紹介：Efficient Video Action Detection with Token Dropout and Context RefinementToru Tamaki

論文紹介：Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...

論文紹介：Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...

論文紹介：Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...Toru Tamaki

論文紹介：MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...

論文紹介：MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...

論文紹介：MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...Toru Tamaki

論文紹介：Revealing the unseen: Benchmarking video action recognition under occlusion

論文紹介：Revealing the unseen: Benchmarking video action recognition under occlusion

論文紹介：Revealing the unseen: Benchmarking video action recognition under occlusionToru Tamaki

論文紹介：Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving

論文紹介：Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving

論文紹介：Video Task Decathlon: Unifying Image and Video Tasks in Autonomous DrivingToru Tamaki

論文紹介：Spatio-Temporal Action Detection Under Large Motion

論文紹介：Spatio-Temporal Action Detection Under Large Motion

論文紹介：Spatio-Temporal Action Detection Under Large MotionToru Tamaki

論文紹介：Vision Transformer Adapter for Dense Predictions

論文紹介：Vision Transformer Adapter for Dense Predictions

論文紹介：Vision Transformer Adapter for Dense PredictionsToru Tamaki

動画像理解のための深層学習アプローチ Deep learning approaches to video understanding

動画像理解のための深層学習アプローチ Deep learning approaches to video understanding

動画像理解のための深層学習アプローチ Deep learning approaches to video understandingToru Tamaki

論文紹介：Masked Vision and Language Modeling for Multi-modal Representation Learning

論文紹介：Masked Vision and Language Modeling for Multi-modal Representation Learning

論文紹介：Masked Vision and Language Modeling for Multi-modal Representation LearningToru Tamaki

論文紹介：Noise-Aware Learning from Web-Crawled Image-Text Data for Image Captioning

論文紹介：Noise-Aware Learning from Web-Crawled Image-Text Data for Image Captioning

論文紹介：Noise-Aware Learning from Web-Crawled Image-Text Data for Image CaptioningToru Tamaki

論文紹介：ProbVLM: Probabilistic Adapter for Frozen Vison-Language Models

論文紹介：ProbVLM: Probabilistic Adapter for Frozen Vison-Language Models

論文紹介：ProbVLM: Probabilistic Adapter for Frozen Vison-Language ModelsToru Tamaki

論文紹介：Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

論文紹介：Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

論文紹介：Prompt Switch: Efficient CLIP Adaptation for Text-Video RetrievalToru Tamaki

論文紹介：Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

論文紹介：Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

論文紹介：Transferable Decoding with Visual Entities for Zero-Shot Image CaptioningToru Tamaki

論文紹介：Video Test-Time Adaptation for Action Recognition

論文紹介：Video Test-Time Adaptation for Action Recognition

論文紹介：Video Test-Time Adaptation for Action RecognitionToru Tamaki

動画像理解のための深層学習アプローチ

動画像理解のための深層学習アプローチ

動画像理解のための深層学習アプローチToru Tamaki

ソフトウェア工学2023 08 GitHub

ソフトウェア工学2023 08 GitHub

ソフトウェア工学2023 08 GitHubToru Tamaki

Contenu connexe

Plus de Toru Tamaki

論文紹介：PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...

論文紹介：PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...

論文紹介：PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...Toru Tamaki

論文紹介：Multitask Vision-Language Prompt Tuning

論文紹介：Multitask Vision-Language Prompt Tuning

論文紹介：Multitask Vision-Language Prompt TuningToru Tamaki

論文紹介：MovieCLIP: Visual Scene Recognition in Movies

論文紹介：MovieCLIP: Visual Scene Recognition in Movies

論文紹介：MovieCLIP: Visual Scene Recognition in MoviesToru Tamaki

論文紹介：Discovering Universal Geometry in Embeddings with ICA

論文紹介：Discovering Universal Geometry in Embeddings with ICA

論文紹介：Discovering Universal Geometry in Embeddings with ICAToru Tamaki

論文紹介：Efficient Video Action Detection with Token Dropout and Context Refinement

論文紹介：Efficient Video Action Detection with Token Dropout and Context Refinement

論文紹介：Efficient Video Action Detection with Token Dropout and Context RefinementToru Tamaki

論文紹介：Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...

論文紹介：Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...

論文紹介：Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...Toru Tamaki

論文紹介：MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...

論文紹介：MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...

論文紹介：MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...Toru Tamaki

論文紹介：Revealing the unseen: Benchmarking video action recognition under occlusion

論文紹介：Revealing the unseen: Benchmarking video action recognition under occlusion

論文紹介：Revealing the unseen: Benchmarking video action recognition under occlusionToru Tamaki

論文紹介：Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving

論文紹介：Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving

論文紹介：Video Task Decathlon: Unifying Image and Video Tasks in Autonomous DrivingToru Tamaki

論文紹介：Spatio-Temporal Action Detection Under Large Motion

論文紹介：Spatio-Temporal Action Detection Under Large Motion

論文紹介：Spatio-Temporal Action Detection Under Large MotionToru Tamaki

論文紹介：Vision Transformer Adapter for Dense Predictions

論文紹介：Vision Transformer Adapter for Dense Predictions

論文紹介：Vision Transformer Adapter for Dense PredictionsToru Tamaki

動画像理解のための深層学習アプローチ Deep learning approaches to video understanding

動画像理解のための深層学習アプローチ Deep learning approaches to video understanding

動画像理解のための深層学習アプローチ Deep learning approaches to video understandingToru Tamaki

論文紹介：Masked Vision and Language Modeling for Multi-modal Representation Learning

論文紹介：Masked Vision and Language Modeling for Multi-modal Representation Learning

論文紹介：Masked Vision and Language Modeling for Multi-modal Representation LearningToru Tamaki

論文紹介：Noise-Aware Learning from Web-Crawled Image-Text Data for Image Captioning

論文紹介：Noise-Aware Learning from Web-Crawled Image-Text Data for Image Captioning

論文紹介：Noise-Aware Learning from Web-Crawled Image-Text Data for Image CaptioningToru Tamaki

論文紹介：ProbVLM: Probabilistic Adapter for Frozen Vison-Language Models

論文紹介：ProbVLM: Probabilistic Adapter for Frozen Vison-Language Models

論文紹介：ProbVLM: Probabilistic Adapter for Frozen Vison-Language ModelsToru Tamaki

論文紹介：Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

論文紹介：Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

論文紹介：Prompt Switch: Efficient CLIP Adaptation for Text-Video RetrievalToru Tamaki

論文紹介：Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

論文紹介：Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

論文紹介：Transferable Decoding with Visual Entities for Zero-Shot Image CaptioningToru Tamaki

論文紹介：Video Test-Time Adaptation for Action Recognition

論文紹介：Video Test-Time Adaptation for Action Recognition

論文紹介：Video Test-Time Adaptation for Action RecognitionToru Tamaki

動画像理解のための深層学習アプローチ

動画像理解のための深層学習アプローチ

動画像理解のための深層学習アプローチToru Tamaki

ソフトウェア工学2023 08 GitHub

ソフトウェア工学2023 08 GitHub

ソフトウェア工学2023 08 GitHubToru Tamaki

Plus de Toru Tamaki (20)

論文紹介：PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...

論文紹介：PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...

論文紹介：PointNet: Deep Learning on Point Sets for 3D Classification and Segmenta...

論文紹介：Multitask Vision-Language Prompt Tuning

論文紹介：Multitask Vision-Language Prompt Tuning

論文紹介：Multitask Vision-Language Prompt Tuning

論文紹介：MovieCLIP: Visual Scene Recognition in Movies

論文紹介：MovieCLIP: Visual Scene Recognition in Movies

論文紹介：MovieCLIP: Visual Scene Recognition in Movies

論文紹介：Discovering Universal Geometry in Embeddings with ICA

論文紹介：Discovering Universal Geometry in Embeddings with ICA

論文紹介：Discovering Universal Geometry in Embeddings with ICA

論文紹介：Efficient Video Action Detection with Token Dropout and Context Refinement

論文紹介：Efficient Video Action Detection with Token Dropout and Context Refinement

論文紹介：Efficient Video Action Detection with Token Dropout and Context Refinement

論文紹介：Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...

論文紹介：Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...

論文紹介：Learning from Noisy Pseudo Labels for Semi-Supervised Temporal Action Lo...

論文紹介：MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...

論文紹介：MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...

論文紹介：MeMViT: Memory-Augmented Multiscale Vision Transformer for Efficient Lon...

論文紹介：Revealing the unseen: Benchmarking video action recognition under occlusion

論文紹介：Revealing the unseen: Benchmarking video action recognition under occlusion

論文紹介：Revealing the unseen: Benchmarking video action recognition under occlusion

論文紹介：Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving

論文紹介：Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving

論文紹介：Video Task Decathlon: Unifying Image and Video Tasks in Autonomous Driving

論文紹介：Spatio-Temporal Action Detection Under Large Motion

論文紹介：Spatio-Temporal Action Detection Under Large Motion

論文紹介：Spatio-Temporal Action Detection Under Large Motion

論文紹介：Vision Transformer Adapter for Dense Predictions

論文紹介：Vision Transformer Adapter for Dense Predictions

論文紹介：Vision Transformer Adapter for Dense Predictions

動画像理解のための深層学習アプローチ Deep learning approaches to video understanding

動画像理解のための深層学習アプローチ Deep learning approaches to video understanding

動画像理解のための深層学習アプローチ Deep learning approaches to video understanding

論文紹介：Masked Vision and Language Modeling for Multi-modal Representation Learning

論文紹介：Masked Vision and Language Modeling for Multi-modal Representation Learning

論文紹介：Masked Vision and Language Modeling for Multi-modal Representation Learning

論文紹介：Noise-Aware Learning from Web-Crawled Image-Text Data for Image Captioning

論文紹介：Noise-Aware Learning from Web-Crawled Image-Text Data for Image Captioning

論文紹介：Noise-Aware Learning from Web-Crawled Image-Text Data for Image Captioning

論文紹介：ProbVLM: Probabilistic Adapter for Frozen Vison-Language Models

論文紹介：ProbVLM: Probabilistic Adapter for Frozen Vison-Language Models

論文紹介：ProbVLM: Probabilistic Adapter for Frozen Vison-Language Models

論文紹介：Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

論文紹介：Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

論文紹介：Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval

論文紹介：Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

論文紹介：Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

論文紹介：Transferable Decoding with Visual Entities for Zero-Shot Image Captioning

論文紹介：Video Test-Time Adaptation for Action Recognition

論文紹介：Video Test-Time Adaptation for Action Recognition

論文紹介：Video Test-Time Adaptation for Action Recognition

動画像理解のための深層学習アプローチ

動画像理解のための深層学習アプローチ

動画像理解のための深層学習アプローチ

ソフトウェア工学2023 08 GitHub

ソフトウェア工学2023 08 GitHub

ソフトウェア工学2023 08 GitHub

文献紹介：Adversarial Cross-Domain Action Recognition with Co-Attention

1. Adversarial Cross-Domain Action Recognition with Co-Attention Boxian Pan, Zhangjie Cao, Ehsan Adeli, Juan Carlos Niebles AAAI2020 大見一樹（名工大玉木研）論文紹介2021/12/16

2. 論文の概要 ◼動作認識のドメイン適応 • 静止画像よりドメインが複雑で困難 ◼動画のドメイン適応が困難な原因 • ラベルと無関係なフレーム • フレームの場所と動作の不一致 ◼提案手法 • 重要フレームをアテンション • 同じ動作のフレームでドメイン適応ソースドメイン(S)の情報を使用してターゲットドメイン（T）におけるタスクを解くラベルがない（少ない）Tの特徴量をラベルがあるSの特徴量に近づけることで Tの適切な特徴量を獲得するドメイン適応とはドメインとはあるデータが持つ特有の傾向

3. フレームの場所と動作の不一致 ◼既存のドメイン適応 • 同じ場所のフレームでドメイン適応 • 動作の異なるフレームで適応してしまう ◼提案手法 • 動作の一致するフレームでドメイン適応 • 無関係なフレームを避ける背景引く引くアンカー打つ背景引くアンカー打つ打つしまうしまう引くアンカー打つ打つ引く引くアンカー打つソースターゲットソースターゲット

4. 提案手法で用いるモジュール ◼Co-attention Module • SとTで共通で重要なフレームにアテンション • 無関係なフレームを避ける • 重要なフレームでドメイン適応 • 動作の一致するフレームでドメイン適応 (1) SとTのセルフアテンション (2) クロスアテンション (3) Co-attention行列を計算 Co-attentionの計算

5. 提案手法ソース(S) ターゲット(T) 𝑓𝑠 ∶ Sの特徴量 ෢ 𝑓𝑡 ∶ Tに合わせたSの特徴量 𝑓𝑡 ∶ Tの特徴量 ෢ 𝑓𝑡 を 𝑓𝑠に近づける 𝑓𝑡 を ෢ 𝑓𝑡 に近づける Co-attentionモジュールが特徴量෢ 𝑓𝑡 を出力推論 SとTで共通して重要な特徴量に注目して間接的に 𝑓𝑡 を𝑓𝑠 に近づける推論時のためにCo-attention モジュールが出力するアテンションと一致するようなターゲット用のアテンションネットワークを学習

6. Co-attention行列の可視化 ◼データセット • UCF50 → Olympic Sports ◼結果 • 重要フレームに高い値 • 重要フレームでもSとTで共通しなければ低い値 S T

7. Co-attention行列の可視化 ◼データセット • UCF50 → Olympic Sports ◼結果 • 重要フレームに高い値 • 重要フレームでもSとTで共通しなければ低い値ターゲット

8. Co-attention行列の可視化 ◼データセット • UCF50 → Olympic Sports ◼結果 • 重要フレームに高い値 • 重要フレームでもSとTで共通しなければ低い値

9. Co-attention行列の可視化 ◼データセット • UCF50 → Olympic Sports ◼結果 • 重要フレームに高い値 • 重要フレームでもSとTで共通しなければ低い値

10. 既存手法との比較 ◼結果 • 全てのデータセットで最も良い性能 • Jesterを使った実験で特に既存手法よりも性能向上 • Jesterはジェスチャーが行われるデータセット • データによって動作の速さ（ジェスチャーの速さ）が異なる • 時間的な情報のずれがあるデータセットでも提案手法は有効 (HMDB51 → UCF101) (UCF50 → Olympic Sports) (Olympic Sports → UCF50) (Jester(S) → Jester(T))

11. まとめ ◼動作認識のドメイン適応手法の提案 • Co-attentionモジュール • SとTで共通して重要なフレームに注目 • 動作の一致するフレームでドメイン適応 ◼手法の性能 • 既存手法より高性能 • 特に時間情報がずれたデータセットで高性能 • 動作の一致するフレームでドメイン適応を行う提案手法の有効性