Soumettre la recherche
Mettre en ligne
[DL Hacks]CelebAをNumPyで保存してみた
•
2 j'aime
•
1,376 vues
Deep Learning JP
Suivre
2018/09/10 Deep Learning JP: http://deeplearning.jp/hacks/
Lire moins
Lire la suite
Technologie
Signaler
Partager
Signaler
Partager
1 sur 13
Télécharger maintenant
Télécharger pour lire hors ligne
Recommandé
RでKaggleの登竜門に挑戦
RでKaggleの登竜門に挑戦
幹雄 小川
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
Deep Learning JP
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
Deep Learning JP
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
Deep Learning JP
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
Deep Learning JP
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
Deep Learning JP
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
Deep Learning JP
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
Deep Learning JP
Recommandé
RでKaggleの登竜門に挑戦
RでKaggleの登竜門に挑戦
幹雄 小川
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
【DL輪読会】AdaptDiffuser: Diffusion Models as Adaptive Self-evolving Planners
Deep Learning JP
【DL輪読会】事前学習用データセットについて
【DL輪読会】事前学習用データセットについて
Deep Learning JP
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
【DL輪読会】 "Learning to render novel views from wide-baseline stereo pairs." CVP...
Deep Learning JP
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
【DL輪読会】Zero-Shot Dual-Lens Super-Resolution
Deep Learning JP
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
【DL輪読会】BloombergGPT: A Large Language Model for Finance arxiv
Deep Learning JP
【DL輪読会】マルチモーダル LLM
【DL輪読会】マルチモーダル LLM
Deep Learning JP
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
【 DL輪読会】ToolLLM: Facilitating Large Language Models to Master 16000+ Real-wo...
Deep Learning JP
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
Deep Learning JP
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
Deep Learning JP
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
Deep Learning JP
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
Deep Learning JP
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
Deep Learning JP
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
Deep Learning JP
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
Deep Learning JP
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
Deep Learning JP
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
Deep Learning JP
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
Deep Learning JP
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
Deep Learning JP
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
Deep Learning JP
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
Deep Learning JP
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
Deep Learning JP
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
Deep Learning JP
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル
Deep Learning JP
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
Deep Learning JP
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
Deep Learning JP
【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】大量API・ツールの扱いに特化したLLM
Deep Learning JP
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
Deep Learning JP
Contenu connexe
Plus de Deep Learning JP
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
Deep Learning JP
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
Deep Learning JP
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
Deep Learning JP
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
Deep Learning JP
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
Deep Learning JP
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
Deep Learning JP
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
Deep Learning JP
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
Deep Learning JP
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
Deep Learning JP
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
Deep Learning JP
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
Deep Learning JP
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
Deep Learning JP
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
Deep Learning JP
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
Deep Learning JP
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
Deep Learning JP
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル
Deep Learning JP
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
Deep Learning JP
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
Deep Learning JP
【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】大量API・ツールの扱いに特化したLLM
Deep Learning JP
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
Deep Learning JP
Plus de Deep Learning JP
(20)
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】AnyLoc: Towards Universal Visual Place Recognition
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Can Neural Network Memorization Be Localized?
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】Hopfield network 関連研究について
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】SimPer: Simple self-supervised learning of periodic targets( ICLR 2023 )
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】RLCD: Reinforcement Learning from Contrast Distillation for Language M...
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Secrets of RLHF in Large Language Models Part I: PPO"
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】"Language Instructed Reinforcement Learning for Human-AI Coordination "
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】Llama 2: Open Foundation and Fine-Tuned Chat Models
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】"Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware"
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Parameter is Not All You Need:Starting from Non-Parametric Networks fo...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Drag Your GAN: Interactive Point-based Manipulation on the Generative ...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Self-Supervised Learning from Images with a Joint-Embedding Predictive...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】Towards Understanding Ensemble, Knowledge Distillation and Self-Distil...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】VIP: Towards Universal Visual Reward and Representation via Value-Impl...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】Deep Transformers without Shortcuts: Modifying Self-attention for Fait...
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】マルチモーダル 基盤モデル
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
【DL輪読会】TrOCR: Transformer-based Optical Character Recognition with Pre-traine...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】HyperDiffusion: Generating Implicit Neural Fields withWeight-Space Dif...
【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】大量API・ツールの扱いに特化したLLM
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
【DL輪読会】DINOv2: Learning Robust Visual Features without Supervision
[DL Hacks]CelebAをNumPyで保存してみた
1.
CelebAをNumPyで保存してみた CelebAデータセットの202599枚のjpeg画像と40の属性を NumPy配列に変換し1枚のファイルに保存してみました 2018/9/10 DLHacks LT 植木孝一郎
2.
CelebFaces Attributes (CelebA)
Dataset 香港中文大学(マルチメディア研究室)の大規模な顔属性のデータセット http://mmlab.ie.cuhk.edu.hk/projects/CelebA.html 論文 https://arxiv.org/abs/1411.7766 178x218ピクセル、202599枚のjpeg画像(RGB) ファイルサイズ合計 1.743GB
3.
顔の属性 list_attr_celeba.csv 000001.jpg -1 1
1 -1 -1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 以下略(1と-1の2値、40属性) ・・・ 202599.jpg -1 1 1 -1 -1 -1 -1 -1 -1 1 -1 -1 -1 -1 -1 以下略 属性は全部で40 (正値は1、負値は-1) 1 : 5_o_Clock_Shadow (無精髭) 2 : Arched_Eyebrows (アーチ型の眉) 3 : Attractive (魅力的) ... 40: Young (若い) 論文には顔属性の識別手法と精度も書かれてます
4.
画像ファイルの読み込み DeepLearningで画像を学習するためには、Pillow(PIL)やOpenCVなどで画像 を読み込みNumPy配列に変換する必要があります。 ファイル名は000001から始まり規則的なのでリスト作成・画像読み込みなどは 簡単にできます。 また顔属性ラベルの list_attr_celeba.csvは numpy.loadtxt()で読み込めます。 問題は処理時間で・・・ Core
i7 メモリ32GB(スワップメモリ64GB)のPCでHDD上の202599枚の画像 を読み込み、NumPy配列に変換すると1929.6秒(約32分)かかりました。 時間をかけデータセットを読み込んだのだから、そのまま保存して活用したい!
5.
保存するNumPy配列 保存するNumPy配列は178x218ピクセルのRGB画像 202599枚をtrain_x とし print(train_x.shape)
(202599, 218, 178, 3) データ型は numpy.uint8 (8ビット符号なし整数)で保存します CSVの属性値(-1,1)を(0,1)に変えた2値の40属性、202599行のNumPy配列をtrain_yとし print(train_y.shape) (202599, 40) train_xとtrain_yを1枚のファイルに保存します。 train_xとtrain_yのメモリ使用量は sys.getsizeof(train_x) 23584954932 (23.584GB) sys.getsizeof(train_y) 8104072 (8.1MB) CelebA 202599枚jpegファイルサイズの合計は 1.743GBでしたので、NumPy配列にすることで13.5倍の容量になります。 ※この容量だとPCのメモリ、スワップメモリに注意してないとエラーやフリーズします! (Linuxなら free -m で確認)
6.
pickle VS numpy.savez PythonでNumPy配列をbinary保存するため ●
Pickle ● numpy.savez(非圧縮、複数) ● numpy.savez_compressed(圧縮、複数) を使って比較しました。 HDD上にファイルを保存する時間、保存したファイルを読み込む時間、保存した ファイルサイズを計測しました。
7.
pickleでsaveとload #保存(python3.4以上) #LT説明用の簡略コード import pickle train_xy =
train_x,train_y with open("save_filename", 'wb') as f: pickle.dump(train_xy , f ,protocol=4) # ← protocolが小さいとメモリエラーになる #読み込み import pickle with open("save_filename", 'rb') as f: train_xy = pickle.load(f) train_x = train_xy[0] train_y = train_xy[1] print(train_x.shape) (202599, 218, 178, 3) print(train_y.shape) (202599, 40)
8.
numpy.savez()で非圧縮保存とload #保存 import numpy as
np np.savez("save_filename", train_x=train_x, train_y=train_y) #読み込み import numpy as np with np.load("save_filename.npz") as data: train_x = data[train_x] train_y = data[train_y] print(train_x.shape) (202599, 218, 178, 3) print(train_y.shape) (202599, 40)
9.
numpy.savez_compressed()で圧縮保存とload #保存 import numpy as
np np.savez_compressed("save_filename", train_x=train_x, train_y=train_y) #読み込み import numpy as np with np.load("save_filename.npz") as data: train_x = data[train_x] train_y = data[train_y] print(train_x.shape) (202599, 218, 178, 3) print(train_y.shape) (202599, 40)
10.
ファイル保存の比較 保存(sec) 読み込み(sec) 作成ファイルサイズ pickle
247.522 153.720 23593058963 (23.593GB) np.savez 102.856 107.335 23593059354 (23.593GB) np.savez_compressed 553.789 114.486 15798570036 (15.798GB) 元の2つのNumPy配列の使用メモリサイズは合計で 2359305900 (23.593GB) 時間計測は5回平均 時間がかかる作成保存は最初の1回だけなので、 私はファイルサイズが小さい np.savez_compressedを使ってます。
11.
NumPyで検索・抽出 属性ラベルも一緒に保存したので、CelebAから画像の検索・抽出を行ってみました。 検索するため train_yを転置します。 print(train_y.shape) (202599,
40) yt = train_y.transpose() print(yt.shape) (40, 202599) 検索は np.where() を使います ※各値は正負(1,0)の2値 Eyeglasses(メガネ yt[15])、Wearing_Earrings(イヤリング着用 yt[34])、 Wearing_Necklace(ネックレス着用 yt[37])、Young(若い yt[39])の Male(男性でなく女性は yt[20]==0) で条件(and)検索します。
12.
NumPyの検索と抽出 index = np.where((yt[15]
== 1) & (yt[34] == 1) & (yt[37] == 1) & (yt[39] == 1) & (yt[20] == 0)) print(len(index[0])) 64 条件に合致するインデックス64件を抽出しました。 print(index[0]) [672 7016 7413 8059 ...] 抽出したインデックスから該当画像データは train_x[672], train_x[7016]...などです。 train_x[672] train_x[7016]
13.
[参考]CelebA 顔属性値(正負の2値) 5_o_Clock_Shadow Arched_Eyebrows Attractive Bags_Under_Eyes Bald Bangs Big_Lips Big_Nose Black_Hair Blond_Hair Blurry Brown_Hair Bushy_Eyebrows Chubby Double_Chin Eyeglasses Goatee Gray_Hair Heavy_Makeup High_Cheekbones Male Mouth_Slightly_Open Mustache Narrow_Eyes No_Beard Oval_Face Pale_Skin Pointy_Nose Receding_Hairline Rosy_Cheeks Sideburns Smiling Straight_Hair Wavy_Hair Wearing_Earrings Wearing_Hat Wearing_Lipstick Wearing_Necklace Wearing_Necktie Young
Télécharger maintenant