Soumettre la recherche
Mettre en ligne
GTC Japan 2014
•
2 j'aime
•
2,028 vues
Hitoshi Sato
Suivre
Presentation slides for GTC Japan 2014 (http://www.gputechconf.jp/page/home.html).
Lire moins
Lire la suite
Logiciels
Signaler
Partager
Signaler
Partager
1 sur 25
Télécharger maintenant
Télécharger pour lire hors ligne
Recommandé
AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合
AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合
Hitoshi Sato
Japan Lustre User Group 2014
Japan Lustre User Group 2014
Hitoshi Sato
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
Hitoshi Sato
Building Software Ecosystems for AI Cloud using Singularity HPC Container
Building Software Ecosystems for AI Cloud using Singularity HPC Container
Hitoshi Sato
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
Ryousei Takano
Ceph Day KL - Bluestore
Ceph Day KL - Bluestore
Ceph Community
Hadoop Installation and basic configuration
Hadoop Installation and basic configuration
Gerrit van Vuuren
Supermicro cloudera hadoop
Supermicro cloudera hadoop
Supermicro_SMCI
Recommandé
AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合
AI橋渡しクラウド(ABCI)における高性能計算とAI/ビッグデータ処理の融合
Hitoshi Sato
Japan Lustre User Group 2014
Japan Lustre User Group 2014
Hitoshi Sato
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
ABCI: AI Bridging Cloud Infrastructure for Scalable AI/Big Data
Hitoshi Sato
Building Software Ecosystems for AI Cloud using Singularity HPC Container
Building Software Ecosystems for AI Cloud using Singularity HPC Container
Hitoshi Sato
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
Ryousei Takano
Ceph Day KL - Bluestore
Ceph Day KL - Bluestore
Ceph Community
Hadoop Installation and basic configuration
Hadoop Installation and basic configuration
Gerrit van Vuuren
Supermicro cloudera hadoop
Supermicro cloudera hadoop
Supermicro_SMCI
Unix v6 セミナー vol. 5
Unix v6 セミナー vol. 5
magoroku Yamamoto
Bluestore
Bluestore
Patrick McGarry
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
Kohei KaiGai
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
Andrey Kudryavtsev
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
Kohei KaiGai
Supermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop Infrastructure
templedf
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
Kohei KaiGai
Alluxio in MOMO
Alluxio in MOMO
Alluxio, Inc.
The Practice of Alluxio in Near Real-Time Data Platform at VIPShop [Chinese]
The Practice of Alluxio in Near Real-Time Data Platform at VIPShop [Chinese]
Alluxio, Inc.
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
Kohei KaiGai
20201128_OSC_Fukuoka_Online_GPUPostGIS
20201128_OSC_Fukuoka_Online_GPUPostGIS
Kohei KaiGai
Vacuum more efficient than ever
Vacuum more efficient than ever
Masahiko Sawada
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
Kohei KaiGai
Postgres-BDR with Google Cloud Platform
Postgres-BDR with Google Cloud Platform
SungJae Yun
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw
Kohei KaiGai
計算機性能の限界点とその考え方
計算機性能の限界点とその考え方
Naoto MATSUMOTO
pgconfasia2016 plcuda en
pgconfasia2016 plcuda en
Kohei KaiGai
Oracle cluster installation with grid and iscsi
Oracle cluster installation with grid and iscsi
Chanaka Lasantha
20210301_PGconf_Online_GPU_PostGIS_GiST_Index
20210301_PGconf_Online_GPU_PostGIS_GiST_Index
Kohei KaiGai
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
Kohei KaiGai
Akfiler upgrades providence july 2012
Akfiler upgrades providence july 2012
Accenture
LUG 2014
LUG 2014
Hitoshi Sato
Contenu connexe
Tendances
Unix v6 セミナー vol. 5
Unix v6 セミナー vol. 5
magoroku Yamamoto
Bluestore
Bluestore
Patrick McGarry
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
Kohei KaiGai
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
Andrey Kudryavtsev
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
Kohei KaiGai
Supermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop Infrastructure
templedf
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
Kohei KaiGai
Alluxio in MOMO
Alluxio in MOMO
Alluxio, Inc.
The Practice of Alluxio in Near Real-Time Data Platform at VIPShop [Chinese]
The Practice of Alluxio in Near Real-Time Data Platform at VIPShop [Chinese]
Alluxio, Inc.
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
Kohei KaiGai
20201128_OSC_Fukuoka_Online_GPUPostGIS
20201128_OSC_Fukuoka_Online_GPUPostGIS
Kohei KaiGai
Vacuum more efficient than ever
Vacuum more efficient than ever
Masahiko Sawada
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
Kohei KaiGai
Postgres-BDR with Google Cloud Platform
Postgres-BDR with Google Cloud Platform
SungJae Yun
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw
Kohei KaiGai
計算機性能の限界点とその考え方
計算機性能の限界点とその考え方
Naoto MATSUMOTO
pgconfasia2016 plcuda en
pgconfasia2016 plcuda en
Kohei KaiGai
Oracle cluster installation with grid and iscsi
Oracle cluster installation with grid and iscsi
Chanaka Lasantha
20210301_PGconf_Online_GPU_PostGIS_GiST_Index
20210301_PGconf_Online_GPU_PostGIS_GiST_Index
Kohei KaiGai
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
Kohei KaiGai
Tendances
(20)
Unix v6 セミナー vol. 5
Unix v6 セミナー vol. 5
Bluestore
Bluestore
20170602_OSSummit_an_intelligent_storage
20170602_OSSummit_an_intelligent_storage
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
DUG'20: 12 - DAOS in Lenovo’s HPC Innovation Center
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
PG-Strom - GPGPU meets PostgreSQL, PGcon2015
Supermicro High Performance Enterprise Hadoop Infrastructure
Supermicro High Performance Enterprise Hadoop Infrastructure
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
Alluxio in MOMO
Alluxio in MOMO
The Practice of Alluxio in Near Real-Time Data Platform at VIPShop [Chinese]
The Practice of Alluxio in Near Real-Time Data Platform at VIPShop [Chinese]
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
GPU/SSD Accelerates PostgreSQL - challenge towards query processing throughpu...
20201128_OSC_Fukuoka_Online_GPUPostGIS
20201128_OSC_Fukuoka_Online_GPUPostGIS
Vacuum more efficient than ever
Vacuum more efficient than ever
20160407_GTC2016_PgSQL_In_Place
20160407_GTC2016_PgSQL_In_Place
Postgres-BDR with Google Cloud Platform
Postgres-BDR with Google Cloud Platform
20171206 PGconf.ASIA LT gstore_fdw
20171206 PGconf.ASIA LT gstore_fdw
計算機性能の限界点とその考え方
計算機性能の限界点とその考え方
pgconfasia2016 plcuda en
pgconfasia2016 plcuda en
Oracle cluster installation with grid and iscsi
Oracle cluster installation with grid and iscsi
20210301_PGconf_Online_GPU_PostGIS_GiST_Index
20210301_PGconf_Online_GPU_PostGIS_GiST_Index
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
Similaire à GTC Japan 2014
Akfiler upgrades providence july 2012
Akfiler upgrades providence july 2012
Accenture
LUG 2014
LUG 2014
Hitoshi Sato
Dell Technologies Dell EMC Data Protection Solutions On One Single Page - POS...
Dell Technologies Dell EMC Data Protection Solutions On One Single Page - POS...
Dell Technologies
Open Source Data Deduplication
Open Source Data Deduplication
RedWireServices
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
Tomas Vondra
General commands for navisphere cli
General commands for navisphere cli
msaleh1234
Australian Bureau of Meteorology moves to a new Data Production Service
Australian Bureau of Meteorology moves to a new Data Production Service
inside-BigData.com
Webinar NETGEAR - ReadyNAS, le novità hardware e software
Webinar NETGEAR - ReadyNAS, le novità hardware e software
Netgear Italia
JetStor NAS series 2016
JetStor NAS series 2016
Gene Leyzarovich
LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)
Pekka Männistö
LizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-web
Szymon Haly
Exploiting Your File System to Build Robust & Efficient Workflows
Exploiting Your File System to Build Robust & Efficient Workflows
jasonajohnson
QNAP TS-832PX-4G.pdf
QNAP TS-832PX-4G.pdf
GustavoLippera1
Marian Marinov, 1H Ltd.
Marian Marinov, 1H Ltd.
Ontico
Performance comparison of Distributed File Systems on 1Gbit networks
Performance comparison of Distributed File Systems on 1Gbit networks
Marian Marinov
Qnap nas TS 1679 introduction_info tech Middle east
Qnap nas TS 1679 introduction_info tech Middle east
Ali Shoaee
Qnap nas ts 1679 introduction-02
Qnap nas ts 1679 introduction-02
CarrierDigit
Ceph Day San Jose - HA NAS with CephFS
Ceph Day San Jose - HA NAS with CephFS
Ceph Community
The basic concept of Linux FIleSystem
The basic concept of Linux FIleSystem
HungWei Chiu
Linux configer
Linux configer
MD. AL AMIN
Similaire à GTC Japan 2014
(20)
Akfiler upgrades providence july 2012
Akfiler upgrades providence july 2012
LUG 2014
LUG 2014
Dell Technologies Dell EMC Data Protection Solutions On One Single Page - POS...
Dell Technologies Dell EMC Data Protection Solutions On One Single Page - POS...
Open Source Data Deduplication
Open Source Data Deduplication
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
PostgreSQL na EXT4, XFS, BTRFS a ZFS / FOSDEM PgDay 2016
General commands for navisphere cli
General commands for navisphere cli
Australian Bureau of Meteorology moves to a new Data Production Service
Australian Bureau of Meteorology moves to a new Data Production Service
Webinar NETGEAR - ReadyNAS, le novità hardware e software
Webinar NETGEAR - ReadyNAS, le novità hardware e software
JetStor NAS series 2016
JetStor NAS series 2016
LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v4.0 (1)
LizardFS-WhitePaper-Eng-v3.9.2-web
LizardFS-WhitePaper-Eng-v3.9.2-web
Exploiting Your File System to Build Robust & Efficient Workflows
Exploiting Your File System to Build Robust & Efficient Workflows
QNAP TS-832PX-4G.pdf
QNAP TS-832PX-4G.pdf
Marian Marinov, 1H Ltd.
Marian Marinov, 1H Ltd.
Performance comparison of Distributed File Systems on 1Gbit networks
Performance comparison of Distributed File Systems on 1Gbit networks
Qnap nas TS 1679 introduction_info tech Middle east
Qnap nas TS 1679 introduction_info tech Middle east
Qnap nas ts 1679 introduction-02
Qnap nas ts 1679 introduction-02
Ceph Day San Jose - HA NAS with CephFS
Ceph Day San Jose - HA NAS with CephFS
The basic concept of Linux FIleSystem
The basic concept of Linux FIleSystem
Linux configer
Linux configer
Dernier
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
harshavardhanraghave
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
Fatema Valibhai
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
Willy Marroquin (WillyDevNET)
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
ICS
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Alberto González Trastoy
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
panagenda
Software Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
Arshad QA
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
Andolasoft Inc
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
kalichargn70th171
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
anilsa9823
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
kalichargn70th171
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
OnePlan Solutions
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
MyIntelliSource, Inc.
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
anilsa9823
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
Wave PLM
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
shikhaohhpro
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
MyIntelliSource, Inc.
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
ComplianceQuest1
Dernier
(20)
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
Software Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
GTC Japan 2014
1.
2.
3.
TSUBAME2 System Overview 11PB
(7PB HDD, 4PB Tape, 200TB SSD) “Global'Work'Space”'#1 SFA10k'#5 “Global'Work' Space”'#2 “Global'Work'Space”'#3 SFA10k'#4SFA10k'#3SFA10k'#2SFA10k'#1 /data0' /work0 /work1'''''/gscr “cNFS/Clusterd'Samba'w/'GPFS”'' HOME System' applicaJon “NFS/CIFS/iSCSI'by'BlueARC”'' HOME iSCSI Infiniband'QDR'Networks SFA10k'#6 GPFS#1 GPFS#2 GPFS#3 GPFS#4 Parallel'File'System'Volumes Home'Volumes QDR'IB(×4)'×'20 10GbE'×'2QDR'IB'(×4)'×'8 1.2PB 3.6(PB /data1' ' ' ' ' Thin'nodes 1408nodes'''(32nodes'x44'Racks)' HP'Proliant'SL390s'G7'1408nodes CPU:'Intel'Westmere`EP''2.93GHz'' '''''''''6cores'×'2'='12cores/node' GPU:'NVIDIA'Tesla'K20X,'3GPUs/node' Mem:'54GB'(96GB)' SSD:''60GB'x'2'='120GB'(120GB'x'2'='240GB) ' Medium'nodes HP'Proliant'DL580'G7'24nodes'' CPU:'Intel'Nehalem`EX'2.0GHz' '''''''''8cores'×'2'='32cores/node' GPU:'NVIDIA''Tesla'S1070,'' ''''''''''NextIO'vCORE'Express'2070' Mem:128GB' SSD:'120GB'x'4'='480GB' ' ' Fat'nodes HP'Proliant'DL580'G7'10nodes' CPU:'Intel'Nehalem`EX'2.0GHz' '''''''''8cores'×'2'='32cores/node'' GPU:'NVIDIA'Tesla'S1070' Mem:'256GB'(512GB)' SSD:'120GB'x'4'='480GB' Compu.ng(Nodes 17.1PFlops(SFP),(5.76PFlops(DFP),(224.69TFlops(CPU),(~100TB(MEM,(~200TB(SSD( Interconnets: FullKbisec.on(Op.cal(QDR(Infiniband(Network ' ' Voltaire'Grid'Director'4700''×12' IB'QDR:'324'ports' Core'Switch' ' ' Edge'Switch' ' ' Edge'Switch'(/w'10GbE'ports)' Voltaire'Grid'Director'4036'×179' IB'QDR':'36'ports' Voltaire''Grid'Director'4036E'×6' IB'QDR:34ports''' 10GbE:''2port' 12switches' 6switches'179switches' 2.4(PB(HDD(+(( 4PB(Tape
4.
例 ! TEPS(Traversed Edges
Per Second) ! (Cybersecurity, Medical Informatics, Social Networks, Data Enrichment, Symbolic Networks) ! ! concurrent search(Breadth First Search : BFS) ! optimization (Single Source Shortest Path) ! edge-oriented (Maximal Independent Set) ! ! Green Graph500 ! http://green.graph500.org/ • Kronecker'Graph' '(BFS)' ' – '16'(=m/n)' '32' ' – SCALE' '2SCALE'' '2SCALE'+'4' ' – SCALE30' 10' '172' 344' ' • ' – ' Input parameters • SCALE • edgefactor (=16) Graph' GeneraJon Graph' ConstrucJon BFS ValidaJon results 64 iterations
5.
• – • – • – • – • – –
6.
• – • – • – • – • – –
7.
• – – ' • – • – ' I/O
8.
• – – ' • – • – ' I/O ' '
9.
• ' ' – – ' • – • • – • – • • – •
10.
Hamar'Overview Map Distributed'Array Rank'0 Rank'1 Rank'n Local'Array
Local'Array Local'Array Local'Array Reduce Map Reduce Map Reduce Shuffle Shuffle Data'Transfer'between'ranks Shuffle Shuffle Local'Array Local'Array Local'Array Local'Array Device(GPU)' Data Host(CPU)' Data Memcpy'' (H2D,'D2H) Virtualized'Data'Object
11.
Map/Reduce'code'sample class'MapImpl':'public'hamar::funcJon::cuda::Map<MapContext>'{' ''public:' ''''''__host__'__device__'Operate(MapContext'*context)'{' '''''''''KeyType'key'='context`>input_key();' '''''''''ValueType'value'='context`>input_value();' ''''''''context`>Emit(key,'value);' '''''}' }' ' class'ReduceImpl':'public'hamar::funcJon::cuda::Reduce<ReduceContext>'{' ''public:' '''''___host__'__device__''Operate(ReduceContext'*context)'{' '''''''''KeyType'key'='context`>input_key();' '''''''''ValueType'values'='context`>input_values();' '''''''''int'n'='context`>num_input_values();' '''''''''ValueType'sum'='values[0]'+'…'+'values[n];' '''''''''context`>Emit(key,'sum);' '''''}' }'
12.
Map/Reduce'code'sample'(cont’d) int'main()'{' ' '''MapImpl'map;' '''ReduceImple'reduce;' ' '''Environment'env;' '''env.Init();''//'MPI/CUDA'IniJalizaJon' ' '''Directory'object(&env);' '''object.Init(path);' ' '''object.Map(map);' '''object.Reduce(reduce);' ' '''object.Destroy();' ' '''env.Destroy();''//'MPI/'CUDA'FinalizaJon' ' }
13.
Highly'Accelerated'MapReduce'with'' Out`of`core'support'on'GPUs Map Reduce Map Reduce Map Reduce • Hierarchical'memory'management'for'large`scale'' data'parallel'processing'using'mulJ`GPUs' – Support'out`of`core'processing'on'GPU'devices' –
Overlapping'computaJon'and'communicaJon' Map Reduce GPU CPU Memcpy'' (H2D,'D2H) Processing'' for'each'chunk Shuffle Shuffle
14.
Map/Reduce'ImplementaJon • IniJalizaJon'before'each'operaJon' – Remove'unnecessary'keys' –
Reordering'data'structures' • OpJmizaJons'for'GPU'accelerators' – Assign'a'warp'(32'threads)'per'key'for'avoiding'warp'divergence'in' Map/Reduce' – Overlapping'computaJon'on'GPU'and'data'transfer'between'CPU'and' GPU' Map/' Reduce Map/' Reduce SortSort Scan Sort'key`value'for'Scan Compact'keys'to'unique Overlap'computaJon'and' data'transfer
15.
GPU`based'External'Sort'ImplementaJon CPUGPU 1.'Divide'input'data'into'chunks,'then'sort'on'GPU'for'each'chunk 2.'Swap'intermediate'' ''''data'on'CPU GPU 3.'Sort'intermediate'data'on'GPU *1:'Y.'Ye'et'al.,'“GPUMemSort:'A'High'Performance'Graphics'Co`processors'SorJng'Algorithm'for'Large'' '''''''Scale'In`Memory'Data”,'GSTF'InternaJonal'Journal'on'CompuJng,'2011' • Out`of`core'GPU'sorJng'algorithm'*1' – Adopted'Sample`based'Parallel'SorJng'Algorithm' –
Overlapping'computaJon'on'GPU'and'data'transfer'between'CPU' and'GPU'
16.
ApplicaJon'Example':'GIM`V' Generalized'IteraJve'Matrix`Vector'mulJplicaJon*1 • Easy'descripJon'of'various'graph'algorithms'by'implemenJng' combine2,'combineAll,'assign'funcJons' • PageRank,'Random'Walk'Restart,'Connected'Component' –
v’#=#M#×G#v''where' v’i'='assign(vj','combineAllj'({xj#|'j#='1..n,'xj#='combine2(mi,j,'vj)}))''(i'='1..n)' – IteraJve'2'phases'MapReduce'operaJons' ×Gv’i mi,j vj v’ M combineAll( and(assign((stage2) combine2((stage1) assign v *1':'Kang,'U.'et'al,'“PEGASUS:'A'Peta`Scale'Graph'Mining'System`'ImplementaJon'' and'ObservaJons”,'IEEE'INTERNATIONAL'CONFERENCE'ON'DATA'MINING'2009 Straigh|orward'implementaJon'using'Hamar
17.
Weak'Scaling'Performance'' [Sato,'Shirahata'et'al.'Cluster2014]' • PageRank'applicaJon'on'TSUBAME'2.5' • Data'size'is'larger'than'GPU'memory'capacity 0' 500' 1000' 1500' 2000' 2500' 3000' 0'
200' 400' 600' 800' 1000' 1200' Performance([MEdges/sec] Number(of(Compute(Nodes SCALE(23(K(24(per(Node 1CPU'(S23'per'node)' 1GPU'(S23'per'node)' 2CPUs'(S24'per'node)' 2GPUs'(S24'per'node)' 3GPUs'(S24'per'node)' 2.81'GE/s'on'3072'GPUs' (SCALE'34) 2.10x'Speedup' (3'GPU'v'2CPU)
18.
Breakdown • Performance'on'3'GPUs'compared'with'2'CPUs' – SCALE'33,'1024'nodes' –
Map:'2.82x,'Reduce:'1.11x,'Sort:'5.04x'speedup' • Overlapping'communicaJon'effecJvely 0' 10000' 20000' 30000' 40000' 50000' 60000' 70000' 1CPU' 1GPU' 2CPUs' 2GPUs' 3GPUs' Elapsed(.me([ms] Map' Shuffle' Reduce' Sort' Others'
19.
Towards(Mul.level(data(management(( on(Hamar(using(GPUs(and(NVMs([GTC2014] Mother'board '''''''''''''''''''''''''''''''''''''RAID'card mSATA mSATA mSATA
mSATA 0' 1000' 2000' 3000' 4000' 5000' 6000' 7000' 8000' 9000' 0' 5' 10' 15' 20' Bandwidth([MB/s] #(mSATAs Raw'mSATA'4KB' RAID0'1MB' RAID0'64KB' 0' 0.5' 1' 1.5' 2' 2.5' 3' 3.5' 0.274'0.547'1.09' 2.19' 4.38' 8.75' 17.5' 35' 70' 140' Throughuput([GB/s] Matrix(Size([GB] Raw'8'mSATA' 8'mSATA'RAID0'(1MB)' 8'mSATA'RAID0'(64KB)' I/O'performance'of'mulJple'mSATA'SSD I/O'performance'from'GPU'to'mulJple'mSATA'SSDs (7.39(GB/s(from(( 16(mSATA(SSDs((Enabled(RAID0)( (3.06(GB/s(from(( 8(mSATA(SSDs(to(GPU( How(to(design(local(storage(for(nextKgen(supercomputers(?( K(Designed(a(local(I/O(prototype(using(16(mSATA(SSDs( Capacity:((4TB( Read(bandwidth:(8(GB/s(
20.
SorJng'for'Rapidly'Increasing'Datasets' [Shamoto,'Sato'et'al]' • The'need'to'process'huge'datasets'is'increasing' due'to'growth'of'data'collecJon'in'various'fields' – Sensor'data' –
SNS'network' • Fast'sorJng'methods' – Distributed'SorJng:'SorJng'for'distributed'system' • Spli~er`based'parallel'sort' • Radix'sort' • Merge'sort' – SorJng'on'heterogeneous'architectures' • Many'sorJng'algorithms'are'accelerated'by'many'cores' and'high'memory'bandwidth.' • SorJng'for'large`scale'heterogeneous'systems' remains'unclear'
21.
ExisJng'SorJng'Algorithms SpligerKbased(parallel(sor.ng( – The'flow'of'the'algorithm' 1. local'sort:'Each'process'sorts'its'own'array' 2.
Select'spli0ers:'Choose'criteria'for'data'segmentaJon' 3. Data'transfer:'Transfer'data'segments' 4. Local'merge:'Merge'sorted'arrays' – Low'communicaJon'costs' 'ComputaJon'costs'starts'dominaJng'the'overall'performance( ( Sor.ng(on(GPU( – There'are'many'a~empts'to'accelerate'sorJng' • Thrust'sort[D.merrill'et'al.,'2011]' – Fast'sorJng'for'one'compute'node' • A'GPU'external'sort[Y.'Ye'et'al.,'2010]' – Handle'GPU'memory'overflows' • A'mulFGnode'GPU'sort[K.'L.'Spafford'et'al.,'2011]' – Does'not'sort'huge'data'sets' U.lize(GPU(accelerators(for(spligerKbased(parallel(sor.ng
22.
GPU'implementaJon'for' Spli~er`based'Parallel'SorJng • Offloading'the'most'Jme`consuming'phase'to' GPU'accelerators 0 20 40 4 8 16 32 64 128 256 512 1024 2048 # of
proccesses (2 proccesses per node) Elapsedtime[s] synchronization costs data transfer and Merge local sort (original) merge (remaining arrays) select splitters select'spli~ers data'transfer merge ' ' GPU local'sort ' unsorted sorted ' '
23.
• 2'~'1024'nodes'(4'~'2048'GPUs)'on'TSUBAME2.5' • 2'processes'per'node'and'each'node'has'2GB'64bit'integer Weak'Scaling'Performance 0 10000 20000 30000 0
500 1000 1500 2000 # of proccesses (2 proccesses per node) Keys/second(millions) HykSort 1thread HykSort 6threads HykSort GPU + 6threads GPU(implementa.on( based(on(mul.Kthreaded( implementa.on Mul.Kthreaded( implementa.on SingleKthreaded( implementa.on x1.4 x3.6 When'the'#'of'processes'is'2048
24.
K20x x4 faster
than K20x 0 20000 40000 60000 0 500 1000 1500 2000 0 500 1000 1500 2000 # of proccesses (2 proccesses per node) Keys/second(millions) HykSort 6threads HykSort GPU + 6threads PCIe_10 PCIe_100 PCIe_200 PCIe_50 Prediction of our implementation Performance'PredicJon • PCIe_#:'#GB/s' bandwidth'of' interconnect'between' CPU'and'GPU' 8.8%'reducJon'of'overall' runJme'when'the'accelerators' work'4'Jmes'faster'than'K20x x2.2'speedup'when'the'#'of'PCI' bandwidth'increase'to'50GB/s
25.
• – – • – • –
Télécharger maintenant