SlideShare a Scribd company logo
1 of 23
Download to read offline
Dai Yang, Josef Weidendorfer, Tilman Küstner and Carsten Trinitis
Chair of Computing Architecture
Technical University of Munich (TUM)
Sibylle Ziegler
Klinik und Poliklinik für Nuklearmedizin,
Ludwig Maximillian Universität München
14. September 2017
Enabling Application Integrated Proactive Fault
Tolerance
ENVELOPE – Efficiency and Reliability: Selforganisation in HPC Systems
ParCo Conferences 2017
http://envelope.itec.kit.edu/
• Complexity of HPC towards Exascale Computing
To hide the complexity of HPC from the application programmer.
• Missing dynamic in HPC applications
• With increasing degree of heterogeneity
• Efficiency
To increase the efficiency of existing and new HPC applications.
• Reliability
To increase the reliability of HPC environment.
- Global Checkpointing and Restart do not scale well enough for exascale
• This work is part of BMBF Project ENVELOPE and funded by BMBF under grant title
01IH16010D.
• Computer resources for this project have been provided by the Gauss Centre for
Supercomputing/Leibniz Supercomputing Centre under grant: pr63qi.
2(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Motivation
Background
• Application integrated approach
• In comparison to application transparent, system-level approach
• For both existing and new applications
Basic Idea
• Exchange/expand/shrink application („Malleable“ Application)
• Application should be able to retreat itself
• Incremental adaptable
• Data-Oriented, SPMD Model (same as MPI)
• PGAS-like
3(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Goals
• Modularized Design, plugin-based, expandable
• Index space abstraction
• A bit of data management – no global array
• Automatic Load-Balancing
• (proactive) Fault Tolerance
• (future) Reactive Fault Tolerance by using In-Memory Checkpointing
4(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
LAIK (0) – Design Principles
• Application - Integrated
• Typical data types (1D/2D/3D) + (future) any data types
• Typical HPC communication backend:
currently MPI (works with simple OpenMP as well)
5(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
LAIK (1) – Design
• Partitioning over index spaces
• Automatic Data (Re-)Balancing by Repartitioning:
• Uniform Distribution per # of Elements or task-wise
• By Element weight
• (future) by Profiling
• Fault Tolerance
• Proactive, via Repartitioning
• (future) Reactive, via local in-memory checkpointing
• Communication Backend:
• Working: MPI
• WIP: Shared Memory
• WIP: Agents for System State Information
• MQTT and TCP
6(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
LAIK (2) At a Glance
• Access Pattern (r/w) and Data Flow (CopyIn/CopyOut) controlled
• Supports coupling of different data containers
• Data Consistency by using given reduction operations upon multiple write access
• Flexible data partitions (malleable) for repartitioning
7(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
LAIK (3) – Partitioning
• Types of partitioning and corresponding partitioners
o Master: all data in only one task
o Blocked: every task has a slice of data
o All: everyone has everything
o (future) Halo, Bisection and others
• Switch Partitioning for Data redistribution
• Data Flow and Consistency is checked and enforced
8(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
LAIK (4) – Partitioning and Partitioners
• Different Repartition Methods: continuous and incremental
• Steps:
1. Synchronize Tasks, communicate failed Task Numbers
2. Create a new Group excluding failed tasks
3. Get partitioner, rerun partitioner with this new group -> new balanced indexes
4. Calculate differences and data transfer action required - Transition
5. For each data container: Execute the transition
6. (optional) remove/migrate old group to new group
7. Update Data/Address space Mapping
LAIK (5) – Repartitioning
(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 9
10(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
LAIK (6) – Basic API
• Laik_Instance* inst = laik_init_mpi(&argc, &argv);
• Laik_Group* world = laik_world(inst);
• Laik_Space* space = laik_new_space_1d(inst, matrix.rows());
• Laik_Partitioner* part = laik_new_block_partitioner_iw1(getEW, &matrix);
• Laik_Partitioning* p = laik_new_partitioning(world, space, part);
• Laik_Data* result = laik_alloc_1d(world, laik_Float, nRows);
• laik_switchto_new(result, laik_All, LAIK_DF_None);
• laik_switchto_flow(result, LAIK_DF_Init | LAIK_DF_ReduceOut |
LAIK_DF_Sum);
• laik_map_def1(result, (void**) &res, 0);
• Laik_Slice* slc = laik_my_slice(p, sNo);
• laik_switchto_flow(result, LAIK_DF_CopyIn);
• Laik_Group* g2 = laik_new_shrinked_group(g, removeLen, removeList);
• rep = laik_new_reassign_partitioner(g2, getEW, (void*)&matrix);
• laik_migrate_and_repartition(part, g2, rep);
11(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
MLEM – Short Introduction The small animal PET scanner
MADPET-II
1152 detectors, 662976 lines of response
Field of view 140 x 140 x 40 voxels, total 784000 voxels
12(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
MLEM Algorithm
• Adaptation for Matrix Partitioning using LAIK
• Improve Mapping algorithm of sparse matrix to handle multiple independent slices
• Creation of Data Container for all working vectors
• Add loop to handle multiple slices
• Added wrapper for handling parameters for repartitioning
• System: CooLMUC 2 - NeXtScale nx360M5, Xeon E5-2697v3 14C 2.6GHz, Infiniband
FDR14
• Testinput: 12GB Probability Sparse Matrix, 10 Iterations
• Simulated Fault by enforce shrinking after 6th Iteration
13(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Steps Done for Porting MLEM to LAIK
14(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Results (0) – Overview
15(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Results (1) – Overhead of LAIK
16(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Results (2) – Time for Repartitiong
17(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Results (3) – 2 Repart Algorithms
• LAIK: A library to increase elasticity in parallel application
• By adding partitioned index spaces as abstraction
• Repartitioning as central functionality
• Automatic Load-Balancing
• Fault Tolerant
• Modularized and expandable
• Increased elasticity in parallel codes
• Porting MLEM & Results
• Limited effort in application porting required
• Low overhead of LAIK
• LAIK scales at least at the same scale as the original application
18(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Conclusion
Working in Progress
• Porting further application, e.g. LULESH
• Further Scalability research using >10000 cores on SuperMUC
• Agent system
• Shared memory backend
• Further optimization to reduce communication effort
Proposed
• Solution to overcome MPI-Weakness
• Local in-memory Checkpointing
• Non-regular data structure
• Elastic index space size for hierarchical instantiations
19(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Future Work
[1] Alrutz, T., Backhaus, J., and et. al. GASPI: A Partitioned Global Address Space
programming interface. In Facing the Multicore-Challenge III (2013), vol. 7686 of Lecture
notes in computer science, Springer Berlin Heidelberg.
[2] Bergman, K., Borkar, S., and et. al. Exascale computing study: Technology challenges in
achieving exascale systems. DARPA IPTO Office, Tech. Rep 15 (2008).
[3] Forum, M. P. I. MPI: A Message-Passing Interface Standard Version 3.0, 2012.
[4] Furlinger, K., Glass, C., Knüpfer, A., Tao, J., Hünich, D., Idrees, K., Maiterth, M., Mhedheb,
Y., and Zhou, H. DASH: Data structures and algorithms with support for hierarchical locality. In
Euro-Par 2014 Workshops (Porto, Portugal) (2014).
[5] Idrees, K. Effective use of the PGAS paradigm: Driving transformations and self-adaptive
behavior in dash-applications. In Proceedings of the 1st Int. Workshop on Program
Transformation for Programmability in Heterogeneous Architectures (2016).
[6] Kale, L. V., and Krishnan, S. Charm++: a portable concurrent object oriented system based
on c++. In ACM Sigplan Notices (1993), vol. 28, ACM, pp. 91–108.
[7] Küstner, T., Weidendorfer, J., Schirmer, J., Klug, T., Trinitis, C., and Ziegler, S. Parallel
MLEM on multicore architectures. In ICCS 2009: 9th Int. Conf. on Computational Science
(Berlin, Heidelberg, 2009), G. Allen et al., Ed., Springer.
20(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
References
[8] Nagarajan, A. B., and Mueller, F. Proactive fault tolerance for HPC with Xen virtualization.
In Proceedings of the 21st annual Int. Conf. on Supercomputing (2007).
[9] Nieplocha, J., Palmer, B., Tipparaju, V., Krishnan, M., Trease, H., and Apra, E. ` Advances,
applications and performance of the global arrays shared memory programming toolkit. The
Int. Journal of High Performance Computing Applications 20, 2 (2006).
[10] Pickartz, S., Clauss, C., Lankes, S., Krempel, S., Moschny, T., and Monti, A. Nonintrusive
Migration of MPI Processes in OS-Bypass Networks. In 2016 IEEE Int. Parallel and Distributed
Processing Symposium Workshops (IPDPSW) (2016).
[11] Rafecas, M., Mosler, B., Dietz, M., Pgl, M., Stamatakis, A., McElroy, D. P., and Ziegler, S.
I. Use of a Monte Carlo-based probability matrix for 3-D iterative reconstruction of MADPET-II
data. IEEE Trans. on Nuclear Science 51, 5 (2004).
[12] Saraswat, V., Bloom, B., and et. al. X10 language specification version 2.5.
[13] Shepp, L. A., and Vardi, Y. Maximum likelihood reconstruction for emission tomography.
IEEE Transactions on Medical Imaging 1, 2 (1982), 113–122.
[14] Strul, D., Slates, R. B., Dahlbom, M., Cherry, S. R., and Marsden, P. K. An improved
analytical detector response function model for multilayer small-diameter PET scanners.
Physics in Medicine and Biology 48 (2003), 979–994.
21(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
References
[15] Treichler, S., Bauer, M., and Aiken, A. Language support for dynamic, hierarchical data
partitioning. In ACM SIGPLAN Notices (2013), vol. 48, ACM, pp. 495–514.
[16] Wang, C., Mueller, F., and et. al. Proactive process-level live migration and back migration
in HPC environments. J. of Parallel and Distributed Comp. 72, 2 (2012).
[17] Weidendorfer, J., Yang, D., and Trinitis, C. Laik: A library for fault tolerant distribution of
global data for parallel applications. In Proceedings of the 27th PARS Workshop (PARS 2017)
(Hagen, 2017), Gesellschaft für Informatik.
[18] Zhou, H., Mhedheb, Y., and et. al. DART-MPI: an mpi-based implementation of a PGAS
runtime system. CoRR abs/1507.01773 (2015).
[19] Zima, H., Chamberlain, B. L., and Callahan, D. Parallel programmability and the Chapel
language. International Journal on HPC Applications, Special Issue on High Productivity
Languages and Models 21, 3 (2007), 291–312.
22(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
References
• LAIK
https://github.com/envelope-project/laik
• MLEM Project
https://github.com/envelope-project/mlem
• Josef Weidendorfer:
weidendo@in.tum.de
• Dai Yang
d.yang@tum.de
• Tilman Küstner
kuestner@in.tum.de
• Carsten Trinitis
carsten.trinitis@tum.de
23(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017
Infos

More Related Content

What's hot

RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012Eleni Trouva
 
RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017ARCFIRE ICT
 
Rumba presentation at FEC2
Rumba presentation at FEC2Rumba presentation at FEC2
Rumba presentation at FEC2ARCFIRE ICT
 
Edge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video StreamingEdge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video StreamingTal Lavian Ph.D.
 
IRATI @ RINA Workshop 2014, Dublin
IRATI @ RINA Workshop 2014, DublinIRATI @ RINA Workshop 2014, Dublin
IRATI @ RINA Workshop 2014, DublinEleni Trouva
 
Experimental evaluation of a RINA prototype - GC 2014
Experimental evaluation of a RINA prototype - GC 2014Experimental evaluation of a RINA prototype - GC 2014
Experimental evaluation of a RINA prototype - GC 2014Eleni Trouva
 
Rlite software-architecture (1)
Rlite software-architecture (1)Rlite software-architecture (1)
Rlite software-architecture (1)ARCFIRE ICT
 
Update on IRATI technical work after month 6
Update on IRATI technical work after month 6Update on IRATI technical work after month 6
Update on IRATI technical work after month 6Eleni Trouva
 
IRATI project presentation
IRATI project presentationIRATI project presentation
IRATI project presentationEleni Trouva
 
Irati goals and achievements - 3rd RINA Workshop
Irati goals and achievements - 3rd RINA WorkshopIrati goals and achievements - 3rd RINA Workshop
Irati goals and achievements - 3rd RINA WorkshopEleni Trouva
 
RINA: Update on research and prototyping activities. Global Future Internet W...
RINA: Update on research and prototyping activities. Global Future Internet W...RINA: Update on research and prototyping activities. Global Future Internet W...
RINA: Update on research and prototyping activities. Global Future Internet W...Eleni Trouva
 
RINA IRATI Korea-EU Workshop 2013
RINA IRATI Korea-EU Workshop 2013RINA IRATI Korea-EU Workshop 2013
RINA IRATI Korea-EU Workshop 2013Eleni Trouva
 
Segment Routing: A Tutorial
Segment Routing: A TutorialSegment Routing: A Tutorial
Segment Routing: A TutorialAPNIC
 
Irati fire-engineering-workshop-nov2012
Irati fire-engineering-workshop-nov2012Irati fire-engineering-workshop-nov2012
Irati fire-engineering-workshop-nov2012Eleni Trouva
 
Unreliable inter process communication in Ethernet: Migrating to RINA with th...
Unreliable inter process communication in Ethernet: Migrating to RINA with th...Unreliable inter process communication in Ethernet: Migrating to RINA with th...
Unreliable inter process communication in Ethernet: Migrating to RINA with th...Eleni Trouva
 
Multi-operator "IPC" VPN Slices: Applying RINA to Overlay Networking
Multi-operator "IPC" VPN Slices: Applying RINA to Overlay NetworkingMulti-operator "IPC" VPN Slices: Applying RINA to Overlay Networking
Multi-operator "IPC" VPN Slices: Applying RINA to Overlay NetworkingARCFIRE ICT
 

What's hot (20)

RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
RINA motivation, introduction and IRATI goals. IEEE ANTS 2012
 
RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017RINA research results - NGP forum - SDN World Congress 2017
RINA research results - NGP forum - SDN World Congress 2017
 
Rumba presentation at FEC2
Rumba presentation at FEC2Rumba presentation at FEC2
Rumba presentation at FEC2
 
Edge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video StreamingEdge Device Multi-unicasting for Video Streaming
Edge Device Multi-unicasting for Video Streaming
 
IRATI @ RINA Workshop 2014, Dublin
IRATI @ RINA Workshop 2014, DublinIRATI @ RINA Workshop 2014, Dublin
IRATI @ RINA Workshop 2014, Dublin
 
Experimental evaluation of a RINA prototype - GC 2014
Experimental evaluation of a RINA prototype - GC 2014Experimental evaluation of a RINA prototype - GC 2014
Experimental evaluation of a RINA prototype - GC 2014
 
Seamless mpls
Seamless mpls Seamless mpls
Seamless mpls
 
Rlite software-architecture (1)
Rlite software-architecture (1)Rlite software-architecture (1)
Rlite software-architecture (1)
 
guna_2015.DOC
guna_2015.DOCguna_2015.DOC
guna_2015.DOC
 
Update on IRATI technical work after month 6
Update on IRATI technical work after month 6Update on IRATI technical work after month 6
Update on IRATI technical work after month 6
 
IRATI project presentation
IRATI project presentationIRATI project presentation
IRATI project presentation
 
Irati goals and achievements - 3rd RINA Workshop
Irati goals and achievements - 3rd RINA WorkshopIrati goals and achievements - 3rd RINA Workshop
Irati goals and achievements - 3rd RINA Workshop
 
Design Principles for 5G
Design Principles for 5GDesign Principles for 5G
Design Principles for 5G
 
RINA: Update on research and prototyping activities. Global Future Internet W...
RINA: Update on research and prototyping activities. Global Future Internet W...RINA: Update on research and prototyping activities. Global Future Internet W...
RINA: Update on research and prototyping activities. Global Future Internet W...
 
Mpls
MplsMpls
Mpls
 
RINA IRATI Korea-EU Workshop 2013
RINA IRATI Korea-EU Workshop 2013RINA IRATI Korea-EU Workshop 2013
RINA IRATI Korea-EU Workshop 2013
 
Segment Routing: A Tutorial
Segment Routing: A TutorialSegment Routing: A Tutorial
Segment Routing: A Tutorial
 
Irati fire-engineering-workshop-nov2012
Irati fire-engineering-workshop-nov2012Irati fire-engineering-workshop-nov2012
Irati fire-engineering-workshop-nov2012
 
Unreliable inter process communication in Ethernet: Migrating to RINA with th...
Unreliable inter process communication in Ethernet: Migrating to RINA with th...Unreliable inter process communication in Ethernet: Migrating to RINA with th...
Unreliable inter process communication in Ethernet: Migrating to RINA with th...
 
Multi-operator "IPC" VPN Slices: Applying RINA to Overlay Networking
Multi-operator "IPC" VPN Slices: Applying RINA to Overlay NetworkingMulti-operator "IPC" VPN Slices: Applying RINA to Overlay Networking
Multi-operator "IPC" VPN Slices: Applying RINA to Overlay Networking
 

Similar to Enabling Application Integrated Proactive Fault Tolerance

Spark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky
Spark-MPI: Approaching the Fifth Paradigm with Nikolay MalitskySpark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky
Spark-MPI: Approaching the Fifth Paradigm with Nikolay MalitskyDatabricks
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowDaniel S. Katz
 
2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdfLevLafayette1
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...Wolfgang Ksoll
 
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONMAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONijcsit
 
Programming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi ClustersProgramming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi ClustersAM Publications
 
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...Dilnoza Bobokalonova
 
Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataeSAT Publishing House
 
Evolutionary Multi-Goal Workflow Progress in Shade
Evolutionary  Multi-Goal Workflow Progress in ShadeEvolutionary  Multi-Goal Workflow Progress in Shade
Evolutionary Multi-Goal Workflow Progress in ShadeIRJET Journal
 
Application-oriented ping-pong benchmarking: how to assess the real communica...
Application-oriented ping-pong benchmarking: how to assess the real communica...Application-oriented ping-pong benchmarking: how to assess the real communica...
Application-oriented ping-pong benchmarking: how to assess the real communica...Trieu Nguyen
 
[Capella Day Toulouse] Driving intelligent transportation systems with Capella
[Capella Day Toulouse] Driving intelligent transportation systems with Capella[Capella Day Toulouse] Driving intelligent transportation systems with Capella
[Capella Day Toulouse] Driving intelligent transportation systems with CapellaObeo
 
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET Journal
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Rusif Eyvazli
 
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdfOpenACC
 
RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)IJCSEA Journal
 
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET Journal
 
An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey
 An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey
An Adjacent Analysis of the Parallel Programming Model Perspective: A SurveyIRJET Journal
 
M3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
M3AT: Monitoring Agents Assignment Model for the Data-Intensive ApplicationsM3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
M3AT: Monitoring Agents Assignment Model for the Data-Intensive ApplicationsVladislavKashansky
 

Similar to Enabling Application Integrated Proactive Fault Tolerance (20)

Spark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky
Spark-MPI: Approaching the Fifth Paradigm with Nikolay MalitskySpark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky
Spark-MPI: Approaching the Fifth Paradigm with Nikolay Malitsky
 
Swift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance WorkflowSwift Parallel Scripting for High-Performance Workflow
Swift Parallel Scripting for High-Performance Workflow
 
2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf
 
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
NextGEOSS: The Next Generation European Data Hub and Cloud Platform for Earth...
 
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONMAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
 
Programming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi ClustersProgramming Modes and Performance of Raspberry-Pi Clusters
Programming Modes and Performance of Raspberry-Pi Clusters
 
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
Dilnoza Bobokalonova Resume | Embedded Systems Engineering | Backend Software...
 
Implementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big dataImplementation of p pic algorithm in map reduce to handle big data
Implementation of p pic algorithm in map reduce to handle big data
 
Evolutionary Multi-Goal Workflow Progress in Shade
Evolutionary  Multi-Goal Workflow Progress in ShadeEvolutionary  Multi-Goal Workflow Progress in Shade
Evolutionary Multi-Goal Workflow Progress in Shade
 
Application-oriented ping-pong benchmarking: how to assess the real communica...
Application-oriented ping-pong benchmarking: how to assess the real communica...Application-oriented ping-pong benchmarking: how to assess the real communica...
Application-oriented ping-pong benchmarking: how to assess the real communica...
 
[Capella Day Toulouse] Driving intelligent transportation systems with Capella
[Capella Day Toulouse] Driving intelligent transportation systems with Capella[Capella Day Toulouse] Driving intelligent transportation systems with Capella
[Capella Day Toulouse] Driving intelligent transportation systems with Capella
 
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
IRJET-Framework for Dynamic Resource Allocation and Efficient Scheduling Stra...
 
Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...Scaling Application on High Performance Computing Clusters and Analysis of th...
Scaling Application on High Performance Computing Clusters and Analysis of th...
 
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
OpenACC and Open Hackathons Monthly Highlights May  2023.pdfOpenACC and Open Hackathons Monthly Highlights May  2023.pdf
OpenACC and Open Hackathons Monthly Highlights May 2023.pdf
 
RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)
 
SICOMORO
SICOMOROSICOMORO
SICOMORO
 
Data Dimensional Reduction by Order Prediction in Heterogeneous Environment
Data Dimensional Reduction by Order Prediction in Heterogeneous EnvironmentData Dimensional Reduction by Order Prediction in Heterogeneous Environment
Data Dimensional Reduction by Order Prediction in Heterogeneous Environment
 
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering AlgorithmIRJET- Review of Existing Methods in K-Means Clustering Algorithm
IRJET- Review of Existing Methods in K-Means Clustering Algorithm
 
An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey
 An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey
An Adjacent Analysis of the Parallel Programming Model Perspective: A Survey
 
M3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
M3AT: Monitoring Agents Assignment Model for the Data-Intensive ApplicationsM3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
M3AT: Monitoring Agents Assignment Model for the Data-Intensive Applications
 

Recently uploaded

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGSujit Pal
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 

Recently uploaded (20)

Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Google AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAGGoogle AI Hackathon: LLM based Evaluator for RAG
Google AI Hackathon: LLM based Evaluator for RAG
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 

Enabling Application Integrated Proactive Fault Tolerance

  • 1. Dai Yang, Josef Weidendorfer, Tilman Küstner and Carsten Trinitis Chair of Computing Architecture Technical University of Munich (TUM) Sibylle Ziegler Klinik und Poliklinik für Nuklearmedizin, Ludwig Maximillian Universität München 14. September 2017 Enabling Application Integrated Proactive Fault Tolerance ENVELOPE – Efficiency and Reliability: Selforganisation in HPC Systems ParCo Conferences 2017 http://envelope.itec.kit.edu/
  • 2. • Complexity of HPC towards Exascale Computing To hide the complexity of HPC from the application programmer. • Missing dynamic in HPC applications • With increasing degree of heterogeneity • Efficiency To increase the efficiency of existing and new HPC applications. • Reliability To increase the reliability of HPC environment. - Global Checkpointing and Restart do not scale well enough for exascale • This work is part of BMBF Project ENVELOPE and funded by BMBF under grant title 01IH16010D. • Computer resources for this project have been provided by the Gauss Centre for Supercomputing/Leibniz Supercomputing Centre under grant: pr63qi. 2(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Motivation
  • 3. Background • Application integrated approach • In comparison to application transparent, system-level approach • For both existing and new applications Basic Idea • Exchange/expand/shrink application („Malleable“ Application) • Application should be able to retreat itself • Incremental adaptable • Data-Oriented, SPMD Model (same as MPI) • PGAS-like 3(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Goals
  • 4. • Modularized Design, plugin-based, expandable • Index space abstraction • A bit of data management – no global array • Automatic Load-Balancing • (proactive) Fault Tolerance • (future) Reactive Fault Tolerance by using In-Memory Checkpointing 4(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 LAIK (0) – Design Principles
  • 5. • Application - Integrated • Typical data types (1D/2D/3D) + (future) any data types • Typical HPC communication backend: currently MPI (works with simple OpenMP as well) 5(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 LAIK (1) – Design
  • 6. • Partitioning over index spaces • Automatic Data (Re-)Balancing by Repartitioning: • Uniform Distribution per # of Elements or task-wise • By Element weight • (future) by Profiling • Fault Tolerance • Proactive, via Repartitioning • (future) Reactive, via local in-memory checkpointing • Communication Backend: • Working: MPI • WIP: Shared Memory • WIP: Agents for System State Information • MQTT and TCP 6(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 LAIK (2) At a Glance
  • 7. • Access Pattern (r/w) and Data Flow (CopyIn/CopyOut) controlled • Supports coupling of different data containers • Data Consistency by using given reduction operations upon multiple write access • Flexible data partitions (malleable) for repartitioning 7(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 LAIK (3) – Partitioning
  • 8. • Types of partitioning and corresponding partitioners o Master: all data in only one task o Blocked: every task has a slice of data o All: everyone has everything o (future) Halo, Bisection and others • Switch Partitioning for Data redistribution • Data Flow and Consistency is checked and enforced 8(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 LAIK (4) – Partitioning and Partitioners
  • 9. • Different Repartition Methods: continuous and incremental • Steps: 1. Synchronize Tasks, communicate failed Task Numbers 2. Create a new Group excluding failed tasks 3. Get partitioner, rerun partitioner with this new group -> new balanced indexes 4. Calculate differences and data transfer action required - Transition 5. For each data container: Execute the transition 6. (optional) remove/migrate old group to new group 7. Update Data/Address space Mapping LAIK (5) – Repartitioning (C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 9
  • 10. 10(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 LAIK (6) – Basic API • Laik_Instance* inst = laik_init_mpi(&argc, &argv); • Laik_Group* world = laik_world(inst); • Laik_Space* space = laik_new_space_1d(inst, matrix.rows()); • Laik_Partitioner* part = laik_new_block_partitioner_iw1(getEW, &matrix); • Laik_Partitioning* p = laik_new_partitioning(world, space, part); • Laik_Data* result = laik_alloc_1d(world, laik_Float, nRows); • laik_switchto_new(result, laik_All, LAIK_DF_None); • laik_switchto_flow(result, LAIK_DF_Init | LAIK_DF_ReduceOut | LAIK_DF_Sum); • laik_map_def1(result, (void**) &res, 0); • Laik_Slice* slc = laik_my_slice(p, sNo); • laik_switchto_flow(result, LAIK_DF_CopyIn); • Laik_Group* g2 = laik_new_shrinked_group(g, removeLen, removeList); • rep = laik_new_reassign_partitioner(g2, getEW, (void*)&matrix); • laik_migrate_and_repartition(part, g2, rep);
  • 11. 11(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 MLEM – Short Introduction The small animal PET scanner MADPET-II 1152 detectors, 662976 lines of response Field of view 140 x 140 x 40 voxels, total 784000 voxels
  • 12. 12(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 MLEM Algorithm
  • 13. • Adaptation for Matrix Partitioning using LAIK • Improve Mapping algorithm of sparse matrix to handle multiple independent slices • Creation of Data Container for all working vectors • Add loop to handle multiple slices • Added wrapper for handling parameters for repartitioning • System: CooLMUC 2 - NeXtScale nx360M5, Xeon E5-2697v3 14C 2.6GHz, Infiniband FDR14 • Testinput: 12GB Probability Sparse Matrix, 10 Iterations • Simulated Fault by enforce shrinking after 6th Iteration 13(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Steps Done for Porting MLEM to LAIK
  • 14. 14(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Results (0) – Overview
  • 15. 15(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Results (1) – Overhead of LAIK
  • 16. 16(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Results (2) – Time for Repartitiong
  • 17. 17(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Results (3) – 2 Repart Algorithms
  • 18. • LAIK: A library to increase elasticity in parallel application • By adding partitioned index spaces as abstraction • Repartitioning as central functionality • Automatic Load-Balancing • Fault Tolerant • Modularized and expandable • Increased elasticity in parallel codes • Porting MLEM & Results • Limited effort in application porting required • Low overhead of LAIK • LAIK scales at least at the same scale as the original application 18(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Conclusion
  • 19. Working in Progress • Porting further application, e.g. LULESH • Further Scalability research using >10000 cores on SuperMUC • Agent system • Shared memory backend • Further optimization to reduce communication effort Proposed • Solution to overcome MPI-Weakness • Local in-memory Checkpointing • Non-regular data structure • Elastic index space size for hierarchical instantiations 19(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Future Work
  • 20. [1] Alrutz, T., Backhaus, J., and et. al. GASPI: A Partitioned Global Address Space programming interface. In Facing the Multicore-Challenge III (2013), vol. 7686 of Lecture notes in computer science, Springer Berlin Heidelberg. [2] Bergman, K., Borkar, S., and et. al. Exascale computing study: Technology challenges in achieving exascale systems. DARPA IPTO Office, Tech. Rep 15 (2008). [3] Forum, M. P. I. MPI: A Message-Passing Interface Standard Version 3.0, 2012. [4] Furlinger, K., Glass, C., Knüpfer, A., Tao, J., Hünich, D., Idrees, K., Maiterth, M., Mhedheb, Y., and Zhou, H. DASH: Data structures and algorithms with support for hierarchical locality. In Euro-Par 2014 Workshops (Porto, Portugal) (2014). [5] Idrees, K. Effective use of the PGAS paradigm: Driving transformations and self-adaptive behavior in dash-applications. In Proceedings of the 1st Int. Workshop on Program Transformation for Programmability in Heterogeneous Architectures (2016). [6] Kale, L. V., and Krishnan, S. Charm++: a portable concurrent object oriented system based on c++. In ACM Sigplan Notices (1993), vol. 28, ACM, pp. 91–108. [7] Küstner, T., Weidendorfer, J., Schirmer, J., Klug, T., Trinitis, C., and Ziegler, S. Parallel MLEM on multicore architectures. In ICCS 2009: 9th Int. Conf. on Computational Science (Berlin, Heidelberg, 2009), G. Allen et al., Ed., Springer. 20(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 References
  • 21. [8] Nagarajan, A. B., and Mueller, F. Proactive fault tolerance for HPC with Xen virtualization. In Proceedings of the 21st annual Int. Conf. on Supercomputing (2007). [9] Nieplocha, J., Palmer, B., Tipparaju, V., Krishnan, M., Trease, H., and Apra, E. ` Advances, applications and performance of the global arrays shared memory programming toolkit. The Int. Journal of High Performance Computing Applications 20, 2 (2006). [10] Pickartz, S., Clauss, C., Lankes, S., Krempel, S., Moschny, T., and Monti, A. Nonintrusive Migration of MPI Processes in OS-Bypass Networks. In 2016 IEEE Int. Parallel and Distributed Processing Symposium Workshops (IPDPSW) (2016). [11] Rafecas, M., Mosler, B., Dietz, M., Pgl, M., Stamatakis, A., McElroy, D. P., and Ziegler, S. I. Use of a Monte Carlo-based probability matrix for 3-D iterative reconstruction of MADPET-II data. IEEE Trans. on Nuclear Science 51, 5 (2004). [12] Saraswat, V., Bloom, B., and et. al. X10 language specification version 2.5. [13] Shepp, L. A., and Vardi, Y. Maximum likelihood reconstruction for emission tomography. IEEE Transactions on Medical Imaging 1, 2 (1982), 113–122. [14] Strul, D., Slates, R. B., Dahlbom, M., Cherry, S. R., and Marsden, P. K. An improved analytical detector response function model for multilayer small-diameter PET scanners. Physics in Medicine and Biology 48 (2003), 979–994. 21(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 References
  • 22. [15] Treichler, S., Bauer, M., and Aiken, A. Language support for dynamic, hierarchical data partitioning. In ACM SIGPLAN Notices (2013), vol. 48, ACM, pp. 495–514. [16] Wang, C., Mueller, F., and et. al. Proactive process-level live migration and back migration in HPC environments. J. of Parallel and Distributed Comp. 72, 2 (2012). [17] Weidendorfer, J., Yang, D., and Trinitis, C. Laik: A library for fault tolerant distribution of global data for parallel applications. In Proceedings of the 27th PARS Workshop (PARS 2017) (Hagen, 2017), Gesellschaft für Informatik. [18] Zhou, H., Mhedheb, Y., and et. al. DART-MPI: an mpi-based implementation of a PGAS runtime system. CoRR abs/1507.01773 (2015). [19] Zima, H., Chamberlain, B. L., and Callahan, D. Parallel programmability and the Chapel language. International Journal on HPC Applications, Special Issue on High Productivity Languages and Models 21, 3 (2007), 291–312. 22(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 References
  • 23. • LAIK https://github.com/envelope-project/laik • MLEM Project https://github.com/envelope-project/mlem • Josef Weidendorfer: weidendo@in.tum.de • Dai Yang d.yang@tum.de • Tilman Küstner kuestner@in.tum.de • Carsten Trinitis carsten.trinitis@tum.de 23(C) 2017 D. Yang (TUM-LRR) | www.lrr.in.tum.de | Enabling Application Integrated Fault Tolerance | PAR-CO 2017 Infos