SlideShare une entreprise Scribd logo
1  sur  44
Distributed Interactive Engineering Toolbox




                                              David Loureiro - Eddy Caron

                                                                              SysFera
                                                     Ecole Normale Supérieure de Lyon
                                                      GRAAL/AVALON Research Team
Outline


   Context
   From DIET…
   … to SysFera-DS
   Conclusion




                      2
Why Large Scale systems?

 First need: supercomputing at a national or international scale
 Large size problems (grand challenge) need a collaboration
  between several codes/supercomputing centers
 Always a need for more computing power, memory capacity,
  and disk storage
 The power of any single resource is always small compared to
  the aggregation of several resources
 Network connectivity increased quickly!

•   Many available resources
                                             •    Increasing complexity of applications
     – Many clusters
                                                   –     Multi-scale
     – Supercomputers
                                                   –     Multi-disciplinary
     – Millions of PC and
                                                   –     Huge data set produced
       workstations connected
                                                   –     Heterogeneity
     – Sharing or renting resources
                               From DIET to SysFera-DS                                3
Centralized or Decentralized ?
                                                                       2001 TeraGrid / 2003 Grid’5000
 Centralized!                      1997 Google Cluster
                                                                       •   Grid Computing
                                                                           (Clusters of Clusters)
 (De)Centralized!
 Decentralized!
 Centralized!
 Decentralized!                            Sky Computing
                                                                     2002 Earth Simulator
                                                                     •   First computer to reach the Teraflops (40TF)
                                                                     •   Homogeneous, Centralized, Expensive



1946 ENIAC
•   18.000 tubes, 30 tons, 170 m²
•   2.000 tubes replaced every
    months by 6 technicians

                                    Cloud Computing
                                    •   Amazon
                                    •   Google
                                    •   Microsoft                                               2008 IBM Roadrunner
                                    •   …                                                       •   First computer to reach
                                                                                                     the Petaflops


                                           From DIET to SysFera-DS                                                  4
Research driven by applications

   Data-centric applications
     Very Large data management (in, out, temporary)
                                                                                                   >30 TB data/night
   Computer-centric applications
     GigaFlops
                                                                     Predicting Impacts of Massive Earthquakes (SDSC)

   Community-centric applications
     Data sharing (acquisition, results, ..)
     Resources
                                                                   Large Hadron Collider (LHC)

       Without an optimal scheduling?
                   I just need my simulation result
       Without minimizing ressources consumption?
       Without any optimisation? …
                                                                Grid user point of view
                                                           Single sign-on
                                                           Single compute space
                                                           Single data space
                                                           Single development environment
                                       From DIET to SysFera-DS                                                5
Which framework ?
   Holy Grail: Transparency and simplicity (maybe even before performance) !
   Scheduling tunability

   Many incarnations of the Grid
         Grid computing
         Cluster computing             peer-to-peer systems,
         Global computing              Web Services,
                                        Clouds, …
   Many programming models
         Shared-State Models
         Message Passing Models,
                                          Hybrids models
         RPC and RMI models
                                          Peer-to-peer models
                                          Web Services models
                                          Coordination models, …
   Do not forget good ol’ time research on scheduling and distributed systems
    !
         Most scheduling problems are very difficult to solve even in their simplistic
          form …
         … but simple solutions often lead to better performance results in real life



                                    From DIET to SysFera-DS                         6
Outline


   Context
   From DIET…
   … to SysFera-DS
   Conclusion




                      7
DIET’s Goals                                                           http://graal.ens-lyon.fr/DIET/

  Our goals
      To develop a toolbox for the deployment of environments using the Application Service
       Provider/Software as a Service (ASP/SaaS) paradigm with different applications
      Use as much as possible public domain and standard software
      To obtain a high performance and scalable environment
      Implement and validate our more theoretical results
           Scheduling for heterogeneous platforms, data (re)distribution and replication, performance
            evaluation, algorithmic for heterogeneous and distributed platforms, …
  Based on CORBA and our own software developments
      FAST for performance evaluation,
      LogService for monitoring,
      VizDIET for the visualization,
      GoDIET for the deployment
      Dagda for the data management

  Several applications in different fields (simulation, bioinformatics, …)
  Release 2.8 available on the web since november
  ACI Grid ASP, RNTL GASP, ANR LEGO CIGC-05-11, ANR Gwendia, Celtic-plus
   Project SEED4C
                                        From DIET to SysFera-DS                                          8
RPC and Grid-Computing: Grid-RPC

  • One simple idea
     – Implementing the RPC programming model over the grid
     – Using resources accessible through the network
     – Mixed parallelism model (data-parallel model at the server level and task
        parallelism between the servers)
  • Features needed
     – Load-balancing (resource localization and performance
       evaluation, scheduling),
     – IDL,
     – Data and replica management,
     – Security,
     – Fault-tolerance,
     – Interoperability with other systems,
     – …
   Design of a standard interface
     – within the OGF (Grid-RPC and SAGA WG)
     – Existing implementations: NetSolve/GridSolve, Ninf, DIET, OmniRPC



                               From DIET to SysFera-DS                             9
RPC and Grid Computing: Grid-RPC


                    Request
                                               AGENT(s)
 Client               S2 !




                        Op(C, A, B)
                                              S3          S4
          S1              S2




                    From DIET to SysFera-DS                10
Client and server interface
 Client side
    So easy …
    Multi-interface
     (C, C++, Fortran, Java, Python, Scilab, Web
     Services, etc.)
    Grid-RPC compliant
 Server side
    Install and submit new server to agent (LA)
    Problem and parameter description
    Client IDL transfer from server
    Dynamic services
      new service
      new version
      security update
      outdated service
      Etc.




                                  From DIET to SysFera-DS   11
Architecture overview


                                                               ( )* +,$
                                                                  "
                                                    ' &$




                                                                                    ( )*
                                                                                       "+,$

                                                                             ' &$

          ( )*
             "+,$
                                           ' &$                 ' &$



                                         %&$
                         %&$
                                                      ! "# $
                                         ! "# $
            ! "# $              ! "# $                                    MA : Master Agent
                                                  ! "# $                  LA : Local Agent
                     ! "# $                                               SeD : ServerDeamon

                              From DIET to SysFera-DS                                     12
Workflow Management
    Workflow representation
         Direct Acyclic Graph (DAG)
            Each vertex is a task
            

           Each directed edge represents
            communication between tasks
         Functional workflows
           Loops, if statements, automatic
            parallelism, fault-tolerance
    Goals
                                                               !
         Build and execute workflows
         Use different heuristics to solve scheduling
          problems
         Extensibility to address multi-workflows
          submission and large grid platform
         Manage heterogeneity and variability of
          environment
    ANR Gwendia time
              Idle                          Data transfert           Execution time
           Language definition (MOTEUR & MADAG)
EGI (Glite) Comparison on Grid’5000 vs EGI 132.143 s
                    32.857s                                         274.643 s
Grid’5000 (DIET)      0.214s   Contribution to the management of large 540.614 s
                                                 3.371 s               scale
                                      platforms: the DIET experience                  13
DIET Scheduling: Plug-in Schedulers
  SeD level
     Performance estimation function
     Estimation Metric Vector - dynamic collection of performance estimation values
       Performance measures available through DIET
                FAST-NWS performance metrics
                Time elapsed since the last execution
                CoRI (Collector of Resource Information)
         Developer defined values

  Aggregation Methods
     Defining mechanism to sort SeD responses: associated with the service and
      defined at SeD level
     Tunable comparison/aggregation routines for scheduling
     Priority Scheduler
         Performs pairwise server estimation comparisons returning a sorted list of server
          responses;
         Can minimize or maximize based on SeD estimations and taking into consideration the
          order in which the request for those performance estimations was specified at SeD level.




                                     From DIET to SysFera-DS                                 14
DIET Scheduling: Performance estimation

   Collector of Resource Information (CoRI)
       Interface to gather performance information
       Currently 2 modules available
                                                                                                                                                                             CoRI Manager
                      CoRI Easy
                      FAST (Martin Quinson’s PhD)                                                                                                              CoRI-Easy      FAST           Other
                                                                                                                                                                 Collector    Collector   Collectors like
                      Sigar, GPU, etc to come…                                                                                                                                              Ganglia
   Extension for parallel program
     • Code analysis / FAST calls combination
     • Allow the estimation of parallel
        regular routines (ScaLAPACK-like)

                                                 Max. error: 14,7 %
                                                 Avg. error: 3,8 %

         35,00                                                                        35,00



         30,00                                                                        30,00



         25,00                                                                        25,00



         20,00                                                                        20,00



         15,00                                                                        15,00



         10,00                                                                        10,00



            5,00                                                                       5,00


                                                                                       0,00
            0,00
                                                                                              1
                   1
                                                                                                  6
                       6
                                                                                  1                   11                                                    1
                           11                                                                                                                           6
                                                                              6                            16
                                 16                                      11                                                                        11
                                                                    16                                          21                            16
                                      21
                                                                                                                                         21
                                           26                  21                                                    26
                                                          26                                                                        26
                                                                                                                          31   31
                                                31   31




                                Measured                                                               Estimated

                                                                                                      From DIET to SysFera-DS                                                                          15
Data Management
 Three approaches for DIET
    DTM (LIFC, Besançon)
           Hierarchical and distributed data manager
           Redistribution between servers
    JuxMem (Paris, Rennes)
           P2P data cache
    DAGDA (IN2P3, Clermont-Ferrand and LIP)
           Joining task scheduling and data management

                                                     Standardized through GridRPC OGF WG.
                                                    •     Data Arrangement for Grid and
                                                          Distributed Applications
                                                           Explicit data replication: Using the API.
                                                           Implicit data replication.
                                                           Data replacement algorithm: LRU, LFU
                                                            AND FIFO
                                                           Transfer optimization by selecting the more
                                                            convenient source.
                                                           Storage resources usage management.
                                                           Data status backup/restoration.

                                      From DIET to SysFera-DS                                      16
Parallel and batch submissions

 Parallel & sequential jobs
    transparent for the user
    system dependent submission                                       MA
 SeDBatch
    Many batch systems
    Batch schedulers behaviour
                                                            LA              SeD//
    Internal scheduling process
           Monitoring & Performance prediction                             NFS
           Simulation (Simbatch)
                                                                 SeD
                               OAR

                       SLURM
                                      SeDBatch             PBS

                                                           LSF
                               OGE
                                          Loadleveler

  6/03/12                        From DIET to SysFera-DS
DIET Cloud

   Inside the Cloud
     DIET platform is virtualized
      inside the cloud.
      (as Xen image for example)
     Very flexible and scalable
      as DIET nodes can be launched
     Scheduling is more complex
   DIET as a Cloud manager
     Eucalyptus interface
     Eucalyptus is treated as a new Batch System
     Provide a new implementation for the BatchSystem abstract class




                             From DIET to SysFera-DS                    18
Grid’5000
                                                                                             Grid’5000
        Building a nation wide experimental platform for
               Grid & P2P researches (like a particle accelerator for the computer scientists)
        9 geographically distributed sites hosting clusters with 256 CPUs to 1K CPUs)
             All sites are connected by RENATER (French Res. and Edu. Net.)
             Design and develop a system/middleware environment for safely test and repeat
              experiments
        Use the platform for Grid experiments in real life conditions
        4 main features:
             A high security for Grid’5000 and the Internet, despite the deep reconfiguration feature
             Single sign-on
             High-performance LRMS: OAR
             A user toolkit to reconfigure the nodes and monitor experiment: Kadeploy


                      DIET deployment over a maximum of processors
                      1 MA, 8 LA, 540 SeDs
                      1120 clients on 140 machines
                      DGEMM requests (2000x2000 matrices)
                      Simple round-robin scheduling


                               From DIET to SysFera-DS                                             19
Applications: 4 of them
     Cosmology Application                                Climatology Application




          • Dark Mater Halos                     • Forecasting of the world's environment and
  • Large Scale experiment on Grid’5K                 climate on regional to global scales
                                                              • Plug-in Scheduler

        Robotic Application                               Bioinformatics Application

                                                                                         Parameters




                                                                                                      DIET API
                                                                                                                    External
                                                            DIET middleware                                      application call
                                                                                            Results

                                                                               Request

                                                                                    Metrics vector




                                                 • BLAST
                                                               BLAST service
                                                                                         Plugin-scheduler
                                                                declaration


                                                 •40000 requests over 5 databases of different
                                                   sizes (from 1 to 5 GB)
  • Experiment between Italia and France         • Data management optimized
                                From DIET to SysFera-DS                                                                             20
Conclusions

 Grid-RPC
   Interesting approach for several applications
   Simple, flexible, and efficient
   Many interesting research issues (scheduling, data management, resource
    discovery and reservation, deployment, fault-tolerance, …)
 DIET
   Scalable, open-source, and multi-application platform
   Concentration on several issues like resource discovery, scheduling (distributed
    scheduling and plugin schedulers), deployment (GoDIET and
    GRUDU), performance evaluation (CoRI), monitoring (LogService and
    VizDIET), data management and replication (DTM, JuxMem, and DAGDA)
   Large scale validation on the Grid’5000 platform
   A middleware designed and tunable for different applications


                                                                   http://www.grid5000.org/


                              From DIET to SysFera-DS                               21
Results

   A complete Middleware for heterogeneous infrastructure
     DIET is light to use and non-intrusive
     Dedicated to many applications
     Designed for Grid and Cloud
     Efficient even in comparison to commercial tools
     DIET is high tunability middleware
     Used in production


   The DIET Team

   SysFera Compagny (14 persons today)
     http://www.sysfera.com




                           From DIET to SysFera-DS           22
Future Prospects

   Do we need application specific schedulers ?
     Scheduling based on Economic Model for Cloud Platform
     DIET Green (Collaboration with RESO)


   Increase the DIET capacity to deal with heterogeneous
    resources                                                                    MA



     Single System Image Cluster OS                                                                                   LA




     Box Cluster                       LA                                            LA                                    SED Kerrighed

                                                                                                                            Kerrighed script generator
                                                                                                                                Deploy the image


     Virtual Machines
                                                                                                                            New services are register
                                                     SED Batch                             SED Cloud
                                        SED
                                                      Batch script generator
                                                                                            Cloud script generator
                                                 Submission to batch scheduler                Deploy the image
                                                                                           New services are register


     GPU architecture                                                                                                                             SMP Virtual


     Multi-core
                                                        Batch Scheduler                        Cloud Platform
                                                     PBS, OAR, Loadlever, ...                Eucalyptus, EC2, ...




     Large scale architecture
     …


                           From DIET to SysFera-DS                                                                                                       23
Outline


   Context
   From DIET…
   … to SysFera-DS
   Conclusion




                      24
Who are we?
 • 2001: Research project from the Graal team
   (Inria/ENS)
    – DIET: grid middleware
 • 2007: SysFera-DS used within the Décrypthon project
    – Used in production
    – Selected by IBM to replace Univa-UD
 • 2010: Creation of SysFera, INRIA spin-off
 • 2012: A team of 14 (R&D: 4 engineers and 5 PhD)
    – Supported by two experts from INRIA and ENS
    – SysFera-DS
Décrypthon
HPC management & mutualization

  Before SysFera-
  DS:
  • Local usage of
  resources
  • No unique
  submission         BORDEAUX                          LILLE

  interface
  • 5 sites, 2       LoadLeveler                    LoadLeveler


  different batch
  schedulers
                                                      JUSSIE
                       ORSAY
                                                        U
                                       LYON

                     LoadLeveler                    LoadLeveler
                                   OAR + Stockage
Décrypthon
HPC management & mutualization
 With SysFera-DS:
 • Resources mutualization
 • Web interface for
 submission
 • Application specific
 scheduling
                                         Site Web
 • Data management BORDEAUX                 de
                                                          LILLE
                                         soumissi
 • Hardware failures LoadLeveler            on
                                                       LoadLeveler
 hidden from the
 users (automatic
 re-submission)
                                                         JUSSIE
                          ORSAY
                                                           U
                                          LYON

                        LoadLeveler                    LoadLeveler
                                      OAR + Stockage
Helping cure muscular distrophy
« The Décrypthon Steering Commitee chose
SysFera-DS starting on June 2007 for its qualities
of robustness and modularity. It has been
progressively implemented on the Décrypthon
grid's ressources while ensuring a completely
transparent and smooth transition for the
users. »                      Thierry Toursel
                  Research Project Manager, AFM
EDF - Distributed platforms are complex
EDF - The solution
Working with a leading international
              company
Thanks to SysFera-DS, we can now provide our
R&D engineers a stable, reliable and
performant     solution   to   access    our
supercomputers and computing clusters.
                              David Bateman
                    ICCOS Group Manager, EDF
SysFera-DS does it all
  • Simple access to complex infrastructures
  • Advanced administration features
      – User management and access control
      – Monitoring and reporting
  •   Consistent platform for application development
  •   Integration to existing environments
  •   Compatibility with many different resources
  •   Non-intrusive, non-exclusive
  •   Flexible, stable, reliable, performant
Keys benefits
                                       Heterogeneous
                                        applications
                                        management




                                                       Big Data



       Efficient
      Management
                   Workflow & dataflow mangement &
                                design




                                                                  Collaborative
                                                                   Webboard




                    Hybrid Cloud
Offers
•   A software to optimize your computations
•   A licence to plug inside your software
•   Your applications migration
•   A webboard to manage your applications & infrastructures
•   Skilled competences to support these tools
•   Skilled competences to develop dedicated plugins




        Your applications
            Our Software
                                                         Our
                                                       Software
        Your infrastucture
                                         Your
                                      Applications
                                                     Pool ressources
                                                     CIMENT   CLOUD   …
Offers

          Webboard
          « To manage         Your
              your        Applications
                                                 Webboard
         applications »
                                              « To manage        Your
                                                  your        Applications
                     Vishnu                  applications »
         « A set of dedicated plugins –
         infrastructure management »


                                          DIET
           « to optimize your computations & integrate your
                           infrastructures »
Features overview
  • Meta-scheduling (load balancing), workflows
    management, jobs management, data management
  • Resources and communications management
  • Launch and monitoring of jobs, file transfers, hardware and
    software infrastructure through a scientific portal
  •   User management with single sign-on
  •   Cross network domain
  •   Advanced and fine-grained data management
  •   Automatic management of dynamic resources
  •   Maintenance management
  •   Easy deployment
  •   Usable in user space: no need to be root
  •   Cloud management
The WebBoard (Before SysFera)
User and admin interface   One app - one page




User rights management
                                    Statistics
SysFera-DS WebBoard
Outline


  •   Context
  •   From DIET…
  •   … to SysFera-DS
  •   Conclusion




                        39
05.04.12                                      ANR-SOP




An open source solution
The core of SysFera-DS is open-source software...

...which means anyone can use it, share it, and
contribute to it.




                                                  40
LIP
                                             SysFera
                   MIS, CNRS, ENSI,
                   ENSHEEIT, LIFC, IRISA,…




DIET Open Source                                 SysFera-DS
Conclusion
 • An open source solution with two different kind of
   collaborated support

 DIET
 LIP - Avalon Team

 - Proof of concept
 - Simulations
 - New features
 - Grid’5000 experiments
 - Scientific expertise
 - etc.
                                SysFera-DS
                                SysFera
                                - Application support with industrial quality
                                - Platfom development
                                - New features
                                - Personnal features
                                - Research Grid to Production Grid
                                - Hotline
Acknowledgment
    Abdelkader Amar                  Florent Rochette               Nicolas Bard
    Adrian Muresan                   Frédéric Desprez               Ousmane Thiare
    Alan Su                          Frédéric Lombard               Peter Frauenkron
    Amine Bsila                      Frédéric Suter                 Philippe Combes
    Andréea Chis                     Gaël Le Mahec                  Philippe Martinez
    Antoine Vernois                  Georg Hoesch                   Philippe Vicens
    Barbara Walter                   Ghislain Charrier              Phuspinder Kaur Chouhan
    Benjamin Depardon                Haïkel Guemar                  Raphaël Bolze
    Benjamin Isnard                  Ibrahima Cissé                 Romain Lacroix
    Bert Van Heukelom                Jean-Marc Nicod                Stéphane Vialle
    Bruno DelFabro                   Jonathan Rouzaud-Cornabas      Sylvain Dahan
    Christophe Pera                  Kevin Coulomb                  Vincent Pichon
    Cyril Pontvieux                  Laurent Philippe               Yves Caniou
    Cédric Tedeschi                  Ludovic Bertsch
    Damien Reimert-Vasconcellos      Luis Rodero-Merino
    Daouda Traore                    Marc Boury
    David Loureiro                   Martin Quinson
    Eric Boix                        Mathias Colin
    Eugene Pamba Capochichi          Mathieu Jan
    Emmanuel Quémener                Maurice Djibril Faye



                                                                                            43
http://graal.ens-lyon.fr/DIET

                                    http://www.sysfera.com
                                     http://blog.sysfera.com




David Loureiro (SysFera CEO):
- david.loureiro@sysfera.com
- @DavidLoureiroFr
- www.sysfera.com

Contenu connexe

Tendances

High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...Larry Smarr
 
UnaCloud: Opportunistic Cloud Computing Infrastructure as a Service
UnaCloud: Opportunistic Cloud Computing Infrastructure as a ServiceUnaCloud: Opportunistic Cloud Computing Infrastructure as a Service
UnaCloud: Opportunistic Cloud Computing Infrastructure as a ServiceMario Jose Villamizar Cano
 
Characterization of hadoop jobs using unsupervised learning
Characterization of hadoop jobs using unsupervised learningCharacterization of hadoop jobs using unsupervised learning
Characterization of hadoop jobs using unsupervised learningJoão Gabriel Lima
 
ACM HPDC 2010参加報告
ACM HPDC 2010参加報告ACM HPDC 2010参加報告
ACM HPDC 2010参加報告Ryousei Takano
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10keirdo1
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM Ganesan Narayanasamy
 
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...Willy Marroquin (WillyDevNET)
 
ClassCloud: switch your PC Classroom into Cloud Testbed
ClassCloud: switch your PC Classroom into Cloud TestbedClassCloud: switch your PC Classroom into Cloud Testbed
ClassCloud: switch your PC Classroom into Cloud TestbedJazz Yao-Tsung Wang
 
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...Maurice Nsabimana
 
Transparent Hardware Acceleration for Deep Learning
Transparent Hardware Acceleration for Deep LearningTransparent Hardware Acceleration for Deep Learning
Transparent Hardware Acceleration for Deep LearningIndrajit Poddar
 
Integration Platform For JMPS Using DDS
Integration Platform For JMPS Using DDSIntegration Platform For JMPS Using DDS
Integration Platform For JMPS Using DDSSupreet Oberoi
 
Hadoop bigdata overview
Hadoop bigdata overviewHadoop bigdata overview
Hadoop bigdata overviewharithakannan
 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introductionrajsandhu1989
 
OMG DDS Tutorial - Part I
OMG DDS Tutorial - Part IOMG DDS Tutorial - Part I
OMG DDS Tutorial - Part IAngelo Corsaro
 
Open repository 2011_duracloud-final
Open repository 2011_duracloud-finalOpen repository 2011_duracloud-final
Open repository 2011_duracloud-finalMark Diggory
 

Tendances (20)

High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
High Performance Cyberinfrastructure Enables Data-Driven Science in the Globa...
 
UnaCloud: Opportunistic Cloud Computing Infrastructure as a Service
UnaCloud: Opportunistic Cloud Computing Infrastructure as a ServiceUnaCloud: Opportunistic Cloud Computing Infrastructure as a Service
UnaCloud: Opportunistic Cloud Computing Infrastructure as a Service
 
Characterization of hadoop jobs using unsupervised learning
Characterization of hadoop jobs using unsupervised learningCharacterization of hadoop jobs using unsupervised learning
Characterization of hadoop jobs using unsupervised learning
 
ACM HPDC 2010参加報告
ACM HPDC 2010参加報告ACM HPDC 2010参加報告
ACM HPDC 2010参加報告
 
Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10Accel Partners New Data Workshop 7-14-10
Accel Partners New Data Workshop 7-14-10
 
OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM OpenPOWER/POWER9 Webinar from MIT and IBM
OpenPOWER/POWER9 Webinar from MIT and IBM
 
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...TECHNICAL OVERVIEW NVIDIA DEEP  LEARNING PLATFORM Giant Leaps in Performance ...
TECHNICAL OVERVIEW NVIDIA DEEP LEARNING PLATFORM Giant Leaps in Performance ...
 
ClassCloud: switch your PC Classroom into Cloud Testbed
ClassCloud: switch your PC Classroom into Cloud TestbedClassCloud: switch your PC Classroom into Cloud Testbed
ClassCloud: switch your PC Classroom into Cloud Testbed
 
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
Using Crowdsourced Images to Create Image Recognition Models with Analytics Z...
 
OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar OpenPOWER/POWER9 AI webinar
OpenPOWER/POWER9 AI webinar
 
Transparent Hardware Acceleration for Deep Learning
Transparent Hardware Acceleration for Deep LearningTransparent Hardware Acceleration for Deep Learning
Transparent Hardware Acceleration for Deep Learning
 
Integration Platform For JMPS Using DDS
Integration Platform For JMPS Using DDSIntegration Platform For JMPS Using DDS
Integration Platform For JMPS Using DDS
 
Hadoop bigdata overview
Hadoop bigdata overviewHadoop bigdata overview
Hadoop bigdata overview
 
DATEV aG
DATEV aGDATEV aG
DATEV aG
 
Paper444012-4014
Paper444012-4014Paper444012-4014
Paper444012-4014
 
D04501036040
D04501036040D04501036040
D04501036040
 
Hadoop Cluster Analysis and Assessment
Hadoop Cluster Analysis and AssessmentHadoop Cluster Analysis and Assessment
Hadoop Cluster Analysis and Assessment
 
Hadoop and Mapreduce Introduction
Hadoop and Mapreduce IntroductionHadoop and Mapreduce Introduction
Hadoop and Mapreduce Introduction
 
OMG DDS Tutorial - Part I
OMG DDS Tutorial - Part IOMG DDS Tutorial - Part I
OMG DDS Tutorial - Part I
 
Open repository 2011_duracloud-final
Open repository 2011_duracloud-finalOpen repository 2011_duracloud-final
Open repository 2011_duracloud-final
 

En vedette

Securing your esi_piedmont
Securing your esi_piedmontSecuring your esi_piedmont
Securing your esi_piedmontscm24
 
Bd cloud v3
Bd cloud v3Bd cloud v3
Bd cloud v3scm24
 
Feel the shift presentation (1) pc version (1)
Feel the shift presentation (1) pc version (1)Feel the shift presentation (1) pc version (1)
Feel the shift presentation (1) pc version (1)Royal LePage Wolstencroft
 
Cloud computing arma_nnj
Cloud computing arma_nnjCloud computing arma_nnj
Cloud computing arma_nnjscm24
 
Post it notes various colors design 1 powerpoint presentation templates.
Post it notes various colors design 1 powerpoint presentation templates.Post it notes various colors design 1 powerpoint presentation templates.
Post it notes various colors design 1 powerpoint presentation templates.SlideTeam.net
 
Maximizing your share_point_investment_final
Maximizing your share_point_investment_finalMaximizing your share_point_investment_final
Maximizing your share_point_investment_finalscm24
 

En vedette (6)

Securing your esi_piedmont
Securing your esi_piedmontSecuring your esi_piedmont
Securing your esi_piedmont
 
Bd cloud v3
Bd cloud v3Bd cloud v3
Bd cloud v3
 
Feel the shift presentation (1) pc version (1)
Feel the shift presentation (1) pc version (1)Feel the shift presentation (1) pc version (1)
Feel the shift presentation (1) pc version (1)
 
Cloud computing arma_nnj
Cloud computing arma_nnjCloud computing arma_nnj
Cloud computing arma_nnj
 
Post it notes various colors design 1 powerpoint presentation templates.
Post it notes various colors design 1 powerpoint presentation templates.Post it notes various colors design 1 powerpoint presentation templates.
Post it notes various colors design 1 powerpoint presentation templates.
 
Maximizing your share_point_investment_final
Maximizing your share_point_investment_finalMaximizing your share_point_investment_final
Maximizing your share_point_investment_final
 

Similaire à David Loureiro - Presentation at HP's HPC & OSL TES

The elephantintheroom bigdataanalyticsinthecloud
The elephantintheroom bigdataanalyticsinthecloudThe elephantintheroom bigdataanalyticsinthecloud
The elephantintheroom bigdataanalyticsinthecloudKhazret Sapenov
 
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)Robert Grossman
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-HadoopNagarjuna D.N
 
My Other Computer is a Data Center (2010 v21)
My Other Computer is a Data Center (2010 v21)My Other Computer is a Data Center (2010 v21)
My Other Computer is a Data Center (2010 v21)Robert Grossman
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...Databricks
 
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONMAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONijcsit
 
Cloud computing 13 principal enabling technologies
Cloud computing 13 principal  enabling technologiesCloud computing 13 principal  enabling technologies
Cloud computing 13 principal enabling technologiesVaibhav Khanna
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
Gridcomputingppt
GridcomputingpptGridcomputingppt
Gridcomputingpptnavjasser
 
CLOUD ENABLING TECHNOLOGIES.pptx
 CLOUD ENABLING TECHNOLOGIES.pptx CLOUD ENABLING TECHNOLOGIES.pptx
CLOUD ENABLING TECHNOLOGIES.pptxDr Geetha Mohan
 
Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsMichael Kopp
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERinside-BigData.com
 
EMC Isilon Database Converged deck
EMC Isilon Database Converged deckEMC Isilon Database Converged deck
EMC Isilon Database Converged deckKeithETD_CTO
 
Processing Drone data @Scale
Processing Drone data @ScaleProcessing Drone data @Scale
Processing Drone data @ScaleDr Hajji Hicham
 
From open data to API-driven business
From open data to API-driven businessFrom open data to API-driven business
From open data to API-driven businessOpenDataSoft
 
An Introduction to Cloud Computing (2009)
An Introduction to Cloud Computing (2009)An Introduction to Cloud Computing (2009)
An Introduction to Cloud Computing (2009)Robert Grossman
 
Blueprint for the Industrial Internet: The Architecture
Blueprint for the Industrial Internet: The ArchitectureBlueprint for the Industrial Internet: The Architecture
Blueprint for the Industrial Internet: The ArchitectureReal-Time Innovations (RTI)
 

Similaire à David Loureiro - Presentation at HP's HPC & OSL TES (20)

The elephantintheroom bigdataanalyticsinthecloud
The elephantintheroom bigdataanalyticsinthecloudThe elephantintheroom bigdataanalyticsinthecloud
The elephantintheroom bigdataanalyticsinthecloud
 
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
An Introduction to Cloud Computing by Robert Grossman 08-06-09 (v19)
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
 
My Other Computer is a Data Center (2010 v21)
My Other Computer is a Data Center (2010 v21)My Other Computer is a Data Center (2010 v21)
My Other Computer is a Data Center (2010 v21)
 
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep... Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
Cloud Computing Was Built for Web Developers—What Does v2 Look Like for Deep...
 
Computer project
Computer projectComputer project
Computer project
 
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISONMAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
MAP-REDUCE IMPLEMENTATIONS: SURVEY AND PERFORMANCE COMPARISON
 
Cloud computing 13 principal enabling technologies
Cloud computing 13 principal  enabling technologiesCloud computing 13 principal  enabling technologies
Cloud computing 13 principal enabling technologies
 
Introduction to map reduce
Introduction to map reduceIntroduction to map reduce
Introduction to map reduce
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Gridcomputingppt
GridcomputingpptGridcomputingppt
Gridcomputingppt
 
CLOUD ENABLING TECHNOLOGIES.pptx
 CLOUD ENABLING TECHNOLOGIES.pptx CLOUD ENABLING TECHNOLOGIES.pptx
CLOUD ENABLING TECHNOLOGIES.pptx
 
Performance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ ApplicationsPerformance Management in ‘Big Data’ Applications
Performance Management in ‘Big Data’ Applications
 
IBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWERIBM Data Centric Systems & OpenPOWER
IBM Data Centric Systems & OpenPOWER
 
EMC Isilon Database Converged deck
EMC Isilon Database Converged deckEMC Isilon Database Converged deck
EMC Isilon Database Converged deck
 
Processing Drone data @Scale
Processing Drone data @ScaleProcessing Drone data @Scale
Processing Drone data @Scale
 
From open data to API-driven business
From open data to API-driven businessFrom open data to API-driven business
From open data to API-driven business
 
Session19 Globus
Session19 GlobusSession19 Globus
Session19 Globus
 
An Introduction to Cloud Computing (2009)
An Introduction to Cloud Computing (2009)An Introduction to Cloud Computing (2009)
An Introduction to Cloud Computing (2009)
 
Blueprint for the Industrial Internet: The Architecture
Blueprint for the Industrial Internet: The ArchitectureBlueprint for the Industrial Internet: The Architecture
Blueprint for the Industrial Internet: The Architecture
 

Dernier

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Manik S Magar
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Dernier (20)

TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!Anypoint Exchange: It’s Not Just a Repo!
Anypoint Exchange: It’s Not Just a Repo!
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

David Loureiro - Presentation at HP's HPC & OSL TES

  • 1. Distributed Interactive Engineering Toolbox David Loureiro - Eddy Caron SysFera Ecole Normale Supérieure de Lyon GRAAL/AVALON Research Team
  • 2. Outline  Context  From DIET…  … to SysFera-DS  Conclusion 2
  • 3. Why Large Scale systems?  First need: supercomputing at a national or international scale  Large size problems (grand challenge) need a collaboration between several codes/supercomputing centers  Always a need for more computing power, memory capacity, and disk storage  The power of any single resource is always small compared to the aggregation of several resources  Network connectivity increased quickly! • Many available resources • Increasing complexity of applications – Many clusters – Multi-scale – Supercomputers – Multi-disciplinary – Millions of PC and – Huge data set produced workstations connected – Heterogeneity – Sharing or renting resources From DIET to SysFera-DS 3
  • 4. Centralized or Decentralized ? 2001 TeraGrid / 2003 Grid’5000  Centralized! 1997 Google Cluster • Grid Computing (Clusters of Clusters)  (De)Centralized!  Decentralized!  Centralized!  Decentralized! Sky Computing 2002 Earth Simulator • First computer to reach the Teraflops (40TF) • Homogeneous, Centralized, Expensive 1946 ENIAC • 18.000 tubes, 30 tons, 170 m² • 2.000 tubes replaced every months by 6 technicians Cloud Computing • Amazon • Google • Microsoft 2008 IBM Roadrunner • … • First computer to reach the Petaflops From DIET to SysFera-DS 4
  • 5. Research driven by applications  Data-centric applications  Very Large data management (in, out, temporary) >30 TB data/night  Computer-centric applications  GigaFlops Predicting Impacts of Massive Earthquakes (SDSC)  Community-centric applications  Data sharing (acquisition, results, ..)  Resources Large Hadron Collider (LHC) Without an optimal scheduling? I just need my simulation result Without minimizing ressources consumption? Without any optimisation? … Grid user point of view  Single sign-on  Single compute space  Single data space  Single development environment From DIET to SysFera-DS 5
  • 6. Which framework ?  Holy Grail: Transparency and simplicity (maybe even before performance) !  Scheduling tunability  Many incarnations of the Grid  Grid computing  Cluster computing  peer-to-peer systems,  Global computing  Web Services,  Clouds, …  Many programming models  Shared-State Models  Message Passing Models,  Hybrids models  RPC and RMI models  Peer-to-peer models  Web Services models  Coordination models, …  Do not forget good ol’ time research on scheduling and distributed systems !  Most scheduling problems are very difficult to solve even in their simplistic form …  … but simple solutions often lead to better performance results in real life From DIET to SysFera-DS 6
  • 7. Outline  Context  From DIET…  … to SysFera-DS  Conclusion 7
  • 8. DIET’s Goals http://graal.ens-lyon.fr/DIET/  Our goals  To develop a toolbox for the deployment of environments using the Application Service Provider/Software as a Service (ASP/SaaS) paradigm with different applications  Use as much as possible public domain and standard software  To obtain a high performance and scalable environment  Implement and validate our more theoretical results  Scheduling for heterogeneous platforms, data (re)distribution and replication, performance evaluation, algorithmic for heterogeneous and distributed platforms, …  Based on CORBA and our own software developments  FAST for performance evaluation,  LogService for monitoring,  VizDIET for the visualization,  GoDIET for the deployment  Dagda for the data management  Several applications in different fields (simulation, bioinformatics, …)  Release 2.8 available on the web since november  ACI Grid ASP, RNTL GASP, ANR LEGO CIGC-05-11, ANR Gwendia, Celtic-plus Project SEED4C From DIET to SysFera-DS 8
  • 9. RPC and Grid-Computing: Grid-RPC • One simple idea – Implementing the RPC programming model over the grid – Using resources accessible through the network – Mixed parallelism model (data-parallel model at the server level and task parallelism between the servers) • Features needed – Load-balancing (resource localization and performance evaluation, scheduling), – IDL, – Data and replica management, – Security, – Fault-tolerance, – Interoperability with other systems, – …  Design of a standard interface – within the OGF (Grid-RPC and SAGA WG) – Existing implementations: NetSolve/GridSolve, Ninf, DIET, OmniRPC From DIET to SysFera-DS 9
  • 10. RPC and Grid Computing: Grid-RPC Request AGENT(s) Client S2 ! Op(C, A, B) S3 S4 S1 S2 From DIET to SysFera-DS 10
  • 11. Client and server interface  Client side  So easy …  Multi-interface (C, C++, Fortran, Java, Python, Scilab, Web Services, etc.)  Grid-RPC compliant  Server side  Install and submit new server to agent (LA)  Problem and parameter description  Client IDL transfer from server  Dynamic services  new service  new version  security update  outdated service  Etc. From DIET to SysFera-DS 11
  • 12. Architecture overview ( )* +,$ " ' &$ ( )* "+,$ ' &$ ( )* "+,$ ' &$ ' &$ %&$ %&$ ! "# $ ! "# $ ! "# $ ! "# $ MA : Master Agent ! "# $ LA : Local Agent ! "# $ SeD : ServerDeamon From DIET to SysFera-DS 12
  • 13. Workflow Management  Workflow representation  Direct Acyclic Graph (DAG) Each vertex is a task   Each directed edge represents communication between tasks  Functional workflows  Loops, if statements, automatic parallelism, fault-tolerance  Goals !  Build and execute workflows  Use different heuristics to solve scheduling problems  Extensibility to address multi-workflows submission and large grid platform  Manage heterogeneity and variability of environment  ANR Gwendia time Idle Data transfert Execution time  Language definition (MOTEUR & MADAG) EGI (Glite) Comparison on Grid’5000 vs EGI 132.143 s  32.857s 274.643 s Grid’5000 (DIET) 0.214s Contribution to the management of large 540.614 s 3.371 s scale platforms: the DIET experience 13
  • 14. DIET Scheduling: Plug-in Schedulers  SeD level  Performance estimation function  Estimation Metric Vector - dynamic collection of performance estimation values  Performance measures available through DIET  FAST-NWS performance metrics  Time elapsed since the last execution  CoRI (Collector of Resource Information)  Developer defined values  Aggregation Methods  Defining mechanism to sort SeD responses: associated with the service and defined at SeD level  Tunable comparison/aggregation routines for scheduling  Priority Scheduler  Performs pairwise server estimation comparisons returning a sorted list of server responses;  Can minimize or maximize based on SeD estimations and taking into consideration the order in which the request for those performance estimations was specified at SeD level. From DIET to SysFera-DS 14
  • 15. DIET Scheduling: Performance estimation  Collector of Resource Information (CoRI)  Interface to gather performance information  Currently 2 modules available CoRI Manager  CoRI Easy  FAST (Martin Quinson’s PhD) CoRI-Easy FAST Other Collector Collector Collectors like  Sigar, GPU, etc to come… Ganglia  Extension for parallel program • Code analysis / FAST calls combination • Allow the estimation of parallel regular routines (ScaLAPACK-like) Max. error: 14,7 % Avg. error: 3,8 % 35,00 35,00 30,00 30,00 25,00 25,00 20,00 20,00 15,00 15,00 10,00 10,00 5,00 5,00 0,00 0,00 1 1 6 6 1 11 1 11 6 6 16 16 11 11 16 21 16 21 21 26 21 26 26 26 31 31 31 31 Measured Estimated From DIET to SysFera-DS 15
  • 16. Data Management  Three approaches for DIET  DTM (LIFC, Besançon)  Hierarchical and distributed data manager  Redistribution between servers  JuxMem (Paris, Rennes)  P2P data cache  DAGDA (IN2P3, Clermont-Ferrand and LIP)  Joining task scheduling and data management  Standardized through GridRPC OGF WG. • Data Arrangement for Grid and Distributed Applications  Explicit data replication: Using the API.  Implicit data replication.  Data replacement algorithm: LRU, LFU AND FIFO  Transfer optimization by selecting the more convenient source.  Storage resources usage management.  Data status backup/restoration. From DIET to SysFera-DS 16
  • 17. Parallel and batch submissions  Parallel & sequential jobs  transparent for the user  system dependent submission MA  SeDBatch  Many batch systems  Batch schedulers behaviour LA SeD//  Internal scheduling process  Monitoring & Performance prediction NFS  Simulation (Simbatch) SeD OAR SLURM SeDBatch PBS LSF OGE Loadleveler 6/03/12 From DIET to SysFera-DS
  • 18. DIET Cloud  Inside the Cloud  DIET platform is virtualized inside the cloud. (as Xen image for example)  Very flexible and scalable as DIET nodes can be launched  Scheduling is more complex  DIET as a Cloud manager  Eucalyptus interface  Eucalyptus is treated as a new Batch System  Provide a new implementation for the BatchSystem abstract class From DIET to SysFera-DS 18
  • 19. Grid’5000 Grid’5000  Building a nation wide experimental platform for  Grid & P2P researches (like a particle accelerator for the computer scientists)  9 geographically distributed sites hosting clusters with 256 CPUs to 1K CPUs)  All sites are connected by RENATER (French Res. and Edu. Net.)  Design and develop a system/middleware environment for safely test and repeat experiments  Use the platform for Grid experiments in real life conditions  4 main features:  A high security for Grid’5000 and the Internet, despite the deep reconfiguration feature  Single sign-on  High-performance LRMS: OAR  A user toolkit to reconfigure the nodes and monitor experiment: Kadeploy  DIET deployment over a maximum of processors  1 MA, 8 LA, 540 SeDs  1120 clients on 140 machines  DGEMM requests (2000x2000 matrices)  Simple round-robin scheduling From DIET to SysFera-DS 19
  • 20. Applications: 4 of them Cosmology Application Climatology Application • Dark Mater Halos • Forecasting of the world's environment and • Large Scale experiment on Grid’5K climate on regional to global scales • Plug-in Scheduler Robotic Application Bioinformatics Application Parameters DIET API External DIET middleware application call Results Request Metrics vector • BLAST BLAST service Plugin-scheduler declaration •40000 requests over 5 databases of different sizes (from 1 to 5 GB) • Experiment between Italia and France • Data management optimized From DIET to SysFera-DS 20
  • 21. Conclusions  Grid-RPC  Interesting approach for several applications  Simple, flexible, and efficient  Many interesting research issues (scheduling, data management, resource discovery and reservation, deployment, fault-tolerance, …)  DIET  Scalable, open-source, and multi-application platform  Concentration on several issues like resource discovery, scheduling (distributed scheduling and plugin schedulers), deployment (GoDIET and GRUDU), performance evaluation (CoRI), monitoring (LogService and VizDIET), data management and replication (DTM, JuxMem, and DAGDA)  Large scale validation on the Grid’5000 platform  A middleware designed and tunable for different applications http://www.grid5000.org/ From DIET to SysFera-DS 21
  • 22. Results  A complete Middleware for heterogeneous infrastructure  DIET is light to use and non-intrusive  Dedicated to many applications  Designed for Grid and Cloud  Efficient even in comparison to commercial tools  DIET is high tunability middleware  Used in production  The DIET Team  SysFera Compagny (14 persons today)  http://www.sysfera.com From DIET to SysFera-DS 22
  • 23. Future Prospects  Do we need application specific schedulers ?  Scheduling based on Economic Model for Cloud Platform  DIET Green (Collaboration with RESO)  Increase the DIET capacity to deal with heterogeneous resources MA  Single System Image Cluster OS LA  Box Cluster LA LA SED Kerrighed Kerrighed script generator Deploy the image  Virtual Machines New services are register SED Batch SED Cloud SED Batch script generator Cloud script generator Submission to batch scheduler Deploy the image New services are register  GPU architecture SMP Virtual  Multi-core Batch Scheduler Cloud Platform PBS, OAR, Loadlever, ... Eucalyptus, EC2, ...  Large scale architecture  … From DIET to SysFera-DS 23
  • 24. Outline  Context  From DIET…  … to SysFera-DS  Conclusion 24
  • 25. Who are we? • 2001: Research project from the Graal team (Inria/ENS) – DIET: grid middleware • 2007: SysFera-DS used within the Décrypthon project – Used in production – Selected by IBM to replace Univa-UD • 2010: Creation of SysFera, INRIA spin-off • 2012: A team of 14 (R&D: 4 engineers and 5 PhD) – Supported by two experts from INRIA and ENS – SysFera-DS
  • 26. Décrypthon HPC management & mutualization Before SysFera- DS: • Local usage of resources • No unique submission BORDEAUX LILLE interface • 5 sites, 2 LoadLeveler LoadLeveler different batch schedulers JUSSIE ORSAY U LYON LoadLeveler LoadLeveler OAR + Stockage
  • 27. Décrypthon HPC management & mutualization With SysFera-DS: • Resources mutualization • Web interface for submission • Application specific scheduling Site Web • Data management BORDEAUX de LILLE soumissi • Hardware failures LoadLeveler on LoadLeveler hidden from the users (automatic re-submission) JUSSIE ORSAY U LYON LoadLeveler LoadLeveler OAR + Stockage
  • 28. Helping cure muscular distrophy « The Décrypthon Steering Commitee chose SysFera-DS starting on June 2007 for its qualities of robustness and modularity. It has been progressively implemented on the Décrypthon grid's ressources while ensuring a completely transparent and smooth transition for the users. » Thierry Toursel Research Project Manager, AFM
  • 29. EDF - Distributed platforms are complex
  • 30. EDF - The solution
  • 31. Working with a leading international company Thanks to SysFera-DS, we can now provide our R&D engineers a stable, reliable and performant solution to access our supercomputers and computing clusters. David Bateman ICCOS Group Manager, EDF
  • 32. SysFera-DS does it all • Simple access to complex infrastructures • Advanced administration features – User management and access control – Monitoring and reporting • Consistent platform for application development • Integration to existing environments • Compatibility with many different resources • Non-intrusive, non-exclusive • Flexible, stable, reliable, performant
  • 33. Keys benefits Heterogeneous applications management Big Data Efficient Management Workflow & dataflow mangement & design Collaborative Webboard Hybrid Cloud
  • 34. Offers • A software to optimize your computations • A licence to plug inside your software • Your applications migration • A webboard to manage your applications & infrastructures • Skilled competences to support these tools • Skilled competences to develop dedicated plugins Your applications Our Software Our Software Your infrastucture Your Applications Pool ressources CIMENT CLOUD …
  • 35. Offers Webboard « To manage Your your Applications Webboard applications » « To manage Your your Applications Vishnu applications » « A set of dedicated plugins – infrastructure management » DIET « to optimize your computations & integrate your infrastructures »
  • 36. Features overview • Meta-scheduling (load balancing), workflows management, jobs management, data management • Resources and communications management • Launch and monitoring of jobs, file transfers, hardware and software infrastructure through a scientific portal • User management with single sign-on • Cross network domain • Advanced and fine-grained data management • Automatic management of dynamic resources • Maintenance management • Easy deployment • Usable in user space: no need to be root • Cloud management
  • 37. The WebBoard (Before SysFera) User and admin interface One app - one page User rights management Statistics
  • 39. Outline • Context • From DIET… • … to SysFera-DS • Conclusion 39
  • 40. 05.04.12 ANR-SOP An open source solution The core of SysFera-DS is open-source software... ...which means anyone can use it, share it, and contribute to it. 40
  • 41. LIP SysFera MIS, CNRS, ENSI, ENSHEEIT, LIFC, IRISA,… DIET Open Source SysFera-DS
  • 42. Conclusion • An open source solution with two different kind of collaborated support DIET LIP - Avalon Team - Proof of concept - Simulations - New features - Grid’5000 experiments - Scientific expertise - etc. SysFera-DS SysFera - Application support with industrial quality - Platfom development - New features - Personnal features - Research Grid to Production Grid - Hotline
  • 43. Acknowledgment  Abdelkader Amar  Florent Rochette  Nicolas Bard  Adrian Muresan  Frédéric Desprez  Ousmane Thiare  Alan Su  Frédéric Lombard  Peter Frauenkron  Amine Bsila  Frédéric Suter  Philippe Combes  Andréea Chis  Gaël Le Mahec  Philippe Martinez  Antoine Vernois  Georg Hoesch  Philippe Vicens  Barbara Walter  Ghislain Charrier  Phuspinder Kaur Chouhan  Benjamin Depardon  Haïkel Guemar  Raphaël Bolze  Benjamin Isnard  Ibrahima Cissé  Romain Lacroix  Bert Van Heukelom  Jean-Marc Nicod  Stéphane Vialle  Bruno DelFabro  Jonathan Rouzaud-Cornabas  Sylvain Dahan  Christophe Pera  Kevin Coulomb  Vincent Pichon  Cyril Pontvieux  Laurent Philippe  Yves Caniou  Cédric Tedeschi  Ludovic Bertsch  Damien Reimert-Vasconcellos  Luis Rodero-Merino  Daouda Traore  Marc Boury  David Loureiro  Martin Quinson  Eric Boix  Mathias Colin  Eugene Pamba Capochichi  Mathieu Jan  Emmanuel Quémener  Maurice Djibril Faye 43
  • 44. http://graal.ens-lyon.fr/DIET http://www.sysfera.com http://blog.sysfera.com David Loureiro (SysFera CEO): - david.loureiro@sysfera.com - @DavidLoureiroFr - www.sysfera.com