SlideShare une entreprise Scribd logo
1  sur  27
Developing Cyberinfrastructure to Support Computational Chemistry Workflows Marlon Pierce (IU), Suresh Marru (IU), SudhakarPamidighantam (NCSA) SashikiranChalla (IU), Ye Fan (NCSA/IU), PatanachaiTangchaisin (IU)
Part 1: Reusable Middleware for OREChem Services and workflows for OREChem
Microsoft Research’s ORECHEM Project “A collaboration between chemistry scholars and information scientists to develop and deploy the infrastructure, services, and applications to enable new models for research and dissemination of scholarly materials in the chemistry community.” http://research.microsoft.com/en-us/projects/orechem/ 3
[object Object]
Citations
Figures
Tables
Chunks
Reactions
Molecular Compounds
NMR Spectra and Structural Data
Experiment data Southampton PSU Cambridge Indiana ,[object Object]
servicesTriplestore On Azure Cloud A not particularly accurate summary of OREChem 4
IU’s Objective To build a pipeline to: Fetch ORE ATOM feeds Transform ATOM feeds into triples and store them into a triple store ( Using GRDDL/Saxon HE) Extract crystallographically obtained 3D coordinates information Submit compute intensive electronic structure calculations, geometry optimization tasks to tools like Gaussian09 on TeraGrid supercomputing resources. Transform the Gaussian output into triples and store them into a triple store 5
OREChem-Computation Workflow Convert CML to Gaussian Input format  Extract Moiety feeds in CML format Gaussian on  TeraGrid Moiety files Gaussian Output to RDF triples ATOM Feeds from eCrystals or CrystalEye N3 files or RDF/XML Triplestore 6
ORECHEM REST Services 7
ORECHEM REST Services http://gf18.ucs.indiana.edu:8146/FeedsHarvester/cml3d/csv?harvester=moiety&numofentries=5 http://gf18.ucs.indiana.edu:8146/CML2GaussianSemCompChem/gauss/inputgenerator 8
9 OREChem Workflow in XBaya
Part 2: Computational Chemistry Middleware Reusing software from the Open Gateway Computing Environments (OGCE) Project
What Is a Science Gateway? User Interface and supporting Web services to scientific applications, data sets, and resources running on cyberinfrastructure. Science portals, Grid Computing Environments, … Broaden and simplify usage Cyberinfrastructure: Distributed computing resources and overlaying middleware for scientific computing. Prominent examples include TeraGrid, Open Science Grid Middleware includes Globus, Condor, iRods/SRB, … Some of these approaches being pushed by scientific cloud computing That is another topic
TeraGrid is one of the largest investments in shared CI from NSF’s Office of CyberinfrastructureSoon to become TeraGrid/XD Dedicated high-speed, cross—country network Staff & Advanced Support 20 Petabytes Storage 2 PetaFLOPS Computation Visualization
Computational Chemistry Grid Has a long history (S. Pamidighantam) Started in 1998as Quantum Chemistry Workbench Evolved into ChemVizin NCSA Expedition Era A pioneer of the TeraGrid Science Gateway and Community Account concepts Manages software installations and licensing as well as middleware Currently in two incarnations GridChem- Science Gateway for Molecular Sciences Production gateway  ParamChem– Automatic Parameterization of Molecular Mechanics Infrastructure research built on GridChem
GridChem Science Gateway Supported Applications  Gaussian, CHARMM, NWChem, GAMESS, Molpro, QMCPack, MD Amber, ACES, NAMD, Wien2K, Gromacs, Castep Usage Statistics (December 2010) 431 Distinct Users  37,500 Computational jobs’ metadata in DB Over 2,000,000 Service Units consumed Tracked over 50 peer reviewed publications Reportable metrics are an important issue
Simplified GridChem Architecture Gaussian, GAMES & Other Molecular Editors &  Input Generators Gaussian, CHARMM, NWChem, GAMESS, NAMD, Amber … Configure Inputs Job Managers & Data Movement Interfaces  GridChem Client OGCE/GridChem Middleware Monitor Resources Manage Jobs Submit & Monitor Jobs Download Output Output Analysis & Visualization
Sample GridChem Post Processing
Collaborations with Open Gateway Computing Environments The OGCE has several general purpose tools that are being phased into GridChem’s production middleware. XBaya: Graphical composition and execution of sequence of tasks. Workflow Interpreter Service and GFAC Supports long running executions and asynchronous invocations. Stop, rewind, and replay executions. Support parametric sweeps of workflows. Integrate human interactions into workflow executions.
OGCE-Generalized GridChem Infrastructure Java CoG Abstraction TeraGrid/XD Globus Molecular Editors & Input Generators DRMAA & SSH Utilities Campus Resources European Grids Amazon, Eucalyptus EC2 Interface Unicore, Open Nebula Condor, SSH, (SLURM) OGCE Workflow & Job Management GridChem Client Cloud API’s Other Grid Middleware Output Analysis and Visualization (Requirements Driven)

Contenu connexe

Tendances

The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataAnubhav Jain
 
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralCloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralPaolo Missier
 
Next-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalNext-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalWaqas Tariq
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesIan Foster
 
grid mining
grid mininggrid mining
grid miningARNOLD
 
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsDeep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsValery Tkachenko
 
Sgg crest-presentation-final
Sgg crest-presentation-finalSgg crest-presentation-final
Sgg crest-presentation-finalmarpierc
 
Integrating scientific laboratories into the cloud
Integrating scientific laboratories into the cloudIntegrating scientific laboratories into the cloud
Integrating scientific laboratories into the cloudData Finder
 
Rescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information SystemsRescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information SystemsHealth Informatics New Zealand
 
Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...balmanme
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...Ian Foster
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataRobert Grossman
 
Cybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterCybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterbalmanme
 
Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Robert Grossman
 
Accelerating GWAS epistatic interaction analysis methods
Accelerating GWAS epistatic interaction analysis methodsAccelerating GWAS epistatic interaction analysis methods
Accelerating GWAS epistatic interaction analysis methodsPriscill Orue Esquivel
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Robert Grossman
 
Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)Robert Grossman
 

Tendances (20)

The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV DataThe DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
The DuraMat Data Hub and Analytics Capability: A Resource for Solar PV Data
 
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralCloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
 
Next-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information RetrievalNext-Generation Search Engines for Information Retrieval
Next-Generation Search Engines for Information Retrieval
 
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy SciencesDiscovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
Discovery Engines for Big Data: Accelerating Discovery in Basic Energy Sciences
 
grid mining
grid mininggrid mining
grid mining
 
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsDeep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
 
Sgg crest-presentation-final
Sgg crest-presentation-finalSgg crest-presentation-final
Sgg crest-presentation-final
 
Data mining weka
Data mining wekaData mining weka
Data mining weka
 
ChemSpider, how a free community resource of data can support teaching NMR sp...
ChemSpider, how a free community resource of data can support teaching NMR sp...ChemSpider, how a free community resource of data can support teaching NMR sp...
ChemSpider, how a free community resource of data can support teaching NMR sp...
 
Integrating scientific laboratories into the cloud
Integrating scientific laboratories into the cloudIntegrating scientific laboratories into the cloud
Integrating scientific laboratories into the cloud
 
Rescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information SystemsRescuing Data from Decaying and Moribund Clinical Information Systems
Rescuing Data from Decaying and Moribund Clinical Information Systems
 
Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...Berkeley lab team develops flexible reservation algorithm for advance network...
Berkeley lab team develops flexible reservation algorithm for advance network...
 
Iugonet 20121020 poster
Iugonet 20121020 posterIugonet 20121020 poster
Iugonet 20121020 poster
 
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
Materials Data Facility: Streamlined and automated data sharing,  discovery, ...Materials Data Facility: Streamlined and automated data sharing,  discovery, ...
Materials Data Facility: Streamlined and automated data sharing, discovery, ...
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big Data
 
Cybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-posterCybertools stork-2009-cybertools allhandmeeting-poster
Cybertools stork-2009-cybertools allhandmeeting-poster
 
Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)Bionimbus Cambridge Workshop (3-28-11, v7)
Bionimbus Cambridge Workshop (3-28-11, v7)
 
Accelerating GWAS epistatic interaction analysis methods
Accelerating GWAS epistatic interaction analysis methodsAccelerating GWAS epistatic interaction analysis methods
Accelerating GWAS epistatic interaction analysis methods
 
Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11Open Science Data Cloud - CCA 11
Open Science Data Cloud - CCA 11
 
Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)Health & Status Monitoring (2010-v8)
Health & Status Monitoring (2010-v8)
 

En vedette

ACES QuakeSim 2011
ACES QuakeSim 2011ACES QuakeSim 2011
ACES QuakeSim 2011marpierc
 
Getting started with Apache Airavata
Getting started with Apache AiravataGetting started with Apache Airavata
Getting started with Apache AiravataHeshan Suriyaarachchi
 
OGCE MSI Presentation
OGCE MSI PresentationOGCE MSI Presentation
OGCE MSI Presentationmarpierc
 
Experiences with the Apache Software Foundation
Experiences with the Apache Software Foundation Experiences with the Apache Software Foundation
Experiences with the Apache Software Foundation marpierc
 
OGCE RT Rroject Review
OGCE RT Rroject ReviewOGCE RT Rroject Review
OGCE RT Rroject Reviewmarpierc
 
OGCE TG09 Tech Track Presentation
OGCE TG09 Tech Track PresentationOGCE TG09 Tech Track Presentation
OGCE TG09 Tech Track Presentationmarpierc
 
OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009marpierc
 
OGCE Review for Indiana University Research Technologies
OGCE Review for Indiana University Research TechnologiesOGCE Review for Indiana University Research Technologies
OGCE Review for Indiana University Research Technologiesmarpierc
 
IWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache AiravataIWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache Airavatamarpierc
 
OGCE Project Overview
OGCE Project OverviewOGCE Project Overview
OGCE Project Overviewmarpierc
 
TG11 ORPS Poster
TG11 ORPS PosterTG11 ORPS Poster
TG11 ORPS Postermarpierc
 

En vedette (14)

OGCE SC10
OGCE SC10OGCE SC10
OGCE SC10
 
ACES QuakeSim 2011
ACES QuakeSim 2011ACES QuakeSim 2011
ACES QuakeSim 2011
 
Apache Airavata Tutorial
Apache Airavata Tutorial Apache Airavata Tutorial
Apache Airavata Tutorial
 
Getting started with Apache Airavata
Getting started with Apache AiravataGetting started with Apache Airavata
Getting started with Apache Airavata
 
OGCE MSI Presentation
OGCE MSI PresentationOGCE MSI Presentation
OGCE MSI Presentation
 
Experiences with the Apache Software Foundation
Experiences with the Apache Software Foundation Experiences with the Apache Software Foundation
Experiences with the Apache Software Foundation
 
OGCE RT Rroject Review
OGCE RT Rroject ReviewOGCE RT Rroject Review
OGCE RT Rroject Review
 
OGCE TG09 Tech Track Presentation
OGCE TG09 Tech Track PresentationOGCE TG09 Tech Track Presentation
OGCE TG09 Tech Track Presentation
 
OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009OGCE Overview for SciDAC 2009
OGCE Overview for SciDAC 2009
 
OGCE Review for Indiana University Research Technologies
OGCE Review for Indiana University Research TechnologiesOGCE Review for Indiana University Research Technologies
OGCE Review for Indiana University Research Technologies
 
IWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache AiravataIWSG2014: Developing Science Gateways Using Apache Airavata
IWSG2014: Developing Science Gateways Using Apache Airavata
 
OGCE Project Overview
OGCE Project OverviewOGCE Project Overview
OGCE Project Overview
 
Family planning
Family planningFamily planning
Family planning
 
TG11 ORPS Poster
TG11 ORPS PosterTG11 ORPS Poster
TG11 ORPS Poster
 

Similaire à PNNL April 2011 ogce

Scientific
Scientific Scientific
Scientific marpierc
 
Integrated research data management in the Structural Sciences
Integrated research data management in the Structural SciencesIntegrated research data management in the Structural Sciences
Integrated research data management in the Structural SciencesManjulaPatel
 
Iaetsd survey on big data analytics for sdn (software defined networks)
Iaetsd survey on big data analytics for sdn (software defined networks)Iaetsd survey on big data analytics for sdn (software defined networks)
Iaetsd survey on big data analytics for sdn (software defined networks)Iaetsd Iaetsd
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Anubhav Jain
 
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Syed Ahmad Chan Bukhari, PhD
 
Heuristic based query optimisation for rsp(rdf stream processing) engines
Heuristic based query optimisation for rsp(rdf stream processing) enginesHeuristic based query optimisation for rsp(rdf stream processing) engines
Heuristic based query optimisation for rsp(rdf stream processing) enginesWilliam Aruga
 
2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdfLevLafayette1
 
RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)IJCSEA Journal
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningRafael Ferreira da Silva
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningKAMAL CHOUDHARY
 
AN INFORMATION THEORY-BASED FEATURE SELECTIONFRAMEWORK FOR BIG DATA UNDER APA...
AN INFORMATION THEORY-BASED FEATURE SELECTIONFRAMEWORK FOR BIG DATA UNDER APA...AN INFORMATION THEORY-BASED FEATURE SELECTIONFRAMEWORK FOR BIG DATA UNDER APA...
AN INFORMATION THEORY-BASED FEATURE SELECTIONFRAMEWORK FOR BIG DATA UNDER APA...Nexgen Technology
 
OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights Summer 2019OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights Summer 2019OpenACC
 
Svm Classifier Algorithm for Data Stream Mining Using Hive and R
Svm Classifier Algorithm for Data Stream Mining Using Hive and RSvm Classifier Algorithm for Data Stream Mining Using Hive and R
Svm Classifier Algorithm for Data Stream Mining Using Hive and RIRJET Journal
 
Internet data mining 2006
Internet data mining   2006Internet data mining   2006
Internet data mining 2006raj_vij
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Larry Smarr
 
Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...IRJET Journal
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )SBGC
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Manuel Martín
 
IRJET- A Survey on Searching of Keyword on Encrypted Data in Cloud using ...
IRJET-  	  A Survey on Searching of Keyword on Encrypted Data in Cloud using ...IRJET-  	  A Survey on Searching of Keyword on Encrypted Data in Cloud using ...
IRJET- A Survey on Searching of Keyword on Encrypted Data in Cloud using ...IRJET Journal
 

Similaire à PNNL April 2011 ogce (20)

Scientific
Scientific Scientific
Scientific
 
Integrated research data management in the Structural Sciences
Integrated research data management in the Structural SciencesIntegrated research data management in the Structural Sciences
Integrated research data management in the Structural Sciences
 
Iaetsd survey on big data analytics for sdn (software defined networks)
Iaetsd survey on big data analytics for sdn (software defined networks)Iaetsd survey on big data analytics for sdn (software defined networks)
Iaetsd survey on big data analytics for sdn (software defined networks)
 
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...Evaluating Machine Learning Algorithms for Materials Science using the Matben...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
 
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
Leveraging CEDAR workbench for ontology-linked submission of adaptive immune ...
 
My thesis
My thesisMy thesis
My thesis
 
Heuristic based query optimisation for rsp(rdf stream processing) engines
Heuristic based query optimisation for rsp(rdf stream processing) enginesHeuristic based query optimisation for rsp(rdf stream processing) engines
Heuristic based query optimisation for rsp(rdf stream processing) engines
 
2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf2023comp90024_Spartan.pdf
2023comp90024_Spartan.pdf
 
RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)RSDC (Reliable Scheduling Distributed in Cloud Computing)
RSDC (Reliable Scheduling Distributed in Cloud Computing)
 
The Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource ProvisioningThe Interplay of Workflow Execution and Resource Provisioning
The Interplay of Workflow Execution and Resource Provisioning
 
Physics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learningPhysics inspired artificial intelligence/machine learning
Physics inspired artificial intelligence/machine learning
 
AN INFORMATION THEORY-BASED FEATURE SELECTIONFRAMEWORK FOR BIG DATA UNDER APA...
AN INFORMATION THEORY-BASED FEATURE SELECTIONFRAMEWORK FOR BIG DATA UNDER APA...AN INFORMATION THEORY-BASED FEATURE SELECTIONFRAMEWORK FOR BIG DATA UNDER APA...
AN INFORMATION THEORY-BASED FEATURE SELECTIONFRAMEWORK FOR BIG DATA UNDER APA...
 
OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights Summer 2019OpenACC Monthly Highlights Summer 2019
OpenACC Monthly Highlights Summer 2019
 
Svm Classifier Algorithm for Data Stream Mining Using Hive and R
Svm Classifier Algorithm for Data Stream Mining Using Hive and RSvm Classifier Algorithm for Data Stream Mining Using Hive and R
Svm Classifier Algorithm for Data Stream Mining Using Hive and R
 
Internet data mining 2006
Internet data mining   2006Internet data mining   2006
Internet data mining 2006
 
Panel: NRP Science Impacts​
Panel: NRP Science Impacts​Panel: NRP Science Impacts​
Panel: NRP Science Impacts​
 
Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...Prediction of Critical Temperature of Superconductors using Tree Based Method...
Prediction of Critical Temperature of Superconductors using Tree Based Method...
 
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
2017 IEEE Projects 2017 For Cse ( Trichy, Chennai )
 
Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?Automating Machine Learning - Is it feasible?
Automating Machine Learning - Is it feasible?
 
IRJET- A Survey on Searching of Keyword on Encrypted Data in Cloud using ...
IRJET-  	  A Survey on Searching of Keyword on Encrypted Data in Cloud using ...IRJET-  	  A Survey on Searching of Keyword on Encrypted Data in Cloud using ...
IRJET- A Survey on Searching of Keyword on Encrypted Data in Cloud using ...
 

Plus de marpierc

XSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata TutorialXSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata Tutorialmarpierc
 
GCE11 Apache Rave Presentation
GCE11 Apache Rave PresentationGCE11 Apache Rave Presentation
GCE11 Apache Rave Presentationmarpierc
 
SC11 Science Gateway Group Overview
SC11 Science Gateway Group OverviewSC11 Science Gateway Group Overview
SC11 Science Gateway Group Overviewmarpierc
 
Building Science Gateways with Gadgets and OpenSocial
Building Science Gateways with Gadgets and OpenSocialBuilding Science Gateways with Gadgets and OpenSocial
Building Science Gateways with Gadgets and OpenSocialmarpierc
 
OGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial IntroOGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial Intromarpierc
 
OGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA SupportOGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA Supportmarpierc
 
OGCE SciDAC2010 Tutorial
OGCE SciDAC2010 TutorialOGCE SciDAC2010 Tutorial
OGCE SciDAC2010 Tutorialmarpierc
 
OREChem Services and Workflows
OREChem Services and WorkflowsOREChem Services and Workflows
OREChem Services and Workflowsmarpierc
 
Ogce about-sc10
Ogce about-sc10Ogce about-sc10
Ogce about-sc10marpierc
 
Indiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway SupportIndiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway Supportmarpierc
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22marpierc
 
GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009marpierc
 
GTLAB Overview
GTLAB OverviewGTLAB Overview
GTLAB Overviewmarpierc
 

Plus de marpierc (13)

XSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata TutorialXSEDE14 SciGaP-Apache Airavata Tutorial
XSEDE14 SciGaP-Apache Airavata Tutorial
 
GCE11 Apache Rave Presentation
GCE11 Apache Rave PresentationGCE11 Apache Rave Presentation
GCE11 Apache Rave Presentation
 
SC11 Science Gateway Group Overview
SC11 Science Gateway Group OverviewSC11 Science Gateway Group Overview
SC11 Science Gateway Group Overview
 
Building Science Gateways with Gadgets and OpenSocial
Building Science Gateways with Gadgets and OpenSocialBuilding Science Gateways with Gadgets and OpenSocial
Building Science Gateways with Gadgets and OpenSocial
 
OGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial IntroOGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial Intro
 
OGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA SupportOGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA Support
 
OGCE SciDAC2010 Tutorial
OGCE SciDAC2010 TutorialOGCE SciDAC2010 Tutorial
OGCE SciDAC2010 Tutorial
 
OREChem Services and Workflows
OREChem Services and WorkflowsOREChem Services and Workflows
OREChem Services and Workflows
 
Ogce about-sc10
Ogce about-sc10Ogce about-sc10
Ogce about-sc10
 
Indiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway SupportIndiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway Support
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009GTLAB Installation Tutorial for SciDAC 2009
GTLAB Installation Tutorial for SciDAC 2009
 
GTLAB Overview
GTLAB OverviewGTLAB Overview
GTLAB Overview
 

PNNL April 2011 ogce

  • 1. Developing Cyberinfrastructure to Support Computational Chemistry Workflows Marlon Pierce (IU), Suresh Marru (IU), SudhakarPamidighantam (NCSA) SashikiranChalla (IU), Ye Fan (NCSA/IU), PatanachaiTangchaisin (IU)
  • 2. Part 1: Reusable Middleware for OREChem Services and workflows for OREChem
  • 3. Microsoft Research’s ORECHEM Project “A collaboration between chemistry scholars and information scientists to develop and deploy the infrastructure, services, and applications to enable new models for research and dissemination of scholarly materials in the chemistry community.” http://research.microsoft.com/en-us/projects/orechem/ 3
  • 4.
  • 11. NMR Spectra and Structural Data
  • 12.
  • 13. servicesTriplestore On Azure Cloud A not particularly accurate summary of OREChem 4
  • 14. IU’s Objective To build a pipeline to: Fetch ORE ATOM feeds Transform ATOM feeds into triples and store them into a triple store ( Using GRDDL/Saxon HE) Extract crystallographically obtained 3D coordinates information Submit compute intensive electronic structure calculations, geometry optimization tasks to tools like Gaussian09 on TeraGrid supercomputing resources. Transform the Gaussian output into triples and store them into a triple store 5
  • 15. OREChem-Computation Workflow Convert CML to Gaussian Input format Extract Moiety feeds in CML format Gaussian on TeraGrid Moiety files Gaussian Output to RDF triples ATOM Feeds from eCrystals or CrystalEye N3 files or RDF/XML Triplestore 6
  • 17. ORECHEM REST Services http://gf18.ucs.indiana.edu:8146/FeedsHarvester/cml3d/csv?harvester=moiety&numofentries=5 http://gf18.ucs.indiana.edu:8146/CML2GaussianSemCompChem/gauss/inputgenerator 8
  • 18. 9 OREChem Workflow in XBaya
  • 19. Part 2: Computational Chemistry Middleware Reusing software from the Open Gateway Computing Environments (OGCE) Project
  • 20. What Is a Science Gateway? User Interface and supporting Web services to scientific applications, data sets, and resources running on cyberinfrastructure. Science portals, Grid Computing Environments, … Broaden and simplify usage Cyberinfrastructure: Distributed computing resources and overlaying middleware for scientific computing. Prominent examples include TeraGrid, Open Science Grid Middleware includes Globus, Condor, iRods/SRB, … Some of these approaches being pushed by scientific cloud computing That is another topic
  • 21. TeraGrid is one of the largest investments in shared CI from NSF’s Office of CyberinfrastructureSoon to become TeraGrid/XD Dedicated high-speed, cross—country network Staff & Advanced Support 20 Petabytes Storage 2 PetaFLOPS Computation Visualization
  • 22. Computational Chemistry Grid Has a long history (S. Pamidighantam) Started in 1998as Quantum Chemistry Workbench Evolved into ChemVizin NCSA Expedition Era A pioneer of the TeraGrid Science Gateway and Community Account concepts Manages software installations and licensing as well as middleware Currently in two incarnations GridChem- Science Gateway for Molecular Sciences Production gateway ParamChem– Automatic Parameterization of Molecular Mechanics Infrastructure research built on GridChem
  • 23. GridChem Science Gateway Supported Applications Gaussian, CHARMM, NWChem, GAMESS, Molpro, QMCPack, MD Amber, ACES, NAMD, Wien2K, Gromacs, Castep Usage Statistics (December 2010) 431 Distinct Users 37,500 Computational jobs’ metadata in DB Over 2,000,000 Service Units consumed Tracked over 50 peer reviewed publications Reportable metrics are an important issue
  • 24. Simplified GridChem Architecture Gaussian, GAMES & Other Molecular Editors & Input Generators Gaussian, CHARMM, NWChem, GAMESS, NAMD, Amber … Configure Inputs Job Managers & Data Movement Interfaces GridChem Client OGCE/GridChem Middleware Monitor Resources Manage Jobs Submit & Monitor Jobs Download Output Output Analysis & Visualization
  • 25. Sample GridChem Post Processing
  • 26. Collaborations with Open Gateway Computing Environments The OGCE has several general purpose tools that are being phased into GridChem’s production middleware. XBaya: Graphical composition and execution of sequence of tasks. Workflow Interpreter Service and GFAC Supports long running executions and asynchronous invocations. Stop, rewind, and replay executions. Support parametric sweeps of workflows. Integrate human interactions into workflow executions.
  • 27. OGCE-Generalized GridChem Infrastructure Java CoG Abstraction TeraGrid/XD Globus Molecular Editors & Input Generators DRMAA & SSH Utilities Campus Resources European Grids Amazon, Eucalyptus EC2 Interface Unicore, Open Nebula Condor, SSH, (SLURM) OGCE Workflow & Job Management GridChem Client Cloud API’s Other Grid Middleware Output Analysis and Visualization (Requirements Driven)
  • 28. ParamChem Overview Collaboration between University of Maryland, NCSA, University of Kentucky, University of Florida Goal: automate the process of parameterization for classical molecular mechanics and semi-empirical methods These are realized as parameter sweeps of workflows. Results disseminated through GridChem data management tools Coupled execution of Quantum Chemistry and Molecular Mechanics. OGCE partners with ParamChem through the NSF SDCI program to provide workflow and job management middleware. Dynamics applications with optimization algorithms are being constructed as workflow chains. Workflow chains are submitted as part of parametric sweeps In progress
  • 29. Empirical ForceFields Parameterization Need Process Lack of Accurate Force Fields Produce Erroneous Property Estimation Vanommeslaeghe et al. J. Comp.Chem2010, 31, 671-690
  • 30. ParamChem Workflows Initial Structure Optimized Structure
  • 32. Part 3: Developing Sustainable Science Gateway Software The Open Gateway Computing Environments Project and Apache Software Foundation
  • 33. OGCE Software We try very hard to keep software scope under control. We don’t build data management systems, for example. We collaborate with groups who do.
  • 34. OGCE Funds Software Lifecycle Obvious but new of NSF as it becomes more interested in sustaining its research investments.
  • 35. Apache Incubators Joining Apache is our software sustainability strategy Open source licensing, meritocracy, visibility Apache’s community development model is our experiment More important than simply being open source. Need to go beyond SourceForge Distributed control, distributed credit. Airavata: tools for science gateway services and workflows XBaya, GFAC, Messenger, XRegistry Collaboration with WS02/LSF, IBM Builds on Apache Axis2, Apache ODE, (Apache Hadoop) Rave: OpenSocial gadget manager, general purpose gadgets Collaboration with Hippo, Mitre, SURFnet Builds on Apache Shindig
  • 36. More Information OGCE Web Site: http://www.collab-ogce.org News Feed/Blog: http://collab-ogce.blogspot.com Contact us: ogce-discuss@googlegroups.com http://groups.google.com/group/ogce-discuss/ Software Downloads: Software is available via SVN from our SourceForge project. http://sourceforge.net/projects/ogce/ See http://www.collab-ogce.org/ogce/index.php/Portal_download