SlideShare une entreprise Scribd logo
1  sur  24
1
Supporting Research through "Desktop as a
Service" models of e-infrastructure access
David Wallom
Overview
• What is e-infrastructure?
• Challenges of a changing community
• Availability of uniform environments for analysis
• Cloud computing and ‘as a Service’
What is e-Infrastructure?
The integration of digitally-based technology, resources, facilities, and
services combined with people and organizational structures needed to
support modern, collaborative research (and teaching).
1. Data and Storage
2. Hardware (Compute)
3. Software (and Algorithms)
4. Networks
5. Security and authentication
6. People (Collaboration, Skills, Capacity)
7. The Digital Library
RCUK e-Infrastructure Working Group
Changes in Earth Observation
13 Tools or Frameworks
presented so far @ EO
Open Science ‘16
What are the issues?
• Complex large volume data from multiple sources, access policy adherence
• Congested hardware, unsuitable usage mechanisms, changing platforms
• Software diversely applied across a community with significant sustainability questions
• Limited training @ undergraduate level → large amount of training at further career stages
• Disconnect between academic and industrial community members & tooling used
A complex geographically and
organisationally distributed community
with rapidly evolving challenges using
diverse tooling…
Bioinformatics software challenges
• An onslaught of new challenges for
bioinformatics:
– projects that used to require teams of
500 are now accessible to small (<5)
teams
– but biology curricula (i.e. biologists) still
lack computational skills.
– thus biologists are overwhelmed by large
amounts of data
– furthermore data types are young - so
software is young, thus
• software may be badly built (by biologists
with no formal software dev training/xp).
• software needs to be frequently updated
(bugfixes, algorithmic improvements
(sensitivity/specificity), new data type
support).
changes everything for biology
How to distribute complex community
software in a way that ensures we
minimise the amount of time learning to
use it takes and minimises support
requriements?
Bio-Linux: A scalable solution
• Comprehensive, free bioinformatics workstation based on Ubuntu
Linux and Debian Med
• 11 years & 8 major releases
• >8000 users from 1600 locations
• 200+ bioinf packages including big integrative tools :- QIIME, Galaxy
Server, PredictProtein, EMBOSS, ...Incorporates all software
Dual BootLinux Live Local Servers Cloud
Docker, simplifying the portability
of applications and services
How to simplify the access to resources
onto which we can run applications on
demand in an environment in which we
are familiar?
Where do we do work at the moment?
External Network inside JASMIN
Unmanaged Cloud – IaaS, PaaS, SaaS
JASMIN Internal Network
Panasas
storage
Lotus Batch
Compute
JASMIN Cloud Architecture
Standard Remote Access Protocols –
ftp, http, …
Managed Cloud - PaaS, SaaS
JASMIN
Analysis
Platform
VM
Project1-org
Science
Analysis
VM 0
Science
Analysis
VM 0
Science
Analysis
VM
JASMIN Cloud Management Interfaces
Direct File
System
Access
Direct access to
batch processing
cluster
Appliance
Catalogue
Firewall + NAT
Firewall
optirad-org
Science
Analysis
VM 0
Science
Analysis
VM 0
IPython
Slave VM
File Server
VM
IPython
JupyterHub VM
eos-cloud-org
Science
Analysis
VM 0
Science
Analysis VM
0
EOSCloud VM
File Server
VM
EOSCloud
Fat Node
IPython Notebook VM with access
cluster through IPython.parallel EOSCoud Desktop as a Service
with dynamic RAM boost
Appliance
Catalogue
Appliance
Catalogue
Firewall + NAT Firewall + NAT
Firewall
Thanks to Phil Kershaw
OTHER PUBLIC CLOUD PROVIDERS ARE AVAILABLE
Why Cloud?
• Data sets can be too big or restricted to easily move
– move the compute to the data
– Researcher work patterns are maintained
• More efficient use of shared resources
• Central maintenance of infrastructure
• Central Management of data sharing agreements possible
• Lower barrier to entry (Compared to traditional HPC and Grid)
• What type of cloud?
• What role for traditional HPC?
TRAINING IS KEY TO MAKING INFORMED CHOICES
Bringing together the ideal platform with
a uniform software tooling distribution
mechanism
EOS Cloud
• A tenancy in the JASMIN Unmanaged Cloud (& QMUL RCC)
• Reusing JASMIN web interfaces and user management to
provide custom IaaS software platform
• Each receives two VMs
– Bio-Linux
– Ubuntu Docker hosting environment
• Users have total responsibility for instantiated system
• Accessible though standard remote desktop tools
• But,
– utilising single scale of resources would be a waste
– Can we scale the users virtual services to take into account demand?
Boosting Resource Capabilities
• Users VMs operate in native state ‘Standard’
– Enough capability to access stored data
– Configure applications and workflows
– Free
• User may boost his running VM to increased
capability
– Enough to run analysis applications on useful
timescale
– Credit consumption only for Boosted instances
• Reference datasets available to users
through shared storage
Name # Core Memory (GB) Cost(Credit/hou
r)
Standard 1 16 0
Standard+ 2 40 1
Big 8 140 4
Max 16 500 8
oSwitch
One-line access to other operating systems.
• Docker applications though portable can feel extremely alien in their usability
• With oSwitch in contrast things feel (largely) unchanged:
– Current working directory is maintained.
– User name, uid and gid are maintained.
– Login shell (bash/zsh/fish) is maintained.
– Home directory is maintained (thus all .dotfiles and config files are maintained).
– read/write permissions are maintained.
– Paths are maintained whenever possible. Thus volumes (external drives, NAS, USB) mounted on
the host are available in the container at the same path.
https://github.com/yeban/oswitch
Pilot Users
• Citizen data collection project
– Physical samples sent for sequencing to assess microbial diversity
– ~200 sites
• Creating compute pipelines and containers for each OSD in silico analysis
– HPC, Cloud (IaaS & PaaS)
– Portable
• Run same analysis on different laptops/grids/clouds
– Repeatable/Reproducible
• Same input gives same output given that reference databases did not change
– Preservation
• All analysis tools and dependencies are in one image
• Images are simple tar.gz
• Preserving Docker and base images is preserving all analysis
Desktop as a Service for research
• Giving researchers an environment they are confident in by changing the
infrastructure around them
• Location independent persistence of the research environment
• Support further key usage models such as teaching or online learning
• Gives tool developers a deployment mechanism that already has community
visibility
Conclusions
• Abstract underpinning e-infrastructure services from the users
– In many cases not interested in what type of tin they’re running on!
– Run something on one resource should be able to be moved to others through the use of
standards etc!
• Cloud is (obviously) an enabler for research
– Allowing flexibility in infrastructure hitherto not possible
– User control rather than provider control
– Higher level services more easily composed and made accessible through Marketplaces
• Creating an ‘EnvLinux’ distribution would create a community wide software suite
across all stakeholder groups.
– Support a single point of distribution for all community relevant software
– Create simple deposit mechanism to allow new tools and services to easily join
– Support activities such as teaching that do not generally work well with rapidly changing
activities such as research software development
Thank you & questions?

Contenu connexe

Tendances

Other distributed systems
Other distributed systemsOther distributed systems
Other distributed systems
Sri Prasanna
 
CS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question BankCS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question Bank
pkaviya
 
2010 09-24-闕志克老師-cloud computing where do we go
2010 09-24-闕志克老師-cloud computing where do we go2010 09-24-闕志克老師-cloud computing where do we go
2010 09-24-闕志克老師-cloud computing where do we go
nccuscience
 
Multi-Tenancy and Virtualization in Cloud Computing
Multi-Tenancy and Virtualization in Cloud ComputingMulti-Tenancy and Virtualization in Cloud Computing
Multi-Tenancy and Virtualization in Cloud Computing
Alexandru Iosup
 

Tendances (20)

Cluster Computing Seminar.
Cluster Computing Seminar.Cluster Computing Seminar.
Cluster Computing Seminar.
 
Seminar on cloudcomputing
Seminar on cloudcomputingSeminar on cloudcomputing
Seminar on cloudcomputing
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Cs6703 grid and cloud computing unit 1
Cs6703 grid and cloud computing unit 1Cs6703 grid and cloud computing unit 1
Cs6703 grid and cloud computing unit 1
 
A cloud environment for backup and data storage
A cloud environment for backup and data storageA cloud environment for backup and data storage
A cloud environment for backup and data storage
 
Virtualization in cloud computing
Virtualization in cloud computingVirtualization in cloud computing
Virtualization in cloud computing
 
Clusters
ClustersClusters
Clusters
 
cluster computing
cluster computingcluster computing
cluster computing
 
Other distributed systems
Other distributed systemsOther distributed systems
Other distributed systems
 
Cluster computing
Cluster computingCluster computing
Cluster computing
 
CS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question BankCS8791 Cloud Computing - Question Bank
CS8791 Cloud Computing - Question Bank
 
Week 1 lecture material cc
Week 1 lecture material ccWeek 1 lecture material cc
Week 1 lecture material cc
 
Cluster Computing
Cluster ComputingCluster Computing
Cluster Computing
 
Integrating DuraCloud with DPN at Chronopolis and the Texas Digital Library
Integrating DuraCloud with DPN at Chronopolis and the Texas Digital LibraryIntegrating DuraCloud with DPN at Chronopolis and the Texas Digital Library
Integrating DuraCloud with DPN at Chronopolis and the Texas Digital Library
 
Towards a Cloud Native Big Data Platform using MiCADO
Towards a Cloud Native Big Data Platform using MiCADOTowards a Cloud Native Big Data Platform using MiCADO
Towards a Cloud Native Big Data Platform using MiCADO
 
Cloud slide
Cloud slideCloud slide
Cloud slide
 
2010 09-24-闕志克老師-cloud computing where do we go
2010 09-24-闕志克老師-cloud computing where do we go2010 09-24-闕志克老師-cloud computing where do we go
2010 09-24-闕志克老師-cloud computing where do we go
 
Computing Outside The Box
Computing Outside The BoxComputing Outside The Box
Computing Outside The Box
 
Cluster computing report
Cluster computing reportCluster computing report
Cluster computing report
 
Multi-Tenancy and Virtualization in Cloud Computing
Multi-Tenancy and Virtualization in Cloud ComputingMulti-Tenancy and Virtualization in Cloud Computing
Multi-Tenancy and Virtualization in Cloud Computing
 

En vedette

certificate-BOW325 Web Intelligence Advanced
certificate-BOW325 Web Intelligence Advancedcertificate-BOW325 Web Intelligence Advanced
certificate-BOW325 Web Intelligence Advanced
Charles Brown
 
Принципы объектно-ориентированного дизайна
Принципы объектно-ориентированного дизайнаПринципы объектно-ориентированного дизайна
Принципы объектно-ориентированного дизайна
Сергей Шебанин
 
PLAN DE NEGOCIOS LIBRERIA EXPERTISE
PLAN DE NEGOCIOS LIBRERIA EXPERTISEPLAN DE NEGOCIOS LIBRERIA EXPERTISE
PLAN DE NEGOCIOS LIBRERIA EXPERTISE
Luis Baquero
 

En vedette (16)

certificate-BOW325 Web Intelligence Advanced
certificate-BOW325 Web Intelligence Advancedcertificate-BOW325 Web Intelligence Advanced
certificate-BOW325 Web Intelligence Advanced
 
The University of Oxford e-Research Centre
The University of Oxford e-Research CentreThe University of Oxford e-Research Centre
The University of Oxford e-Research Centre
 
Introduction to Cloud Computing
Introduction to Cloud ComputingIntroduction to Cloud Computing
Introduction to Cloud Computing
 
Sme strategy half day strategic planning workshops offsites
Sme strategy half day strategic planning workshops offsitesSme strategy half day strategic planning workshops offsites
Sme strategy half day strategic planning workshops offsites
 
6.1 Шаблоны классов
6.1 Шаблоны классов6.1 Шаблоны классов
6.1 Шаблоны классов
 
Принципы объектно-ориентированного дизайна
Принципы объектно-ориентированного дизайнаПринципы объектно-ориентированного дизайна
Принципы объектно-ориентированного дизайна
 
Big Data technology for systems monitoring in Energy – Big Data Europe
Big Data technology for systems monitoring in Energy – Big Data Europe Big Data technology for systems monitoring in Energy – Big Data Europe
Big Data technology for systems monitoring in Energy – Big Data Europe
 
Devops Enterprise Summit: My Great Awakening: 
Top “Ah-ha” Moments As Former ...
Devops Enterprise Summit: My Great Awakening: 
Top “Ah-ha” Moments As Former ...Devops Enterprise Summit: My Great Awakening: 
Top “Ah-ha” Moments As Former ...
Devops Enterprise Summit: My Great Awakening: 
Top “Ah-ha” Moments As Former ...
 
Задача №1. Работа с видео.
Задача №1. Работа с видео.Задача №1. Работа с видео.
Задача №1. Работа с видео.
 
С. Дасгупта, Х. Пападимитриу, У. Вазирани. Алгоритмы
С. Дасгупта, Х. Пападимитриу, У. Вазирани. АлгоритмыС. Дасгупта, Х. Пападимитриу, У. Вазирани. Алгоритмы
С. Дасгупта, Х. Пападимитриу, У. Вазирани. Алгоритмы
 
Game Programming 13 - Debugging & Performance Optimization
Game Programming 13 - Debugging & Performance OptimizationGame Programming 13 - Debugging & Performance Optimization
Game Programming 13 - Debugging & Performance Optimization
 
Разбор задач модуля "Теория графов ll"
Разбор задач модуля "Теория графов ll"Разбор задач модуля "Теория графов ll"
Разбор задач модуля "Теория графов ll"
 
Разбор задач модуля Комбинаторика l
Разбор задач модуля Комбинаторика lРазбор задач модуля Комбинаторика l
Разбор задач модуля Комбинаторика l
 
Las redes sociales en la educación
Las redes sociales en la educaciónLas redes sociales en la educación
Las redes sociales en la educación
 
PLAN DE NEGOCIOS LIBRERIA EXPERTISE
PLAN DE NEGOCIOS LIBRERIA EXPERTISEPLAN DE NEGOCIOS LIBRERIA EXPERTISE
PLAN DE NEGOCIOS LIBRERIA EXPERTISE
 
Papelería lluvias-de-bendición
Papelería lluvias-de-bendiciónPapelería lluvias-de-bendición
Papelería lluvias-de-bendición
 

Similaire à Supporting Research through "Desktop as a Service" models of e-infrastructure access

Desktop as a Service supporting Environmental ‘omics
Desktop as a Service supporting Environmental ‘omicsDesktop as a Service supporting Environmental ‘omics
Desktop as a Service supporting Environmental ‘omics
David Wallom
 
Clould Computing and its application in Libraries
Clould Computing and its application in LibrariesClould Computing and its application in Libraries
Clould Computing and its application in Libraries
Amit Shaw
 
Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtc
DataTactics
 

Similaire à Supporting Research through "Desktop as a Service" models of e-infrastructure access (20)

Desktop as a Service supporting Environmental 'Omics
Desktop as a Service supporting Environmental 'OmicsDesktop as a Service supporting Environmental 'Omics
Desktop as a Service supporting Environmental 'Omics
 
e-infrastructural needs to support informatics
e-infrastructural needs to support informaticse-infrastructural needs to support informatics
e-infrastructural needs to support informatics
 
Desktop as a Service supporting Environmental ‘omics
Desktop as a Service supporting Environmental ‘omicsDesktop as a Service supporting Environmental ‘omics
Desktop as a Service supporting Environmental ‘omics
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...
 
OIT552 Cloud Computing Material
OIT552 Cloud Computing MaterialOIT552 Cloud Computing Material
OIT552 Cloud Computing Material
 
Introduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OSIntroduction to Apache Mesos and DC/OS
Introduction to Apache Mesos and DC/OS
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
HPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journeyHPC and cloud distributed computing, as a journey
HPC and cloud distributed computing, as a journey
 
Ism
IsmIsm
Ism
 
Data Lake and the rise of the microservices
Data Lake and the rise of the microservicesData Lake and the rise of the microservices
Data Lake and the rise of the microservices
 
Packaging computational biology tools for broad distribution and ease-of-reuse
Packaging computational biology tools for broad distribution and ease-of-reusePackaging computational biology tools for broad distribution and ease-of-reuse
Packaging computational biology tools for broad distribution and ease-of-reuse
 
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
Analyzing Big Data in Medicine with Virtual Research Environments and Microse...
 
The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data
The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data
The Effectiveness, Efficiency and Legitimacy of Outsourcing Your Data
 
DOE Magellan OpenStack user story
DOE Magellan OpenStack user storyDOE Magellan OpenStack user story
DOE Magellan OpenStack user story
 
Clould Computing and its application in Libraries
Clould Computing and its application in LibrariesClould Computing and its application in Libraries
Clould Computing and its application in Libraries
 
Application Virtualization, University of New Hampshire
Application Virtualization, University of New HampshireApplication Virtualization, University of New Hampshire
Application Virtualization, University of New Hampshire
 
Federated Cloud Computing
Federated Cloud ComputingFederated Cloud Computing
Federated Cloud Computing
 
在小學有效運用雲端電腦以促進電子學習(第一節筆記)
在小學有效運用雲端電腦以促進電子學習(第一節筆記)在小學有效運用雲端電腦以促進電子學習(第一節筆記)
在小學有效運用雲端電腦以促進電子學習(第一節筆記)
 
{code} and containers
{code} and containers{code} and containers
{code} and containers
 
Data Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtcData Tactics dhs introduction to cloud technologies wtc
Data Tactics dhs introduction to cloud technologies wtc
 

Plus de David Wallom

Plus de David Wallom (20)

Quantifying the impact of green leasing on energy use in a retail portfolio: ...
Quantifying the impact of green leasing on energy use in a retail portfolio: ...Quantifying the impact of green leasing on energy use in a retail portfolio: ...
Quantifying the impact of green leasing on energy use in a retail portfolio: ...
 
Trust and Cloud computing, removing the need for the consumer to trust their ...
Trust and Cloud computing, removing the need for the consumer to trust their ...Trust and Cloud computing, removing the need for the consumer to trust their ...
Trust and Cloud computing, removing the need for the consumer to trust their ...
 
Trust and Cloud computing, removing the need for the consumer to trust their ...
Trust and Cloud computing, removing the need for the consumer to trust their ...Trust and Cloud computing, removing the need for the consumer to trust their ...
Trust and Cloud computing, removing the need for the consumer to trust their ...
 
Benefits of big data analytics in Smart Metering, ADEPT, WICKED and beyond
Benefits of big data analytics in Smart Metering,  ADEPT, WICKED and beyondBenefits of big data analytics in Smart Metering,  ADEPT, WICKED and beyond
Benefits of big data analytics in Smart Metering, ADEPT, WICKED and beyond
 
Smarter Energy, Infrastruture service, consumtion analytics and applications
Smarter Energy, Infrastruture service, consumtion analytics and applicationsSmarter Energy, Infrastruture service, consumtion analytics and applications
Smarter Energy, Infrastruture service, consumtion analytics and applications
 
The Climateprediction.net programme, big data climate modelling
The Climateprediction.net programme, big data climate modellingThe Climateprediction.net programme, big data climate modelling
The Climateprediction.net programme, big data climate modelling
 
1990-2050 sulphur dioxide emissions data from ECLIPSE v5a for use in Met Offi...
1990-2050 sulphur dioxide emissions data from ECLIPSE v5a for use in Met Offi...1990-2050 sulphur dioxide emissions data from ECLIPSE v5a for use in Met Offi...
1990-2050 sulphur dioxide emissions data from ECLIPSE v5a for use in Met Offi...
 
e-Research & the art of linking Astrophysics to Deforestation
e-Research & the art of linking Astrophysics to Deforestatione-Research & the art of linking Astrophysics to Deforestation
e-Research & the art of linking Astrophysics to Deforestation
 
Privacy and Security policies in the cloud
Privacy and Security policies in the cloudPrivacy and Security policies in the cloud
Privacy and Security policies in the cloud
 
Working with Earth Observation Data, INFORM and the IEA
Working with Earth Observation Data, INFORM and the IEAWorking with Earth Observation Data, INFORM and the IEA
Working with Earth Observation Data, INFORM and the IEA
 
WICKED - Working with the data rich
WICKED - Working with the data richWICKED - Working with the data rich
WICKED - Working with the data rich
 
Mapping Priorities and Future Collaborations for you Projects
Mapping Priorities and Future Collaborations for you ProjectsMapping Priorities and Future Collaborations for you Projects
Mapping Priorities and Future Collaborations for you Projects
 
CloudWatch: Mapping priorities and future collaboration for your project
CloudWatch: Mapping priorities and future collaboration for your projectCloudWatch: Mapping priorities and future collaboration for your project
CloudWatch: Mapping priorities and future collaboration for your project
 
Trust and Cloud Computing, removing the need to trust your cloud provider
Trust and Cloud Computing, removing the need to trust your cloud providerTrust and Cloud Computing, removing the need to trust your cloud provider
Trust and Cloud Computing, removing the need to trust your cloud provider
 
CloudWatch2 Adoption Deep Dive
CloudWatch2 Adoption Deep DiveCloudWatch2 Adoption Deep Dive
CloudWatch2 Adoption Deep Dive
 
Generating Insight from Big Data
Generating Insight from Big DataGenerating Insight from Big Data
Generating Insight from Big Data
 
International Forest Risk Model
International Forest Risk ModelInternational Forest Risk Model
International Forest Risk Model
 
Generating Insight from Big Data in Energy and the Environment
Generating Insight from Big Data in Energy and the EnvironmentGenerating Insight from Big Data in Energy and the Environment
Generating Insight from Big Data in Energy and the Environment
 
Smart Grid, Smart Metering and Cybersecurity
Smart Grid, Smart Metering and CybersecuritySmart Grid, Smart Metering and Cybersecurity
Smart Grid, Smart Metering and Cybersecurity
 
Introduction to the OeRC and Environmental ICT projects
Introduction to the OeRC and Environmental ICT projectsIntroduction to the OeRC and Environmental ICT projects
Introduction to the OeRC and Environmental ICT projects
 

Dernier

Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Sérgio Sacani
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
NazaninKarimi6
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 

Dernier (20)

Forensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdfForensic Biology & Its biological significance.pdf
Forensic Biology & Its biological significance.pdf
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune WaterworldsBiogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
Biogenic Sulfur Gases as Biosignatures on Temperate Sub-Neptune Waterworlds
 
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
Human & Veterinary Respiratory Physilogy_DR.E.Muralinath_Associate Professor....
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
Introduction to Viruses
Introduction to VirusesIntroduction to Viruses
Introduction to Viruses
 
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptxPSYCHOSOCIAL NEEDS. in nursing II sem pptx
PSYCHOSOCIAL NEEDS. in nursing II sem pptx
 
Zoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdfZoology 5th semester notes( Sumit_yadav).pdf
Zoology 5th semester notes( Sumit_yadav).pdf
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceuticsPulmonary drug delivery system M.pharm -2nd sem P'ceutics
Pulmonary drug delivery system M.pharm -2nd sem P'ceutics
 
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICESAMASTIPUR CALL GIRL 7857803690  LOW PRICE  ESCORT SERVICE
SAMASTIPUR CALL GIRL 7857803690 LOW PRICE ESCORT SERVICE
 
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verifiedSector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 62, Noida Call girls :8448380779 Model Escorts | 100% verified
 
development of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virusdevelopment of diagnostic enzyme assay to detect leuser virus
development of diagnostic enzyme assay to detect leuser virus
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 

Supporting Research through "Desktop as a Service" models of e-infrastructure access

  • 1. 1 Supporting Research through "Desktop as a Service" models of e-infrastructure access David Wallom
  • 2. Overview • What is e-infrastructure? • Challenges of a changing community • Availability of uniform environments for analysis • Cloud computing and ‘as a Service’
  • 3. What is e-Infrastructure? The integration of digitally-based technology, resources, facilities, and services combined with people and organizational structures needed to support modern, collaborative research (and teaching). 1. Data and Storage 2. Hardware (Compute) 3. Software (and Algorithms) 4. Networks 5. Security and authentication 6. People (Collaboration, Skills, Capacity) 7. The Digital Library RCUK e-Infrastructure Working Group
  • 4. Changes in Earth Observation 13 Tools or Frameworks presented so far @ EO Open Science ‘16
  • 5. What are the issues? • Complex large volume data from multiple sources, access policy adherence • Congested hardware, unsuitable usage mechanisms, changing platforms • Software diversely applied across a community with significant sustainability questions • Limited training @ undergraduate level → large amount of training at further career stages • Disconnect between academic and industrial community members & tooling used
  • 6. A complex geographically and organisationally distributed community with rapidly evolving challenges using diverse tooling…
  • 7. Bioinformatics software challenges • An onslaught of new challenges for bioinformatics: – projects that used to require teams of 500 are now accessible to small (<5) teams – but biology curricula (i.e. biologists) still lack computational skills. – thus biologists are overwhelmed by large amounts of data – furthermore data types are young - so software is young, thus • software may be badly built (by biologists with no formal software dev training/xp). • software needs to be frequently updated (bugfixes, algorithmic improvements (sensitivity/specificity), new data type support). changes everything for biology
  • 8. How to distribute complex community software in a way that ensures we minimise the amount of time learning to use it takes and minimises support requriements?
  • 9. Bio-Linux: A scalable solution • Comprehensive, free bioinformatics workstation based on Ubuntu Linux and Debian Med • 11 years & 8 major releases • >8000 users from 1600 locations • 200+ bioinf packages including big integrative tools :- QIIME, Galaxy Server, PredictProtein, EMBOSS, ...Incorporates all software Dual BootLinux Live Local Servers Cloud
  • 10. Docker, simplifying the portability of applications and services
  • 11. How to simplify the access to resources onto which we can run applications on demand in an environment in which we are familiar?
  • 12. Where do we do work at the moment?
  • 13. External Network inside JASMIN Unmanaged Cloud – IaaS, PaaS, SaaS JASMIN Internal Network Panasas storage Lotus Batch Compute JASMIN Cloud Architecture Standard Remote Access Protocols – ftp, http, … Managed Cloud - PaaS, SaaS JASMIN Analysis Platform VM Project1-org Science Analysis VM 0 Science Analysis VM 0 Science Analysis VM JASMIN Cloud Management Interfaces Direct File System Access Direct access to batch processing cluster Appliance Catalogue Firewall + NAT Firewall optirad-org Science Analysis VM 0 Science Analysis VM 0 IPython Slave VM File Server VM IPython JupyterHub VM eos-cloud-org Science Analysis VM 0 Science Analysis VM 0 EOSCloud VM File Server VM EOSCloud Fat Node IPython Notebook VM with access cluster through IPython.parallel EOSCoud Desktop as a Service with dynamic RAM boost Appliance Catalogue Appliance Catalogue Firewall + NAT Firewall + NAT Firewall Thanks to Phil Kershaw
  • 14. OTHER PUBLIC CLOUD PROVIDERS ARE AVAILABLE
  • 15. Why Cloud? • Data sets can be too big or restricted to easily move – move the compute to the data – Researcher work patterns are maintained • More efficient use of shared resources • Central maintenance of infrastructure • Central Management of data sharing agreements possible • Lower barrier to entry (Compared to traditional HPC and Grid) • What type of cloud? • What role for traditional HPC? TRAINING IS KEY TO MAKING INFORMED CHOICES
  • 16. Bringing together the ideal platform with a uniform software tooling distribution mechanism
  • 17. EOS Cloud • A tenancy in the JASMIN Unmanaged Cloud (& QMUL RCC) • Reusing JASMIN web interfaces and user management to provide custom IaaS software platform • Each receives two VMs – Bio-Linux – Ubuntu Docker hosting environment • Users have total responsibility for instantiated system • Accessible though standard remote desktop tools • But, – utilising single scale of resources would be a waste – Can we scale the users virtual services to take into account demand?
  • 18. Boosting Resource Capabilities • Users VMs operate in native state ‘Standard’ – Enough capability to access stored data – Configure applications and workflows – Free • User may boost his running VM to increased capability – Enough to run analysis applications on useful timescale – Credit consumption only for Boosted instances • Reference datasets available to users through shared storage Name # Core Memory (GB) Cost(Credit/hou r) Standard 1 16 0 Standard+ 2 40 1 Big 8 140 4 Max 16 500 8
  • 19. oSwitch One-line access to other operating systems. • Docker applications though portable can feel extremely alien in their usability • With oSwitch in contrast things feel (largely) unchanged: – Current working directory is maintained. – User name, uid and gid are maintained. – Login shell (bash/zsh/fish) is maintained. – Home directory is maintained (thus all .dotfiles and config files are maintained). – read/write permissions are maintained. – Paths are maintained whenever possible. Thus volumes (external drives, NAS, USB) mounted on the host are available in the container at the same path. https://github.com/yeban/oswitch
  • 20.
  • 21. Pilot Users • Citizen data collection project – Physical samples sent for sequencing to assess microbial diversity – ~200 sites • Creating compute pipelines and containers for each OSD in silico analysis – HPC, Cloud (IaaS & PaaS) – Portable • Run same analysis on different laptops/grids/clouds – Repeatable/Reproducible • Same input gives same output given that reference databases did not change – Preservation • All analysis tools and dependencies are in one image • Images are simple tar.gz • Preserving Docker and base images is preserving all analysis
  • 22. Desktop as a Service for research • Giving researchers an environment they are confident in by changing the infrastructure around them • Location independent persistence of the research environment • Support further key usage models such as teaching or online learning • Gives tool developers a deployment mechanism that already has community visibility
  • 23. Conclusions • Abstract underpinning e-infrastructure services from the users – In many cases not interested in what type of tin they’re running on! – Run something on one resource should be able to be moved to others through the use of standards etc! • Cloud is (obviously) an enabler for research – Allowing flexibility in infrastructure hitherto not possible – User control rather than provider control – Higher level services more easily composed and made accessible through Marketplaces • Creating an ‘EnvLinux’ distribution would create a community wide software suite across all stakeholder groups. – Support a single point of distribution for all community relevant software – Create simple deposit mechanism to allow new tools and services to easily join – Support activities such as teaching that do not generally work well with rapidly changing activities such as research software development
  • 24. Thank you & questions?

Notes de l'éditeur

  1. Software – Mathais slide…
  2. Have we seen this before?
  3. Bio-Linux[1] is a major success with widespread adoption across multiple communities. It provides a method by which personal bioinformatics workstations can be set up and maintained by novice users, removing the need for individuals to compile and install packages themselves. The project has an active user and developer community and engages with other open-source community efforts like Debian Med[2], Galaxy[3], Open Bioinformatics Foundation[4], etc. Workstations running Bio-Linux connect to a package repository where hundreds of tools have been built into Debian (.deb) packages and are tested and continually updated by the community of maintainers. Installation or removal of even complex tools is trivial for the end user, and updates are automatic. Bio-Linux has traditionally been installed onto bare-metal, possibly on a dual-boot workstation, or run directly from a USB stick for training and demo purposes. In recent times, use on local Virtual Machines(VM) and Cloud systems has increased, with a community CloudBioLinux[5] effort building and releasing machine images on the Amazon EC2[6] public cloud. Also, partly through this current project, support for local VM installations (on VirtualBox[7], Parallels[8], VMWare[9]) has been made a core part of each Bio-Linux release. Users can obtain an Open Virtualization Archive (OVA)[10] file which loads directly into these products to instantly provide a functional desktop system. It is this same pre-packaged virtualised workstation that users now access on the EOS Cloud.
  4. Docker[11] allows a uniform platform-independent application to be built and encapsulated in a format know as a container. As the research community, and people in general, make use of a large number of different operating systems the complexity of installing applications on new systems and can be high. Sometimes there may be incompatibilities between the libraries and helper tools needed by an application and those on the target platform. The Docker approach aims to get round this by providing single application hosting platform irrespective of hosting system/operating system etc. We have recently seen within the bioinformatics community an increase in sharing of applications and tools in this manner. One problem faced when using the Docker hosting environment is that the applications that are installed and running within it look and feel like remotely hosted applications, they do not behave as if they have been installed on the users machine. Therefore a tool that bridges this usability gap would greatly increase the acceptability of tools and services that are installed and distributed using Docker.
  5. Hosted through the Centre for Environmental Data Analysis(CEDA) at STFC the JASMIN service[13] is intended to be a multi user and multi use-case facility that is able to support a wide range of user communities. The system is built in a modular fashion with a hardware layer that is deliberately designed to support multiple different distributed computing paradigms, from HPC type workloads to cloud hosting. Utilising commercial off-the-shelf private cloud software, which supports API level access, the cloud service operated in two modes: firstly a highly controlled Platform as a Service system in the managed cloud, and secondly a more flexible community or user controlled Infrastructure as a Service system, the unmanaged cloud. Users may opt to run in the managed mode, where VMs are standardised to JASMIN approved templates and configured by Puppet[14] scripts that apply centralised access controls, or in the unmanaged mode which we use for the EOS Cloud. Here each user can have full root access rights on their VM once it is deployed and handed over to them, and we as clients of JASMIN can make use of the full vCloud Director API[15] to assign resources to VMs according to our own policy. A further advantage of working on JASMIN is that, as a community cloud, the service already hosts, and synchronises data from relevant providers. These include large reference datasets which our target user community finds useful. Users can rely on this resource and only need to concern themselves with the transfer of their own data into and out of the cloud.
  6. The oSwitch application enables seamless switching from one operating system to another - providing access to diverse ranges of tools, whilst ensuring that during their utilisation the environment provided is not too alien to the user community. oSwitch is a wrapper facilitating access to Docker images. Importantly, when switching operating systems inside a shell, most things remain unchanged: Current working directory is maintained User name, uid and gid are maintained Login shell (bash/zsh/fish) is maintained Home directory is maintained (with all .dotfiles and config files). Read/Write permissions are maintained Paths are maintained whenever possible. Thus volumes (external drives, Network Attached Storage, USB) mounted on the host are available in the container at the same paths. The service is designed to sit within the operating system to allow a user to manage their own applications whilst ensuring that the way they interact with their own system is not changed by the running of containerised applications. This also overcomes a major hurdle for systems administrators because it makes it possible for even tools with complex dependencies to be installed entirely by users in user-space, and for bioinformaticians because it dramatically increases the agility with which they can test out and use different tools as part of their analysis processes. We will not have to perform any packaging of any specific tools because docker-packaging is already widely occurring for existing tools (e.g. http://bioboxes.org and see http://hub.docker.com). oSwitch enables access to easily and quickly installed applications with enabled utilisation as if they are installed natively.