Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Why The Cloud Is A Computational Biologist's Best Friend

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 13 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Why The Cloud Is A Computational Biologist's Best Friend (20)

Publicité

Plus récents (20)

Publicité

Why The Cloud Is A Computational Biologist's Best Friend

  1. 1. Amazon Cloud: A Religious Experience Yannick Pouliot 2/23/2012
  2. 2. Amazon Cloud services in a nutshell: Highly flexible storage and compute power sold on a use basis
  3. 3. Why the Cloud? • Complete flexibility of computing power and storage • Grow or diminish as needed • Arbitrary number of machines • Ridiculously powerful machine made affordable on a short lease basis to address particular task (e.g., 15B ANOVAs) • Unusual architectures (e.g., GPUs)
  4. 4. There Are Many Cloud Providers… … but Amazon is clear leader, IMO
  5. 5. Q: What does working with a Cloud machine feel like? A: It’s not materially different than accessing a machine on our cluster, except you can do anything you want
  6. 6. Main Services Provided by Amazon Cloud • Storage ▫ Traditional disk volumes ▫ S3 buckets (“Simple Storage System”) • Computing (EC2 – “Elastic Compute Cloud”) ▫ Single machine instances ▫ Clusters of various types • Machine types ▫ ▫ ▫ ▫ ▫ Compute servers Database servers Cluster Specialized architectures Variety of operating systems (LINUX flavors, Windows)
  7. 7. Types of Instances • Based on definition of the virtual machine definition ▫ ▫ ▫ ▫ I/O bus Number of CPUs Memory Type of CPU, cluster • Deployment: Spot market vs. “Reserved”
  8. 8. Costs • You pay for (almost) everything you do ▫ Data transfers (out) ▫ Storage ▫ CPU cycles (depends on instance type; one instance is free) • Can purchase cycles at below average market price ▫ Can provide access to vast amounts of computing power at a price you can afford • Research grants from Amazon
  9. 9. Controlling Your Services • Web-base console • Command-line tools ▫ EC2 API tools • Third party systems: RightScale
  10. 10. Using & Distributing Instances • You can always make images of your instances for later use/backup • Images can be made public • You can launch other people’s images (i.e., public images), e.g., ▫ CloudBioLinux: pre-made biocomputational instances ▫ Galaxy Cloud: pre-made Cluster-based Galaxy instance (Web-based, no less) ▫ PathSeq: pre-made comprehensive bowtie engine that uses Hadoop
  11. 11. Issues • Security ▫ Lots of it • Data transfers ▫ Free for upload; $ for download ▫ No big deal, so far ▫ Can send drives… • Latency ▫ No big deal • Small “ephemeral” storage ▫ Gotcha if you don’t know • Max 1 terabyte per disk ▫ Hum… • “Max” 20 disks per instance ▫ Can be circumvented • No sharing of disks between instances, usually
  12. 12. Support • Unless you purchase support, you’re on your own • Hasn’t been an issue for me, though it can consume time to find solution… Support options:
  13. 13. Questions?

Notes de l'éditeur

  • Blue=servicesI’ve used
  • Describe I/0
  • Mention cost calculator:http://calculator.s3.amazonaws.com/calc5.html
  • Go to security menu

×