October 2013 "Beyond the Genome" presentation slides. Talk is mostly focused on issues around IaaS cloud usage for "Bio-IT" and life science informatics & scientific computing.
PDF SLIDES AVAILABLE DIRECTLY - PLEASE EMAIL "CHRIS@BIOTEAM.NET" FOR SLIDES
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Bio-IT & Cloud Sobriety: 2013 Beyond The Genome Meeting
1. Bio-IT & Cloud Sobriety
Beyond the Genome, San Francisco 2013
Thursday, October 3, 13
2. 2
The ‘Meta’ Issue
What is driving all of this?
Drivers For Cloud Adoption In Bio-IT
What The Cloud Salespeople Will Not Tell You
Private Clouds & Practical Advice
Intro & Terminology
Getting our buzzwords straight
The Road Ahead
1
2
3
4
5
6
Thursday, October 3, 13
3. 3
I’m Chris.
I’m an infrastructure geek.
I work for the BioTeam.
Twitter: @chris_dag
Thursday, October 3, 13
4. Who, What, Why ...
4
BioTeam
‣ Independent consulting shop
‣ Staffed by scientists forced to
learn IT, SW & HPC to get our
own research done
‣ 10+ years bridging the “gap”
between science, IT & high
performance computing
‣ Our wide-ranging work is what
gets us invited to speak at
events like this ...
Thursday, October 3, 13
5. Seriously.
Listen to me at your own risk
‣ Clever people find multiple
solutions to common issues
‣ I’m fairly blunt, burnt-out
and cynical in my advanced
age
‣ Significant portion of my
work has been done in
demanding production
Biotech & Pharma
environments
‣ Filter my words accordingly
5
Thursday, October 3, 13
7. 7
Defining Terms
‣ The term ‘cloud computing’ is almost meaning-
free today – too many marketers have fuzzed
and co-opted the term
‣ Before serious discussion can occur it is
essential that all parties are operating from
similar baseline presumptions
Thursday, October 3, 13
8. Gartner
8
Defining Terms
‣ Gartner:
• “Cloud computing is a style of computing where
scalable and elastic IT-enabled capabilities are
delivered as a service to external customers using
Internet technologies.”
Thursday, October 3, 13
9. 9
My preferred definition
‣ Jinesh Varia on Amazon Web Services:
• “… a highly reliable and scalable infrastructure for
deploying web-scale solutions, with minimal support
and administration costs, and more flexibility than
you’ve come to expect from your own infrastructure,
either on-premise or at a datacenter facility.”
Thursday, October 3, 13
10. I’m an infrastructure geek, which do you think I prefer?
10
Cloud Subtypes
‣ Software as a Service
(SaaS)
‣ Platform as a Service
(PaaS)
‣ Infrastructure as a Service
(IaaS)
Thursday, October 3, 13
11. 11
This is an IaaS cloud talk
‣ We need flexible scientific computing and
informatics capability “on the cloud”
‣ Service and Platform clouds are not a good fit
for the flexible/general use case
‣ IaaS clouds provide “building blocks” that allow
us to build the informatics environments we
require
Thursday, October 3, 13
29. 29
16 of AWS’s biggest servers + 22 GPU nodes
... at a cost of $30/hour via Spot Market
Non Trivial HPC on the cloud
Thursday, October 3, 13
30. Why this work was ‘easy’ on Amazon AWS ...
30
Difficult on any other cloud
‣ Lets discuss why this simulation workload
would be much, much harder to do on some
other cloud platform ...
Thursday, October 3, 13
31. Why this work was ‘easy’ on Amazon AWS ...
31
Nightmare on any other cloud
1. Virtual Servers
2. Block Storage
3. Object Storage
4. ... and maybe
some other stuff
if I’m lucky
‣ EC2, S3, EBS, RDS, SNS,
SQS, SWS, GPUs, SSDs,
CloudFormation, VPC,
ENIs, SecurityGroups,
10GbE, DirectConnect,
Reserved Instances,
ImportExport, Spot Market
‣ And ~30 other products
and service features with
more added monthly
Brand ‘X’ Cloud Amazon
Thursday, October 3, 13
32. Easy on AWS; much harder elsewhere
One very specific example
32
‣ The widely used
FLEXlm license server
uses NIC MAC
addresses when
generating license keys
‣ Different MAC? Science
stops. Screwed.
‣ VPC ENIs allow
separation of MAC
address from Network
Interface. Badass.
Thursday, October 3, 13
33. 33
The ‘Meta’ Issue
What is driving all of this?
Drivers For Cloud Adoption In Bio-IT
What The Cloud Salespeople Will Not Tell You
Private Clouds & Practical Advice
Intro & Terminology
Getting our buzzwords straight
The Road Ahead
1
2
3
4
5
6
Thursday, October 3, 13
35. 35
Big Picture / Meta Issue
‣ HUGE revolution in the rate at which
lab platforms are being redesigned,
improved & refreshed
• Example: CCD sensor upgrade on that
confocal microscopy rig just doubled
storage requirements
• Example: The 2D ultrasound imager is
now a 3D imager
• Example: Illumina HiSeq upgrade just
doubled the rate at which you can acquire
genomes. Massive downstream increase
in storage, compute & data movement
needs
‣ For the above examples, do you
think IT was informed in advance?
Thursday, October 3, 13
36. Science progressing way faster than IT can refresh/change
The Central Problem Is ...
‣ Instrumentation & protocols are changing FAR
FASTER than we can refresh our Research-IT &
Scientific Computing infrastructure
• Bench science is changing month-to-month ...
• ... while our IT infrastructure only gets refreshed every
2-7 years
‣ We have to design systems TODAY that can
support unknown research requirements &
workflows over many years (gulp ...)
36
Thursday, October 3, 13
37. The Central Problem Is ...
‣ The easy period is over
‣ 5 years ago we could toss
inexpensive storage and
servers at the problem;
even in a nearby closet or
under a lab bench if
necessary
‣ That does not work any
more; real solutions
required
37
Thursday, October 3, 13
38. And a related problem ...
‣ It has never been easier to
acquire vast amounts of data
cheaply and easily
‣ Growth rate of data creation/
ingest exceeds rate at which
the storage industry is
improving disk capacity
‣ Not just a storage lifecycle
problem. This data *moves*
and often needs to be shared
among multiple entities and
providers
• ... ideally without punching holes in
your firewall or consuming all
available internet bandwidth
38
Thursday, October 3, 13
39. If we get it wrong ...
‣ Lost opportunity
‣ Missing capability
‣ Beaten by the competition
‣ Frustrated & very vocal scientific staff
‣ Problems in recruiting, retention,
publication & product development
39
Thursday, October 3, 13
40. 40
The ‘Meta’ Issue
What is driving all of this?
Drivers For Cloud Adoption In Bio-IT
What The Cloud Salespeople Will Not Tell You
Private Clouds & Practical Advice
Intro & Terminology
Getting our buzzwords straight
The Road Ahead
1
2
3
4
5
6
Thursday, October 3, 13
42. Mainstream in life science for quite some time ...
42
Public IaaS Clouds
‣ Public infrastructure clouds offer
excellent “pressure release valve”
when rapidly changing scientific
requirements can’t be satisfied by
on-premise infrastructure
‣ Economics can’t be ignored
‣ Popular meeting ground for data
swapping and collaboration
‣ ‘Scriptable Datacenters’ enabling
entirely new capabilities
‣ Money people like converting
CapEx to OpEx
Thursday, October 3, 13
43. The ‘neutral’ meeting ground ..
43
Cloud Hubs & Portals
‣ Many types of entities need
to meet, collaborate and
exchange life science data
‣ Data sharing hubs and
portals becoming popular on
public IaaS clouds like AWS
‣ Why?
• Far easier than punching holes in
your firewall and issuing VPN
credentials to outsiders
Thursday, October 3, 13
44. Compelling economics
44
Cloud Data Repositories
‣ IaaS clouds becoming ‘center of
gravity’ for some large scale
scientific data hosting
‣ Why?
• Compelling pricing
• No need to own & operate mirror sites
• AWS has some very interesting
‘downloader pays’ models that seem
to be a good fit for grant-funded
science with mandated multi-year
data accessibility requirements
www.1000genomes.org
Thursday, October 3, 13
45. My $.02
Amazon vs. Everyone Else
‣ AWS clear leader for Bio IT IaaS cloud use
‣ Why?
• By far the largest number of IaaS building blocks
• Rate of innovation puts AWS years ahead of competition
‣ Exceptions
• For specific high-value pipelines & workstreams, Google
& Microsoft are valid alternatives
45
Thursday, October 3, 13
46. 46
The ‘Meta’ Issue
What is driving all of this?
Drivers For Cloud Adoption In Bio-IT
What The Cloud Salespeople Will Not Tell You
Private Clouds & Practical Advice
Intro & Terminology
Getting our buzzwords straight
The Road Ahead
1
2
3
4
5
6
Thursday, October 3, 13
47. What the salesfolk won’t tell you ...
47
‣ There is no one-size-fits-all
research design pattern ...
‣ You are not going to toss everything
and replace it with “Big Data”
‣ Very few of us have a single pipeline
or workflow that we can devote
endless engineering effort to
‣ We are not going to toss out
hundreds of legacy codes and
rewrite everything for GPUs or
MapReduce
‣ For research HPC it’s all about the
building blocks { and how we can
effectively use/deploy them }
Thursday, October 3, 13
48. 48
What the salesfolk won’t tell you
‣ Your organization actually needs THREE
tested cloud design patterns:
‣ (1) To handle ‘legacy’ scientific apps &
workflows
‣ (2) The special stuff that is worth re-architecting
‣ (3) Hadoop & big data analytics
Thursday, October 3, 13
49. Legacy HPC on the Cloud
49
Design Pattern #1 - Legacy
‣ There are many hundreds of
existing algorithms and
applications in the life
science informatics space
‣ We’ll be running/using these
codes for years to come
‣ Many can’t or will never be
refactored or rewritten
‣ I call this the “legacy”
design pattern
Thursday, October 3, 13
51. StarCluster
51
Design Pattern #1 - Legacy
‣ MIT StarCluster
• http://web.mit.edu/star/cluster/
‣ Infinite Awesomeness. Worth a talk by itself.
‣ This is your baseline
‣ Extend as needed
Thursday, October 3, 13
52. 52
Design Pattern #2 - “Cloudy”
‣ Some of our research workflows are important
enough to be rewritten for “the cloud” and the
advantages that a truly elastic & API-driven
infrastructure can deliver
‣ This is where you have the most freedom
‣ Many published best practices you can borrow
‣ Warning: Cloud vendor lock-in potential is
strongest here
Thursday, October 3, 13
53. 53
Design Pattern #3 - Hadoop/BigData
‣ Hadoop and “big data” need to be on your
radar
‣ Be careful though, you’ll need a gas mask to
avoid the smog of marketing and vapid hype
‣ The utility is real and this does represent one
“future path” for analysis of large data sets
Thursday, October 3, 13
54. 54
Design Pattern #3 - Hadoop/BigData
‣ It’s gonna be a MapReduce world, get used to it
‣ Little need to roll your own Hadoop in 2013
‣ ISV & commercial ecosystem already healthy
‣ Multiple providers today; both onsite & cloud-
based
‣ Often a slam-dunk cloud use case
Thursday, October 3, 13
55. What you need to know
55
Design Pattern #3 - Hadoop/BigData
‣ “Hadoop” and “Big Data” are now general
terms
‣ You need to drill down to find out what people
actually mean
‣ We are still in the period where senior
leadership may demand “Hadoop” or “BigData”
capability without any actual business or
scientific need
Thursday, October 3, 13
56. What you need to know
56
Hadoop & “Big Data”
‣ In broad terms you can break “Big Data” down into two
very basic use cases:
1. Compute: Hadoop can be used as a very powerful
platform for the analysis of very large data sets. The
google search term here is “map reduce”
2. Data Stores: Hadoop is driving the development of very
sophisticated “no-SQL” “non-Relational” databases and
data query engines. The google search terms include
“nosql”, “couchdb”, “hive”, “pig” & “mongodb”, etc.
‣ Your job is to figure out which type applies for the
groups requesting “Hadoop” or “BigData” capability
Thursday, October 3, 13
57. What you need to know
57
Hadoop & “Big Data”
‣ Hadoop is being driven by a small group of
academics writing and releasing open source
life science hadoop applications;
‣ Your people will want to run these codes
‣ In some academic environments you may find
people wanting to develop on this platform
Thursday, October 3, 13
58. 58
The ‘Meta’ Issue
What is driving all of this?
Drivers For Cloud Adoption In Bio-IT
What The Cloud Salespeople Will Not Tell You
Private Clouds & Practical Advice
Intro & Terminology
Getting our buzzwords straight
The Road Ahead
1
2
3
4
5
6
Thursday, October 3, 13
60. 60
Private Clouds: Only 60% BS in ’13
‣ I’m known as a private cloud cynic
‣ The hype::usefulness ratio is still extreme
‣ For vendors it’s still a play to get you to toss
everything in your datacenter and ‘start fresh’
‣ However ...
Thursday, October 3, 13
61. 61
Private Clouds: Make sense for ...
‣ If you are a globe spanning enterprise with tens
of thousands of employees or “customers”
‣ If you want to leverage hardcore DevOps for
serious infrastructure automation and
configuration management
‣ If you want to use Private Cloud to drive fresh
new tech like object storage and software
defined networking (SDN) into your
environment
Thursday, October 3, 13
62. 62
Private Clouds: However ...
‣ My $.02 is that the two primary science-facing benefits
from Cloud are:
1. Browsable catalogs of available server images
2. Self-service (Scientists can select & provision systems)
‣ And guess what? You can do that TODAY on most
enterprise virtualization stacks WITHOUT jumping on
the private cloud bandwagon
‣ My advice:
• Think hard about what you hope to gain from private clouds and
do some extra due-diligence to see if you can gain those
capabilities in a simpler and cheaper way
Thursday, October 3, 13
64. Design Patterns
64
Practical Advice
‣ Remember the three design patterns on the
cloud:
• Legacy HPC systems
(replicate traditional clusters in the cloud)
• Hadoop
• Cloudy
(when you rewrite something to fully leverage cloud
capability)
Thursday, October 3, 13
65. Policies and Procedures
65
Practical Advice
‣ Cloud technology bits are easy. Cloud Process
and Policy discussions take forever
‣ Start these conversations sooner rather than
later!
Thursday, October 3, 13
66. Core services that take time and advance planning
66
Practical Advice
‣ A few key cloud services take time and
advanced planning to deploy properly:
‣ VPNs & subnet schemes
‣ Identity Management & Access Control
‣ Data Movement
Thursday, October 3, 13
68. 68
Physical Ingest Just Plain Nasty
‣ Easy to talk about in theory
‣ Seems “easy” to scientists
and even IT at first glance
‣ Really really nasty in practice
• Incredibly time consuming
• Significant operational burden
• Easy to do badly / lose data
Thursday, October 3, 13
69. And huge need for fast(er) research networks!
69
Huge Need For Network Ingest
1. Public data repositories have
petabytes of useful data
2. Collaborators still need to
swap data in serious ways
3. Amazon becoming an
important repo of public and
private sources
4. Many vendors now “deliver”
to the cloud
Thursday, October 3, 13
76. Network vs. Physical
Cloud Data Movement
‣ With a 1GbE internet connection ...
‣ and using Aspera software ....
‣ We sustained 700 MB/sec for more than 7 hours
freighting genomes into Amazon Web Services
‣ This is fast enough for many use cases,
including genome sequencing core facilities*
‣ Chris Dwan’s webinar on this topic:
http://biote.am/7e
76
Thursday, October 3, 13
77. Network vs. Physical
Cloud Data Movement
‣ Results like this mean we now favor network-
based data movement over physical media
movement
‣ Large-scale physical data movement carries a
high operational burden and consumes non-
trivial staff time & resources
77
Thursday, October 3, 13
78. There are three ways to do network data movement ...
Cloud Data Movement
1. Buy software from Aspera and be done with it
2. Attend the annual SuperComputing conference
& see which student group wins the bandwidth
challenge contest; use their code
3. Get GridFTP from the Globus folks
78
Thursday, October 3, 13
79. 79
The ‘Meta’ Issue
What is driving all of this?
Drivers For Cloud Adoption In Bio-IT
What The Cloud Salespeople Will Not Tell You
Private Clouds & Practical Advice
Intro & Terminology
Getting our buzzwords straight
The Road Ahead
1
2
3
4
5
6
Thursday, October 3, 13
81. Some final thoughts
81
Future Trends & Patterns
‣ Compute continues to become easier
‣ Data movement (physical & network) gets harder.
‣ The cloud decision may be made by
where your data actually resides
‣ Cost of storage will be dwarfed by “cost of
managing stored data”
‣ We can see end-of-life for our current IT
architecture and design patterns; new patterns
will start to appear over next 2-5 years
Thursday, October 3, 13
82. Very blurry lines in 2013 for all of these roles
82
Scientist/SysAdmin/Programmer
‣ Cloud is forcing these issues ...
‣ Far more control is going into
the hands of the research end
user
‣ IT support roles will radically
change -- no longer owners or
gatekeepers
‣ IT will handle policies,
procedures, reference patterns ,
security & best practices
‣ Researchers will control the
“what”, “when” and “how big”
Thursday, October 3, 13