2. Why Software is Eating The World
Expression coined by Marc
Andreessen, founder of
Netscape
2011 Essay in the Wall Street
Journal
http://www.wsj.com/articles/SB10001424053111903480904576512250915629460
3. What Does it Mean?
• Value is moving into software
(greater margins)
• Hardware is becoming more
commodity than ever (thinner
margins)
• Get out of the hardware business!
• Get into the software business!
4. What About Storage?
• Surely this can’t happen to
storage?
• Storage needs hardware, right?
• Storage will always be
delivered on hardware – won’t
it?
9. Google For the Data Centre?
• Google designs for small number of apps and millions
of users
• Redundancy built into the application
• Very read-centric architecture – lots of ability to cache
• Internet-based response times
• Home grown Software Defined Storage
• FREE!!
10. Private Data Centres are different
• Google/Facebook/Backblaze
architectures don’t directly
translate to the private data
centre
• Applications have specific SLAs
– Availability
– Performance
• Downtime = MONEY
11. Enter Software Defined Storage
• What could SDS and storage virtualisation give
us?
– Abstract away the hardware issues
– Use abstraction & software to automate
provisioning
– Opportunity to reduce costs
12. SDS Defined
• Wikipedia defines SDS as:
Software-defined storage (SDS) is an evolving concept for computer data
storage software to manage policy-based provisioning and management
of data storage independent of the underlying hardware.
• SNIA defines SDS as:
Virtualized storage with a service management interface. SDS includes
pools of storage with data service characteristics that may be applied to
meet the requirements specified through the service management
interface.
SDS as a term evolved from SDN, however SDN is fundamentally different
as networks don’t have to manage state.
13. SDS Key Features – My Definition
• Abstraction – I/O services should be delivered independent
of the underlying hardware, through logical constructs like
LUNs, volumes, file shares and repositories.
• Automation – resources should be consumed using CLIs
and APIs rather than manually allocated through a GUI.
• Policy/Service Driven – the service received (IOPS, latency)
should be established by policies that implement Quality of
Service, availability and resiliency.
• Scalable – solutions should enable performance & capacity
scaling independent of I/O delivery.
14. SDS – Commodity Hardware?
• Well, D’oh! Obviously!
• Vendors have been moving away from
bespoke hardware for many years
• Intel x86 deployed in EMC Symmetrix in
2009 – VMAX
– CLARiiON, Celerra, Centera all Intel Xeon
based in 2010
• Hitachi VSP was x86-based in 2011
• Compellent hardware was all commodity
15. Bespoke Hardware isn’t bad!
• 3PAR has had custom ASIC since inception and
still uses it to accelerate some functions
• SimpliVity has hardware-based de-duplication
• Server vendors (e.g. SuperMicro) are creating
storage-specific hardware – multi-node, high-
drive count systems
• Just don’t create a “Homer”…
16. Storage Virtualisation
• Abstracts the physical storage from the user through the use of logical
LUNs, volumes and shares
• Provides:
– Mobility – move physical data around without affecting logical view
– Flexibility – re-use existing resources effectively, extend life of legacy assets
– Efficiency – use virtualisation controller to implement data services
– Lower Cost – can be used to reduce cost of solutions with intelligent design
• Storage Virtualisation is 25 Years Old!
– First Integrated Cached Disk Array (ICDA) introduced by EMC in 1990
– Abstracted 5.25” drives as logical LUNs with RAID mirroring
– We’ve been abstracting data ever since!
17. Storage Virtualisation - Evolution
• Monolithic – single central controller
– Inline in the data path
– Controller maps logical to “physical” storage
– data management features built into the controller
– Not highly scalable
• Centralised Metadata – Distributed Data
– Central metadata functions
– Data distributed/replicated across many devices
– System not in the data path
– Separation of control and data planes
• Totally Distributed
– No central metadata or data
– Data distributed across many devices
– Fully scalable architecture
18. What Does SDS & Virtualisation let us do differently?
• Flexibility – let’s end users choose the whole
configuration – including hardware components
• Agility - easily introduce new features – simple
software upgrade to implement more efficient
dedupe, or support additional snapshots
• Efficiency - reuse old resources – sweat the
assets, move data around to the most effective
hardware based on price/performance
• Transparency – now we can see how much
markup vendors were making on the hardware
• Deliver Storage as a Service
19. But what’s the negative side?
• Vendors can’t test every hardware combination –
sometimes you will be on your own
• Performance profiles will be difficult to gauge unless
you test yourself
• There’s little or no shared experience – unlike analytics
an appliance vendor can do across all their customers
• SDS doesn’t necessarily fix issues of data migration,
data protection, etc (although storage virtualisation
helps)
20. Is Software Defined Storage for Me?
• Well it depends…
• Good Use Cases to dip your toe in the water
– ROBO/SMB – low cost branch deployments with
virtualisation
– Scale out archive – object or NAS
– Hyper-convergence
21. Who’s after your business?
• Storage industry is a huge growth area
• New businesses coming to market each year
• So who might be knocking on your door?
• Bring on the NASCAR slide….
22.
23. So, is SDS Fact or Fiction
• SDS is FACT
• It is here today and has been with us for some
time
• The hard part now is deciding which vendor is
right for you and cutting through the
marketing hype
Much quoted work by Marc Andreessen that highlights how the value of IT is moving to be in the software being developed, rather than the innovations in hardware.
There is a perception that more margin can be made out of software – after all its virtual, there’s nothing to ship, you just download and go – which means the cost of manufacture and delivery is nearly zero. By comparison, hardware takes time to develop and engineer, needs field support and maintenance. Components are commodity and can be bought cheaply by any vendor looking to build hardware solutions. Margins as a result are low.
The issues with storage are the usual suspects we’ve seen over the last 30 years – growth, the effort to manage it, the cost, being able to be agile and deliver on demand and being flexible to (internal) customer requirements.
The hyperscalers store petabytes of storage per year – surely they have cracked the storage problem?
2009 CNET post - http://www.cnet.com/news/google-uncloaks-once-secret-server-10209580/ - pretty much commodity hardwrae – standard motherboard, dual disks, memory and PSU. Also has a 12 volt battery for backup – no UPS! However this is a single application company
Google designs infrastructure for a different purpose to the typical enterprise; they run a small number of apps for millions of mostly read-only users. All of the data sits behind the public Internet which can mask poor response times. Most of all, this service is free – no SLAs, no guarantees on lost data etc. Think also of how that process applies to Gmail – no real compensation for loss of your data….