SlideShare une entreprise Scribd logo
1  sur  42
© 2012 VMware Inc. All rights reserved
Confidential
Hadoop-as-a-Service
CXO Big Data Seminar
September 26, 2012
2 Confidential
Agenda
 VMware Data Portfolio
 Big Data and Virtualization Trends
 Enterprise Hadoop Needs
 Virtualized Hadoop for the Enterprise
 Summary
3 Confidential
Trends Driving Change in Enterprise IT
Cloud
• Offered “as-a-Service”
• Virtualization
New Application Types
• Mobile, SaaS, social
• Apps released early and often
Frameworks
• New application frameworks driving
• Increase in application development
Data Disruption
• Web orientation drives exponential
data volumes
• Reduced latency and new types of data
4 Confidential
The Database is Being Stretched
Big Data
Cloud Delivery
Flexible Data
 Virtualized
 Offered “-as-a-Service”
 Petabytes vs.
Gigabytes
 Democratize BI
 Multi-structured data
 Developer productivity
Fast Data
 Global access patterns
 Mobile app proliferation
5 Confidential
Big, Fast and Flexible Data
Flexible
Big
Big Data
Processing
Big Data
Analytics
Serengeti
Fast
OLTP
workloads
Analytic
workloads
Cloud Delivery Model
Data as a service for private and public clouds
OSS Relational
Document
Object
Key / Value
GemFire
vPostgres
GemFire
GemFire
6 Confidential
Agenda
 VMware Data Portfolio
 Big Data and Virtualization Trends
 Enterprise Hadoop Needs
 Virtualized Hadoop for the Enterprise
 Summary
7 Confidential
Data is exploding & Hadoop is driving growth
Unstructured data driving growth Hadoop adoption is ramping
2011 2012 2013 2014 2015 2016 2017 2018 2019 2020
Structured Unstructured
Complex unstructured data
forecastedto outpace structured
relationaldata by 10x by 2020
Evaluating
53%
In-
production
23%
Piloting
18%
Testing
2%
Don't know
2%
Other
2%
Source: Forrester Survey of 60 CIOs, September 2011
• Unstructured data explosion and Hadoop capabilities causing CIOs to
reconsider Enterprise data strategy
• Gartner predicts +800% data growth over next 5 years
• Hadoop’s ability to process raw data at cost presents intriguing value prop for CIOs
8 Confidential
Log Processing / Click
Stream Analytics
Machine Learning /
sophisticated data mining
Web crawling / text
processing
Extract Transform Load
(ETL) replacement
Image / XML message
processing
Broad Application of Hadoop technology
General archiving /
compliance
Financial Services
Mobile / Telecom
Internet Retailer
Scientific Research
Pharmaceutical / Drug
Discovery
Social Media
Vertical Use Cases
Horizontal Use Cases
Hadoop’s ability to handle large unstructured data affordably and efficiently makes
it a valuable tool kit for enterprises across a number of applications and fields.
9 Confidential
The Future of Virtualization
VDC
Software-defined Datacenter Services
2008 2012 FUTURE
Time to Provision
New Services
Workloads
Virtualized
Weeks Days/Hours Minutes/Seconds
25% 60%
+
>90%
10 Confidential
Virtualization enables a Common Infrastructure for Big Data
Single purpose clusters for various
business applications lead to cluster
sprawl.
Virtualization Platform
 Simplify
• Single Hardware Infrastructure
• Unified operations
 Optimize
• Shared Resources = higher utilization
• Elastic resources = faster on-demand access
MPP DB Hadoop
HBase
Virtualization Platform
MPP DB
Hadoop
HBase
Cluster Sprawling
Cluster Consolidation
11 Confidential
Agenda
 VMware Data Portfolio
 Big Data and Virtualization Trends
 Enterprise Hadoop Needs
 Virtualized Hadoop for the Enterprise
 Summary
12 Confidential
Hadoop Users
Data scientists, analysts, developers
• Line of business users
• Intimate with data and analysis, not IT
• Tasked with providing actionable intelligence that impacts the business
Concerns
• Obtain a Hadoop cluster on demand
• Minimize time to insight
• Require reasonable performance from Hadoop cluster
13 Confidential
The IT Guy
Admins, architects, CIO
• Responsible for technology infrastructure, compliance, budget management
• Evaluates new technologies and recommends best practices
Concerns
• Keeping up with demands of the business
• Cost savings and consolidation
• Reliability
• Complexity of running and tuning Hadoop clusters
• Shortage of skills to do the above
14 Confidential
Hadoop Journey in Enterprises
20 300
0 node
Integrated
Scale
15 Confidential
Agenda
 VMware Data Portfolio
 Big Data and Virtualization Trends
 Enterprise Hadoop Needs
 Virtualized Hadoop for the Enterprise
 Summary
16 Confidential
Why Virtualize Hadoop?
 Shrink and expand
cluster on demand
 Independent scaling of
Compute and data
 Strong multi-tenancy
Elasticity & Multi-tenancy
 High availability for
entire Hadoop stack
 One click to setup
 Battle-tested
High Availability
 Rapid deployment
 One stop command
center
 Easy to
configure/reconfigure
Operational Simplicity
17 Confidential
Project Serengeti
 Open source project launched in June, 2012
 Toolkit that leverage virtualization to simplify Hadoop deployment
and operations
 To learn more, projectserengeti.org
Deploy a Hadoop cluster in 10 Minutes
Customize Hadoop cluster
Use Your Favorite Hadoop Distribution
One stop command center
Serengeti
18 Confidential
Rapid Deployment of a Hadoop Cluster with Serengeti
Done
Step 1: Deploy Serengeti virtual appliance on vSphere.
Step 2: A few simple commands to stand up Hadoop Cluster.
19 Confidential
A Walk Through Serengeti
20 Confidential
A Walk Through Serengeti
21 Confidential
A Walk Through Serengeti
Scaling out a cluster
Advanced cluster creation
22 Confidential
Customizing Your Hadoop Cluster
 Choice of distros
 Storage configuration
• Choice of shared storage or local disk
 Resource configuration
 High availability option
 # of nodes
 Also used to tune Hadoop config
…
"distro":"apache",
"groups":[
{ "name": "master",
"roles":[
"hadoop_namenode",
"hadoop_jobtracker”],
"storage": {
"type": "SHARED",
"sizeGB": 20},
"instanceType": "MEDIUM",
"instanceNum": 1,
"haFlag": 'on’},
{"name": "worker",
"roles":[
"hadoop_datanode",
"hadoop_tasktracker"
],
"instanceType": "SMALL",
"instanceNum": 5,
"haFlag": 'off'
…
23 Confidential
Freedom of Choice and Open Source
Community
Projects
Distributions
• Flexibility to choose from major distributions
• Support for multiple projects (work in progress)
• Open architecture to welcome industry participation
• Contributing Hadoop Virtualization Extensions (HVE) to open
source community
24 Confidential
Use Local Disk where it’s Needed
SAN Storage
$2 - $10/Gigabyte
$1M gets:
0.5 Petabytes
200,000 IOPS
8Gbyte/sec
NAS Filers
$1 - $5/Gigabyte
$1M gets:
1 Petabyte
200,000 IOPS
10Gbyte/sec
Local Storage
$0.05/Gigabyte
$1M gets:
10 Petabytes
400,000 IOPS
250 Gbytes/sec
25 Confidential
Virtual Storage Architecture Includes Local Disk
 Shared Storage: SAN or NAS
• Easy to provision
• Automated cluster rebalancing
• Leverage high availability protection
 Local Storage: Local Disks
• Local disk for Hadoop
• Scalable bandwidth, lower cost/GB
Host
Hadoop
Other
VM
Other
VM
Host
Hadoop
Hadoop
Other
VM
Host
Hadoop
Hadoop
Other
VM
Host
Hadoop
Other
VM
Other
VM
Host
Hadoop
Hadoop
Other
VM
Host
Hadoop
Hadoop
Other
VM
Shared Storage Shared Storage
Local Storage
26 Confidential
Hadoop Runs Well on Virtualization
0
50
100
150
200
250
300
350
400
450
TeraGen TeraSort TeraValidate
Elapsed
time,
seconds
(lower
is
better)
Native
1 VM
2 VMs
4 VMs
Source: http://www.vmware.com/files/pdf/techpaper/VMW-Hadoop-Performance-vSphere5.pdf
27 Confidential
Why Virtualize Hadoop?
 Shrink and expand
cluster on demand
 Independent scaling of
Compute and data
 Strong multi-tenancy
Elasticity & Multi-tenancy
 High availability for
entire Hadoop stack
 One click to setup
 Battle-tested
High Availability
 Rapid deployment
 One stop command
center
 Easy to
configure/reconfigure
Operational Simplicity
28 Confidential
High Availability for the Hadoop Stack
HDFS
(Hadoop Distributed File System)
HBase (Key-Value store)
MapReduce (Job Scheduling/Execution System)
Pig (Data Flow) Hive (SQL)
BI Reporting
ETL Tools
Management
Server
Zookeepr
(Coordination)
HCatalog
RDBMS
Namenode
Jobtracker
Hive
MetaDB
Hcatalog MDB
Server
HA for Hadoop stack is more than Name node HA
29 Confidential
vMotion Reduces Planned Downtime
Description:
Enables the live migration of virtual
machines from one host to another
with continuous service availability.
Benefits:
• Revolutionary technology that is the
basis for automated virtual machine
movement
• Meets service level and performance
goals
30 Confidential
Hadoop Aware HA - Protection Against Unplanned Downtime
• Protection against host and VM failures
• Added application-aware HA for Hadoop NameNode (NN) and JobTracker (JT),
protecting against NN and JT failures
• Automatic failure detection and restart virtual machine in minutes, on any
available host in cluster
• In progress Hadoop Jobs will pause and resume when name node is up
Overview
31 Confidential
vSphere Fault Tolerance Provides Continuous Protection
App
OS
App
OS
App
OS
X
X
App
OS
App
OS
App
OS
App
OS
X
VMware ESX VMware ESX
• Single identical VMs running in
lockstep on separate hosts
• Zero downtime, zero data loss
failover for all virtual machines in
case of hardware failures
• Integrated with VMware HA/DRS
• No complex clustering or
specialized hardware required
• Single common mechanism for all
applications and operating
systems
FT
HA
HA
Overview
Zero downtime for Name Node, Job Tracker and other components in Hadoop clusters
32 Confidential
Achieve HA for the Entire Hadoop Stack
HDFS
(Hadoop Distributed File System)
HBase (Key-Value store)
MapReduce (Job Scheduling/Execution System)
Pig (Data Flow) Hive (SQL)
BI Reporting
ETL Tools
Management
Server
Zookeepr
(Coordination)
HCatalog
RDBMS
Namenode
Jobtracker
Hive MetaDB Hcatalog MDB
Server
• Battle-tested high availability technology
• Single mechanism to achieve HA for the entire Hadoop stack
• One click to enable HA and/or FT
33 Confidential
Why Virtualize Hadoop?
 Shrink and expand
cluster on demand
 Independent scaling of
Compute and data
 Strong multi-tenancy
Elasticity & Multi-tenancy
 High availability for
entire Hadoop stack
 One click to setup
 Battle-tested
High Availability
 Rapid deployment
 One stop command
center
 Easy to
configure/reconfigure
Operational Simplicity
34 Confidential
Storage
Evolution of Hadoop on VMs
Compute
Current
Hadoop:
Combined
Storage/
Compute
Storage
T1 T2
VM VM VM
VM
VM
VM
Hadoop in VM
- VM lifecycle
determined
by Datanode
- Limited elasticity
- Limited to Hadoop
Multi-Tenancy
Separate Storage
- Separate compute
from data
- Elastic compute
- Enable shared
workloads
- Raise utilization
Separate Compute Clusters
- Separate virtual clusters
per tenant
- Stronger VM-grade security
and resource isolation
- Enable deployment of
multiple Hadoop runtime
versions
Slave Node
35 Confidential
Ad hoc
data mining
In-house Hadoop as a Service “Enterprise EMR” – (Hadoop + Hadoop)
Compute
layer
Data
layer
HDFS
Host Host Host Host Host Host
Production
recommendation engine
Production
ETL of log files
Virtualization platform
HDFS
36 Confidential
Hadoop
batch analysis
Integrated Big Data Production – (Hadoop + other big data)
HDFS
Host Host Host Host Host Host
HBase
real-time queries
NoSQL –
Cassandra
key-value
store
MPP DBMS –
Analysis of
structured data
Compute
layer
Data
layer
Virtualization platform
37 Confidential
Short-lived
Hadoop compute cluster
Integrated Hadoop and Webapps – (Hadoop + Other Workloads)
HDFS
Host Host Host Host Host Host
Web servers
for ecommerce site
Compute
layer
Data
layer
Hadoop
compute cluster
Virtualization platform
38 Confidential
Agenda
 VMware Data Portfolio
 Big Data and Virtualization Trends
 Enterprise Hadoop Needs
 Virtualized Hadoop for the Enterprise
 Summary
39 Confidential
Simple, Reliable, Elastic Hadoop on Demand
 Shrink and expand
cluster on demand
 Independent scaling of
Compute and data
 Strong multi-tenancy
Elasticity & Multi-tenancy
 High availability for
entire Hadoop stack
 One click to setup
 Battle-tested
High Availability
 Rapid deployment
 One stop command
center
 Easy to
configure/reconfigure
Operational Simplicity
Hadoop-as-a-Service
(Enterprise Grade EMR)
40 Confidential
Virtualization Benefits across Hadoop Maturity Spectrum
20 300
0 node
Integrated
Scale
41 Confidential
Serengeti Resources
 Download and try Serengeti
• projectserengeti.org
 VMware Hadoop site
• vmware.com/hadoop
 Hadoop performance on vSphere
• vmware.com/files/pdf/VMW-Hadoop-
Performance-vSphere5.pdf
 Hadoop High Availability solution
• vmware.com/files/pdf/Apache-Hadoop-
VMware-HA-solution.pdf
42 Confidential
Thank You!

Contenu connexe

Tendances

Streamline Data Governance with Egeria: The Industry's First Open Metadata St...
Streamline Data Governance with Egeria: The Industry's First Open Metadata St...Streamline Data Governance with Egeria: The Industry's First Open Metadata St...
Streamline Data Governance with Egeria: The Industry's First Open Metadata St...DataWorks Summit
 
Internet of Things (IoT)
Internet of Things (IoT)Internet of Things (IoT)
Internet of Things (IoT)Trivadis
 
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...Seeling Cheung
 
Data Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud WorldData Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud WorldDenodo
 
Future of Data Platform in Cloud Native world
Future of Data Platform in Cloud Native worldFuture of Data Platform in Cloud Native world
Future of Data Platform in Cloud Native worldSrivatsan Srinivasan
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Big Data Spain
 
Become an IT Service Broker
Become an IT Service BrokerBecome an IT Service Broker
Become an IT Service BrokerRackspace
 
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...Databricks
 
Promote the Good of the People of the United Kingdom by Maintaining Monetary ...
Promote the Good of the People of the United Kingdom by Maintaining Monetary ...Promote the Good of the People of the United Kingdom by Maintaining Monetary ...
Promote the Good of the People of the United Kingdom by Maintaining Monetary ...DataWorks Summit
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Dr. Arif Wider
 
Not Just a necessary evil, it’s good for business: implementing PCI DSS contr...
Not Just a necessary evil, it’s good for business: implementing PCI DSS contr...Not Just a necessary evil, it’s good for business: implementing PCI DSS contr...
Not Just a necessary evil, it’s good for business: implementing PCI DSS contr...DataWorks Summit
 
It's not the size of your cluster, it's how you use it
It's not the size of your cluster, it's how you use itIt's not the size of your cluster, it's how you use it
It's not the size of your cluster, it's how you use itDataWorks Summit/Hadoop Summit
 
Securing and governing a multi-tenant data lake within the financial industry
Securing and governing a multi-tenant data lake within the financial industrySecuring and governing a multi-tenant data lake within the financial industry
Securing and governing a multi-tenant data lake within the financial industryDataWorks Summit
 
Next generation Polyglot Architectures using Neo4j by Stefan Kolmar
Next generation Polyglot Architectures using Neo4j by Stefan KolmarNext generation Polyglot Architectures using Neo4j by Stefan Kolmar
Next generation Polyglot Architectures using Neo4j by Stefan KolmarBig Data Spain
 
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4jScalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4jNeo4j
 
Big Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyondBig Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyondDataWorks Summit/Hadoop Summit
 
Webinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
Webinar: Hybrid Cloud Integration - Why It's Different and Why It MattersWebinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
Webinar: Hybrid Cloud Integration - Why It's Different and Why It MattersSnapLogic
 
Organising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data WorldOrganising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data WorldDataWorks Summit/Hadoop Summit
 

Tendances (20)

Practical advice to build a data driven company
Practical advice to build a data driven companyPractical advice to build a data driven company
Practical advice to build a data driven company
 
Streamline Data Governance with Egeria: The Industry's First Open Metadata St...
Streamline Data Governance with Egeria: The Industry's First Open Metadata St...Streamline Data Governance with Egeria: The Industry's First Open Metadata St...
Streamline Data Governance with Egeria: The Industry's First Open Metadata St...
 
Internet of Things (IoT)
Internet of Things (IoT)Internet of Things (IoT)
Internet of Things (IoT)
 
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
Citizens Bank: Data Lake Implementation – Selecting BigInsights ViON Spark/Ha...
 
Data Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud WorldData Virtualization to Survive a Multi and Hybrid Cloud World
Data Virtualization to Survive a Multi and Hybrid Cloud World
 
Future of Data Platform in Cloud Native world
Future of Data Platform in Cloud Native worldFuture of Data Platform in Cloud Native world
Future of Data Platform in Cloud Native world
 
Big Data Application Architectures - IoT
Big Data Application Architectures - IoTBig Data Application Architectures - IoT
Big Data Application Architectures - IoT
 
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
Stream Processing as Game Changer for Big Data and Internet of Things by Kai ...
 
Become an IT Service Broker
Become an IT Service BrokerBecome an IT Service Broker
Become an IT Service Broker
 
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
RubiOne: Apache Spark as the Backbone of a Retail Analytics Development Envir...
 
Promote the Good of the People of the United Kingdom by Maintaining Monetary ...
Promote the Good of the People of the United Kingdom by Maintaining Monetary ...Promote the Good of the People of the United Kingdom by Maintaining Monetary ...
Promote the Good of the People of the United Kingdom by Maintaining Monetary ...
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
 
Not Just a necessary evil, it’s good for business: implementing PCI DSS contr...
Not Just a necessary evil, it’s good for business: implementing PCI DSS contr...Not Just a necessary evil, it’s good for business: implementing PCI DSS contr...
Not Just a necessary evil, it’s good for business: implementing PCI DSS contr...
 
It's not the size of your cluster, it's how you use it
It's not the size of your cluster, it's how you use itIt's not the size of your cluster, it's how you use it
It's not the size of your cluster, it's how you use it
 
Securing and governing a multi-tenant data lake within the financial industry
Securing and governing a multi-tenant data lake within the financial industrySecuring and governing a multi-tenant data lake within the financial industry
Securing and governing a multi-tenant data lake within the financial industry
 
Next generation Polyglot Architectures using Neo4j by Stefan Kolmar
Next generation Polyglot Architectures using Neo4j by Stefan KolmarNext generation Polyglot Architectures using Neo4j by Stefan Kolmar
Next generation Polyglot Architectures using Neo4j by Stefan Kolmar
 
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4jScalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
Scalability and Graph Analytics with Neo4j - Stefan Kolmar, Neo4j
 
Big Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyondBig Data for Managers: From hadoop to streaming and beyond
Big Data for Managers: From hadoop to streaming and beyond
 
Webinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
Webinar: Hybrid Cloud Integration - Why It's Different and Why It MattersWebinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
Webinar: Hybrid Cloud Integration - Why It's Different and Why It Matters
 
Organising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data WorldOrganising the Data Lake - Information Management in a Big Data World
Organising the Data Lake - Information Management in a Big Data World
 

En vedette

Orchestrating HBase Cluster Deployment with Ironfan and Chef
Orchestrating HBase Cluster Deployment with Ironfan and ChefOrchestrating HBase Cluster Deployment with Ironfan and Chef
Orchestrating HBase Cluster Deployment with Ironfan and ChefRobert Berger
 
Big Data & Cloud - Infinite Monkey Theorem
Big Data & Cloud - Infinite Monkey TheoremBig Data & Cloud - Infinite Monkey Theorem
Big Data & Cloud - Infinite Monkey TheoremJim Kaskade
 
Woodside Glens Neighborhood Plan - Amended 1999
Woodside Glens Neighborhood Plan - Amended 1999Woodside Glens Neighborhood Plan - Amended 1999
Woodside Glens Neighborhood Plan - Amended 1999Jim Kaskade
 
CISCO\'s Take On Internet Video
CISCO\'s Take On Internet VideoCISCO\'s Take On Internet Video
CISCO\'s Take On Internet VideoJim Kaskade
 
Infochimps Cloudcon 2012
Infochimps Cloudcon 2012Infochimps Cloudcon 2012
Infochimps Cloudcon 2012Jim Kaskade
 
Online Video and Next-gen Storage
Online Video and Next-gen StorageOnline Video and Next-gen Storage
Online Video and Next-gen StorageJim Kaskade
 
Astute Corporate Profile
Astute Corporate ProfileAstute Corporate Profile
Astute Corporate ProfileShailesh Soni
 
Woodside Residential Design Guidelines
Woodside Residential Design GuidelinesWoodside Residential Design Guidelines
Woodside Residential Design GuidelinesJim Kaskade
 

En vedette (9)

Orchestrating HBase Cluster Deployment with Ironfan and Chef
Orchestrating HBase Cluster Deployment with Ironfan and ChefOrchestrating HBase Cluster Deployment with Ironfan and Chef
Orchestrating HBase Cluster Deployment with Ironfan and Chef
 
Big Data & Cloud - Infinite Monkey Theorem
Big Data & Cloud - Infinite Monkey TheoremBig Data & Cloud - Infinite Monkey Theorem
Big Data & Cloud - Infinite Monkey Theorem
 
Woodside Glens Neighborhood Plan - Amended 1999
Woodside Glens Neighborhood Plan - Amended 1999Woodside Glens Neighborhood Plan - Amended 1999
Woodside Glens Neighborhood Plan - Amended 1999
 
CISCO\'s Take On Internet Video
CISCO\'s Take On Internet VideoCISCO\'s Take On Internet Video
CISCO\'s Take On Internet Video
 
Infochimps Cloudcon 2012
Infochimps Cloudcon 2012Infochimps Cloudcon 2012
Infochimps Cloudcon 2012
 
Online Video and Next-gen Storage
Online Video and Next-gen StorageOnline Video and Next-gen Storage
Online Video and Next-gen Storage
 
Astute Corporate Profile
Astute Corporate ProfileAstute Corporate Profile
Astute Corporate Profile
 
Woodside Residential Design Guidelines
Woodside Residential Design GuidelinesWoodside Residential Design Guidelines
Woodside Residential Design Guidelines
 
Ironfan: Your Foundation for Flexible Big Data Infrastructure
Ironfan: Your Foundation for Flexible Big Data InfrastructureIronfan: Your Foundation for Flexible Big Data Infrastructure
Ironfan: Your Foundation for Flexible Big Data Infrastructure
 

Similaire à Vmware Serengeti - Based on Infochimps Ironfan

VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoopChiou-Nan Chen
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the OrganizationSeeling Cheung
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Cécile Poyet
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Hortonworks
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Cécile Poyet
 
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformModernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformHortonworks
 
Red Hat - Presentation at Hortonworks Booth - Strata 2014
Red Hat - Presentation at Hortonworks Booth - Strata 2014Red Hat - Presentation at Hortonworks Booth - Strata 2014
Red Hat - Presentation at Hortonworks Booth - Strata 2014Hortonworks
 
HP CloudSystem, Alex Haddock, HP Server Strategy Team
HP CloudSystem, Alex Haddock, HP Server Strategy TeamHP CloudSystem, Alex Haddock, HP Server Strategy Team
HP CloudSystem, Alex Haddock, HP Server Strategy Teamsubtitle
 
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.OW2
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformEMC
 
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...Amazon Web Services
 
Big Data Infrastructure
Big Data InfrastructureBig Data Infrastructure
Big Data InfrastructureTrivadis
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Pactera_US
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsDataWorks Summit/Hadoop Summit
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRData Con LA
 

Similaire à Vmware Serengeti - Based on Infochimps Ironfan (20)

VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
VMworld 2013: Big Data Platform Building Blocks: Serengeti, Resource Manageme...
 
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
VMworld 2013: Beyond Mission Critical: Virtualizing Big-Data, Hadoop, HPC, Cl...
 
1. beyond mission critical virtualizing big data and hadoop
1. beyond mission critical   virtualizing big data and hadoop1. beyond mission critical   virtualizing big data and hadoop
1. beyond mission critical virtualizing big data and hadoop
 
Hadoop and SQL: Delivery Analytics Across the Organization
Hadoop and SQL:  Delivery Analytics Across the OrganizationHadoop and SQL:  Delivery Analytics Across the Organization
Hadoop and SQL: Delivery Analytics Across the Organization
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data PlatformModernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
Modernize Your Existing EDW with IBM Big SQL & Hortonworks Data Platform
 
Red Hat - Presentation at Hortonworks Booth - Strata 2014
Red Hat - Presentation at Hortonworks Booth - Strata 2014Red Hat - Presentation at Hortonworks Booth - Strata 2014
Red Hat - Presentation at Hortonworks Booth - Strata 2014
 
Hortonworks.bdb
Hortonworks.bdbHortonworks.bdb
Hortonworks.bdb
 
Hp
HpHp
Hp
 
HP CloudSystem, Alex Haddock, HP Server Strategy Team
HP CloudSystem, Alex Haddock, HP Server Strategy TeamHP CloudSystem, Alex Haddock, HP Server Strategy Team
HP CloudSystem, Alex Haddock, HP Server Strategy Team
 
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
Scalable ETL with Talend and Hadoop, Cédric Carbone, Talend.
 
Robin_Hadoop
Robin_HadoopRobin_Hadoop
Robin_Hadoop
 
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platformPivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
Pivotal deep dive_on_pivotal_hd_world_class_hdfs_platform
 
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
AWS Partner Webcast - Hadoop in the Cloud: Unlocking the Potential of Big Dat...
 
Big Data Infrastructure
Big Data InfrastructureBig Data Infrastructure
Big Data Infrastructure
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
Hadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the expertsHadoop in the Cloud - The what, why and how from the experts
Hadoop in the Cloud - The what, why and how from the experts
 
Hadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapRHadoop and NoSQL joining forces by Dale Kim of MapR
Hadoop and NoSQL joining forces by Dale Kim of MapR
 

Plus de Jim Kaskade

Jim kaskade biography (updated)
Jim kaskade biography (updated)Jim kaskade biography (updated)
Jim kaskade biography (updated)Jim Kaskade
 
Infochimps Hadoop Summit 2013
Infochimps Hadoop Summit 2013Infochimps Hadoop Summit 2013
Infochimps Hadoop Summit 2013Jim Kaskade
 
Infochimps TieCon 2013
Infochimps TieCon 2013Infochimps TieCon 2013
Infochimps TieCon 2013Jim Kaskade
 
Big analytics best practices @ PARC
Big analytics best practices @ PARCBig analytics best practices @ PARC
Big analytics best practices @ PARCJim Kaskade
 
Marketing & Sales
Marketing & SalesMarketing & Sales
Marketing & SalesJim Kaskade
 
Outsourcing Class
Outsourcing ClassOutsourcing Class
Outsourcing ClassJim Kaskade
 
Rapid Social Game Development & Deployment
Rapid Social Game Development & DeploymentRapid Social Game Development & Deployment
Rapid Social Game Development & DeploymentJim Kaskade
 
Application Model for Cloud Deployment
Application Model for Cloud DeploymentApplication Model for Cloud Deployment
Application Model for Cloud DeploymentJim Kaskade
 
Next-Gen Security (using Cloud)
Next-Gen Security (using Cloud)Next-Gen Security (using Cloud)
Next-Gen Security (using Cloud)Jim Kaskade
 
CISCO Visual Networking Index Forecast and Methodology, 2009-14
CISCO Visual Networking Index Forecast and Methodology, 2009-14CISCO Visual Networking Index Forecast and Methodology, 2009-14
CISCO Visual Networking Index Forecast and Methodology, 2009-14Jim Kaskade
 
Jim Kaskade Biography
Jim Kaskade BiographyJim Kaskade Biography
Jim Kaskade BiographyJim Kaskade
 
Private Cloud Platform as a Service
Private Cloud Platform as a ServicePrivate Cloud Platform as a Service
Private Cloud Platform as a ServiceJim Kaskade
 
Advertising Exchange Whitepaper
Advertising Exchange WhitepaperAdvertising Exchange Whitepaper
Advertising Exchange WhitepaperJim Kaskade
 
Broadband Video Ad Exchange
Broadband Video Ad ExchangeBroadband Video Ad Exchange
Broadband Video Ad ExchangeJim Kaskade
 
Broadband Video Review
Broadband Video ReviewBroadband Video Review
Broadband Video ReviewJim Kaskade
 
Video SaaS Overview
Video SaaS OverviewVideo SaaS Overview
Video SaaS OverviewJim Kaskade
 

Plus de Jim Kaskade (17)

Jim kaskade biography (updated)
Jim kaskade biography (updated)Jim kaskade biography (updated)
Jim kaskade biography (updated)
 
Infochimps Hadoop Summit 2013
Infochimps Hadoop Summit 2013Infochimps Hadoop Summit 2013
Infochimps Hadoop Summit 2013
 
Infochimps TieCon 2013
Infochimps TieCon 2013Infochimps TieCon 2013
Infochimps TieCon 2013
 
Big analytics best practices @ PARC
Big analytics best practices @ PARCBig analytics best practices @ PARC
Big analytics best practices @ PARC
 
Marketing & Sales
Marketing & SalesMarketing & Sales
Marketing & Sales
 
Outsourcing Class
Outsourcing ClassOutsourcing Class
Outsourcing Class
 
Rapid Social Game Development & Deployment
Rapid Social Game Development & DeploymentRapid Social Game Development & Deployment
Rapid Social Game Development & Deployment
 
Application Model for Cloud Deployment
Application Model for Cloud DeploymentApplication Model for Cloud Deployment
Application Model for Cloud Deployment
 
Next-Gen Security (using Cloud)
Next-Gen Security (using Cloud)Next-Gen Security (using Cloud)
Next-Gen Security (using Cloud)
 
CISCO Visual Networking Index Forecast and Methodology, 2009-14
CISCO Visual Networking Index Forecast and Methodology, 2009-14CISCO Visual Networking Index Forecast and Methodology, 2009-14
CISCO Visual Networking Index Forecast and Methodology, 2009-14
 
Jim Kaskade Biography
Jim Kaskade BiographyJim Kaskade Biography
Jim Kaskade Biography
 
Private Cloud Platform as a Service
Private Cloud Platform as a ServicePrivate Cloud Platform as a Service
Private Cloud Platform as a Service
 
Advertising Exchange Whitepaper
Advertising Exchange WhitepaperAdvertising Exchange Whitepaper
Advertising Exchange Whitepaper
 
Broadband Video Ad Exchange
Broadband Video Ad ExchangeBroadband Video Ad Exchange
Broadband Video Ad Exchange
 
Mobile Video
Mobile VideoMobile Video
Mobile Video
 
Broadband Video Review
Broadband Video ReviewBroadband Video Review
Broadband Video Review
 
Video SaaS Overview
Video SaaS OverviewVideo SaaS Overview
Video SaaS Overview
 

Dernier

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 

Dernier (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 

Vmware Serengeti - Based on Infochimps Ironfan

  • 1. © 2012 VMware Inc. All rights reserved Confidential Hadoop-as-a-Service CXO Big Data Seminar September 26, 2012
  • 2. 2 Confidential Agenda  VMware Data Portfolio  Big Data and Virtualization Trends  Enterprise Hadoop Needs  Virtualized Hadoop for the Enterprise  Summary
  • 3. 3 Confidential Trends Driving Change in Enterprise IT Cloud • Offered “as-a-Service” • Virtualization New Application Types • Mobile, SaaS, social • Apps released early and often Frameworks • New application frameworks driving • Increase in application development Data Disruption • Web orientation drives exponential data volumes • Reduced latency and new types of data
  • 4. 4 Confidential The Database is Being Stretched Big Data Cloud Delivery Flexible Data  Virtualized  Offered “-as-a-Service”  Petabytes vs. Gigabytes  Democratize BI  Multi-structured data  Developer productivity Fast Data  Global access patterns  Mobile app proliferation
  • 5. 5 Confidential Big, Fast and Flexible Data Flexible Big Big Data Processing Big Data Analytics Serengeti Fast OLTP workloads Analytic workloads Cloud Delivery Model Data as a service for private and public clouds OSS Relational Document Object Key / Value GemFire vPostgres GemFire GemFire
  • 6. 6 Confidential Agenda  VMware Data Portfolio  Big Data and Virtualization Trends  Enterprise Hadoop Needs  Virtualized Hadoop for the Enterprise  Summary
  • 7. 7 Confidential Data is exploding & Hadoop is driving growth Unstructured data driving growth Hadoop adoption is ramping 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 Structured Unstructured Complex unstructured data forecastedto outpace structured relationaldata by 10x by 2020 Evaluating 53% In- production 23% Piloting 18% Testing 2% Don't know 2% Other 2% Source: Forrester Survey of 60 CIOs, September 2011 • Unstructured data explosion and Hadoop capabilities causing CIOs to reconsider Enterprise data strategy • Gartner predicts +800% data growth over next 5 years • Hadoop’s ability to process raw data at cost presents intriguing value prop for CIOs
  • 8. 8 Confidential Log Processing / Click Stream Analytics Machine Learning / sophisticated data mining Web crawling / text processing Extract Transform Load (ETL) replacement Image / XML message processing Broad Application of Hadoop technology General archiving / compliance Financial Services Mobile / Telecom Internet Retailer Scientific Research Pharmaceutical / Drug Discovery Social Media Vertical Use Cases Horizontal Use Cases Hadoop’s ability to handle large unstructured data affordably and efficiently makes it a valuable tool kit for enterprises across a number of applications and fields.
  • 9. 9 Confidential The Future of Virtualization VDC Software-defined Datacenter Services 2008 2012 FUTURE Time to Provision New Services Workloads Virtualized Weeks Days/Hours Minutes/Seconds 25% 60% + >90%
  • 10. 10 Confidential Virtualization enables a Common Infrastructure for Big Data Single purpose clusters for various business applications lead to cluster sprawl. Virtualization Platform  Simplify • Single Hardware Infrastructure • Unified operations  Optimize • Shared Resources = higher utilization • Elastic resources = faster on-demand access MPP DB Hadoop HBase Virtualization Platform MPP DB Hadoop HBase Cluster Sprawling Cluster Consolidation
  • 11. 11 Confidential Agenda  VMware Data Portfolio  Big Data and Virtualization Trends  Enterprise Hadoop Needs  Virtualized Hadoop for the Enterprise  Summary
  • 12. 12 Confidential Hadoop Users Data scientists, analysts, developers • Line of business users • Intimate with data and analysis, not IT • Tasked with providing actionable intelligence that impacts the business Concerns • Obtain a Hadoop cluster on demand • Minimize time to insight • Require reasonable performance from Hadoop cluster
  • 13. 13 Confidential The IT Guy Admins, architects, CIO • Responsible for technology infrastructure, compliance, budget management • Evaluates new technologies and recommends best practices Concerns • Keeping up with demands of the business • Cost savings and consolidation • Reliability • Complexity of running and tuning Hadoop clusters • Shortage of skills to do the above
  • 14. 14 Confidential Hadoop Journey in Enterprises 20 300 0 node Integrated Scale
  • 15. 15 Confidential Agenda  VMware Data Portfolio  Big Data and Virtualization Trends  Enterprise Hadoop Needs  Virtualized Hadoop for the Enterprise  Summary
  • 16. 16 Confidential Why Virtualize Hadoop?  Shrink and expand cluster on demand  Independent scaling of Compute and data  Strong multi-tenancy Elasticity & Multi-tenancy  High availability for entire Hadoop stack  One click to setup  Battle-tested High Availability  Rapid deployment  One stop command center  Easy to configure/reconfigure Operational Simplicity
  • 17. 17 Confidential Project Serengeti  Open source project launched in June, 2012  Toolkit that leverage virtualization to simplify Hadoop deployment and operations  To learn more, projectserengeti.org Deploy a Hadoop cluster in 10 Minutes Customize Hadoop cluster Use Your Favorite Hadoop Distribution One stop command center Serengeti
  • 18. 18 Confidential Rapid Deployment of a Hadoop Cluster with Serengeti Done Step 1: Deploy Serengeti virtual appliance on vSphere. Step 2: A few simple commands to stand up Hadoop Cluster.
  • 19. 19 Confidential A Walk Through Serengeti
  • 20. 20 Confidential A Walk Through Serengeti
  • 21. 21 Confidential A Walk Through Serengeti Scaling out a cluster Advanced cluster creation
  • 22. 22 Confidential Customizing Your Hadoop Cluster  Choice of distros  Storage configuration • Choice of shared storage or local disk  Resource configuration  High availability option  # of nodes  Also used to tune Hadoop config … "distro":"apache", "groups":[ { "name": "master", "roles":[ "hadoop_namenode", "hadoop_jobtracker”], "storage": { "type": "SHARED", "sizeGB": 20}, "instanceType": "MEDIUM", "instanceNum": 1, "haFlag": 'on’}, {"name": "worker", "roles":[ "hadoop_datanode", "hadoop_tasktracker" ], "instanceType": "SMALL", "instanceNum": 5, "haFlag": 'off' …
  • 23. 23 Confidential Freedom of Choice and Open Source Community Projects Distributions • Flexibility to choose from major distributions • Support for multiple projects (work in progress) • Open architecture to welcome industry participation • Contributing Hadoop Virtualization Extensions (HVE) to open source community
  • 24. 24 Confidential Use Local Disk where it’s Needed SAN Storage $2 - $10/Gigabyte $1M gets: 0.5 Petabytes 200,000 IOPS 8Gbyte/sec NAS Filers $1 - $5/Gigabyte $1M gets: 1 Petabyte 200,000 IOPS 10Gbyte/sec Local Storage $0.05/Gigabyte $1M gets: 10 Petabytes 400,000 IOPS 250 Gbytes/sec
  • 25. 25 Confidential Virtual Storage Architecture Includes Local Disk  Shared Storage: SAN or NAS • Easy to provision • Automated cluster rebalancing • Leverage high availability protection  Local Storage: Local Disks • Local disk for Hadoop • Scalable bandwidth, lower cost/GB Host Hadoop Other VM Other VM Host Hadoop Hadoop Other VM Host Hadoop Hadoop Other VM Host Hadoop Other VM Other VM Host Hadoop Hadoop Other VM Host Hadoop Hadoop Other VM Shared Storage Shared Storage Local Storage
  • 26. 26 Confidential Hadoop Runs Well on Virtualization 0 50 100 150 200 250 300 350 400 450 TeraGen TeraSort TeraValidate Elapsed time, seconds (lower is better) Native 1 VM 2 VMs 4 VMs Source: http://www.vmware.com/files/pdf/techpaper/VMW-Hadoop-Performance-vSphere5.pdf
  • 27. 27 Confidential Why Virtualize Hadoop?  Shrink and expand cluster on demand  Independent scaling of Compute and data  Strong multi-tenancy Elasticity & Multi-tenancy  High availability for entire Hadoop stack  One click to setup  Battle-tested High Availability  Rapid deployment  One stop command center  Easy to configure/reconfigure Operational Simplicity
  • 28. 28 Confidential High Availability for the Hadoop Stack HDFS (Hadoop Distributed File System) HBase (Key-Value store) MapReduce (Job Scheduling/Execution System) Pig (Data Flow) Hive (SQL) BI Reporting ETL Tools Management Server Zookeepr (Coordination) HCatalog RDBMS Namenode Jobtracker Hive MetaDB Hcatalog MDB Server HA for Hadoop stack is more than Name node HA
  • 29. 29 Confidential vMotion Reduces Planned Downtime Description: Enables the live migration of virtual machines from one host to another with continuous service availability. Benefits: • Revolutionary technology that is the basis for automated virtual machine movement • Meets service level and performance goals
  • 30. 30 Confidential Hadoop Aware HA - Protection Against Unplanned Downtime • Protection against host and VM failures • Added application-aware HA for Hadoop NameNode (NN) and JobTracker (JT), protecting against NN and JT failures • Automatic failure detection and restart virtual machine in minutes, on any available host in cluster • In progress Hadoop Jobs will pause and resume when name node is up Overview
  • 31. 31 Confidential vSphere Fault Tolerance Provides Continuous Protection App OS App OS App OS X X App OS App OS App OS App OS X VMware ESX VMware ESX • Single identical VMs running in lockstep on separate hosts • Zero downtime, zero data loss failover for all virtual machines in case of hardware failures • Integrated with VMware HA/DRS • No complex clustering or specialized hardware required • Single common mechanism for all applications and operating systems FT HA HA Overview Zero downtime for Name Node, Job Tracker and other components in Hadoop clusters
  • 32. 32 Confidential Achieve HA for the Entire Hadoop Stack HDFS (Hadoop Distributed File System) HBase (Key-Value store) MapReduce (Job Scheduling/Execution System) Pig (Data Flow) Hive (SQL) BI Reporting ETL Tools Management Server Zookeepr (Coordination) HCatalog RDBMS Namenode Jobtracker Hive MetaDB Hcatalog MDB Server • Battle-tested high availability technology • Single mechanism to achieve HA for the entire Hadoop stack • One click to enable HA and/or FT
  • 33. 33 Confidential Why Virtualize Hadoop?  Shrink and expand cluster on demand  Independent scaling of Compute and data  Strong multi-tenancy Elasticity & Multi-tenancy  High availability for entire Hadoop stack  One click to setup  Battle-tested High Availability  Rapid deployment  One stop command center  Easy to configure/reconfigure Operational Simplicity
  • 34. 34 Confidential Storage Evolution of Hadoop on VMs Compute Current Hadoop: Combined Storage/ Compute Storage T1 T2 VM VM VM VM VM VM Hadoop in VM - VM lifecycle determined by Datanode - Limited elasticity - Limited to Hadoop Multi-Tenancy Separate Storage - Separate compute from data - Elastic compute - Enable shared workloads - Raise utilization Separate Compute Clusters - Separate virtual clusters per tenant - Stronger VM-grade security and resource isolation - Enable deployment of multiple Hadoop runtime versions Slave Node
  • 35. 35 Confidential Ad hoc data mining In-house Hadoop as a Service “Enterprise EMR” – (Hadoop + Hadoop) Compute layer Data layer HDFS Host Host Host Host Host Host Production recommendation engine Production ETL of log files Virtualization platform HDFS
  • 36. 36 Confidential Hadoop batch analysis Integrated Big Data Production – (Hadoop + other big data) HDFS Host Host Host Host Host Host HBase real-time queries NoSQL – Cassandra key-value store MPP DBMS – Analysis of structured data Compute layer Data layer Virtualization platform
  • 37. 37 Confidential Short-lived Hadoop compute cluster Integrated Hadoop and Webapps – (Hadoop + Other Workloads) HDFS Host Host Host Host Host Host Web servers for ecommerce site Compute layer Data layer Hadoop compute cluster Virtualization platform
  • 38. 38 Confidential Agenda  VMware Data Portfolio  Big Data and Virtualization Trends  Enterprise Hadoop Needs  Virtualized Hadoop for the Enterprise  Summary
  • 39. 39 Confidential Simple, Reliable, Elastic Hadoop on Demand  Shrink and expand cluster on demand  Independent scaling of Compute and data  Strong multi-tenancy Elasticity & Multi-tenancy  High availability for entire Hadoop stack  One click to setup  Battle-tested High Availability  Rapid deployment  One stop command center  Easy to configure/reconfigure Operational Simplicity Hadoop-as-a-Service (Enterprise Grade EMR)
  • 40. 40 Confidential Virtualization Benefits across Hadoop Maturity Spectrum 20 300 0 node Integrated Scale
  • 41. 41 Confidential Serengeti Resources  Download and try Serengeti • projectserengeti.org  VMware Hadoop site • vmware.com/hadoop  Hadoop performance on vSphere • vmware.com/files/pdf/VMW-Hadoop- Performance-vSphere5.pdf  Hadoop High Availability solution • vmware.com/files/pdf/Apache-Hadoop- VMware-HA-solution.pdf