Introduction and agenda
Ops benefits
Tech benefits
Architecture
Use cases
Demo video
Hybrid data model
Current directions
Q&A
Supplementals
Adobe is a Big Data company.
Adobe adopting a virtualization approach of Hadoop has both business and technical justifications and allows competitive differentiation.
Analytics is core competency of DMBU.
Rapid provisioning: Much of the cluster deployment process can be automated using existing tools.
High availability: HA protection can be provided through the virtualization platform to protect the single points of failure in the Hadoop system.
Elasticity: Hadoop capacity can be scaled up and down on demand in a virtual environment.
Multi-tenancy: Different tenants running Hadoop can be isolated in separate VMs, providing stronger VM-grade resource and security isolation.
Operational Simplicity
Rapid Deployment
Self service tools
Performance
Maximize Resource Utilization
True multi-tenancy
Elastic scaling
Avoid dedicated hardware
VM-based isolation
Increase resource utilization
Architect Scalable Platform
Deployment choice
Maintain management flexibility at scale
Control Costs
Leverage toolsets
Security
Expecting a lot of questions on this one and halfway through, so good time for intermediate Q&A if Chris wants to discuss some of the physical design. We can defer questions on use-cases and workflows since those will be immediately following.
Prod and dev review
Video walkthrough of vCAC deployment and auto-discovery via Cloudera Manager
Hybrid storage model to get the both of both worlds
Or for flexibility
Master nodes:
NameNode, JobTracker on shared storage
Leverage vSphere vMotion, HA and FT
Slave nodes
TaskTracker, DataNode on local storage
Lower cost, scalable bandwidth
Identify acronyms, DMBU and vCAC first.
Integration with Adobe DMBU Private Cloud: IaaS environment leveraging VMware stack (vCAC + vCOPs + vCenter).
HDFS Storage Integration: Storage team is currently managing >10PB of data on Isilon. Presenting this layer, via HDFS, to multiple product teams from a single-view.
Service Blueprints in vCAC: Offering multiple blueprints for various cluster types and sizes within vCAC. Present these blueprints to the Service Catalog and our internal self-provisioning portal.