Pivotal HD Platform Overview

1© Copyright 2013 Pivotal. All rights reserved. 1© Copyright 2013 Pivotal. All rights reserved.
A NEW PLATFORM FOR A NEW ERA

2Pivotal Confidential–Internal Use Only 2© Copyright 2013 Pivotal. All rights reserved.
Pivotal HD

3Pivotal Confidential–Internal Use Only
HDFS
HBase
Pig, Hive,
Mahout
Map Reduce
Sqoop Flume
Resource
Management
& Workflow
Yarn
Zookeeper
Apache Pivotal HD Added Value
Configure,
Deploy, Monitor,
Manage
Command
Center
Hadoop Virtualization (HVE)
Data Loader
Pivotal HD
Enterprise
Xtension
Framework
Catalog
Services
Query
Optimizer
Dynamic Pipelining
ANSI SQL + Analytics
HAWQ– Advanced
Database Services
Pivotal HD Architecture

• HDFS – The Hadoop Distributed File System
acts as the storage layer for Hadoop
• MapReduce – Parallel processing framework
used for data computation in Hadoop
• Hive – Structured, data warehouse
implementation for data in HDFS that
provides a SQL-like interface to Hadoop
• Pig – High-level procedural language for data
pipeline/data flow processing in Hadoop
• HBase – NoSQL, key-value data store on top
of HDFS
• Mahout – Library of scalable machine-
learning Algorithms
• Spring Hadoop – Integrates the Spring
framework into Hadoop
Pivotal HD Components

• Installation and Configuration Manager (ICM) – cluster
installation, upgrade, and expansion tools.
• GP Command Center – visual interface for cluster health,
system metrics, and job monitoring.
• Hadoop Virtualization Extension (HVE) – enhances
Hadoop to support virtual node awareness and enables
greater cluster elasticity.
• GP Data Loader – parallel loading infrastructure that
supports “line speed” data loading into HDFS.
• Isilon Integration – extensively tested at scale with
guidelines for compute-heavy, storage-heavy, and
balanced configurations.
• Advanced Database Services (HAWQ)– high-performance,
“True SQL” query interface running within the Hadoop cluster.
• Extensions Framework (GPXF) – support for HAWQ
interfaces on external data providers (HBase, Avro, etc.).
• Advanced Analytics Functions (MADLib) – ability to access
parallelized machine-learning and data-mining functions at
scale.
GPHD Includes… Pivotal HD Adds the Following to GPHD…
Pivotal HD Value-Added Components

Component Version
Hadoop 1.0.3
HBase 0.92.1
Hive 0.8.1
Mahout 0.6
Pig 0.9.2
Zookeeper 3.3.5
Flume 1.2.0
Sqoop 1.4.1
Spring Hadoop
GPHD 1.2 Core Distribution Pivotal HD Enterprise
Pivotal Core Components & Versions
Component Version
Hadoop 2.0.2
HBase 0.94.2
Hive 0.9.1
Mahout 0.8.0
Pig 0.10.0
Zookeeper 3.4.5
Flume 1.3.1
Sqoop 1.4.2
Spring Hadoop 1.0.0

DataLoader
.
.
Streams
Push
Pull
Connectors
Flume
HDFS
DataLoader
Data Source
Registration
Copy
Strategy
Optimization
Web GUI and CLI
Data
Destination
Registration
Data Copy
Job
Management
Data
Processing
REST APIs
Files
HDFS
NFS
HTTP
FTP
Local

Command Center
Simple and complete cluster management
 Install and configure Hadoop
components and services
 Centralized interface for Pivotal HD
cluster monitoring, diagnostics, and
management
 Live and historical Hadoop system
metrics analysis
Configure
Monitor
Manage
Analyze
Deploy

Command Center – Monitor, Manage, and
Analyze
 Host, application, and job level
monitoring across the entire Pivotal
HD cluster performance
 Visualize and analyze live and
historical Hadoop cluster information
through Command Center
Dashboard
 Quick diagnostics of functional or
performance issue

Hadoop Virtualization Extensions (HVE)
• HVE enables Hadoop to support more effective virtual deployments
• This creates the opportunity to provision and scale the compute and storage processes
independently resulting in:
• Much better resource utilization
• Improved resource allocation and consumption
• Support Multi-Tenancy

HAWQ Delivers
 SQL compliant
 World-class query optimizer
 Interactive query
 Horizontal scalability
 Robust data management
 Common Hadoop formats
 Deep analytics

Xtension Framework
 An advanced version of GPDB
external tables
 Enables combining HAWQ data and
Hadoop data in single query
 Supports connectors for HDFS,
Hbase and Hive
 Provides extensible framework API to
enable custom connector
development for other data sources
HDFS HBase Hive
Xtension Framework

Pivotal HD Platform Overview

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (8)

Similaire à Pivotal HD Platform Overview

Similaire à Pivotal HD Platform Overview (20)

Plus de Chiou-Nan Chen

Plus de Chiou-Nan Chen (20)

Dernier

Dernier (20)

Pivotal HD Platform Overview

Notes de l'éditeur