Hadoop Meets Exadata- Kerry Osborne

Hi

Hadoop Meets Exadata
Presented by: Kerry Osborne

Oracle Open World – October, 2012

whoami –

Never Worked for Oracle
Worked with Oracle Since 1982 (V2)
Working with Exadata since early 2010
Work for Enkitec (www.enkitec.com)
(Enkitec owns a Half Rack – V2/X2)
(Enkitec owns a Big Data Appliance)
Many Exadata customers and POCs
Exadata Book (recently translated to Chinese)
Hadoop Aficionado

Blog: kerryosborne.oracle-guy.com
Twitter: @KerryOracleGuy

2

Top Secret Feature of BDA

3

What’s the Point?

Data Volumes are Increasing Rapidly
Cost of Processing / Storing is High
Something’s Gotta Give!

Besides – managing large quantities of data is what
we do!

4

Hadoop Is A Virus

* Stolen from Orbitz

5

Digression #1 - Big Data
Not My Favorite Term
3 or 4 V’s
Value Density
Not the Right Tool for Every Job

9

Disjointed Presentation

Architecture Comparison
Integration Discussion
Case Study ?

10

Traditional RDBMS Architecture
RAC

w
o Cache (SGA)
r workers
k

dbwr lgwr etc…

Block Mapper
(ASM)

Storage

11

HDFS/Hadoop Architecture
HA ?

w
o Job Tracker
Name Node
r
k

datanode tasktracker datanode tasktracker

workers workers

Storage Storage

12

HA ?

w
o Job Tracker
r
k
Block Mapper
(namenode)


workers workers

Storage Storage

13

Exadata Architecture
RAC

w
workers
o Cache
r
k
Block Mapper
(ASM)

Storage Node Storage Node

workers workers

Storage Storage

14

HA ?

w
o Job Tracker
r
k
Block Mapper
(namenode)


workers workers

Storage Storage

15

Oracle + Hadoop Integration

16

Obligatory Marketing Slide

17

Integration Options

Many Ways to Skin the Cat

•
Fuse
•
Sqoop
•
Oracle Big Data Connectors

18

Fuse – External Tables

19

Sqoop (SQL-to-Hadoop)

•
Graduated from Incubator Status in March 2012
•
Slower (no direct path?)
•
Quest has a plug-in (oraoop)
•
Bi-Directional

20

Oracle Big Data Connectors
Oracle Loader for Hadoop - OLH

Oracle Direct Connector for HDFS - ODCH

Oracle R Connector for Hadoop – ORHC

Oracle Data Integrator Application Adapter for Hadoop

Note:

All Connectors are One Way
All sold together for $2K per core list

21

Oracle Data Integrator
Application Adapter for Hadoop
ODIAAH ?

22

Oracle R Connector for Hadoop (ORHC)
•
Provides ability to pull data from Oracle RDBMS
•
Provides ability to pull data from HDFS
•
Provides access to local file system
•
Not really a loader tool
•
Most useful for analysts

23

Oracle Loader for Hadoop (OLH)
•
Implemented as a MapReduce job (oraloader.jar)
•
Saves CPU on DB Server
•
Can convert to Oracle datatypes
•
Can partition data and optionally sort it
•
Online – direct into Oracle tables
•
Can load into Oracle via JDBC or OCI Direct Path
•
Offline – generate preprocessed files in HDFS (DP format)

24

Oracle Direct Connector for HDFS (ODCH)
•
My Favorite
•
Uses External Tables
•
Fastest
•
12T per hour
•
Can load DP files preprocessed by OLH
•
Allows Oracle SQL to query HDFS data
•
Doesn’t require loading into Oracle
•
Pretty Cool!

•
Downside – uses DB CPU’s

25

Exadoop

* Mad Scientist Project

26

Exadoop

Unusual Situation!
Half Rack with 4 Spare Storage Servers
Exadata Cells Very Similar to BDA Servers
slower CPU’s
less memory
but same drives (12X3T)
and IB
and Flash
4 Cells ≈ Mini BDA! (happy face)

27

Digression #2 - BDA Stuff

28


29


30

Exadoop

Situation

•
Pilot Underway – but wanted more power
•
4 Exadata Storage Servers were sitting idle
•
Suggestion was to Install Hadoop Cluster on them
•
1st Concern was being able to Reclaim for Exadata
•
Removing Data Node from HDFS Not a Problem
•
Adding Storage to ASM Not a Problem
•
So the Decision Was Made to Move Forward

31

Exadoop

Set Up

•
Removed the Internal USB’s
•
Installed OEL 6.2
•
Installed CDH3
•
Loaded Some Data
•
Set Up ODCH with External Tables

32

Exadoop

Testing

•
Selecting Data Using External Tables was Not Very Fast
•
Quickly Determined we had Used Default 1G Network
•
Reconfigured with IB
•
Helped But Not as Much as Expected
•
Using Little CPU on Data Nodes
•
But a Single Process was Pegging a CPU on the DB
•
Added Parallelism
•
No Good, Only One Slave Active
•
Added Multiple Files to External Table Def. – Bingo!

33

Exadoop

Testing - Continued

•
Added Fuse Client
•
Created External Tables with Fuse
•
PX seems to work even on single files
•
Puts additional CPU load on DB server (2T/hr)

34

Wrap Up

Right Tool For The Job?

Maybe

All the Cool Kids Are Doing It!

35

Questions?
Contact Information : Kerry Osborne
kerry.osborne@enkitec.com
kerryosborne.oracle-guy.com
www.enkitec.com

36

Hadoop Meets Exadata- Kerry Osborne

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (6)

Similaire à Hadoop Meets Exadata- Kerry Osborne

Similaire à Hadoop Meets Exadata- Kerry Osborne (20)

Plus de Enkitec

Plus de Enkitec (20)

Dernier

Dernier (20)

Hadoop Meets Exadata- Kerry Osborne

Notes de l'éditeur