SlideShare a Scribd company logo
1 of 87
The	
  Data	
  Warehouse	
  Evolu0on	
  Roadshow	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

1
	
  
Agenda	
  
	
  
Welcome	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
   	
  
	
  MapR	
  
	
  
	
  
	
   Data	
  and	
  your	
  Data	
  Warehouse	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
   	
  MapR	
  
Big	
  
	
  
The	
  New	
  Data	
  Warehouse	
  
	
  
	
  
	
  
	
  Informa6ca	
  
	
  
Making	
  the	
  Most	
  of	
  Big	
  Data 	
  
	
  
	
  
	
  MicroStrategy	
  
	
  
Enterprise-­‐Grade	
  Hadoop:	
  Use	
  Cases	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  MapR	
  
	
  
Infrastructure	
  PlaLorm	
  For	
  Big	
  Data 	
  
	
  Cisco	
  
	
  
	
  
	
  
	
  
	
  	
  
GeNng	
  Started/Q&A	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  All	
  
	
  
Close	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
   	
  
	
  MapR	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

2
	
  
Big	
  Data	
  and	
  	
  
Your	
  Data	
  Warehouse	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

3
	
  
 
	
  

“Data is a precious thing and will last
longer than the systems themselves.”
– Tim Berners-Lee, inventor of the World Wide Web.

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

4
	
  
 
	
  

“Without big data analytics, companies are
blind and deaf, wandering out onto the web
like deer on a freeway.”
– Geoffrey Moore, author and consultant.

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

5
	
  
 
	
  

“If we have data, let’s look at data. If all we
have are opinions, let’s go with mine.”
– Jim Barksdale, former Netscape CEO

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

6
	
  
Big	
  Data	
  today	
  in	
  the	
  Enterprise	
  
“Too	
  many	
  different	
  types,	
  sources	
  &	
  formats	
  of	
  cri6cal	
  data”	
  
	
  
	
  

Mul0ple	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
data	
  sources	
  
	
  
Mul0ple	
  
technologies	
  
	
  
Mul0ple	
  	
  	
  	
  	
  	
  	
  
copies	
  of	
  data	
  
	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

7
	
  
An	
  Enterprise	
  Data	
  Hub	
  

Sensor	
  	
  
Data	
  

Click	
  
Streams	
  

Enterprise	
  
Data	
  Hub	
  

Produc6on	
  
Data	
  

Web	
  Logs	
  

Loca6on	
  

Public	
  

Social	
  
Media	
  

Sales	
  
SCM	
  

ü 
ü 
ü 
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

	
  

CRM	
  

Combine	
  different	
  data	
  sources	
  
Minimize	
  data	
  movement	
  
One	
  plaLorm	
  for	
  analy6cs	
  
8
	
  

Billing	
  
Big	
  Data	
  in	
  our	
  World	
  
YouTube	
  users	
  upload	
  48	
  hours	
  of	
  new	
  video	
  every	
  
minute	
  of	
  the	
  day.	
  	
  
§  Twieer	
  sees	
  roughly	
  175	
  million	
  tweets	
  every	
  day,	
  and	
  
has	
  more	
  than	
  465	
  million	
  accounts.	
  	
  
§  Facebook	
  stores,	
  accesses,	
  and	
  analyzes	
  30+	
  Petabytes	
  
of	
  user	
  generated	
  data.	
  
§  More	
  than	
  5	
  billion	
  people	
  are	
  calling,	
  tex6ng,	
  twee6ng	
  
and	
  browsing	
  on	
  mobile	
  phones	
  worldwide.	
  	
  
§  2.7	
  Zetabytes	
  of	
  data	
  exist	
  in	
  the	
  digital	
  universe	
  today.	
  
§  Data	
  produc6on	
  will	
  be	
  44	
  6mes	
  greater	
  in	
  2020	
  than	
  it	
  
was	
  in	
  2009.	
  
§ 

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

9
	
  
Arrival	
  of	
  Big	
  Data	
  Impacts	
  Data	
  Warehouses	
  

Variety	
  
Volume	
  
Prohibi6vely	
  expensive	
  
storage	
  costs	
  
	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

Inability	
  to	
  process	
  
unstructured	
  formats	
  
	
  

Velocity	
  
Data	
  
Warehouse	
  

10
	
  

Faster	
  arrival	
  and	
  
processing	
  needs	
  
The	
  Hadoop	
  Advantage	
  
§ 

Fueling	
  an	
  industry	
  revolu6on	
  by	
  
providing	
  infinite	
  capability	
  to	
  
store	
  and	
  process	
  big	
  data	
  

§ 

Expanding	
  analy6cs	
  across	
  	
  	
  	
  	
  	
  	
  	
  
data	
  types	
  

	
  
§ 

Compelling	
  economics	
  
–  	
  20	
  to	
  100X	
  more	
  cost	
  effec6ve	
  than	
  

alterna6ves	
  

Pioneered	
  at	
  	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

11
	
  
Important	
  Drivers	
  for	
  Hadoop 	
  	
  
§ 

Data	
  on	
  compute	
  drives	
  efficiencies	
  and	
  
beeer	
  analy6cs	
  

§ 

With	
  Hadoop	
  you	
  don’t	
  need	
  to	
  know	
  
what	
  ques6ons	
  to	
  ask	
  beforehand	
  

§ 

Simple	
  algorithms	
  on	
  Big	
  Data	
  
outperform	
  complex	
  models	
  

§ 

Powerful	
  ability	
  to	
  analyze	
  	
  
unstructured	
  data	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

12
	
  
What	
  is	
  the	
  Best	
  Way	
  to	
  Deploy	
  Hadoop?	
  

Transitory	
  Data	
  Store	
  
	
  
•  No long-term scale
	
  
advantages
	
  
•  Unprotected data

Permanent	
  Data	
  Store	
  
	
  
•  Highly available and fully
protected 	
  data
	
  
•  Works with existing tools

vs.	
  

•  ETL Tool focus

•  Real-time ingestion and
extraction

•  Archive data from data
warehouse

Enterprise	
  Data	
  Hub	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

13
	
  
 
	
  
“Hadoop ingests and stores data very cost effectively, and
handles workloads such as the simple transformations in ETL.
On the other hand, Hadoop does not address the missioncritical complex business analytic workloads…”
	
  
	
  
Mike	
  Koehler	
  -­‐	
  CEO	
  Teradata	
  	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

14
	
  
Data	
  Warehouse	
  Op0mized:	
  Cost	
  Savings	
  
RDBMS	
  

DW	
  

ETL	
  + Long	
   erm	
  S Storage	
  
ETL	
  +	
  	
  Long	
  TTerm	
  torage	
  

Sensor	
  Data	
  	
  
Web	
  Logs	
  

Query	
  +	
  
Present	
  	
  

Hadoop	
  

Benefits:	
  
ü  Both	
  structured	
  and	
  unstructured	
  data	
  
ü  Expanded	
  analy6cs	
  with	
  MapReduce,	
  NoSQL,	
  etc.	
  
Solu0on	
  

Hadoop	
  

Cost	
  /	
  Terabyte	
  

Hadoop	
  Advantage	
  

$333	
  

Teradata	
  Warehouse	
  Appliance	
  

$16,500	
  

50x	
  savings	
  

Oracle	
  Exadata	
  

$14,000	
  

42x	
  savings	
  

IBM	
  Netezza	
  

$10,000	
  

30x	
  savings	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

15
	
  
Exis6ng	
  Data	
  

Social	
  Data	
  

Weblog	
  Data	
  

Telemetry	
  

The	
  Enterprise	
  
Data	
  Hub	
  for	
  
Hadoop	
  Compute	
  
-­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  -­‐	
  	
  	
  

Freed	
  Up	
  Space	
  

Fraud	
  Detec6on	
  
Applica6on	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

Enterprise	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  
Data	
  W16	
  
arehouse	
  

Recommenda6on	
  
Engine	
  
Mul0-­‐Tenant	
  Capabili0es	
  to	
  Share	
  a	
  Cluster	
  
Successfully	
  
§ 

Isola6on	
  
–  Data	
  placement	
  control	
  
–  Label	
  based	
  job	
  scheduling	
  

§ 

Quotas	
  
–  Storage,	
  CPU,	
  Memory	
  

§ 

Security	
  and	
  delega6on	
  
–  ACLs	
  
–  AD,	
  LDAP,	
  Linux	
  PAM	
  

§ 

Repor6ng	
  
–  About	
  70	
  resource	
  usage	
  metrics	
  
–  REST	
  API	
  integra6on	
  
	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

17
	
  
One	
  PlaMorm	
  for	
  Big	
  Data	
  
Batch	
  

Interac0ve	
  

Log	
  file	
  Analysis	
  
Data	
  Warehouse	
  Offload	
  
Fraud	
  Detec6on	
  
Clickstream	
  Analy6cs	
  

Forensic	
  Analysis	
  
Analy6c	
  Modeling	
  
BI	
  User	
  Focus	
  

Map	
  
Reduce	
  

File-­‐Based	
  
Applica6ons	
  

99.999%	
  
HA	
  

Data	
  
Protec6on	
  

Sensor	
  Analysis	
  
“Twieerscraping”	
  
Telema6cs	
  
Process	
  Op6miza6on	
  

Interac6ve	
  

Batch	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

Real-­‐Time	
  

Real-­‐6me	
  

SQL	
  

Database	
  

Scalability	
  	
  
&	
  

Disaster	
  
Recovery	
  

Performance	
  

18
	
  

Search	
  

Enterprise	
  
Integra6on	
  

Stream	
  
Processing	
  

Mul6-­‐
tenancy	
  

…
MapR	
  Means	
  More	
  from	
  Hadoop	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

19
	
  
The	
  New	
  Data	
  Warehouse	
  
Big	
  Data	
  +	
  Hadoop	
  
	
  
	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

20
	
  
Agenda	
  
§  Big	
  Data	
  and	
  Data	
  Warehouse	
  Op6miza6on	
  
§  What	
  Are	
  Customers	
  Doing	
  to	
  Op6mize	
  their	
  

Data	
  Warehouse?	
  
§  Informa6ca	
  on	
  Hadoop	
  Complements	
  Your	
  Data	
  
Warehouse	
  
	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

21
	
  
Big	
  Data	
  and	
  Data	
  Warehouse	
  
Op0miza0on	
  
	
  
	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

22
	
  
2014
2011
Devices
& Machines

2007
Communities
& Society

1990s
1980s

BUSINESS
1960s-1970s

USERS

VALUE
TECHNOLOGIES

Few
Employees

Back Office
Automation

Business
Ecosystems
Customers/
Consumers

Many
Employees

Front Office
Productivity

Line-of-Business
Self-Service

Social
Engagement

Real-Time
Optimization

E-Commerce

OS/360

SOURCES

TECHNOLOGY
MAINFRAME

10 2
CLIENT-SERVER

10 4

WEB

10 6
CLOUD

10 7

SOCIAL

10 9

INTERNET
OF THINGS

10 11
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

23
	
  
Informa0ca	
  +	
  Hadoop	
  

PowerCenter	
  Developers	
  are	
  Now	
  Hadoop	
  Developers	
  

Archive

Profile

Parse

ETL

Cleanse

Match

Transactions,
OLTP, OLAP
Analytics & Op
Dashboards
Documents and Emails

Mobile
Apps

Social Media, Web Logs

Machine Device,
Scientific

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

Real-Time
Alerts

24
	
  
Data	
  Warehouse	
  Op0miza0on	
  
1.	
  	
  Iden6fy	
  inac6ve	
  &	
  
infrequently	
  used	
  data	
  

Data Warehouse

Transactions,
OLTP, OLAP
Reports

Documents and Emails

Social Media, Web Logs

2.	
  	
  Offload	
  data	
  &	
  
processing	
  	
  to	
  Hadoop	
  

5.	
  	
  Move	
  high	
  value	
  
results	
  data	
  into	
  DW	
  	
  

3.	
  	
  Ingest	
  raw	
  data,	
  
replicate	
  changes	
  &	
  
schemas	
  

Machine Device,
Scientific
4.	
  	
  Store	
  &	
  prepare	
  (e.g.	
  
ETL)	
  data	
  on	
  Hadoop	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

25
	
  
PowerCenter	
  Big	
  Data	
  Edi0on	
  
Minimize	
  Risk	
  
Quickly	
  staff	
  projects	
  with	
  trained	
  
experts	
  

Map	
  Once.	
  Deploy	
  AnywhereTM	
  
Deploy	
  On-­‐Premise	
  or	
  
in	
  the	
  Cloud	
  

Traditional Grid

	
  	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

26
	
  
What	
  Are	
  Customers	
  Doing	
  to	
  
Op0mize	
  their	
  Data	
  Warehouse?	
  
	
  
	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

27
	
  
Minimize	
  risk	
  and	
  grow	
  digital	
  business	
  
	
  

The Challenge. Grow	
  digital	
  business	
  to	
  30%	
  ($1.8B)	
  and	
  reduce	
  fraud	
  

The	
  Solu0on	
  

Relational - SQL Server, Oracle,
DB2, AS400, Mainframe

The	
  Result	
  

BI / Analytics
Visualization & Reporting

PowerCenter	
  Big	
  Data	
  Edi6on	
  

Profile	
  

Parse	
  

ETL	
  

•  Comprehensive	
  data	
  
integra6on	
  plaLorm	
  to	
  
integrate	
  large	
  volumes	
  
of	
  data	
  from	
  over	
  18+	
  
systems	
  
•  Ability	
  to	
  use	
  exis6ng	
  
skill	
  sets	
  &	
  make	
  them	
  
more	
  produc6ve	
  

Surveys & Net Promoter
Scores (NPS)

•  Lowest	
  risk	
  as	
  industry	
  
leader	
  

Social Media, Web Logs,
JSON, XML
Netezza, SQL
Server, Oracle, SAS
Machine, Forensic, Splunk

Large	
  Global	
  Financial	
  Services	
  and	
  Communica0ons	
  Company	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

28
	
  
Reduce	
  Costs	
  &	
  Increase	
  Revenue	
  

Consolidate	
  Data	
  on	
  Hadoop	
  &	
  Provide	
  360	
  View	
  of	
  Customer	
  
The Challenge Data	
   increasing	
   20x	
   every	
   year	
   with	
   costs	
   rising	
   from	
   $17K	
   per	
   day	
   to	
   $50K
	
  
per	
  day	
  within	
  6	
  months.	
  	
  Time	
  to	
  deliver	
  informa6on	
  taking	
  too	
  long.

The	
  Solu0on
	
  

Business
Reports

Traditional Grid

•  Gain	
  360	
  view	
  of	
  
customer	
  behavior,	
  
increase	
  cross-­‐sell	
  &	
  
up-­‐sell	
  revenue	
  

Transactions from
70 Data Centers

In-­‐Store	
  POS	
  
Data	
  

B2B	
  Data	
  
Exchange	
  

Expected	
  Result
	
  

Data	
  
Warehouse	
  

Power	
  Center	
  Big	
  
Data	
  Edi6on	
  

•  Reduce	
  data	
  storage	
  
costs	
  from	
  $50K	
  per	
  
day	
  to	
  $500	
  per	
  day	
  

172	
  TB	
  

&	
  Data	
  
Valida0on	
  

Data	
  from	
  Gaming	
  
Consoles,	
  TV,	
  Tablets,	
  
Readers,	
  &	
  Clickstreams	
  
from	
  5000	
  Web	
  Sites	
  

•  Reduce	
  6me	
  to	
  deliver	
  
informa6on	
  to	
  business	
  
from	
  48	
  hours	
  to	
  15	
  
minutes	
  

Large	
  Global	
  Media	
  &	
  Entertainment	
  Company	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

29
	
  
Flexible	
  architecture	
  to	
  support	
  rapid	
  changes	
  
	
  

The Challenge. Data	
  volumes	
  growing	
  	
  at	
  3-­‐5	
  6mes	
  over	
  the	
  next	
  2-­‐3	
  years	
  

The	
  Solu0on	
  

The	
  Result	
  
•  Manage	
  data	
  integra6on	
  
and	
  load	
  of	
  10+	
  billion	
  
records	
  from	
  mul6ple	
  
disparate	
  data	
  sources	
  

Traditional Grid

DW	
  

Data Virtualization

Mainframe	
  
RDBMS	
  

EDW	
  

Business
Reports

•  Flexible	
  data	
  integra6on	
  
architecture	
  to	
  support	
  
changing	
  business	
  
requirements	
  in	
  a	
  
heterogeneous	
  data	
  
management	
  environment	
  

DW	
  
Unstructured	
  
Data	
  

Large	
  Government	
  Agency	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

30
	
  
Lower	
  costs	
  of	
  Big	
  Data	
  projects	
  
	
  

The Challenge. Data	
   warehouse	
   exploding	
   with	
   over	
   200TB	
   of	
   data.	
   	
   User	
   ac6vity
	
  
	
  
	
  genera6ng	
  up	
  to	
  5	
  million	
  queries	
  a	
  day	
  impac6ng	
  query	
  performance

The	
  Solu0on	
  

The	
  Result	
  
Business
Reports

ERP	
  

CRM	
  

Custom	
  

Interac0on	
  Data	
  

EDW

•  Saved	
  $20M	
  +	
  $2-­‐3M	
  
on-­‐going	
  by	
  archiving	
  &	
  
op6miza6on	
  
•  Reduced	
  	
  project	
  
6meline	
  from	
  	
  	
  	
  	
  	
  6	
  
months	
  to	
  2	
  weeks	
  

Phase	
  1	
  

•  Improved	
  performance	
  
by	
  25%	
  	
  

Archived
Archived	
  
Data
Data	
  

•  Return	
  on	
  investment	
  
in	
  less	
  than	
  6	
  months	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

Large	
  Global	
  Financial	
  Ins0tu0on	
  
31
	
  
Lower	
  costs	
  and	
  minimize	
  risk	
  
The Challenge. Increasing demand for faster data driven decision making and analytics
as data volumes and processing loads rapidly increase

The	
  Solu0on	
  
RDBMS

The	
  Result	
  
•  Cost-­‐effec6vely	
  scale	
  
performance	
  	
  

Near Real-Time

Datamarts
RDBMS

Traditional Grid

•  Increased	
  agility	
  by	
  
standardizing	
  on	
  one	
  
data	
  integra6on	
  
plaLorm	
  
Data
Warehouse

Web Logs

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

•  Lower	
  hardware	
  costs	
  

Large	
  Global	
  Financial	
  Ins0tu0on	
  
32
	
  

•  Leverage	
  new	
  data	
  
sources	
  for	
  faster	
  
innova6on	
  
Informa0ca	
  on	
  Hadoop	
  
Complements	
  Your	
  Data	
  
Warehouse	
  
	
  
	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

33
	
  
Maximize	
  Your	
  Return	
  On	
  Big	
  Data	
  

Hadoop	
  complements	
  your	
  exisIng	
  infrastructure	
  
Data	
  Assets	
  

Opera0onal	
  Systems	
  

OLTP	
  

Analy0cal	
  Systems	
  

Data	
  Products	
  

Data	
  
Warehouse	
  

MDM	
  

Transactions,
OLTP, OLAP

OLTP	
  

Data	
  
Mart	
  

ODS	
  

Documents,
Email

&	
  other	
  NoSQL	
  
Social Media,
Web Logs

Machine Device,
Scientific

Access	
  	
  
&	
  Ingest	
  

Parse	
  &	
  
Prepare	
  

Discover	
  &	
  
Profile	
  

Transform	
  
&	
  Cleanse	
  

Manage	
  (i.e.	
  Security,	
  Performance,	
  Governance,	
  Collabora6on)	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

34
	
  

Extract	
  &	
  
Deliver	
  
Data	
  Integra0on	
  &	
  Quality	
  on	
  Hadoop	
  
1.  Entire Informatica mapping
translated to Hive Query Language
2.  Optimized HQL converted to
MapReduce & submitted to Hadoop
cluster (job tracker).
3.  Advanced mapping transformations
executed on Hadoop through User
Defined Functions using Vibe
SELECT	
  
	
  	
  	
  	
  	
  	
  T1.ORDERKEY1	
  AS	
  ORDERKEY2,	
  T1.li_count,	
  orders.O_CUSTKEY	
  AS	
  CUSTKEY,	
  customer.C_NAME,	
  	
  
	
  	
  	
  	
  	
  	
  customer.C_NATIONKEY,	
  na6on.N_NAME,	
  na6on.N_REGIONKEY	
  
	
  	
  	
  	
  	
  	
  FROM	
  
	
  (	
  
	
  SELECT	
  TRANSFORM	
  (L_Orderkey.id)	
  USING	
  CustomInfaTx	
  
	
  FROM	
  lineitem	
  
	
  GROUP	
  BY	
  L_ORDERKEY	
  
	
  )	
  T1	
  
	
  JOIN	
  orders	
  ON	
  (customer.C_ORDERKEY	
  =	
  orders.O_ORDERKEY)	
  
	
  JOIN	
  customer	
  ON	
  (orders.O_CUSTKEY	
  =	
  customer.C_CUSTKEY)	
  
	
  JOIN	
  na6on	
  ON	
  (customer.C_NATIONKEY	
  =	
  na6on.N_NATIONKEY)	
  
	
  WHERE	
  na6on.N_NAME	
  =	
  'UNITED	
  STATES'	
  
	
  )	
  T2	
  
	
  	
  	
  	
  	
  	
  	
  INSERT	
  OVERWRITE	
  TABLE	
  TARGET1	
  SELECT	
  *	
  
	
  	
  	
  	
  	
  	
  	
  INSERT	
  OVERWRITE	
  TABLE	
  TARGET2	
  SELECT	
  CUSTKEY,	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  count(ORDERKEY2)	
  GROUP	
  BY	
  CUSTKEY;	
  

MapReduce	
  
UDF	
  

Hive-QL

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

35
	
  

	
  
	
  
Accelerate	
  Development	
  

	
  
Reuse	
  and	
  Import	
  PowerCenter	
  Metadata	
  	
  
	
  

Import	
  and	
  validate	
  
exis6ng	
  PowerCenter	
  
mappings	
  before	
  running	
  
on	
  Hadoop	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

36
	
  
Hadoop	
  Data	
  Profiling	
  Results	
  
Value	
  and	
  Paeern	
  Frequency	
  
to	
  isolated	
  inconsistent/dirty	
  
data	
  or	
  unexpected	
  paeerns	
  

Hadoop	
  Data	
  Profiling	
  
results	
  –	
  exposed	
  to	
  
anyone	
  in	
  enterprise	
  	
  via	
  
browser	
  	
  
CUSTOMER_ID	
  example	
  

COUNTRY	
  CODE	
  example	
  

2.	
  Value	
  &	
  
Pabern	
  	
  
Analysis	
  of	
  	
  
Hadoop	
  Data	
  

1.	
  Profiling	
  Stats:	
  
Min/Max	
  Values,	
  NULLs,	
  	
  
Inferred	
  Data	
  Types,	
  etc.	
  

Stats	
  to	
  iden6fy	
  
outliers	
  and	
  
anomalies	
  in	
  data	
  	
  

3.	
  Drilldown	
  Analysis	
  (into	
  Hadoop	
  Data)	
  

Drill	
  down	
  into	
  actual	
  
data	
  values	
  to	
  inspect	
  
results	
  across	
  en6re	
  data	
  
set,	
  including	
  poten6al	
  
duplicates	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

37
	
  
Hadoop	
  Data	
  Domain	
  Discovery	
  
	
  
Finding	
  funcIonal	
  meaning	
  of	
  Data	
  in	
  Hadoop	
  
Leverage	
  INFA	
  rules/mapplets	
  to	
  
iden6fy	
  func6onal	
  meaning	
  of	
  
Hadoop	
  data	
  
	
  
Sensi6ve	
  data	
  	
  
(e.g.	
  SSN,	
  Credit	
  Card	
  number,	
  etc.)	
  	
  	
  

View/share	
  report	
  of	
  data	
  domains/
sensi6ve	
  data	
  contained	
  in	
  Hadoop.	
  	
  
Ability	
  to	
  drill	
  down	
  to	
  see	
  suspect	
  data	
  
values.	
  

PHI:	
  	
  Protected	
  Health	
  Informa0on	
  
PII:	
  	
  Personally	
  Iden0fiable	
  Informa0on	
  
Scalable	
  to	
  look	
  for/discover	
  ANY	
  Domain	
  type	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

38
	
  
Unified	
  Administra0on	
  

	
  
Single Place to Manage & Monitor
Full	
  traceability	
  from	
  workflow	
  
to	
  MapReduce	
  jobs	
  

View	
  generated	
  
Hive	
  scripts	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

39
	
  
Maximize	
  Your	
  Return	
  on	
  Big	
  Data	
  
Lower Big Data Costs Up To 2X
(helps self-fund big data projects)

•  5x	
  produc6vity	
  increase	
  using	
  exis6ng	
  
developer	
  skills	
  

Minimize Risk of New Technologies
(single platform, quickly staff projects)

•  Design	
  in	
  PowerCenter,	
  run	
  on	
  Hadoop	
  or	
  
any	
  other	
  data	
  plaLorm	
  

Accelerate Innovation
(onboard, discover, operationalize)

•  Enterprise	
  scalability,	
  security,	
  &	
  support	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

40
	
  
Making	
  the	
  Most	
  of	
  Big	
  Data	
  
Leveraging	
  business	
  intelligence	
  to	
  turn	
  business	
  users	
  into	
  data	
  
scien6sts	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

41
	
  
MicroStrategy	
  Confiden6al.	
  	
  Distribu6on	
  Prohibited	
  without	
  Prior	
  Authoriza6on.	
  
Agenda	
  
1. 

Self	
  Service	
  

2. 

Informa6on	
  Driven	
  Apps	
  

3. 

Mobility	
  

4. 

Advanced	
  Analy6cs	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

42
	
  
SELF-­‐SERVICE	
  ANALYTICS	
  
Empowering everyone with rapid-fire
data exploration and dashboarding

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

43
	
  
Self-­‐Service	
  Analy0cs	
  Revolu0onizes	
  Tradi0onal	
  BI	
  
Boost	
  user	
  sa6sfac6on	
  while	
  massively	
  increasing	
  produc6vity	
  

More Productive!

5-10x!

More content per creator"

More Content"

More Producers!

5-10x!

More users can create
content"

More Collaborative!
Peer-to-peer sharing"

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

More Content"
Creators"

5-10x!
More Sharing"

44
	
  

>100x!
more content"
creation and "
consumption"
Business	
  User	
  Access	
  to	
  1000s	
  of	
  Data	
  Sources	
  
Faster	
  access	
  to	
  your	
  data	
  
Enterprise
Applications

Relational
Databases

CloudBased Data

Personal or
Departmental

Big Data &
Hadoop

Spreadsheets, Access
databases, CSV, public data
downloads, etc.

MapR

MicroStrategy
Modeled Data

	
  

	
  

SAP, Oracle e-Business,
Siebel, Peoplesoft, etc.

Oracle, SQL Server,
MySQL, Teradata,
Netezza, etc.

Salesforce.com, NetSuite.
Facebook, Eloqua, Google
Docs, etc.

Quick Data Import

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

No SQL or Scripting

45
	
  

Enterprise-certified singleversion of the truth
Enrich	
  Every	
  Analysis	
  with	
  Added	
  Insight	
  
Enrich	
  with	
  Weather	
  
Data	
  
	
  

Impact	
  of	
  weather	
  on	
  game	
  
outcome	
  and	
  aeendance

	
  

Professional	
  Sports	
  

Enrich	
  with	
  
Demographic	
  Data	
  
	
  

Product	
  popularity	
  by	
  demographic	
  
segment

	
  

Product	
  Sales	
  

Enrich	
  with	
  
Social	
  Data	
  
	
  

Cross-­‐brand	
  affinity	
  to	
  determine	
  
promo6ons	
  or	
  bundling	
  offers

	
  

Marke0ng	
  Promo0ons	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

46
	
  
World-­‐Class	
  Produc0on	
  Dashboard	
  Applica0ons	
  
Informa6on-­‐Driven	
  Apps	
  are	
  the	
  future	
  of	
  dashboards	
  
•  100	
  %	
  customized	
  
look	
  and	
  feel	
  
•  Comprehensive	
  data	
  
•  Easy	
  to	
  use	
  
•  Guided	
  workflow	
  for	
  
consistent	
  user	
  
experience	
  
•  Personalized	
  for	
  each	
  
user	
  
•  Online	
  or	
  distributed	
  
via	
  email	
  
•  Mul6media	
  	
  	
  	
  	
  	
  
content-­‐enabled	
  
•  Transac6on-­‐enabled	
  
•  Live	
  data	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

47
	
  
Beyond	
  Mobile	
  Dashboards	
  
Build	
  great	
  mobile	
  Smart	
  Apps	
  without	
  the	
  pain	
  of	
  na6ve	
  development	
  
Analy0cs	
  

Transac0ons	
  

Mul0media	
  

Update	
  systems	
  like	
  	
  ERP/CRM	
  

Analy6cs	
  and	
  data	
  visualiza6on	
  

Add	
  videos	
  and	
  other	
  content	
  

+"

+"

Apps for Every Customer-Facing Process"
Apps for Every Internal Business Process"

Logistics	
  
Apps	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

Operations	
  
Apps	
  

B2E"
Apps	
  

Data Collection" Product"
Apps	
  
Apps	
  

48
	
  

Context-Aware"
Apps	
  

Executive"
Apps	
  
Easy	
  Integra0on	
  with	
  Third	
  Party	
  Analy0c	
  Models	
  
All	
  of	
  an	
  Organiza6on’s	
  Analy6cs	
  Can	
  Now	
  be	
  Distributed	
  Through	
  a	
  Single	
  PlaLorm	
  
Deploy	
  Any	
  of	
  5000+	
  
Open	
  Source	
  R	
  Analy6cs	
  

Import	
  Predic6ve	
  Models	
  
from	
  Popular	
  Packages	
  

Create	
  Your	
  Own	
  
Custom	
  Func6ons	
  

MicroStrategy	
  R	
  
Integra6on	
  Pack	
  

PMML	
  Model	
  
ƒApply(X)

MicroStrategy	
  Custom	
  
Func6on	
  Plug-­‐in	
  

As	
  a	
  MicroStrategy	
  metric,	
  use	
  models	
  and	
  
func6ons	
  in	
  any	
  report	
  or	
  dashboard	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

49
	
  
MicroStrategy	
  Analy0cs	
  PlaMorm	
  
Comprehensive	
  analy6cs	
  suite	
  for	
  business	
  
MicroStrategy Analytics Platform
Self-Service
Analytics

Enterprise-Grade
Business Intelligence

Big Data
Analytics

Rapid-fire data
discovery

Produce and publish trusted
analytics to elevate performance

The power to transform your
Big Data into insight

•  Intuitive data exploration

•  Self-service with no IT needed

•  Access and combine data from all
sources

•  Trusted system-of-record reliability

•  Advanced and predictive analytics

•  Easy, cost-effective administration

•  Fast dashboard development

•  Comprehensive delivery options
with massive user scale

•  Blazing speed and performance

Web or Mobile
On-Premises or on MicroStrategy Cloud

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

50
	
  
Two	
  Ways	
  to	
  Experience	
  MicroStrategy	
  Today	
  
Best	
  of	
  all,	
  they’re	
  free!	
  
MicroStrategy	
  Analy0cs	
  
Desktop
Fastest, easiest self-service analytics
tool for business users.
100% free!
See it in action

MicroStrategy	
  Analy0cs	
  
Express
Cloud-based self-service visual
analytics for any organization.
Free for one year!
See it in action

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

51
	
  
Enterprise-­‐Grade	
  Hadoop	
  	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

52
	
  
Use	
  Cases	
  
	
  
	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

53
	
  
Data	
  Warehouse	
  Offload:	
  Cost	
  Savings	
  +	
  Analy0cs	
  
RDBMS	
  

DW	
  

ETL	
  + Long	
   erm	
  S Storage	
  
ETL	
  +	
  	
  Long	
  TTerm	
  torage	
  

Sensor	
  Data	
  	
  
Web	
  Logs	
  

Query	
  +	
  
Present	
  	
  

Hadoop	
  

Benefits:	
  
ü  Both	
  structured	
  and	
  unstructured	
  data	
  
ü  Expanded	
  analy6cs	
  with	
  MapReduce,	
  NoSQL,	
  etc.	
  
Solu0on	
  

Hadoop	
  

Cost	
  /	
  Terabyte	
  

Hadoop	
  Advantage	
  

$333	
  

Teradata	
  Warehouse	
  Appliance	
  

$16,500	
  

50x	
  savings	
  

Oracle	
  Exadata	
  

$14,000	
  

42x	
  savings	
  

IBM	
  Netezza	
  

$10,000	
  

30x	
  savings	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

54
	
  
Expand	
  Data	
  For	
  Exis0ng	
  Applica0ons	
  
§ 

§ 

Network	
  security:	
  Network	
  IDS	
  with	
  a	
  3-­‐day	
  window	
  
instead	
  of	
  a	
  10-­‐minute	
  window	
  
Trade	
  Surveillance:	
  Rogue	
  trader	
  detec6on	
  on	
  intra-­‐
day	
  instead	
  of	
  end-­‐of-­‐day	
  market	
  data	
  

	
  
§ 

Insurance:	
  Calculate	
  risk	
  triangles	
  for	
  individual	
  
proper6es	
  instead	
  of	
  neighborhoods	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

55
	
  

	
  
	
  
Advantages:	
  
ü  1T	
  files	
  and	
  tables	
  
ü  Real-­‐6me	
  data	
  inges6on	
  with	
  
streaming	
  writes	
  
ü  24x7	
  opera6ons	
  with	
  
automated	
  failure	
  recovery	
  	
  
ü  Beeer	
  hardware	
  u6liza6on	
  
with	
  2x	
  performance	
  
Combine	
  Different	
  Data	
  Sources	
  

	
  
	
  
Advantages:	
  

Streaming	
  
writes	
  to	
  
Hadoop	
  

ü  Exponen6al	
  decrease	
  in	
  
6me	
  to	
  market	
  

Hadoop	
  

ü  Real-­‐6me	
  data	
  inges6on	
  
with	
  streaming	
  writes	
  	
  	
  	
  	
  	
  

Real-­‐6me	
  
offers	
  

ü  1T	
  files	
  and	
  tables	
  
ü  24x7	
  opera6ons	
  with	
  
automated	
  failure	
  recovery	
  

POS/Online	
  	
  	
  
Data	
  

Retail	
  purchase	
  Info	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

56
	
  
New	
  Analy0cs	
  

	
  
	
  
Advantages	
  
ü  Increased	
  ROI	
  with	
  2x	
  
performance	
  
ü  High	
  available,	
  fully	
  data	
  
protected	
  environment	
  	
  

•  Enhanced search
•  Real-time event processing
•  MapReduce-enabled machine learning algorithms
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

57
	
  

ü  Mul6ple	
  users	
  running	
  
different	
  jobs	
  on	
  one	
  
cluster	
  	
  
Customer	
  Example	
  
Cloud-­‐based	
  predic6ve	
  analy6cs	
  plaLorm	
  

Apache	
  HBase	
  

ý
•  Compac6ons	
  
•  Manual	
  administra6on	
  
•  Poor	
  reliability	
  

Cassandra	
  

ý	
  

þ	
  

•  Compac6ons	
  
•  Manual	
  administra6on	
  
•  Eventual	
  consistency	
  

• 
• 
• 
• 
• 

No	
  compac6ons	
  
Zero	
  administra6on	
  
Strong	
  consistency	
  
2x	
  Cassandra	
  performance	
  	
  
3x	
  HBase	
  performance	
  

Sociocast	
  conducted	
  a	
  POC	
  with	
  the	
  three	
  solu6ons	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

58
	
  
MapR	
  Advantages	
  for	
  Enterprise	
  Data	
  Hub	
  
•  Enterprise Grade Platform
•  99.999% HA
•  Full data protection
•  Disaster recovery
•  Easiest Integration
•  Industry-standard interfaces:
NFS, ODBC, LDAP, REST

•  Streaming writes
•  Best ROI
•  Faster time to market
•  Eliminate risk
•  Reuse existing apps and tools
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

59
	
  
In	
  the	
  era	
  of	
  the	
  “Internet	
  Of	
  Everything”	
  

	
  
Unified	
  Compu0ng	
  Systems	
  
	
  

The	
  Infrastructure	
  PlaMorm	
  For	
  Big	
  Data	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

60
	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

61
	
  
“The	
  internet	
  of	
  everything	
  
will	
  provide	
  a	
  21%	
  increase	
  in	
  
corporate	
  profits	
  in	
  the	
  next	
  
10	
  years”	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

62
	
  
How	
  many	
  IP	
  addresses	
  does	
  your	
  home	
  have?	
  
IPV6	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

63
	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

64
	
  
How	
  will	
  the	
  internet	
  of	
  things	
  change	
  
Basketball?	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

65
	
  
Facebook	
  And	
  Cisco	
  Let	
  Brick-­‐&-­‐Mortars	
  
Demand	
  Customers	
  Check-­‐In	
  To	
  Get	
  Wi-­‐Fi	
  
	
  
10.03.13	
  at	
  Interop	
  Facebook	
  and	
  Cisco	
  
roll	
  out	
  a	
  way	
  to	
  help	
  any	
  brick-­‐and-­‐
mortar	
  recoup	
  its	
  costs	
  by	
  asking	
  users	
  
to	
  check-­‐in	
  to	
  get	
  Internet	
  access.	
  Those	
  
who	
  oblige	
  get	
  dropped	
  on	
  the	
  
business’	
  Facebook	
  Page,	
  and	
  their	
  
anonymous,	
  aggregate	
  demographic	
  
info	
  is	
  passed	
  to	
  the	
  merchant.	
  

hep://techcrunch.com/2013/10/02/facebook-­‐wifi/	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

66
	
  
In-­‐Store	
  Manager	
  View	
  &	
  Capabili0es	
  
Product	
  Catalog	
  
Product	
  Characteris6cs	
  
Marke6ng	
  Descrip6on	
  
Quality	
  Data	
  
Mul6-­‐media	
  Informa6on	
  
Product	
  Sugges6ons	
  

Promo0on	
  PorMolio	
  
Campaign	
  Management	
  
Customer	
  Segmenta6on	
  
Loca6on	
  triggered	
  Rules	
  
Consumer	
  Profile	
  
CRM	
  profile	
  
Loyalty	
  status	
  
Consumer	
  Preferences	
  

Applica0on	
  Analy0cs	
  &	
  
Forecas0ng	
  
Based	
  on	
  Historical	
  FooLal	
  
Heatmap	
  Preferences	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

67
	
  
Beber	
  Retailing?	
  	
  
Retailers	
  
Dashboard	
  

Mobility	
  Services	
  Engine	
  

Exis6ng	
  
ERP	
  	
  
Systems	
  
t	
  
Exis0ng	
  
Retailing	
  
PlaMorm	
  

Cisco	
  Wireless	
  
WLAN	
  Controller	
  

Consumer	
  
Personal	
  Shopping	
  Assistant	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

Cisco Wireless Access
68
	
  
Point
Big	
  Data	
  and	
  Key	
  Infrastructure	
  Abributes	
  
(What	
  big	
  data	
  isn’t)	
  
§ 
§ 
§ 
§ 

Usually	
  not	
  blade	
  servers	
  (not	
  enough	
  local	
  storage)	
  
Usually	
  not	
  virtualized	
  (hypervisor	
  only	
  adds	
  overhead)	
  
Usually	
  not	
  highly	
  oversubscribed	
  (significant	
  east-­‐west	
  traffic)	
  
Usually	
  not	
  SAN/NAS	
  

Low-­‐cost,	
  DAS-­‐based,	
  
scale-­‐out	
  clustered	
  
filesystem	
  

Move	
  the	
  
compute	
  to	
  
the	
  storage	
  	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

69
	
  

$$$	
  

69	
  

69	
  
Cost,	
  Performance,	
  and	
  Capacity	
  
HW:SW $ split 30:70

Expensive	
  Load	
  
1TB/hr	
  ETL	
  

Structured	
  Data:	
  
Rela0onal	
  
Database	
  

$20K/TB	
  
Enterprise	
  	
  
Data	
  

Massive Scale-Out
Column Store

$10K/TB	
  

$500-­‐$1K/TB	
  

Hadoop
No SQL

HW:SW $ split 70:30

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

70
	
  

Unstructured	
  Data:	
  Machine	
  Logs,	
  	
  Web	
  

Click	
  Stream,	
  Call	
  Data	
  Records,	
  Satellite	
  Feeds,	
  
GPS	
  Data,	
  Sensor	
  Readings,	
  Sales	
  Data,	
  Blogs,	
  
Emails,	
  Video	
  
Typical	
  big	
  data	
  deployments	
  
Dedicated	
  “Pod”	
  for	
  Big	
  Data	
  

General	
  Purpose	
  IT	
  Data	
  Center	
  
IT	
  Infrastructure	
  

standard	
  IT	
  servers	
  

SAP	
  

VMwar
e	
  
WEB	
  

X86	
  servers	
  

Big	
  Data	
  

Big	
  Data	
  

§ 

Experimental	
  use	
  of	
  Big	
  Data	
  

§ 

App	
  team	
  mandated	
  infrastructure	
  	
  

§ 

Deployed	
  into	
  IT	
  Ops	
  mandated	
  
infrastructures	
  

§ 

Purpose	
  built	
  for	
  Big	
  Data	
  

§ 

Big	
  Data	
  has	
  established	
  business	
  value	
  

§ 

Performance	
  maeers	
  

§ 

Large	
  or	
  small	
  clusters	
  

§ 

“Skunk	
  works”	
  

§ 

Small	
  to	
  medium	
  clusters	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

71
	
  
Cisco	
  UCS	
  Common	
  PlaMorm	
  Architecture	
  (CPA)	
  
Building	
  Blocks	
  for	
  Big	
  Data	
  

UCS	
  
Manager	
  

UCS	
  6200	
  Series	
  
Fabric	
  Interconnects	
  

Nexus	
  2232	
  
Fabric	
  
Extenders	
  
	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

LAN,	
  SAN,	
  
Management	
  

UCS	
  240	
  M3	
  
Servers	
  

72
	
  

72	
  
Cisco	
  Big	
  Data	
  	
  
Common	
  PlaMorm	
  Architecture	
  
Single-­‐SKU	
  Big	
  Data	
  SmartPlay	
  Bundles	
  
The	
  Big	
  Data	
  Accelera0on	
  Kit	
  
Cisco	
  Components	
  
•  16	
  node	
  UCS	
  CPA	
  Solu6on	
  	
  
Cisco	
  SKUs:	
  UCS-­‐EZ-­‐BD-­‐HP	
  and	
  UCS-­‐EZ-­‐BD-­‐HC	
  
MapR	
  Components	
  
Single	
  Rack	
  UCS	
  Solu0ons	
  
Single	
  Rack	
  
Half-­‐Rack	
  UCS	
  Solu0ons	
  
•  16-­‐node	
  M7	
  license	
  	
   UCS	
  Solu0ons	
  
Bundle	
  for	
  Hadoop	
  
Bundle	
  for	
  Hadoop	
  
Bundle	
  for	
  MPP	
  
•  (2)	
  Free	
  Administrator	
  Training	
  Credits	
  
Performance	
  
Capacity	
  
Configura0on	
  
•  Installa6on	
  and	
  configura6on	
  
UCS-­‐EZ-­‐BD-­‐HP	
  
UCS-­‐EZ-­‐BD-­‐HC	
  
UCS-­‐EZ-­‐BD-­‐STRT	
   Data	
  strategy	
  and	
  explora6on	
  
• 
	
  
	
  
	
  
•  MapR	
  SKU:	
  M7-­‐16-­‐CISCO-­‐12	
  
	
  
2	
  x	
  UCS	
  6248	
  
2	
  x	
  Nexus	
  2232	
  PP	
  
8	
  x	
  C240	
  M3	
  (SFF)	
  
	
  
2x	
  E5-­‐2690	
  
256GB	
  
24x	
  600GB	
  10K	
  SAS	
  
hep://www.cisco.com/en/US/docs/

unified_compu6ng/ucs/UCS_CVDs/
Cisco_UCS_CPA_for_Big_Data_with_MapR.h
tml	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

	
  
2	
  x	
  UCS	
  6296	
  
2	
  x	
  Nexus	
  2232	
  PP	
  
16	
  x	
  C240	
  M3	
  (LFF)	
  
	
  
E5-­‐2640	
  (12	
  cores)	
  
128GB	
  
12x	
  3TB	
  7.2K	
  SATA	
  	
  

73
	
  

	
  
2	
  x	
  UCS	
  6296	
  
2	
  x	
  Nexus	
  2232	
  PP	
  
16	
  x	
  C240	
  M3	
  (SFF)	
  
	
  
2x	
  E5-­‐2665	
  (16	
  cores)	
  
256GB	
  
24	
  x	
  1TB	
  7.2K	
  SAS	
  

73	
  
Hadoop	
  Hardware	
  Evolving	
  in	
  the	
  Enterprise	
  
Typical	
  2009	
  
Hadoop	
  node	
  
• 1RU	
  server	
  
• 4	
  x	
  1TB	
  3.5”	
  spindles	
  
• 2	
  x	
  4-­‐core	
  CPU	
  
• 1	
  x	
  GE	
  
• 24	
  GB	
  RAM	
  
• Single	
  PSU	
  
• Running	
  Apache	
  
• $	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

Economics	
  favor	
  
“fat”	
  nodes	
  
• 6x-­‐9x	
  more	
  data/
node	
  
• 3x-­‐6x	
  more	
  IOPS/
node	
  
• Saturated	
  gigabit,	
  
10GE	
  on	
  the	
  rise	
  
• Fewer	
  total	
  nodes	
  
lowers	
  licensing/
support	
  costs	
  
• Increased	
  
significance	
  of	
  node	
  
and	
  switch	
  failure	
  

74
	
  

Typical	
  2012	
  
Hadoop	
  node	
  
• 2RU	
  server	
  
• 12	
  x	
  3TB	
  3.5”	
  or	
  24	
  x	
  
1TB	
  2.5”	
  spindles	
  
• 2	
  x	
  8-­‐core	
  CPU	
  
• 1-­‐2	
  x	
  10GE	
  
• 128	
  GB	
  RAM	
  
• Dual	
  PSU	
  
• Running	
  MapR	
  	
  
• $$$	
  
Seamless	
  Integra0on	
  with	
  Enterprise	
  
ETH	
  1	
  
ETH	
  2	
  
SAN	
  B	
  
Applica0ons	
   SAN	
  A	
  
MGMT	
  

MGMT	
  

Uplink	
  Ports	
  
OOB	
  Mgmt	
  
Fabric	
  Switch	
  
Server	
  Ports	
  

Fabric	
  Extenders	
  	
  
Virtualized	
  Adapters	
  
Compute	
  Blades	
  
Half	
  /	
  Full	
  width	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

6200	
  
Fabric	
  A	
  

F
E
X	
  
	
  
A	
  

Cluster	
  

Chassis	
  1	
  

F
E
X	
  
	
  
B	
  

CNA	
  

6200	
  
Fabric	
  B	
  

FEX A

FEX B

CNA

Rack Mount

B200	
  

75
	
  
Extending	
  UCS	
  Enterprise	
  Applica0on	
  Ecosystem	
  
to	
  Big	
  Data	
  
	
  
Big Data Common
Platform Architecture

Enterprise
Applications

UCS Rack-Mount
Servers
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

76
	
  

UCS Blade
Servers

SAN/NAS
Arrays
UCSM	
  policy-­‐based	
  management,	
  provisioning,	
  
and	
  monitoring	
  for	
  Big	
  Data	
  Infrastructure	
  
UCS	
  Management	
  (160	
  Nodes	
  per	
  UCS	
  Managed	
  Cluster	
  Domain)	
  
•  Cluster	
  Layout	
  and	
  Inventory	
  
•  Per-­‐Server	
  Inventory	
  
•  ID	
  Pools	
  (MAC,	
  IP,	
  UUID)	
  Management	
  
Inventory &
Asset Mgmt

Fault Detection
& SW Updates

QoS Policies &
Power Capping
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

•  Fault	
  detec6on	
  &	
  Logs	
  	
  
•  Event	
  Aggrega6on	
  
•  System	
  so•ware	
  updates	
  

•  QoS	
  Policy	
  defini6on	
  
•  Policy	
  driven	
  framework	
  
•  Policy	
  Based	
  Power	
  Capping	
  
77
	
  
CPA:	
  High-­‐performance	
  unified	
  fabric	
  and	
  
compute	
  increases	
  cluster	
  efficiency	
  	
  
Single	
  wire	
  for	
  data	
  and	
  management	
  
8	
  x	
  10GE	
  
uplinks	
  per	
  
FEX=	
  2:1	
  
oversub	
  (16	
  
servers/rack),	
  
no	
  
portchannel	
  
(sta6c	
  
pinning)	
  
2	
  x	
  10GE	
  links	
  
per	
  server	
  for	
  all	
  
traffic,	
  data	
  and	
  
management	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

78
	
  
Cisco	
  Unified	
  IO	
  Grant	
  Bandwidth	
  
3G/s	
  

2G/s	
  

Individual	
  
Ethernets	
  
	
  	
  

LAN	
  Traffic	
  (HDFS	
  Import)	
  
3G/s	
  

3G/s	
  

Cluster	
  Traffic	
  (Shuffle)	
  
3G/s	
  

3G/s	
  

Priori6sed	
  QoS	
  

3G/s	
  

Applica6on	
  Traffic	
  (HBase)	
  	
  
4G/s	
  

5G/s	
  

t1	
  

t2	
  

• 	
  Near	
  Wire	
  Speed	
  without	
  CPU	
  load	
  
• 	
  Dynamic	
  bandwidth	
  management	
  according	
  to	
  SLA’s	
  
• 	
  See	
  network	
  sec6on	
  for	
  more	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

79
	
  

t3	
  
Scaling	
  the	
  CPA	
  

L2/L3	
  Switching	
  

Single Rack
16 servers

Single Domain
Up to 10 racks, 160 servers

UCS	
  Manager	
  
UCS	
  Central	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

Multiple
Domains
80
	
  

80	
  
Big	
  Data	
  Infrastructure	
  
UCS	
  Mul6-­‐Domain	
  (UCS	
  Central	
  Manages	
  up	
  to	
  10,000	
  nodes)	
  

•  Inventory,	
  Fault,	
  Log,	
  Event	
  Aggrega6on	
  
•  Global	
  ID	
  Pools,	
  Firmware	
  Updates,	
  Backups	
  and	
  Global	
  
Admin	
  Policies	
  	
  

•  Global	
  Service	
  Profiles,	
  Templates	
  &	
  Policies	
  
•  Sta6s6cs	
  Aggrega6on	
  
•  HA	
  for	
  UCS	
  Central	
  Virtual	
  Machine	
  with	
  shared	
  storage	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

81
	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

82
	
  
©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

83
	
  
Q&A	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

84
	
  
Big	
  Data	
  Accelera0on	
  -­‐	
  Key	
  Benefits	
  	
  
	
  
§  Rapid	
  Big	
  Data	
  plaLorm	
  deployment/Accelerate	
  Big	
  Data	
  ROI	
  
§ 

Ease	
  of	
  infrastructure	
  management	
  and	
  cluster	
  administra6on	
  

§ 

Support	
  for	
  mission	
  cri6cal	
  workloads	
  

§ 

Enterprise-­‐ready	
  workload	
  automa6on	
  

§ 

Powerful	
  plaLorm	
  for	
  high	
  performance	
  and	
  high	
  capacity	
  

§ 

Produc6on	
  ready	
  with	
  full	
  data	
  protec6on	
  and	
  disaster	
  recovery	
  

§ 

Support	
  for	
  wide	
  variety	
  of	
  Big	
  Data	
  applica6ons,	
  including	
  but	
  not	
  
limited	
  to:	
  	
  
–  data	
  warehouse	
  offload,	
  
–  predic6ve	
  analy6cs,	
  	
  
–  360°	
  view	
  of	
  the	
  customer,	
  	
  
–  recommenda6on	
  engine,	
  and	
  	
  
–  long-­‐term	
  data	
  store	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

85
	
  
Big	
  Data	
  Accelera0on	
  Kit	
  
Consul0ng	
  Services	
  

16-­‐node	
  M7	
  UCS	
  Cluster	
  

	
  
ü  Data	
  strategy	
  &	
  explora6on	
  
ü  Integra6on	
  planning	
  
ü  Installa6on	
  &	
  configura6on	
  

	
  
ü  Highly	
  scalable	
  Cisco	
  UCS	
  
CPA	
  solu6on	
  
ü  HA	
  and	
  full	
  data	
  protec6on	
  
ü  Advanced	
  admin	
  console	
  
Helping	
  
You	
  	
  
Get	
  Started	
  	
  

Formal	
  Training	
  &	
  Support	
  	
  

Hadoop	
  Self	
  Training	
  	
  

	
  
ü  Free	
  admin	
  training	
  for	
  (2)	
  
ü  24/7	
  support	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

	
  
ü  Series	
  of	
  jumpstart	
  videos	
  
ü  User	
  forum	
  access	
  

86
	
  
Thank	
  You	
  

©MapR	
  Technologies	
  -­‐	
  Confiden6al	
  

87
	
  

More Related Content

What's hot

Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...Facultad de Informática UCM
 
Drilling into Data with Apache Drill
Drilling into Data with Apache DrillDrilling into Data with Apache Drill
Drilling into Data with Apache DrillDataWorks Summit
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged ApplicationsMapR Technologies
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkDatabricks
 
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
20140228 - Singapore - BDAS - Ensuring Hadoop Production SuccessAllen Day, PhD
 
BIG DATA: From mammoth to elephant
BIG DATA: From mammoth to elephantBIG DATA: From mammoth to elephant
BIG DATA: From mammoth to elephantRoman Nikitchenko
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR Technologies
 
Deep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningDeep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningMapR Technologies
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainMapR Technologies
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR Technologies
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keownCisco Canada
 
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationIndexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationCesare Cugnasco
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR Technologies
 
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
Show me the Money! Cost & Resource  Tracking for Hadoop and Storm Show me the Money! Cost & Resource  Tracking for Hadoop and Storm
Show me the Money! Cost & Resource Tracking for Hadoop and Storm DataWorks Summit/Hadoop Summit
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...StreamNative
 

What's hot (20)

Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...Paradigmas de procesamiento en  Big Data: estado actual,  tendencias y oportu...
Paradigmas de procesamiento en Big Data: estado actual, tendencias y oportu...
 
Drilling into Data with Apache Drill
Drilling into Data with Apache DrillDrilling into Data with Apache Drill
Drilling into Data with Apache Drill
 
How Spark is Enabling the New Wave of Converged Applications
How Spark is Enabling  the New Wave of Converged ApplicationsHow Spark is Enabling  the New Wave of Converged Applications
How Spark is Enabling the New Wave of Converged Applications
 
MapR 5.2 Product Update
MapR 5.2 Product UpdateMapR 5.2 Product Update
MapR 5.2 Product Update
 
Optimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache SparkOptimizing Delta/Parquet Data Lakes for Apache Spark
Optimizing Delta/Parquet Data Lakes for Apache Spark
 
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
20140228 - Singapore - BDAS - Ensuring Hadoop Production Success
 
BIG DATA: From mammoth to elephant
BIG DATA: From mammoth to elephantBIG DATA: From mammoth to elephant
BIG DATA: From mammoth to elephant
 
MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -MapR on Azure: Getting Value from Big Data in the Cloud -
MapR on Azure: Getting Value from Big Data in the Cloud -
 
Deep Learning vs. Cheap Learning
Deep Learning vs. Cheap LearningDeep Learning vs. Cheap Learning
Deep Learning vs. Cheap Learning
 
The Evolution of Big Data Pipelines at Intuit
The Evolution of Big Data Pipelines at Intuit The Evolution of Big Data Pipelines at Intuit
The Evolution of Big Data Pipelines at Intuit
 
Keys for Success from Streams to Queries
Keys for Success from Streams to QueriesKeys for Success from Streams to Queries
Keys for Success from Streams to Queries
 
Evolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and RainEvolving Beyond the Data Lake: A Story of Wind and Rain
Evolving Beyond the Data Lake: A Story of Wind and Rain
 
MapR and Cisco Make IT Better
MapR and Cisco Make IT BetterMapR and Cisco Make IT Better
MapR and Cisco Make IT Better
 
Real-World NoSQL Schema Design
Real-World NoSQL Schema DesignReal-World NoSQL Schema Design
Real-World NoSQL Schema Design
 
Asd 2015
Asd 2015Asd 2015
Asd 2015
 
Cisco connect toronto 2015 big data sean mc keown
Cisco connect toronto 2015 big data  sean mc keownCisco connect toronto 2015 big data  sean mc keown
Cisco connect toronto 2015 big data sean mc keown
 
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integrationIndexing 3-dimensional trajectories: Apache Spark and Cassandra integration
Indexing 3-dimensional trajectories: Apache Spark and Cassandra integration
 
MapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data PlatformMapR 5.2: Getting More Value from the MapR Converged Data Platform
MapR 5.2: Getting More Value from the MapR Converged Data Platform
 
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
Show me the Money! Cost & Resource  Tracking for Hadoop and Storm Show me the Money! Cost & Resource  Tracking for Hadoop and Storm
Show me the Money! Cost & Resource Tracking for Hadoop and Storm
 
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
Change Data Capture to Data Lakes Using Apache Pulsar and Apache Hudi - Pulsa...
 

Viewers also liked

Cosmos, Big Data GE implementation in FIWARE
Cosmos, Big Data GE implementation in FIWARECosmos, Big Data GE implementation in FIWARE
Cosmos, Big Data GE implementation in FIWAREFernando Lopez Aguilar
 
Facebook Retrospective - Big data-world-europe-2012
Facebook Retrospective - Big data-world-europe-2012Facebook Retrospective - Big data-world-europe-2012
Facebook Retrospective - Big data-world-europe-2012Joydeep Sen Sarma
 
Facebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage ChallengeFacebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage ChallengeDataWorks Summit
 
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
A Survey of Petabyte Scale Databases and Storage Systems Deployed at FacebookA Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
A Survey of Petabyte Scale Databases and Storage Systems Deployed at FacebookBigDataCloud
 
Hw09 Rethinking The Data Warehouse With Hadoop And Hive
Hw09   Rethinking The Data Warehouse With Hadoop And HiveHw09   Rethinking The Data Warehouse With Hadoop And Hive
Hw09 Rethinking The Data Warehouse With Hadoop And HiveCloudera, Inc.
 
Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Cloudera, Inc.
 
Storage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook MessagesStorage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook Messagesyarapavan
 
Creating a Culture of Data @ Facebook - TCCEU13
Creating a Culture of Data @ Facebook - TCCEU13Creating a Culture of Data @ Facebook - TCCEU13
Creating a Culture of Data @ Facebook - TCCEU13Andy Kriebel
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataCloudera, Inc.
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprisesmarkgrover
 
Hive Training -- Motivations and Real World Use Cases
Hive Training -- Motivations and Real World Use CasesHive Training -- Motivations and Real World Use Cases
Hive Training -- Motivations and Real World Use Casesnzhang
 
Webinar: Oracle R12 Warehouse Management System (WMS) Overview
Webinar: Oracle R12 Warehouse Management System (WMS) OverviewWebinar: Oracle R12 Warehouse Management System (WMS) Overview
Webinar: Oracle R12 Warehouse Management System (WMS) OverviewiWare Logic Technologies Pvt. Ltd.
 
FBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversFBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversAngelo Failla
 
SREConEurope15 - The evolution of the DHCP infrastructure at Facebook
SREConEurope15 - The evolution of the DHCP infrastructure at FacebookSREConEurope15 - The evolution of the DHCP infrastructure at Facebook
SREConEurope15 - The evolution of the DHCP infrastructure at FacebookAngelo Failla
 
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
Facebooks Petabyte Scale Data Warehouse using Hive and HadoopFacebooks Petabyte Scale Data Warehouse using Hive and Hadoop
Facebooks Petabyte Scale Data Warehouse using Hive and Hadooproyans
 
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopZheng Shao
 

Viewers also liked (20)

Cosmos, Big Data GE implementation in FIWARE
Cosmos, Big Data GE implementation in FIWARECosmos, Big Data GE implementation in FIWARE
Cosmos, Big Data GE implementation in FIWARE
 
ADER RRHH PRESENTACIÓN CORPORATIVA
ADER RRHH PRESENTACIÓN CORPORATIVAADER RRHH PRESENTACIÓN CORPORATIVA
ADER RRHH PRESENTACIÓN CORPORATIVA
 
Facebook Retrospective - Big data-world-europe-2012
Facebook Retrospective - Big data-world-europe-2012Facebook Retrospective - Big data-world-europe-2012
Facebook Retrospective - Big data-world-europe-2012
 
Facebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage ChallengeFacebook's Approach to Big Data Storage Challenge
Facebook's Approach to Big Data Storage Challenge
 
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
A Survey of Petabyte Scale Databases and Storage Systems Deployed at FacebookA Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
A Survey of Petabyte Scale Databases and Storage Systems Deployed at Facebook
 
Project Voldemort
Project VoldemortProject Voldemort
Project Voldemort
 
Hw09 Rethinking The Data Warehouse With Hadoop And Hive
Hw09   Rethinking The Data Warehouse With Hadoop And HiveHw09   Rethinking The Data Warehouse With Hadoop And Hive
Hw09 Rethinking The Data Warehouse With Hadoop And Hive
 
Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010Facebook - Jonthan Gray - Hadoop World 2010
Facebook - Jonthan Gray - Hadoop World 2010
 
planning & project management for DWH
planning & project management for DWHplanning & project management for DWH
planning & project management for DWH
 
Storage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook MessagesStorage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook Messages
 
Planning Data Warehouse
Planning Data WarehousePlanning Data Warehouse
Planning Data Warehouse
 
Creating a Culture of Data @ Facebook - TCCEU13
Creating a Culture of Data @ Facebook - TCCEU13Creating a Culture of Data @ Facebook - TCCEU13
Creating a Culture of Data @ Facebook - TCCEU13
 
Enterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big DataEnterprise Data Hub: The Next Big Thing in Big Data
Enterprise Data Hub: The Next Big Thing in Big Data
 
Hadoop and Hive in Enterprises
Hadoop and Hive in EnterprisesHadoop and Hive in Enterprises
Hadoop and Hive in Enterprises
 
Hive Training -- Motivations and Real World Use Cases
Hive Training -- Motivations and Real World Use CasesHive Training -- Motivations and Real World Use Cases
Hive Training -- Motivations and Real World Use Cases
 
Webinar: Oracle R12 Warehouse Management System (WMS) Overview
Webinar: Oracle R12 Warehouse Management System (WMS) OverviewWebinar: Oracle R12 Warehouse Management System (WMS) Overview
Webinar: Oracle R12 Warehouse Management System (WMS) Overview
 
FBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversFBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp servers
 
SREConEurope15 - The evolution of the DHCP infrastructure at Facebook
SREConEurope15 - The evolution of the DHCP infrastructure at FacebookSREConEurope15 - The evolution of the DHCP infrastructure at Facebook
SREConEurope15 - The evolution of the DHCP infrastructure at Facebook
 
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
Facebooks Petabyte Scale Data Warehouse using Hive and HadoopFacebooks Petabyte Scale Data Warehouse using Hive and Hadoop
Facebooks Petabyte Scale Data Warehouse using Hive and Hadoop
 
HIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on HadoopHIVE: Data Warehousing & Analytics on Hadoop
HIVE: Data Warehousing & Analytics on Hadoop
 

Similar to Data Warehouse Evolution Roadshow

Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise WeAreEsynergy
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTKiththi Perera
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardKiththi Perera
 
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 MillionHow One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 MillionDataWorks Summit
 
Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big DataNetApp
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB
 
How Experian increased insights with Hadoop
How Experian increased insights with HadoopHow Experian increased insights with Hadoop
How Experian increased insights with HadoopPrecisely
 
Data warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-clouderaData warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-clouderaJyrki Määttä
 
ADV Slides: 2021 Trends in Enterprise Analytics
ADV Slides: 2021 Trends in Enterprise AnalyticsADV Slides: 2021 Trends in Enterprise Analytics
ADV Slides: 2021 Trends in Enterprise AnalyticsDATAVERSITY
 
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantageFueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantagePrecisely
 
Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresDATAVERSITY
 
Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...
Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...
Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...exponential-inc
 
Big Data in small words
Big Data in small wordsBig Data in small words
Big Data in small wordsYogesh Tomar
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeDATAVERSITY
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubCloudera, Inc.
 
Integrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environmentIntegrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environmentMapR Technologies
 
Separating Hadoop Myths from Reality by ROB ANDERSON at Big Data Spain 2013
 Separating Hadoop Myths from Reality by ROB ANDERSON at Big Data Spain 2013 Separating Hadoop Myths from Reality by ROB ANDERSON at Big Data Spain 2013
Separating Hadoop Myths from Reality by ROB ANDERSON at Big Data Spain 2013Big Data Spain
 
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data PlatformDeploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data PlatformRackspace
 

Similar to Data Warehouse Evolution Roadshow (20)

Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise Steve Jenkins - Business Opportunities for Big Data in the Enterprise
Steve Jenkins - Business Opportunities for Big Data in the Enterprise
 
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLTBig Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
Big Data Solutions on Cloud – The Way Forward by Kiththi Perera SLT
 
Big data solutions on cloud – the way forward
Big data solutions on cloud – the way forwardBig data solutions on cloud – the way forward
Big data solutions on cloud – the way forward
 
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 MillionHow One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
How One Company Offloaded Data Warehouse ETL To Hadoop and Saved $30 Million
 
Exploring the Wider World of Big Data
Exploring the Wider World of Big DataExploring the Wider World of Big Data
Exploring the Wider World of Big Data
 
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
MongoDB IoT City Tour LONDON: Hadoop and the future of data management. By, M...
 
How Experian increased insights with Hadoop
How Experian increased insights with HadoopHow Experian increased insights with Hadoop
How Experian increased insights with Hadoop
 
Data warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-clouderaData warehouse-optimization-with-hadoop-informatica-cloudera
Data warehouse-optimization-with-hadoop-informatica-cloudera
 
ADV Slides: 2021 Trends in Enterprise Analytics
ADV Slides: 2021 Trends in Enterprise AnalyticsADV Slides: 2021 Trends in Enterprise Analytics
ADV Slides: 2021 Trends in Enterprise Analytics
 
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive AdvantageFueling AI & Machine Learning: Legacy Data as a Competitive Advantage
Fueling AI & Machine Learning: Legacy Data as a Competitive Advantage
 
Operational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data StoresOperational Analytics Using Spark and NoSQL Data Stores
Operational Analytics Using Spark and NoSQL Data Stores
 
Expect More from Hadoop
Expect More from Hadoop Expect More from Hadoop
Expect More from Hadoop
 
Big Data and OSS at IBM
Big Data and OSS at IBMBig Data and OSS at IBM
Big Data and OSS at IBM
 
Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...
Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...
Keynote Address at 2013 CloudCon: Future of Big Data by Richard McDougall (In...
 
Big Data in small words
Big Data in small wordsBig Data in small words
Big Data in small words
 
Using Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-PurposeUsing Data Platforms That Are Fit-For-Purpose
Using Data Platforms That Are Fit-For-Purpose
 
The Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data HubThe Future of Data Management: The Enterprise Data Hub
The Future of Data Management: The Enterprise Data Hub
 
Integrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environmentIntegrating Hadoop into your enterprise IT environment
Integrating Hadoop into your enterprise IT environment
 
Separating Hadoop Myths from Reality by ROB ANDERSON at Big Data Spain 2013
 Separating Hadoop Myths from Reality by ROB ANDERSON at Big Data Spain 2013 Separating Hadoop Myths from Reality by ROB ANDERSON at Big Data Spain 2013
Separating Hadoop Myths from Reality by ROB ANDERSON at Big Data Spain 2013
 
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data PlatformDeploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
Deploy Apache Spark™ on Rackspace OnMetal™ for Cloud Big Data Platform
 

More from MapR Technologies

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscapeMapR Technologies
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationMapR Technologies
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataMapR Technologies
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureMapR Technologies
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...MapR Technologies
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsMapR Technologies
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMapR Technologies
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action MapR Technologies
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsMapR Technologies
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageMapR Technologies
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionMapR Technologies
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformMapR Technologies
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...MapR Technologies
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareMapR Technologies
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsMapR Technologies
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Technologies
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data AnalyticsMapR Technologies
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsMapR Technologies
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceMapR Technologies
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataMapR Technologies
 

More from MapR Technologies (20)

Converging your data landscape
Converging your data landscapeConverging your data landscape
Converging your data landscape
 
ML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & EvaluationML Workshop 2: Machine Learning Model Comparison & Evaluation
ML Workshop 2: Machine Learning Model Comparison & Evaluation
 
Self-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your DataSelf-Service Data Science for Leveraging ML & AI on All of Your Data
Self-Service Data Science for Leveraging ML & AI on All of Your Data
 
Enabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data CaptureEnabling Real-Time Business with Change Data Capture
Enabling Real-Time Business with Change Data Capture
 
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
Machine Learning for Chickens, Autonomous Driving and a 3-year-old Who Won’t ...
 
ML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning LogisticsML Workshop 1: A New Architecture for Machine Learning Logistics
ML Workshop 1: A New Architecture for Machine Learning Logistics
 
Machine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model ManagementMachine Learning Success: The Key to Easier Model Management
Machine Learning Success: The Key to Easier Model Management
 
Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action Data Warehouse Modernization: Accelerating Time-To-Action
Data Warehouse Modernization: Accelerating Time-To-Action
 
Live Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIsLive Tutorial – Streaming Real-Time Events Using Apache APIs
Live Tutorial – Streaming Real-Time Events Using Apache APIs
 
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale StorageBringing Structure, Scalability, and Services to Cloud-Scale Storage
Bringing Structure, Scalability, and Services to Cloud-Scale Storage
 
Live Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn PredictionLive Machine Learning Tutorial: Churn Prediction
Live Machine Learning Tutorial: Churn Prediction
 
An Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data PlatformAn Introduction to the MapR Converged Data Platform
An Introduction to the MapR Converged Data Platform
 
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
How to Leverage the Cloud for Business Solutions | Strata Data Conference Lon...
 
Best Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in HealthcareBest Practices for Data Convergence in Healthcare
Best Practices for Data Convergence in Healthcare
 
Geo-Distributed Big Data and Analytics
Geo-Distributed Big Data and AnalyticsGeo-Distributed Big Data and Analytics
Geo-Distributed Big Data and Analytics
 
MapR Product Update - Spring 2017
MapR Product Update - Spring 2017MapR Product Update - Spring 2017
MapR Product Update - Spring 2017
 
3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics3 Benefits of Multi-Temperature Data Management for Data Analytics
3 Benefits of Multi-Temperature Data Management for Data Analytics
 
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA DeploymentsCisco & MapR bring 3 Superpowers to SAP HANA Deployments
Cisco & MapR bring 3 Superpowers to SAP HANA Deployments
 
Handling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in FinanceHandling the Extremes: Scaling and Streaming in Finance
Handling the Extremes: Scaling and Streaming in Finance
 
Baptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big DataBaptist Health: Solving Healthcare Problems with Big Data
Baptist Health: Solving Healthcare Problems with Big Data
 

Recently uploaded

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rick Flair
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Strongerpanagenda
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditSkynet Technologies
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 

Recently uploaded (20)

A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...Rise of the Machines: Known As Drones...
Rise of the Machines: Known As Drones...
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better StrongerModern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
Modern Roaming for Notes and Nomad – Cheaper Faster Better Stronger
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
Manual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance AuditManual 508 Accessibility Compliance Audit
Manual 508 Accessibility Compliance Audit
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 

Data Warehouse Evolution Roadshow

  • 1. The  Data  Warehouse  Evolu0on  Roadshow   ©MapR  Technologies  -­‐  Confiden6al   1  
  • 2. Agenda     Welcome                                                                                                              MapR         Data  and  your  Data  Warehouse                              MapR   Big     The  New  Data  Warehouse          Informa6ca     Making  the  Most  of  Big  Data        MicroStrategy     Enterprise-­‐Grade  Hadoop:  Use  Cases                            MapR     Infrastructure  PlaLorm  For  Big  Data    Cisco               GeNng  Started/Q&A                                                                                    All     Close                                                                                                                          MapR   ©MapR  Technologies  -­‐  Confiden6al   2  
  • 3. Big  Data  and     Your  Data  Warehouse   ©MapR  Technologies  -­‐  Confiden6al   3  
  • 4.     “Data is a precious thing and will last longer than the systems themselves.” – Tim Berners-Lee, inventor of the World Wide Web. ©MapR  Technologies  -­‐  Confiden6al   4  
  • 5.     “Without big data analytics, companies are blind and deaf, wandering out onto the web like deer on a freeway.” – Geoffrey Moore, author and consultant. ©MapR  Technologies  -­‐  Confiden6al   5  
  • 6.     “If we have data, let’s look at data. If all we have are opinions, let’s go with mine.” – Jim Barksdale, former Netscape CEO ©MapR  Technologies  -­‐  Confiden6al   6  
  • 7. Big  Data  today  in  the  Enterprise   “Too  many  different  types,  sources  &  formats  of  cri6cal  data”       Mul0ple                     data  sources     Mul0ple   technologies     Mul0ple               copies  of  data     ©MapR  Technologies  -­‐  Confiden6al   7  
  • 8. An  Enterprise  Data  Hub   Sensor     Data   Click   Streams   Enterprise   Data  Hub   Produc6on   Data   Web  Logs   Loca6on   Public   Social   Media   Sales   SCM   ü  ü  ü  ©MapR  Technologies  -­‐  Confiden6al     CRM   Combine  different  data  sources   Minimize  data  movement   One  plaLorm  for  analy6cs   8   Billing  
  • 9. Big  Data  in  our  World   YouTube  users  upload  48  hours  of  new  video  every   minute  of  the  day.     §  Twieer  sees  roughly  175  million  tweets  every  day,  and   has  more  than  465  million  accounts.     §  Facebook  stores,  accesses,  and  analyzes  30+  Petabytes   of  user  generated  data.   §  More  than  5  billion  people  are  calling,  tex6ng,  twee6ng   and  browsing  on  mobile  phones  worldwide.     §  2.7  Zetabytes  of  data  exist  in  the  digital  universe  today.   §  Data  produc6on  will  be  44  6mes  greater  in  2020  than  it   was  in  2009.   §  ©MapR  Technologies  -­‐  Confiden6al   9  
  • 10. Arrival  of  Big  Data  Impacts  Data  Warehouses   Variety   Volume   Prohibi6vely  expensive   storage  costs     ©MapR  Technologies  -­‐  Confiden6al   Inability  to  process   unstructured  formats     Velocity   Data   Warehouse   10   Faster  arrival  and   processing  needs  
  • 11. The  Hadoop  Advantage   §  Fueling  an  industry  revolu6on  by   providing  infinite  capability  to   store  and  process  big  data   §  Expanding  analy6cs  across                 data  types     §  Compelling  economics   –   20  to  100X  more  cost  effec6ve  than   alterna6ves   Pioneered  at     ©MapR  Technologies  -­‐  Confiden6al   11  
  • 12. Important  Drivers  for  Hadoop     §  Data  on  compute  drives  efficiencies  and   beeer  analy6cs   §  With  Hadoop  you  don’t  need  to  know   what  ques6ons  to  ask  beforehand   §  Simple  algorithms  on  Big  Data   outperform  complex  models   §  Powerful  ability  to  analyze     unstructured  data   ©MapR  Technologies  -­‐  Confiden6al   12  
  • 13. What  is  the  Best  Way  to  Deploy  Hadoop?   Transitory  Data  Store     •  No long-term scale   advantages   •  Unprotected data Permanent  Data  Store     •  Highly available and fully protected  data   •  Works with existing tools vs.   •  ETL Tool focus •  Real-time ingestion and extraction •  Archive data from data warehouse Enterprise  Data  Hub   ©MapR  Technologies  -­‐  Confiden6al   13  
  • 14.     “Hadoop ingests and stores data very cost effectively, and handles workloads such as the simple transformations in ETL. On the other hand, Hadoop does not address the missioncritical complex business analytic workloads…”     Mike  Koehler  -­‐  CEO  Teradata     ©MapR  Technologies  -­‐  Confiden6al   14  
  • 15. Data  Warehouse  Op0mized:  Cost  Savings   RDBMS   DW   ETL  + Long   erm  S Storage   ETL  +    Long  TTerm  torage   Sensor  Data     Web  Logs   Query  +   Present     Hadoop   Benefits:   ü  Both  structured  and  unstructured  data   ü  Expanded  analy6cs  with  MapReduce,  NoSQL,  etc.   Solu0on   Hadoop   Cost  /  Terabyte   Hadoop  Advantage   $333   Teradata  Warehouse  Appliance   $16,500   50x  savings   Oracle  Exadata   $14,000   42x  savings   IBM  Netezza   $10,000   30x  savings   ©MapR  Technologies  -­‐  Confiden6al   15  
  • 16. Exis6ng  Data   Social  Data   Weblog  Data   Telemetry   The  Enterprise   Data  Hub  for   Hadoop  Compute   -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐    -­‐       Freed  Up  Space   Fraud  Detec6on   Applica6on   ©MapR  Technologies  -­‐  Confiden6al   Enterprise                                         Data  W16   arehouse   Recommenda6on   Engine  
  • 17. Mul0-­‐Tenant  Capabili0es  to  Share  a  Cluster   Successfully   §  Isola6on   –  Data  placement  control   –  Label  based  job  scheduling   §  Quotas   –  Storage,  CPU,  Memory   §  Security  and  delega6on   –  ACLs   –  AD,  LDAP,  Linux  PAM   §  Repor6ng   –  About  70  resource  usage  metrics   –  REST  API  integra6on     ©MapR  Technologies  -­‐  Confiden6al   17  
  • 18. One  PlaMorm  for  Big  Data   Batch   Interac0ve   Log  file  Analysis   Data  Warehouse  Offload   Fraud  Detec6on   Clickstream  Analy6cs   Forensic  Analysis   Analy6c  Modeling   BI  User  Focus   Map   Reduce   File-­‐Based   Applica6ons   99.999%   HA   Data   Protec6on   Sensor  Analysis   “Twieerscraping”   Telema6cs   Process  Op6miza6on   Interac6ve   Batch   ©MapR  Technologies  -­‐  Confiden6al   Real-­‐Time   Real-­‐6me   SQL   Database   Scalability     &   Disaster   Recovery   Performance   18   Search   Enterprise   Integra6on   Stream   Processing   Mul6-­‐ tenancy   …
  • 19. MapR  Means  More  from  Hadoop   ©MapR  Technologies  -­‐  Confiden6al   19  
  • 20. The  New  Data  Warehouse   Big  Data  +  Hadoop       ©MapR  Technologies  -­‐  Confiden6al   20  
  • 21. Agenda   §  Big  Data  and  Data  Warehouse  Op6miza6on   §  What  Are  Customers  Doing  to  Op6mize  their   Data  Warehouse?   §  Informa6ca  on  Hadoop  Complements  Your  Data   Warehouse     ©MapR  Technologies  -­‐  Confiden6al   21  
  • 22. Big  Data  and  Data  Warehouse   Op0miza0on       ©MapR  Technologies  -­‐  Confiden6al   22  
  • 23. 2014 2011 Devices & Machines 2007 Communities & Society 1990s 1980s BUSINESS 1960s-1970s USERS VALUE TECHNOLOGIES Few Employees Back Office Automation Business Ecosystems Customers/ Consumers Many Employees Front Office Productivity Line-of-Business Self-Service Social Engagement Real-Time Optimization E-Commerce OS/360 SOURCES TECHNOLOGY MAINFRAME 10 2 CLIENT-SERVER 10 4 WEB 10 6 CLOUD 10 7 SOCIAL 10 9 INTERNET OF THINGS 10 11 ©MapR  Technologies  -­‐  Confiden6al   23  
  • 24. Informa0ca  +  Hadoop   PowerCenter  Developers  are  Now  Hadoop  Developers   Archive Profile Parse ETL Cleanse Match Transactions, OLTP, OLAP Analytics & Op Dashboards Documents and Emails Mobile Apps Social Media, Web Logs Machine Device, Scientific ©MapR  Technologies  -­‐  Confiden6al   Real-Time Alerts 24  
  • 25. Data  Warehouse  Op0miza0on   1.    Iden6fy  inac6ve  &   infrequently  used  data   Data Warehouse Transactions, OLTP, OLAP Reports Documents and Emails Social Media, Web Logs 2.    Offload  data  &   processing    to  Hadoop   5.    Move  high  value   results  data  into  DW     3.    Ingest  raw  data,   replicate  changes  &   schemas   Machine Device, Scientific 4.    Store  &  prepare  (e.g.   ETL)  data  on  Hadoop   ©MapR  Technologies  -­‐  Confiden6al   25  
  • 26. PowerCenter  Big  Data  Edi0on   Minimize  Risk   Quickly  staff  projects  with  trained   experts   Map  Once.  Deploy  AnywhereTM   Deploy  On-­‐Premise  or   in  the  Cloud   Traditional Grid     ©MapR  Technologies  -­‐  Confiden6al   26  
  • 27. What  Are  Customers  Doing  to   Op0mize  their  Data  Warehouse?       ©MapR  Technologies  -­‐  Confiden6al   27  
  • 28. Minimize  risk  and  grow  digital  business     The Challenge. Grow  digital  business  to  30%  ($1.8B)  and  reduce  fraud   The  Solu0on   Relational - SQL Server, Oracle, DB2, AS400, Mainframe The  Result   BI / Analytics Visualization & Reporting PowerCenter  Big  Data  Edi6on   Profile   Parse   ETL   •  Comprehensive  data   integra6on  plaLorm  to   integrate  large  volumes   of  data  from  over  18+   systems   •  Ability  to  use  exis6ng   skill  sets  &  make  them   more  produc6ve   Surveys & Net Promoter Scores (NPS) •  Lowest  risk  as  industry   leader   Social Media, Web Logs, JSON, XML Netezza, SQL Server, Oracle, SAS Machine, Forensic, Splunk Large  Global  Financial  Services  and  Communica0ons  Company   ©MapR  Technologies  -­‐  Confiden6al   28  
  • 29. Reduce  Costs  &  Increase  Revenue   Consolidate  Data  on  Hadoop  &  Provide  360  View  of  Customer   The Challenge Data   increasing   20x   every   year   with   costs   rising   from   $17K   per   day   to   $50K   per  day  within  6  months.    Time  to  deliver  informa6on  taking  too  long. The  Solu0on   Business Reports Traditional Grid •  Gain  360  view  of   customer  behavior,   increase  cross-­‐sell  &   up-­‐sell  revenue   Transactions from 70 Data Centers In-­‐Store  POS   Data   B2B  Data   Exchange   Expected  Result   Data   Warehouse   Power  Center  Big   Data  Edi6on   •  Reduce  data  storage   costs  from  $50K  per   day  to  $500  per  day   172  TB   &  Data   Valida0on   Data  from  Gaming   Consoles,  TV,  Tablets,   Readers,  &  Clickstreams   from  5000  Web  Sites   •  Reduce  6me  to  deliver   informa6on  to  business   from  48  hours  to  15   minutes   Large  Global  Media  &  Entertainment  Company   ©MapR  Technologies  -­‐  Confiden6al   29  
  • 30. Flexible  architecture  to  support  rapid  changes     The Challenge. Data  volumes  growing    at  3-­‐5  6mes  over  the  next  2-­‐3  years   The  Solu0on   The  Result   •  Manage  data  integra6on   and  load  of  10+  billion   records  from  mul6ple   disparate  data  sources   Traditional Grid DW   Data Virtualization Mainframe   RDBMS   EDW   Business Reports •  Flexible  data  integra6on   architecture  to  support   changing  business   requirements  in  a   heterogeneous  data   management  environment   DW   Unstructured   Data   Large  Government  Agency   ©MapR  Technologies  -­‐  Confiden6al   30  
  • 31. Lower  costs  of  Big  Data  projects     The Challenge. Data   warehouse   exploding   with   over   200TB   of   data.     User   ac6vity      genera6ng  up  to  5  million  queries  a  day  impac6ng  query  performance The  Solu0on   The  Result   Business Reports ERP   CRM   Custom   Interac0on  Data   EDW •  Saved  $20M  +  $2-­‐3M   on-­‐going  by  archiving  &   op6miza6on   •  Reduced    project   6meline  from            6   months  to  2  weeks   Phase  1   •  Improved  performance   by  25%     Archived Archived   Data Data   •  Return  on  investment   in  less  than  6  months   ©MapR  Technologies  -­‐  Confiden6al   Large  Global  Financial  Ins0tu0on   31  
  • 32. Lower  costs  and  minimize  risk   The Challenge. Increasing demand for faster data driven decision making and analytics as data volumes and processing loads rapidly increase The  Solu0on   RDBMS The  Result   •  Cost-­‐effec6vely  scale   performance     Near Real-Time Datamarts RDBMS Traditional Grid •  Increased  agility  by   standardizing  on  one   data  integra6on   plaLorm   Data Warehouse Web Logs ©MapR  Technologies  -­‐  Confiden6al   •  Lower  hardware  costs   Large  Global  Financial  Ins0tu0on   32   •  Leverage  new  data   sources  for  faster   innova6on  
  • 33. Informa0ca  on  Hadoop   Complements  Your  Data   Warehouse       ©MapR  Technologies  -­‐  Confiden6al   33  
  • 34. Maximize  Your  Return  On  Big  Data   Hadoop  complements  your  exisIng  infrastructure   Data  Assets   Opera0onal  Systems   OLTP   Analy0cal  Systems   Data  Products   Data   Warehouse   MDM   Transactions, OLTP, OLAP OLTP   Data   Mart   ODS   Documents, Email &  other  NoSQL   Social Media, Web Logs Machine Device, Scientific Access     &  Ingest   Parse  &   Prepare   Discover  &   Profile   Transform   &  Cleanse   Manage  (i.e.  Security,  Performance,  Governance,  Collabora6on)   ©MapR  Technologies  -­‐  Confiden6al   34   Extract  &   Deliver  
  • 35. Data  Integra0on  &  Quality  on  Hadoop   1.  Entire Informatica mapping translated to Hive Query Language 2.  Optimized HQL converted to MapReduce & submitted to Hadoop cluster (job tracker). 3.  Advanced mapping transformations executed on Hadoop through User Defined Functions using Vibe SELECT              T1.ORDERKEY1  AS  ORDERKEY2,  T1.li_count,  orders.O_CUSTKEY  AS  CUSTKEY,  customer.C_NAME,                customer.C_NATIONKEY,  na6on.N_NAME,  na6on.N_REGIONKEY              FROM    (    SELECT  TRANSFORM  (L_Orderkey.id)  USING  CustomInfaTx    FROM  lineitem    GROUP  BY  L_ORDERKEY    )  T1    JOIN  orders  ON  (customer.C_ORDERKEY  =  orders.O_ORDERKEY)    JOIN  customer  ON  (orders.O_CUSTKEY  =  customer.C_CUSTKEY)    JOIN  na6on  ON  (customer.C_NATIONKEY  =  na6on.N_NATIONKEY)    WHERE  na6on.N_NAME  =  'UNITED  STATES'    )  T2                INSERT  OVERWRITE  TABLE  TARGET1  SELECT  *                INSERT  OVERWRITE  TABLE  TARGET2  SELECT  CUSTKEY,                              count(ORDERKEY2)  GROUP  BY  CUSTKEY;   MapReduce   UDF   Hive-QL ©MapR  Technologies  -­‐  Confiden6al   35      
  • 36. Accelerate  Development     Reuse  and  Import  PowerCenter  Metadata       Import  and  validate   exis6ng  PowerCenter   mappings  before  running   on  Hadoop   ©MapR  Technologies  -­‐  Confiden6al   36  
  • 37. Hadoop  Data  Profiling  Results   Value  and  Paeern  Frequency   to  isolated  inconsistent/dirty   data  or  unexpected  paeerns   Hadoop  Data  Profiling   results  –  exposed  to   anyone  in  enterprise    via   browser     CUSTOMER_ID  example   COUNTRY  CODE  example   2.  Value  &   Pabern     Analysis  of     Hadoop  Data   1.  Profiling  Stats:   Min/Max  Values,  NULLs,     Inferred  Data  Types,  etc.   Stats  to  iden6fy   outliers  and   anomalies  in  data     3.  Drilldown  Analysis  (into  Hadoop  Data)   Drill  down  into  actual   data  values  to  inspect   results  across  en6re  data   set,  including  poten6al   duplicates   ©MapR  Technologies  -­‐  Confiden6al   37  
  • 38. Hadoop  Data  Domain  Discovery     Finding  funcIonal  meaning  of  Data  in  Hadoop   Leverage  INFA  rules/mapplets  to   iden6fy  func6onal  meaning  of   Hadoop  data     Sensi6ve  data     (e.g.  SSN,  Credit  Card  number,  etc.)       View/share  report  of  data  domains/ sensi6ve  data  contained  in  Hadoop.     Ability  to  drill  down  to  see  suspect  data   values.   PHI:    Protected  Health  Informa0on   PII:    Personally  Iden0fiable  Informa0on   Scalable  to  look  for/discover  ANY  Domain  type   ©MapR  Technologies  -­‐  Confiden6al   38  
  • 39. Unified  Administra0on     Single Place to Manage & Monitor Full  traceability  from  workflow   to  MapReduce  jobs   View  generated   Hive  scripts   ©MapR  Technologies  -­‐  Confiden6al   39  
  • 40. Maximize  Your  Return  on  Big  Data   Lower Big Data Costs Up To 2X (helps self-fund big data projects) •  5x  produc6vity  increase  using  exis6ng   developer  skills   Minimize Risk of New Technologies (single platform, quickly staff projects) •  Design  in  PowerCenter,  run  on  Hadoop  or   any  other  data  plaLorm   Accelerate Innovation (onboard, discover, operationalize) •  Enterprise  scalability,  security,  &  support   ©MapR  Technologies  -­‐  Confiden6al   40  
  • 41. Making  the  Most  of  Big  Data   Leveraging  business  intelligence  to  turn  business  users  into  data   scien6sts   ©MapR  Technologies  -­‐  Confiden6al   41   MicroStrategy  Confiden6al.    Distribu6on  Prohibited  without  Prior  Authoriza6on.  
  • 42. Agenda   1.  Self  Service   2.  Informa6on  Driven  Apps   3.  Mobility   4.  Advanced  Analy6cs   ©MapR  Technologies  -­‐  Confiden6al   42  
  • 43. SELF-­‐SERVICE  ANALYTICS   Empowering everyone with rapid-fire data exploration and dashboarding ©MapR  Technologies  -­‐  Confiden6al   43  
  • 44. Self-­‐Service  Analy0cs  Revolu0onizes  Tradi0onal  BI   Boost  user  sa6sfac6on  while  massively  increasing  produc6vity   More Productive! 5-10x! More content per creator" More Content" More Producers! 5-10x! More users can create content" More Collaborative! Peer-to-peer sharing" ©MapR  Technologies  -­‐  Confiden6al   More Content" Creators" 5-10x! More Sharing" 44   >100x! more content" creation and " consumption"
  • 45. Business  User  Access  to  1000s  of  Data  Sources   Faster  access  to  your  data   Enterprise Applications Relational Databases CloudBased Data Personal or Departmental Big Data & Hadoop Spreadsheets, Access databases, CSV, public data downloads, etc. MapR MicroStrategy Modeled Data     SAP, Oracle e-Business, Siebel, Peoplesoft, etc. Oracle, SQL Server, MySQL, Teradata, Netezza, etc. Salesforce.com, NetSuite. Facebook, Eloqua, Google Docs, etc. Quick Data Import ©MapR  Technologies  -­‐  Confiden6al   No SQL or Scripting 45   Enterprise-certified singleversion of the truth
  • 46. Enrich  Every  Analysis  with  Added  Insight   Enrich  with  Weather   Data     Impact  of  weather  on  game   outcome  and  aeendance   Professional  Sports   Enrich  with   Demographic  Data     Product  popularity  by  demographic   segment   Product  Sales   Enrich  with   Social  Data     Cross-­‐brand  affinity  to  determine   promo6ons  or  bundling  offers   Marke0ng  Promo0ons   ©MapR  Technologies  -­‐  Confiden6al   46  
  • 47. World-­‐Class  Produc0on  Dashboard  Applica0ons   Informa6on-­‐Driven  Apps  are  the  future  of  dashboards   •  100  %  customized   look  and  feel   •  Comprehensive  data   •  Easy  to  use   •  Guided  workflow  for   consistent  user   experience   •  Personalized  for  each   user   •  Online  or  distributed   via  email   •  Mul6media             content-­‐enabled   •  Transac6on-­‐enabled   •  Live  data   ©MapR  Technologies  -­‐  Confiden6al   47  
  • 48. Beyond  Mobile  Dashboards   Build  great  mobile  Smart  Apps  without  the  pain  of  na6ve  development   Analy0cs   Transac0ons   Mul0media   Update  systems  like    ERP/CRM   Analy6cs  and  data  visualiza6on   Add  videos  and  other  content   +" +" Apps for Every Customer-Facing Process" Apps for Every Internal Business Process" Logistics   Apps   ©MapR  Technologies  -­‐  Confiden6al   Operations   Apps   B2E" Apps   Data Collection" Product" Apps   Apps   48   Context-Aware" Apps   Executive" Apps  
  • 49. Easy  Integra0on  with  Third  Party  Analy0c  Models   All  of  an  Organiza6on’s  Analy6cs  Can  Now  be  Distributed  Through  a  Single  PlaLorm   Deploy  Any  of  5000+   Open  Source  R  Analy6cs   Import  Predic6ve  Models   from  Popular  Packages   Create  Your  Own   Custom  Func6ons   MicroStrategy  R   Integra6on  Pack   PMML  Model   ƒApply(X) MicroStrategy  Custom   Func6on  Plug-­‐in   As  a  MicroStrategy  metric,  use  models  and   func6ons  in  any  report  or  dashboard   ©MapR  Technologies  -­‐  Confiden6al   49  
  • 50. MicroStrategy  Analy0cs  PlaMorm   Comprehensive  analy6cs  suite  for  business   MicroStrategy Analytics Platform Self-Service Analytics Enterprise-Grade Business Intelligence Big Data Analytics Rapid-fire data discovery Produce and publish trusted analytics to elevate performance The power to transform your Big Data into insight •  Intuitive data exploration •  Self-service with no IT needed •  Access and combine data from all sources •  Trusted system-of-record reliability •  Advanced and predictive analytics •  Easy, cost-effective administration •  Fast dashboard development •  Comprehensive delivery options with massive user scale •  Blazing speed and performance Web or Mobile On-Premises or on MicroStrategy Cloud ©MapR  Technologies  -­‐  Confiden6al   50  
  • 51. Two  Ways  to  Experience  MicroStrategy  Today   Best  of  all,  they’re  free!   MicroStrategy  Analy0cs   Desktop Fastest, easiest self-service analytics tool for business users. 100% free! See it in action MicroStrategy  Analy0cs   Express Cloud-based self-service visual analytics for any organization. Free for one year! See it in action ©MapR  Technologies  -­‐  Confiden6al   51  
  • 52. Enterprise-­‐Grade  Hadoop     ©MapR  Technologies  -­‐  Confiden6al   52  
  • 53. Use  Cases       ©MapR  Technologies  -­‐  Confiden6al   53  
  • 54. Data  Warehouse  Offload:  Cost  Savings  +  Analy0cs   RDBMS   DW   ETL  + Long   erm  S Storage   ETL  +    Long  TTerm  torage   Sensor  Data     Web  Logs   Query  +   Present     Hadoop   Benefits:   ü  Both  structured  and  unstructured  data   ü  Expanded  analy6cs  with  MapReduce,  NoSQL,  etc.   Solu0on   Hadoop   Cost  /  Terabyte   Hadoop  Advantage   $333   Teradata  Warehouse  Appliance   $16,500   50x  savings   Oracle  Exadata   $14,000   42x  savings   IBM  Netezza   $10,000   30x  savings   ©MapR  Technologies  -­‐  Confiden6al   54  
  • 55. Expand  Data  For  Exis0ng  Applica0ons   §  §  Network  security:  Network  IDS  with  a  3-­‐day  window   instead  of  a  10-­‐minute  window   Trade  Surveillance:  Rogue  trader  detec6on  on  intra-­‐ day  instead  of  end-­‐of-­‐day  market  data     §  Insurance:  Calculate  risk  triangles  for  individual   proper6es  instead  of  neighborhoods   ©MapR  Technologies  -­‐  Confiden6al   55       Advantages:   ü  1T  files  and  tables   ü  Real-­‐6me  data  inges6on  with   streaming  writes   ü  24x7  opera6ons  with   automated  failure  recovery     ü  Beeer  hardware  u6liza6on   with  2x  performance  
  • 56. Combine  Different  Data  Sources       Advantages:   Streaming   writes  to   Hadoop   ü  Exponen6al  decrease  in   6me  to  market   Hadoop   ü  Real-­‐6me  data  inges6on   with  streaming  writes             Real-­‐6me   offers   ü  1T  files  and  tables   ü  24x7  opera6ons  with   automated  failure  recovery   POS/Online       Data   Retail  purchase  Info   ©MapR  Technologies  -­‐  Confiden6al   56  
  • 57. New  Analy0cs       Advantages   ü  Increased  ROI  with  2x   performance   ü  High  available,  fully  data   protected  environment     •  Enhanced search •  Real-time event processing •  MapReduce-enabled machine learning algorithms ©MapR  Technologies  -­‐  Confiden6al   57   ü  Mul6ple  users  running   different  jobs  on  one   cluster    
  • 58. Customer  Example   Cloud-­‐based  predic6ve  analy6cs  plaLorm   Apache  HBase   ý •  Compac6ons   •  Manual  administra6on   •  Poor  reliability   Cassandra   ý   þ   •  Compac6ons   •  Manual  administra6on   •  Eventual  consistency   •  •  •  •  •  No  compac6ons   Zero  administra6on   Strong  consistency   2x  Cassandra  performance     3x  HBase  performance   Sociocast  conducted  a  POC  with  the  three  solu6ons   ©MapR  Technologies  -­‐  Confiden6al   58  
  • 59. MapR  Advantages  for  Enterprise  Data  Hub   •  Enterprise Grade Platform •  99.999% HA •  Full data protection •  Disaster recovery •  Easiest Integration •  Industry-standard interfaces: NFS, ODBC, LDAP, REST •  Streaming writes •  Best ROI •  Faster time to market •  Eliminate risk •  Reuse existing apps and tools ©MapR  Technologies  -­‐  Confiden6al   59  
  • 60. In  the  era  of  the  “Internet  Of  Everything”     Unified  Compu0ng  Systems     The  Infrastructure  PlaMorm  For  Big  Data   ©MapR  Technologies  -­‐  Confiden6al   60  
  • 61. ©MapR  Technologies  -­‐  Confiden6al   61  
  • 62. “The  internet  of  everything   will  provide  a  21%  increase  in   corporate  profits  in  the  next   10  years”   ©MapR  Technologies  -­‐  Confiden6al   62  
  • 63. How  many  IP  addresses  does  your  home  have?   IPV6   ©MapR  Technologies  -­‐  Confiden6al   63  
  • 64. ©MapR  Technologies  -­‐  Confiden6al   64  
  • 65. How  will  the  internet  of  things  change   Basketball?   ©MapR  Technologies  -­‐  Confiden6al   65  
  • 66. Facebook  And  Cisco  Let  Brick-­‐&-­‐Mortars   Demand  Customers  Check-­‐In  To  Get  Wi-­‐Fi     10.03.13  at  Interop  Facebook  and  Cisco   roll  out  a  way  to  help  any  brick-­‐and-­‐ mortar  recoup  its  costs  by  asking  users   to  check-­‐in  to  get  Internet  access.  Those   who  oblige  get  dropped  on  the   business’  Facebook  Page,  and  their   anonymous,  aggregate  demographic   info  is  passed  to  the  merchant.   hep://techcrunch.com/2013/10/02/facebook-­‐wifi/   ©MapR  Technologies  -­‐  Confiden6al   66  
  • 67. In-­‐Store  Manager  View  &  Capabili0es   Product  Catalog   Product  Characteris6cs   Marke6ng  Descrip6on   Quality  Data   Mul6-­‐media  Informa6on   Product  Sugges6ons   Promo0on  PorMolio   Campaign  Management   Customer  Segmenta6on   Loca6on  triggered  Rules   Consumer  Profile   CRM  profile   Loyalty  status   Consumer  Preferences   Applica0on  Analy0cs  &   Forecas0ng   Based  on  Historical  FooLal   Heatmap  Preferences   ©MapR  Technologies  -­‐  Confiden6al   67  
  • 68. Beber  Retailing?     Retailers   Dashboard   Mobility  Services  Engine   Exis6ng   ERP     Systems   t   Exis0ng   Retailing   PlaMorm   Cisco  Wireless   WLAN  Controller   Consumer   Personal  Shopping  Assistant   ©MapR  Technologies  -­‐  Confiden6al   Cisco Wireless Access 68   Point
  • 69. Big  Data  and  Key  Infrastructure  Abributes   (What  big  data  isn’t)   §  §  §  §  Usually  not  blade  servers  (not  enough  local  storage)   Usually  not  virtualized  (hypervisor  only  adds  overhead)   Usually  not  highly  oversubscribed  (significant  east-­‐west  traffic)   Usually  not  SAN/NAS   Low-­‐cost,  DAS-­‐based,   scale-­‐out  clustered   filesystem   Move  the   compute  to   the  storage     ©MapR  Technologies  -­‐  Confiden6al   69   $$$   69   69  
  • 70. Cost,  Performance,  and  Capacity   HW:SW $ split 30:70 Expensive  Load   1TB/hr  ETL   Structured  Data:   Rela0onal   Database   $20K/TB   Enterprise     Data   Massive Scale-Out Column Store $10K/TB   $500-­‐$1K/TB   Hadoop No SQL HW:SW $ split 70:30 ©MapR  Technologies  -­‐  Confiden6al   70   Unstructured  Data:  Machine  Logs,    Web   Click  Stream,  Call  Data  Records,  Satellite  Feeds,   GPS  Data,  Sensor  Readings,  Sales  Data,  Blogs,   Emails,  Video  
  • 71. Typical  big  data  deployments   Dedicated  “Pod”  for  Big  Data   General  Purpose  IT  Data  Center   IT  Infrastructure   standard  IT  servers   SAP   VMwar e   WEB   X86  servers   Big  Data   Big  Data   §  Experimental  use  of  Big  Data   §  App  team  mandated  infrastructure     §  Deployed  into  IT  Ops  mandated   infrastructures   §  Purpose  built  for  Big  Data   §  Big  Data  has  established  business  value   §  Performance  maeers   §  Large  or  small  clusters   §  “Skunk  works”   §  Small  to  medium  clusters   ©MapR  Technologies  -­‐  Confiden6al   71  
  • 72. Cisco  UCS  Common  PlaMorm  Architecture  (CPA)   Building  Blocks  for  Big  Data   UCS   Manager   UCS  6200  Series   Fabric  Interconnects   Nexus  2232   Fabric   Extenders     ©MapR  Technologies  -­‐  Confiden6al   LAN,  SAN,   Management   UCS  240  M3   Servers   72   72  
  • 73. Cisco  Big  Data     Common  PlaMorm  Architecture   Single-­‐SKU  Big  Data  SmartPlay  Bundles   The  Big  Data  Accelera0on  Kit   Cisco  Components   •  16  node  UCS  CPA  Solu6on     Cisco  SKUs:  UCS-­‐EZ-­‐BD-­‐HP  and  UCS-­‐EZ-­‐BD-­‐HC   MapR  Components   Single  Rack  UCS  Solu0ons   Single  Rack   Half-­‐Rack  UCS  Solu0ons   •  16-­‐node  M7  license     UCS  Solu0ons   Bundle  for  Hadoop   Bundle  for  Hadoop   Bundle  for  MPP   •  (2)  Free  Administrator  Training  Credits   Performance   Capacity   Configura0on   •  Installa6on  and  configura6on   UCS-­‐EZ-­‐BD-­‐HP   UCS-­‐EZ-­‐BD-­‐HC   UCS-­‐EZ-­‐BD-­‐STRT   Data  strategy  and  explora6on   •        •  MapR  SKU:  M7-­‐16-­‐CISCO-­‐12     2  x  UCS  6248   2  x  Nexus  2232  PP   8  x  C240  M3  (SFF)     2x  E5-­‐2690   256GB   24x  600GB  10K  SAS   hep://www.cisco.com/en/US/docs/ unified_compu6ng/ucs/UCS_CVDs/ Cisco_UCS_CPA_for_Big_Data_with_MapR.h tml   ©MapR  Technologies  -­‐  Confiden6al     2  x  UCS  6296   2  x  Nexus  2232  PP   16  x  C240  M3  (LFF)     E5-­‐2640  (12  cores)   128GB   12x  3TB  7.2K  SATA     73     2  x  UCS  6296   2  x  Nexus  2232  PP   16  x  C240  M3  (SFF)     2x  E5-­‐2665  (16  cores)   256GB   24  x  1TB  7.2K  SAS   73  
  • 74. Hadoop  Hardware  Evolving  in  the  Enterprise   Typical  2009   Hadoop  node   • 1RU  server   • 4  x  1TB  3.5”  spindles   • 2  x  4-­‐core  CPU   • 1  x  GE   • 24  GB  RAM   • Single  PSU   • Running  Apache   • $   ©MapR  Technologies  -­‐  Confiden6al   Economics  favor   “fat”  nodes   • 6x-­‐9x  more  data/ node   • 3x-­‐6x  more  IOPS/ node   • Saturated  gigabit,   10GE  on  the  rise   • Fewer  total  nodes   lowers  licensing/ support  costs   • Increased   significance  of  node   and  switch  failure   74   Typical  2012   Hadoop  node   • 2RU  server   • 12  x  3TB  3.5”  or  24  x   1TB  2.5”  spindles   • 2  x  8-­‐core  CPU   • 1-­‐2  x  10GE   • 128  GB  RAM   • Dual  PSU   • Running  MapR     • $$$  
  • 75. Seamless  Integra0on  with  Enterprise   ETH  1   ETH  2   SAN  B   Applica0ons   SAN  A   MGMT   MGMT   Uplink  Ports   OOB  Mgmt   Fabric  Switch   Server  Ports   Fabric  Extenders     Virtualized  Adapters   Compute  Blades   Half  /  Full  width   ©MapR  Technologies  -­‐  Confiden6al   6200   Fabric  A   F E X     A   Cluster   Chassis  1   F E X     B   CNA   6200   Fabric  B   FEX A FEX B CNA Rack Mount B200   75  
  • 76. Extending  UCS  Enterprise  Applica0on  Ecosystem   to  Big  Data     Big Data Common Platform Architecture Enterprise Applications UCS Rack-Mount Servers ©MapR  Technologies  -­‐  Confiden6al   76   UCS Blade Servers SAN/NAS Arrays
  • 77. UCSM  policy-­‐based  management,  provisioning,   and  monitoring  for  Big  Data  Infrastructure   UCS  Management  (160  Nodes  per  UCS  Managed  Cluster  Domain)   •  Cluster  Layout  and  Inventory   •  Per-­‐Server  Inventory   •  ID  Pools  (MAC,  IP,  UUID)  Management   Inventory & Asset Mgmt Fault Detection & SW Updates QoS Policies & Power Capping ©MapR  Technologies  -­‐  Confiden6al   •  Fault  detec6on  &  Logs     •  Event  Aggrega6on   •  System  so•ware  updates   •  QoS  Policy  defini6on   •  Policy  driven  framework   •  Policy  Based  Power  Capping   77  
  • 78. CPA:  High-­‐performance  unified  fabric  and   compute  increases  cluster  efficiency     Single  wire  for  data  and  management   8  x  10GE   uplinks  per   FEX=  2:1   oversub  (16   servers/rack),   no   portchannel   (sta6c   pinning)   2  x  10GE  links   per  server  for  all   traffic,  data  and   management   ©MapR  Technologies  -­‐  Confiden6al   78  
  • 79. Cisco  Unified  IO  Grant  Bandwidth   3G/s   2G/s   Individual   Ethernets       LAN  Traffic  (HDFS  Import)   3G/s   3G/s   Cluster  Traffic  (Shuffle)   3G/s   3G/s   Priori6sed  QoS   3G/s   Applica6on  Traffic  (HBase)     4G/s   5G/s   t1   t2   •   Near  Wire  Speed  without  CPU  load   •   Dynamic  bandwidth  management  according  to  SLA’s   •   See  network  sec6on  for  more   ©MapR  Technologies  -­‐  Confiden6al   79   t3  
  • 80. Scaling  the  CPA   L2/L3  Switching   Single Rack 16 servers Single Domain Up to 10 racks, 160 servers UCS  Manager   UCS  Central   ©MapR  Technologies  -­‐  Confiden6al   Multiple Domains 80   80  
  • 81. Big  Data  Infrastructure   UCS  Mul6-­‐Domain  (UCS  Central  Manages  up  to  10,000  nodes)   •  Inventory,  Fault,  Log,  Event  Aggrega6on   •  Global  ID  Pools,  Firmware  Updates,  Backups  and  Global   Admin  Policies     •  Global  Service  Profiles,  Templates  &  Policies   •  Sta6s6cs  Aggrega6on   •  HA  for  UCS  Central  Virtual  Machine  with  shared  storage   ©MapR  Technologies  -­‐  Confiden6al   81  
  • 82. ©MapR  Technologies  -­‐  Confiden6al   82  
  • 83. ©MapR  Technologies  -­‐  Confiden6al   83  
  • 84. Q&A   ©MapR  Technologies  -­‐  Confiden6al   84  
  • 85. Big  Data  Accelera0on  -­‐  Key  Benefits       §  Rapid  Big  Data  plaLorm  deployment/Accelerate  Big  Data  ROI   §  Ease  of  infrastructure  management  and  cluster  administra6on   §  Support  for  mission  cri6cal  workloads   §  Enterprise-­‐ready  workload  automa6on   §  Powerful  plaLorm  for  high  performance  and  high  capacity   §  Produc6on  ready  with  full  data  protec6on  and  disaster  recovery   §  Support  for  wide  variety  of  Big  Data  applica6ons,  including  but  not   limited  to:     –  data  warehouse  offload,   –  predic6ve  analy6cs,     –  360°  view  of  the  customer,     –  recommenda6on  engine,  and     –  long-­‐term  data  store   ©MapR  Technologies  -­‐  Confiden6al   85  
  • 86. Big  Data  Accelera0on  Kit   Consul0ng  Services   16-­‐node  M7  UCS  Cluster     ü  Data  strategy  &  explora6on   ü  Integra6on  planning   ü  Installa6on  &  configura6on     ü  Highly  scalable  Cisco  UCS   CPA  solu6on   ü  HA  and  full  data  protec6on   ü  Advanced  admin  console   Helping   You     Get  Started     Formal  Training  &  Support     Hadoop  Self  Training       ü  Free  admin  training  for  (2)   ü  24/7  support   ©MapR  Technologies  -­‐  Confiden6al     ü  Series  of  jumpstart  videos   ü  User  forum  access   86  
  • 87. Thank  You   ©MapR  Technologies  -­‐  Confiden6al   87