Contenu connexe Similaire à In memory computing principles by Mac Moore of GridGain (20) In memory computing principles by Mac Moore of GridGain2. ©
2014
GridGain
Systems,
Inc.
Agenda
• Overview
• Why
In
Memory
&
Use
Cases
• Evolution
of
Architectures
• Concepts
and
Considerations
• In-‐Memory
Data
Fabric
6.5
• Data
Grid
• Clustering
and
Compute
• Streaming
• Hadoop
Acceleration
• Highlights:
Release
6.5
3. ©
2014
GridGain
Systems,
Inc.
Why
In-‐Memory
Computing?
Cloud/SaaS
apps,
Mobile
Computing
back-‐ends,
Internet
of
Things,
Big
Data
analytics,
Social
Networks
–
all
need
to
be
done
in-‐memory
to
reach
Internet
scale
“RAM
is
the
new
disk,
disk
is
the
new
tape.”
RAM
is
3,000
times
faster
than
spinning
disks.
By
moving
data
from
disk
to
RAM
and
employing
modern
in-‐memory
data
grid
technology,
things
get
fast.
Really,
really
fast.
4. “In-‐memory
computing
will
have
a
long
term,
disruptive
impact
by
radically
changing
users’
expectations,
application
design
principles,
products’
architectures
and
vendors’
strategies.”
In-‐memory
computing
is
the
future
of
computing…
it
offers
a
massive
potential
not
only
in
TCO
reduction
but
across
all
four
value
dimensions:
performance,
process
innovation,
simplification
and
flexibility.
©
2014
GridGain
Systems,
Inc.
5. “Organizations
that
do
not
consider
adopting
in-‐memory
application
infrastructure
technologies
risk
being
out-‐
innovated
by
competitors
that
are
early
mainstream
users
of
these
capabilities”
©
2014
GridGain
Systems,
Inc.
6. ©
2014
GridGain
Systems,
Inc.
In-‐Memory
Computing:
Why
Now?
In-memory will have an industry impact comparable to
web and cloud. RAM is the new disk, and disk is the
new tape.!
In-Memory Computing Market:!
• $13.23B in 2018!
• 2013-2018 CAGR 43%!
DRAM Cost, $
Cost Less
than
2
zetabytes
in
2011,
8
in
2015 drops 30% every 12 months
BigData Technologies Planned
34% will use in-memory technology
Data Growth
7. ©
2014
GridGain
Systems,
Inc.
Top
3
Reasons
for
In-‐Memory
Computing
1. Performance
2. Scalability
3. Future-‐proofing
8. ©
2014
GridGain
Systems,
Inc.
How
In-‐Memory
Computing
Works:
The
Basic
Idea
•Persistence
•Recovery
•Post-‐Processing
•Backup
9. ©
2014
GridGain
Systems,
Inc.
In-‐Memory
Technology:
Use
Cases
Data
Velocity,
Data
Volume,
Real-‐Time
Performance
> Automated Trading Systems
Real time analysis of trading positions & market
risk. High volume transactions, ultra low latencies.!
> Financial Services
Fraud Detection, Risk Analysis, Insurance rating and
modeling.!
> Online & Mobile Advertising
Real time decisions, geo-targeting & retail traffic
information.!
> Big Data Analytics
Customer 360 view, real-time analysis of KPIs,
up-to-the-second operational BI.!
> Online Gaming
Real-time back-ends for mobile and massively parallel
games.!
> Bioinformatics & Sciences
High performance genome data matching,
Environmental simulation.
11. ©
2014
GridGain
Systems,
Inc.
Traditional
Architecture
App Server1
User App
Data Requests
Traditional RDBMS Server
Disks
Disks
Disks
Disks
Disks
Disks
App Server2
User App
App Server3
User App
App Server4
User App
App Server n
User App
Processing
Happens Here
Data Requests
Data Requests
Data Requests
Relational
Data
Data is converted
Data is shipped
across the network
for every request
to objects
(Marshaling)
All data is on disk.
This
is
the
central
bottleneck.
Beyond
a
certain
point,
scaling
the
RDBMS
becomes
complex,
expensive
and
difficult
to
manage.
12. ©
2014
GridGain
Systems,
Inc.
Horizontally
Scale
the
RDBMS
Data is converted
Scaling
improves,
but
little
else
changes.
Also
these
are
very
expensive!
Data Requests
RDBMS Server
Server1 Server2
Disks
Disks
Disks
Disks
Disks
Disks
App Server1
User App
App Server2
User App
App Server3
User App
App Server4
User App
App Server n
User App
Processing
Happens Here
Data Requests
Data Requests
Data Requests
Relational
Data
Data is shipped
across the network
for every request
to objects
(Marshaling)
All data is still on disk.
13. ©
2014
GridGain
Systems,
Inc.
IMDG:
Distributed
Caching
Scaling
improves,
as
some
(or
all)
data
is
now
in
RAM.
But,
we
still
have
to
ship
data
to
the
app
for
every
request.
Distributed Cache
Server1 Server2
App Server1
User App
App Server2
User App
Traditional RDBMS Server
Disks
Disks
Disks
Disks
Disks
Disks
App Server3
User App
App Server4
User App
App Server n
User App
Object
Data
RAM
RAM
RAM
RAM
RAM
RAM
Processing
Happens Here
Data is shipped
across the network
for every request
14. ©
2014
GridGain
Systems,
Inc.
GridGain:
IMDG
+
Grid
Compute
By
bringing
computation
to
the
In-‐Memory
Data
Grid,
you
can
now
achieve
the
fastest
possible
results.
User App
User App
User App
User App
In-Memory Compute+Data Grid
Processing
is local, in-memory,
and in native object
format.
All data is in
RAM
Server1
User App
GridGain
Node
RAM
RAM
Server2
GridGain
Node
RAM
RAM
Server3
GridGain
Node
RAM
RAM
Server n
User App
GridGain
Node
RAM
RAM
Server4
User App
GridGain
Node
RAM
RAM
Server5
GridGain
Node
RAM
RAM
Server6
GridGain
Node
RAM
RAM
Server n
User App
GridGain
Node
RAM
RAM
Tasks/Queries
15. ©
2014
GridGain
Systems,
Inc.
In-‐Memory
Data
Grids:
Considerations
Performance
Scaling Consistency
Search/Query
Persistence Transactions
Icons
by:
http://www.icons-‐land.com/
http://www.awicons.com/
http://www.aha-‐soft.com/
16. ©
2014
GridGain
Systems,
Inc.
In-‐Memory
Data
Grids:
Considerations
Performance > Understand
your
criteria…
> Real-‐Time
Performance
> High
Throughput
> Low
Latency
> Dataset
sizes
(growing)
17. ©
2014
GridGain
Systems,
Inc.
In-‐Memory
Data
Grids:
Considerations
> Horizontal
Scaling
(“add
a
brick”)
> Dynamic
Topology
(no
downtime)
> Vertical
Scaling
(large
RAM
allocation
per-‐
process)
> Good
monitoring
tools
(utilization
&
mgmt.)
Scaling
18. ©
2014
GridGain
Systems,
Inc.
In-‐Memory
Data
Grids:
Considerations
> Weak
(Eventual)
Consistency
vs
Strong
Consistency
> Per-‐store
or
per-‐cache
configuration
> Understand
impact
on
performance
Consistency
19. ©
2014
GridGain
Systems,
Inc.
In-‐Memory
Data
Grids:
Considerations
> Flexible
Persistence
Options
> Read-‐Through,
Write-‐Through,
Write-‐
Behind
> Examples:
Relational
Database,
Local
Storage,
Shared
Storage
Persistence
20. ©
2014
GridGain
Systems,
Inc.
In-‐Memory
Data
Grids:
Considerations
> Are
transactions
supported?
> Local
or
Distributed
> Global
or
per-‐store/per
cache
configuration
> Impact
on
performance
Transactions
21. ©
2014
GridGain
Systems,
Inc.
In-‐Memory
Data
Grids:
Considerations
> Query
Capability:
SQL
or
Proprietary
API
> Indexing
support
> JDBC/ODBC
Interface
Search/Query
23. Data
Velocity,
Data
Volume,
Real-‐Time
Performance
©
2014
GridGain
Systems,
Inc.
Customer
Use
Cases
> Automated Trading Systems
Real time analysis of trading positions & market
risk. High volume transactions, ultra low latencies.!
> Financial Services
Fraud Detection, Risk Analysis, Insurance rating and
modeling.!
> Online & Mobile Advertising
Real time decisions, geo-targeting & retail traffic
information.!
> Big Data Analytics
Customer 360 view, real-time analysis of KPIs,
up-to-the-second operational BI.!
> Online Gaming
Real-time back-ends for mobile and massively parallel
games.!
> Bioinformatics & Sciences
High performance genome data matching,
Environmental simulation.
24. Gartner
names
GridGain
a
2014
“Cool
Vendor”
for
IMC
©
2014
GridGain
Systems,
Inc.
GridGain
named
a
2014
AlwaysOn
Global
250
Top
Private
Company
“This
positions
GridGain
as
one
of
the
few
IMC
open-‐source
technologies
available
and
the
only
one…
providing
such
a
rich
set
of
functionality.”
25. Use
Case:
• Open
©
2014
GridGain
Systems,
Inc.
Largest bank in Eastern Europe (Russia), and the third largest in Europe
tender
won
by
GridGain
–Goal:
Real-‐time
risk
and
leverage
reporting
on
their
global
financial
trading
portfolio
– Performed
a
detailed
evaluation
and
software
assurance
test
–Delivered
best
performance,
scale
and
high
availability
1 Billion
Transactions per Second
10 Dell R610 servers
1 TB Memory
< $40K
26. ©
2014
GridGain
Systems,
Inc.
GridGain
enters
the
Apache
Software
Foundation
27. ©
2014
GridGain
Systems,
Inc.
GridGain
In-‐Memory
Data
Fabric:
Strategic
Approach
to
IMC
• Supports Applications of
various types and
languages
• Open Source – Apache 2.0!
• Simple Java/C#/C++ APIs!
• 1 JAR Dependency!
• High Performance & Scale!
• Automatic Fault Tolerance!
• Management/Monitoring!
• Runs on Commodity Hardware
• Supports existing & new data sources!
• No need to rip & replace
28. • Distributed
©
2014
GridGain
Systems,
Inc.
In-‐Memory
Key-‐
Value
Store
• Local,
Replicated,
Partitioned
• TBs
of
data,
of
any
type
• On-‐Heap
and
Off-‐Heap
Storage
• Backup
Replicas
/
Automatic
Failover
• Distributed
ACID
Transactions
• SQL
queries
and
JDBC
driver
• Collocation
of
Compute
and
Data
In-‐Memory
Data
Grid
29. • Direct
©
2014
GridGain
Systems,
Inc.
API
for
MapReduce
• Zero
Clustering
&
Compute
Deployment
• Cron-‐like
Task
Scheduling
• State
Checkpoints
• Early
and
Late
Load
Balancing
• Automatic
Failover
• Full
Cluster
Management
• Pluggable
SPI
Design
30. ©
2014
GridGain
Systems,
Inc.
Client-‐Server
vs
Affinity
Colocation
Client-‐
Server
Affinity
Colocation
31. ©
2014
GridGain
Systems,
Inc.
Distributed
Java
Structures
• Distributed
Map
(cache)
• Distributed
Set
• Distributed
Queue
• CountDownLatch
• AtomicLong
• AtomicSequence
• AtomicReference
• Distributed
ExecutorService
32. In-‐Memory
Streaming
and
CEP
• Streaming
©
2014
GridGain
Systems,
Inc.
Data
Never
Ends
• Branching
Pipelines
• Pluggable
Routing
• Sliding
Windows
for
CEP/Continuous
Query
• Real
Time
Analysis
33. • Plug
©
2014
GridGain
Systems,
Inc.
Hadoop
Accelerator
and
Play
installation
• 10x
to
100x
Acceleration
• In-‐Memory
Native
MapReduce
• In-‐Process
Data
Colocation
• GGFS
In-‐Memory
File
System
• Pure
In-‐Memory
• Read-‐Through
from
HDFS
• Write-‐Through
to
HDFS
• Sync
and
Async
Persistence
34. ©
2014
GridGain
Systems,
Inc.
In-‐Memory
Accelerated
Map
Reduce
• In-‐Memory
Native
Performance
• Zero
Code
Change
• Use
existing
MR
code
• Use
existing
Hive
queries
• No
Name
Node
• No
Network
Noise
• In-‐Process
Data
Colocation
37. ©
2014
GridGain
Systems,
Inc.
Cross-‐Language
Interoperability
• Portable
Objects
Language-‐neutral
storage
format.
Allows
you
to
write
data
from
one
language,
and
access
or
modify
it
from
another.
• Performance
Across
Languages
Optimized
neutral
format
that
provides
great
performance
across
all
supported
languages.
• Client
Feature
Parity
Feature
parity
across
supported
language
APIs.
C#
C++
Cross Language Data Interoperability
Java GridGain Data Grid
38. ©
2014
GridGain
Systems,
Inc.
Dynamic
Schema
Changes
• Dynamic
Schema
Changes
Change
data
structures:
add
properties/fields
dynamically
at
runtime
when
needed.
• Searchable
/
Indexable
Index
and
search
into
arbitrary
fields,
without
needing
to
create
annotated
classes.
• Version
Independent
Data
storage
format
no
longer
tied
to
class
versioning
or
product
versioning.
C#
C++
Dynamic Schema Changes
Language Independent Format
Java
39. ©
2014
GridGain
Systems,
Inc.
Grid
Managed
Services
• Automatic
High-‐Availability
Define
services
that
should
be
instantiated
by
the
grid.
• Configurable
Support
for
Grid
Singletons,
Node
Singletons,
and
more.
• Maintenance-‐free
Deployment
of
services
in
the
desired
configuration
is
guaranteed
by
the
GridGain
Grid.
Grid Managed Services
Grid Singleton
Node Singleton
GridGain Data Grid GridGain Data Grid
40. ©
2014
GridGain
Systems,
Inc.
Enterprise
Edition:
Exclusive
Features
> Management & Monitoring
GridGain Visor for GUI-based DevOps!
> Local Restartable Store
fast recovery during planned outages or DR!
> Data Center Replication
Multi-datacenter WAN support!
> Network Segmentation
Protection
Configurable fault-tolerance for network interruptions!
> Security Features
Client Authentication and related SPIs!
> Rolling Production Updates
Perform software upgrades without downtime!
> Support & Maintenance
Support incidents, ticket access, upgrades, patches!
> Training & Consulting
Technical training and customized consulting services!
> Deploy with Confidence
Indemnification for Enterprise Customers
41. ©
2014
GridGain
Systems,
Inc.
Enterprise
&
Open
Source
Comparison
Chart
GridGain
Enterprise
Subscriptions
include
the
following
during
the
subscription
term:
> Right
to
Use
the
Enterprise
Edition
of
the
GridGain
product.
> Bug
fixes,
patches,
updates
and
upgrades
to
the
latest
version
of
the
product.
> 9x5
or
24x7
Support
for
the
product.
> Ability
to
procure
Training
and
Consulting
Services
from
GridGain.
> Confidence
and
protection,
not
provided
under
Open
Source
licensing,
that
only
a
commercial
vendor
can
provide,
such
as
Indemnification.
42. ©
2014
GridGain
Systems,
Inc.
ANY
QUESTIONS?
Thank
you
for
joining
us.
Follow
the
conversation.
www.gridgain.com
@gridgain
#gridgain