Covers the problems of achieving scalability in server farm environments and how distributed data grids provide in-memory storage and boost performance. Includes summary of ScaleOut Software product offerings including ScaleOut State Server and Grid Computing Edition.
2. Agenda
• About ScaleOut Software
• Overview of Products
• What is a Distributed Data Grid (DDG)?
• The Top Six Reasons
• What to Look for in a DDG Product
2 ScaleOut Software, Inc.
3. Company
• Founded in September 2003, privately funded
• Offices in Bellevue, WA and Beaverton, OR
• Team:
– Dr. William Bain, Founder & CEO
• Career focused on parallel computing – Bell Labs, Intel, Microsoft
• 3 prior start-ups, last acquired by Microsoft and product now ships
as Network Load Balancing in Windows Server
– David Brinker, COO
• 20 years software business and executive management
experience
• Mentor Graphics, Cadence, Webridge
• Develops and markets Linux & Windows DDG products.
• Seven years market experience.
3 ScaleOut Software, Inc.
4. It’s All About Scaling Performance
• Scaling performance:
SCALE OUT
CPU
Memory Scale Out
Storage
CPU CPU CPU CPU
Memory Memory Memory Memory
Scaling out:
• Has excellent scalability.
Storage Storage Storage Storage
• But is challenging to
implement.
4 ScaleOut Software, Inc.
5. What is a Distributed Data Grid?
(Aka “distributed cache”, “in-memory data grid”)
Processor Processor
• A new “vertical” storage tier: Cache Cache
– Adds missing layer to boost
performance. L2 Cache L2 Cache
– Uses in-memory, out-of-process
storage. Application Application
Memory Memory
– Avoids repeated trips to backing “In-Process” “In-Process”
storage.
Distributed Distributed
• A new “horizontal” storage tier: Data Grid Data Grid
“Out-of- “Out-of-
– Allows data sharing among servers. Process” Process”
– Scales performance & capacity.
– Adds high availability. Backing
Storage
– Can be used independently of
backing storage.
5 ScaleOut Software, Inc.
6. Distributed Data Grids: A Closer Look
• Incorporates a client-side, in-
process cache (“near cache”):
Application
– Transparent to the application Memory
– Holds recently accessed data. “In-Process”
Client-side
• Boosts performance: Cache
– Eliminates repeated network data “In-Process”
Distributed
transfers & deserialization. Data Grid
– Reduces access times to near “in- “Out-of-
process” latency. Process”
– Is automatically updated if the
distributed grid changes.
– Supports various coherency models
(coherent, polled, event-driven)
6 ScaleOut Software, Inc.
7. The Need for Memory-Based Storage
Example: Web server farm:
Internet
• Load-balancer directs Load-balancer
incoming client requests
POW ER FAU LT DATA AL A RM
Ethernet
to Web servers.
• Web and app. server
farms build Web pages W eb Server
Distributed, In-Memory DataServer W eb Server
W eb Server W eb Server W eb Server W eb
Grid
and run business logic. Ethernet
• Database server holds all
mission-critical, LOB data.
D atabase R aid D isk D atabase
Server Array Server Bottleneck
• Server farms share fast- Ethernet
changing data using a Distributed, In-Memory Data Grid
DDG to avoid bottlenecks
and maximize scalability. App. Server App. Server App. Server App. Server
7 ScaleOut Software, Inc.
8. The Need for Memory-Based Storage
Example: Cloud Application: Cloud Application
App VS
• Application runs as multiple, App VS
virtual servers (VS). App VS
App VS
App VS
• Application instances store and
retrieve LOB data from cloud- Grid VS
based file system or database. Grid VS
Grid VS
Distributed Data Grid
• Applications need fast, scalable
storage for fast-changing data.
• Distributed data grid runs as
multiple, virtual servers to
provide “elastic,” in-memory
storage.
Cloud-Based Storage
8 ScaleOut Software, Inc.
9. Scalability Challenges for Applications
• “Scaled out” server applications repeatedly access two types of data:
– Repeatedly referenced database-data (e.g., stock prices) and
– Fast changing, business-logic data (e.g., session-state, workflow state)
• Database servers are not designed to meet this need:
Characteristics: Typical DBMS data Application data
Volume High Low
Lifetime/turnover Long/slow Short/fast
Access patterns Complex Simple
Data preservation Critical Less critical
Fast access/update Less important More important
• Scaled-out applications create additional challenges:
– How to make shared application data quickly accessible by any server
– How to maintain fast access and avoid bottlenecks as the server farm grows
– How to keep application data highly available when a server fails
9 ScaleOut Software, Inc.
10. Wide Range of Applications for DDGs
Financial Services E-commerce
• Portfolio risk analysis • Session-state storage
• VaR calculations • Application state storage
• Monte Carlo simulations • Online banking
• Algorithmic trading • Loan applications
• Market message caching • Wealth management
• Derivatives trading • Online learning
• Pricing calculations • Hotel reservations
• News story caching
Other Applications
• Edge servers: chat, email • Shopping carts
• Online gaming servers • Social networking
• Scientific computations • Service call tracking
• Command and control • Online surveys
10 ScaleOut Software, Inc.
11. Product: ScaleOut StateServer®
Fully distributed data grid designed for storing application
data on server farms, compute grids, and the cloud:
• Runs in-memory directly on a farm or grid as a distributed service.
• Automatically:
– Distributes and shares SOSS
data across the farm. Service
Web Server
– Reduces access time.
– Scales when SOSS
Service
the farm grows.
Ethernet
Ethernet
Web Server
Internet
– Survives when
a server fails. SOSS
Service
DBMS
Server
• Cost-effective Web Server
DBMS
• Complements & offloads DBMS. SOSS Bottleneck
Service
• Portable across Windows and Linux. Web Server
11 ScaleOut Software, Inc.
12. Product: ScaleOut Remote Client Option
• Allows hosting ScaleOut
StateServer on a separate
server farm. Web or Application Server Farm
• Ensures highly Client
Application
Client
Application
Client
Application
Client
Application
Client
Application
available
connectivity to
Windows Windows Windows Linux Linux
Remote Client Remote Client Remote Client Remote Client Remote Client
SOSS store.
Load-balanced Connections
• Automatically
load-balances access
requests to minimize
Windows Linux Windows
SOSS SOSS SOSS
response times.
• Uses multiple connections
to maximize throughput.
ScaleOut StateServer Farm
12 ScaleOut Software, Inc.
13. Products: Grid Computing Edition
Compute Servers
• Extends ScaleOut
StateServer for use in high
performance computing
(HPC) applications.
• Provides advanced
capabilities for parallel data
Master
analysis.
• Includes optional
management tools.
SOSS
..
Service
• Complements SSI’s Data
Bottleneck
extended support plans.
Database Servers
13 ScaleOut Software, Inc.
14. Products: ScaleOut GeoServer Option
Global, Multi-Site Data Grids
• Extends SOSS across multiple sites.
• Ensures against site-wide failures.
• Replicates data between
data SOSS farms.
• Employs scalable,
hi-av connections.
• Automatically handles
membership changes
at remote sites.
• Can support both “push”
and “pull” access models.
14 ScaleOut Software, Inc.
15. Reason #1: Faster Access Time
• Eliminates repeated network data transfers.
• Eliminates repeated object deserialization.
Average Response Time
10KB Objects
3500 20:1 Read/Update
3000
2500
Microseconds
2000
1500
1000
500
0
DDG DBMS
15 ScaleOut Software, Inc.
16. Example of Faster API Read Access
• Example for direct API access:
– 10 KB objects, 20:1 read/update ratio
– 3-host ScaleOut StateServer store with 3 clients
• Results:
– Distributed cache provided >6X faster read time than database server.
16 ScaleOut Software, Inc.
17. Reason #2: Linearly Scalable Throughput
ScaleOut StateServer automatically scales its performance to match
the size and workload of a server farm or HPC compute grid.
Read/Write Throughput
10KB Objects
Accesses / Second
80,000
60,000
40,000
20,000
0
4 16 28 40 52 64 Nodes
16,000 ------------------------------------------- 256,000 #Objects
Tests performed in Microsoft Enterprise Engineering Center
17 ScaleOut Software, Inc.
18. What is Scalable Throughput?
• What it is (a perfect fit for server farms):
– Workload W takes time T on 1 server ( 1 W/T).
– Workload 2W takes time T on 2 servers (2 W/T).
– Workload nW takes time T on n servers (n W/T).
– Total completion time (i.e., response time) stays fixed.
• What it is not (common misperception):
– Workload W takes time T/2 on 2 servers (2 W/T).
– Workload W takes time T/n on n servers (n W/T).
• Why increase the workload with more servers?
– Adding servers adds overhead (e.g., networking).
– Increasing workload hides overheads for linear scaling.
– DDG must keep overheads low for linear scaling.
– Must not let network saturate! (Its throughput is fixed.)
18 ScaleOut Software, Inc.
19. How SOSS Achieves Scalable Throughput
• Fully peer-to-peer architecture to eliminate bottlenecks.
• Automatically partitioned
data storage with dynamic ScaleOut StateServer Distributed Cache
Object Copy Replica
load-balancing. Cache
Service
Cache
Service
Cache
Service
Cache
Service
• Fixed number of replicas
Heartbeats Heartbeats Heartbeats
per stored object (1 or 2) Web or
Application
Server
Web or
Application
Server
Web or
Application
Server
Web or
Application
Server
to avoid order-n overhead Ethernet
(storage and latency)
• Patented technique for scaling quorum updates to
stored objects
• Patented, scalable heart-beating algorithm
19 ScaleOut Software, Inc.
20. Integrated, Powerful Platform for Scaling
• All product features benefit from the scalable, hi-av
architecture: Client
Application
Client
Application
Client
Application
Client
Application
– Ex. Parallel object Client Client Client Client
Library Library Library Library
eventing: Cache Cache Cache Cache
• All hosts handle events.
Service Service Service Service
ScaleOut StateServer Distributed Cache
• Event delivery is hi-av.
– Ex. Global replication:
• All hosts replicate objects.
• Caches automatically handle
membership changes.
Local
Farm
Remote
Farm
20 ScaleOut Software, Inc.
21. Impact of Scalable TP on Access Latency
• Scalable, distributed data grid scales throughput and
thereby maintains low latency:
– DDG scales throughput by
adding servers. Access Latency vs. Throughput
– Avoids throughput barrier
Access Latency (msec)
of a DBMS or file system.
– Maintains low latency as
throughput increases.
– Network bandwidth is
only throughput limit.
– Also has inherently lower
Throughput (accesses / sec)
latency due to:
• Memory-based storage
• Client-side caching SOSS DBMS
21 ScaleOut Software, Inc.
22. Putting it Together: How SOSS Works
• Creating or updating an object:
– Client connects to a SOSS service instance and makes request.
– Local SOSS service load-balances request to a selected host.
– Selected host creates object and one or two remote replicas.
Client
SOSS SOSS SOSS SOSS
Server Server Server Server
22 ScaleOut Software, Inc.
23. How SOSS Works
• Reading an object:
– Client connects to SOSS service and makes request.
– Local SOSS service forwards to selected host.
– Selected host returns object’s data.
– Requesting host caches object for future reads.
Client
SOSS SOSS SOSS SOSS
Server Server Server Server
23 ScaleOut Software, Inc.
24. How SOSS Works
• Adding a new host:
– Neighboring hosts detect SOSS on new host.
– Hosts automatically establish new membership.
– Neighbor hosts migrate objects to new host to rebalance load.
SOSS SOSS SOSS SOSS SOSS
Server Server Server Server Server
24 ScaleOut Software, Inc.
25. Reason #3: High Availability
• Recovering from a host failure:
– Host or NIC fails.
– Neighboring hosts detect heartbeat failure.
– Hosts establish new membership.
– Neighbor host creates new object replica to “self-heal”.
STOP
SOSS SOSS SOSS SOSS
Server Server Server Server
25 ScaleOut Software, Inc.
26. SOSS: Integrated High Availability
• Peer-to-peer architecture for maximum redundancy & scalability
• Fully integrated data replication for data redundancy, scalability, and
ease of use:
– Partial replicas ensure scalable storage and throughput.
– Per-server and per-client caches ensure fast access.
• Self-discovery and self-healing for hi-av and ease of use
• Patented quorum algorithm for reliable updating with scalability
Client
Application
Retrieve
Client Cached
Library Copy
Object Copy Replica
Cache Cache Cache Cache
Service Service Service Service
ScaleOut StateServer Distributed Cache
26 ScaleOut Software, Inc.
27. Reason #4: Sharing Data Across the Farm
The first step for server farms (1998): load-balanced,
stateless, Web applications:
• Without the ability to share
data, we need “sticky”
sessions (no hi av!): SOSS
Service
• Or we can overload the Web Server
database server: SOSS
Service
Ethernet
Ethernet
Web Server
Internet
SOSS
Service
DBMS
Server
Web Server
• Or we can share data SOSS
Service
across the farm in a
distributed data grid for Web Server
both scalability & high av.
27 ScaleOut Software, Inc.
28. The Evolution in DDGs and Data Sharing
Drivers:
• Scaling data access & analysis are critical to
competitiveness.
Cloud Computing
• Server farms & the cloud are now mainstream
using industry-standard APIs
computing platforms.
Market Penetration
• Data access is a key bottleneck.
• Short dev. cycles are mandatory.
• Standard APIs are emerging.
Expansion to new verticals
(e.g., financial services)
for data & compute grids
Early adoption on
Web and app. server farms
for speed and hi-av
Session-state Application Grid Platform-wide Data
Storage Caching Computing Usage Analysis
2005 2006 2007 2008 2010 2011
28 ScaleOut Software, Inc.
29. Data Sharing: a Closer Look
• Advantages of sharing data in a distributed data grid:
– Boosts application performance and offloads the DBMS.
– Advances & simplifies the programming model:
• Allows “stateful” business objects
• Keeps object/relational mapping at the data access layer
• Examples: session & profile data, business objects,
workflow state
• Requirements of a distributed data grid:
– Coherent storage so all clients see a consistent view
– Easy-to-use APIs
– Integrated object locking to enable coordinated updating
– High availability to avoid data loss if a server fails
– Advanced features to enable effective use of the grid (e.g.,
parallel query, map/reduce analysis)
29 ScaleOut Software, Inc.
30. Basic APIs for Data Access
.
key
• Are easy to use in C#, Java, or C/C++. Object
• Store objects in the grid as serialized blobs.
• Primarily use string or numeric keys to identify objects.
• Group objects into name spaces (“named caches”).
// Read and update object:
MyClass retrievedObj;
retrievedObj = cache["myObj"] as MyClass;
retrievedObj.var1 = "Hello, again!";
cache["myObj"] = retrievedObj;
30 ScaleOut Software, Inc.
31. Example: Named Cache Access (Java)
static void Main(string argv[])
{
// Initialize string object to be stored:
String s = “Test string”;
// Create a cache collection:
SossCache cache = SossCacheFactory.getCache(“MyCache”);
// Store object in ScaleOut StateServer (SOSS):
CachedObjectId id = new CachedObjectId(UUID.randomUUID());
cache.put(id, s);
// Read object stored in SOSS:
String answerJNC = (String)cache.get(id);
// Remove object from SOSS:
cache.remove(id);
}
31 ScaleOut Software, Inc.
32. Example: Named Cache Access (C#)
static void Main(string[] args)
{
// Initialize object to be stored:
SampleClass sampleObj = new SampleClass();
sampleObj.var1 = "Hello, SOSS!";
// Create a cache:
SossCache cache = CacheFactory.GetCache("myCache");
// Store object in the distributed cache:
cache["myObj"] = sampleObj;
// Read and update object stored in cache:
SampleClass retrievedObj = null;
retrievedObj = cache["myObj"] as SampleClass;
retrievedObj.var1 = "Hello, again!";
cache["myObj"] = retrievedObj;
// Remove object from the cache:
cache.["myObj“] = null;
}
32 ScaleOut Software, Inc.
33. Fully Distributed Locking
• Goal: synchronize access to a stored object by multiple client
threads.
• Two mechanisms: pessimistic and optimistic locking
• Pessimistic uses read-modify-write semantics:
– Can be set as default for all objects within a named cache.
– Reads to locked objects are automatically retried.
– Locks have timeouts to handle client failures.
– Simple reads and updates can bypass locks.
string myObj = cache.Retrieve("key", true); // read and lock
...
cache.Update("key", “new value", true); // update and unlock
• Optimistic uses object’s version number to allow or inhibit an update:
– User supplies version number from read to a locking update.
– Benefit: one trip to the server if high probability of success.
33 ScaleOut Software, Inc.
34. Advanced API Features
• Object timeouts
• Distributed locking for coordinating access
• Object dependency relationships
• Asynchronous events on object changes
• Automatic access to a backing store
• Object eviction on high memory usage
• Object metadata
• Bulk insertion
• Authentication
• Custom serialization for compression & encryption
• Parallel query based on metadata or properties
34 ScaleOut Software, Inc.
35. Parallel Data Analysis
• The goal:
– Quickly analyze a large set of data for patterns and trends.
– Take advantage of scalable computing to shorten “time to insight.”
• Applications:
– Search
– Financial services
– Business intelligence
– Risk analysis
– Weather simulation
– Structural modeling
– Fluid-flow analysis
– Climate modeling
NCAR Community Climate Model
http://www.vets.ucar.edu/vg/IPCC_CCSM3/index.shtml
35 ScaleOut Software, Inc.
36. Reason #5: Parallel Data Analysis
• Rapid analysis of large data sets has become a top
priority.
• Distributed data grids enable fast parallel analysis:
– Automatically harness the power of many servers and cores.
– Offer a simple, easy-to-use development model.
– Deliver top performance for memory-based datasets.
• Key attributes of DDG-based PMI vs. Random Access Throughput Comparison
data analysis: 600
2mb time series objects
SOSS PMI
– Data is memory-based and 500 Random Access
Objects per Second
400
data motion is minimized. 300
200
– Programming model is object- 100
oriented; parallelism is automatic.
0
Number of Nodes 4 8 12 16 20 24 28 32
Number of Objects 512 1024 1536 2048 2560 3072 3584 4096
36 ScaleOut Software, Inc.
37. Parallel Query
• Goal: identify a set of objects with selected properties.
• Uses all grid servers to scale query performance.
• Uses fast, optimized lookup on each grid server.
Query the DDG
in parallel.
Sequentially
analyze all
queried objects.
Merge the keys
into a list.
37 ScaleOut Software, Inc.
38. Parallel Query Example (Java)
• Mark class properties as indexes for SOSS query:
public class Stock implements Serializable {
private String ticker;
private int totalShares;
private double price;
@SossIndexAttribute
public String getTicker() {
return ticker;} … }
• Define a query using these properties:
NamedCache cache = CacheFactory.getCache("Stocks",
false);
Set keys = cache.queryKeys(Stock.class,
or(equal("ticker", "GOOG"),
equal("ticker", "ORCL")));
38 ScaleOut Software, Inc.
39. Parallel Query Example (C#)
• Mark class properties as indexes for SOSS query:
class Stock {
[SossIndex]
public string Ticker { get; set; }
public decimal TotalShares { get; set; }
public decimal Price { get; set; }}
• Define a query using these properties. Objects are
automatically read into memory:
NamedCache cache = CacheFactory.GetCache("Stocks");
var q = from s in cache.QueryObjects<Stock>()
where s.Ticker == "GOOG" || s.Ticker == "ORCL"
select s;
Console.WriteLine("{0} Stocks found", q.Count());
39 ScaleOut Software, Inc.
40. Parallel Method Invocation (“Map/Reduce”)
• Goal: analyze a set of objects with selected properties.
• Executes user’s code in parallel across the grid.
• Uses a parallel query to select objects for analysis.
Analyze Data (Map)
In-Memory Distributed Data Grid
Runs Map/Reduce Analysis.
Combine Results
(Reduce)
40 ScaleOut Software, Inc.
41. Example in Financial Services
Analyze trading strategies across stock histories:
Why?
• Back-testing systems help guard against risks in deploying new
trading strategies.
• Performance is critical for “first to market” advantage.
• Uses significant amount of market data and computation time.
How?
• Write method E to analyze trading strategies across a single
stock history.
• Write method M to merge two sets of results.
• Populate the data store with a set of stock histories.
• Run method E in parallel on all stock histories.
• Merge the results with method M to produce a report.
• Refine and repeat…
41 ScaleOut Software, Inc.
42. Stage the Data for Analysis
• Step 1: Populate the distributed data grid with objects each of which
represents a price history for a ticker symbol:
42 ScaleOut Software, Inc.
43. Code the Eval and Merge Methods
• Step 2: Write a method to evaluate a stock history based on parameters:
Results EvalStockHistory(StockHistory history, Parameters params)
{
<analyze trading strategy for this stock history>
return results;
}
• Step 3: Write a method to merge the results of two evaluations:
Results MergeResuts(Results results1, Results results2)
{
<merge both results>
return results;
}
• Notes:
– This code can be run a sequential calculation on in-memory data.
– No explicit accesses to the distributed data grid are used.
43 ScaleOut Software, Inc.
44. Run the Analysis
• Step 4: Invoke parallel evaluation and merging of results:
Results Invoke(EvalStockHistory, MergeResults, querySpec,
params);
EvalStockHistory()
MergeResults()
44 ScaleOut Software, Inc.
45. Start parallel
analysis
.eval()
stock stock stock stock stock stock
history history history history history history
results results results results results results
.merge() .merge() .merge()
results results results
.merge()
results returned results
to client
45 ScaleOut Software, Inc.
46. Advantages of Using PMI
• Fast
PMI Engine
– Automatically scales application
performance across grid servers. Core Core
– Automatically uses all server cores. Core Core
– Minimizes data motion between
servers.
– API-based invocation delivers very
low latency.
• Easy to Use:
– User writes simple, “in memory”
code; all grid accesses are implicit.
Grid Service
– Matches Java/C# model of object-
oriented collections.
– Requires no tuning.
46 ScaleOut Software, Inc.
47. Comparison of DDGs and File-Based M/R
DDG File-Based M/R
Data set size Gigabytes->terabytes Terabytes->petabytes
Data repository In-memory File / database
Data view Queried object collection File-based key/value
pairs
Development time Low High
Automatic Yes Application
scalability dependent
Best use Quick-turn analysis of Complex analysis of
memory-based data large datasets
I/O overhead Low High
Cluster mgt. Simple Complex
High availability Memory-based File-based
47 ScaleOut Software, Inc.
48. DDG Minimizes Data Motion
• File-based map/reduce must move data to memory for analysis:
M/R Server M/R Server M/R Server
E E E
Server
Memory
File System /
D D D D D D D D D Database
• Memory-based DDG analyzes data in place:
Grid Server Grid Server Grid Server
E E E
Distributed
D D D D D D D D D Data Grid
48 ScaleOut Software, Inc.
49. Start parallel
analysis
.eval()
File I/O
stock stock stock stock stock stock
history history history history history history
results results results results results results
.merge() .merge() .merge()
File I/O
results results results
File I/O
.merge()
results returned results
to client
49 ScaleOut Software, Inc.
50. Performance Impact of Data Motion
Measured random access to DDG data to simulate file I/O:
50 ScaleOut Software, Inc.
51. PMI Delivers 16X Speedup Over Hadoop
Throughput Comparison
800
700
Throughput (Obj/Sec)
600 SOSS PMI
500
Hadoop/SOSS
400
Hadoop
300
200
100
0
4 6 8
Number of Servers
51 ScaleOut Software, Inc.
52. Reason # 6: Simplify Data Migration
• DDGs enable seamless data migration across on-
premise sites and the cloud:
– Automatically access
remote data as needed.
– Efficiently manage
WAN bandwidth.
– Enable full data
synchronization
across sites.
In-Memory Distributed Data Grid
52 ScaleOut Software, Inc.
53. Example: Web Farm Cloud-Bursting
• DDGs bridge on-premise and cloud-based in-memory storage of
Web session state.
• DDG automatically migrates session-state objects into the cloud
on demand.
• This enables seamless access to data across multiple sites.
Cloud Application Web Load Balancer
Cloud Application VS
App App VS
App VS App VS App VS
App VS
App VS
App VS On-Premise Application 2
App VS App VS
Server App Server App
On-Premise Application 2
SOSS VS
Server App Server App
SOSS VS
SOSS VSVS
SOSS Aut
o
SOSS VS Mig matic
rate ally
Cloud-Based Distributed Automatically
Cache Da
ta SOSS Host SOSS Host
SOSS Host
SOSS VS Migrate Data
SOSS Host
Cloud hosted Cloud of Virtual Servers On-Premise Backing
Distributed Data Grid On-Premise Cache
Distributed Data Grid Store
User’s On-Premise Application
Cloud of Virtual Servers User’s On-Premise Application
Virtual Distributed Data Grid
53 ScaleOut Software, Inc.
54. Example: Global Access to Shared Data
Mirrored Data Centers
SOSS SVR Satellite Data Centers
SOSS SVR
SOSS SVR
SOSS SVR
Distributed Data Grid SOSS SVR
SOSS SVR
SOSS SVR
SOSS SVR
SOSS SVR Distributed Data Grid
Distributed Data Grid
SOSS SVR
SOSS SVR
SOSS SVR
Distributed Data Grid
Global Distributed Data Grid
54 ScaleOut Software, Inc.
55. What to Look for in a DDG Product
• SSI's products have an unusually high level of integration and
Ease of Use focus on automatic operation. This dramatically simplifies
deployment and management of a distributed data grid.
Performance • In direct comparison tests, SSI demonstrates faster access
performance and scalability in key benchmarks.
• SSI’s architecture integrates both scalability and high
Architecture availability and uniformly applies key architectural principles,
such as peer-to-peer design.
• Seamless interoperability across Windows and Unix (Linux,
Portability Solaris, etc.) operating systems was designed into SSI’s
architecture from the outset.
• Advanced capabilities for "map/reduce"-style parallel data
Data Analysis analysis open up important new applications for distributed data
grids.
• SSI’s comprehensive tools for managing distributed data grids,
Manageability such as its object browser and parallel backup and restore utility,
are unique in the industry.
55 ScaleOut Software, Inc.
56. SOSS Maximizes Ease of Use
Grid servers self-aggregate, self-heal, and automatically load-balance.
Tree list shows: Host
• Store status configuration
• Host list pane:
• Host status Just need to
• Remote stores select subnet
• Remote client shared by all
configuration hosts.
56 ScaleOut Software, Inc.
58. SOSS Object Browser
• Simplifies development.
• Provides extremely useful visibility into grid usage.
• Allows grid objects to be analyzed and managed.
58 ScaleOut Software, Inc.
59. SOSS Parallel Backup and Restore
• Enables grid contents (or portions) to be backed up or
restored in parallel either to:
– Separate file systems on all caching servers or
– A single network file share
• Creates backups or snapshots for later analysis.
• Makes full use of SOSS’s parallel implementation to
deliver highly scalable performance and high availability.
Ethernet Ethernet
SOSS SOSS SOSS SOSS SOSS SOSS SOSS SOSS
Server Server Server Server Server Server Server Server
Ethernet Ethernet
59 ScaleOut Software, Inc.
60. Recap: Top 6 Reasons to Use a DDG
1. Faster access time for business logic state or database data
2. Scalable throughput to match a growing workload and keep
response times low
3. High availability to prevent data loss if a grid server (or network
link) fails
Access Latency vs. Throughput
4. Shared access to data across
Access Latency (msec)
the server farm Grid DBMS
5. Advanced capabilities
for quickly and easily mining
data using scalable,
“map/reduce,” analysis
6. Transparent data migration
across multiple sites and the Throughput (accesses / sec)
cloud.
60 ScaleOut Software, Inc.
61. Thank you for joining us today!
Distributed Data Grids for
Server Farms & High Performance Computing
www.scaleoutsoftware.com