Sharded Joins for Scalable Incremental Graph Queries
1. Budapest University of Technology and Economics
Department of Measurement and Information Systems
Budapest University of Technology and Economics
Fault Tolerant Systems Research Group
Sharded Joins for Scalable
Incremental Graph Queries
János Maginecz, Gábor Szárnyas
17. Motivating Example
Error pattern for an AUTOSAR validation constraint
Communication
channel
Logical signal Mapping Physical signal
18. Motivating Example
Error pattern for an AUTOSAR validation constraint
Communication
channel
Logical signal Mapping Physical signal
Validation
19. Motivating Example
Error pattern for an AUTOSAR validation constraint
Communication
channel
Logical signal Mapping Physical signal
Invalid submodel
Validation
20. Motivating Example
Error pattern for an AUTOSAR validation constraint
Communication
channel
Logical signal Mapping Physical signal
Invalid submodel
Validation
21. Motivating Example
Error pattern for an AUTOSAR validation constraint
Communication
channel
Logical signal Mapping Physical signal
Invalid submodel
Validation
Valid submodel
40. Antijoin
Join
Join
Fill indexer nodesStore interim resultsRead result setEdit modelPropagating changes
Rete Algorithm
Communication
channel
Logical signal Mapping Physical signal
41. Antijoin
Join
Join
Fill indexer nodesStore interim resultsRead result setEdit modelPropagating changes
Rete Algorithm
Communication
channel
Logical signal Mapping Physical signal
42. Antijoin
Join
Join
Fill indexer nodesStore interim resultsRead result setEdit modelPropagating changesRead result set
Rete Algorithm
Communication
channel
Logical signal Mapping Physical signal
Result set
43. Single Workstation Rete Implementation
Rete-based incremental graph query engine
Open-source Eclipse project
Java Virtual Machine limitations
o Cannot handle 15+ GB heap memory efficiently
Proposed solution
o Horizontal scaling: distributed system
51. IncQuery-D Architecture
Server 1
Database
shard 1
Server 2
Database
shard 2
Server 3
Database
shard 3
Transaction
Database
shard 0
Server 0
Rete net
INCQUERY-D
Distributed indexer Model access adapter
52. IncQuery-D Architecture
Server 1
Database
shard 1
Server 2
Database
shard 2
Server 3
Database
shard 3
Transaction
Database
shard 0
Server 0
INCQUERY-D
Distributed query evaluation network
Distributed indexer Model access adapter
53. IncQuery-D Architecture
Server 1
Database
shard 1
Server 2
Database
shard 2
Server 3
Database
shard 3
Transaction
Database
shard 0
Server 0
INCQUERY-D
Distributed query evaluation network
Distributed indexer Model access adapter
Distributed persistent
storage
54. IncQuery-D Architecture
Server 1
Database
shard 1
Server 2
Database
shard 2
Server 3
Database
shard 3
Transaction
Database
shard 0
Server 0
INCQUERY-D
Distributed query evaluation network
Distributed indexer Model access adapter
Distributed indexing,
notification
Distributed persistent
storage
55. IncQuery-D Architecture
Server 1
Database
shard 1
Server 2
Database
shard 2
Server 3
Database
shard 3
Transaction
Database
shard 0
Server 0
INCQUERY-D
Distributed query evaluation network
Distributed indexer Model access adapter
Distributed indexing,
notification
Distributed persistent
storage
Distributed production network
• Each intermediate node can be allocated
to a different host
• Remote internode communication
56. IncQuery-D Architecture
Server 1
Database
shard 1
Server 2
Database
shard 2
Server 3
Database
shard 3
Transaction
Database
shard 0
Server 0
INCQUERY-D
Distributed query evaluation network
Distributed indexer Model access adapter
57. IncQuery-D Architecture
Server 1
Database
shard 1
Server 2
Database
shard 2
Server 3
Database
shard 3
Transaction
Database
shard 0
Server 0
INCQUERY-D
Distributed query evaluation network
Indexer Indexer Indexer Indexer
Join
Join
Antijoin
58. Working around Memory Limits
Host-2
Host-1
Input
Node A
Node B
Distributed
Output
Host-1
Input
Node A
Node B
Local
Output
Solution 1
Simple and efficient
Memory of the machine is an upper
bound for the network
Nodes run on different computers
The memory of each node is
limited to the assigned machine
+
−
+
−
59. Working around Memory Limits
Host-2
Host-1
Input
Node A
Node B
Distributed
Output
Host-1
Input
Node A
Node B
Local
Output
Solution 1
EMF-IncQuery IncQuery-D
Simple and efficient
Memory of the machine is an upper
bound for the network
Nodes run on different computers
The memory of each node is
limited to the assigned machine
+
−
+
−
60. Host-3Host-1
Host-2
Working around Memory Limits
Distributed
+
Sharded
Input
Node A
Node B
Output
Solution 2
Host-2
Host-1
Input
Node A
Node B
Distributed
Output
Nodes may be allocated on
more than 1 computer
Network overhead
+
−
Nodes run on different computers
The memory of each node is limited to the assigned
machine
+
−
61. Host-3Host-1
Host-2
Working around Memory Limits
Distributed
+
Sharded
Input
Node A
Node B
Output
Solution 2
IncQuery-DS
Host-2
Host-1
Input
Node A
Node B
Distributed
Output
IncQuery-D
Nodes may be allocated on
more than 1 computer
Network overhead
+
−
Nodes run on different computers
The memory of each node is limited to the assigned
machine
+
−
62. Í
Join
Antijoin
Join / Shard 2Join / Shard 1
Sharded Rete Algorithm
Communication
channel
Logical signal Mapping Physical signal
63. Í
Join
Antijoin
Join / Shard 2Join / Shard 1
Sharded Rete Algorithm
Communication
channel
Logical signal Mapping Physical signal
64. Í
Join
Antijoin
Join / Shard 2Join / Shard 1
Sharded Rete Algorithm
Communication
channel
Logical signal Mapping Physical signal
65. Í
Join
Antijoin
Join / Shard 2Join / Shard 1
Sharded Rete Algorithm
Communication
channel
Logical signal Mapping Physical signal
66. Í
Join
Antijoin
Join / Shard 2Join / Shard 1
Sharded Rete Algorithm
Communication
channel
Logical signal Mapping Physical signal
67. Í
Join
Antijoin
Join / Shard 2Join / Shard 1
Sharded Rete Algorithm
Communication
channel
Logical signal Mapping Physical signal
IncQuery-DS
Distributed and Sharded
68. Validation of Critical Systems
Model validation for large models
Well-formedness contraints with complex graph
patterns
Train Benchmark
o Open-source performance measurement framework
o Presented yesterday
69. Train Benchmark
Phases
o Initial read and validation
o Small changes and revalidation
• Simulating modifications from a user
Goal: Measure response times
Execution timeExecution time
Read Transformation RevalidationValidation
× 10× 3
73. Join Optimization
Hash join
o Using hash maps
Sort merge join
o Using red-black trees
Collection frameworks
o Standard library in Scala
o Goldman Sachs Collections
76. Summary
Designed a sharded Rete engine
Evaluated its scalability
Analysis of join algorithms and collection
frameworks
Future work
o Domains with similar challenges