Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Trinity: A Distributed Graph Engine on a Memory Cloud

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Prochain SlideShare
WLAN IP and Frame
WLAN IP and Frame
Chargement dans…3
×

Consultez-les par la suite

1 sur 25 Publicité
Publicité

Plus De Contenu Connexe

Diaporamas pour vous (16)

Similaire à Trinity: A Distributed Graph Engine on a Memory Cloud (20)

Publicité

Plus par Qian Lin (13)

Plus récents (20)

Publicité

Trinity: A Distributed Graph Engine on a Memory Cloud

  1. 1. Trinity: A Distributed Graph Engine on a Memory Cloud Speaker: LIN Qian http://www.comp.nus.edu.sg/~linqian/
  2. 2. Graph applications Online query processing  Low latency Offline graph analytics  High throughput
  3. 3. Online queries Random data access e.g., BFS, sub-graph matching, …
  4. 4. Offline computations Performed iteratively
  5. 5. Insight: Keeping the graph in memory at least the topology
  6. 6. Trinity Online query + Offline analytics
  7. 7. Random data access problem in large graph computation Globally addressable distr. memory Random access abstraction
  8. 8. Belief High-speed network is more available DRAM is cheaper In-memory solution become practical
  9. 9. “Trinity itself is not a system that comes with comprehensive built-in graph computation modules.”
  10. 10. Trinity cluster
  11. 11. Stack of Trinity system modules User define: Graph schema, Communication protocols, Computation paradigms
  12. 12. Memory cloud Partition memory space into trunks Hashing
  13. 13. Memory trunks 2p > m 1. Trunk level parallelism 2. Efficient hashing
  14. 14. Hashing Key-value store p-bit value  i ∈ [0, 2p – 1] Inner trunk hash table
  15. 15. Data partitioning and addressing Benefit: Scalability Fault-tolerance
  16. 16. Modeling graph Cell: value + schema Represent a node in a cell
  17. 17. TSL Object-oriented cell manipulation Data integration Network communication
  18. 18. Online queries Traversal based New paradigm
  19. 19. Vertex centric offline analytics Restrictive vertex centric model
  20. 20. Message passing optimization Create a bipartite partition of the local graph Buffer hub vertices
  21. 21. A new paradigm for offline analytics 1. Aggregate answers from local computations 2. Employ probabilistic inference
  22. 22. Circular memory management • Aim to avoid memory gaps between large number of key-value pairs
  23. 23. Fault tolerance Heartbeat-based failure detection BSP: checkpointing Async.: “periodical interruption”
  24. 24. Performance
  25. 25. Performance (cont.)

×