Publicité
Publicité

Contenu connexe

Similaire à NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer Self-Attention Networks", WWW 2022(20)

Plus de ssuser4b1f48(20)

Publicité

NS-CUK Joint Journal Club: V.T.Hoang, Review on "Universal Graph Transformer Self-Attention Networks", WWW 2022

  1. In Companion Proceedings of the Web Conference 2022 (WWW ’22 Companion), April 25–29, 2022, Virtual Event, Lyon, France Thuy Hoang Van, PhD student. Network Science Lab, The Catholic University of Korea. https://nslab.catholic.ac.kr/
  2. Contributions • A transformer-based GNN model with 2 variants: – Leverage the transformer on a set of sampled neighbors for each input node • SOTA: graph classification – Leverage the transformer on all input nodes • text classification
  3. Graph Data Graphs are everywhere. Program Flow Ecological Network Biological Network Social Network Chemical Network Web Graph
  4. Graph embedding learning • The goal: – Mapping individual nodes to vector points in latent space.
  5. Problems • However, as graph data grow unprecedentedly in volume and complexity in modern time: – Traditional learning methods for graph are mostly inadequate to model increasing complexity.
  6. GNN-based models • GNN-based approaches provide faster and practical training and state-of-the-art results on benchmark datasets for downstream tasks such as node classification
  7. The poposed model: UGformer • The use of the transformer to a new domain such as GNNs as a novelty • a transformer-based GNN model, to learn graph representations
  8. Variant 1: Leveraging the transformer on a set of sampled neighbors for each node
  9. Variant 1: Leveraging the transformer on a set of sampled neighbors for each node 𝑘-th layer, given a node v ∈ V, at each step 𝑡. A transformer-based function to aggregate the vector representations for all nodes u ∈ Nv ∪ {v} Trans(.) a MLP network . ATT(.) a self-attention layer.
  10. Variant 2: Leveraging the transformer on all input nodes For small and medium graphs
  11. Experiment • UGformer Variant 2 for inductive text classification: – build a graph G: • words as nodes • cooccurrences between words (within a fixed-size sliding window of length 3) as edges. Graph-level readout function:
  12. Experimental setup • 4 benchmarks – MR, R8, R52, and Ohsumed • Number of attention heads to 2 • The hidden size to 384 • 2-layer model • Adam optimizer
  13. UGformer Variant 1 for graph classification in an inductive setting • The vector representations ev of nodes v (𝐾 is the number of layers): • 7 datasets – 3 social network datasets (COLLAB, IMDB-B, and IMDB-M) – 4 bioinformatics datasets (DD, MUTAG, PROTEINS, and PTC)
  14. UGformer Variant 1 for graph classification in an inductive setting • 𝐾 of UGformer layers in {1, 2, 3}, • The number of steps 𝑇 in {1, 2, 3, 4}
  15. UGformer Variant 1 for graph classification in an “unsupervised transductive” settin • An unsupervised transductive learning approach to train GNNs to address the limited availability of class label. • Where: – Node embeddings ov are also learned as model parameters.
  16. Graph classification results (% accuracy)
  17. Conclusion • GNN as an auxiliary module: – Sampling strategy – Capture local graph structure • For small graph: – Local structure is more meaningful – Self-attention may not be good.
Publicité