CMSA Computer Science for Mathematicians: Graph Neural Networks: Expressive Power, Generalization, and Extrapolation
Keyulu Xu - MIT
Recent advances in deep learning exploit the structure in data and architectures. Graph Neural Network (GNN) is a powerful framework for learning with graph-structured objects, and for learning the interaction of objects on a graph. Applications include recommender systems, drug discovery, physical and visual reasoning, program synthesis, and natural language processing.
In this talk, we study GNNs from the following aspects: expressive power, generalization, and extrapolation. We characterize the expressive power of GNNs from the perspective of graph isomorphism tests. We show an upper bound that GNNs are at most as powerful as a Weisfeiler-Lehman test. We then show conditions to achieve this upper bound, and present a maximally powerful GNN. Next, we analyze the generalization of GNNs. The optimization trajectories of over-parameterized GNNs trained by gradient descent correspond to those of kernel regression using a specific graph neural tangent kernel. Using this relation, we show GNNs provably learn a class of functions on graphs. More generally, we study how the architectural inductive biases influence generalization in a task. We introduce an algorithmic alignment measure, and show better alignment implies better generalization. Our framework suggests GNNs can sample-efficiently learn dynamic programming algorithms. Finally, we study how neural networks trained by gradient descent extrapolate, i.e., what they learn outside the support of the training distribution (e.g., on larger graphs or edge weights). We prove a linear extrapolation behavior of ReLU multilayer perceptrons (MLPs), and identify conditions under which MLPs and GNNs extrapolate well. Our results suggest how a good representation or architecture can help extrapolation.