The Energy-Based Learning Model
MATHEMATICAL PICTURE LANGUAGE
Yann LeCun - New York University and Facebook
One of the hottest sub-topics of machine learning in recent times has been Self-Supervised Learning (SSL). In SSL, a learning machine captures the dependencies between input variables, some of which may be observed, denoted X, and others not always observed, denoted Y. SSL pre-training has revolutionized natural language processing and is making very fast progress in speech and image recognition. SSL may enable machines to learn predictive models of the world through observation, and to learn representations of the perceptual world, thereby reducing the number of labeled samples or rewarded trials to learn a downstream task. In the Energy-Based Model framework (EBM), both X and Y are inputs, and the model outputs a scalar energy that measures the degree of incompatibility between X and Y. EBMs are implicit functions that can represent complex and multimodal dependencies between X and Y. EBM architectures belong to two main families: joint embedding architectures and latent-variable generative architectures. There are two main families of methods to train EBMs: contrastive methods, and volume regularization methods. Much of the underlying mathematics of EBM is borrowed from statistical physics, including concepts of partition function, free energy, and variational approximations thereof.