CMSA New Technologies in Mathematics: Toward Demystifying Transformers and Attention
SEMINARS, CMSA EVENTS
Speaker:
Ben Edelman - Harvard Computer Science
Over the past several years, attention mechanisms (primarily in the form of the Transformer architecture) have revolutionized deep learning, leading to advances in natural language processing, computer vision, code synthesis, protein structure prediction, and beyond. Attention has a remarkable ability to enable the learning of long-range dependencies in diverse modalities of data. And yet, there is at present limited principled understanding of the reasons for its success. In this talk, I’ll explain how attention mechanisms and Transformers work, and then I’ll share the results of a preliminary investigation
https://harvard.zoom.us/j/99651364593?pwd=Q1R0RTMrZ2NZQjg1U1ZOaUYzSE02QT09