Loading Events

Thinking Like Transformers – A Practical Session

CMSA EVENTS: CMSA NEW TECHNOLOGIES IN MATHEMATICS

When: November 20, 2024
10:00 am - 11:00 am
Where: Virtually
Speaker: Gail Weiss (EPFL)

With the help of the RASP programming language, we can better imagine how transformers—the powerful attention based sequence processing architecture—solve certain tasks. Some tasks, such as simply repeating or reversing an input sequence, have reasonably straightforward solutions, but many others are more difficult. To unlock a fuller intuition of what can and cannot be achieved with transformers, we must understand not just the RASP operations but also how to use them effectively.

In this session, I would like to discuss some useful tricks with you in more detail. How is the powerful selector_width operation yielded from the true RASP operations? How can a fixed-depth RASP program perform arbitrary length long-addition, despite the equally large number of potential carry operations such a computation entails? How might a transformer perform in-context reasoning? And are any of these solutions reasonable, i.e., realisable in practice?
I will begin with a brief introduction of the base RASP operations to ground our discussion, and then walk us through several interesting task solutions. Following this, and armed with this deeper intuition of how transformers solve several tasks, we will conclude with a discussion of what this implies for how knowledge and computations must spread out in transformer layers and embeddings in practice.

This talk will be on Zoom only. Here is the link (password: cmsa).

https://harvard.zoom.us/j/92220006185?pwd=V3mrb4cNSbgRXtNJtRJkTvWFVhmbI5.1
Password: cmsa