CMSA New Technologies in Mathematics Seminar: What Algorithms can Transformers Learn? A Study in Length Generalization
CMSA EVENTS
February 14, 2024 2:00 pmLarge language models exhibit many surprising “out-of-distribution” generalization abilities, yet also struggle to solve certain simple tasks like decimal addition. To clarify the scope of Transformers' out-of-distribution generalization, we isolate...
Read more