The initialization determines whether in-context learning is gradient descent

Xie, Shifeng; Yuan, Rui; Rossi, Simone; Hannagan, Thomas
NeurIPS 2025, Workshop, What Can(’t) Transformers Do?, 39th Annual Conference on Neural Information Processing Systems, 2-7 December 2025, San Diego, USA


Type:
Poster / Demo
City:
San Diego
Date:
2025-12-02
Department:
Data Science
Eurecom Ref:
8537
See also:

PERMALINK : https://www.eurecom.fr/publication/8537