Language models have grown increasingly powerful at performing complex tasks, motivating the study of their behavior and internals. However, distinct research communities often pursue these two objectives in isolation. As a result, we lack robust and standardized interpretability methods to assess LM behavior in complex, real-world scenarios comprehensively. This workshop promotes research and discussion on the interplay between behavior and model internals to address this gap. We aim to explore how understanding internal mechanisms can enhance our knowledge of complex model behaviors, and vice versa.
Organizers: Leshem Choshen, Vagrant Gautam, Yufang Hou, Anne Lauscher, Tamar Rott Shaham, Andreas Waldis
Steering Committee: Jacob Andreas, David Bau, Yonatan Belinkov, Iryna Gurevych, Kyle Mahowald
News
-
We look for reviewers, register here
-
Call for papers published!
-
Workshop accepted at COLM ‘25!
Important Dates
June 23 - Submission due
July 24 - Acceptance notification
October 10 - Workshop day