Séminaire au DIC: «Systematicity in language models' knowledge and self-knowledge» par Jacob Andreas
Séminaire ayant lieu dans le cadre du doctorat en informatique cognitive, en collaboration avec le centre de recherche CRIA
TITRE : Systematicity in language models' knowledge and self-knowledge
Jacob ANDREAS
Jeudi le 22 janvier 2026 à 10h30
Local PK-5115 (Il est possible d'y assister en virtuel en vous inscrivant ici)
RÉSUMÉ
Current language models (LMs) can converse knowledgeably, and in remarkable depth, about a wide range of topics. But these same LMs often generate confident-but-incorrect outputs, contradict themselves, and generally behave in ways that appear surprising and unnatural to human users. Increasingly, researchers attribute these failures not to surface-level statistical errors, but instead to mistakes and inconsistencies in LMs' "knowledge" or "beliefs" about the outside world. To what extent should we understand LMs as possessing beliefs at all? How should this understanding influence the procedures we use to train them? This talk will describe a family of training objectives that optimize language models for *internal systematicity* rather than predictive accuracy on some external dataset, showing that such objectives can improve models' linguistic and factual generalization, as well as the reliability of their explanations of their own behavior.
BIOGRAPHIE
Jacob ANDREAS is Associate Professor in the Department of Electrical Engineering and Computer Science at MIT and a member of CSAIL, where he directs the Language & Intelligence Group. His research focuses on understanding computational foundations of language learning and building intelligent systems that communicate effectively with humans. Andreas earned his PhD from UC Berkeley, MPhil from Cambridge as a Churchill Scholar, and BS from Columbia. He has received the Samsung AI Researcher of the Year award, MIT's Kolokotrones teaching award, and paper awards at NAACL and ICML. His work bridges machine learning and natural language processing, with particular expertise in compositional generalization, neural module networks, and systematic reasoning in language models.
RÉFÉRENCES
Akyürek, A. F., Akyürek, E., Choshen, L., Wijaya, D. T., & Andreas, J. (2024, August). Deductive closure training of language models for coherence, accuracy, and updatability. In Findings of the Association for Computational Linguistics: ACL 2024 (pp. 9802-9818).
Li, B. Z., Guo, Z. C., Huang, V., Steinhardt, J., & Andreas, J. (2025). Training Language Models to Explain Their Own Computations. arXiv preprint arXiv:2511.08579.
Damani, M., Puri, I., Slocum, S., Shenfeld, I., Choshen, L., Kim, Y., & Andreas, J. (2025). Beyond binary rewards: Training lms to reason about their uncertainty.

Date / heure
Lieu
Montréal (QC)
Prix
Renseignements
- Mylène Dagenais
- dic@uqam.ca
- https://www.dic.uqam.ca