About

Jacob Andreas

Associate Professor, Department of Electrical Engineering and Computer Science

Research Publications

Who they work with

Categories

Linguistics Natural Language Processing Neuro-Symbolic AI

Jacob Andreas is an associate professor in MIT’s Department of Electrical Engineering and Computer Science (EECS), and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL). His research aims to understand the computational foundations of efficient language learning, and build general-purpose intelligent systems that can communicate effectively with humans and learn from human guidance. He earned a BS from Columbia University and an MPhil from the University of Cambridge, where he studied as a Churchill Scholar. He earned a PhD from the University of California, Berkeley,

Selected Publications

Jacob, A. P., Shen, Y., Farina, G. & Andreas, J. (2024). The Consensus Game: Language Model Generation via Equilibrium Search. The Proceedings of the International Conference on Learning Representations (ICLR).
Schwettmann, S., Shaham, T. R., Materzyńska, J., Chowdhury, N., Li, S., Andreas, J., Bau, D. & Torralba, A. (2023). A Function Interpretation Benchmark for Evaluating Interpretability Methods. The Proceedings of the Conference on Neural Information Processing Systems (NeurIPS).
Hou, B., O’Connor, J., Andreas, J., Chang, S. & Zhang, Y. (2023). PromptBoosting: Black-Box Text Classification with Ten Forward Passes. International Conference on Machine Learning (ICML).

Media

July 23, 2024: MIT News, MIT researchers advance automated interpretability in AI models
July 11, 2024: MIT News, Reasoning skills of large language models are often overestimated
May 14, 2024: MIT News, Using ideas from game theory to improve the reliability of language models