Hi, I’m Katie! I’m a Machine Learning PhD candidate in the Computational and Biological Learning (CBL) Lab at the University of Cambridge, where I am supervised by Adrian Weller MBE and advised by Richard Turner. I collaborate closely with Josh Tenenbaum and the Computational Cognitive Science Group at MIT. I am also a Student Fellow at the Leverhulme Centre for the Future of Intelligence (CFI) and volunteer with the Human-Oriented Automated Theorem Proving, led by Sir Tim Gowers. Previously, I was a part-time Student Researcher at Google DeepMind with Krishnamurthy (Dj) Dvijotham.

I’m particularly excited about two prongs: 1) bridging “the gap” between human and AI problem-solving, decision-making, and reasoning, and 2) leveraging this gap. The former (1) revolves around bringing AI systems closer to humans in their ability to learn from little data and handle (and propose) entirely new problems reliably - or at least express uncertainty when they can’t. The latter (2), however, emphasizes the perhaps advantageous differences between humans and AI systems; humans are not perfect. I am especially excited about designing AI assistive systems which complement human strengths + weaknesses to push the frontiers of human producivity, wellbeing, and discovery. Both prongs require characterizing what such “gaps” even are! Across these efforts, I work with amazing collaborators expanding + applying the toolkit of computational cognitive science, especially (increasingly) probabilistic programming, and studying foundation models and their interactions with humans.

I recently completed an MPhil in Machine Learning and Machine Intelligence from the University of Cambridge and obtained a Bachelors of Science from MIT in 2021 in Brain and Cognitive Sciences, with minors in Computer Science and Biomedical Engineering. I am grateful to the Marshall Scholarship for funding my MPhil and PhD, as well as King’s College, and the Cambridge Trust for additional support. During my undergrad, I founded the MITxHarvard Women in AI Group in an effort to bringing more diverse voices to the table in AI! I have since enjoyed co-organizing other AI community efforts, including the NeurIPS 2023 Math-AI Workshop and the ICML 2022 Workshop on Human-Machine Collaboration and Teaming.

Outside of research, I love to run (!) and used to run competitively for MIT.


You can find the most up-to-date listing on Google Scholar profile.

Evaluating Language Models for Mathematics through Interactions
Katherine M. Collins^, Albert Q. Jiang^, Simon Frieder, Lionel Wong, Miri Zilka, Umang Bhatt, Thomas Lukasiewicz, Yuhuai Wu, Joshua B. Tenenbaum, William Hart, Timothy Gowers, Wenda Li, Adrian Weller^^, Mateja Jamnik^^.
Working Paper, arXiv (2023).
CheckMate Interactive Eval Platform MathConverse Data

Human Uncertainty in Concept-Based AI Systems
Katherine M. Collins, Matthew Barker^, Mateo Espinosa Zarlenga^, Naveen Raman^, Umang Bhatt, Mateja Jamnik, Ilia Sucholutsky, Adrian Weller, Krishnamurthy (Dj) Dvijotham.
AIES (2023).
CUB-S Data Project Page

Human-in-the-Loop Mixup
Katherine M. Collins, Umang Bhatt, Weiyang Liu, Vihari Piratla, Ilia Sucholutsky, Bradley Love, Adrian Weller.
UAI (2023) and selected for an Oral presentation! Earlier version at AAAI Workshop R^2HCAI and Best Demo/Poster at HCOMP Demo Track (2022).
H-Mix Data and HILL MixE Suite Code Video

On the Informativeness of Supervision Signals
Ilia Sucholutsky, Ruairidh M. Battleday, Katherine M. Collins, Raja Marjieh, Joshua C. Peterson, Pulkit Singh, Umang Bhatt, Nori Jacoby, Adrian Weller, Thomas L. Griffiths.
UAI (2023).

BioAutoMATED: An end-to-end automated machine learning tool for explanation and design of biological sequences
Jacqueline A. Valeri^, Luis R. Soenksen^, Katherine M. Collins^, Pradeep Ramesh, George Cai, Rani Powers, Nicolaas M. Angenent-Mari, Diogo M. Camacho, Felix Wong, Timothy K. Lu, James J. Collins
Cell Systems (2023).
BioAutoMATED Platform

Harms from Increasingly Agentic Systems
Alan Chan, Rebecca Salganik, Alva Markelius, Chris Pang, Nitarshan Rajkumar, Dmitrii Krasheninnikov, Lauro Langosco, Zhonghao He, Yawen Duan, Micah Carroll, Michelle Lin, Alex Mayhew, Katherine M. Collins, Maryam Molamohammadi, John Burden, Wanru Zhao, Shalaleh Rismani, Konstantinos Voudouris, Umang Bhatt, Adrian Weller, David Krueger, Tegan Maharaj.
FAccT (2023).

Eliciting and learning with soft labels from every annotator
Katherine M Collins^, Umang Bhatt^, Adrian Weller.
AAAI HCOMP (2022).
Code and Data Project Page

Structured, flexible, and robust: benchmarking and improving large language models towards more human-like behavior in out-of-distribution reasoning tasks
Katherine M. Collins^, Lionel Wong^, Jiahai Feng, Megan Wei, Joshua B. Tenenbaum.
CogSci (2022). Invited Talk. Awarded Travel Grant for Paper.
Code and Data Project Page Turing Institute Talk

Hybrid Memoised Wake-Sleep: Approximate Inference at the Discrete-Continuous Interface
Tuan Anh Le, Katherine M Collins, Luke Hewitt, Kevin Ellis, Samuel J Gershman, Joshua B Tenenbaum.
ICLR (2022).

Learning signal-agnostic manifolds of neural fields
Yilun Du, Katherine M. Collins, Joshua B. Tenenbaum, Vincent Sitzmann.
NeurIPS (2021).
Code Project Page

Deep representation learning improves prediction of LacI-mediated transcriptional repression
Alexander S Garruss, Katherine M Collins, George M Church.
PNAS (2021).

Sequence-to-function deep learning frameworks for engineered riboregulators
Jacqueline A Valeri^, Katherine M Collins^, Pradeep Ramesh^, Miguel A Alcantar, Bianca A Lepe, Timothy K Lu, Diogo M Camacho.
Nature Communications (2020).

Next-generation machine learning for biological networks
Diogo M Camacho, Katherine M Collins, Rani K Powers, James C Costello, James J Collins.
Cell (2018).

^Contributed equally. ^^Equal co-supervision.

Website template modified from academicpages