Bio

I work in grantmaking in AI safety and interpretability.

I am currently a program lead at Schmidt Sciences and a visiting research scientist at Stanford University. Previously, I’ve been at Anthropic, AI2, Google, and Meta. I completed my PhD at the University of North Carolina and my postdoc at Stanford.

Historically, I have done research in interpretability, monitoring, unlearning, weak-to-strong generalization, LLM calibration, and multi-agent communication. My PhD research was supported by a Google PhD Fellowship, and my postdoc was supported by a Schmidt Sciences AI Fellowship.

Broadly, I am interested in explaining and controlling the behavior of machine learning models. I see language models as a good object of study since we lack complete explanations for their behavior and human language provides a rich means of interaction with models. I find work on clarifying concepts and developing strong evaluation procedures especially valuable.

I take some pride in reviewing for the community. I’ve served on 40+ program committees (i.e., reviewing/ACing/SACing), receiving three reviewer awards and two AC awards.