I am a fourth-year PhD student in the UNC-NLP lab at the University of North Carolina at Chapel Hill, where I am advised by Mohit Bansal. My work at UNC is supported by a Google PhD Fellowship and previously by a Royster Fellowship. Before this, I graduated with a bachelor’s degree from Duke University, where my thesis advisor was Cynthia Rudin. At Duke I was supported by a Trinity Scholarship.
My research interests center on interpretable machine learning and natural language processing. I am particularly interested in techniques for explaining model behavior and aligning ML systems with human values. I see language models as a good object of study since we lack complete explanations for their behavior and human language provides a rich means of interaction with models. I am broadly interested in topics related to AI Safety; besides interpretable ML I have worked on methods for supervising model reasoning via explanations, providing recourses to people adversely affected by ML models, and editing language models to be more truthful. In all of these areas, I find work on clarifying concepts and developing strong evaluation procedures especially valuable.
- 2023 - New paper out! “Does Localization Inform Editing? Surprising Differences in Causality-Based Localization vs. Knowledge Editing in Language Models” [pdf] [code]
- 2022 - Serving as an Area Chair for ACL 2023 in the Interpretability and Analysis of Models for NLP track
- 2022 - Serving as an Area Chair for the AAAI 2023 Workshop on Representation learning for Responsible Human-Centric AI
- 2022 - Work accepted to EMNLP 2022: “Are Hard Examples also Harder to Explain? A Study with Human and Model-Generated Explanations” [pdf]
- 2022 - Work accepted to NeurIPS 2022: “VisFIS: Visual Feature Importance Supervision with Right-for-the-Right-Reason Objectives” [pdf]
- 2022 - Serving as an Area Chair for EMNLP 2022 in the Interpretability, Interactivity and Analysis of Models for NLP track
- 2022 - Started summer internship at Google Research! Supervised by Asma Ghandeharioun and Been Kim
- 2022 - Invited talk at the University of Oxford on Explainable Machine Learning in NLP
- 2022 - Paper accepted to ACL 2022 Workshop on Natural Language Supervision! “When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data” [pdf] [code]
- 2022 - Invited talk at NEC Laboratories Europe, on Explainable Machine Learning in NLP
- 2022 - Invited talk at the National Institute for Standards and Technology, on Evaluating Explainable AI
- 2022 - Invited talk at the Allen Institute for AI, on Detecting, Updating, and Visualizing Language Model Beliefs
- 2022 - Invited talk at Uber AI, on The OOD Problem and Search Methods in Explainable ML
- 2021 - New preprint on arxiv! “Do Language Models Have Beliefs? Methods for Detecting, Updating, and Visualizing Model Beliefs” [pdf] [code]
- 2021 - Paper accepted to NeurIPS 2021! “The Out-of-Distribution Problem in Explainability and Search Methods for Feature Importance Explanations” [pdf] [code]
- 2021 - Awarded a Google PhD Fellowship for Natural Language Processing!
- 2021 - Invited talk at CHAI, UC Berkeley, on Evaluating Explainable AI
- 2021 - Paper accepted to EMNLP 2021: “FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging” [pdf] [code]
- 2021 - Named as an outstanding reviewer for ACL-IJCNLP 2021
- 2021 - New paper on arxiv! “Search Methods for Sufficient, Socially-Aligned Feature Importance Explanations with In-Distribution Counterfactuals” [pdf] [code]
- 2021 - Started summer internship at FAIR, supervised by Srini Iyer.
- 2021 - New blog post on the Alignment Forum: “Opinions on Interpretable Machine Learning and 70 Summaries of Recent Papers” [link]
- 2021 - New preprint on arxiv: “When Can Models Learn From Explanations? A Formal Framework for Understanding the Roles of Explanation Data” [pdf] [code]
- 2020 - New preprint on arxiv! “FastIF: Scalable Influence Functions for Efficient Model Interpretation and Debugging” [pdf] [code]
- 2020 - Recognized as an Outstanding Reviewer for EMNLP 2020
- 2020 - Paper accepted into Findings of EMNLP, “Leakage-Adjusted Simulatability: Can Models Generate Non-Trivial Explanations of Their Behavior in Natural Language?” [pdf] [code]
- 2020 - Paper accepted into ACL 2020, “Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?” [pdf] [code]
- 2019 - Paper accepted into AAAI-HCOMP 2019, “Interpretable Image Recognition with Hierarchical Prototypes” [pdf] [code]
- 2019 - Joined the UNC NLP lab
- 2019 - Graduated with a B.S. from the Department of Statistical Science at Duke University
- 2019 - Awarded a Royster PhD Fellowship from UNC Chapel Hill