I am a first-year PhD student at the University of Pennsylvania advised by Eric Wong and Hamed Hassani. I'm also a research scientist in the National Security Directorate at Pacific Northwest National Lab. I have research interests in AI safety, security, and the science of deep learning.

I've recently been thinking about a) evals and b) how some topics in AI security could be informed by the science of deep learning. I also have recent work evaluating models on open math problems. In my free time, I like to run, ski, scroll the arxiv (see some papers I think are underrated), and (re-)read Robert Caro's biographies.

Recent Papers (on model diffing, universality)   (All Papers)  

EMNLP '23 Understanding the Inner Workings of Language Models Through Representation Dissimilarity
Davis Brown, Charles Godfrey, Nicholas Konz, Jonathan Tu, Henry Kvinge
ATTRIB NeurIPS '23 (oral) Attributing Learned Concepts in Neural Networks to Training Data
Nicholas Konz, Charles Godfrey, Madelyn Shapiro, Jonathan Tu, Henry Kvinge, Davis Brown
HiLD ICML '23 On Privileged and Convergent Bases in Neural Network Representations
{Davis Brown, Nikhil Vyas}, Yamini Bansal
NeurIPS '22 On the Symmetries of Deep Learning Models and their Internal Representations
{Charles Godfrey, Davis Brown}, Tegan Emerson, Henry Kvinge