I am a first-year PhD student at the University of Pennsylvania advised by Eric Wong and Hamed Hassani.
I'm also a research scientist in the National Security Directorate at Pacific Northwest National Lab.
I have research interests in AI safety, security, and the science of deep learning.
I've recently been thinking about a) evals and b) how some topics in AI security could be informed by the science of deep learning.
I also have recent work evaluating models on open math problems.
In my free time, I like to run, ski, scroll the arxiv (see some papers I think are underrated), and (re-)read Robert Caro's biographies.
Recent Papers (on model diffing, universality)
 
(All Papers)
 
EMNLP '23 |
Understanding the Inner Workings of Language Models Through Representation Dissimilarity
|
|
Davis Brown,
Charles Godfrey, Nicholas Konz, Jonathan Tu, Henry Kvinge
|
ATTRIB NeurIPS '23 (oral) |
Attributing Learned Concepts in Neural Networks to Training Data
|
|
Nicholas Konz, Charles Godfrey, Madelyn Shapiro, Jonathan Tu, Henry Kvinge, Davis Brown
|
HiLD ICML '23 |
On Privileged and Convergent Bases in Neural Network Representations
|
|
{Davis Brown,
Nikhil Vyas}, Yamini Bansal
|
NeurIPS '22 |
On the Symmetries of Deep Learning Models and their Internal
Representations
|
|
{Charles Godfrey, Davis Brown},
Tegan Emerson, Henry Kvinge
|