Paper Feed: May 2026
Highlighting research I find interesting and think may deserve more attention (as of 04/21/26).
Training and Interpretability
-
Synthetic Data for any Differentiable Target
Tristan Thrush, Sung Min Park, Herman Brunborg, [...], Tatsunori Hashimoto (2026)
-
Mechanisms of Introspective Awareness
Uzay Macar, Li Yang, Atticus Wang, Peter Wallich, Emmanuel Ameisen, Jack Lindsey (2026)
-
Subliminal Effects in Your Data: A General Mechanism via Log-Linearity
Ishaq Aden-Ali, Noah Golowich, Allen Liu, Abhishek Shetty, Ankur Moitra, Nika Haghtalab (2026)
-
Disentangling MLP Neuron Weights in Vocabulary Space
Asaf Avrahamy, Yoav Gur-Arieh, Mor Geva (2026)
Misalignment and Generalization
Security, Governance, and Agents
AI Economics and Forecasting