I work on making AI go well! At CAIS, I serve as special projects associate and technical executive assistant to Director Dan Hendrycks. I manage projects in public engagement strategy, research coordination, and stakeholder relations. I also co-authored A Definition of AGI.
Previously, I researched adversarial robustness, chain of thought faithfulness, singular learning theory, and interpretability. I'm a former CHAI intern, and am incredibly grateful to have been mentored by researchers at OpenAI, Google DeepMind, Apollo Research, and Far AI.
Publications
- A Definition of AGI arXiv Preprint
- Emerging Vulnerabilities in Frontier Models: Multi-Turn Jailbreak Attacks arXiv Preprint
- Decompose, Recompose, and Conquer: Multi-modal LLMs are Vulnerable to Compositional Adversarial Attacks in Multi-Image Queries NeurIPS 2024 RBFM · NeurIPS 2024 Red Teaming GenAI
- The Structural Safety Generalization Problem ACL 2025 Findings