Hello! I’m Saffron. I’m a technologist, researcher and writer.
I’m co-founder and co-director of the Collective Intelligence Project (CIP), a research and policy organization looking to improve and leverage humanity’s collective intelligence to better govern and thus benefit from transformative technologies such as AI.
I'm driven by improving the interface between AI and people. I previously was a research engineer at DeepMind, investigating and engineering large language models, human-AI interaction, value alignment, conceptual reasoning, and multi-agent RL.
More stuff about me: I have a degree in Applied Mathematics-Computer Science, with minors in Government and German, from Harvard. I'm from New Zealand. And sometimes I do photography.
Selected Papers / Book Chapters
- Collective Constitutional AI - D Ganguli*, S Huang*, L Lovitt*, D Siddarth* et al (*joint co-authors) (2023). Featured in the New York Times.
- Collective Intelligence over Artificial Intelligence - S Huang, D Siddarth. AI Morality. Oxford University Press (forthcoming 2024).
- Using the Veil of Ignorance to align AI systems with principles of justice - L Weidinger, K McKee, ... (PNAS 2023)
- Generative AI and the Digital Commons - S Huang, D Siddarth (2022)
- Red Teaming Language Models with Language Models - E Perez, S Huang, ... (EMNLP 2022)
- Scaling Language Models: Methods, Analysis & Insights from Training Gopher - JW Rae, S Borgeaud, ... (2021)
- Improving language models by retrieving from trillions of tokens - S Borgeaud, A Mensch, ... (ICML 2022)
- Bi-Level Multi-Agent Reinforcement Learning for Intervening in Intertemporal Social Dilemmas - S Huang (2020)
Talks / Podcasts
- Under the Rose - a sporadically maintained blog on the mechanisms and politics of digital information flows.
- Kernel Magazine (co-founder, creative director, writer)
- November 2022: Wrote my first piece in the New Statesman.
- November 2022: I'm on the Advisory Council of Civic Future, an initiative to identify talented people and enable them towards contributing to policy and public service.
- October 2022: Red Teaming Language Models with Language Models was accepted at EMNLP 2022.
- September 2022: Kernel Magazine (the magazine I co-founded) issue 2 is out.
- August 2022: Excited to announce that I am co-directing the Collective Intelligence Project with Divya Siddarth. If we can collectively govern, own, or have input on transformative technology, we will be able to steer the most significant engine of societal change for the better.
- August 2022: Josh Stark of the Ethereum Foundation and I spent many months writing a piece on consistent patterns of centralization in past information technologies (e.g. radio, film, telephone and TV) and what crypto can learn from it. We put forward some fun, scary "statechain" scenarios, trying to predict the traffic jam, not just the automobile.
- August 2022: I presented at the Interact Summer Symposium on how we structure our digital information flows. Some of the research ended up or will end up on my substack.
- July 2022: Presented Red Teaming Language Models with Language Models at the IJCAI 2022 Evaluation Beyond Metrics workshop in Vienna.
- March 2022: My work on toxicity metrics of large language models is a top takeaway in the AI Index Report of 2022.