Saffron Huang

About

Hi! I'm Saffron. I'm currently a research scientist on the Societal Impacts team at Anthropic. We tackle big questions about how AI will change society, and use these insights to guide responsible AI development.

I previously co-founded the Collective Intelligence Project (CIP), which works to make AI development more democratic and use AI to strengthen democracy. For this work, my co-founder and I were named in the 2024 TIME 100 Most Influential People in AI list, and our work was featured twice in the New York Times. As an advisor there now, I continue to support their work on improving how society makes decisions about transformative technologies.

I also helped to start the Societal Impacts team at the UK AI Safety Institute.

Before that, I was a research engineer at DeepMind, working on a hodgepodge of things including: language models, human-AI interaction, conceptual reasoning, value alignment, and multi-agent RL.

My bylines have appeared in WIRED, Noema, The New Statesman, Reboot and elsewhere. You can also find my thoughts at Substack and Twitter.

Even more stuff about me: I co-founded Kernel Magazine. I have a degree in Applied Mathematics-Computer Science, with minors in Government and German, from Harvard. I'm from New Zealand. And sometimes I do photography.

Research

Values in the Wild: Discovering and Analyzing Values in Real-World Language Model Interactions - S Huang, E Durmus, ..., D Ganguli (2025). I'm interviewed for this in a VentureBeat article. Also, we released the dataset and taxonomy of thousands of AI values here
Clio: Privacy-Preserving Insights into Real-World AI Use - Anthropic Societal Impacts Team (arXiv 2024). Introduces a system for analyzing AI assistant usage patterns across millions of conversations while preserving user privacy
How large language models can reshape collective intelligence - Nature Human Behaviour (2024)
Evaluating feature steering: A case study in mitigating social biases - Anthropic Societal Impacts Team (arXiv 2024)
How will advanced AI systems impact democracy? - arXiv (2024)
Collective Constitutional AI - S Huang*, D Siddarth*, L Lovitt*, ..., D Ganguli* (*joint co-authors) (2024). Featured in the New York Times. Blog post
Beyond Static AI Evaluations: Advancing Human Interaction Evaluations for LLM Harms and Risks - L Ibrahim, S Huang, ...
Using the Veil of Ignorance to align AI systems with principles of justice - L Weidinger, K McKee, ... (PNAS 2023)
Generative AI and the Digital Commons - S Huang, D Siddarth (2022)
Red Teaming Language Models with Language Models - E Perez, S Huang, ... (EMNLP 2022)
Scaling Language Models: Methods, Analysis & Insights from Training Gopher - JW Rae, S Borgeaud, ... (2021)
Improving language models by retrieving from trillions of tokens - S Borgeaud, A Mensch, ... (ICML 2022)

Writing

Here's How To Share AI's Future Wealth - Noema Magazine
Collective Intelligence over Artificial Intelligence - S Huang, D Siddarth. AI Morality. Oxford University Press
A Vision of Democratic AI - D Siddarth, S Huang, A Tang. The Digitalist Papers (site). Stanford Digital Economy Lab
Predistribution over Redistribution: Beyond the Windfall Clause - CIP Blog (with Sam Manning)
The Surprising Synergy Between Acupuncture and AI - WIRED
ChatGPT and the death of the author - The New Statesman
Control and Consciousness of Time - Summer of Protocols
A philosophy of subtraction - Reboot
Who will control crypto? - on consistent patterns of centralisation in past information technologies and implications for open blockchains
Privacy and pluralism - what is privacy, and why does it matter?
Turing-Complete Governance - on the power and potential of EVM arbitrariness for collective decision-making
What is Technology? - Kernel Magazine (republished from Letters to a Young Technologist)
Letters to a Young Technologist - a collection of essays written with some dear friends. I penned What is Technology? and To Be a Technologist is to be Human
Harvard Creates Managers Instead of Elites in Palladium Magazine - I also did a podcast episode on this
Virtually Social in The Wave Arts Magazine

Media

TIME: The 100 Most Influential People in AI 2024 - Saffron Huang and Divya Siddarth (TIME, Sep 2024)
The Race to Democratise AI (Democracy Technologies, October 2023)
Rewiring Democracy: can society control its tech destiny? (Culture3, August 2023)
Aligning Technology Development with Public Input: A Conversation with Saffron Huang (C/Change, August 2023)
Can AI And Democracy Fix Each Other? (The New York Times, April 2023)
Alumni Profile: Saffron Huang, A.B. '20 (Harvard SEAS, April 2023)

Talks / Podcasts

UN Internet Governance Forum (IGF 2023): Digital Democracy in the Age of AI (Oct 2023)
OpenAI Forum: The Importance of Public Input in Designing AI Systems (inaugural event) - (July 2023)
Panel at Civic Future's The Great Stagnation conference (July 2023)
Demos: Collective Intelligence Panel (July 2023)
The Rhys Show podcast (April 2023)
Generative AI and Democracy at Bruce Schneier's International Workshop on Recreating Democracy (December 2022)
Red Teaming Language Models with Language Models at EMNLP and IJCAI (2022)
Palladium Magazine podcast (July 2020)

Saffron Huang

Work I hope you'll read

Here's How To Share AI's Future Wealth

Collective Constitutional AI

Values in the Wild

The Surprising Synergy between Acupuncture and AI

Collective Intelligence Project Whitepaper

Generative AI and the Digital Commons

What is Technology?

To Be a Technologist is to be Human

Links

Selected Research

Selected Writing

Media

About

Research

Writing

Media

Talks / Podcasts