research
* * * responsible ml * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
on ai deployment
Our series, On AI Deployment, discusses the economic and regulatory implications AI supply chains.
open foundation models
What are the benefits of open models? What are the risks? Led by Sayash Kapoor and Rishi Bommasani, this work collects the thoughts of 25 authors to start answering these questions.
w/ Sayash Kapoor, Rishi Bommasani, Kevin Klyman, Shayne Longpre, Ashwin Ramaswami, Peter Cihon, Kevin Bankston, Stella Biderman, Miranda Bogen, Rumman Chowdhury, Alex Engler, Peter Henderson, Yacine Jernite, Seth Lazar, Stefano Maffulli, Alondra Nelson, Joelle Pineau, Aviya Skowron, Dawn Song, Victor Storchan, Daniel Zhang, Daniel E. Ho, Percy Liang & Arvind Narayanan
sampling with llms
People have begun using large language models (LLMs) to induce sample distributions (e.g. for synthetic training data purposes), but there are no guarantees about said distribution. We evaluate LLMs as distribution samplers acrpss multiple modalities, finding they struggle to produce a reasonable distribution.
w/ Alex Renda & Michael Carbin
designing data for ml
The ML pipeline includes data collection and iteration. But what data should you collect, how should you collect it, and how do you evaluate what a model has learned prior to deployment?
w/Fred Hohman, Luca Zappella, Xavier Suau Cuadros, & Dominik Moritz
ml practices outside big tech
Support for the democratization of machine learning is growing rapidly, but responsible ML development outside Big Tech is poorly understood. As more organizations turn to ML, what challenges do they face in creating fair and ethical ML?
We explore these challenges, highlighting future research directions for the ML community in our AIES spotlight paper.
w/ Serena Booth
* * * ml interpretability * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
emergent world representations
Do complex language models memorize surface statistics, or do they develop internal representations of underlying processes
generating sequences? We explored this question using a synthetic board game task (Othello), uncovering nonlinear internal representations of a board state. By intervening on model's layer activations during its calculations, we learn that these representations are causal. Finally, we leverage these techniques to create latent saliency maps to explain what influenced the model's output.
w/Kenneth Li, David Bau, Fernanda Viegas, Hanspeter Pfister, & Martin Wattenberg
* * * uncertainty in ml * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
uncertainty in ml systems
ML systems are the product of complex sociotechnical processes, each of which introduces distinct forms of uncertainty into the final output. Communicating this uncertainty is critical for building appropriate trust, but is often achieved through simple, cumulative encodings that may obfuscate uncertainty's underlying complexity. Our work is aimed at exploring how and what uncertainty measures should be presented to different stakeholders.
w/Harini Suresh
socializing data
Labeled datasets are historically treated as authoritative sources of ground truth. But how is that ground truth determined, and how can we build historical contexts for these systems? This project focuses on collaborative sensemaking and label provenance.
* * * visualizations * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *
misleading visualizations
Misinformation comes in many forms--including in charts and graphs. So we built a spell-check equivalent for visualizations! We hope that by pointing out ineffectiveness in visualizations, we can ensure best practices in design and increase data literacy. Just as importantly, we can encourage accuracy and critique in public domains.
So what is the red wavy line analogue for graphs?
w/Michael Correll & Arvind Satyanarayan
Air quality, like many environmental and health considerations, is important to communicate to the public.
But how do you effectively communicate important information to lay readers, particularly in context of uncertainty and statistical model outputs?