Megan Richards

About Me

I’m a Computer Science PhD student at NYU’s Courant Institute for Mathematical Sciences, working with Kyunghyun Cho. My PhD is supported by NYU’s Dean’s Doctoral Fellowship, and the NSF Graduate Research Fellowship (GRFP). Previously, I was an AI Resident at Meta AI (FAIR labs) where I was fortunate to be advised by Mark Ibrahim and Diane Bouchacourt.

I study the foundations of reliable machine learning, with the goal of making models more consistent, representative, fair, and useful. Most recently, I have studied geographical failures in vision models, including studying the widening progress gap between imagenet-based benchmarks and global, crowdsourced data (ICLR 2024), mechanisms of geographical bias (Spotlight, NeurIPS 2023), and improving measurement of geographic disparities in image generation (Outstanding Paper, TiFA workshop ICML 2024).

My current research interests in machine learning reliability are motivated by my experiences building models for high-stakes medical settings. I completed my undergradate at Duke University, where I worked with Mark Sendak as part of the Duke Institute for Healthcare Innovation (DIHI). At DIHI, I worked on building risk prediction models for severe pregnancy complications, which are now integrated and in silent trials. I also worked with DIHI to implement a data quality assurance framework which improved model performance by better integrating clinical feedback into dataset design. While at Duke, I also worked at the Duke Center for Global Women’s Health Technologies on a self-screening device for cervical cancer designed for low-resource global settings, which earned a Best Research award at NIH’s IEEE HIPOCT Conference in 2019.

Recent Updates

June ‘25: Delighted to be speaking at CVPR’s DemoDiv workshop about some of our work studying geographic underrepresentation in computer vision. I made my presentation publicly available here!
Nov ‘24: 📝 Check out our new preprint On the Role of Speech Data in Reducing Toxicity Detection Bias, led by Samuel Bell! We generate and release a new set of multilingual toxicity annotations for MuTox, and find that when models have access to the audio itself, rather than a transcript, they are more accurate and less biased in detecting toxicity (w.r.t group mentions).
July ‘24:🎊 We’re honored to receive an outstanding paper award at the ICML TiFA workshop for our work measuring geographic disparities in image generations! It was such a pleasure to help supervise this project, led by Abhishek Sureddy, Dishant Padelia, and Nandhinee Periyakaruppa.
May ‘24: 📝 Check out our Introduction to Vision-Language Modeling, created through a broad collaboration of researchers (> 40 people across 10 institutions) to help democratize knowledge about VLMs!
April ‘24: 🎓 This fall, I will start a PhD at NYU Computer Science, working with Kyunghyun Cho. My work will be supported by NYU’s GSAS Dean’s Doctoral Fellowship, as well as the NSF Graduate Research Fellowship (GRFP).
Jan ‘24: 📝 We’re thrilled to have our work accepted at ICLR 2024! Our work demonstrates a widening progress gap between imagenet-based benchmarks and global, crowdsourced data, driven by significant overrepresentation of western images in internet-scraped datasets.
Dec ‘23: 🎊 We’re honored to receive a spotlight award at NeurIPS 2023 under the datasets track for our paper investigating mechanisms of geographical bias in vision models! See full paper here, our annotations here!

Publications

On the Role of Speech Data in Reducing Toxicity Detection Bias
Samuel J. Bell^*, Mariano Coria Megliol^*, Megan Richards^*, Eduardo Sánchez^*,
Christophe Ropers, Skyler Wang, Adina Williams, Levent Sagun, Marta R. Costa-jussà^*
^*core contributor
[ArXiv]
Decomposed Evaluations of Geographic Disparities in Text-To-Image Models
Abhishek Sureddy^*, Dishant Padalia^*, Nandhinee Periyakaruppa^*, Oindrila Saha, Adina Williams, Adriana Romero-Soriano,
Megan Richards^**, Polina Kirichenko^**, Melissa Hall^**
^*joint first author ^**joint senior author
Outstanding Paper, Trustworthy Multi-modal Foundation Models and AI Agents (TiFA) Workshop, ICML 2024.
Next Generation of AI Safety Workshop, ICML 2024.
[ArXiv]
An Introduction to Vision-Language Modeling
Florian Bordes, Richard Yuanzhe Pang, Anurag Ajay, …, Megan Richards, …, Kate Saenko, Asli Celikyilmaz, Vikas Chandra
[ArXiv]
Does Progress On Object Recognition Benchmarks Improve Generalization on Crowdsourced, Global Data?
Megan Richards, Polina Kirichenko, Diane Bouchacourt, Mark Ibrahim
ICLR ‘24
[ICLR ‘24]
Exploring Why Object Recognition Performance Degrades Across Income Levels and Geographies
Laura Gustafson, Megan Richards, Melissa Hall, Caner Hazirbas, Diane Bouchacourt, Mark Ibrahim
[(Spotlight) NeurIPS 2023 Datasets and Benchmarks].
Development and Validation of ML-DQA – a Machine Learning Data Quality Assurance Framework for Healthcare
Mark Sendak, Gaurav Sirdeshmukh, Timothy Ochoa, Hayley Premo, Linda Tang, Kira Niederhoffer, Sarah Reed, Kaivalya Deshpande, Emily Sterrett, Melissa Bauer, Laurie Snyder, Afreen Shariff, David Whellan, Jeffrey Riggio, David Gaieski, Kristin Corey, Megan Richards, Michael Gao, Marshall Nichols, Bradley Heintze, William Knechtle, William Ratliff, Suresh Balu
Machine Learning for Healthcare, 2022
[PMLR]
Multicontrast Pocket Colposcopy Cervical Cancer Diagnostic Algorithm for Referral Populations
Erica Skerrett, Zichen Miao, Mercy N Asiedu, Megan Richards, Brian Crouch, Guillermo Sapiro, Qiang Qiu, Nirmala Ramanujam
BME Frontiers, 2022
[BME Frontiers]

Posters

Does Progress On Object Recognition Benchmarks Improve Generalization on Crowdsourced, Global Data?
Megan Richards, Polina Kirichenko, Diane Bouchacourt, Mark Ibrahim
Poster, Data Centric Machine Learning (DMLR) Workship, ICML 2023
Towards Deploying Predictive Models for Maternal Health
Kaivalya Deshpande, Willie Boag, Freya Gulamali, Megan Richards, Michael Gao, Namita Kansal, Vaishakhi Mayya, Mark Sendak, Ashraf Habib, Terrence Allen, Sarah McWay Boling, Melissa Bauer, Jennifer Gilner, Brenna Hughes, Courtney Mitchell, Heather Tally, Amanda Craig, Suresh Balu, William Knechtle
Poster, Machine Learning for Healthcare, 2023
Phenotype Development and Validation for a Maternal Early Warning System
Megan Richards, MS Michael Gao, William Knechtle, Namita Kansal, Vaishakhi Mayya, MD Sendak, Ashraf Habib, Terrence Allen, Sarah McWay Boling, Melissa RN, DO Bauer, Jennifer Gilner, MD Courtney Mitchell
Poster, Machine Learning for Healthcare, 2022
Development of a Speculum-Free Liquid Applicator for At-Home Cervical Cancer Screening
Erica Skerrett, Mercy N Asiedu, Megan Richards, John Wilson Schmitt, Nirmala Ramanujam
Best Poster, NIH IEEE HIPOCT Conference, 2019

Talks

CVPR 2025, DemoDiv Workshop
Geographic Underrepresentation in Computer Vision
Slides

Organizing

I’m really excited by efforts to make machine learning/science more inclusive, and am proud to part of the following efforts:

Organizer, Queer In AI 🌈
I helped organize the QinAI NeurIPS 2023 workshop - see our NeurIPS website here and our org website here.
Discussion Lead, Women-In-Machine-Learning (WiML) at ICML 2023
I helped organize a breakout session on robustness and large-scale vision models at the Women in Machine Learning workshop at ICML 2023 (slides).

Service

Reviewer, ICLR Workshops 2024
I was a reviewer for the Workshops at ICLR 2024. Excited to see so many great new avenues of research!
Reviewer, DMLR Workshop at ICML 2023
I was a reviewer for the DMLR workshop at ICML 2023. See more about the workshop here.