Peter Stone's Selected Publications

Classified by TopicClassified by Publication TypeSorted by DateSorted by First Author Last NameClassified by Funding Source


Fair human-centric image dataset for ethical AI benchmarking

Fair human-centric image dataset for ethical AI benchmarking.
Alice Xiang, Jerone T. A. Andrews, Rebecca L. Bourke, William Thong, Julienne M. LaChance, Tiffany Georgievski, Apostolos Modas, Aida Rahmattalabbi, Yunhao Ba, Shruti Nagpal, Orestis Papakyriakopoulos, Dora Zhao, Jinru Xue, Victoria Matthews, Linxia Gong, Austin T. Hoag, Mircea Cimpoi, Swami Sankaranarayanan, Wiebke Hutiri, Morgan K. Scheuerman, Albert S. Abedi, Peter Stone, Peter R. Wurman, Hiroaki Kitano, and Michael Spranger .
Nature, 648:97–108, 2025.
The official paper from Nature
An editorial about the paper
The project website

Download

[PDF]43.9MB  

Abstract

Computer vision is central to many artificial intelligence (AI) applications, from autonomous vehicles to consumer devices. However, the data behind such technical innovations are often collected with insufficient consideration of ethical concerns. This has led to a reliance on datasets that lack diversity, perpetuate biases and are collected without the consent of data rights holders. These datasets compromise the fairness and accuracy of AI models and disenfranchise stakeholders. Although awareness of the problems of bias in computer vision technologies, particularly facial recognition, has become widespread9, the field lacks publicly available, consensually collected datasets for evaluating bias for most tasks. In response, we introduce the Fair Human-Centric Image Benchmark (FHIBE, pronounced ‘Feebee’), a publicly available human image dataset implementing best practices for consent, privacy, compensation, safety, diversity and utility. FHIBE can be used responsibly as a fairness evaluation dataset for many human-centric computer vision tasks, including pose estimation, person segmentation, face detection and verification, and visual question answering. By leveraging comprehensive annotations capturing demographic and physical attributes, environmental factors, instrument and pixel-level annotations, FHIBE can identify a wide variety of biases. The annotations also enable more nuanced and granular bias diagnoses, enabling practitioners to better understand sources of bias and mitigate potential downstream harms. FHIBE therefore represents an important step forward towards trustworthy AI, raising the bar for fairness benchmarks and providing a road map for responsible data curation in AI.

BibTeX Entry

@Article{peter_nature_2025,
  author   = {Alice Xiang and Jerone T. A. Andrews and Rebecca L. Bourke and William Thong and Julienne M. LaChance and Tiffany Georgievski and Apostolos Modas and Aida Rahmattalabbi and Yunhao Ba and Shruti Nagpal and Orestis Papakyriakopoulos and Dora Zhao and Jinru Xue and Victoria Matthews and Linxia Gong and Austin T. Hoag and Mircea Cimpoi and Swami Sankaranarayanan and Wiebke Hutiri and Morgan K. Scheuerman and Albert S. Abedi and Peter Stone and Peter R. Wurman and Hiroaki Kitano and Michael Spranger },
  title    = {Fair human-centric image dataset for ethical {AI} benchmarking},
  journal = {Nature},
  year     = {2025},
  volume="648",
  pages="97--108",
  abstract = {Computer vision is central to many artificial intelligence (AI) applications, from autonomous vehicles to consumer devices. However, the data behind such technical innovations are often collected with insufficient consideration of ethical concerns. This has led to a reliance on datasets that lack diversity, perpetuate biases and are collected without the consent of data rights holders. These datasets compromise the fairness and accuracy of AI models and disenfranchise stakeholders. Although awareness of the problems of bias in computer vision technologies, particularly facial recognition, has become widespread9, the field lacks publicly available, consensually collected datasets for evaluating bias for most tasks. In response, we introduce the Fair Human-Centric Image Benchmark (FHIBE, pronounced ‘Feebee’), a publicly available human image dataset implementing best practices for consent, privacy, compensation, safety, diversity and utility. FHIBE can be used responsibly as a fairness evaluation dataset for many human-centric computer vision tasks, including pose estimation, person segmentation, face detection and verification, and visual question answering. By leveraging comprehensive annotations capturing demographic and physical attributes, environmental factors, instrument and pixel-level annotations, FHIBE can identify a wide variety of biases. The annotations also enable more nuanced and granular bias diagnoses, enabling practitioners to better understand sources of bias and mitigate potential downstream harms. FHIBE therefore represents an important step forward towards trustworthy AI, raising the bar for fairness benchmarks and providing a road map for responsible data curation in AI.},
  wwwnote={<a href="https://www.nature.com/articles/s41586-025-09716-2">The official paper from Nature</a><br>
      <a href="https://www.nature.com/articles/d41586-025-03568-6">An editorial about the paper</a><br>
      <a href="https://fairnessbenchmark.ai.sony/">The project website</a>
  },
}

Generated by bib2html.pl (written by Patrick Riley ) on Wed Jun 10, 2026 15:26:41