Investigating Demographic Bias in Brain MRI Segmentation: A Comparative Study of Deep-Learning and Non-Deep-Learning Methods
Ghazal Danaee
, Marc Niethammer
, Jarrett Rushmore , Sylvain Bouix
Publication date: 2025/12/30
https://doi.org/10.59275/j.melba.2025-d1g3
Abstract
Deep-learning-based segmentation algorithms have substantially advanced the field of medical image analysis, particularly in structural delineations in MRIs. However, an important consideration is the intrinsic bias in the data. Concerns about unfairness, such as performance disparities based on sensitive attributes like race and sex, are increasingly urgent. In this work, we evaluate the results of four different segmentation models (UNesT, nnU‐Net, and CoTr) and a traditional atlas-based method (ANTs), applied to segment the left and right nucleus accumbens (NAc) in MRI images. We utilize a dataset including four demographic subgroups: black female, black male, white female, and white male. We employ manually labeled gold-standard segmentations to train and test segmentation models. This study consists of two parts: the first assesses the segmentation performance of models, while the second measures the volumes they produce to evaluate the effects of race, sex, and their interaction. Fairness is quantitatively measured using a metric designed to quantify fairness in segmentation performance. Additionally, linear mixed models analyze the impact of demographic variables on segmentation accuracy and derived volumes. Training on the same race as the test subjects leads to significantly better segmentation accuracy for some models. ANTs and UNesT show notable improvements in segmentation accuracy when trained and tested on race-matched data, unlike nnU-Net, which demonstrates robust performance independent of demographic matching. Finally, we examine sex and race effects on the volume of the NAc using segmentation from the manual rater and from our biased models. Results reveal that the sex effects observed with manual segmentation can also be observed with biased models, whereas the race effects disappear in all but one model. Our findings underscore the importance of diverse and balanced datasets for equitable brain MRI segmentation and highlight the need for systematic bias analysis in developing medical imaging models.
Keywords
Bias · Fairness · Deep learning · Multi-atlas label fusion segmentation · Brain · MRI
Bibtex
@article{melba:2025:035:danaee,
title = "Investigating Demographic Bias in Brain MRI Segmentation: A Comparative Study of Deep-Learning and Non-Deep-Learning Methods ",
author = "Danaee, Ghazal and Niethammer, Marc and Rushmore, Jarrett and Bouix, Sylvain",
journal = "Machine Learning for Biomedical Imaging",
volume = "3",
issue = "Special issue on FAIMI",
year = "2025",
pages = "792--808",
issn = "2766-905X",
doi = "https://doi.org/10.59275/j.melba.2025-d1g3",
url = "https://melba-journal.org/2025:035"
}
RIS
TY - JOUR
AU - Danaee, Ghazal
AU - Niethammer, Marc
AU - Rushmore, Jarrett
AU - Bouix, Sylvain
PY - 2025
TI - Investigating Demographic Bias in Brain MRI Segmentation: A Comparative Study of Deep-Learning and Non-Deep-Learning Methods
T2 - Machine Learning for Biomedical Imaging
VL - 3
IS - Special issue on FAIMI
SP - 792
EP - 808
SN - 2766-905X
DO - https://doi.org/10.59275/j.melba.2025-d1g3
UR - https://melba-journal.org/2025:035
ER -