Effect of Demographic Bias on Skin Lesion Classification

Ralf Raumanns1,2,3Orcid, Gerard Schouten4Orcid, Veronika Cheplygina3Orcid, Josien P.W. Pluim2Orcid
1: Fontys University of Applied Science, Venlo, The Netherlands, 2: Eindhoven University of Technology, Eindhoven, The Netherlands, 3: IT University of Copenhagen, Denmark, 4: Fontys University of Applied Science, Eindhoven, The Netherlands
Publication date: 2026/05/29
https://doi.org/10.59275/j.melba.2026-4156
PDF · Code

Abstract

The influence of bias in datasets on the fairness of model predictions is a topic of ongoing research in various fields. In this study, we evaluate the performance of skin lesion classification using ResNet-based convolutional models, focusing on the impact of demographic bias in training data, particularly variations in patient sex and age. We use a linear programming method to generate datasets with controlled demographic characteristics, allowing systematic investigation of bias effects. Three distinct learning strategies are evaluated: a single-task model, a reinforcing multi-task model, and an adversarial learning scheme.
Our sex-based analysis indicates that sex-specific training datasets optimise model performance. Notably, including male patients in the training data improved performance for the male subgroup, even in female-majority cases. Reinforcing and adversarial learning schemes narrowed or eliminated bias gaps in balanced and female-majority datasets. However, these strategies proved less effective in male-majority settings, where models continued to perform better for males than females. The two learning schemes showed marginal bias reduction compared to the baseline model in predominantly male patient populations.
Age-based analysis demonstrates comparable baseline performance across the three model approaches, with per formance declining across age categories. Younger groups consistently achieve the highest performance, regardless of training data distribution. Although balanced training yields optimal results for the youngest age category, performance decreases in older categories.
We find that sex biases arise mainly from data imbalances, while age biases consistently favour younger groups regardless of distribution. These distinct mechanisms require targeted mitigation strategies. Our work aims to advance equitable AI in medical imaging by addressing these specific sources of disparity.
Additionally, cross-dataset validation on two external datasets revealed that domain shifts notably affect performance and demographic bias patterns. The source code and models are available on GitHub: https://github.com/raumannsr/demographic-fairness-extended

Keywords

Skin lesions · Bias · Fairness · Multi-task learning · Adversarial learning · Cross-dataset analysis

Bibtex @article{melba:2026:011:raumanns, title = "Effect of Demographic Bias on Skin Lesion Classification", author = "Raumanns, Ralf and Schouten, Gerard and Cheplygina, Veronika and Pluim, Josien P.W.", journal = "Machine Learning for Biomedical Imaging", volume = "2026", issue = "Special issue on FAIMI", year = "2026", pages = "200--225", issn = "2766-905X", doi = "https://doi.org/10.59275/j.melba.2026-4156", url = "https://melba-journal.org/2026:011" }
RISTY - JOUR AU - Raumanns, Ralf AU - Schouten, Gerard AU - Cheplygina, Veronika AU - Pluim, Josien P.W. PY - 2026 TI - Effect of Demographic Bias on Skin Lesion Classification T2 - Machine Learning for Biomedical Imaging VL - 2026 IS - Special issue on FAIMI SP - 200 EP - 225 SN - 2766-905X DO - https://doi.org/10.59275/j.melba.2026-4156 UR - https://melba-journal.org/2026:011 ER -

2026:011 cover