T-SYNTH: A Knowledge-Based Dataset of Synthetic Breast Images

Christopher Wiedeman1Orcid, Anastasiia Sarmakeeva1Orcid, Elena Sizikova1Orcid, Daniil Filienko1Orcid, Miguel Lago1Orcid, Jana G. Delfino1Orcid, Aldo Bdadano1Orcid
1: Office of Science and Engineering Laboratories, Center for Devices and Radiological Health, U.S. Food and Drug Administration, Silver Spring, MD 20993 USA
Publication date: 2025/12/31
https://doi.org/10.59275/j.melba.2025-g444
PDF · Data and code

Abstract

One of the key impediments for developing and assessing robust medical imaging algorithms is limited access to large-scale datasets with suitable annotations. Synthetic data generated with plausible physical and biological constraints may address some of these data limitations. We propose the use of physics simulations to generate synthetic images with pixel-level segmentation annotations, which are notoriously difficult to obtain. Specifically, we apply this approach to breast imaging analysis and release T-SYNTH, a large-scale open-source dataset of paired 2D digital mammography (DM) and 3D digital breast tomosynthesis (DBT) images. Our initial experimental results indicate that T-SYNTH images show promise for augmenting limited real patient datasets for detection tasks in DM and DBT. Our data and code are publicly available at: https://github.com/DIDSR/tsynth-release

Keywords

Digital Breast Tomosynthesis (DBT) · Synthetic Data · Lesion Detection

Bibtex @article{melba:2025:038:wiedeman, title = "T-SYNTH: A Knowledge-Based Dataset of Synthetic Breast Images", author = "Wiedeman, Christopher and Sarmakeeva, Anastasiia and Sizikova, Elena and Filienko, Daniil and Lago, Miguel and Delfino, Jana G. and Bdadano, Aldo", journal = "Machine Learning for Biomedical Imaging", volume = "3", issue = "Special Issue on Open Data at MICCAI 2024–2025", year = "2025", pages = "833--847", issn = "2766-905X", doi = "https://doi.org/10.59275/j.melba.2025-g444", url = "https://melba-journal.org/2025:038" }
RISTY - JOUR AU - Wiedeman, Christopher AU - Sarmakeeva, Anastasiia AU - Sizikova, Elena AU - Filienko, Daniil AU - Lago, Miguel AU - Delfino, Jana G. AU - Bdadano, Aldo PY - 2025 TI - T-SYNTH: A Knowledge-Based Dataset of Synthetic Breast Images T2 - Machine Learning for Biomedical Imaging VL - 3 IS - Special Issue on Open Data at MICCAI 2024–2025 SP - 833 EP - 847 SN - 2766-905X DO - https://doi.org/10.59275/j.melba.2025-g444 UR - https://melba-journal.org/2025:038 ER -

2025:038 cover