COVID-19 Image Data Collection: Prospective Predictions are the Future
Joseph Paul Cohen1, Paul Morrison2, Lan Dao3, Karsten Roth4,5,6, Tim Duong7, Marzyeh Ghassem8
1: Mila, University of Montreal, 2: Mila, Fontbonne University, 3: Department of Medicine, Mila, University of Montreal, 4: Vector, 5: Mila, 6: Heidelberg University, 7: Stony Brook Medicine, 8: Vector, University of Toronto
December 2020 issue
Publication date: 2020/12/15
Abstract
Across the world’s coronavirus disease 2019 (COVID-19) hot spots, the need to streamline patient diagnosis and management has become more pressing than ever. As one of the main imaging tools, chest X-rays (CXRs) are common, fast, non-invasive, relatively cheap, and potentially bedside to monitor the progression of the disease. This paper describes the first public COVID-19 image data collection as well as a preliminary exploration of possible use cases for the data. This dataset currently contains hundreds of frontal view X-rays and is the largest public resource for COVID-19 image and prognostic data, making it a necessary resource to develop and evaluate tools to aid in the treatment of COVID-19. It was manually aggregated from publication figures as well as various web based repositories into a machine learning (ML) friendly format with accompanying dataloader code. We collected frontal and lateral view imagery and metadata such as the time since first symptoms, intensive care unit (ICU) status, survival status, intubation status, or hospital location. We present multiple possible use cases for the data such as predicting the need for the ICU, predicting patient survival, and understanding a patient’s trajectory during treatment. Data can be accessed here: https://github.com/ieee8023/covid-chestxray-dataset
Keywords
dataset · covid-19 · machine learning · chest x-ray · computer vision
Bibtex
@article{melba:2020:002:cohen,
title = "COVID-19 Image Data Collection: Prospective Predictions are the Future",
author = "Cohen, Joseph Paul and Morrison, Paul and Dao, Lan and Roth, Karsten and Duong, Tim and Ghassem, Marzyeh",
journal = "Machine Learning for Biomedical Imaging",
volume = "1",
issue = "December 2020 issue",
year = "2020",
pages = "1--38",
issn = "2766-905X",
url = "https://melba-journal.org/2020:002"
}
RIS
TY - JOUR
AU - Cohen, Joseph Paul
AU - Morrison, Paul
AU - Dao, Lan
AU - Roth, Karsten
AU - Duong, Tim
AU - Ghassem, Marzyeh
PY - 2020
TI - COVID-19 Image Data Collection: Prospective Predictions are the Future
T2 - Machine Learning for Biomedical Imaging
VL - 1
IS - December 2020 issue
SP - 1
EP - 38
SN - 2766-905X
UR - https://melba-journal.org/2020:002
ER -
