Navigation auf zora.uzh.ch

Search ZORA

ZORA (Zurich Open Repository and Archive)

Handling missing values in machine learning to predict patient-specific risk of adverse cardiac events: Insights from REFINE SPECT registry

Rios, Richard; Miller, Robert J H; Manral, Nipun; Sharir, Tali; Einstein, Andrew J; Fish, Mathews B; Ruddy, Terrence D; Kaufmann, Philipp A; Sinusas, Albert J; Miller, Edward J; Bateman, Timothy M; Dorbala, Sharmila; Di Carli, Marcelo; Van Kriekinge, Serge D; Kavanagh, Paul B; Parekh, Tejas; Liang, Joanna X; Dey, Damini; Berman, Daniel S; Slomka, Piotr J (2022). Handling missing values in machine learning to predict patient-specific risk of adverse cardiac events: Insights from REFINE SPECT registry. Computers in Biology and Medicine, 145:105449.

Abstract

BACKGROUND

Machine learning (ML) models can improve prediction of major adverse cardiovascular events (MACE), but in clinical practice some values may be missing. We evaluated the influence of missing values in ML models for patient-specific prediction of MACE risk.

METHODS

We included 20,179 patients from the multicenter REFINE SPECT registry with MACE follow-up data. We evaluated seven methods for handling missing values: 1) removal of variables with missing values (ML-Remove), 2) imputation with median and unique category for continuous and categorical variables, respectively (ML-Traditional), 3) unique category for missing variables (ML-Unique), 4) cluster-based imputation (ML-Cluster), 5) regression-based imputation (ML-Regression), 6) missRanger imputation (ML-MR), and 7) multiple imputation (ML-MICE). We trained ML models with full data and simulated missing values in testing patients. Prediction performance was evaluated using area under the receiver-operating characteristic curve (AUC) and compared with a model without missing values (ML-All), expert visual diagnosis and total perfusion deficit (TPD).

RESULTS

During mean follow-up of 4.7 ± 1.5 years, 3,541 patients experienced at least one MACE (3.7% annualized risk). ML-All (reference model-no missing values) had AUC 0.799 for MACE risk prediction. All seven models with missing values had lower AUC (ML-Remove: 0.778, ML-MICE: 0.774, ML-Cluster: 0.771, ML-Traditional: 0.771, ML-Regression: 0.770, ML-MR: 0.766, and ML-Unique: 0.766; p < 0.01 for ML-Remove vs remaining methods). Stress TPD (AUC 0.698) and visual diagnosis (0.681) had the lowest AUCs.

CONCLUSION

Missing values reduce the accuracy of ML models when predicting MACE risk. Removing variables with missing values and retraining the model may yield superior patient-level prediction performance.

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:04 Faculty of Medicine > University Hospital Zurich > Clinic for Nuclear Medicine
Dewey Decimal Classification:610 Medicine & health
Scopus Subject Areas:Physical Sciences > Computer Science Applications
Health Sciences > Health Informatics
Language:English
Date:June 2022
Deposited On:24 Jan 2023 17:59
Last Modified:27 Apr 2025 01:35
Publisher:Elsevier
ISSN:0010-4825
OA Status:Closed
Publisher DOI:https://doi.org/10.1016/j.compbiomed.2022.105449
PubMed ID:35381453
Full text not available from this repository.

Metadata Export

Statistics

Citations

Dimensions.ai Metrics
18 citations in Web of Science®
22 citations in Scopus®
Google Scholar™

Altmetrics

Authors, Affiliations, Collaborations

Similar Publications