Jen-Ping Lee, Yu-Lin Chao, Ping-Hsun Wu, Yun-Shiuan Chuang, Chan Hsu, Pei-Yu Wu, Szu-Chia Chen, Wei-Chung Tsai, Yi-Wen Chiu, Shang-Jyh Hwang, Yi- Ting Lin, Mei-Chuan Kuo
Objective: Cardiac function stands as a robust and seemingly independent predictor of all-cause and cardiovascular mortality among individuals undergoing Hemodialysis (HD). The crucial need for efficient cardiac function assessment led us to explore the potential of using accessible blood sampling for evaluation. In this study, we cautiously harnessed cardiovascular proteomics in conjunction with Machine Learning (ML) techniques to explore the feasibility of predicting cardiac function in HD patients.
Methods: A cohort of 328 HD patients was gathered from two units located in Southern Taiwan. Utilizing proximity extension assays, a comprehensive measurement of 184 cardiovascular proteins was performed. Employing machine learning, we optimized a model for predicting cardiac dysfunction based on ejection fraction. Model performance was evaluated using the Area Under the Curve (AUC), while the Significance of Hierarchical Averaging of Shapley Values (SHAP) method was employed to identify crucial variables for prediction.
Results: Employing a dataset encompassing 184 proteomic biomarkers and 34 standard clinical variables within our analytical framework, it was discerned that the predictive efficacy of the "proteomic biomarkers" surpassed that of the "routine clinical and laboratory variables" using various machine learning algorithms, including Classification And Regression Tree (CART), Least Absolute Shrinkage And Selection Operator (LASSO), random forest, ranger and extreme gradient boosting (XgBoost) models. Through the application of XgBoost for feature selection, the significance of N-terminal pro-B type Natriuretic Peptide (NT-proBNP) emerged as the foremost contributor, supplemented by the predictive roles of Angiotensin Converting Enzyme 2 (ACE-2) and Chitotriosidase-1 (CHIT- 1) in determining cardiac dysfunction. This alignment was reaffirmed by SHAP-based elucidation of the XgBoost model.
Conclusion: Proteomic features outperformed clinical variables in predicting cardiac dysfunction using machine learning. Further analysis with XgBoost and SHAP highlighted NT-proBNP and CHIT-1 as crucial biomarkers, shedding light on cardiac dysfunction assessment in HD patients through blood biomarkers.