Predicting improvement of physical exercise capacity using pulse wave analysis in patients with coronary artery disease using a machine learning approach

https://doi.org/10.1007/s00392-025-02625-4

Hendrik Schäfer (Witten)1, V. Tsakanikas (Ioannina)2, D. I. Fotiadis (Ioannina)2, B. Schmitz (Ennepetal)3, F. C. Mooren (Witten)1

1Universität Witten/Herdecke gGmbH Fakultät für Gesundheit Witten, Deutschland; 2Foundation for Research and Technology-Hellas Ioannina, Griechenland; 3Klinik Königsfeld Zentrum für Rehabilitation Ennepetal, Deutschland

 

Background and aims:
Patients with stable coronary artery disease (CAD) have a residual risk of adverse events and all-cause mortality. Physical exercise capacity (EC) or fitness is one of the cardinal determinants of morbidity and mortality in patients with CAD. Enhancing EC by exercise training (ET) during cardiac rehabilitation (CR) is thus a class 1 A guideline recommendation. However, there is a high number of ET non-responders i.e. patients that do not improve their EC significantly with CR. We aimed to develop a prediction model using machine learning (ML) for the early identification of ET non-responders based on cardiopulmonary exercise testing (CPET) and cardiovascular information derived from pulsewave analysis (PWA).

Methods:
Data from 404 CAD patients (mean age 57.3±9.2 years; 18.3% women) after MI and/or PCI (without bypass surgery) who underwent inpatient phase II CR were included for analysis. CPET (Ergostic, Amedtec, Aue, Germany) and PWA (Tel-O-Graph, IEM GmbH, Stolberg, Germany) were conducted at the beginning (T0) and end of CR (T1). Data on diagnosis, severity of the disease and medication were also included for modelling. Positive response to ET in terms of increase  in oxygen uptake was defined as a deviation greater than two fold of the typical error away from zero. A feature importance analysis was conducted to assess the relative significance of determinates on explaining the change in EC. A data-driven approach was applied, to root the derived model in empirical evidence. Different supervised ML models including Random Forest, K-Nearest Neighbors, Naive Bayes, and XGBoost were applied. The dataset was split into training and test sets and 10-fold cross-validation was used.  Trained models were evaluated using metrics such as accuracy, sensitivity, and specificity. Predictions were explained using the model-agnostic SHapley Additive exPlanation (SHAP) methodology.

Results:
Of the included patients, 69.7% were responder and 30.3% were non-responder. The feature importance analysis revealed that peak oxygen uptake at T0 had the greatest influence, followed by variables derived from the raw pulse wave data, such as kurtosis or entropy at T0. For the prediction model, the Random Forest classifier provided the best mean balanced accuracy of 84.2% (Table 1) and 82% mean value of the area under (AUC) of the Receiver Operating Characteristic analysis (ROC) (Figure 1). Variability across the cross-validation folds of 4.4%, indicated a highly consistent performance. The most influential features impacting the predictive performance of the model were breathing frequency, power, oxygen uptake combined with PWA-derived vascular characteristics at T0 including pulse pressure, augmentation index and central arterial pressure. Of note, primary diagnosis, disease severity and medication had only limited influence on the model.

Table 1. The performance of ML models

ML models

Mean Accuracy

Std Deviation

Logistic Regression

0.746

0.1555

Random Forest

0.842

0.094

K-Nearest Neighbors

0.645

0.084

Naive Bayes

0.500

0.009

XGBoost

0.794

0.118

Figure 1. ROC curves of the generated ML models


Conclusion:
Our new developed ML-based model enables an early identification of ET "non-responder". With the knowledge of the expected development during CRII, patient-centered individual interventions can be improved.

Keywords:
Coronary Artery Disease, Prediction, Physical Capacity, Pulsewave Analysis, Data Mining, Modelling

Diese Seite teilen