Machine-learning based prediction of paroxysmal atrial fibrillation in patients with pulmonary hypertension

Timo Kratz (Heidelberg)1, S. Hegde (Heidelberg)1, M. Prüser (Heidelberg)1, C. Dieterich (Heidelberg)1, C. Schmidt (Heidelberg)1

1Universitätsklinikum Heidelberg Klinik für Innere Med. III, Kardiologie, Angiologie u. Pneumologie Heidelberg, Deutschland


Background: Atrial fibrillation (AF) is the most common cardiac arrhythmia globally and has been linked to many adverse outcomes, including thromboembolism and stroke. It is frequently undetected and early diagnosis and treatment remains a challenge. In patients with pulmonary hypertension (PH), the presence of concomitant AF has been shown to significantly worsen what is already a rapidly progressive disease with a poor prognosis due to higher rates of cardiac decompensation and frequent hospital re-admission. Thus, identifying AF in this special patient cohort is of utmost clinical importance both in order to prevent morbidity and mortality, as well as to improve quality of life.

Purpose/aim: Our aim was to identify markers that will predict the presence of paroxysmal atrial fibrillation in patients with pulmonary hypertension, ultimately allowing us to derive a predictive model for this small, yet highly vulnerable patient group.

Methods: A retrospective cohort of 306 patients who presented to the PH outpatient clinic at Heidelberg University Hospital in the years 2000-2023 was used in the study. The cohort consists of 128 male (41.8%) and 178 female (52.2%) individuals with an average age of 66.5 [55.0,75.0] yrs. Feature Engineering was carried out utilizing three feature selection methods: SHAP Importance, Recursive Feature Elimination (RFE), and Permutation Importance. The Random Forest and XGBoost classifiers were used for the binary (sinus rhythm vs. AF) classification process. Model performance was evaluated using the F1 score as the primary metric. Secondary metrices included AUROC, precision, recall and accuracy. The study protocol was approved by the ethics committee of the University of Heidelberg

Results: Our single centre-cohort consisted of 306 individual patients with pulmonary hypertension of any aetiology (i.e. groups 1-5). Using a combination of clinical parameters, ECG, lung function testing, echocardiography and right heart catheterization data as well as laboratory values, we identified 98 potentially predictive features for further processing (i.e. imputation, ranking and selection). We obtained the highest F1 score of 0.814 with just 5 top features selected from the permutation feature importance method using the XGBoost classifier. These were pulmonary arterial resistance (PAR), LA-size and RA-area, age and serum creatinine. Age and LA-size are well known risk factors for the development of AF, whilst PAR and RA size are closely correlated with PH severity and right heart dysfunction. Creatinine, a widely used biochemical marker of renal function, has also been linked to both left- and right-sided heart failure, suggesting that, in addition to established AF risk factors, PH-related features are also predictive of AF development.  

Conclusions: By combining clinical features/parameters obtained from different modalities, we managed to identify a set of readily available markers that adequately predict the presence of AF in patients with pulmonary hypertension. We hope that these results will enable us to derive an easy-to-use prediction score for AF in this high-risk patient group. The clinical importance of such a score cannot be understated given the devastating effects of undetected AF in patients with concomitant PH. In future studies, we aim to validate our findings in a larger external cohort, such as that of the UK-Biobank.

Diese Seite teilen