Individual predictions and visualization of outcomes from clinical routine data of chronic heart failure patients by a machine learning model: a potential tool for improvement of medical care

Jan Ulmer (Hannover)1, K. Werle (Hannover)1, S. Schallhorn (Hannover)2, S. Soltani (Hannover)2, E. Angelini (Hannover)2, M.-A. Sandu (Hannover)2, T. Zeppernick (Hannover)2, M. Freitag (Hannover)2, A. Schröder (Hannover)2, J. Bauersachs (Hannover)2, M. Gietzelt (Hannover)1, M. Marschollek (Hannover)1, U. Bavendiek (Hannover)2

1Medizinische Hochschule Hannover Institut für Medizininformatik Hannover, Deutschland; 2Medizinische Hochschule Hannover Klinik für Kardiologie und Angiologie Hannover, Deutschland



Chronic heart failure (CHF) affects about 2-3 million people in Germany and is one of the most common reasons for hospitalization. Medical data from routine clinical practice are stored in different primary systems causing limited access for health-care professionals as well as for research. Thus, use of already existing medical data for optimal patient care and research is markedly restricted.


As part of the HiGHmed consortium within the Medical Informatics Initiative, the Use Case Cardiology (UCC) aims to collect standardized medical data from patients with CHF from routine clinical practice and make it accessible for clinical research. The long-term goal is to improve patient care and outcomes by identifying high-risk-patients with CHF and examining warning signals of disease worsening. This study aims to identify attributes from standardized clinical routine data predicting outcomes of patients with CHF and to develop a tool employing a machine learning model to visualize outcomes at the individual patient level.


In this study, standardized and harmonized clinical data from the routine treatment of a cohort of 1320 patients with CHF enrolled at the UCC-site of Hannover Medical School were analyzed. The age distribution of this cohort ranged from 18 to 95 years (average 70 years, 67% male). The statistical analysis based on an event-time analysis using Kaplan-Meier curves over a fixed study period of 2 years. First, the results abstracted from the curves should be statistically verified using the log-rank test. Second, they are used to evaluate a machine learning model, the Random Survival Forest (RSF), providing risk prediction curves for single patients.


The analysis of survival time using Kaplan-Meier curves identified the following attributes as particularly relevant: As binary attributes NYHA-class (NYHA I-II vs. III-IV), heart failure hospitalization within the last 18 month (HFH 18M) and furosemide treatment; as quartiles estimated glomerular filtration rate (eGFR), NT-proBNP, and left ventricular ejection fraction (LVEF). The statistical significance was confirmed for the events death and hospitalization for worsening of heart failure (HHF) using the log-rank test and the multivariate-log-rank test (p-values see table 1). The RSF for the prediction of survival time achieves a concordance index (C-index) of 0.74 for death and of 0.65 for hospitalization for worsening of heart failure after a 5-fold cross-validation. The model delivers individual prediction curves for single patients, which can be visualized with different scenarios of risk attributes (figure 1).


Individual predictions from clinical routine data employing a machine learning model enable to investigate the effects of risk attributes on outcomes of a single patient with CHF. Visualization of different scenarios of risk attributes on individual outcomes to the patient provides a potential tool for patient education and improvement of medical care.


Table 1







HFH 18M (yes vs. no)



furosemide treatment (yes vs. no)



eGFR (ml/min)*

p≤0.001 1. quartile

p≤0.001 1. quartile

NT-proBNP (ng/ml)*

p≤0.001 1. and 4. quartile

p≤0.001 all quartiles

LVEF (%)*


p≤0.001 1. quartile


Figure 1

Change of values for the attribute NYHA-class (-1) and betablocker therapy (+1) provides a new survival curve for a single patient

Diese Seite teilen