1Universitätsmedizin Mannheim Klinik für Dermatologie, Venerologie und Allergologie Mannheim, Deutschland; 2Universitätsmedizin Mannheim V. Medizinische Klinik Nephrologie, Hypertensiologie, Endokrinologie, Diabetologie, Rheumatologie, Pneumologie Mannheim, Deutschland; 3SYNLAB Holding Deutschland GmbH SYNLAB Akademie Mannheim, Deutschland; 4Universitätsklinikum Mannheim V. Medizinische Klinik Mannheim, Deutschland
Background:
Leveraging two extensive clinical datasets - LURIC and UMM - we used automated machine learning techniques to develop predictive and analytical models. Compared to other machine learning studies, this offers the advantage of a more efficient application of data science technology in the development of tools derived from existing clinical cardiovascular data, resulting in superior cardiovascular risk stratification.
Methods:
Two extensive clinical datasets - LURIC (1997-2010, 3058 patients) and UMM (2017-2020, 423 patients) - were harnessed with automated machine learning techniques to create predictive and analytical models. Compared to other machine learning studies, our approach can be applied by non-data science experts, i.e. medical professionals. This holds the advantage of more efficient applications of data science technology in the development of tools resulting from existing clinical cardiovascular data. Notable evaluation metrics, such as AUC, LogLoss, and Mathew's correlation coefficient, guided the development of tailored, highly accurate models for our specific use cases.
Results
In the first part of our study, we analyzed various parameter interactions in the two study cohorts that were previously unknown due to the nonlinear nature of the feature correlations. In the second part, a consolidated dataset yielded AUC model scores ranging from 0.73 to 0.84 in cross-validation and 0.7 to 0.85 in external validation, demonstrating how models built on LURIC data were able to accurately apply findings to entirely new patient cohorts in the UMM dataset. This approach substantially improved model accuracy in external validation.
The third part involved the creation of four different cardiovascular mortality risk stratification models. The AUC scores for these models ranged from 0.74 to 0.85, providing more accurate predictions than traditional tools. Shapley-based plots highlighted critical features for the necessary calculations, including age, NT-proBNP levels, hsCRP levels and the variable effect of Lipoprotein (a), providing insight into the prediction process. These predictive insights were effectively applied to individual patient risk assessments using an AI risk stratification application that we will demonstrate in this presentation to compete with established scores such as ESC, Framingham or PROCAM (Figure 1).
Discussion:
Our findings underscore the transformative potential of machine learning in cardiovascular medicine. We highlight relationships such as the impact of statin use on coronary artery disease (CAD) and better prediction with “novel” risk factors such as NT-pro-BNP, hsCRP and Lipoprotein (a) (Figure 2). Our results not only align with the conference theme, but also add to the body of knowledge in the field of digital medicine and cardiology.
Conclusion:
In conclusion, our work exemplifies the fusion of medical expertise and technological innovation. The improved risk prediction capabilities and the development of robust models open the door to more accurate and personalized assessments of cardiovascular disease risk. Our study encourages further discussion and research in this exciting and emerging field. These findings enable adaptive, real-time predictive applications that can evolve with new data, making them valuable future tools in clinical practice that can be used by local-level institutions such as hospitals or practicing physicians.
Figures:
Figure 1
___________
Figure 2
___________