Standardization and AI-readiness across 5 international heart failure datasets to support the development of AI models: an iCARE4CVD use case

Clin Res Cardiol (2026). DOI 10.1007/s00392-026-02870-1
M. Verket (Aachen)¹, C. Peters (Maastricht)², H. Ghaem Sigarchian (Geel)³, E. Zilonova (London)⁴, S. Kabak (Maastricht)², M. Colombo (Milan)⁵, A. Henderson (Glasgow)⁶, N. Pavo (Wien)⁷, N. Krautenbacher (Penzberg)⁸, W. Wei (Maastricht)⁹, A. A. Voors (Groningen)¹⁰, M. Huelsmann (Wien)⁷, R. Latini (Milan)¹¹, D. Müller-Wieland (Aachen)¹, H.-P. Brunner-La Rocca (Maastricht)¹²
¹Uniklinik RWTH Aachen Med. Klinik I - Kardiologie, Angiologie und Internistische Intensivmedizin Aachen, Deutschland; ²Maastricht University Medical Centre Department of Cardiology Maastricht, Niederlande; ³Thomas More Care and Well-Being – Research Group Mobilab & Care Geel, Belgien; ⁴Novo Nordisk Digital Biology, AI & Digital Innovation, London, Großbritannien; ⁵Istituto di Ricerche Farmacologiche Mario Negri Dipartimento di Ricerca Danno Cerebrale e Cardiovascolare Acuto Milan, Italien; ⁶University of Glasgow School of Cardiovascular & Metabolic Health Glasgow, Großbritannien; ⁷Medizinische Universität Wien Innere Medizin II / Kardiologie Wien, Österreich; ⁸Roche Diagnostics GmbH Penzberg, Deutschland; ⁹Maastricht University Institute of Data Science Maastricht, Niederlande; ¹⁰University Medical Center Groningen Department of Echocardiography Groningen, Niederlande; ¹¹IRCCS Istituto Neurologico Department of Cardiovascular Research Milan, Italien; ¹²Maastricht University Medical Center Maastricht, Niederlande

Background: The integration of diverse heart failure (HF) datasets offers the potential to enhance predictive modelling and personalized treatment strategies. However, heterogeneity in data structure, terminology, and medication representation poses significant challenges for pooled analyses and artificial intelligence (AI) model development. Purpose: To standardise distinct HF datasets by mapping clinical variables and transforming medication data, thereby creating a unified, high-quality dataset suitable for AI-driven predictive modelling. Methods: 5 HF datasets, Aachen-HF(DE), Biostat (NL), TIME-HF (CH), GISSI-HF (IT), and Vienna-HF (AT), were used. Each dataset included demographic, clinical, laboratory, and treatment variables. To allow for AI-based modeling, a standardization pipeline. with codings standards (SNOMED-CT, LOINC, ATC) was implemented by developing metadata dictionaries. Custom mapping were developed to address missing standardisation and legacy coding. HF medication mapping and transformation to a percentage of the target dose was aligned with 2021 ESC HF Guidelines, ensuringconsistency in the definitions, classification, and target dose of HF therapies across datasets. This percentage was pooled into one variable per medication class.Results: More than 80 feature variables from 5181 patients with 25982 records were identified to be harmonised between the HF datasets. 5 medications, beta blockers, RAS inhibitors, diuretics, and MRAs, were transformed to the daily target doses for each patient. Signs and symptoms of HF, such as edema, NYHA, orthopnoea, were included. Additionally, comorbidities, such as diabetes, kidney disease and COPD were identified as feature variables. Conclusion: Standardisation and transformation of HF medication data across multiple, heterogeneous datasets is critical for developing clinically relevant AI models. This process yielded a robust, standardized dataset with coherent definitions of demographic, laboratory and treatment variables, particularly those reflecting guideline-directed HF therapies. By bridging the data variability across cohorts and successfully reuse older datasets, this work lays the foundation for AI-driven precision HF medicine.

Standardization and AI-readiness across 5 international heart failure datasets to support the development of AI models: an iCARE4CVD use case

Starke Unterstützung*