Introduction Heart failure (HF) is a frequent yet complex condition with a diverse etiology and large clinical heterogeneity. We explored the application of digital vocal biomarkers with unsupervised machine learning in order to identify novel HF subgroups that may aid personalizing monitoring and treatment.
Methods The AHF-Voice study is a BMBF-funded ongoing monocentric prospective cohort study conducted at the University Hospital Würzburg. Inclusion criteria are hospitalization for acute HF, age ≥18 years, life-expectancy ≥6 months. Exclusion criteria are high output HF, cardiogenic shock, listing for high-urgency heart transplantation or a history of vocal fold disease or phonosurgery. Supervised by study staff, patients collect daily voice recordings in a dedicated smartphone app using three different voice tasks: spontaneous speech, sustained vowel, text reading. Patients are comprehensively phenotyped during hospitalization. We set up a vocal biomarker pipeline for audio processing and extraction of phonation and spectral-based voice features. Data were analyzed using principal component analysis (PCA) with an unsupervised K-means clustering approach. The number of clusters was determined using the Silhouette score. The AHF-Voice aims to recruit 123 patients and follow them for 6 months. We here report on the in-hospital period of the first 50 AHF-Voice patients included between April and August 2023.
Conclusion Machine learning-based cluster analysis based on voice features is able to identify distinct groups of HF patients. The clinical utility needs to be explored in the future.
| |
1 N=9
|
2 N=10
|
3 N=23
|
P value
|
|
Age (yrs)
|
74±8
|
74±13
|
74±11
|
0.967
|
|
Male sex
|
7 (78)
|
7 (70)
|
13 (57)
|
0.490
|
|
HF Charateristics
|
|
History of HF
|
|
|
|
0.009
|
|
de novo
|
1 (11)
|
7 (70)
|
10 (44)
|
|
|
<1 years
|
-
|
-
|
3 (13)
|
|
|
1-5 years
|
-
|
3 (30)
|
2 (9)
|
|
|
>5 years
|
7 (78)
|
-
|
7 (30)
|
|
|
unknown
|
1 (11)
|
-
|
1 (4)
|
|
|
NYHA class III/IV
|
8 (89)
|
9 (90)
|
20 (87)
|
0.966
|
|
Comorbidities and risk factors
|
|
History of MI
|
4 (44)
|
3 (30)
|
7 (30)
|
0.733
|
|
Current smoking
|
-
|
1 (10)
|
4 (17)
|
0.393
|
|
Diabetes mellitus
|
5 (56)
|
5 (50)
|
8 (35)
|
0.502
|
|
pAVK
|
1 (11)
|
4 (40)
|
4 (17)
|
0.250
|
|
COPD
|
3 (33)
|
1 (10)
|
4 (17)
|
0.423
|
|
Revascularization
|
3 (33)
|
4 (40)
|
6 (26)
|
0.724
|
|
Valve intervention
|
3 (33)
|
-
|
3 (13)
|
0.119
|
|
Device (CRT/ICD)
|
4 (44)
|
-
|
8 (35)
|
0.067
|
|
Charlson comorbidity score
|
3 [2, 5]
|
3 [2, 3]
|
2 [1, 4]
|
0.685
|
|
Measurements
|
|
LVEF (%)
|
37±12
|
47±14
|
50±16
|
0.052
|
|
BMI (kg/m2)
|
31±6
|
32±5
|
31±7
|
0.911
|
|
Systolic pressure (mmHg)
|
124±34
|
142±26
|
133±21
|
0.316
|
|
NT-proBNP (pg/mL)
|
7387 [2199, 12069]
|
4429 [1441, 13894]
|
5190 [3100, 11034]
|
0.867
|
|
Sodium (mmol/l)
|
138 [137, 140]
|
138 [138, 140]
|
141 [138, 143]
|
0.096
|
|
Potassium (mmol/l)
|
4±1
|
5±1
|
4±1
|
0.020
|
|
Hematocrit (%)
|
39±5
|
36±8
|
38±7
|
0.634
|
|
eGFR (ml/min/1.73m2)
|
43 [40, 46]
|
66 [45, 85]
|
41 [33, 76]
|
0.165
|
|
C-reactive protein (mg/dl)
|
1 [0, 3]
|
1 [0, 1]
|
1 [0, 3]
|
0.544
|
Data are n (%), mean±SD or median (quartiles). Group comparisons by Kruskal-Wallis test or X² test
