Atrial fibrillation (AF) is the most common
cardiac arrhythmia and is associated with increased morbidity and mortality.
Early prediction of AF episodes remains a clinical challenge. This study aimed
to generate physiopathological hypotheses for AF onset by analyzing
correlations among heart rate variability (HRV) parameters in patients
monitored via long-term Holter ECG. We utilized the IRIDIA-AF database,
comprising 1319 paroxysmal AF episodes from 872 patients. An XGBoost machine
learning model was developed to predict AF onset within 24 h using short- and
long-term HRV features, fragmentation indices, and non-linear metrics extracted
during sinus rhythm. Model interpretation was performed using SHapley Additive
exPlanations (SHAP) values, and dimensionality reduction techniques were
applied for data visualization. The model achieved an area under the receiver
operating characteristic curve of 0.919 and an area under the precision-recall
curve of 0.919, with high accuracy, sensitivity, and specificity. Key
predictive features included short-term vagal activity, HRV fragmentation
indices, and non-linear parameters, highlighting the role of the autonomic
nervous system in AF initiation. Our findings suggest that distinct
physiological profiles, detectable via HRV, may underlie AF susceptibility and
could inform personalized monitoring and prevention strategies.