k-Means Clustering in Fingerprint-Based Configuration Selection for Fitting Interatomic Potentials
Journal of Chemical Theory and Computation, 2024
Please see the full published article: https://doi.org/10.1021/acs.jctc.4c01225
Highlights:
- K-means clustering with atomistic fingerprints improves configuration selection for fitting interatomic potentials from larger datasets.
- The method outperforms uniform random sampling, achieving lower energy and force errors with fewer training configurations.
- Only about 30 configurations were sufficient to obtain an EAM potential that well described the full set of 1800 Ti configurations.
- The clustering approach yields more reliable fits, with consistently lower standard deviations than random selection.
- Fingerprinting combines CrystalNN, bond-distance statistics, and RDF, enabling selection of structurally distinct configurations.
- t-SNE visualization revealed overlap between vacancy and non-vacancy subsets, indicating similar atomic environments and potential data redundancy.
- Configurations with vacancies could be predicted accurately even when excluded from training, confirming the redundancy suggested by the t-SNE map.
- The method offers an efficient route to reduce DFT workload and streamline potential fitting workflows.