⭐ Article highlights: K-means clustering of atomistic fingerprints enables efficient selection of distinct configurations for interatomic potential fitting

k-Means Clustering in Fingerprint-Based Configuration Selection for Fitting Interatomic Potentials

Journal of Chemical Theory and Computation, 2024

Please see the full published article: https://doi.org/10.1021/acs.jctc.4c01225

Highlights:

  • K-means clustering with atomistic fingerprints improves configuration selection for fitting interatomic potentials from larger datasets.
  • The method outperforms uniform random sampling, achieving lower energy and force errors with fewer training configurations.
  • Only about 30 configurations were sufficient to obtain an EAM potential that well described the full set of 1800 Ti configurations.
  • The clustering approach yields more reliable fits, with consistently lower standard deviations than random selection.
  • Fingerprinting combines CrystalNN, bond-distance statistics, and RDF, enabling selection of structurally distinct configurations.
  • t-SNE visualization revealed overlap between vacancy and non-vacancy subsets, indicating similar atomic environments and potential data redundancy.
  • Configurations with vacancies could be predicted accurately even when excluded from training, confirming the redundancy suggested by the t-SNE map.
  • The method offers an efficient route to reduce DFT workload and streamline potential fitting workflows.