Training sets based on uncertainty estimates in the cluster-expansion method

David Kleiven, Jaakko Akola, Andrew A. Peterson, Tejs Vegge, Jin Hyun Chang

Research output: Contribution to journalArticleScientificpeer-review

4 Downloads (Pure)


Cluster expansion (CE) has gained an increasing level of popularity in recent years, and its applications go far beyond its original root in binary alloys, reaching even complex crystalline systems often used in energy materials research. Similar to other modern machine learning approaches in materials science, many strategies have been proposed for training and fitting the CE models to first-principles calculation results. Here, we propose a new strategy for constructing a training set based on their relevance in Monte Carlo sampling for statistical analysis and reduction of the expected error. The CE model constructed from the proposed approach has lower dependence on the specific details of the training set, thereby increasing the reproducibility of the model. The same method can be applied to other machine learning approaches where it is desirable to sample relevant configurational space with a small set of training data, which is often the case when they consist of first-principles calculations.

Original languageEnglish
Article number034012
Number of pages12
JournalJPhys Energy
Issue number3
Publication statusPublished - 2021
Publication typeA1 Journal article-refereed


  • Bootstrapping
  • Cluster expansion
  • Energy materials
  • Machine learning
  • Monte Carlo
  • Phase transition

Publication forum classification

  • Publication forum level 0

ASJC Scopus subject areas

  • Energy(all)
  • Materials Chemistry
  • Materials Science (miscellaneous)


Dive into the research topics of 'Training sets based on uncertainty estimates in the cluster-expansion method'. Together they form a unique fingerprint.

Cite this