Machine learning and data science in soft materials engineering

Andrew L. Ferguson

Research output: Contribution to journalReview articlepeer-review


In many branches of materials science it is now routine to generate data sets of such large size and dimensionality that conventional methods of analysis fail. Paradigms and tools from data science and machine learning can provide scalable approaches to identify and extract trends and patterns within voluminous data sets, perform guided traversals of high-dimensional phase spaces, and furnish data-driven strategies for inverse materials design. This topical review provides an accessible introduction to machine learning tools in the context of soft and biological materials by 'de-jargonizing' data science terminology, presenting a taxonomy of machine learning techniques, and surveying the mathematical underpinnings and software implementations of popular tools, including principal component analysis, independent component analysis, diffusion maps, support vector machines, and relative entropy. We present illustrative examples of machine learning applications in soft matter, including inverse design of self-assembling materials, nonlinear learning of protein folding landscapes, high-throughput antimicrobial peptide design, and data-driven materials design engines. We close with an outlook on the challenges and opportunities for the field.

Original languageEnglish (US)
Article number043002
JournalJournal of Physics Condensed Matter
Issue number4
StatePublished - Jan 31 2018


  • biological materials
  • data science
  • data-driven design
  • inverse design
  • machine learning
  • soft materials

ASJC Scopus subject areas

  • General Materials Science
  • Condensed Matter Physics


Dive into the research topics of 'Machine learning and data science in soft materials engineering'. Together they form a unique fingerprint.

Cite this