Breaking the barrier of human-annotated training data for machine learning-aided plant research using aerial imagery

Sebastian Varela, Xuying Zheng, Joyce Njuguna, Erik Sacks, Dylan Allen, Jeremy Ruhter, Andrew D B Leakey

Research output: Contribution to journalArticlepeer-review

Abstract

Machine learning (ML) can accelerate biological research. However, the adoption of such tools to facilitate phenotyping based on sensor data has been limited by (i) the need for a large amount of human-annotated training data for each context in which the tool is used and (ii) phenotypes varying across contexts defined in terms of genetics and environment. This is a major bottleneck because acquiring training data is generally costly and time-consuming. This study demonstrates how a ML approach can address these challenges by minimizing the amount of human supervision needed for tool building. A case study was performed to compare ML approaches that examine images collected by an uncrewed aerial vehicle to determine the presence/absence of panicles (i.e. “heading”) across thousands of field plots containing genetically diverse breeding populations of 2 Miscanthus species. Automated analysis of aerial imagery enabled the identification of heading approximately 9 times faster than in-field visual inspection by humans. Leveraging an Efficiently Supervised Generative Adversarial Network (ESGAN) learning strategy reduced the requirement for human-annotated data by 1 to 2 orders of magnitude compared to traditional, fully supervised learning approaches. The ESGAN model learned the salient features of the data set by using thousands of unlabeled images to inform the discriminative ability of a classifier so that it required minimal human-labeled training data. This method can accelerate the phenotyping of heading date as a measure of flowering time in Miscanthus across diverse contexts (e.g. in multistate trials) and opens avenues to promote the broad adoption of ML tools.

Original languageEnglish (US)
Article numberkiaf132
JournalPlant physiology
Volume197
Issue number4
DOIs
StatePublished - Apr 2025

ASJC Scopus subject areas

  • Physiology
  • Genetics
  • Plant Science

Fingerprint

Dive into the research topics of 'Breaking the barrier of human-annotated training data for machine learning-aided plant research using aerial imagery'. Together they form a unique fingerprint.

Cite this