Visual recognition software for binary classification and its application to spruce pollen identification

David K. Tcheng, Ashwin K. Nayak, Charless C. Fowlkes, Surangi W. Punyasena

Research output: Contribution to journalArticle

Abstract

Discriminating between black and white spruce (Picea mariana and Picea glauca) is a difficult palynological classification problem that, if solved, would provide valuable data for paleoclimate reconstructions. We developed an open-source visual recognition software (ARLO, Automated Recognition with Layered Optimization) capable of differentiating between these two species at an accuracy on par with human experts. The system applies pattern recognition and machine learning to the analysis of pollen images and discovers general-purpose image features, defined by simple features of lines and grids of pixels taken at different dimensions, size, spacing, and resolution. It adapts to a given problem by searching for the most effective combination of both feature representation and learning strategy. This results in a powerful and flexible framework for image classification. We worked with images acquired using an automated slide scanner. We first applied a hash-based "pollen spotting" model to segment pollen grains from the slide background. We next tested ARLO's ability to reconstruct black to white spruce pollen ratios using artificially constructed slides of known ratios. We then developed a more scalable hash-based method of image analysis that was able to distinguish between the pollen of black and white spruce with an estimated accuracy of 83.61%, comparable to human expert performance. Our results demonstrate the capability of machine learning systems to automate challenging taxonomic classifications in pollen analysis, and our success with simple image representations suggests that our approach is generalizable to many other object recognition problems.

Original languageEnglish (US)
Article numbere0148879
JournalPloS one
Volume11
Issue number2
DOIs
StatePublished - Feb 2016

Fingerprint

Pollen
Learning systems
Picea
Software
Picea glauca
pollen
Picea mariana
Pattern recognition systems
artificial intelligence
Image classification
Object recognition
Image analysis
Automated Pattern Recognition
Pixels
Metrorrhagia
Aptitude
scanners
Recognition (Psychology)
learning
Learning

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

Visual recognition software for binary classification and its application to spruce pollen identification. / Tcheng, David K.; Nayak, Ashwin K.; Fowlkes, Charless C.; Punyasena, Surangi W.

In: PloS one, Vol. 11, No. 2, e0148879, 02.2016.

Research output: Contribution to journalArticle

@article{0d045266476641a095b37f2b09b64201,
title = "Visual recognition software for binary classification and its application to spruce pollen identification",
abstract = "Discriminating between black and white spruce (Picea mariana and Picea glauca) is a difficult palynological classification problem that, if solved, would provide valuable data for paleoclimate reconstructions. We developed an open-source visual recognition software (ARLO, Automated Recognition with Layered Optimization) capable of differentiating between these two species at an accuracy on par with human experts. The system applies pattern recognition and machine learning to the analysis of pollen images and discovers general-purpose image features, defined by simple features of lines and grids of pixels taken at different dimensions, size, spacing, and resolution. It adapts to a given problem by searching for the most effective combination of both feature representation and learning strategy. This results in a powerful and flexible framework for image classification. We worked with images acquired using an automated slide scanner. We first applied a hash-based {"}pollen spotting{"} model to segment pollen grains from the slide background. We next tested ARLO's ability to reconstruct black to white spruce pollen ratios using artificially constructed slides of known ratios. We then developed a more scalable hash-based method of image analysis that was able to distinguish between the pollen of black and white spruce with an estimated accuracy of 83.61{\%}, comparable to human expert performance. Our results demonstrate the capability of machine learning systems to automate challenging taxonomic classifications in pollen analysis, and our success with simple image representations suggests that our approach is generalizable to many other object recognition problems.",
author = "Tcheng, {David K.} and Nayak, {Ashwin K.} and Fowlkes, {Charless C.} and Punyasena, {Surangi W.}",
year = "2016",
month = "2",
doi = "10.1371/journal.pone.0148879",
language = "English (US)",
volume = "11",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "2",

}

TY - JOUR

T1 - Visual recognition software for binary classification and its application to spruce pollen identification

AU - Tcheng, David K.

AU - Nayak, Ashwin K.

AU - Fowlkes, Charless C.

AU - Punyasena, Surangi W.

PY - 2016/2

Y1 - 2016/2

N2 - Discriminating between black and white spruce (Picea mariana and Picea glauca) is a difficult palynological classification problem that, if solved, would provide valuable data for paleoclimate reconstructions. We developed an open-source visual recognition software (ARLO, Automated Recognition with Layered Optimization) capable of differentiating between these two species at an accuracy on par with human experts. The system applies pattern recognition and machine learning to the analysis of pollen images and discovers general-purpose image features, defined by simple features of lines and grids of pixels taken at different dimensions, size, spacing, and resolution. It adapts to a given problem by searching for the most effective combination of both feature representation and learning strategy. This results in a powerful and flexible framework for image classification. We worked with images acquired using an automated slide scanner. We first applied a hash-based "pollen spotting" model to segment pollen grains from the slide background. We next tested ARLO's ability to reconstruct black to white spruce pollen ratios using artificially constructed slides of known ratios. We then developed a more scalable hash-based method of image analysis that was able to distinguish between the pollen of black and white spruce with an estimated accuracy of 83.61%, comparable to human expert performance. Our results demonstrate the capability of machine learning systems to automate challenging taxonomic classifications in pollen analysis, and our success with simple image representations suggests that our approach is generalizable to many other object recognition problems.

AB - Discriminating between black and white spruce (Picea mariana and Picea glauca) is a difficult palynological classification problem that, if solved, would provide valuable data for paleoclimate reconstructions. We developed an open-source visual recognition software (ARLO, Automated Recognition with Layered Optimization) capable of differentiating between these two species at an accuracy on par with human experts. The system applies pattern recognition and machine learning to the analysis of pollen images and discovers general-purpose image features, defined by simple features of lines and grids of pixels taken at different dimensions, size, spacing, and resolution. It adapts to a given problem by searching for the most effective combination of both feature representation and learning strategy. This results in a powerful and flexible framework for image classification. We worked with images acquired using an automated slide scanner. We first applied a hash-based "pollen spotting" model to segment pollen grains from the slide background. We next tested ARLO's ability to reconstruct black to white spruce pollen ratios using artificially constructed slides of known ratios. We then developed a more scalable hash-based method of image analysis that was able to distinguish between the pollen of black and white spruce with an estimated accuracy of 83.61%, comparable to human expert performance. Our results demonstrate the capability of machine learning systems to automate challenging taxonomic classifications in pollen analysis, and our success with simple image representations suggests that our approach is generalizable to many other object recognition problems.

UR - http://www.scopus.com/inward/record.url?scp=84959387721&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84959387721&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0148879

DO - 10.1371/journal.pone.0148879

M3 - Article

C2 - 26867017

AN - SCOPUS:84959387721

VL - 11

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 2

M1 - e0148879

ER -