A 1000-word vocabulary, speaker-independent, continuous live-mode speech recognizer implemented in a single FPGA

Edward C. Lin, Kai Yu, Rob A. Rutenbar, Tsuhan Chen

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The Carnegie Mellon In Silico Vox project seeks to move best-quality speech recognition technology from its current software-only form into a range of efficient all-hardware implementations. The central thesis is that, like graphics chips, the application is simply too performance hungry, and too power sensitive, to stay as a large software application. As a first step in this direction, we describe the design and implementation of a fully functional speech-to-text recognizer on a single Xilinx XUP platform. The design recognizes a 1000 word vocabulary, is speaker-independent, recognizes continuous (connected) speech, and is a "live mode" engine, wherein recognition can start as soon as speech input appears. To the best of our knowledge, this is the most complex recognizer architecture ever fully committed to a hardware-only form. The implementation is extraordinarily small, and achieves the same accuracy as state-of-the-art software recognizers, while running at a fraction of the clock speed.

Original languageEnglish (US)
Title of host publicationFPGA 2007
Subtitle of host publicationFifteenth ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
Pages60-68
Number of pages9
DOIs
StatePublished - 2007
Externally publishedYes
EventFPGA 2007: Fifteenth ACM/SIGDA International Symposium on Field-Programmable Gate Arrays - Monterey, CA, United States
Duration: Feb 18 2007Feb 20 2007

Publication series

NameACM/SIGDA International Symposium on Field Programmable Gate Arrays - FPGA

Other

OtherFPGA 2007: Fifteenth ACM/SIGDA International Symposium on Field-Programmable Gate Arrays
Country/TerritoryUnited States
CityMonterey, CA
Period2/18/072/20/07

Keywords

  • DSP
  • FPGA
  • In silico vox
  • Speech recognition

ASJC Scopus subject areas

  • Computer Science(all)

Fingerprint

Dive into the research topics of 'A 1000-word vocabulary, speaker-independent, continuous live-mode speech recognizer implemented in a single FPGA'. Together they form a unique fingerprint.

Cite this