Abstract
We describe a large audio-visual speech corpus recorded in a car environment, as well as the equipment and procedures used to build this corpus. Data are collected through a multi-sensory array consisting of eight microphones on the sun visor and four video cameras on the dashboard. The script for the corpus consists of four categories: isolated digits, isolated letters, phone numbers, and sentences, all in English. Speakers from various language backgrounds are included, 50 male and 50 female. In order to vary the signal-to-noise ratio, each script has five different noise conditions: idling, driving at 35 mph with windows open and closed, and driving at 55 mph with windows open and closed. The corpus is available through <http://www.ifp.uiuc.edu/speech/AVICAR/>.
Original language | English (US) |
---|---|
Pages | 2489-2492 |
Number of pages | 4 |
State | Published - 2004 |
Event | 8th International Conference on Spoken Language Processing, ICSLP 2004 - Jeju, Jeju Island, Korea, Republic of Duration: Oct 4 2004 → Oct 8 2004 |
Other
Other | 8th International Conference on Spoken Language Processing, ICSLP 2004 |
---|---|
Country/Territory | Korea, Republic of |
City | Jeju, Jeju Island |
Period | 10/4/04 → 10/8/04 |
ASJC Scopus subject areas
- Language and Linguistics
- Linguistics and Language