Aligning ASL for statistical translation using a discriminative word model

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We describe a method to align ASL video subtitles with a closed-caption transcript. Our alignments are partial, based on spotting words within the video sequence, which consists of joined (rather than isolated) signs with unknown word boundaries. We start with windows known to contain an example of a word, but not limited to it. We estimate the start and end of the word in these examples using a voting method. This provides a small number of training examples (typically three per word). Since there is no shared structure, we use a discriminative rather than a generative word model. While our word spotters are not perfect, they are sufficient to establish an alignment. We demonstrate that quite small numbers of good word spotters results in an alignment good enough to produce simple English-ASL translations, both by phrase matching and using word substitution.

Original languageEnglish (US)
Title of host publicationProceedings - 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006
Pages1471-1476
Number of pages6
DOIs
StatePublished - Dec 22 2006
Event2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006 - New York, NY, United States
Duration: Jun 17 2006Jun 22 2006

Publication series

NameProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume2
ISSN (Print)1063-6919

Other

Other2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006
CountryUnited States
CityNew York, NY
Period6/17/066/22/06

Fingerprint

Substitution reactions

Keywords

  • Action analysis and recognition
  • Applications of vision
  • Image and video retrieval
  • Object recognition

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Cite this

Farhadi, A., & Forsyth, D. A. (2006). Aligning ASL for statistical translation using a discriminative word model. In Proceedings - 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006 (pp. 1471-1476). [1640930] (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2). https://doi.org/10.1109/CVPR.2006.51

Aligning ASL for statistical translation using a discriminative word model. / Farhadi, Ali; Forsyth, David Alexander.

Proceedings - 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006. 2006. p. 1471-1476 1640930 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition; Vol. 2).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Farhadi, A & Forsyth, DA 2006, Aligning ASL for statistical translation using a discriminative word model. in Proceedings - 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006., 1640930, Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1471-1476, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006, New York, NY, United States, 6/17/06. https://doi.org/10.1109/CVPR.2006.51
Farhadi A, Forsyth DA. Aligning ASL for statistical translation using a discriminative word model. In Proceedings - 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006. 2006. p. 1471-1476. 1640930. (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition). https://doi.org/10.1109/CVPR.2006.51
Farhadi, Ali ; Forsyth, David Alexander. / Aligning ASL for statistical translation using a discriminative word model. Proceedings - 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006. 2006. pp. 1471-1476 (Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition).
@inproceedings{9c91195eaedf4f058187359eb4d519d1,
title = "Aligning ASL for statistical translation using a discriminative word model",
abstract = "We describe a method to align ASL video subtitles with a closed-caption transcript. Our alignments are partial, based on spotting words within the video sequence, which consists of joined (rather than isolated) signs with unknown word boundaries. We start with windows known to contain an example of a word, but not limited to it. We estimate the start and end of the word in these examples using a voting method. This provides a small number of training examples (typically three per word). Since there is no shared structure, we use a discriminative rather than a generative word model. While our word spotters are not perfect, they are sufficient to establish an alignment. We demonstrate that quite small numbers of good word spotters results in an alignment good enough to produce simple English-ASL translations, both by phrase matching and using word substitution.",
keywords = "Action analysis and recognition, Applications of vision, Image and video retrieval, Object recognition",
author = "Ali Farhadi and Forsyth, {David Alexander}",
year = "2006",
month = "12",
day = "22",
doi = "10.1109/CVPR.2006.51",
language = "English (US)",
isbn = "0769525970",
series = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",
pages = "1471--1476",
booktitle = "Proceedings - 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006",

}

TY - GEN

T1 - Aligning ASL for statistical translation using a discriminative word model

AU - Farhadi, Ali

AU - Forsyth, David Alexander

PY - 2006/12/22

Y1 - 2006/12/22

N2 - We describe a method to align ASL video subtitles with a closed-caption transcript. Our alignments are partial, based on spotting words within the video sequence, which consists of joined (rather than isolated) signs with unknown word boundaries. We start with windows known to contain an example of a word, but not limited to it. We estimate the start and end of the word in these examples using a voting method. This provides a small number of training examples (typically three per word). Since there is no shared structure, we use a discriminative rather than a generative word model. While our word spotters are not perfect, they are sufficient to establish an alignment. We demonstrate that quite small numbers of good word spotters results in an alignment good enough to produce simple English-ASL translations, both by phrase matching and using word substitution.

AB - We describe a method to align ASL video subtitles with a closed-caption transcript. Our alignments are partial, based on spotting words within the video sequence, which consists of joined (rather than isolated) signs with unknown word boundaries. We start with windows known to contain an example of a word, but not limited to it. We estimate the start and end of the word in these examples using a voting method. This provides a small number of training examples (typically three per word). Since there is no shared structure, we use a discriminative rather than a generative word model. While our word spotters are not perfect, they are sufficient to establish an alignment. We demonstrate that quite small numbers of good word spotters results in an alignment good enough to produce simple English-ASL translations, both by phrase matching and using word substitution.

KW - Action analysis and recognition

KW - Applications of vision

KW - Image and video retrieval

KW - Object recognition

UR - http://www.scopus.com/inward/record.url?scp=33845569392&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33845569392&partnerID=8YFLogxK

U2 - 10.1109/CVPR.2006.51

DO - 10.1109/CVPR.2006.51

M3 - Conference contribution

AN - SCOPUS:33845569392

SN - 0769525970

SN - 9780769525976

T3 - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SP - 1471

EP - 1476

BT - Proceedings - 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2006

ER -