Who's in the picture?

Tamara L. Berg, Alexander C. Berg, Jaety Edwards, David Alexander Forsyth

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The context in which a name appears in a caption provides powerful cues as to who is depicted in the associated image. We obtain 44,773 face images, using a face detector, from approximately half a million captioned news images and automatically link names, obtained using a named entity recognizer, with these faces. A simple clustering method can produce fair results. We improve these results significantly by combining the clustering process with a model of the probability that an individual is depicted given its context. Once the labeling procedure is over, we have an accurately labeled set of faces, an appearance model for each individual depicted, and a natural language model that can produce accurate results on captions in isolation.

Original languageEnglish (US)
Title of host publicationAdvances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004
PublisherNeural information processing systems foundation
ISBN (Print)0262195348, 9780262195348
StatePublished - Jan 1 2005
Externally publishedYes
Event18th Annual Conference on Neural Information Processing Systems, NIPS 2004 - Vancouver, BC, Canada
Duration: Dec 13 2004Dec 16 2004

Publication series

NameAdvances in Neural Information Processing Systems
ISSN (Print)1049-5258

Other

Other18th Annual Conference on Neural Information Processing Systems, NIPS 2004
CountryCanada
CityVancouver, BC
Period12/13/0412/16/04

Fingerprint

Labeling
Detectors

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Cite this

Berg, T. L., Berg, A. C., Edwards, J., & Forsyth, D. A. (2005). Who's in the picture? In Advances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004 (Advances in Neural Information Processing Systems). Neural information processing systems foundation.

Who's in the picture? / Berg, Tamara L.; Berg, Alexander C.; Edwards, Jaety; Forsyth, David Alexander.

Advances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004. Neural information processing systems foundation, 2005. (Advances in Neural Information Processing Systems).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Berg, TL, Berg, AC, Edwards, J & Forsyth, DA 2005, Who's in the picture? in Advances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004. Advances in Neural Information Processing Systems, Neural information processing systems foundation, 18th Annual Conference on Neural Information Processing Systems, NIPS 2004, Vancouver, BC, Canada, 12/13/04.
Berg TL, Berg AC, Edwards J, Forsyth DA. Who's in the picture? In Advances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004. Neural information processing systems foundation. 2005. (Advances in Neural Information Processing Systems).
Berg, Tamara L. ; Berg, Alexander C. ; Edwards, Jaety ; Forsyth, David Alexander. / Who's in the picture?. Advances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004. Neural information processing systems foundation, 2005. (Advances in Neural Information Processing Systems).
@inproceedings{36791da1b9b846789cea7ecff29feac8,
title = "Who's in the picture?",
abstract = "The context in which a name appears in a caption provides powerful cues as to who is depicted in the associated image. We obtain 44,773 face images, using a face detector, from approximately half a million captioned news images and automatically link names, obtained using a named entity recognizer, with these faces. A simple clustering method can produce fair results. We improve these results significantly by combining the clustering process with a model of the probability that an individual is depicted given its context. Once the labeling procedure is over, we have an accurately labeled set of faces, an appearance model for each individual depicted, and a natural language model that can produce accurate results on captions in isolation.",
author = "Berg, {Tamara L.} and Berg, {Alexander C.} and Jaety Edwards and Forsyth, {David Alexander}",
year = "2005",
month = "1",
day = "1",
language = "English (US)",
isbn = "0262195348",
series = "Advances in Neural Information Processing Systems",
publisher = "Neural information processing systems foundation",
booktitle = "Advances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004",

}

TY - GEN

T1 - Who's in the picture?

AU - Berg, Tamara L.

AU - Berg, Alexander C.

AU - Edwards, Jaety

AU - Forsyth, David Alexander

PY - 2005/1/1

Y1 - 2005/1/1

N2 - The context in which a name appears in a caption provides powerful cues as to who is depicted in the associated image. We obtain 44,773 face images, using a face detector, from approximately half a million captioned news images and automatically link names, obtained using a named entity recognizer, with these faces. A simple clustering method can produce fair results. We improve these results significantly by combining the clustering process with a model of the probability that an individual is depicted given its context. Once the labeling procedure is over, we have an accurately labeled set of faces, an appearance model for each individual depicted, and a natural language model that can produce accurate results on captions in isolation.

AB - The context in which a name appears in a caption provides powerful cues as to who is depicted in the associated image. We obtain 44,773 face images, using a face detector, from approximately half a million captioned news images and automatically link names, obtained using a named entity recognizer, with these faces. A simple clustering method can produce fair results. We improve these results significantly by combining the clustering process with a model of the probability that an individual is depicted given its context. Once the labeling procedure is over, we have an accurately labeled set of faces, an appearance model for each individual depicted, and a natural language model that can produce accurate results on captions in isolation.

UR - http://www.scopus.com/inward/record.url?scp=84898929216&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84898929216&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84898929216

SN - 0262195348

SN - 9780262195348

T3 - Advances in Neural Information Processing Systems

BT - Advances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004

PB - Neural information processing systems foundation

ER -