Names and faces in the news

Tamara L. Berg, Alexander C. Berg, Jaety Edwards, Michael Maire, Ryan White, Yee Whye Teh, Erik Learned-Miller, D. A. Forsyth

Research output: Contribution to journalConference article

Abstract

We show quite good face clustering is possible for a dataset of inaccurately and ambiguously labelled face images. Our dataset is 44,773 face images, obtained by applying a face finder to approximately half a million captioned news images. This dataset is more realistic than usual face recognition datasets, because it contains faces captured "in the wild" in a variety of configurations with respect to the camera, taking a variety of expressions, and under illumination of widely varying color. Each face image is associated with a set of names, automatically extracted from the associated caption. Many, but not all such sets contain the correct name. We cluster face images in appropriate discriminant coordinates. We use a clustering procedure to break ambiguities in labelling and identify incorrectly labelled faces. A merging procedure then identifies variants of names that refer to the same individual. The resulting representation can be used to label faces in news images or to organize news pictures by individuals present. An alternative view of our procedure is as a process that cleans up noisy supervised data. We demonstrate how to use entropy measures to evaluate such procedures.

Original languageEnglish (US)
JournalProceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition
Volume2
StatePublished - Oct 19 2004
Externally publishedYes
EventProceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR 2004 - Washington, DC, United States
Duration: Jun 27 2004Jul 2 2004

Fingerprint

Face recognition
Merging
Labeling
Labels
Entropy
Lighting
Cameras
Color

ASJC Scopus subject areas

  • Software
  • Computer Vision and Pattern Recognition

Cite this

Names and faces in the news. / Berg, Tamara L.; Berg, Alexander C.; Edwards, Jaety; Maire, Michael; White, Ryan; Teh, Yee Whye; Learned-Miller, Erik; Forsyth, D. A.

In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Vol. 2, 19.10.2004.

Research output: Contribution to journalConference article

Berg, TL, Berg, AC, Edwards, J, Maire, M, White, R, Teh, YW, Learned-Miller, E & Forsyth, DA 2004, 'Names and faces in the news', Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2.
Berg, Tamara L. ; Berg, Alexander C. ; Edwards, Jaety ; Maire, Michael ; White, Ryan ; Teh, Yee Whye ; Learned-Miller, Erik ; Forsyth, D. A. / Names and faces in the news. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2004 ; Vol. 2.
@article{ccd6e31c46d9463088746030c73fb676,
title = "Names and faces in the news",
abstract = "We show quite good face clustering is possible for a dataset of inaccurately and ambiguously labelled face images. Our dataset is 44,773 face images, obtained by applying a face finder to approximately half a million captioned news images. This dataset is more realistic than usual face recognition datasets, because it contains faces captured {"}in the wild{"} in a variety of configurations with respect to the camera, taking a variety of expressions, and under illumination of widely varying color. Each face image is associated with a set of names, automatically extracted from the associated caption. Many, but not all such sets contain the correct name. We cluster face images in appropriate discriminant coordinates. We use a clustering procedure to break ambiguities in labelling and identify incorrectly labelled faces. A merging procedure then identifies variants of names that refer to the same individual. The resulting representation can be used to label faces in news images or to organize news pictures by individuals present. An alternative view of our procedure is as a process that cleans up noisy supervised data. We demonstrate how to use entropy measures to evaluate such procedures.",
author = "Berg, {Tamara L.} and Berg, {Alexander C.} and Jaety Edwards and Michael Maire and Ryan White and Teh, {Yee Whye} and Erik Learned-Miller and Forsyth, {D. A.}",
year = "2004",
month = "10",
day = "19",
language = "English (US)",
volume = "2",
journal = "Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition",
issn = "1063-6919",
publisher = "IEEE Computer Society",

}

TY - JOUR

T1 - Names and faces in the news

AU - Berg, Tamara L.

AU - Berg, Alexander C.

AU - Edwards, Jaety

AU - Maire, Michael

AU - White, Ryan

AU - Teh, Yee Whye

AU - Learned-Miller, Erik

AU - Forsyth, D. A.

PY - 2004/10/19

Y1 - 2004/10/19

N2 - We show quite good face clustering is possible for a dataset of inaccurately and ambiguously labelled face images. Our dataset is 44,773 face images, obtained by applying a face finder to approximately half a million captioned news images. This dataset is more realistic than usual face recognition datasets, because it contains faces captured "in the wild" in a variety of configurations with respect to the camera, taking a variety of expressions, and under illumination of widely varying color. Each face image is associated with a set of names, automatically extracted from the associated caption. Many, but not all such sets contain the correct name. We cluster face images in appropriate discriminant coordinates. We use a clustering procedure to break ambiguities in labelling and identify incorrectly labelled faces. A merging procedure then identifies variants of names that refer to the same individual. The resulting representation can be used to label faces in news images or to organize news pictures by individuals present. An alternative view of our procedure is as a process that cleans up noisy supervised data. We demonstrate how to use entropy measures to evaluate such procedures.

AB - We show quite good face clustering is possible for a dataset of inaccurately and ambiguously labelled face images. Our dataset is 44,773 face images, obtained by applying a face finder to approximately half a million captioned news images. This dataset is more realistic than usual face recognition datasets, because it contains faces captured "in the wild" in a variety of configurations with respect to the camera, taking a variety of expressions, and under illumination of widely varying color. Each face image is associated with a set of names, automatically extracted from the associated caption. Many, but not all such sets contain the correct name. We cluster face images in appropriate discriminant coordinates. We use a clustering procedure to break ambiguities in labelling and identify incorrectly labelled faces. A merging procedure then identifies variants of names that refer to the same individual. The resulting representation can be used to label faces in news images or to organize news pictures by individuals present. An alternative view of our procedure is as a process that cleans up noisy supervised data. We demonstrate how to use entropy measures to evaluate such procedures.

UR - http://www.scopus.com/inward/record.url?scp=5044236741&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=5044236741&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:5044236741

VL - 2

JO - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

JF - Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

SN - 1063-6919

ER -