It's all about the data

Tamara L. Berg, Alexander Sorokin, Gang Wang, David Alexander Forsyth, Derek W Hoiem, Ian Endres, Ali Farhadi

Research output: Contribution to journalArticle

Abstract

Modern computer vision research consumes labelled data in quantity, and building datasets has become an important activity. The Internet has become a tremendous resource for computer vision researchers. By seeing the Internet as a vast, slightly disorganized collection of visual data, we can build datasets. The key point is that visual data are surrounded by contextual information like text and HTML tags, which is a strong, if noisy, cue to what the visual data means. In a series of case studies, we illustrate how useful this contextual information is. It can be used to build a large and challenging labelled face dataset with no manual intervention. With very small amounts of manual labor, contextual data can be used together with image data to identify pictures of animals. In fact, these contextual data are sufficiently reliable that a very large pool of noisily tagged images can be used as a resource to build image features, which reliably improve on conventional visual features. By seeing the Internet as a marketplace that can connect sellers of annotation services to researchers, we can obtain accurately annotated datasets quickly and cheaply. We describe methods to prepare data, check quality, and set prices for work for this annotation process. The problems posed by attempting to collect very big research datasets are fertile for researchers because collecting datasets requires us to focus on two important questions: What makes a good picture? What is the meaning of a picture?

Original languageEnglish (US)
Article number5464301
Pages (from-to)1434-1452
Number of pages19
JournalProceedings of the IEEE
Volume98
Issue number8
DOIs
StatePublished - Aug 1 2010

Fingerprint

Internet
Computer vision
HTML
Animals
Personnel

Keywords

  • Computer vision
  • Internet

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Cite this

Berg, T. L., Sorokin, A., Wang, G., Forsyth, D. A., Hoiem, D. W., Endres, I., & Farhadi, A. (2010). It's all about the data. Proceedings of the IEEE, 98(8), 1434-1452. [5464301]. https://doi.org/10.1109/JPROC.2009.2032355

It's all about the data. / Berg, Tamara L.; Sorokin, Alexander; Wang, Gang; Forsyth, David Alexander; Hoiem, Derek W; Endres, Ian; Farhadi, Ali.

In: Proceedings of the IEEE, Vol. 98, No. 8, 5464301, 01.08.2010, p. 1434-1452.

Research output: Contribution to journalArticle

Berg, TL, Sorokin, A, Wang, G, Forsyth, DA, Hoiem, DW, Endres, I & Farhadi, A 2010, 'It's all about the data', Proceedings of the IEEE, vol. 98, no. 8, 5464301, pp. 1434-1452. https://doi.org/10.1109/JPROC.2009.2032355
Berg TL, Sorokin A, Wang G, Forsyth DA, Hoiem DW, Endres I et al. It's all about the data. Proceedings of the IEEE. 2010 Aug 1;98(8):1434-1452. 5464301. https://doi.org/10.1109/JPROC.2009.2032355
Berg, Tamara L. ; Sorokin, Alexander ; Wang, Gang ; Forsyth, David Alexander ; Hoiem, Derek W ; Endres, Ian ; Farhadi, Ali. / It's all about the data. In: Proceedings of the IEEE. 2010 ; Vol. 98, No. 8. pp. 1434-1452.
@article{de85eadc9612416c994743478e0aabd3,
title = "It's all about the data",
abstract = "Modern computer vision research consumes labelled data in quantity, and building datasets has become an important activity. The Internet has become a tremendous resource for computer vision researchers. By seeing the Internet as a vast, slightly disorganized collection of visual data, we can build datasets. The key point is that visual data are surrounded by contextual information like text and HTML tags, which is a strong, if noisy, cue to what the visual data means. In a series of case studies, we illustrate how useful this contextual information is. It can be used to build a large and challenging labelled face dataset with no manual intervention. With very small amounts of manual labor, contextual data can be used together with image data to identify pictures of animals. In fact, these contextual data are sufficiently reliable that a very large pool of noisily tagged images can be used as a resource to build image features, which reliably improve on conventional visual features. By seeing the Internet as a marketplace that can connect sellers of annotation services to researchers, we can obtain accurately annotated datasets quickly and cheaply. We describe methods to prepare data, check quality, and set prices for work for this annotation process. The problems posed by attempting to collect very big research datasets are fertile for researchers because collecting datasets requires us to focus on two important questions: What makes a good picture? What is the meaning of a picture?",
keywords = "Computer vision, Internet",
author = "Berg, {Tamara L.} and Alexander Sorokin and Gang Wang and Forsyth, {David Alexander} and Hoiem, {Derek W} and Ian Endres and Ali Farhadi",
year = "2010",
month = "8",
day = "1",
doi = "10.1109/JPROC.2009.2032355",
language = "English (US)",
volume = "98",
pages = "1434--1452",
journal = "Proceedings of the IEEE",
issn = "0018-9219",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
number = "8",

}

TY - JOUR

T1 - It's all about the data

AU - Berg, Tamara L.

AU - Sorokin, Alexander

AU - Wang, Gang

AU - Forsyth, David Alexander

AU - Hoiem, Derek W

AU - Endres, Ian

AU - Farhadi, Ali

PY - 2010/8/1

Y1 - 2010/8/1

N2 - Modern computer vision research consumes labelled data in quantity, and building datasets has become an important activity. The Internet has become a tremendous resource for computer vision researchers. By seeing the Internet as a vast, slightly disorganized collection of visual data, we can build datasets. The key point is that visual data are surrounded by contextual information like text and HTML tags, which is a strong, if noisy, cue to what the visual data means. In a series of case studies, we illustrate how useful this contextual information is. It can be used to build a large and challenging labelled face dataset with no manual intervention. With very small amounts of manual labor, contextual data can be used together with image data to identify pictures of animals. In fact, these contextual data are sufficiently reliable that a very large pool of noisily tagged images can be used as a resource to build image features, which reliably improve on conventional visual features. By seeing the Internet as a marketplace that can connect sellers of annotation services to researchers, we can obtain accurately annotated datasets quickly and cheaply. We describe methods to prepare data, check quality, and set prices for work for this annotation process. The problems posed by attempting to collect very big research datasets are fertile for researchers because collecting datasets requires us to focus on two important questions: What makes a good picture? What is the meaning of a picture?

AB - Modern computer vision research consumes labelled data in quantity, and building datasets has become an important activity. The Internet has become a tremendous resource for computer vision researchers. By seeing the Internet as a vast, slightly disorganized collection of visual data, we can build datasets. The key point is that visual data are surrounded by contextual information like text and HTML tags, which is a strong, if noisy, cue to what the visual data means. In a series of case studies, we illustrate how useful this contextual information is. It can be used to build a large and challenging labelled face dataset with no manual intervention. With very small amounts of manual labor, contextual data can be used together with image data to identify pictures of animals. In fact, these contextual data are sufficiently reliable that a very large pool of noisily tagged images can be used as a resource to build image features, which reliably improve on conventional visual features. By seeing the Internet as a marketplace that can connect sellers of annotation services to researchers, we can obtain accurately annotated datasets quickly and cheaply. We describe methods to prepare data, check quality, and set prices for work for this annotation process. The problems posed by attempting to collect very big research datasets are fertile for researchers because collecting datasets requires us to focus on two important questions: What makes a good picture? What is the meaning of a picture?

KW - Computer vision

KW - Internet

UR - http://www.scopus.com/inward/record.url?scp=77954864888&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77954864888&partnerID=8YFLogxK

U2 - 10.1109/JPROC.2009.2032355

DO - 10.1109/JPROC.2009.2032355

M3 - Article

AN - SCOPUS:77954864888

VL - 98

SP - 1434

EP - 1452

JO - Proceedings of the IEEE

JF - Proceedings of the IEEE

SN - 0018-9219

IS - 8

M1 - 5464301

ER -