TY - GEN
T1 - Every picture tells a story
T2 - 11th European Conference on Computer Vision, ECCV 2010
AU - Farhadi, Ali
AU - Hejrati, Mohsen
AU - Sadeghi, Mohammad Amin
AU - Young, Peter
AU - Rashtchian, Cyrus
AU - Hockenmaier, Julia
AU - Forsyth, David
PY - 2010
Y1 - 2010
N2 - Humans can prepare concise descriptions of pictures, focusing on what they find important. We demonstrate that automatic methods can do so too. We describe a system that can compute a score linking an image to a sentence. This score can be used to attach a descriptive sentence to a given image, or to obtain images that illustrate a given sentence. The score is obtained by comparing an estimate of meaning obtained from the image to one obtained from the sentence. Each estimate of meaning comes from a discriminative procedure that is learned using data. We evaluate on a novel dataset consisting of human-annotated images. While our underlying estimate of meaning is impoverished, it is sufficient to produce very good quantitative results, evaluated with a novel score that can account for synecdoche.
AB - Humans can prepare concise descriptions of pictures, focusing on what they find important. We demonstrate that automatic methods can do so too. We describe a system that can compute a score linking an image to a sentence. This score can be used to attach a descriptive sentence to a given image, or to obtain images that illustrate a given sentence. The score is obtained by comparing an estimate of meaning obtained from the image to one obtained from the sentence. Each estimate of meaning comes from a discriminative procedure that is learned using data. We evaluate on a novel dataset consisting of human-annotated images. While our underlying estimate of meaning is impoverished, it is sufficient to produce very good quantitative results, evaluated with a novel score that can account for synecdoche.
UR - http://www.scopus.com/inward/record.url?scp=78149311145&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=78149311145&partnerID=8YFLogxK
U2 - 10.1007/978-3-642-15561-1_2
DO - 10.1007/978-3-642-15561-1_2
M3 - Conference contribution
AN - SCOPUS:78149311145
SN - 364215560X
SN - 9783642155604
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 15
EP - 29
BT - Computer Vision, ECCV 2010 - 11th European Conference on Computer Vision, Proceedings
PB - Springer
Y2 - 10 September 2010 through 11 September 2010
ER -