TY - GEN
T1 - Constructing an anonymous dataset from the personal digital photo libraries of Mac app store users
AU - Gozali, Jesse Prabawa
AU - Kan, Min Yen
AU - Sundaram, Hari
PY - 2013
Y1 - 2013
N2 - Personal digital photo libraries embody a large amount of information useful for research into photo organization, photo layout, and development of novel photo browser features. Even when anonymity can be ensured, amassing a sizable dataset from these libraries is still difficult due to the visibility and cost that would be required from such a study. We explore using the Mac App Store to reach more users to collect data from such personal digital photo libraries. More specifically, we compare and discuss how it differs from common data collection methods, e.g. Amazon Mechanical Turk, in terms of time, cost, quantity, and design of the data collection application. We have collected a large, openly available photo feature dataset using this manner. We illustrate the types of data that can be collected. In 60 days, we collected data from 20,778 photo sets (473,772 photos). Our study with the Mac App Store suggests that popular application distribution channels is a viable means to acquire massive data collections for researchers.
AB - Personal digital photo libraries embody a large amount of information useful for research into photo organization, photo layout, and development of novel photo browser features. Even when anonymity can be ensured, amassing a sizable dataset from these libraries is still difficult due to the visibility and cost that would be required from such a study. We explore using the Mac App Store to reach more users to collect data from such personal digital photo libraries. More specifically, we compare and discuss how it differs from common data collection methods, e.g. Amazon Mechanical Turk, in terms of time, cost, quantity, and design of the data collection application. We have collected a large, openly available photo feature dataset using this manner. We illustrate the types of data that can be collected. In 60 days, we collected data from 20,778 photo sets (473,772 photos). Our study with the Mac App Store suggests that popular application distribution channels is a viable means to acquire massive data collections for researchers.
KW - Crowd-sourcing
KW - Data collection
KW - Ground truth
KW - Personal digital library
KW - Photography
UR - http://www.scopus.com/inward/record.url?scp=84882254199&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84882254199&partnerID=8YFLogxK
U2 - 10.1145/2467696.2467730
DO - 10.1145/2467696.2467730
M3 - Conference contribution
AN - SCOPUS:84882254199
SN - 9781450320764
T3 - Proceedings of the ACM/IEEE Joint Conference on Digital Libraries
SP - 305
EP - 308
BT - JCDL 2013 - Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries
T2 - 13th ACM/IEEE-CS Joint Conference on Digital Libraries, JCDL 2013
Y2 - 22 July 2013 through 26 July 2013
ER -