TY - GEN
T1 - Crowd-powered find algorithms
AU - Das Sarma, Anish
AU - Parameswaran, Aditya
AU - Garcia-Molina, Hector
AU - Halevy, Alon
PY - 2014
Y1 - 2014
N2 - We consider the problem of using humans to find a bounded number of items satisfying certain properties, from a data set. For instance, we may want humans to identify a select number of travel photos from a data set of photos to display on a travel website, or a candidate set of resumes that meet certain requirements from a large pool of applicants. Since data sets can be enormous, and since monetary cost and latency of data processing with humans can be large, optimizing the use of humans for finding items is an important challenge. We formally define the problem using the metrics of cost and time, and design optimal algorithms that span the skyline of cost and time, i.e., we provide designers the ability to control the cost vs. time trade-off. We study the deterministic as well as error-prone human answer settings, along with multiplicative and additive approximations. Lastly, we study how we may design algorithms with specific expected cost and time measures.
AB - We consider the problem of using humans to find a bounded number of items satisfying certain properties, from a data set. For instance, we may want humans to identify a select number of travel photos from a data set of photos to display on a travel website, or a candidate set of resumes that meet certain requirements from a large pool of applicants. Since data sets can be enormous, and since monetary cost and latency of data processing with humans can be large, optimizing the use of humans for finding items is an important challenge. We formally define the problem using the metrics of cost and time, and design optimal algorithms that span the skyline of cost and time, i.e., we provide designers the ability to control the cost vs. time trade-off. We study the deterministic as well as error-prone human answer settings, along with multiplicative and additive approximations. Lastly, we study how we may design algorithms with specific expected cost and time measures.
UR - http://www.scopus.com/inward/record.url?scp=84901813373&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84901813373&partnerID=8YFLogxK
U2 - 10.1109/ICDE.2014.6816715
DO - 10.1109/ICDE.2014.6816715
M3 - Conference contribution
AN - SCOPUS:84901813373
SN - 9781479925544
T3 - Proceedings - International Conference on Data Engineering
SP - 964
EP - 975
BT - 2014 IEEE 30th International Conference on Data Engineering, ICDE 2014
PB - IEEE Computer Society
T2 - 30th IEEE International Conference on Data Engineering, ICDE 2014
Y2 - 31 March 2014 through 4 April 2014
ER -