TY - CONF
T1 - Towards efficient data valuation based on the Shapley value
AU - Jia, Ruoxi
AU - Dao, David
AU - Wang, Boxin
AU - Hubis, Frances Ann
AU - Hynes, Nick
AU - Gurel, Nezihe Merve
AU - Li, Bo
AU - Zhang, Ce
AU - Song, Dawn
AU - Spanos, Costas
N1 - Funding Information:
This work is supported in part by the Republic of Singapores National Research Foundation through a grant to the Berkeley Education Alliance for Research in Singapore (BEARS) for the Singapore-Berkeley Building Efficiency and Sustainability in the Tropics (SinBerBEST) Program. This work is also supported in part by the CLTC (Center for Long-Term Cybersecurity); FORCES (Foundations of Resilient CybErPhysical Systems), which receives support from the National Science Foundation (NSF award numbers CNS-1238959, CNS-1238962, CNS-1239054, CNS1239166); and the National Science Foundation under Grant No. TWC-1518899. CZ and the DS3Lab gratefully acknowledge the support from Mercedes-Benz Research & Development NA, MeteoSwiss, Oracle Labs, Swiss Data Science Center, Swisscom, Zurich Insurance, Chinese Scholarship Council, and the Department of Computer Science at ETH Zurich.
Funding Information:
This work is supported in part by the Republic of Sin-gapores National Research Foundation through a grant to the Berkeley Education Alliance for Research in Singapore (BEARS) for the Singapore-Berkeley Building Efficiency and Sustainability in the Tropics (Sin-BerBEST) Program. This work is also supported in part by the CLTC (Center for Long-Term Cyberse-curity); FORCES (Foundations Of Resilient CybEr-Physical Systems), which receives support from the National Science Foundation (NSF award numbers CNS-1238959, CNS-1238962, CNS-1239054, CNS1239166); and the National Science Foundation under Grant No. TWC-1518899. CZ and the DS3Lab gratefully acknowledge the support from Mercedes-Benz Research & Development NA, MeteoSwiss, Oracle Labs, Swiss Data Science Center, Swisscom, Zurich Insurance, Chinese Scholarship Council, and the Department of Computer Science at ETH Zurich.
Publisher Copyright:
© 2019 by the author(s).
PY - 2020
Y1 - 2020
N2 - “How much is my data worth?” is an increasingly common question posed by organizations and individuals alike. An answer to this question could allow, for instance, fairly distributing profits among multiple data contributors and determining prospective compensation when data breaches happen. In this paper, we study the problem of data valuation by utilizing the Shapley value, a popular notion of value which originated in coopoerative game theory. The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value. However, the Shapley value often requires exponential time to compute. To meet this challenge, we propose a repertoire of efficient algorithms for approximating the Shapley value. We also demonstrate the value of each training instance for various benchmark datasets.
AB - “How much is my data worth?” is an increasingly common question posed by organizations and individuals alike. An answer to this question could allow, for instance, fairly distributing profits among multiple data contributors and determining prospective compensation when data breaches happen. In this paper, we study the problem of data valuation by utilizing the Shapley value, a popular notion of value which originated in coopoerative game theory. The Shapley value defines a unique payoff scheme that satisfies many desiderata for the notion of data value. However, the Shapley value often requires exponential time to compute. To meet this challenge, we propose a repertoire of efficient algorithms for approximating the Shapley value. We also demonstrate the value of each training instance for various benchmark datasets.
UR - http://www.scopus.com/inward/record.url?scp=85085061245&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085061245&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85085061245
T2 - 22nd International Conference on Artificial Intelligence and Statistics, AISTATS 2019
Y2 - 16 April 2019 through 18 April 2019
ER -