TY - JOUR
T1 - Dealer
T2 - An end-to-end model marketplace with differential privacy
AU - Liu, Jinfei
AU - Lou, Jian
AU - Liu, Junxu
AU - Xiong, Li
AU - Pei, Jian
AU - Sun, Jimeng
N1 - Funding Information:
This work was supported in part by the NSF grants (CNS-1952192, CNS-2027783, SCH-2014438, PPoSS 2028839, and IIS-1838042), AFOSR under DDDAS program FA9550-12-1-0240, the NSFC grants (61941121 and 91646203), the NSERC Discovery Grant program, and the NIH grants (R01 1R01NS107291-01 and R56HL138415).
Publisher Copyright:
© is held by the owner/author(s).
PY - 2021/2
Y1 - 2021/2
N2 - Data-driven machine learning has become ubiquitous. A marketplace for machine learning models connects data owners and model buyers, and can dramatically facilitate data-driven machine learning applications. In this paper, we take a formal data marketplace perspective and propose the first enD-to-end model marketplace with differential privacy (Dealer) towards answering the following questions: How to formulate data owners’ compensation functions and model buyers’ price functions? How can the broker determine prices for a set of models to maximize the revenue with arbitrage-free guarantee, and train a set of models with maximum Shapley coverage given a manufacturing budget to remain competitive? For the former, we propose compensation function for each data owner based on Shapley value and privacy sensitivity, and price function for each model buyer based on Shapley coverage sensitivity and noise sensitivity. Both privacy sensitivity and noise sensitivity are measured by the level of differential privacy. For the latter, we formulate two optimization problems for model pricing and model training, and propose efficient dynamic programming algorithms. Experiment results on the real chess dataset and synthetic datasets justify the design of Dealer and verify the efficiency and effectiveness of the proposed algorithms.
AB - Data-driven machine learning has become ubiquitous. A marketplace for machine learning models connects data owners and model buyers, and can dramatically facilitate data-driven machine learning applications. In this paper, we take a formal data marketplace perspective and propose the first enD-to-end model marketplace with differential privacy (Dealer) towards answering the following questions: How to formulate data owners’ compensation functions and model buyers’ price functions? How can the broker determine prices for a set of models to maximize the revenue with arbitrage-free guarantee, and train a set of models with maximum Shapley coverage given a manufacturing budget to remain competitive? For the former, we propose compensation function for each data owner based on Shapley value and privacy sensitivity, and price function for each model buyer based on Shapley coverage sensitivity and noise sensitivity. Both privacy sensitivity and noise sensitivity are measured by the level of differential privacy. For the latter, we formulate two optimization problems for model pricing and model training, and propose efficient dynamic programming algorithms. Experiment results on the real chess dataset and synthetic datasets justify the design of Dealer and verify the efficiency and effectiveness of the proposed algorithms.
UR - http://www.scopus.com/inward/record.url?scp=85102659336&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85102659336&partnerID=8YFLogxK
U2 - 10.14778/3447689.3447700
DO - 10.14778/3447689.3447700
M3 - Article
AN - SCOPUS:85102659336
SN - 2150-8097
VL - 14
SP - 957
EP - 969
JO - Proceedings of the VLDB Endowment
JF - Proceedings of the VLDB Endowment
IS - 6
ER -