TY - GEN
T1 - Proof-of-Contribution-Based Design for Collaborative Machine Learning on Blockchain
AU - Buyukates, Baturalp
AU - He, Chaoyang
AU - Han, Shanshan
AU - Fang, Zhiyong
AU - Zhang, Yupeng
AU - Long, Jieyi
AU - Farahanchi, Ali
AU - Avestimehr, Salman
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - We consider a project (model) owner that would like to train a model by utilizing the local private data and compute power of interested data owners, i.e., trainers. Our goal is to design a data marketplace for such decentralized collaborative/federated learning applications that simultaneously provides i) proof-of-contribution based reward allocation so that the trainers are compensated based on their contributions to the trained model; ii) privacy-preserving decentralized model training by avoiding any data movement from data owners; iii) robustness against malicious parties (e.g., trainers aiming to poison the model); iv) verifiability in the sense that the integrity, i.e., correctness, of all computations in the data market protocol including contribution assessment and outlier detection are verifiable through zero-knowledge proofs; and v) efficient and universal design. We propose a blockchain-based marketplace design to achieve all five objectives mentioned above. In our design, we utilize a distributed storage infrastructure and an aggregator aside from the project owner and the trainers. The aggregator is a processing node that performs certain computations, including assessing trainer contributions, removing outliers, and updating hyper-parameters. We execute the proposed data market through a blockchain smart contract. The deployed smart contract ensures that the project owner cannot evade payment, and honest trainers are rewarded based on their contributions at the end of training. Finally, we implement the building blocks of the proposed data market and demonstrate their applicability in practical scenarios through extensive experiments.
AB - We consider a project (model) owner that would like to train a model by utilizing the local private data and compute power of interested data owners, i.e., trainers. Our goal is to design a data marketplace for such decentralized collaborative/federated learning applications that simultaneously provides i) proof-of-contribution based reward allocation so that the trainers are compensated based on their contributions to the trained model; ii) privacy-preserving decentralized model training by avoiding any data movement from data owners; iii) robustness against malicious parties (e.g., trainers aiming to poison the model); iv) verifiability in the sense that the integrity, i.e., correctness, of all computations in the data market protocol including contribution assessment and outlier detection are verifiable through zero-knowledge proofs; and v) efficient and universal design. We propose a blockchain-based marketplace design to achieve all five objectives mentioned above. In our design, we utilize a distributed storage infrastructure and an aggregator aside from the project owner and the trainers. The aggregator is a processing node that performs certain computations, including assessing trainer contributions, removing outliers, and updating hyper-parameters. We execute the proposed data market through a blockchain smart contract. The deployed smart contract ensures that the project owner cannot evade payment, and honest trainers are rewarded based on their contributions at the end of training. Finally, we implement the building blocks of the proposed data market and demonstrate their applicability in practical scenarios through extensive experiments.
KW - blockchain
KW - collaborative machine learning
KW - contribution assessment
KW - data markets
KW - zero-knowledge proof
UR - http://www.scopus.com/inward/record.url?scp=85172861049&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85172861049&partnerID=8YFLogxK
U2 - 10.1109/DAPPS57946.2023.00012
DO - 10.1109/DAPPS57946.2023.00012
M3 - Conference contribution
AN - SCOPUS:85172861049
T3 - Proceedings - 2023 IEEE International Conference on Decentralized Applications and Infrastructures, DAPPS 2023
SP - 13
EP - 22
BT - Proceedings - 2023 IEEE International Conference on Decentralized Applications and Infrastructures, DAPPS 2023
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 5th IEEE International Conference on Decentralized Applications and Infrastructures, DAPPS 2023
Y2 - 17 July 2023 through 20 July 2023
ER -