TY - GEN
T1 - Aggregation for Sensitive Data
AU - Bhowmik, Avradeep
AU - Ghosh, Joydeep
AU - Koyejo, Oluwasanmi
N1 - Publisher Copyright:
© 2019 IEEE.
PY - 2019/7
Y1 - 2019/7
N2 - In many modern applications, considerations like privacy, security and legal doctrines like the GDPR put limitations on data storage and sharing with third parties. Specifically, access to individual level data points is restricted and machine learning models need to be trained with aggregated versions of the datasets. Learning with aggregated data is a new and relatively unexplored form of semi-supervision. We tackle this problem by designing aggregation paradigms that conform to certain kinds of privacy or non-identifiability requirements. We further develop novel learning algorithms that can nevertheless be used to learn from only these aggregates. We motivate our framework for the case of Gaussian regression, and subsequently extend our techniques to subsume arbitrary binary classifiers and generalised linear models. We provide theoretical results and empirical evaluation of our methods on real data from healthcare and telecom.
AB - In many modern applications, considerations like privacy, security and legal doctrines like the GDPR put limitations on data storage and sharing with third parties. Specifically, access to individual level data points is restricted and machine learning models need to be trained with aggregated versions of the datasets. Learning with aggregated data is a new and relatively unexplored form of semi-supervision. We tackle this problem by designing aggregation paradigms that conform to certain kinds of privacy or non-identifiability requirements. We further develop novel learning algorithms that can nevertheless be used to learn from only these aggregates. We motivate our framework for the case of Gaussian regression, and subsequently extend our techniques to subsume arbitrary binary classifiers and generalised linear models. We provide theoretical results and empirical evaluation of our methods on real data from healthcare and telecom.
UR - http://www.scopus.com/inward/record.url?scp=85082879545&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85082879545&partnerID=8YFLogxK
U2 - 10.1109/SampTA45681.2019.9030955
DO - 10.1109/SampTA45681.2019.9030955
M3 - Conference contribution
AN - SCOPUS:85082879545
T3 - 2019 13th International Conference on Sampling Theory and Applications, SampTA 2019
BT - 2019 13th International Conference on Sampling Theory and Applications, SampTA 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 13th International Conference on Sampling Theory and Applications, SampTA 2019
Y2 - 8 July 2019 through 12 July 2019
ER -