TY - GEN
T1 - Robust Mean Estimation in High Dimensions
T2 - 2022 IEEE International Symposium on Information Theory, ISIT 2022
AU - Deshmukh, Aditya
AU - Liu, Jing
AU - Veeravalli, Venugopal V.
N1 - *Equal contribution. A. Deshmukh and V.V. Veeravalli are with the ECE Department and Coordinated Science Lab, UIUC, Illinois, USA. Email: ad11,[email protected] J. Liu is with the CS Department, UIUC, Illinois, USA. Email: [email protected] This research was supported by the US Army Research Laboratory under Cooperative Agreement W911NF-17-2-0196 and by the US National Science Foundation under grant 2106727, through UIUC.
PY - 2022
Y1 - 2022
N2 - The problem of robust mean estimation in high dimensions is studied, in which a certain fraction (less than half) of the datapoints can be arbitrarily corrupted. Motivated by compressive sensing, the robust mean estimation problem is formulated as the minimization of the ℓ0-'norm' of an outlier indicator vector, under a second moment constraint on the datapoints. The ℓ0-'norm' is then relaxed to the ℓp-norm (0 < p ≤ 1) in the objective, and it is shown that the global minima for each of these objectives are order-optimal and have optimal breakdown point for the robust mean estimation problem. Furthermore, a computationally tractable iterative ℓp-minimization and hard thresholding algorithm is proposed that outputs an order-optimal robust estimate of the population mean. The proposed algorithm (with breakdown point ≈0.3) does not require prior knowledge of the fraction of outliers, in contrast with most existing algorithms, and for p = 1 it has near-linear time complexity. Both synthetic and real data experiments demonstrate that the proposed algorithm outperforms state-of-the-art robust mean estimation methods.
AB - The problem of robust mean estimation in high dimensions is studied, in which a certain fraction (less than half) of the datapoints can be arbitrarily corrupted. Motivated by compressive sensing, the robust mean estimation problem is formulated as the minimization of the ℓ0-'norm' of an outlier indicator vector, under a second moment constraint on the datapoints. The ℓ0-'norm' is then relaxed to the ℓp-norm (0 < p ≤ 1) in the objective, and it is shown that the global minima for each of these objectives are order-optimal and have optimal breakdown point for the robust mean estimation problem. Furthermore, a computationally tractable iterative ℓp-minimization and hard thresholding algorithm is proposed that outputs an order-optimal robust estimate of the population mean. The proposed algorithm (with breakdown point ≈0.3) does not require prior knowledge of the fraction of outliers, in contrast with most existing algorithms, and for p = 1 it has near-linear time complexity. Both synthetic and real data experiments demonstrate that the proposed algorithm outperforms state-of-the-art robust mean estimation methods.
KW - Global outlier pursuit
KW - High-dimensional statistics
KW - Robust estimation
KW - linear time complexity algorithm
UR - http://www.scopus.com/inward/record.url?scp=85136247115&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85136247115&partnerID=8YFLogxK
U2 - 10.1109/ISIT50566.2022.9834585
DO - 10.1109/ISIT50566.2022.9834585
M3 - Conference contribution
AN - SCOPUS:85136247115
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 1115
EP - 1120
BT - 2022 IEEE International Symposium on Information Theory, ISIT 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 26 June 2022 through 1 July 2022
ER -