Robust Mean Estimation in High Dimensions: An Outlier Fraction Agnostic and Efficient Algorithm

Aditya Deshmukh, Jing Liu, Venugopal V. Veeravalli

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The problem of robust mean estimation in high dimensions is studied, in which a certain fraction (less than half) of the datapoints can be arbitrarily corrupted. Motivated by compressive sensing, the robust mean estimation problem is formulated as the minimization of the ℓ0-'norm' of an outlier indicator vector, under a second moment constraint on the datapoints. The ℓ0-'norm' is then relaxed to the ℓp-norm (0 < p ≤ 1) in the objective, and it is shown that the global minima for each of these objectives are order-optimal and have optimal breakdown point for the robust mean estimation problem. Furthermore, a computationally tractable iterative ℓp-minimization and hard thresholding algorithm is proposed that outputs an order-optimal robust estimate of the population mean. The proposed algorithm (with breakdown point ≈0.3) does not require prior knowledge of the fraction of outliers, in contrast with most existing algorithms, and for p = 1 it has near-linear time complexity. Both synthetic and real data experiments demonstrate that the proposed algorithm outperforms state-of-the-art robust mean estimation methods.

Original languageEnglish (US)
Title of host publication2022 IEEE International Symposium on Information Theory, ISIT 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1115-1120
Number of pages6
ISBN (Electronic)9781665421591
DOIs
StatePublished - 2022
Event2022 IEEE International Symposium on Information Theory, ISIT 2022 - Espoo, Finland
Duration: Jun 26 2022Jul 1 2022

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
Volume2022-June
ISSN (Print)2157-8095

Conference

Conference2022 IEEE International Symposium on Information Theory, ISIT 2022
Country/TerritoryFinland
CityEspoo
Period6/26/227/1/22

Keywords

  • Global outlier pursuit
  • High-dimensional statistics
  • Robust estimation
  • linear time complexity algorithm

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Robust Mean Estimation in High Dimensions: An Outlier Fraction Agnostic and Efficient Algorithm'. Together they form a unique fingerprint.

Cite this