### Abstract

The principal submatrix localization problem deals with recovering a K ×K principal submatrix of elevated mean µ in a large n × n symmetric matrix subject to additive standard Gaussian noise, or more generally, mean zero, variance one, subgaussian noise. This problem serves as a prototypical example for community detection, in which the community corresponds to the support of the submatrix. The main result of this paper is that in the regime Ω(n) ≤ K ≤ o(n), the support of the submatrix can be weakly recovered (with o(K) misclassification errors on average) by an optimized message passing algorithm if λ = µ^{2}K^{2}/n, the signal-to-noise ratio, exceeds 1/e. This extends a result by Deshpande and Montanari previously obtained for K = Θ(n) and µ = Θ(1). In addition, the algorithm can be combined with a voting procedure to achieve the information-theoretic limit of exact recovery with sharp constants for all K ≥_{log} ^{n} _{n}(_{8} ^{1} _{e} + o(1)). The total running time of the algorithm is O(n^{2} log n). Another version of the submatrix localization problem, known as noisy biclustering, aims to recover a K_{1} × K_{2} submatrix of elevated mean µ in a large n_{1} × n_{2} Gaussian matrix. The optimized message passing algorithm and its analysis are adapted to the bicluster problem assuming Ω(n_{i}) ≤ K_{i} ≤ o(n_{i}) and K_{1} K_{2}. A sharp information-theoretic condition for the weak recovery of both clusters is also identified.

Original language | English (US) |
---|---|

Pages (from-to) | 1-52 |

Number of pages | 52 |

Journal | Journal of Machine Learning Research |

Volume | 18 |

State | Published - Apr 1 2018 |

### Keywords

- Biclustering
- High-dimensional statistics
- Message passing
- Spectral algorithms computational complexity
- Submatrix localization

### ASJC Scopus subject areas

- Software
- Control and Systems Engineering
- Statistics and Probability
- Artificial Intelligence

## Fingerprint Dive into the research topics of 'Submatrix localization via message passing'. Together they form a unique fingerprint.

## Cite this

*Journal of Machine Learning Research*,

*18*, 1-52.