Inference in High-Dimensional Linear Regression via Lattice Basis Reduction and Integer Relation Detection

David Gamarnik, Eren C. Kizildaǧ, Ilias Zadik

Research output: Contribution to journalArticlepeer-review

Abstract

We consider the high-dimensional linear regression problem, where the algorithmic goal is to efficiently infer an unknown feature vector β ∈ Rp from its linear measurements, using a small number n of samples. Unlike most of the literature, we make no sparsity assumption on β*, but instead adopt a different regularization: In the noiseless setting, we assume β*consists of entries, which are either rational numbers with a common denominator Q ∈ Z+ (referred to as Q-rationality); or irrational numbers taking values in a rationally independent set of bounded cardinality, known to learner; collectively called as the mixed-range assumption. Using a novel combination of the Partial Sum of Least Squares (PSLQ) integer relation detection, and the Lenstra-Lenstra-Lovász (LLL) lattice basis reduction algorithms, we propose a polynomial-time algorithm which provably recovers a β*∈ Rp enjoying the mixed-range assumption, from its linear measurements Y=Xβ*∈ Rn for a large class of distributions for the random entries of X, even with one measurement (n=1). In the noisy setting, we propose a polynomial-time, lattice-based algorithm, which recovers a β*∈ Rp enjoying the Q-rationality property, from its noisy measurements Y=Xβ*+W ∈ Rn, even from a single sample (n=1). We further establish that for large Q, and normal noise, this algorithm tolerates information-theoretically optimal level of noise. We then apply these ideas to develop a polynomial-time, single-sample algorithm for the phase retrieval problem. Our methods address the single-sample (n=1) regime, where the sparsity-based methods such as the Least Absolute Shrinkage and Selection Operator (LASSO) and the Basis Pursuit are known to fail. Furthermore, our results also reveal algorithmic connections between the high-dimensional linear regression problem, and the integer relation detection, randomized subset-sum, and shortest vector problems.

Original languageEnglish (US)
Pages (from-to)8109-8139
Number of pages31
JournalIEEE Transactions on Information Theory
Volume67
Issue number12
DOIs
StatePublished - Dec 1 2021
Externally publishedYes

Keywords

  • channel capacity
  • High-dimensional statistical inference
  • information-theoretic thresholds
  • integer relation detection
  • lattices
  • linear regression
  • phase retrieval
  • polynomial-time algorithm
  • single-sample recovery
  • subset sum problem

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint

Dive into the research topics of 'Inference in High-Dimensional Linear Regression via Lattice Basis Reduction and Integer Relation Detection'. Together they form a unique fingerprint.

Cite this