Riemannian geometry and statistical modeling correct for batch effects and control false discoveries in single-cell surface protein count data

Shuyi Zhang, Jacob R. Leistico, Christopher Cook, Yale Liu, Raymond J. Cho, Jeffrey B. Cheng, Jun S. Song

Research output: Contribution to journalArticlepeer-review

Abstract

Recent advances in next generation sequencing-based single-cell technologies have allowed high-throughput quantitative detection of cell-surface proteins along with the transcriptome in individual cells, extending our understanding of the heterogeneity of cell populations in diverse tissues that are in different diseased states or under different experimental conditions. Count data of surface proteins from the cellular indexing of transcriptomes and epitopes by sequencing (CITE-seq) technology pose new computational challenges, and there is currently a dearth of rigorous mathematical tools for analyzing the data. This work utilizes concepts and ideas from Riemannian geometry to remove batch effects between samples and develops a statistical framework for distinguishing positive signals from background noise. The strengths of these approaches are demonstrated on two independent CITE-seq data sets in mouse and human.

Original languageEnglish (US)
Article number012409
JournalPhysical Review E
Volume102
Issue number1
DOIs
StatePublished - Jul 2020

ASJC Scopus subject areas

  • Statistical and Nonlinear Physics
  • Statistics and Probability
  • Condensed Matter Physics

Fingerprint

Dive into the research topics of 'Riemannian geometry and statistical modeling correct for batch effects and control false discoveries in single-cell surface protein count data'. Together they form a unique fingerprint.

Cite this