Fundamental Limits of Multi-Sample Flow Graph Decomposition

Kayvon Mazooji, Sreeram Kannan, William Stafford Noble, Ilan Shomorony

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The problem of decomposing a graph flow into a small set of paths has a wide range of applications, including transcriptome assembly and routing in data networks. A standard formulation is the sparsest flow decomposition problem, which is known to be NP-hard. In this work, we consider a multi-sample variant of this problem, motivated by the problem of identifying and quantifying proteoforms from mass spectrometry data, where multiple views of the graph can be obtained from multiple biological samples. We derive necessary conditions for the set of samples to unambiguously determine the ground truth set of paths, and we design an algorithm with matching sufficient conditions for a large class of problem instances, making our algorithm information optimal for this class of problem instances. The necessary conditions, combined with a probabilistic model for sample generation, yield a characterization of the number of samples needed for unambiguous recovery of the underlying paths. We analyze the algorithm's performance on flow data simulated on peptide graphs from real mass spectrometry data.

Original languageEnglish (US)
Title of host publication2022 IEEE International Symposium on Information Theory, ISIT 2022
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2403-2408
Number of pages6
ISBN (Electronic)9781665421591
DOIs
StatePublished - 2022
Event2022 IEEE International Symposium on Information Theory, ISIT 2022 - Espoo, Finland
Duration: Jun 26 2022Jul 1 2022

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
Volume2022-June
ISSN (Print)2157-8095

Conference

Conference2022 IEEE International Symposium on Information Theory, ISIT 2022
Country/TerritoryFinland
CityEspoo
Period6/26/227/1/22

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Fundamental Limits of Multi-Sample Flow Graph Decomposition'. Together they form a unique fingerprint.

Cite this