Reconstructing Mixtures of Coded Strings from Prefix and Suffix Compositions

Ryan Gabrys, Srilakshmi Pattabiraman, Olgica Milenkovic

Research output: Chapter in Book/Report/Conference proceedingConference contribution


The problem of string reconstruction from substring information has found many applications due to its relevance in DNA- and polymer-based data storage. One practically important and challenging paradigm requires reconstructing mixtures of strings based on the union of compositions of their prefixes and suffixes, generated by mass spectrometry readouts. We describe new coding methods that allow for unique joint reconstruction of subsets of strings selected from a code and provide matching upper and lower bounds on the asymptotic rate of the underlying codebooks. Under certain mild constraints on the problem parameters, one can show that the largest possible rate of a codebook that allows for all subcollections of less than or equal to h codestrings to be uniquely reconstructable from the prefix-suffix information equals 1{h.

Original languageEnglish (US)
Title of host publication2020 IEEE Information Theory Workshop, ITW 2020
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781728159621
StatePublished - Apr 11 2021
Event2020 IEEE Information Theory Workshop, ITW 2020 - Virtual, Riva del Garda, Italy
Duration: Apr 11 2021Apr 15 2021

Publication series

Name2020 IEEE Information Theory Workshop, ITW 2020


Conference2020 IEEE Information Theory Workshop, ITW 2020
CityVirtual, Riva del Garda


  • B sequences
  • Polymer-based data storage
  • Unique string reconstruction

ASJC Scopus subject areas

  • Computational Theory and Mathematics
  • Information Systems
  • Signal Processing
  • Software
  • Theoretical Computer Science


Dive into the research topics of 'Reconstructing Mixtures of Coded Strings from Prefix and Suffix Compositions'. Together they form a unique fingerprint.

Cite this