Unique Reconstruction of Coded Sequences from Multiset Substring Spectra

Ryan Gabrys, Olgica Milenkovic

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The problem of reconstructing strings from their substring spectra has a long history and in its most simple incarnation asks for determining under which conditions the spectrum uniquely determines the string. We study the problem of coded string reconstruction from multiset substring spectra, where the strings are restricted to lie in some codebook. In particular, we consider binary codebooks that allow for unique string reconstruction and propose a new method, termed repeat replacement, to create the codebook. Our contributions include algorithmic solutions for repeat replacement and constructive redundancy bounds for the underlying coding schemes. The study is motivated by applications in DNA-based data storage systems that use high throughput readout sequencers.

Original languageEnglish (US)
Title of host publication2018 IEEE International Symposium on Information Theory, ISIT 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages2540-2544
Number of pages5
ISBN (Print)9781538647806
DOIs
StatePublished - Aug 15 2018
Event2018 IEEE International Symposium on Information Theory, ISIT 2018 - Vail, United States
Duration: Jun 17 2018Jun 22 2018

Publication series

NameIEEE International Symposium on Information Theory - Proceedings
Volume2018-June
ISSN (Print)2157-8095

Other

Other2018 IEEE International Symposium on Information Theory, ISIT 2018
Country/TerritoryUnited States
CityVail
Period6/17/186/22/18

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Information Systems
  • Modeling and Simulation
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Unique Reconstruction of Coded Sequences from Multiset Substring Spectra'. Together they form a unique fingerprint.

Cite this