Mutually uncorrelated primers for DNA-based data storage

S. M.H. Tabatabaei Yazdi, Han Mao Kiah, Ryan Gabrys, Olgica Milenkovic

Research output: Contribution to journalArticlepeer-review

Abstract

We introduce the notion of weakly mutually uncorrelated (WMU) sequences, motivated by applications in DNA-based data storage systems and synchronization between communication devices. WMU sequences are characterized by the property that no sufficiently long suffix of one sequence is the prefix of the same or another sequence. WMU sequences used for primer design in DNA-based data storage systems are also required to be at large mutual Hamming distance from each other, have balanced compositions of symbols, and avoid primer-dimer byproducts. We derive bounds on the size of WMU and various constrained WMU codes and present a number of constructions for balanced, error-correcting, primer-dimer free WMU codes using Dyck paths, prefix-synchronized, and cyclic codes.

Original languageEnglish (US)
Article number8255669
Pages (from-to)6283-6296
Number of pages14
JournalIEEE Transactions on Information Theory
Volume64
Issue number9
DOIs
StatePublished - Sep 2018

Keywords

  • Biological information theory
  • DNA-based data storage systems
  • bioinformatics
  • channel coding
  • constrained coding
  • data storage systems

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences

Fingerprint Dive into the research topics of 'Mutually uncorrelated primers for DNA-based data storage'. Together they form a unique fingerprint.

Cite this