DNA-Based Storage: Models and Fundamental Limits

Ilan Shomorony, Reinhard Heckel

Research output: Contribution to journalArticlepeer-review


Due to its longevity and enormous information density, DNA is an attractive medium for archival storage. In this work, we study the fundamental limits and trade-offs of DNA-based storage systems by introducing a new channel model, which we call the noisy shuffling-sampling channel. Motivated by current technological constraints on DNA synthesis and sequencing, this model captures three key distinctive aspects of DNA storage systems: (1) the data is written onto many short DNA molecules; (2) the molecules are corrupted by noise during synthesis and sequencing and (3) the data is read by randomly sampling from the DNA pool. We provide capacity results for this channel under specific noise and sampling assumptions and show that, in many scenarios, a simple index-based coding scheme is optimal.

Original languageEnglish (US)
Article number9353576
Pages (from-to)3675-3689
Number of pages15
JournalIEEE Transactions on Information Theory
Issue number6
StatePublished - Jun 2021


  • DNA storage
  • Data storage
  • channel capacity

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Library and Information Sciences


Dive into the research topics of 'DNA-Based Storage: Models and Fundamental Limits'. Together they form a unique fingerprint.

Cite this