DNA Merge-Sort: A Family of Nested Varshamov-Tenengolts Reassembly Codes for Out-of-Order Media

Sajjad Nassirpour, Ilan Shomorony, Alireza Vahid

Research output: Contribution to journalArticlepeer-review

Abstract

Motivated by the DNA storage paradigm, we consider the torn-paper channel (TPC), which models data storage in long DNA molecules and breaks the input sequence into a random number of out-of-order variable-length non-overlapped fragments. We propose a computationally-efficient code construction for this model. More specifically, we introduce a family of nested Varshamov-Tenengolts (VT) codes to merge and sort the fragments in order to recover the stored data. We numerically show that our scheme (i) obtains rates that are higher than in prior results, (ii) has a decoding complexity that is cubic in the number of codeword fragments, which is significantly lower than the complexity of the brute-force approach, and (iii) offers decreasing and negligible error rates as the codeword length increases. We also propose a new construction for VT codes, quantify the number of required parity bits, and show that our approach requires fewer parity bits compared to known results.

Original languageEnglish (US)
Pages (from-to)1303-1317
Number of pages15
JournalIEEE Transactions on Communications
Volume72
Issue number3
DOIs
StatePublished - Mar 1 2024

Keywords

  • DNA storage
  • codeword fragmentation
  • sequence assembly
  • shuffling channel
  • unordered communication

ASJC Scopus subject areas

  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'DNA Merge-Sort: A Family of Nested Varshamov-Tenengolts Reassembly Codes for Out-of-Order Media'. Together they form a unique fingerprint.

Cite this