Abstract
We consider the problem of reconstructing a sequence of independent and identically distributed symbols from a set of equal-size, consecutive, fragments, as well as a dependent reference sequence. First, in the regime in which the fragments are relatively long and typically no fragment appears more than once, we determine the scaling of the failure probability of the maximum-likelihood reconstruction algorithm for a perfect reconstruction, and bound it for a partial reconstruction. Second, we characterize the regime in which the fragments are relatively short and repeating fragments abound. We state a trade-off between the fraction of fragments that cannot be adequately reconstructed vs. the distortion level allowed for the reconstruction of each fragment, while still allowing vanishing failure probability.
Original language | English (US) |
---|---|
Article number | 10504879 |
Pages (from-to) | 4634-4654 |
Number of pages | 21 |
Journal | IEEE Transactions on Information Theory |
Volume | 70 |
Issue number | 7 |
DOIs | |
State | Published - Jul 1 2024 |
Externally published | Yes |
Keywords
- bee-identification problem
- Distortion
- DNA sequencing
- Encoding
- Fragment reordering
- Genomics
- Noise measurement
- permutation reconstruction
- Reconstruction algorithms
- reference sequence
- sequence reconstruction
- Sequential analysis
- side information
- sliced sequences
- Vectors
ASJC Scopus subject areas
- Information Systems
- Computer Science Applications
- Library and Information Sciences