TY - GEN
T1 - Reconstruction and Error-Correction Codes for Polymer-Based Data Storage
AU - Pattabiraman, Srilakshmi
AU - Gabrys, Ryan
AU - Milenkovic, Olgica
N1 - Funding Information:
The work was funded by the DARPA Molecular Informatics program, the SemiSynBio program of the NSF and SRC, and the NSF CIF grant 1618366.
Publisher Copyright:
© 2019 IEEE.
PY - 2019/8
Y1 - 2019/8
N2 - Motivated by polymer-based data-storage platforms that use chains of binary synthetic polymers as the recording media and read the content via tandem mass spectrometers, we propose a new family of codes that allows for unique string reconstruction and correction of one mass error. Our approach is based on introducing redundancy that scales logarithmically with the length of the string and allows for the string to be uniquely reconstructed based only on its erroneous substring composition multiset. The key idea behind our unique reconstruction approach is to interleave Catalan-type paths with arbitrary binary strings and 'reflect' them so as to allow prefixes and suffixes of the same length to have different weights. For error correction, we add a constant number of bits that provides information about the weights of reflected pairs of bits and hence enable recovery from a single mass error. The asymptotic code rate of the scheme is one, and decoding is accomplished via a simplified version of the backtracking algorithm used for the Turnpike problem.
AB - Motivated by polymer-based data-storage platforms that use chains of binary synthetic polymers as the recording media and read the content via tandem mass spectrometers, we propose a new family of codes that allows for unique string reconstruction and correction of one mass error. Our approach is based on introducing redundancy that scales logarithmically with the length of the string and allows for the string to be uniquely reconstructed based only on its erroneous substring composition multiset. The key idea behind our unique reconstruction approach is to interleave Catalan-type paths with arbitrary binary strings and 'reflect' them so as to allow prefixes and suffixes of the same length to have different weights. For error correction, we add a constant number of bits that provides information about the weights of reflected pairs of bits and hence enable recovery from a single mass error. The asymptotic code rate of the scheme is one, and decoding is accomplished via a simplified version of the backtracking algorithm used for the Turnpike problem.
KW - Composition errors
KW - Polymer-based data storage
KW - String reconstruction
UR - http://www.scopus.com/inward/record.url?scp=85081100599&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85081100599&partnerID=8YFLogxK
U2 - 10.1109/ITW44776.2019.8989171
DO - 10.1109/ITW44776.2019.8989171
M3 - Conference contribution
AN - SCOPUS:85081100599
T3 - 2019 IEEE Information Theory Workshop, ITW 2019
BT - 2019 IEEE Information Theory Workshop, ITW 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2019 IEEE Information Theory Workshop, ITW 2019
Y2 - 25 August 2019 through 28 August 2019
ER -