TY - GEN
T1 - Mass Error-Correction Codes for Polymer-Based Data Storage
AU - Gabrys, Ryan
AU - Pattabiraman, Srilakshmi
AU - Milenkovic, Olgica
N1 - Publisher Copyright:
© 2020 IEEE.
PY - 2020/6
Y1 - 2020/6
N2 - We consider the problem of correcting mass readout errors in information encoded in binary polymer strings. Our work builds on results for string reconstruction problems using composition multisets [1] and the unique string reconstruction framework proposed in [2]. Binary polymer-based data storage systems [3] operate by designing two molecules of significantly different masses to represent the symbols {0,1} and perform readouts through noisy tandem mass spectrometry. Tandem mass spectrometers fragment the strings to be read into shorter substrings and only report their masses, often with errors due to imprecise ionization. Modeling the fragmentation process output in terms of composition multisets allows for designing asymptotically optimal codes capable of unique reconstruction and the correction of a single mass error [2] through the use of derivatives of Catalan paths. Nevertheless, no solutions for multiple-mass error-corrections are currently known. Our work addresses this issue by describing the first multiple-error correction codes that use the polynomial factorization approach for the Turnpike problem [4] and the related factorization described in [1]. Adding Reed-Solomon type coding redundancy into the corresponding polynomials allows for correcting t mass errors in polynomial time using {\mathcal{O}}\left( {{t^2}\log k} \right) redundant bits, where k is the information string length. The redundancy can be improved to {\mathcal{O}}(t + \log k). However, no decoding algorithm that runs polynomial-time in both t and n for this scheme are currently known, where n is the length of the coded string.
AB - We consider the problem of correcting mass readout errors in information encoded in binary polymer strings. Our work builds on results for string reconstruction problems using composition multisets [1] and the unique string reconstruction framework proposed in [2]. Binary polymer-based data storage systems [3] operate by designing two molecules of significantly different masses to represent the symbols {0,1} and perform readouts through noisy tandem mass spectrometry. Tandem mass spectrometers fragment the strings to be read into shorter substrings and only report their masses, often with errors due to imprecise ionization. Modeling the fragmentation process output in terms of composition multisets allows for designing asymptotically optimal codes capable of unique reconstruction and the correction of a single mass error [2] through the use of derivatives of Catalan paths. Nevertheless, no solutions for multiple-mass error-corrections are currently known. Our work addresses this issue by describing the first multiple-error correction codes that use the polynomial factorization approach for the Turnpike problem [4] and the related factorization described in [1]. Adding Reed-Solomon type coding redundancy into the corresponding polynomials allows for correcting t mass errors in polynomial time using {\mathcal{O}}\left( {{t^2}\log k} \right) redundant bits, where k is the information string length. The redundancy can be improved to {\mathcal{O}}(t + \log k). However, no decoding algorithm that runs polynomial-time in both t and n for this scheme are currently known, where n is the length of the coded string.
UR - http://www.scopus.com/inward/record.url?scp=85090414132&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85090414132&partnerID=8YFLogxK
U2 - 10.1109/ISIT44484.2020.9174404
DO - 10.1109/ISIT44484.2020.9174404
M3 - Conference contribution
AN - SCOPUS:85090414132
T3 - IEEE International Symposium on Information Theory - Proceedings
SP - 25
EP - 30
BT - 2020 IEEE International Symposium on Information Theory, ISIT 2020 - Proceedings
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2020 IEEE International Symposium on Information Theory, ISIT 2020
Y2 - 21 July 2020 through 26 July 2020
ER -