TY - JOUR
T1 - Identification of amino acids with sensitive nanoporous MoS2
T2 - towards machine learning-based prediction
AU - Barati Farimani, Amir
AU - Heiranian, Mohammad
AU - Aluru, Narayana R.
N1 - Funding Information:
This work is supported by NSF under grants 1420882, 1506619, 1708852, 1720701, 1720633, and 1545907. We acknowledge the use of the parallel computing resource Blue Waters provided by the University of Illinois and the National Center for Supercomputing Applications.
Publisher Copyright:
© 2018, The Author(s).
PY - 2018/12/1
Y1 - 2018/12/1
N2 - Protein detection plays a key role in determining the single point mutations which can cause a variety of diseases. Nanopore sequencing provides a label-free, single base, fast and long reading platform, which makes it amenable for personalized medicine. A challenge facing nanopore technology is the noise in ionic current. Here, we show that a nanoporous single-layer molybdenum disulfide (MoS2) can detect individual amino acids in a polypeptide chain (16 units) with a high accuracy and distinguishability. Using extensive molecular dynamics simulations (with a total aggregate simulation time of 66 µs) and machine learning techniques, we featurize and cluster the ionic current and residence time of the 20 amino acids and identify the fingerprints of the signals. Using logistic regression, nearest neighbor, and random forest classifiers, the sensor reading is predicted with an accuracy of 72.45, 94.55, and 99.6%, respectively. In addition, using advanced ML classification techniques, we are able to theoretically predict over 2.8 million hypothetical sensor readings’ amino acid types.
AB - Protein detection plays a key role in determining the single point mutations which can cause a variety of diseases. Nanopore sequencing provides a label-free, single base, fast and long reading platform, which makes it amenable for personalized medicine. A challenge facing nanopore technology is the noise in ionic current. Here, we show that a nanoporous single-layer molybdenum disulfide (MoS2) can detect individual amino acids in a polypeptide chain (16 units) with a high accuracy and distinguishability. Using extensive molecular dynamics simulations (with a total aggregate simulation time of 66 µs) and machine learning techniques, we featurize and cluster the ionic current and residence time of the 20 amino acids and identify the fingerprints of the signals. Using logistic regression, nearest neighbor, and random forest classifiers, the sensor reading is predicted with an accuracy of 72.45, 94.55, and 99.6%, respectively. In addition, using advanced ML classification techniques, we are able to theoretically predict over 2.8 million hypothetical sensor readings’ amino acid types.
UR - http://www.scopus.com/inward/record.url?scp=85059441333&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85059441333&partnerID=8YFLogxK
U2 - 10.1038/s41699-018-0060-8
DO - 10.1038/s41699-018-0060-8
M3 - Article
AN - SCOPUS:85059441333
SN - 2397-7132
VL - 2
JO - npj 2D Materials and Applications
JF - npj 2D Materials and Applications
IS - 1
M1 - 14
ER -