Evaluation of significance level assignment of database search programs using monte carlo permutation approach

Malik N. Akhtar, Bruce R. Southey, Per E. Andrén, Jonathan V Sweedler, Sandra Luisa Rodriguez-Zas

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Accurate detection of neuropeptides in the tandem mass spectrometry experiments using database search programs remains an area of active research. Of interest is accurate computation of statistical significance levels assigned to observed-theoretical spectrum matches. Main factors that influence significance values are peptide size, incomplete fragmentation, low spectra quality, and density of the database search space. A Monte Carlo approach was used to generate k-permuted decoy databases that would offer accurate p-values calculations in the database search program Crux. The k-permuted decoy databases were generated from 236 peptides that fall within 12 Daltons of the masses of 80 SwePep tandem spectra in a target database of 618 neuropeptides. The performance of the kpermuted decoy databases to identify peptides was examined relative to the approach already implemented in the Crux. The ability of the Crux's indicators of peptide match: number of matched fragment ions, XCorr and Sp score to identify peptides was compared using permutation p-values. The proposed method improved the detection of neuropeptides relative to the approach implemented in Crux. The performance of the number of matched fragment ions and Sp score was comparable and both indicators detected 98.75 and 100.0% of the peptides using 105 and 106 whole sequence k-permuted decoy databases, respectively. The XCorr indicator had the weakest performance relative to the other indicators. The proposed approach can be integrated with multiple database search programs and other types of tandem mass spectrometry experiments.

Original languageEnglish (US)
Title of host publicationProceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
PublisherInternational Society for Computers and Their Applications
Pages103-107
Number of pages5
ISBN (Print)9781632665140
StatePublished - Jan 1 2014
Event6th International Conference on Bioinformatics and Computational Biology, BICOB 2014 - Las Vegas, NV, United States
Duration: Mar 24 2014Mar 26 2014

Publication series

NameProceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014

Other

Other6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
CountryUnited States
CityLas Vegas, NV
Period3/24/143/26/14

Fingerprint

Databases
Peptides
Neuropeptides
Tandem Mass Spectrometry
Mass spectrometry
Ions
Experiments
Research

Keywords

  • Database search programs
  • K-permuted decoy databases
  • Monte carlo approach
  • Neuropeptides
  • P-value

ASJC Scopus subject areas

  • Information Systems
  • Health Informatics

Cite this

Akhtar, M. N., Southey, B. R., Andrén, P. E., Sweedler, J. V., & Rodriguez-Zas, S. L. (2014). Evaluation of significance level assignment of database search programs using monte carlo permutation approach. In Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014 (pp. 103-107). (Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014). International Society for Computers and Their Applications.

Evaluation of significance level assignment of database search programs using monte carlo permutation approach. / Akhtar, Malik N.; Southey, Bruce R.; Andrén, Per E.; Sweedler, Jonathan V; Rodriguez-Zas, Sandra Luisa.

Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014. International Society for Computers and Their Applications, 2014. p. 103-107 (Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Akhtar, MN, Southey, BR, Andrén, PE, Sweedler, JV & Rodriguez-Zas, SL 2014, Evaluation of significance level assignment of database search programs using monte carlo permutation approach. in Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014. Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014, International Society for Computers and Their Applications, pp. 103-107, 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014, Las Vegas, NV, United States, 3/24/14.
Akhtar MN, Southey BR, Andrén PE, Sweedler JV, Rodriguez-Zas SL. Evaluation of significance level assignment of database search programs using monte carlo permutation approach. In Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014. International Society for Computers and Their Applications. 2014. p. 103-107. (Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014).
Akhtar, Malik N. ; Southey, Bruce R. ; Andrén, Per E. ; Sweedler, Jonathan V ; Rodriguez-Zas, Sandra Luisa. / Evaluation of significance level assignment of database search programs using monte carlo permutation approach. Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014. International Society for Computers and Their Applications, 2014. pp. 103-107 (Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014).
@inproceedings{1c5f802cef2342408d1b701881671af0,
title = "Evaluation of significance level assignment of database search programs using monte carlo permutation approach",
abstract = "Accurate detection of neuropeptides in the tandem mass spectrometry experiments using database search programs remains an area of active research. Of interest is accurate computation of statistical significance levels assigned to observed-theoretical spectrum matches. Main factors that influence significance values are peptide size, incomplete fragmentation, low spectra quality, and density of the database search space. A Monte Carlo approach was used to generate k-permuted decoy databases that would offer accurate p-values calculations in the database search program Crux. The k-permuted decoy databases were generated from 236 peptides that fall within 12 Daltons of the masses of 80 SwePep tandem spectra in a target database of 618 neuropeptides. The performance of the kpermuted decoy databases to identify peptides was examined relative to the approach already implemented in the Crux. The ability of the Crux's indicators of peptide match: number of matched fragment ions, XCorr and Sp score to identify peptides was compared using permutation p-values. The proposed method improved the detection of neuropeptides relative to the approach implemented in Crux. The performance of the number of matched fragment ions and Sp score was comparable and both indicators detected 98.75 and 100.0{\%} of the peptides using 105 and 106 whole sequence k-permuted decoy databases, respectively. The XCorr indicator had the weakest performance relative to the other indicators. The proposed approach can be integrated with multiple database search programs and other types of tandem mass spectrometry experiments.",
keywords = "Database search programs, K-permuted decoy databases, Monte carlo approach, Neuropeptides, P-value",
author = "Akhtar, {Malik N.} and Southey, {Bruce R.} and Andr{\'e}n, {Per E.} and Sweedler, {Jonathan V} and Rodriguez-Zas, {Sandra Luisa}",
year = "2014",
month = "1",
day = "1",
language = "English (US)",
isbn = "9781632665140",
series = "Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014",
publisher = "International Society for Computers and Their Applications",
pages = "103--107",
booktitle = "Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014",

}

TY - GEN

T1 - Evaluation of significance level assignment of database search programs using monte carlo permutation approach

AU - Akhtar, Malik N.

AU - Southey, Bruce R.

AU - Andrén, Per E.

AU - Sweedler, Jonathan V

AU - Rodriguez-Zas, Sandra Luisa

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Accurate detection of neuropeptides in the tandem mass spectrometry experiments using database search programs remains an area of active research. Of interest is accurate computation of statistical significance levels assigned to observed-theoretical spectrum matches. Main factors that influence significance values are peptide size, incomplete fragmentation, low spectra quality, and density of the database search space. A Monte Carlo approach was used to generate k-permuted decoy databases that would offer accurate p-values calculations in the database search program Crux. The k-permuted decoy databases were generated from 236 peptides that fall within 12 Daltons of the masses of 80 SwePep tandem spectra in a target database of 618 neuropeptides. The performance of the kpermuted decoy databases to identify peptides was examined relative to the approach already implemented in the Crux. The ability of the Crux's indicators of peptide match: number of matched fragment ions, XCorr and Sp score to identify peptides was compared using permutation p-values. The proposed method improved the detection of neuropeptides relative to the approach implemented in Crux. The performance of the number of matched fragment ions and Sp score was comparable and both indicators detected 98.75 and 100.0% of the peptides using 105 and 106 whole sequence k-permuted decoy databases, respectively. The XCorr indicator had the weakest performance relative to the other indicators. The proposed approach can be integrated with multiple database search programs and other types of tandem mass spectrometry experiments.

AB - Accurate detection of neuropeptides in the tandem mass spectrometry experiments using database search programs remains an area of active research. Of interest is accurate computation of statistical significance levels assigned to observed-theoretical spectrum matches. Main factors that influence significance values are peptide size, incomplete fragmentation, low spectra quality, and density of the database search space. A Monte Carlo approach was used to generate k-permuted decoy databases that would offer accurate p-values calculations in the database search program Crux. The k-permuted decoy databases were generated from 236 peptides that fall within 12 Daltons of the masses of 80 SwePep tandem spectra in a target database of 618 neuropeptides. The performance of the kpermuted decoy databases to identify peptides was examined relative to the approach already implemented in the Crux. The ability of the Crux's indicators of peptide match: number of matched fragment ions, XCorr and Sp score to identify peptides was compared using permutation p-values. The proposed method improved the detection of neuropeptides relative to the approach implemented in Crux. The performance of the number of matched fragment ions and Sp score was comparable and both indicators detected 98.75 and 100.0% of the peptides using 105 and 106 whole sequence k-permuted decoy databases, respectively. The XCorr indicator had the weakest performance relative to the other indicators. The proposed approach can be integrated with multiple database search programs and other types of tandem mass spectrometry experiments.

KW - Database search programs

KW - K-permuted decoy databases

KW - Monte carlo approach

KW - Neuropeptides

KW - P-value

UR - http://www.scopus.com/inward/record.url?scp=84905845230&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905845230&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84905845230

SN - 9781632665140

T3 - Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014

SP - 103

EP - 107

BT - Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014

PB - International Society for Computers and Their Applications

ER -