An assessment of true and false positive detection rates of stepwise epistatic model selection as a function of sample size and number of markers

Angela H. Chen, Weihao Ge, William Metcalf, Eric Jakobsson, Liudmila Sergeevna Mainzer, Alexander Edward Lipka

Research output: Contribution to journalArticle

Abstract

Association studies have been successful at identifying genomic regions associated with important traits, but routinely employ models that only consider the additive contribution of an individual marker. Because quantitative trait variability typically arises from multiple additive and non-additive sources, utilization of statistical approaches that include main and two-way interaction marker effects of several loci in one model could lead to unprecedented characterization of these sources. Here we examine the ability of one such approach, called the Stepwise Procedure for constructing an Additive and Epistatic Multi-Locus model (SPAEML), to detect additive and epistatic signals simulated using maize and human marker data. Our results revealed that SPAEML was capable of detecting quantitative trait nucleotides (QTNs) at sample sizes as low as n = 300 and consistently specifying signals as additive and epistatic for larger sizes. Sample size and minor allele frequency had a major influence on SPAEML’s ability to distinguish between additive and epistatic signals, while the number of markers tested did not. We conclude that SPAEML is a useful approach for providing further elucidation of the additive and epistatic sources contributing to trait variability when applied to a small subset of genome-wide markers located within specific genomic regions identified using a priori analyses.

Original languageEnglish (US)
Pages (from-to)660-671
Number of pages12
JournalHeredity
Volume122
Issue number5
DOIs
StatePublished - May 1 2019

Fingerprint

Sample Size
Gene Frequency
Zea mays
Nucleotides
Genome

ASJC Scopus subject areas

  • Genetics
  • Genetics(clinical)

Cite this

An assessment of true and false positive detection rates of stepwise epistatic model selection as a function of sample size and number of markers. / Chen, Angela H.; Ge, Weihao; Metcalf, William; Jakobsson, Eric; Mainzer, Liudmila Sergeevna; Lipka, Alexander Edward.

In: Heredity, Vol. 122, No. 5, 01.05.2019, p. 660-671.

Research output: Contribution to journalArticle

@article{0e163a11f61c499ab78770fef758295f,
title = "An assessment of true and false positive detection rates of stepwise epistatic model selection as a function of sample size and number of markers",
abstract = "Association studies have been successful at identifying genomic regions associated with important traits, but routinely employ models that only consider the additive contribution of an individual marker. Because quantitative trait variability typically arises from multiple additive and non-additive sources, utilization of statistical approaches that include main and two-way interaction marker effects of several loci in one model could lead to unprecedented characterization of these sources. Here we examine the ability of one such approach, called the Stepwise Procedure for constructing an Additive and Epistatic Multi-Locus model (SPAEML), to detect additive and epistatic signals simulated using maize and human marker data. Our results revealed that SPAEML was capable of detecting quantitative trait nucleotides (QTNs) at sample sizes as low as n = 300 and consistently specifying signals as additive and epistatic for larger sizes. Sample size and minor allele frequency had a major influence on SPAEML’s ability to distinguish between additive and epistatic signals, while the number of markers tested did not. We conclude that SPAEML is a useful approach for providing further elucidation of the additive and epistatic sources contributing to trait variability when applied to a small subset of genome-wide markers located within specific genomic regions identified using a priori analyses.",
author = "Chen, {Angela H.} and Weihao Ge and William Metcalf and Eric Jakobsson and Mainzer, {Liudmila Sergeevna} and Lipka, {Alexander Edward}",
year = "2019",
month = "5",
day = "1",
doi = "10.1038/s41437-018-0162-2",
language = "English (US)",
volume = "122",
pages = "660--671",
journal = "Heredity",
issn = "0018-067X",
publisher = "Nature Publishing Group",
number = "5",

}

TY - JOUR

T1 - An assessment of true and false positive detection rates of stepwise epistatic model selection as a function of sample size and number of markers

AU - Chen, Angela H.

AU - Ge, Weihao

AU - Metcalf, William

AU - Jakobsson, Eric

AU - Mainzer, Liudmila Sergeevna

AU - Lipka, Alexander Edward

PY - 2019/5/1

Y1 - 2019/5/1

N2 - Association studies have been successful at identifying genomic regions associated with important traits, but routinely employ models that only consider the additive contribution of an individual marker. Because quantitative trait variability typically arises from multiple additive and non-additive sources, utilization of statistical approaches that include main and two-way interaction marker effects of several loci in one model could lead to unprecedented characterization of these sources. Here we examine the ability of one such approach, called the Stepwise Procedure for constructing an Additive and Epistatic Multi-Locus model (SPAEML), to detect additive and epistatic signals simulated using maize and human marker data. Our results revealed that SPAEML was capable of detecting quantitative trait nucleotides (QTNs) at sample sizes as low as n = 300 and consistently specifying signals as additive and epistatic for larger sizes. Sample size and minor allele frequency had a major influence on SPAEML’s ability to distinguish between additive and epistatic signals, while the number of markers tested did not. We conclude that SPAEML is a useful approach for providing further elucidation of the additive and epistatic sources contributing to trait variability when applied to a small subset of genome-wide markers located within specific genomic regions identified using a priori analyses.

AB - Association studies have been successful at identifying genomic regions associated with important traits, but routinely employ models that only consider the additive contribution of an individual marker. Because quantitative trait variability typically arises from multiple additive and non-additive sources, utilization of statistical approaches that include main and two-way interaction marker effects of several loci in one model could lead to unprecedented characterization of these sources. Here we examine the ability of one such approach, called the Stepwise Procedure for constructing an Additive and Epistatic Multi-Locus model (SPAEML), to detect additive and epistatic signals simulated using maize and human marker data. Our results revealed that SPAEML was capable of detecting quantitative trait nucleotides (QTNs) at sample sizes as low as n = 300 and consistently specifying signals as additive and epistatic for larger sizes. Sample size and minor allele frequency had a major influence on SPAEML’s ability to distinguish between additive and epistatic signals, while the number of markers tested did not. We conclude that SPAEML is a useful approach for providing further elucidation of the additive and epistatic sources contributing to trait variability when applied to a small subset of genome-wide markers located within specific genomic regions identified using a priori analyses.

UR - http://www.scopus.com/inward/record.url?scp=85056715346&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85056715346&partnerID=8YFLogxK

U2 - 10.1038/s41437-018-0162-2

DO - 10.1038/s41437-018-0162-2

M3 - Article

VL - 122

SP - 660

EP - 671

JO - Heredity

JF - Heredity

SN - 0018-067X

IS - 5

ER -