Impact of bias correction methods and increasing the biological samples in transcriptomic analysis

Dianelys González-Peña, Scott E. Nixon, Bruce R. Southey, Marcus A. Lawson, Robert H. McCusker, Robert Dantzer, Keith W. Kelley, Sandra Luisa Rodriguez-Zas

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

RNA-sequencing (RNA-Seq) technologies enable the quantification of gene expression levels, identification of splice junctions, and uncovering of novel transcripts. Comparing the number of sequences (reads) that map to transcripts under different conditions (e.g. genotypes) can detect differentially expressed genes and transcripts. Typically, RNA-Seq experiments have small sample sizes, making accurate detection of differentially abundant transcripts (DATs) challenging under these conditions. Normalization, bias adjustment and modeling of the RNASeq count data can further influence the detection and ranking of DATs. The objectives of this study were to assess the impact of sample size, normalization, and bias correction on the detection, ranking and functional characterization of DATs. RNA-Seq data from the brain macrophages of wild-type and indoleamine 2,3- dioxygenase 1-knockout mice was used. Data subsets (4 and 6 samples per group) and a range of bias adjustments were tested using TopHat and Cufflink routines. Prolactin variant 1 and growth hormone DATs were detected in both 4 and 6 group sample sizes. Average DAT number across sample sizes was 144.5 and 79, respectively. Bias corrections affected the estimate precision, resulting in reranking of the DATs. Despite the additional DATs identified using bias adjustments, functional clustering remained stable. Identification of robust DAT sets requires the evaluation of complementary bias-correction approaches.

Original languageEnglish (US)
Title of host publicationProceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
PublisherInternational Society for Computers and Their Applications
Pages115-118
Number of pages4
ISBN (Print)9781632665140
StatePublished - Jan 1 2014
Event6th International Conference on Bioinformatics and Computational Biology, BICOB 2014 - Las Vegas, NV, United States
Duration: Mar 24 2014Mar 26 2014

Publication series

NameProceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014

Other

Other6th International Conference on Bioinformatics and Computational Biology, BICOB 2014
CountryUnited States
CityLas Vegas, NV
Period3/24/143/26/14

Fingerprint

RNA
Sample Size
Indoleamine-Pyrrole 2,3,-Dioxygenase
RNA Sequence Analysis
Macrophages
Hormones
Gene expression
Knockout Mice
Prolactin
Growth Hormone
Cluster Analysis
Brain
Genes
Genotype
Technology
Gene Expression
Experiments

Keywords

  • Bias correction
  • RNA-Seq
  • Sample size
  • Transcriptome

ASJC Scopus subject areas

  • Information Systems
  • Health Informatics

Cite this

González-Peña, D., Nixon, S. E., Southey, B. R., Lawson, M. A., McCusker, R. H., Dantzer, R., ... Rodriguez-Zas, S. L. (2014). Impact of bias correction methods and increasing the biological samples in transcriptomic analysis. In Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014 (pp. 115-118). (Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014). International Society for Computers and Their Applications.

Impact of bias correction methods and increasing the biological samples in transcriptomic analysis. / González-Peña, Dianelys; Nixon, Scott E.; Southey, Bruce R.; Lawson, Marcus A.; McCusker, Robert H.; Dantzer, Robert; Kelley, Keith W.; Rodriguez-Zas, Sandra Luisa.

Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014. International Society for Computers and Their Applications, 2014. p. 115-118 (Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

González-Peña, D, Nixon, SE, Southey, BR, Lawson, MA, McCusker, RH, Dantzer, R, Kelley, KW & Rodriguez-Zas, SL 2014, Impact of bias correction methods and increasing the biological samples in transcriptomic analysis. in Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014. Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014, International Society for Computers and Their Applications, pp. 115-118, 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014, Las Vegas, NV, United States, 3/24/14.
González-Peña D, Nixon SE, Southey BR, Lawson MA, McCusker RH, Dantzer R et al. Impact of bias correction methods and increasing the biological samples in transcriptomic analysis. In Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014. International Society for Computers and Their Applications. 2014. p. 115-118. (Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014).
González-Peña, Dianelys ; Nixon, Scott E. ; Southey, Bruce R. ; Lawson, Marcus A. ; McCusker, Robert H. ; Dantzer, Robert ; Kelley, Keith W. ; Rodriguez-Zas, Sandra Luisa. / Impact of bias correction methods and increasing the biological samples in transcriptomic analysis. Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014. International Society for Computers and Their Applications, 2014. pp. 115-118 (Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014).
@inproceedings{1c8ace1bdf054a14a8aa4c5e72156fdd,
title = "Impact of bias correction methods and increasing the biological samples in transcriptomic analysis",
abstract = "RNA-sequencing (RNA-Seq) technologies enable the quantification of gene expression levels, identification of splice junctions, and uncovering of novel transcripts. Comparing the number of sequences (reads) that map to transcripts under different conditions (e.g. genotypes) can detect differentially expressed genes and transcripts. Typically, RNA-Seq experiments have small sample sizes, making accurate detection of differentially abundant transcripts (DATs) challenging under these conditions. Normalization, bias adjustment and modeling of the RNASeq count data can further influence the detection and ranking of DATs. The objectives of this study were to assess the impact of sample size, normalization, and bias correction on the detection, ranking and functional characterization of DATs. RNA-Seq data from the brain macrophages of wild-type and indoleamine 2,3- dioxygenase 1-knockout mice was used. Data subsets (4 and 6 samples per group) and a range of bias adjustments were tested using TopHat and Cufflink routines. Prolactin variant 1 and growth hormone DATs were detected in both 4 and 6 group sample sizes. Average DAT number across sample sizes was 144.5 and 79, respectively. Bias corrections affected the estimate precision, resulting in reranking of the DATs. Despite the additional DATs identified using bias adjustments, functional clustering remained stable. Identification of robust DAT sets requires the evaluation of complementary bias-correction approaches.",
keywords = "Bias correction, RNA-Seq, Sample size, Transcriptome",
author = "Dianelys Gonz{\'a}lez-Pe{\~n}a and Nixon, {Scott E.} and Southey, {Bruce R.} and Lawson, {Marcus A.} and McCusker, {Robert H.} and Robert Dantzer and Kelley, {Keith W.} and Rodriguez-Zas, {Sandra Luisa}",
year = "2014",
month = "1",
day = "1",
language = "English (US)",
isbn = "9781632665140",
series = "Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014",
publisher = "International Society for Computers and Their Applications",
pages = "115--118",
booktitle = "Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014",

}

TY - GEN

T1 - Impact of bias correction methods and increasing the biological samples in transcriptomic analysis

AU - González-Peña, Dianelys

AU - Nixon, Scott E.

AU - Southey, Bruce R.

AU - Lawson, Marcus A.

AU - McCusker, Robert H.

AU - Dantzer, Robert

AU - Kelley, Keith W.

AU - Rodriguez-Zas, Sandra Luisa

PY - 2014/1/1

Y1 - 2014/1/1

N2 - RNA-sequencing (RNA-Seq) technologies enable the quantification of gene expression levels, identification of splice junctions, and uncovering of novel transcripts. Comparing the number of sequences (reads) that map to transcripts under different conditions (e.g. genotypes) can detect differentially expressed genes and transcripts. Typically, RNA-Seq experiments have small sample sizes, making accurate detection of differentially abundant transcripts (DATs) challenging under these conditions. Normalization, bias adjustment and modeling of the RNASeq count data can further influence the detection and ranking of DATs. The objectives of this study were to assess the impact of sample size, normalization, and bias correction on the detection, ranking and functional characterization of DATs. RNA-Seq data from the brain macrophages of wild-type and indoleamine 2,3- dioxygenase 1-knockout mice was used. Data subsets (4 and 6 samples per group) and a range of bias adjustments were tested using TopHat and Cufflink routines. Prolactin variant 1 and growth hormone DATs were detected in both 4 and 6 group sample sizes. Average DAT number across sample sizes was 144.5 and 79, respectively. Bias corrections affected the estimate precision, resulting in reranking of the DATs. Despite the additional DATs identified using bias adjustments, functional clustering remained stable. Identification of robust DAT sets requires the evaluation of complementary bias-correction approaches.

AB - RNA-sequencing (RNA-Seq) technologies enable the quantification of gene expression levels, identification of splice junctions, and uncovering of novel transcripts. Comparing the number of sequences (reads) that map to transcripts under different conditions (e.g. genotypes) can detect differentially expressed genes and transcripts. Typically, RNA-Seq experiments have small sample sizes, making accurate detection of differentially abundant transcripts (DATs) challenging under these conditions. Normalization, bias adjustment and modeling of the RNASeq count data can further influence the detection and ranking of DATs. The objectives of this study were to assess the impact of sample size, normalization, and bias correction on the detection, ranking and functional characterization of DATs. RNA-Seq data from the brain macrophages of wild-type and indoleamine 2,3- dioxygenase 1-knockout mice was used. Data subsets (4 and 6 samples per group) and a range of bias adjustments were tested using TopHat and Cufflink routines. Prolactin variant 1 and growth hormone DATs were detected in both 4 and 6 group sample sizes. Average DAT number across sample sizes was 144.5 and 79, respectively. Bias corrections affected the estimate precision, resulting in reranking of the DATs. Despite the additional DATs identified using bias adjustments, functional clustering remained stable. Identification of robust DAT sets requires the evaluation of complementary bias-correction approaches.

KW - Bias correction

KW - RNA-Seq

KW - Sample size

KW - Transcriptome

UR - http://www.scopus.com/inward/record.url?scp=84905819573&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84905819573&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:84905819573

SN - 9781632665140

T3 - Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014

SP - 115

EP - 118

BT - Proceedings of the 6th International Conference on Bioinformatics and Computational Biology, BICOB 2014

PB - International Society for Computers and Their Applications

ER -