TY - JOUR
T1 - RSim
T2 - A reference-based normalization method via rank similarity
AU - Yuan, Bo
AU - Wang, Shulei
N1 - Publisher Copyright:
Copyright: © 2023 Yuan, Wang. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
PY - 2023/9
Y1 - 2023/9
N2 - Microbiome sequencing data normalization is crucial for eliminating technical bias and ensuring accurate downstream analysis. However, this process can be challenging due to the high frequency of zero counts in microbiome data. We propose a novel reference-based normalization method called normalization via rank similarity (RSim) that corrects sample-specific biases, even in the presence of many zero counts. Unlike other normalization methods, RSim does not require additional assumptions or treatments for the high prevalence of zero counts. This makes it robust and minimizes potential bias resulting from procedures that address zero counts, such as pseudo-counts. Our numerical experiments demonstrate that RSim reduces false discoveries, improves detection power, and reveals true biological signals in downstream tasks such as PCoA plotting, association analysis, and differential abundance analysis.
AB - Microbiome sequencing data normalization is crucial for eliminating technical bias and ensuring accurate downstream analysis. However, this process can be challenging due to the high frequency of zero counts in microbiome data. We propose a novel reference-based normalization method called normalization via rank similarity (RSim) that corrects sample-specific biases, even in the presence of many zero counts. Unlike other normalization methods, RSim does not require additional assumptions or treatments for the high prevalence of zero counts. This makes it robust and minimizes potential bias resulting from procedures that address zero counts, such as pseudo-counts. Our numerical experiments demonstrate that RSim reduces false discoveries, improves detection power, and reveals true biological signals in downstream tasks such as PCoA plotting, association analysis, and differential abundance analysis.
UR - http://www.scopus.com/inward/record.url?scp=85171133852&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85171133852&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1011447
DO - 10.1371/journal.pcbi.1011447
M3 - Article
C2 - 37656740
AN - SCOPUS:85171133852
SN - 1553-734X
VL - 19
JO - PLoS computational biology
JF - PLoS computational biology
IS - 9
M1 - e1011447
ER -