TY - JOUR
T1 - Bias in estimates of the classic and incidence-based Jaccard similarity indices
T2 - Insights from assemblage simulation
AU - Cao, Y.
N1 - Funding Information:
The author gratefully acknowledges the constructive comments of P. Minchin on an earlier draft. This study was supported by Illinois Natural History Survey, University of Illinois. Discussions with C.P. Hawkins and J. Van Sickle helped the author to frame this study.
Publisher Copyright:
© Akadémiai Kiadó, Budapest.
PY - 2018/12
Y1 - 2018/12
N2 - Similarity indices are often used for measuring b-diversity and as the starting point of multivariate analysis. In this study, I used simulation to examine the direction and amount of bias in estimates of two similarity indices, Jaccard Coefficient (J) and incidence-based J (J^). I design a novel simulation to generate three sets of assemblages that vary in species richness, species-occurrence distributions, and b-diversity. I characterized assemblage differences with the ratio of [proportion of rare species in all shared species / proportion of rare species in all unshared species] (i.e., PR ss /PR us ) and the Pearson's correlation in the probabilities of shared species between two assemblages (i.e., share-species correlation). I found that J was subject to strong positive or negative bias, depending on PR ss /PR us . J^ was mainly subject to negative bias, which varied with share-species correlation. In both indices, bias varied substantially from one pair of assemblages to another and among datasets. The high variation in the bias across different comparisons of assemblages may compromise b-diversity estimation established at low sampling efforts based on the two indices or their variants.
AB - Similarity indices are often used for measuring b-diversity and as the starting point of multivariate analysis. In this study, I used simulation to examine the direction and amount of bias in estimates of two similarity indices, Jaccard Coefficient (J) and incidence-based J (J^). I design a novel simulation to generate three sets of assemblages that vary in species richness, species-occurrence distributions, and b-diversity. I characterized assemblage differences with the ratio of [proportion of rare species in all shared species / proportion of rare species in all unshared species] (i.e., PR ss /PR us ) and the Pearson's correlation in the probabilities of shared species between two assemblages (i.e., share-species correlation). I found that J was subject to strong positive or negative bias, depending on PR ss /PR us . J^ was mainly subject to negative bias, which varied with share-species correlation. In both indices, bias varied substantially from one pair of assemblages to another and among datasets. The high variation in the bias across different comparisons of assemblages may compromise b-diversity estimation established at low sampling efforts based on the two indices or their variants.
KW - Assemblage simulation
KW - Beta-diversity
KW - Estimating assemblage similarity
KW - Under-sampling
UR - http://www.scopus.com/inward/record.url?scp=85060706107&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85060706107&partnerID=8YFLogxK
U2 - 10.1556/168.2018.19.3.12
DO - 10.1556/168.2018.19.3.12
M3 - Article
AN - SCOPUS:85060706107
SN - 1585-8553
VL - 19
SP - 311
EP - 318
JO - Coenoses
JF - Coenoses
IS - 3
ER -