TY - JOUR
T1 - Minimum Hellinger distance estimation for the analysis of count data
AU - Simpson, Douglas G.
N1 - Funding Information:
* Douglas G. Simpson is Assistant Professor, Department of Statistics and Institute for Environmental Studies, University of Illinois, Champaign, IL 61820. This work was partially supported by National Science Foundation Contract DMS 84O0602. The author thanks B’arry H. Mar-golin for stimulating this research and Raymond J. Carroll and David Ruppert for helpful suggestions.
PY - 1987/9
Y1 - 1987/9
N2 - Minimum Hellinger distance (MHD) estimation is studied in the context of discrete data. The MHD estimator is shown to provide an effective treatment of anomalous data points, and its properties are illustrated using short-term mutagenicity test data. Asymptotic normality for a discrete distribution with countable support is derived under a readily verified condition on the model. Breakdown properties of the MHD estimator and an outlier screen are compared. Count data occur frequently in statistical applications. For instance, in chemical mutagenicity studies, which comprise an important step in the identification of environmental carcinogens, much of the resultant data are counts. Woodruff, Mason, Valencia, and Zimmering (1984) reported anomalous counts in the sex-linked recessive lethal test in drosophila. These outliers can have a substantial impact on the experimental conclusions. MHD estimation provides a means for reliable inference when modeling count data that are prone to outliers. The MHD fit gives little weight to counts that are improbable relative to the model. On the other hand, the MHD estimator is asymptotically equivalent to the maximum likelihood estimator when the model is correct. This latter result, long known for a parametric multinomial model with finite support, is extended here to models with countable support. The breakdown point provides a quantification of outlier resistence. Roughly, it is the smallest proportion of outliers in the data that can cause an arbitrarily large shift in the estimate (Donoho and Huber 1983). Here the MHD estimator is shown to have an asymptotic breakdown point of ½ at the model.
AB - Minimum Hellinger distance (MHD) estimation is studied in the context of discrete data. The MHD estimator is shown to provide an effective treatment of anomalous data points, and its properties are illustrated using short-term mutagenicity test data. Asymptotic normality for a discrete distribution with countable support is derived under a readily verified condition on the model. Breakdown properties of the MHD estimator and an outlier screen are compared. Count data occur frequently in statistical applications. For instance, in chemical mutagenicity studies, which comprise an important step in the identification of environmental carcinogens, much of the resultant data are counts. Woodruff, Mason, Valencia, and Zimmering (1984) reported anomalous counts in the sex-linked recessive lethal test in drosophila. These outliers can have a substantial impact on the experimental conclusions. MHD estimation provides a means for reliable inference when modeling count data that are prone to outliers. The MHD fit gives little weight to counts that are improbable relative to the model. On the other hand, the MHD estimator is asymptotically equivalent to the maximum likelihood estimator when the model is correct. This latter result, long known for a parametric multinomial model with finite support, is extended here to models with countable support. The breakdown point provides a quantification of outlier resistence. Roughly, it is the smallest proportion of outliers in the data that can cause an arbitrarily large shift in the estimate (Donoho and Huber 1983). Here the MHD estimator is shown to have an asymptotic breakdown point of ½ at the model.
KW - Asymptotic efficiency
KW - Breakdown point
KW - Discrete probability model
KW - Minimum-distance estimate
KW - Outlier screen
UR - http://www.scopus.com/inward/record.url?scp=0000167105&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0000167105&partnerID=8YFLogxK
U2 - 10.1080/01621459.1987.10478501
DO - 10.1080/01621459.1987.10478501
M3 - Article
AN - SCOPUS:0000167105
SN - 0162-1459
VL - 82
SP - 802
EP - 807
JO - Journal of the American Statistical Association
JF - Journal of the American Statistical Association
IS - 399
ER -