TY - CONF
T1 - ON THE IMPORTANCE OF FIRTH BIAS REDUCTION IN FEW-SHOT CLASSIFICATION
AU - Ghaffari, Saba
AU - Saleh, Ehsan
AU - Forsyth, David A.
AU - Wang, Yu Xiong
N1 - This work was supported in part by (1) the National Science Foundation’s Major Research Instrumentation program (Kindratenko et al., 2020), grant number 1725729, (2) the National Science Foundation’s Creating Knowledge with All-Novel-Class Computer Vision program, grant number 2106825, and (3) the Jump ARCHES endowment through the Health Care Engineering Systems Center. The majority of NSF funding contributions were designated to Ehsan Saleh and Saba Ghaffari equally in the form of computational resource allocations on high-performance computing platforms for conducting the experiments. Overall, this work consumed more than 32 CPU years and one Nvidia V-100 GPU year from the NSF-funded resource allocations in the course of its analysis. Also, this work made use of the Illinois Campus Cluster, a computing resource that is operated by the Illinois Campus Cluster Program (ICCP) in conjunction with the National Center for Supercomputing Applications (NCSA) and is supported by funds from the University of Illinois Urbana-Champaign.
This work was supported in part by (1) the National Science Foundation's Major Research Instrumentation program (Kindratenko et al., 2020), grant number 1725729, (2) the National Science Foundation's Creating Knowledge with All-Novel-Class Computer Vision program, grant number 2106825, and (3) the Jump ARCHES endowment through the Health Care Engineering Systems Center. The majority of NSF funding contributions were designated to Ehsan Saleh and Saba Ghaffari equally in the form of computational resource allocations on high-performance computing platforms for conducting the experiments. Overall, this work consumed more than 32 CPU years and one Nvidia V-100 GPU year from the NSF-funded resource allocations in the course of its analysis. Also, this work made use of the Illinois Campus Cluster, a computing resource that is operated by the Illinois Campus Cluster Program (ICCP) in conjunction with the National Center for Supercomputing Applications (NCSA) and is supported by funds from the University of Illinois Urbana-Champaign.
PY - 2022
Y1 - 2022
N2 - Learning accurate classifiers for novel categories from very few examples, known as few-shot image classification, is a challenging task in statistical machine learning and computer vision. The performance in few-shot classification suffers from the bias in the estimation of classifier parameters; however, an effective underlying bias reduction technique that could alleviate this issue in training few-shot classifiers has been overlooked. In this work, we demonstrate the effectiveness of Firth bias reduction in few-shot classification. Theoretically, Firth bias reduction removes the O(N−1) first order term from the small-sample bias of the Maximum Likelihood Estimator. Here we show that the general Firth bias reduction technique simplifies to encouraging uniform class assignment probabilities for multinomial logistic classification, and almost has the same effect in cosine classifiers. We derive an easy-to-implement optimization objective for Firth penalized multinomial logistic and cosine classifiers, which is equivalent to penalizing the cross-entropy loss with a KL-divergence between the uniform label distribution and the predictions. Then, we empirically evaluate that it is consistently effective across the board for few-shot image classification, regardless of (1) the feature representations from different backbones, (2) the number of samples per class, and (3) the number of classes. Furthermore, we demonstrate the effectiveness of Firth bias reduction on cross-domain and imbalanced data settings. Our implementation is available at https://github.com/ehsansaleh/firth_bias_reduction.
AB - Learning accurate classifiers for novel categories from very few examples, known as few-shot image classification, is a challenging task in statistical machine learning and computer vision. The performance in few-shot classification suffers from the bias in the estimation of classifier parameters; however, an effective underlying bias reduction technique that could alleviate this issue in training few-shot classifiers has been overlooked. In this work, we demonstrate the effectiveness of Firth bias reduction in few-shot classification. Theoretically, Firth bias reduction removes the O(N−1) first order term from the small-sample bias of the Maximum Likelihood Estimator. Here we show that the general Firth bias reduction technique simplifies to encouraging uniform class assignment probabilities for multinomial logistic classification, and almost has the same effect in cosine classifiers. We derive an easy-to-implement optimization objective for Firth penalized multinomial logistic and cosine classifiers, which is equivalent to penalizing the cross-entropy loss with a KL-divergence between the uniform label distribution and the predictions. Then, we empirically evaluate that it is consistently effective across the board for few-shot image classification, regardless of (1) the feature representations from different backbones, (2) the number of samples per class, and (3) the number of classes. Furthermore, we demonstrate the effectiveness of Firth bias reduction on cross-domain and imbalanced data settings. Our implementation is available at https://github.com/ehsansaleh/firth_bias_reduction.
UR - http://www.scopus.com/inward/record.url?scp=85144507538&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85144507538&partnerID=8YFLogxK
M3 - Paper
AN - SCOPUS:85144507538
T2 - 10th International Conference on Learning Representations, ICLR 2022
Y2 - 25 April 2022 through 29 April 2022
ER -