TY - JOUR
T1 - On the interpretability of conditional probability estimates in the agnostic setting
AU - Gao, Yihan
AU - Parameswaran, Aditya
AU - Peng, Jian
N1 - Publisher Copyright:
© 2017 Institute of Mathematical Statistics. All rights reserved.
PY - 2017
Y1 - 2017
N2 - We study the interpretability of conditional probability estimates for binary classification under the agnostic setting or scenario. Under the agnostic setting, conditional probability estimates do not necessarily reflect the true conditional probabilities. Instead, they have a certain calibration property: among all data points that the classifier has predicted P(Y=1|X)=p, p portion of them actually have label Y=1. For cost-sensitive decision problems, this calibration property provides adequate support for us to use Bayes Decision Rule. In this paper, we define a novel measure for the calibration property together with its empirical counterpart, and prove a uniform convergence result between them. This new measure enables us to formally justify the calibration property of conditional probability estimations. It also provides new insights on the problem of estimating and calibrating conditional probabilities, and allows us to reliably estimate the expected cost of decision rules when applied to an unlabeled dataset.
AB - We study the interpretability of conditional probability estimates for binary classification under the agnostic setting or scenario. Under the agnostic setting, conditional probability estimates do not necessarily reflect the true conditional probabilities. Instead, they have a certain calibration property: among all data points that the classifier has predicted P(Y=1|X)=p, p portion of them actually have label Y=1. For cost-sensitive decision problems, this calibration property provides adequate support for us to use Bayes Decision Rule. In this paper, we define a novel measure for the calibration property together with its empirical counterpart, and prove a uniform convergence result between them. This new measure enables us to formally justify the calibration property of conditional probability estimations. It also provides new insights on the problem of estimating and calibrating conditional probabilities, and allows us to reliably estimate the expected cost of decision rules when applied to an unlabeled dataset.
UR - http://www.scopus.com/inward/record.url?scp=85038407762&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85038407762&partnerID=8YFLogxK
U2 - 10.1214/17-EJS1376SI
DO - 10.1214/17-EJS1376SI
M3 - Article
AN - SCOPUS:85038407762
SN - 1935-7524
VL - 11
SP - 5198
EP - 5231
JO - Electronic Journal of Statistics
JF - Electronic Journal of Statistics
IS - 2
ER -