Consistent binary classification with generalized performance metrics

Oluwasanmi Koyejo, Nagarajan Natarajan, Pradeep Ravikumar, Inderjit S. Dhillon

Research output: Contribution to journalConference article

Abstract

Performance metrics for binary classification are designed to capture tradeoffs between four fundamental population quantities: true positives, false positives, true negatives and false negatives. Despite significant interest from theoretical and applied communities, little is known about either optimal classifiers or consistent algorithms for optimizing binary classification performance metrics beyond a few special cases. We consider a fairly large family of performance metrics given by ratios of linear combinations of the four fundamental population quantities. This family includes many well known binary classification metrics such as classification accuracy, AM measure, F-measure and the Jaccard similarity coefficient as special cases. Our analysis identifies the optimal classifiers as the sign of the thresholded conditional probability of the positive class, with a performance metric-dependent threshold. The optimal threshold can be constructed using simple plug-in estimators when the performance metric is a linear combination of the population quantities, but alternative techniques are required for the general case. We propose two algorithms for estimating the optimal classifiers, and prove their statistical consistency. Both algorithms are straightforward modifications of standard approaches to address the key challenge of optimal threshold selection, thus are simple to implement in practice. The first algorithm combines a plug-in estimate of the conditional probability of the positive class with optimal threshold selection. The second algorithm leverages recent work on calibrated asymmetric surrogate losses to construct candidate classifiers. We present empirical comparisons between these algorithms on benchmark datasets.

Original languageEnglish (US)
Pages (from-to)2744-2752
Number of pages9
JournalAdvances in Neural Information Processing Systems
Volume3
Issue numberJanuary
StatePublished - Jan 1 2014
Externally publishedYes
Event28th Annual Conference on Neural Information Processing Systems 2014, NIPS 2014 - Montreal, Canada
Duration: Dec 8 2014Dec 13 2014

Fingerprint

Classifiers

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Information Systems
  • Signal Processing

Cite this

Consistent binary classification with generalized performance metrics. / Koyejo, Oluwasanmi; Natarajan, Nagarajan; Ravikumar, Pradeep; Dhillon, Inderjit S.

In: Advances in Neural Information Processing Systems, Vol. 3, No. January, 01.01.2014, p. 2744-2752.

Research output: Contribution to journalConference article

Koyejo, O, Natarajan, N, Ravikumar, P & Dhillon, IS 2014, 'Consistent binary classification with generalized performance metrics', Advances in Neural Information Processing Systems, vol. 3, no. January, pp. 2744-2752.
Koyejo, Oluwasanmi ; Natarajan, Nagarajan ; Ravikumar, Pradeep ; Dhillon, Inderjit S. / Consistent binary classification with generalized performance metrics. In: Advances in Neural Information Processing Systems. 2014 ; Vol. 3, No. January. pp. 2744-2752.
@article{04d5dbbb24b14778924b8fea197de787,
title = "Consistent binary classification with generalized performance metrics",
abstract = "Performance metrics for binary classification are designed to capture tradeoffs between four fundamental population quantities: true positives, false positives, true negatives and false negatives. Despite significant interest from theoretical and applied communities, little is known about either optimal classifiers or consistent algorithms for optimizing binary classification performance metrics beyond a few special cases. We consider a fairly large family of performance metrics given by ratios of linear combinations of the four fundamental population quantities. This family includes many well known binary classification metrics such as classification accuracy, AM measure, F-measure and the Jaccard similarity coefficient as special cases. Our analysis identifies the optimal classifiers as the sign of the thresholded conditional probability of the positive class, with a performance metric-dependent threshold. The optimal threshold can be constructed using simple plug-in estimators when the performance metric is a linear combination of the population quantities, but alternative techniques are required for the general case. We propose two algorithms for estimating the optimal classifiers, and prove their statistical consistency. Both algorithms are straightforward modifications of standard approaches to address the key challenge of optimal threshold selection, thus are simple to implement in practice. The first algorithm combines a plug-in estimate of the conditional probability of the positive class with optimal threshold selection. The second algorithm leverages recent work on calibrated asymmetric surrogate losses to construct candidate classifiers. We present empirical comparisons between these algorithms on benchmark datasets.",
author = "Oluwasanmi Koyejo and Nagarajan Natarajan and Pradeep Ravikumar and Dhillon, {Inderjit S.}",
year = "2014",
month = "1",
day = "1",
language = "English (US)",
volume = "3",
pages = "2744--2752",
journal = "Advances in Neural Information Processing Systems",
issn = "1049-5258",
number = "January",

}

TY - JOUR

T1 - Consistent binary classification with generalized performance metrics

AU - Koyejo, Oluwasanmi

AU - Natarajan, Nagarajan

AU - Ravikumar, Pradeep

AU - Dhillon, Inderjit S.

PY - 2014/1/1

Y1 - 2014/1/1

N2 - Performance metrics for binary classification are designed to capture tradeoffs between four fundamental population quantities: true positives, false positives, true negatives and false negatives. Despite significant interest from theoretical and applied communities, little is known about either optimal classifiers or consistent algorithms for optimizing binary classification performance metrics beyond a few special cases. We consider a fairly large family of performance metrics given by ratios of linear combinations of the four fundamental population quantities. This family includes many well known binary classification metrics such as classification accuracy, AM measure, F-measure and the Jaccard similarity coefficient as special cases. Our analysis identifies the optimal classifiers as the sign of the thresholded conditional probability of the positive class, with a performance metric-dependent threshold. The optimal threshold can be constructed using simple plug-in estimators when the performance metric is a linear combination of the population quantities, but alternative techniques are required for the general case. We propose two algorithms for estimating the optimal classifiers, and prove their statistical consistency. Both algorithms are straightforward modifications of standard approaches to address the key challenge of optimal threshold selection, thus are simple to implement in practice. The first algorithm combines a plug-in estimate of the conditional probability of the positive class with optimal threshold selection. The second algorithm leverages recent work on calibrated asymmetric surrogate losses to construct candidate classifiers. We present empirical comparisons between these algorithms on benchmark datasets.

AB - Performance metrics for binary classification are designed to capture tradeoffs between four fundamental population quantities: true positives, false positives, true negatives and false negatives. Despite significant interest from theoretical and applied communities, little is known about either optimal classifiers or consistent algorithms for optimizing binary classification performance metrics beyond a few special cases. We consider a fairly large family of performance metrics given by ratios of linear combinations of the four fundamental population quantities. This family includes many well known binary classification metrics such as classification accuracy, AM measure, F-measure and the Jaccard similarity coefficient as special cases. Our analysis identifies the optimal classifiers as the sign of the thresholded conditional probability of the positive class, with a performance metric-dependent threshold. The optimal threshold can be constructed using simple plug-in estimators when the performance metric is a linear combination of the population quantities, but alternative techniques are required for the general case. We propose two algorithms for estimating the optimal classifiers, and prove their statistical consistency. Both algorithms are straightforward modifications of standard approaches to address the key challenge of optimal threshold selection, thus are simple to implement in practice. The first algorithm combines a plug-in estimate of the conditional probability of the positive class with optimal threshold selection. The second algorithm leverages recent work on calibrated asymmetric surrogate losses to construct candidate classifiers. We present empirical comparisons between these algorithms on benchmark datasets.

UR - http://www.scopus.com/inward/record.url?scp=84937905822&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84937905822&partnerID=8YFLogxK

M3 - Conference article

AN - SCOPUS:84937905822

VL - 3

SP - 2744

EP - 2752

JO - Advances in Neural Information Processing Systems

JF - Advances in Neural Information Processing Systems

SN - 1049-5258

IS - January

ER -