A variance minimization criterion to active learning on graphs

Ming Ji, Jiawei Han

Research output: Contribution to journalArticle

Abstract

We consider the problem of active learning over the vertices in a graph, without feature representation. Our study is based on the common graph smoothness assumption, which is formulated in a Gaussian random field model. We analyze the probability distribution over the unlabeled vertices conditioned on the label information, which is a multivariate normal with the mean being the harmonic solution over the field. Then we select the nodes to label such that the total variance of the distribution on the unlabeled data, as well as the expected prediction error, is minimized. In this way, the classifier we obtain is theoretically more robust. Compared with existing methods, our algorithm has the advantage of selecting data in a batch offline mode with solid theoretical support. We show improved performance over existing label selection criteria on several real world data sets.

Original languageEnglish (US)
Pages (from-to)556-564
Number of pages9
JournalJournal of Machine Learning Research
Volume22
StatePublished - 2012

ASJC Scopus subject areas

  • Control and Systems Engineering
  • Software
  • Statistics and Probability
  • Artificial Intelligence

Fingerprint Dive into the research topics of 'A variance minimization criterion to active learning on graphs'. Together they form a unique fingerprint.

  • Cite this