Learning from Imbalanced Data in Relational Domains: A Soft Margin Approach

Shuo Yang, Tushar Khot, Kristian Kersting, Gautam Kunapuli, Kris Hauser, Sriraam Natarajan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We consider the problem of learning probabilistic models from relational data. One of the key issues with relational data is class imbalance where the number of negative examples far outnumbers the number of positive examples. The common approach for dealing with this problem is the use of sub-sampling of negative examples. We, on the other hand, consider a soft margin approach that explicitly trades off between the false positives and false negatives. We apply this approach to the recently successful formalism of relational functional gradient boosting. Specifically, we modify the objective function of the learning problem to explicitly include the trade-off between false positives and negatives. We show empirically that this approach is more successful in handling the class imbalance problem than the original framework that weighed all the examples equally.

Original languageEnglish (US)
Title of host publicationProceedings - 14th IEEE International Conference on Data Mining, ICDM 2014
EditorsRavi Kumar, Hannu Toivonen, Jian Pei, Joshua Zhexue Huang, Xindong Wu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1085-1090
Number of pages6
EditionJanuary
ISBN (Electronic)9781479943029
DOIs
StatePublished - Jan 1 2014
Externally publishedYes
Event14th IEEE International Conference on Data Mining, ICDM 2014 - Shenzhen, China
Duration: Dec 14 2014Dec 17 2014

Publication series

NameProceedings - IEEE International Conference on Data Mining, ICDM
NumberJanuary
Volume2015-January
ISSN (Print)1550-4786

Other

Other14th IEEE International Conference on Data Mining, ICDM 2014
Country/TerritoryChina
CityShenzhen
Period12/14/1412/17/14

ASJC Scopus subject areas

  • General Engineering

Fingerprint

Dive into the research topics of 'Learning from Imbalanced Data in Relational Domains: A Soft Margin Approach'. Together they form a unique fingerprint.

Cite this