Survey on distance metric learning and dimensionality reduction in data mining

Fei Wang, Jimeng Sun

Research output: Contribution to journalArticlepeer-review

Abstract

Distance metric learning is a fundamental problem in data mining and knowledge discovery. Many representative data mining algorithms, such as k-nearest neighbor classifier, hierarchical clustering and spectral clustering, heavily rely on the underlying distance metric for correctly measuring relations among input data. In recent years, many studies have demonstrated, either theoretically or empirically, that learning a good distance metric can greatly improve the performance of classification, clustering and retrieval tasks. In this survey, we overview existing distance metric learning approaches according to a common framework. Specifically, depending on the available supervision information during the distance metric learning process, we categorize each distance metric learning algorithm as supervised, unsupervised or semi-supervised. We compare those different types of metric learning methods, point out their strength and limitations. Finally, we summarize open challenges in distance metric learning and propose future directions for distance metric learning.

Original languageEnglish (US)
Pages (from-to)534-564
Number of pages31
JournalData Mining and Knowledge Discovery
Volume29
Issue number2
DOIs
StatePublished - Jan 1 2014
Externally publishedYes

Keywords

  • Data mining
  • Dimensionality reduction
  • Distance metric learning

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computer Networks and Communications

Fingerprint Dive into the research topics of 'Survey on distance metric learning and dimensionality reduction in data mining'. Together they form a unique fingerprint.

Cite this