Ranking-based name matching for author disambiguation in bibliographic data

Jialu Liu, Kin Hou Lei, Jeffery Yufei Liu, Chi Wang, Jiawei Han

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Author name ambiguity is a frequently encountered problem in digital publication libraries such as Microsoft Academic Search. The cause of this problem mostly is that different authors may publish under the same name, while the same author could publish under various names due to abbreviations, nicknames, etc. Author disambiguation is exactly the goal of the Track II of KDD Cup Data Mining Contest 2013. In this paper we introduce our ranking-based name matching algorithm and system called RankMatch. One important feature of our solution is using heterogeneous meta-paths to evaluate the similarity between two potential duplicate authors whose names are compatible. We participated under team name "SmallData" and our final solution achieved a Mean F1 score of 99.157%, ranking in the second place in the contest.

Original languageEnglish (US)
Title of host publicationProceedings of the 2013 KDD Cup 2013 Workshop
PublisherAssociation for Computing Machinery
ISBN (Print)9781450324953
DOIs
StatePublished - 2013
Externally publishedYes
Event2013 KDD Cup 2013 Workshop - Chicago, IL, United States
Duration: Aug 11 2013Aug 14 2013

Publication series

NameProceedings of the 2013 KDD Cup 2013 Workshop

Conference

Conference2013 KDD Cup 2013 Workshop
Country/TerritoryUnited States
CityChicago, IL
Period8/11/138/14/13

Keywords

  • name disambiguation
  • name matching

ASJC Scopus subject areas

  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Ranking-based name matching for author disambiguation in bibliographic data'. Together they form a unique fingerprint.

Cite this