Anonymized user datasets are often released for research or industry applications. As an example, t.qq.com released its anonymized users' profile, social interaction, and recommendation log data in KDD Cup 2012 to call for recommendation algorithms. Since the entities (users and so on) and edges (links among entities) are of multiple types, the released social network is a heterogeneous information network. Prior work has shown how privacy can be compromised in homogeneous information networks by the use of specific types of graph patterns. We show how the extra information derived from heterogeneity can be used to relax these assumptions. To characterize and demonstrate this added threat, we formally define privacy risk in an anonymized heterogeneous information network to identify the vulnerability in the possible way such data are released, and further present a new de-anonymization attack that exploits the vulnerability. Our attack successfully de-anonymized most individuals involved in the data-for an anonymized 1,000-user t.qq.com network of density 0.01, the attack precision is over 90% with a 2.3-million-user auxiliary network.

Original languageEnglish (US)
Title of host publicationAdvances in Database Technology - EDBT 2014
Subtitle of host publication17th International Conference on Extending Database Technology, Proceedings
EditorsVincent Leroy, Vassilis Christophides, Vassilis Christophides, Stratos Idreos, Anastasios Kementsietsidis, Minos Garofalakis, Sihem Amer-Yahia
PublisherOpenProceedings.org, University of Konstanz, University Library
Number of pages12
ISBN (Electronic)9783893180653
StatePublished - 2014
Event17th International Conference on Extending Database Technology, EDBT 2014 - Athens, Greece
Duration: Mar 24 2014Mar 28 2014

Publication series

NameAdvances in Database Technology - EDBT 2014: 17th International Conference on Extending Database Technology, Proceedings


Other17th International Conference on Extending Database Technology, EDBT 2014


  • Anonymization
  • Attack
  • Data mining
  • Privacy
  • Social networks

ASJC Scopus subject areas

  • Computer Science Applications
  • Information Systems
  • Software


Dive into the research topics of 'Privacy risk in anonymized heterogeneous information networks'. Together they form a unique fingerprint.

Cite this