Curated Open Citations Dataset

  • Dmitriy Korobskiy (Creator)
  • George Chacko (Creator)



This dataset is derived from the COCI, the OpenCitations Index of Crossref open DOI-to-DOI references ( Silvio Peroni, David Shotton (2020). OpenCitations, an infrastructure organization for open scholarship. Quantitative Science Studies, 1(1): 428-444. We have curated it to remove duplicates, self-loops, and parallel edges. These data were copied from the Open Citations website on May 6, 2023 and subsequently processed to produce a node list and an edge-list. Integer_ids have been assigned to the DOIs to reduce memory and storage needs when working with these data. As noted on the Open Citation website, each record is a citing-cited pair that uses DOIs as persistent identifiers.
Date made availableJun 6 2023
PublisherUniversity of Illinois Urbana-Champaign


  • bibliometrics
  • open citations
  • citation network
  • scientometrics

Cite this