Disambiguation and co-authorship networks of the U.S. patent inventor database (1975-2010)

Guan Cheng Li, Ronald Lai, Alexander D'Amour, David M. Doolin, Ye Sun, Vetle I. Torvik, Amy Z. Yu, Fleming Lee

Research output: Contribution to journalArticlepeer-review


Research into invention, innovation policy, and technology strategy can greatly benefit from an accurate understanding of inventor careers. The United States Patent and Trademark Office does not provide unique inventor identifiers, however, making large-scale studies challenging. Many scholars of innovation have implemented ad-hoc disambiguation methods based on string similarity thresholds and string comparison matching; such methods have been shown to be vulnerable to a number of problems that can adversely affect research results. The authors address this issue contributing (1) an application of the Author-ity disambiguation approach (Torvik et al., 2005; Torvik and Smalheiser, 2009) to the US utility patent database, (2) a new iterative blocking scheme that expands the match space of this algorithm while maintaining scalability, (3) a public posting of the algorithm and code, and (4) a public posting of the results of the algorithm in the form of a database of inventors and their associated patents. The paper provides an overview of the disambiguation method, assesses its accuracy, and calculates network measures based on co-authorship and collaboration variables. It illustrates the potential for large-scale innovation studies across time and space with visualizations of inventor mobility across the United States. The complete input and results data from the original disambiguation are available at (http://dvn.iq.harvard.edu/dvn/dv/patent); revised data described here are at (http://funglab.berkeley.edu/pub/disamb-no- postpolishing.csv); original and revised code is available at (https://github.com/funginstitute/disambiguator); visualizations of inventor mobility are at (http://funglab.berkeley.edu/mobility/).

Original languageEnglish (US)
Pages (from-to)941-955
Number of pages15
JournalResearch Policy
Issue number6
StatePublished - Jul 2014


  • Careers
  • Disambiguation
  • Inventors
  • Networks
  • Patents

ASJC Scopus subject areas

  • Strategy and Management
  • Management Science and Operations Research
  • Management of Technology and Innovation


Dive into the research topics of 'Disambiguation and co-authorship networks of the U.S. patent inventor database (1975-2010)'. Together they form a unique fingerprint.

Cite this