Matching Methods for Observational Studies Derived from Large Administrative Databases

Ruoqi Yu, Jeffrey H. Silber, Paul R. Rosenbaum

Research output: Contribution to journalArticlepeer-review

Abstract

We propose new optimal matching techniques for large administrative data sets. In current practice, very large matched samples are constructed by subdividing the population and solving a series of smaller problems, for instance, matching men to men and separately matching women to women. Without simplification of some kind, the time required to optimally match T treated individuals to T controls selected from C ≥ T potential controls grows much faster than linearly with the number of people to be matched—the required time is of order O{(T + C)3}—so splitting one large problem into many small problems greatly accelerates the computations. This common practice has several disadvantages that we describe. In its place, we propose a single match, using everyone, that accelerates the computations in a different way. In particular, we use an iterative form of Glover’s algorithm for a doubly convex bipartite graph to determine an optimal caliper for the propensity score, radically reducing the number of candidate matches; then we optimally match in a large but much sparser graph. In this graph, a modified form of near-fine balance can be used on a much larger scale, improving its effectiveness. We illustrate the method using data from US Medicaid, matching children receiving surgery at a children’s hospital to similar children receiving surgery at a hospital that mostly treats adults. In the example, we form 38,841 matched pairs from 159,527 potential controls, controlling for 29 covariates plus 463 Principal Surgical Procedures, plus 973 Principal Diagnoses. The method is implemented in an R package bigmatch available from CRAN.

Original languageEnglish (US)
Pages (from-to)338-355
Number of pages18
JournalStatistical Science
Volume35
Issue number3
DOIs
StatePublished - Aug 2020
Externally publishedYes

Keywords

  • Causal inference
  • fine balance
  • Glover’s algorithm
  • observational study
  • optimal caliper
  • optimal matching
  • propensity score

ASJC Scopus subject areas

  • Statistics and Probability
  • General Mathematics
  • Statistics, Probability and Uncertainty

Fingerprint

Dive into the research topics of 'Matching Methods for Observational Studies Derived from Large Administrative Databases'. Together they form a unique fingerprint.

Cite this