GPU-accelerated Hungarian algorithms for the Linear Assignment Problem

Ketan Date, Rakesh Nagi

Research output: Contribution to journalArticlepeer-review


In this paper, we describe parallel versions of two different variants (classical and alternating tree) of the Hungarian algorithm for solving the Linear Assignment Problem (LAP). We have chosen Compute Unified Device Architecture (CUDA) enabled NVIDIA Graphics Processing Units (GPU) as the parallel programming architecture because of its ability to perform intense computations on arrays and matrices. The main contribution of this paper is an efficient parallelization of the augmenting path search phase of the Hungarian algorithm. Computational experiments on problems with up to 25 million variables reveal that the GPU-accelerated versions are extremely efficient in solving large problems, as compared to their CPU counterparts. Tremendous parallel speedups are achieved for problems with up to 400 million variables, which are solved within 13 seconds on average. We also tested multi-GPU versions of the two variants on up to 16 GPUs, which show decent scaling behavior for problems with up to 1.6 billion variables and dense cost matrix structure.

Original languageEnglish (US)
Pages (from-to)52-72
Number of pages21
JournalParallel Computing
StatePublished - Sep 1 2016


  • CUDA
  • Graphics processing unit
  • Linear assignment problem
  • Parallel algorithm

ASJC Scopus subject areas

  • Software
  • Theoretical Computer Science
  • Hardware and Architecture
  • Computer Networks and Communications
  • Computer Graphics and Computer-Aided Design
  • Artificial Intelligence


Dive into the research topics of 'GPU-accelerated Hungarian algorithms for the Linear Assignment Problem'. Together they form a unique fingerprint.

Cite this