An algorithm for fast edit distance computation on GPUs

Reza Farivar, Harshit Kharbanda, Shivaram Venkataraman, R H Campbell

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The problem of finding the edit distance between two sequences (and its closely related problem of longest common subsequence) are important problems with applications in many domains like virus scanners, security kernels, natural language translation and genome sequence alignment. The traditional dynamic-programming based algorithm is hard to parallelize on SIMD processors as the algorithm is memory intensive and has many divergent control paths. In this paper we introduce a new algorithm which modifies the dynamic programming method to reduce its amount of data storage and eliminate control flow divergences. Our algorithm divides the problem into independent 'quadrants' and makes efficient use of shared memory and registers available in GPUs to store data between different phases of the algorithm. Further, we eliminate any control flow divergences by embedding condition variables in the program logic to ensure all the threads execute the same instructions even though they work on different data items. We present an implementation of this algorithm on an NVIDIA GeForce GTX 275 GPU and compare against an optimized multi-threaded implementation on an Intel Core i7-920 quad core CPU with hyper-threading support. Our results show that our GPU implementation is up to 8x faster when operating on a large number of sequences.

Original languageEnglish (US)
Title of host publication2012 Innovative Parallel Computing, InPar 2012
DOIs
StatePublished - Dec 12 2012
Event2012 Innovative Parallel Computing, InPar 2012 - San Jose, CA, United States
Duration: May 13 2012May 14 2012

Publication series

Name2012 Innovative Parallel Computing, InPar 2012

Other

Other2012 Innovative Parallel Computing, InPar 2012
CountryUnited States
CitySan Jose, CA
Period5/13/125/14/12

Fingerprint

Dynamic programming
Flow control
Data storage equipment
Viruses
Program processors
Graphics processing unit
Genes

ASJC Scopus subject areas

  • Software

Cite this

Farivar, R., Kharbanda, H., Venkataraman, S., & Campbell, R. H. (2012). An algorithm for fast edit distance computation on GPUs. In 2012 Innovative Parallel Computing, InPar 2012 [6339593] (2012 Innovative Parallel Computing, InPar 2012). https://doi.org/10.1109/InPar.2012.6339593

An algorithm for fast edit distance computation on GPUs. / Farivar, Reza; Kharbanda, Harshit; Venkataraman, Shivaram; Campbell, R H.

2012 Innovative Parallel Computing, InPar 2012. 2012. 6339593 (2012 Innovative Parallel Computing, InPar 2012).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Farivar, R, Kharbanda, H, Venkataraman, S & Campbell, RH 2012, An algorithm for fast edit distance computation on GPUs. in 2012 Innovative Parallel Computing, InPar 2012., 6339593, 2012 Innovative Parallel Computing, InPar 2012, 2012 Innovative Parallel Computing, InPar 2012, San Jose, CA, United States, 5/13/12. https://doi.org/10.1109/InPar.2012.6339593
Farivar R, Kharbanda H, Venkataraman S, Campbell RH. An algorithm for fast edit distance computation on GPUs. In 2012 Innovative Parallel Computing, InPar 2012. 2012. 6339593. (2012 Innovative Parallel Computing, InPar 2012). https://doi.org/10.1109/InPar.2012.6339593
Farivar, Reza ; Kharbanda, Harshit ; Venkataraman, Shivaram ; Campbell, R H. / An algorithm for fast edit distance computation on GPUs. 2012 Innovative Parallel Computing, InPar 2012. 2012. (2012 Innovative Parallel Computing, InPar 2012).
@inproceedings{774abe0b118848a18e5f809413c8929a,
title = "An algorithm for fast edit distance computation on GPUs",
abstract = "The problem of finding the edit distance between two sequences (and its closely related problem of longest common subsequence) are important problems with applications in many domains like virus scanners, security kernels, natural language translation and genome sequence alignment. The traditional dynamic-programming based algorithm is hard to parallelize on SIMD processors as the algorithm is memory intensive and has many divergent control paths. In this paper we introduce a new algorithm which modifies the dynamic programming method to reduce its amount of data storage and eliminate control flow divergences. Our algorithm divides the problem into independent 'quadrants' and makes efficient use of shared memory and registers available in GPUs to store data between different phases of the algorithm. Further, we eliminate any control flow divergences by embedding condition variables in the program logic to ensure all the threads execute the same instructions even though they work on different data items. We present an implementation of this algorithm on an NVIDIA GeForce GTX 275 GPU and compare against an optimized multi-threaded implementation on an Intel Core i7-920 quad core CPU with hyper-threading support. Our results show that our GPU implementation is up to 8x faster when operating on a large number of sequences.",
author = "Reza Farivar and Harshit Kharbanda and Shivaram Venkataraman and Campbell, {R H}",
year = "2012",
month = "12",
day = "12",
doi = "10.1109/InPar.2012.6339593",
language = "English (US)",
isbn = "9781467326322",
series = "2012 Innovative Parallel Computing, InPar 2012",
booktitle = "2012 Innovative Parallel Computing, InPar 2012",

}

TY - GEN

T1 - An algorithm for fast edit distance computation on GPUs

AU - Farivar, Reza

AU - Kharbanda, Harshit

AU - Venkataraman, Shivaram

AU - Campbell, R H

PY - 2012/12/12

Y1 - 2012/12/12

N2 - The problem of finding the edit distance between two sequences (and its closely related problem of longest common subsequence) are important problems with applications in many domains like virus scanners, security kernels, natural language translation and genome sequence alignment. The traditional dynamic-programming based algorithm is hard to parallelize on SIMD processors as the algorithm is memory intensive and has many divergent control paths. In this paper we introduce a new algorithm which modifies the dynamic programming method to reduce its amount of data storage and eliminate control flow divergences. Our algorithm divides the problem into independent 'quadrants' and makes efficient use of shared memory and registers available in GPUs to store data between different phases of the algorithm. Further, we eliminate any control flow divergences by embedding condition variables in the program logic to ensure all the threads execute the same instructions even though they work on different data items. We present an implementation of this algorithm on an NVIDIA GeForce GTX 275 GPU and compare against an optimized multi-threaded implementation on an Intel Core i7-920 quad core CPU with hyper-threading support. Our results show that our GPU implementation is up to 8x faster when operating on a large number of sequences.

AB - The problem of finding the edit distance between two sequences (and its closely related problem of longest common subsequence) are important problems with applications in many domains like virus scanners, security kernels, natural language translation and genome sequence alignment. The traditional dynamic-programming based algorithm is hard to parallelize on SIMD processors as the algorithm is memory intensive and has many divergent control paths. In this paper we introduce a new algorithm which modifies the dynamic programming method to reduce its amount of data storage and eliminate control flow divergences. Our algorithm divides the problem into independent 'quadrants' and makes efficient use of shared memory and registers available in GPUs to store data between different phases of the algorithm. Further, we eliminate any control flow divergences by embedding condition variables in the program logic to ensure all the threads execute the same instructions even though they work on different data items. We present an implementation of this algorithm on an NVIDIA GeForce GTX 275 GPU and compare against an optimized multi-threaded implementation on an Intel Core i7-920 quad core CPU with hyper-threading support. Our results show that our GPU implementation is up to 8x faster when operating on a large number of sequences.

UR - http://www.scopus.com/inward/record.url?scp=84870723547&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84870723547&partnerID=8YFLogxK

U2 - 10.1109/InPar.2012.6339593

DO - 10.1109/InPar.2012.6339593

M3 - Conference contribution

AN - SCOPUS:84870723547

SN - 9781467326322

T3 - 2012 Innovative Parallel Computing, InPar 2012

BT - 2012 Innovative Parallel Computing, InPar 2012

ER -