Optimizing R VM: Allocation removal and path length reduction via interpreter-level specialization

Haichuan Wang, Peng Wu, David Padua

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The performance of R, a popular data analysis language, was never properly understood. Some claimed their R codes ran as efficiently as any native code, others quoted orders of magnitude slowdown of R codes with respect to equivalent C implementations.We found both claims to be true depending on how an R code is written. This paper introduces a first classification of R programming styles into Type I (looping over data), Type II (vector programming), and Type III (glue codes). The most serious overhead of R are mostly manifested on Type I R codes, whereas many Type III R codes can be quite fast. This paper focuses on improving the performance of Type I R codes. We propose the ORBIT VM, an extension of the GNU R VM, to perform aggressive removal of allocated objects and reduction of instruction path lengths in the GNU R VM via profile-driven specialization techniques. The ORBIT VM is fully compatible with the R language and is purely based on interpreted execution. It is a specialization JIT and runtime focusing on data representation specialization and operation specialization. For our benchmarks of Type I R codes, ORBIT is able to achieve an average of 3.5X speedups over the current release of GNU R VM and outperforms most other R optimization projects that are currently available.

Original languageEnglish (US)
Title of host publicationProceedings of the 12th ACM/IEEE International Symposium on Code Generation and Optimization, CGO 2014
PublisherAssociation for Computing Machinery
Pages295-305
Number of pages11
ISBN (Print)9781450326704
DOIs
StatePublished - 2014
Event12th ACM/IEEE International Symposium on Code Generation and Optimization, CGO 2014 - Orlando, FL, United States
Duration: Feb 15 2014Feb 19 2014

Publication series

NameProceedings of the 12th ACM/IEEE International Symposium on Code Generation and Optimization, CGO 2014

Other

Other12th ACM/IEEE International Symposium on Code Generation and Optimization, CGO 2014
Country/TerritoryUnited States
CityOrlando, FL
Period2/15/142/19/14

Keywords

  • Dynamic scripting language
  • R
  • Specialization

ASJC Scopus subject areas

  • Software
  • Computational Theory and Mathematics
  • Applied Mathematics

Fingerprint

Dive into the research topics of 'Optimizing R VM: Allocation removal and path length reduction via interpreter-level specialization'. Together they form a unique fingerprint.

Cite this