The design and implementation of a scalable deep learning benchmarking platform

Cheng Li, Abdul Dakkak, Jinjun Xiong, Wen Mei Hwu

Research output: Chapter in Book/Report/Conference proceedingConference contribution


The current Deep Learning (DL) landscape is fast-paced and is rife with non-uniform models, hardware/software (HW/SW) stacks. Currently, there is no DL benchmarking platform to facilitate the evaluation and comparison of DL innovations, be it models, frameworks, libraries, or hardware. As a result, the current practice of evaluating the benefits of proposed DL innovations is both arduous and error-prone-stifling the adoption of the innovations. In this work, we first identify 10 design features that are desirable within a DL benchmarking platform. These features include: Performing the evaluation in a consistent, reproducible, and scalable manner, being framework and hardware agnostic, supporting real-world benchmarking workloads, providing in-depth model execution inspection across the HW/SW stack levels, etc. We then propose MLModelScope, a DL benchmarking platform that realizes these 10 design objectives. MLModelScope introduces a specification to define DL model evaluations and provides a runtime to provision the evaluation workflow using the user-specified HW/SW stack. MLModelScope defines abstractions for frameworks and supports the board range of DL models and evaluation scenarios. We implement MLModelScope as an open-source project with support for all major frameworks and hardware architectures. Through MLModelScope's evaluation and automated analysis workflows, we perform a case-study analysis of 37 models across 4 systems and show how model, hardware, and framework selection affects model accuracy and performance under different benchmarking scenarios. We further demonstrate how MLModelScope's tracing capability gives a holistic view of model execution and helps pinpoint bottlenecks.

Original languageEnglish (US)
Title of host publicationProceedings - 2020 IEEE 13th International Conference on Cloud Computing, CLOUD 2020
PublisherIEEE Computer Society
Number of pages12
ISBN (Electronic)9781728187808
StatePublished - Oct 2020
Event13th IEEE International Conference on Cloud Computing, CLOUD 2020 - Virtual, Beijing, China
Duration: Oct 18 2020Oct 24 2020

Publication series

NameIEEE International Conference on Cloud Computing, CLOUD
ISSN (Print)2159-6182
ISSN (Electronic)2159-6190


Conference13th IEEE International Conference on Cloud Computing, CLOUD 2020
CityVirtual, Beijing


  • Artificial Intelligence
  • Benchmarking
  • Deep Learning
  • Machine Learning
  • Software Engineering

ASJC Scopus subject areas

  • Artificial Intelligence
  • Information Systems
  • Software

Fingerprint Dive into the research topics of 'The design and implementation of a scalable deep learning benchmarking platform'. Together they form a unique fingerprint.

Cite this