Characterizing cloud applications on a Google data center

Sheng Di, Derrick Kondo, Franck Cappello

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In this paper, we characterize Google applications, based on a one-month Google trace with over 650k jobs running across over 12000 heterogeneous hosts from a Google data center. On one hand, we carefully compute the valuable statistics about task events and resource utilization for Google applications, based on various types of resources (such as CPU, memory) and execution types (e.g., whether they can run batch tasks or not). Resource utilization per application is observed with an extremely typical Pareto principle. On the other hand, we classify applications via a K-means clustering algorithm with optimized number of sets, based on task events and resource usage. The number of applications in the Kmeans clustering sets follows a Pareto-similar distribution. We believe our work is very interesting and valuable for the further investigation of Cloud environment.

Original languageEnglish (US)
Title of host publicationProceedings
Subtitle of host publicationInternational Conference on Parallel Processing - The 42nd Annual Conference, ICPP 2013
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages468-473
Number of pages6
ISBN (Print)9780769551173
DOIs
StatePublished - 2013
Event42nd Annual International Conference on Parallel Processing, ICPP 2013 - Lyon, France
Duration: Oct 1 2013Oct 4 2013

Publication series

NameProceedings of the International Conference on Parallel Processing
ISSN (Print)0190-3918

Conference

Conference42nd Annual International Conference on Parallel Processing, ICPP 2013
Country/TerritoryFrance
CityLyon
Period10/1/1310/4/13

ASJC Scopus subject areas

  • Software
  • General Mathematics
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Characterizing cloud applications on a Google data center'. Together they form a unique fingerprint.

Cite this