Mining historical issue repositories to heal large-scale online service systems

Rui Ding, Qiang Fu, Jian Guang Lou, Qingwei Lin, Dongmei Zhang, Tao Xie

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Online service systems have been increasingly popular and important nowadays. Reducing the MTTR (Mean Time to Restore) of a service remains one of the most important steps to assure the user-perceived availability of the service. To reduce the MTTR, a common practice is to restore the service by identifying and applying an appropriate healing action. In this paper, we present an automated mining-based approach for suggesting an appropriate healing action for a given new issue. Our approach suggests an appropriate healing action by adapting healing actions from the retrieved similar historical issues. We have applied our approach to a real-world and large-scale product online service. The studies on 243 real issues of the service show that our approach can effectively suggest appropriate healing actions (with 87% accuracy) to reduce the MTTR of the service. In addition, according to issue characteristics, we further study and categorize issues where automatic healing suggestion faces difficulties.

Original languageEnglish (US)
Title of host publicationProceedings of the International Conference on Dependable Systems and Networks
PublisherIEEE Computer Society
Pages311-322
Number of pages12
ISBN (Electronic)9781479922338
DOIs
StatePublished - Sep 18 2014
Event44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014 - Atlanta, United States
Duration: Jun 23 2014Jun 26 2014

Publication series

NameProceedings of the International Conference on Dependable Systems and Networks

Other

Other44th Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2014
Country/TerritoryUnited States
CityAtlanta
Period6/23/146/26/14

Keywords

  • Online service system
  • healing action
  • incident management
  • issue repository

ASJC Scopus subject areas

  • Software
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Mining historical issue repositories to heal large-scale online service systems'. Together they form a unique fingerprint.

Cite this