TY - GEN
T1 - Understanding issue correlations
T2 - 6th ACM Symposium on Cloud Computing, ACM SoCC 2015
AU - Huang, Jian
AU - Zhang, Xuechen
AU - Schwan, Karsten
N1 - Publisher Copyright:
© 2015 ACM.
PY - 2015/8/27
Y1 - 2015/8/27
N2 - Over the last decade, Hadoop has evolved into a widely used platform for Big Data applications. Acknowledging its wide-spread use, we present a comprehensive analysis of the solved issues with applied patches in the Hadoop ecosystem. The analysis is conducted with a focus on Hadoop's two essential components: HDFS (storage) and MapReduce (computation), it involves a total of 4218 solved issues over the last six years, covering 2180 issues from HDFS and 2038 issues from MapReduce. Insights derived from the study concern system design and development, particularly with respect to correlated issues and correlations between root causes of issues and characteristics of the Hadoop subsystems. These findings shed light on the future development of Big Data systems, on their testing, and on bug-finding tools.
AB - Over the last decade, Hadoop has evolved into a widely used platform for Big Data applications. Acknowledging its wide-spread use, we present a comprehensive analysis of the solved issues with applied patches in the Hadoop ecosystem. The analysis is conducted with a focus on Hadoop's two essential components: HDFS (storage) and MapReduce (computation), it involves a total of 4218 solved issues over the last six years, covering 2180 issues from HDFS and 2038 issues from MapReduce. Insights derived from the study concern system design and development, particularly with respect to correlated issues and correlations between root causes of issues and characteristics of the Hadoop subsystems. These findings shed light on the future development of Big Data systems, on their testing, and on bug-finding tools.
KW - Big data
KW - Bug study
KW - Hadoop
KW - Issue correlation
UR - http://www.scopus.com/inward/record.url?scp=84958959260&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84958959260&partnerID=8YFLogxK
U2 - 10.1145/2806777.2806937
DO - 10.1145/2806777.2806937
M3 - Conference contribution
AN - SCOPUS:84958959260
T3 - ACM SoCC 2015 - Proceedings of the 6th ACM Symposium on Cloud Computing
SP - 2
EP - 15
BT - ACM SoCC 2015 - Proceedings of the 6th ACM Symposium on Cloud Computing
A2 - Balazinska, Magdalena
A2 - Freedman, Michael J.
A2 - Barahmand, Sumita
A2 - Ghandeharizadeh, Shahram
PB - Association for Computing Machinery, Inc
Y2 - 27 August 2015 through 29 August 2015
ER -