TY - GEN
T1 - Modeling stream processing applications for dependability evaluation
AU - Jacques-Silva, Gabriela
AU - Kalbarczyk, Zbigniew T
AU - Gedik, Bugra
AU - Andrade, Henrique
AU - Wu, Kun Lung
AU - Iyer, Ravishankar K
PY - 2011
Y1 - 2011
N2 - This paper describes a modeling framework for evaluating the impact of faults on the output of streaming applications. Our model is based on three abstractions: stream operators, stream connections, and tuples. By composing these abstractions within a Stochastic Activity Network, we allow the modeling of complete applications. We consider faults that lead to data loss and to silent data corruption (SDC). Our framework captures how faults originating in one operator propagate to other operators down the stream processing graph. We demonstrate the extensibility of our framework by evaluating three different fault tolerance techniques: checkpointing, partial graph replication, and full graph replication. We show that under crashes that lead to data loss, partial graph replication has a great advantage in maintaining the accuracy of the application output when compared to checkpointing. We also show that SDC can break the no data duplication guarantees of a full graph replication-based fault tolerance technique.
AB - This paper describes a modeling framework for evaluating the impact of faults on the output of streaming applications. Our model is based on three abstractions: stream operators, stream connections, and tuples. By composing these abstractions within a Stochastic Activity Network, we allow the modeling of complete applications. We consider faults that lead to data loss and to silent data corruption (SDC). Our framework captures how faults originating in one operator propagate to other operators down the stream processing graph. We demonstrate the extensibility of our framework by evaluating three different fault tolerance techniques: checkpointing, partial graph replication, and full graph replication. We show that under crashes that lead to data loss, partial graph replication has a great advantage in maintaining the accuracy of the application output when compared to checkpointing. We also show that SDC can break the no data duplication guarantees of a full graph replication-based fault tolerance technique.
UR - http://www.scopus.com/inward/record.url?scp=80051920334&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=80051920334&partnerID=8YFLogxK
U2 - 10.1109/DSN.2011.5958256
DO - 10.1109/DSN.2011.5958256
M3 - Conference contribution
AN - SCOPUS:80051920334
SN - 9781424492336
T3 - Proceedings of the International Conference on Dependable Systems and Networks
SP - 430
EP - 441
BT - 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks, DSN 2011
T2 - 2011 IEEE/IFIP 41st International Conference on Dependable Systems and Networks, DSN 2011
Y2 - 27 June 2011 through 30 June 2011
ER -