TY - GEN
T1 - An object-oriented testbed for the evaluation of checkpointing and recovery systems
AU - Ramamurthy, B.
AU - Upadhyaya, S. J.
AU - Iyer, R. K.
N1 - Funding Information:
Acknowledgement This research was supported in part by the Defense Advanced Research Projects Agency (DARPA) under contract DABT63-94-C-0045. The content of this paper does not necessarily reflect the position or policy of these agencies, and no endorsement should be inferred.
Publisher Copyright:
© 1997 IEEE.
PY - 1997
Y1 - 1997
N2 - The paper presents the design and development of an object-oriented testbed for simulation and analysis of checkpointing and recovery schemes in distributed systems. An important contribution, of the testbed is a unified environment that provides a set of specialized components for easy and detailed simulation of checkpointing and recovery schemes. The testbed allows a designer to mix and match different components either to study the effectiveness of a particular scheme or to freely experiment with hybrid designs before the actual implementation. The testbed also facilitates the evaluation of interdependencies among the various parameters such as communication and application dynamics and their effect on the performance of checkpointing and recovery schemes. The implementation of the testbed as an extension of DEPEND which is an integrated design and fault-injection environment, provides for unique system-level dependability analysis under realistic fault conditions unlike existing simulation tools. The authors illustrate the versatility of the testbed by using four diverse applications, ranging from the comparison of performances of two checkpointing and recovery schemes to the study of the effect of checkpoint size.
AB - The paper presents the design and development of an object-oriented testbed for simulation and analysis of checkpointing and recovery schemes in distributed systems. An important contribution, of the testbed is a unified environment that provides a set of specialized components for easy and detailed simulation of checkpointing and recovery schemes. The testbed allows a designer to mix and match different components either to study the effectiveness of a particular scheme or to freely experiment with hybrid designs before the actual implementation. The testbed also facilitates the evaluation of interdependencies among the various parameters such as communication and application dynamics and their effect on the performance of checkpointing and recovery schemes. The implementation of the testbed as an extension of DEPEND which is an integrated design and fault-injection environment, provides for unique system-level dependability analysis under realistic fault conditions unlike existing simulation tools. The authors illustrate the versatility of the testbed by using four diverse applications, ranging from the comparison of performances of two checkpointing and recovery schemes to the study of the effect of checkpoint size.
UR - http://www.scopus.com/inward/record.url?scp=85043483520&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85043483520&partnerID=8YFLogxK
U2 - 10.1109/FTCS.1997.614092
DO - 10.1109/FTCS.1997.614092
M3 - Conference contribution
AN - SCOPUS:85043483520
T3 - Digest of Papers - 27th Annual International Symposium on Fault-Tolerant Computing, FTCS 1997
SP - 194
EP - 203
BT - Digest of Papers - 27th Annual International Symposium on Fault-Tolerant Computing, FTCS 1997
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 27th Annual International Symposium on Fault-Tolerant Computing, FTCS 1997
Y2 - 24 June 1997 through 27 June 1997
ER -