TY - GEN
T1 - A flexible software architecture for high availability computing
AU - Iyer, R. K.
AU - Kalbarczyk, Z.
AU - Whisnant, K.
AU - Bagchi, S.
N1 - Funding Information:
play a key role in attaining high availability for substantially off-the-shelf applications. The system is design to recover from faults in the application, hardware, operating system, and Chameleon components themselves. Chameleon by design ensures that there is no single point of failure. The following features differentiate Chameleon from other software-implemented approaches to fault tolerance: • Construction of fault-tolerant execution strategies from a comprehensive set of ARMORs. These fault-tolerant execution strategies can then be reused by different applications without modification. • Creation of ARMORs from a library of reusable basic building blocks and the ability to seamlessly integrate new ARMORs into existing fault-tolerant execution strategies through ARMOR factories. • Dynamic adaptation to changing fault tolerance requirements achieved through the run-time reconfiguration of ARMORs. • Flexibility of using the same computation nodes to concurrently execute applications with different availability requirements. • Hierarchical error detection and recovery whereby every ARMOR and user application is overseen by another ARMOR so that failure of a single ARMOR does not compromise the dependability of the system. • Operation in a network of heterogeneous computation nodes, including UNIX and Windows NT platforms. Acknowledgment This work was supported in part by JPL – NASA Jet Propulsion Laboratory under the contract JPL961345.
Publisher Copyright:
© 1998 IEEE.
PY - 1998
Y1 - 1998
N2 - Presents an overview of the Chameleon architecture for supporting a wide range of criticality requirements in a heterogeneous network environment. Chameleon employs ARMORs (Adaptive, Reconfigurable and Mobile Objects for Reliability) to synthesize different fault-Tolerant configurations and to maintain run-Time adaptation to changes in the fault tolerance requirements of an application. ARMORs have a flexible architecture that allows their composition to be reconfigured at run-Time, i.e.The ARMORs may dynamically adapt to changing application requirements. In this paper, we focus on the detailed description of the ARMOR architecture, including ARMOR class hierarchy, basic building blocks, ARMOR composition and use of ARMOR factories. We describe how ARMORs can be reconfigured and reengineered, and demonstrate how the architecture serves our objective of providing an adaptive software infrastructure. Our experience with an early Chameleon implementation demonstrates that the proposed ARMOR architecture provides for a highly flexible and reconfigurable software infrastructure.
AB - Presents an overview of the Chameleon architecture for supporting a wide range of criticality requirements in a heterogeneous network environment. Chameleon employs ARMORs (Adaptive, Reconfigurable and Mobile Objects for Reliability) to synthesize different fault-Tolerant configurations and to maintain run-Time adaptation to changes in the fault tolerance requirements of an application. ARMORs have a flexible architecture that allows their composition to be reconfigured at run-Time, i.e.The ARMORs may dynamically adapt to changing application requirements. In this paper, we focus on the detailed description of the ARMOR architecture, including ARMOR class hierarchy, basic building blocks, ARMOR composition and use of ARMOR factories. We describe how ARMORs can be reconfigured and reengineered, and demonstrate how the architecture serves our objective of providing an adaptive software infrastructure. Our experience with an early Chameleon implementation demonstrates that the proposed ARMOR architecture provides for a highly flexible and reconfigurable software infrastructure.
UR - http://www.scopus.com/inward/record.url?scp=39149115601&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=39149115601&partnerID=8YFLogxK
U2 - 10.1109/HASE.1998.731594
DO - 10.1109/HASE.1998.731594
M3 - Conference contribution
AN - SCOPUS:39149115601
T3 - Proceedings - 3rd IEEE International High-Assurance Systems Engineering Symposium, HASE 1998
SP - 42
EP - 49
BT - Proceedings - 3rd IEEE International High-Assurance Systems Engineering Symposium, HASE 1998
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 3rd IEEE International High-Assurance Systems Engineering Symposium, HASE 1998
Y2 - 13 November 1998 through 14 November 1998
ER -