Dependability evaluation is an important, but difficult, aspect of the design of fault-tolerant parallel and distributed computing systems. One possible technique is to use Markov models but, if applied directly to realistic designs, this often results in large and intractable models. Many authors have investigated methods to avoid this explosive state-space growth, but have typically either solved the problem for a specific system design, or required manipulation of the model at the state-space level. Stochastic activity networks (SANs), a stochastic extension of Petri nets, together with recently developed reduced base model construction techniques, have the potential to avoid this state-space growth at the SAN level for many parallel and distributed systems. This paper investigates this claim by considering their application to three different systems: a fault-tolerant parallel computing system, a distributed database architecture, and a multiprocessor-multimemory system. We show that this method does indeed result in tractable Markov models for these systems, and argue that it can be applied to the dependability evaluation of many parallel and distributed systems.
ASJC Scopus subject areas
- Theoretical Computer Science
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence