In this paper, we propose a single fault-tolerant distributed shared memory (DSM) that uses a competitive update protocol. In this update protocol, multiple copies of each page may be maintained at different nodes. However, it is also possible for a page to exist in only one node, as some copies of the page may be invalidated. We propose an implementation that makes the competitive update protocol recoverable from a single node failure, by guaranteeing that at least two copies of each page exist. We also present a mechanism that maintains consistency between shared data and process local state after recovery, by updating shared data and process local state atomically. The paper presents evaluation of the recoverable DSM using an implementation. It is shown that the overhead of making the DSM recoverable measured in terms of the number of messages and the amount of data transferred is small in many applications.
- Competitive update
- Recoverable distributed shared memory
ASJC Scopus subject areas
- Hardware and Architecture
- Computer Networks and Communications
- Artificial Intelligence