TY - JOUR
T1 - Single fault-tolerant distributed shared memory using competitive update
AU - Kim, Jai Hoon
AU - Vaidya, Nitin H.
N1 - Funding Information:
* Corresponding author. E-mail: jhkim,[email protected], Web: http:// www.cs.tamu.edu/faculty/vaidya i This work is supported in part by the National Science Foundation under grant MIP-9502563. A preliminary ve,'sion of this paper was presented at Pacific Rim International Symposium on Fault-Tolerant Systems, December 1995.
PY - 1997/12/15
Y1 - 1997/12/15
N2 - In this paper, we propose a single fault-tolerant distributed shared memory (DSM) that uses a competitive update protocol. In this update protocol, multiple copies of each page may be maintained at different nodes. However, it is also possible for a page to exist in only one node, as some copies of the page may be invalidated. We propose an implementation that makes the competitive update protocol recoverable from a single node failure, by guaranteeing that at least two copies of each page exist. We also present a mechanism that maintains consistency between shared data and process local state after recovery, by updating shared data and process local state atomically. The paper presents evaluation of the recoverable DSM using an implementation. It is shown that the overhead of making the DSM recoverable measured in terms of the number of messages and the amount of data transferred is small in many applications.
AB - In this paper, we propose a single fault-tolerant distributed shared memory (DSM) that uses a competitive update protocol. In this update protocol, multiple copies of each page may be maintained at different nodes. However, it is also possible for a page to exist in only one node, as some copies of the page may be invalidated. We propose an implementation that makes the competitive update protocol recoverable from a single node failure, by guaranteeing that at least two copies of each page exist. We also present a mechanism that maintains consistency between shared data and process local state after recovery, by updating shared data and process local state atomically. The paper presents evaluation of the recoverable DSM using an implementation. It is shown that the overhead of making the DSM recoverable measured in terms of the number of messages and the amount of data transferred is small in many applications.
KW - Back-up
KW - Competitive update
KW - Fault-tolerant
KW - Recoverable distributed shared memory
UR - http://www.scopus.com/inward/record.url?scp=0031372045&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0031372045&partnerID=8YFLogxK
U2 - 10.1016/S0141-9331(97)00032-X
DO - 10.1016/S0141-9331(97)00032-X
M3 - Article
AN - SCOPUS:0031372045
SN - 0141-9331
VL - 21
SP - 183
EP - 196
JO - Microprocessors and Microsystems
JF - Microprocessors and Microsystems
IS - 3
ER -