Dark side of risk (What your mother never told you about Time Warp)

D. M. Nicol, X. Liu

Research output: Contribution to conferencePaper

Abstract

This paper is a reminder of the danger of allowing `risk' when synchronizing a parallel discrete-event simulation: a simulation code that runs correctly on a serial machine may, when run in parallel, fail catastrophically. This can happen when Time Warp presents an `inconsistent' message to an LP, a message that makes absolutely no sense given the LP's state. Failure may result if the simulation modeler did not anticipate the possibility of this inconsistency. While the problem is not new, there has been little discussion of how to deal with it; furthermore the problem may not be evident to new users or potential users of parallel simulation. This paper shows how the problem may occur, and the damage it may cause. We show how one may eliminate inconsistencies due to lagging rollbacks and stale state, but then show that so long as risk is allowed it is still possible for an LP to be placed in a state that is inconsistent with model semantics, again making it vulnerable to failure. We finally show how simulation code can be tested to ensure safe execution under a risk-free protocol. Whether risky or risk-free, we conclude that under current practice the development of correct and safe parallel simulation code is not transparent to the modeler; certain protections must be included in model code or model testing that are not rigorously necessary if the simulation were executed only serially.

Original languageEnglish (US)
Pages188-195
Number of pages8
StatePublished - Jan 1 1997
Externally publishedYes
EventProceedings of the 1997 11th Workshop on Parallel and Distributed Simulation - Lockenhaus, Austria
Duration: Jun 10 1997Jun 13 1997

Other

OtherProceedings of the 1997 11th Workshop on Parallel and Distributed Simulation
CityLockenhaus, Austria
Period6/10/976/13/97

Fingerprint

Discrete event simulation
Semantics
Testing

ASJC Scopus subject areas

  • Engineering(all)

Cite this

Nicol, D. M., & Liu, X. (1997). Dark side of risk (What your mother never told you about Time Warp). 188-195. Paper presented at Proceedings of the 1997 11th Workshop on Parallel and Distributed Simulation, Lockenhaus, Austria, .

Dark side of risk (What your mother never told you about Time Warp). / Nicol, D. M.; Liu, X.

1997. 188-195 Paper presented at Proceedings of the 1997 11th Workshop on Parallel and Distributed Simulation, Lockenhaus, Austria, .

Research output: Contribution to conferencePaper

Nicol, DM & Liu, X 1997, 'Dark side of risk (What your mother never told you about Time Warp)' Paper presented at Proceedings of the 1997 11th Workshop on Parallel and Distributed Simulation, Lockenhaus, Austria, 6/10/97 - 6/13/97, pp. 188-195.
Nicol DM, Liu X. Dark side of risk (What your mother never told you about Time Warp). 1997. Paper presented at Proceedings of the 1997 11th Workshop on Parallel and Distributed Simulation, Lockenhaus, Austria, .
Nicol, D. M. ; Liu, X. / Dark side of risk (What your mother never told you about Time Warp). Paper presented at Proceedings of the 1997 11th Workshop on Parallel and Distributed Simulation, Lockenhaus, Austria, .8 p.
@conference{9d7e7e049d9e44a8b41b7322b63eea9b,
title = "Dark side of risk (What your mother never told you about Time Warp)",
abstract = "This paper is a reminder of the danger of allowing `risk' when synchronizing a parallel discrete-event simulation: a simulation code that runs correctly on a serial machine may, when run in parallel, fail catastrophically. This can happen when Time Warp presents an `inconsistent' message to an LP, a message that makes absolutely no sense given the LP's state. Failure may result if the simulation modeler did not anticipate the possibility of this inconsistency. While the problem is not new, there has been little discussion of how to deal with it; furthermore the problem may not be evident to new users or potential users of parallel simulation. This paper shows how the problem may occur, and the damage it may cause. We show how one may eliminate inconsistencies due to lagging rollbacks and stale state, but then show that so long as risk is allowed it is still possible for an LP to be placed in a state that is inconsistent with model semantics, again making it vulnerable to failure. We finally show how simulation code can be tested to ensure safe execution under a risk-free protocol. Whether risky or risk-free, we conclude that under current practice the development of correct and safe parallel simulation code is not transparent to the modeler; certain protections must be included in model code or model testing that are not rigorously necessary if the simulation were executed only serially.",
author = "Nicol, {D. M.} and X. Liu",
year = "1997",
month = "1",
day = "1",
language = "English (US)",
pages = "188--195",
note = "Proceedings of the 1997 11th Workshop on Parallel and Distributed Simulation ; Conference date: 10-06-1997 Through 13-06-1997",

}

TY - CONF

T1 - Dark side of risk (What your mother never told you about Time Warp)

AU - Nicol, D. M.

AU - Liu, X.

PY - 1997/1/1

Y1 - 1997/1/1

N2 - This paper is a reminder of the danger of allowing `risk' when synchronizing a parallel discrete-event simulation: a simulation code that runs correctly on a serial machine may, when run in parallel, fail catastrophically. This can happen when Time Warp presents an `inconsistent' message to an LP, a message that makes absolutely no sense given the LP's state. Failure may result if the simulation modeler did not anticipate the possibility of this inconsistency. While the problem is not new, there has been little discussion of how to deal with it; furthermore the problem may not be evident to new users or potential users of parallel simulation. This paper shows how the problem may occur, and the damage it may cause. We show how one may eliminate inconsistencies due to lagging rollbacks and stale state, but then show that so long as risk is allowed it is still possible for an LP to be placed in a state that is inconsistent with model semantics, again making it vulnerable to failure. We finally show how simulation code can be tested to ensure safe execution under a risk-free protocol. Whether risky or risk-free, we conclude that under current practice the development of correct and safe parallel simulation code is not transparent to the modeler; certain protections must be included in model code or model testing that are not rigorously necessary if the simulation were executed only serially.

AB - This paper is a reminder of the danger of allowing `risk' when synchronizing a parallel discrete-event simulation: a simulation code that runs correctly on a serial machine may, when run in parallel, fail catastrophically. This can happen when Time Warp presents an `inconsistent' message to an LP, a message that makes absolutely no sense given the LP's state. Failure may result if the simulation modeler did not anticipate the possibility of this inconsistency. While the problem is not new, there has been little discussion of how to deal with it; furthermore the problem may not be evident to new users or potential users of parallel simulation. This paper shows how the problem may occur, and the damage it may cause. We show how one may eliminate inconsistencies due to lagging rollbacks and stale state, but then show that so long as risk is allowed it is still possible for an LP to be placed in a state that is inconsistent with model semantics, again making it vulnerable to failure. We finally show how simulation code can be tested to ensure safe execution under a risk-free protocol. Whether risky or risk-free, we conclude that under current practice the development of correct and safe parallel simulation code is not transparent to the modeler; certain protections must be included in model code or model testing that are not rigorously necessary if the simulation were executed only serially.

UR - http://www.scopus.com/inward/record.url?scp=0030652415&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=0030652415&partnerID=8YFLogxK

M3 - Paper

AN - SCOPUS:0030652415

SP - 188

EP - 195

ER -