Restart-based fault-tolerance: System design and schedulability analysis

Fardin Abdi, Renato Mancuso, Rohan Tabish, Marco Caccamo

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Embedded systems in safety-critical environments are continuously required to deliver more performance and functionality, while expected to provide verified safety guarantees. Nonetheless, platform-wide software verification (required for safety) is often expensive. Therefore, design methods that enable utilization of components such as real-time operating systems (RTOS), without requiring their correctness to guarantee safety, is necessary. In this paper, we propose a design approach to deploy safe-by-design embedded systems. To attain this goal, we rely on a small core of verified software to handle faults in applications and RTOS and recover from them while ensuring that timing constraints of safety-critical tasks are always satisfied. Faults are detected by monitoring the application timing and fault-recovery is achieved via full platform restart and software reload, enabled by the short restart time of embedded systems. Schedulability analysis is used to ensure that the timing constraints of critical plant control tasks are always satisfied in spite of faults and consequent restarts. We derive schedulability results for four restart-tolerant task models. We use a simulator to evaluate and compare the performance of the considered scheduling models.

Original languageEnglish (US)
Title of host publicationRTCSA 2017 - 23rd IEEE International Conference on Embedded and Real-Time Computing Systems and Applications
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781538618981
DOIs
StatePublished - Sep 19 2017
Event23rd IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA 2017 - Hsinchu, Taiwan, Province of China
Duration: Aug 16 2017Aug 18 2017

Publication series

NameRTCSA 2017 - 23rd IEEE International Conference on Embedded and Real-Time Computing Systems and Applications

Other

Other23rd IEEE International Conference on Embedded and Real-Time Computing Systems and Applications, RTCSA 2017
Country/TerritoryTaiwan, Province of China
CityHsinchu
Period8/16/178/18/17

ASJC Scopus subject areas

  • Computer Networks and Communications
  • Computer Science Applications
  • Hardware and Architecture

Fingerprint

Dive into the research topics of 'Restart-based fault-tolerance: System design and schedulability analysis'. Together they form a unique fingerprint.

Cite this