TY - GEN
T1 - Exploiting Temporal Data Diversity for Detecting Safety-critical Faults in AV Compute Systems
AU - Jha, Saurabh
AU - Cui, Shengkun
AU - Tsai, Timothy
AU - Hari, Siva Kumar Sastry
AU - Sullivan, Michael B.
AU - Kalbarczyk, Zbigniew T.
AU - Keckler, Stephen W.
AU - Iyer, Ravishankar K.
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Silent data corruption caused by random hardware faults in autonomous vehicle (AV) computational elements is a significant threat to vehicle safety. Previous research has explored design diversity, data diversity, and duplication techniques to detect such faults in other safety-critical domains. However, these are challenging to use for AVs in practice due to significant resource overhead and design complexity. We propose, DiverseAV, a low-cost data-diversity-based redundancy technique for detecting safety-critical random hardware faults in computational elements. DiverseAV introduces data-diversity between the redundant agents by exploiting the temporal semantic consistency available in the AV sensor data. DiverseAV is a black-box technique that offers a plug-and-play solution as it requires no knowledge of the internals of the AI agent responsible for executing driving decisions, requiring little to no modification to the agent itself for achieving high coverage of transient and permanent hardware faults. It is commercially viable because it avoids software modifications to agents that are costly in terms of development and testing time. Specifically, DiverseAV distributes the sensor data between the two software agents in a round-robin manner. As a result, the sensor data for two consecutive time steps are semantically similar in terms of their worldview but significantly different at the bit level, thus ensuring the state and data diversity between the two agents necessary for detecting faults. We demonstrate DiverseAV using an open-source self-driving AI agent which is controlling a car in an open-source world simulator.
AB - Silent data corruption caused by random hardware faults in autonomous vehicle (AV) computational elements is a significant threat to vehicle safety. Previous research has explored design diversity, data diversity, and duplication techniques to detect such faults in other safety-critical domains. However, these are challenging to use for AVs in practice due to significant resource overhead and design complexity. We propose, DiverseAV, a low-cost data-diversity-based redundancy technique for detecting safety-critical random hardware faults in computational elements. DiverseAV introduces data-diversity between the redundant agents by exploiting the temporal semantic consistency available in the AV sensor data. DiverseAV is a black-box technique that offers a plug-and-play solution as it requires no knowledge of the internals of the AI agent responsible for executing driving decisions, requiring little to no modification to the agent itself for achieving high coverage of transient and permanent hardware faults. It is commercially viable because it avoids software modifications to agents that are costly in terms of development and testing time. Specifically, DiverseAV distributes the sensor data between the two software agents in a round-robin manner. As a result, the sensor data for two consecutive time steps are semantically similar in terms of their worldview but significantly different at the bit level, thus ensuring the state and data diversity between the two agents necessary for detecting faults. We demonstrate DiverseAV using an open-source self-driving AI agent which is controlling a car in an open-source world simulator.
UR - http://www.scopus.com/inward/record.url?scp=85136327862&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85136327862&partnerID=8YFLogxK
U2 - 10.1109/DSN53405.2022.00021
DO - 10.1109/DSN53405.2022.00021
M3 - Conference contribution
AN - SCOPUS:85136327862
T3 - Proceedings - 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2022
SP - 88
EP - 100
BT - Proceedings - 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, DSN 2022
Y2 - 27 June 2022 through 30 June 2022
ER -