Abstract
This paper proposes a general framework for detecting unsafe states of a system whose basic real-time parameters are captured by multiple sensors. Our approach is to learn a danger-level function that can be used to alert the users of dangerous situations in advance so that certain measures can be taken to avoid the collapse. The main challenge to this learning problem is the labeling issue, i.e., it is difficult to assign an objective danger level at each time step to the training data, except at the collapse points, where a definitive penalty can be assigned, and at the successful ends, where a certain reward can be assigned. In this paper, we treat the danger level as an expected future reward (a penalty is regarded as a negative reward) and use temporal difference (TD) learning to learn a function for approximating the expected future reward, given the current and historical sensor readings. The TD learning obtains the approximation by propagating the penalties/rewards observable at collapse points or successful ends to the entire feature space following some constraints. This avoids the labeling issue and naturally allows a general framework to detect unsafe states. Our approach is applied to, but not limited to, the application of monitoring driving safety, and the experimental results demonstrate the effectiveness of the approach.
Original language | English (US) |
---|---|
Article number | 5200366 |
Pages (from-to) | 4-15 |
Number of pages | 12 |
Journal | IEEE Transactions on Intelligent Transportation Systems |
Volume | 11 |
Issue number | 1 |
DOIs | |
State | Published - Mar 2010 |
Keywords
- Driving safety
- Labeling issue
- Multisensor
- Temporal difference (TD) learning
- Unsafe system state
ASJC Scopus subject areas
- Automotive Engineering
- Mechanical Engineering
- Computer Science Applications