Back to Search Start Over

On-line failure prediction in safety-critical systems

Authors :
Marco Rizzuto
Luca Montanari
Roberto Baldoni
Publication Year :
2015
Publisher :
Elsevier, 2015.

Abstract

In safety-critical systems such as Air Traffic Control system, SCADA systems, Railways Control Systems, there has been a rapid transition from monolithic systems to highly modular ones, using off-the-shelf hardware and software applications possibly developed by different manufactures. This shift increased the probability that a fault occurring in an application propagates to others with the risk of a failure of the entire safety-critical system. This calls for new tools for the on-line detection of anomalous behaviors of the system, predicting thus a system failure before it happens, allowing the deployment of appropriate mitigation policies.The paper proposes a novel architecture, namely CASPER, for online failure prediction that has the distinctive features to be (i) black-box: no knowledge of applications internals and logic of the system is required (ii) non-intrusive: no status information of the components is used such as CPU or memory usage; The architecture has been implemented to predict failures in a real Air Traffic Control System. CASPER exhibits high degree of accuracy in predicting failures with low false positive rate. The experimental validation shows how operators are provided with predictions issued a few hundred of seconds before the occurrence of the failure. Non-intrusive and black box effective online failure prediction.We monitor network traffic, only, to perform online failure prediction.Application agnostic: no knowledge of application logic is required.We use complex event processing to produce a representation of the system state.We use hidden Markov models in order to create a state recognizer.

Details

Language :
English
Database :
OpenAIRE
Accession number :
edsair.doi.dedup.....ff8f4437f3d9198dd2e55636e81191a5