Back to Search Start Over

Text mining of accident reports using semi-supervised keyword extraction and topic modeling.

Authors :
Ahadh, Abdhul
Binish, Govind Vallabhasseri
Srinivasan, Rajagopalan
Source :
Process Safety & Environmental Protection: Transactions of the Institution of Chemical Engineers Part B. Nov2021, Vol. 155, p455-465. 11p.
Publication Year :
2021

Abstract

Learning from past incidents is critical to achieving and maintaining high process safety performance. Accident and incident records provide one way for learning; however, these are usually in the form of unstructured texts, which makes analysis difficult. Recently, text mining methods based on supervised learning have been proposed for analyzing accident reports; however, they require an impractically large number of labeled records as training examples. This paper proposes an automated, semi-supervised, domain-independent approach for analyzing accident reports. Given a set of user-defined classification topics and domain literature such as handbooks, glossaries, and Wikipedia articles, the method can identify domain-specific keywords and group them into topics with minimal expert involvement. These keywords and topics can then be used for various data mining purposes, including classification. The proposed approach is demonstrated using two different case studies across domains: (1) in aviation to identify the stage of flight when an accident occurs, and (2) in the process industry domain to identify the cause of pipeline accidents. The average classification accuracy of the proposed method was 80% which is comparable to that of supervised learning methods. The key benefits of this approach are that it can generate domain-specific predictive models with limited manual intervention. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09575820
Volume :
155
Database :
Academic Search Index
Journal :
Process Safety & Environmental Protection: Transactions of the Institution of Chemical Engineers Part B
Publication Type :
Academic Journal
Accession number :
153659109
Full Text :
https://doi.org/10.1016/j.psep.2021.09.022