Back to Search Start Over

Discovering recurring anomalies in text reports regarding complex space systems

Authors :
B. Zane-Ulman
Ashok N. Srivastava
Source :
2005 IEEE Aerospace Conference.
Publication Year :
2005
Publisher :
IEEE, 2005.

Abstract

Many existing complex space systems have a significant amount of historical maintenance and problem data bases that are stored in unstructured text forms. The problem that we address in this paper is the discovery of recurring anomalies and relationships between problem reports that may indicate larger systemic problems. We illustrate our techniques on data from discrepancy reports regarding software anomalies in the Space Shuttle. These free text reports are written by a number of different people, thus the emphasis and wording vary considerably. We test four automatic methods of anomaly detection in text that are popular in the current literature on text mining. The first method that we describe is k-means or Gaussian mixture model and its application to the term-document matrix. The second method is the Sammon nonlinear map, which projects high dimensional document vectors into two dimensions for visualization and clustering purposes. The third method is based on an analysis of the results of applying a clustering method, expectation maximization on a mixture of von Mises Fisher distributions that represents each document as a point on a high dimensional sphere. In this space, we perform clustering to obtain sets of similar documents. The results are derived from a new method known as spectral clustering, where vectors from the term-document matrix are embedded in a high dimensional space for clustering. The paper concludes with recommendations regarding the development of an operational text mining system for analysis of problem reports that arise from complex space systems. We also contrast such systems with general purpose text mining systems, illustrating the areas in which this system needs to be specified for the space domain

Details

Database :
OpenAIRE
Journal :
2005 IEEE Aerospace Conference
Accession number :
edsair.doi...........a6444ca3371474b286d7fa37fd531188
Full Text :
https://doi.org/10.1109/aero.2005.1559692