Biswas, Rahul, Blackburn, Lindy, Cao, Junwei, Essick, Reed, Hodge, Kari Alison, Katsavounidis, Erotokritos, Kyungmin Kim, Young-Min Kim, Bigot, Eric-Olivier Le, Chang-Hwan Lee, Oh, John J., Sang Hoon Oh, Son, Edwin J., Ye Tao, Vaulin, Ruslan, and Xiaoge Wang
The sensitivity of searches for astrophysical transients in data from the Laser Interferometer Gravitational-wave Observatory (LIGO) is generally limited by the presence of transient, non-Gaussian noise artifacts, which occur at a high enough rate such that accidental coincidence across multiple detectors is non-negligible. These "glitches" can easily be mistaken for transient gravitational-wave signals, and their robust identification and removal will help any search for astrophysical gravitational waves. We apply machine-learning algorithms (MLAs) to the problem, using data from auxiliary channels within the LIGO detectors that monitor degrees of freedom unaffected by astrophysical signals. Noise sources may produce artifacts in these auxiliary channels as well as the gravitational-wave channel. The number of auxiliary-channel parameters describing these disturbances may also be extremely large; high dimensionality is an area where MLAs are particularly well suited. We demonstrate the feasibility and applicability of three different MLAs: artificial neural networks, support vector machines, and random forests. These classifiers identify and remove a substantial fraction of the glitches present in two different data sets: four weeks of LIGO's fourth science run and one week of LIGO's sixth science run. We observe that all three algorithms agree on which events are glitches to within 10% for the sixth-science-run data, and support this by showing that the different optimization criteria used by each classifier generate the same decision surface, based on a likelihood-ratio statistic. Furthermore, we find that all classifiers obtain similar performance to the benchmark algorithm, the ordered veto list, which is optimized to detect pairwise correlations between transients in LIGO auxiliary channels and glitches in the gravitational-wave data. This suggests that most of the useful information currently extracted from the auxiliary channels is already described by this model. Future performance gains are thus likely to involve additional sources of information, rather than improvements in the classification algorithms themselves. We discuss several plausible sources of such new information as well as the ways of propagating it through the classifiers into gravitational-wave searches. [ABSTRACT FROM AUTHOR]