Back to Search Start Over

Linnaeus: A highly reusable and adaptable ML based log classification pipeline

Authors :
Catovic, Armin
Cartwright, Carolyn
Gebreyesus, Yasmin Tesfaldet
Ferlin, Simone
Publication Year :
2021

Abstract

Logs are a common way to record detailed run-time information in software. As modern software systems evolve in scale and complexity, logs have become indispensable to understanding the internal states of the system. At the same time however, manually inspecting logs has become impractical. In recent times, there has been more emphasis on statistical and machine learning (ML) based methods for analyzing logs. While the results have shown promise, most of the literature focuses on algorithms and state-of-the-art (SOTA), while largely ignoring the practical aspects. In this paper we demonstrate our end-to-end log classification pipeline, Linnaeus. Besides showing the more traditional ML flow, we also demonstrate our solutions for adaptability and re-use, integration towards large scale software development processes, and how we cope with lack of labelled data. We hope Linnaeus can serve as a blueprint for, and inspire the integration of, various ML based solutions in other large scale industrial settings.<br />Comment: 8 pages, 7 figures; to be included in ICSE/WAIN'21

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2103.06927
Document Type :
Working Paper