Back to Search Start Over

Automating Outlier Detection via Meta-Learning

Authors :
Zhao, Yue
Rossi, Ryan A.
Akoglu, Leman
Publication Year :
2020

Abstract

Given an unsupervised outlier detection (OD) task on a new dataset, how can we automatically select a good outlier detection method and its hyperparameter(s) (collectively called a model)? Thus far, model selection for OD has been a "black art"; as any model evaluation is infeasible due to the lack of (i) hold-out data with labels, and (ii) a universal objective function. In this work, we develop the first principled data-driven approach to model selection for OD, called MetaOD, based on meta-learning. MetaOD capitalizes on the past performances of a large body of detection models on existing outlier detection benchmark datasets, and carries over this prior experience to automatically select an effective model to be employed on a new dataset without using any labels. To capture task similarity, we introduce specialized meta-features that quantify outlying characteristics of a dataset. Through comprehensive experiments, we show the effectiveness of MetaOD in selecting a detection model that significantly outperforms the most popular outlier detectors (e.g., LOF and iForest) as well as various state-of-the-art unsupervised meta-learners while being extremely fast. To foster reproducibility and further research on this new problem, we open-source our entire meta-learning system, benchmark environment, and testbed datasets.<br />Comment: 21 pages. The code is available at http://github.com/yzhao062/MetaOD

Details

Database :
arXiv
Publication Type :
Report
Accession number :
edsarx.2009.10606
Document Type :
Working Paper