51. Bag encoding strategies in multiple instance learning problems
- Author
-
Mustafa Gokce Baydogan, Emel Seyma Kucukasci, and Bölüm Yok
- Subjects
Information Systems and Management ,Theoretical computer science ,Computer science ,Decision trees ,Multiple instance learning ,Feature vector ,Supervised learning ,02 engineering and technology ,Classification ,Mixture model ,Partition (database) ,Computer Science Applications ,Theoretical Computer Science ,Set (abstract data type) ,Tree (data structure) ,ComputingMethodologies_PATTERNRECOGNITION ,Bag encoding ,Artificial Intelligence ,Control and Systems Engineering ,020204 information systems ,Encoding (memory) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Representation (mathematics) ,Software - Abstract
Multiple instance learning (MIL) deals with supervised learning tasks, where the aim is to learn from a set of labeled bags containing certain number of instances. In MIL setting, instance label information is unavailable, which makes it difficult to apply regular supervised learning. To resolve this problem, researchers devise methods focusing on certain assumptions regarding the instance labels. However, it is not a trivial task to determine which assumption holds for a new type of MIL problem. A bag-level representation based on instance characteristics does not require assumptions about the instance labels and is shown to be successful in MIL tasks. These approaches mainly encode bag vectors using bag-of-features representations. In this paper, we propose tree-based encoding strategies that partition the instance feature space and represent the bags using the frequency of instances residing at each partition. Our encoding implicitly learns generalized Gaussian Mixture Model (GMM) on the instance feature space and transforms this information into a bag-level summary. We show that bag representation using tree ensembles provides fast, accurate and robust representations. Our experiments on a large database of MIL problems show that tree-based encoding is highly scalable, and its performance is competitive with the state-of-the-art algorithms. © 2018 Elsevier Inc.
- Published
- 2018