Back to Search
Start Over
A three-way clustering approach for novelty detection
- Source :
- Information Sciences. 569:650-668
- Publication Year :
- 2021
- Publisher :
- Elsevier BV, 2021.
-
Abstract
- Novelty detection aims to identify novel instances in the test data that differ in some respect from the normal instances in the training data. Novel instances may be defined and interpreted in different ways. We consider a specific interpretation where novel instances are instances from unknown classes which are not seen during the training phase. This is also sometimes referred to as open world classification or open set recognition. A key challenge in this scenario is to design approaches that effectively classify normal instances and reject the classification of novel instances. Three-way decisions may be realized as a useful strategy to deal with this challenge. It provides provision for deferring the decisions of classifying objects whenever the available evidence is not enough. The deferred cases may be realized as novel or unknown since their classification results are not known and not available. Three-way clustering is an important three-way decision model which can be used for the classification of objects by considering classes as clusters in the data. In this paper, we introduce a three-way clustering based algorithm called reduction and elevation based three-way clustering for open world classification or RE3OWC. A three-way cluster consists of a pair of core and support sets. The RE3OWC uses the operations of reduction and elevation to define the core and support of a three-way cluster. The two sets lead to the three regions of inside, partial and outside corresponding to a cluster. The three regions provide the realization of three-way decisions and are used to identify instances from the unknown classes. Experimental results on datasets of 20 Newsgroups and Amazon reviews suggest improvements in commonly and widely used F1 measure by up to 2.3% and 6.5%, respectively, in comparisons to some of the best known available approaches of DOC, cbsSVM, openMax and others, for identifying instances from unknown classes.
- Subjects :
- Information Systems and Management
Computer science
Open set
Realization (linguistics)
02 engineering and technology
Machine learning
computer.software_genre
Novelty detection
Theoretical Computer Science
Reduction (complexity)
Artificial Intelligence
0202 electrical engineering, electronic engineering, information engineering
Cluster analysis
business.industry
05 social sciences
050301 education
Computer Science Applications
Control and Systems Engineering
Key (cryptography)
020201 artificial intelligence & image processing
Artificial intelligence
business
0503 education
Decision model
computer
Software
Test data
Subjects
Details
- ISSN :
- 00200255
- Volume :
- 569
- Database :
- OpenAIRE
- Journal :
- Information Sciences
- Accession number :
- edsair.doi...........1fe1f6ed7e649f7032e84616bf9c717d
- Full Text :
- https://doi.org/10.1016/j.ins.2021.05.021