Back to Search Start Over

Feature selection algorithm using neighborhood equivalence tolerance relation for incomplete decision systems.

Authors :
Wu, Shangzhi
Wang, Litai
Ge, Shuyue
Xiong, Zheng
Liu, Jie
Source :
Applied Soft Computing; May2024, Vol. 157, pN.PAG-N.PAG, 1p
Publication Year :
2024

Abstract

Rough set is an important method for dealing with incomplete information systems. In incomplete information systems, the most common way to determine the relation between two samples is the tolerance relation. However, the condition for the tolerance relation to determine those samples may belong to the same category is very lenient, which makes the reduction rate low when using the rough set generated by this relation to select features. In response to the above problems, we design the neighborhood equivalence tolerance relation to solve them. Different from other improved tolerance relations, firstly, the relation designed in this paper does not require additional threshold to accomplish the above goals, which will avoid the trouble caused by the given threshold. Secondly, we notice that most of the current improvements for this kind of problems are computationally cumbersome, and the relation designed in this paper is simple and effective. Based on this, we construct a neighborhood rough set model that handles incomplete information by using this relation, introduce its properties, expound the properties that a reduction set should satisfy, quantify the importance of conditional attributes with attribute dependence degree, which provides the basis for the design of feature selection algorithm. Finally, the greedy strategy is used to design a forward feature selection algorithm. Experimental results show that the model is effective in dealing with incomplete information systems. The feature selection algorithm has the smallest size of the average reduced subset on twelve datasets, and maintains the accuracy of the classifier, which verifies that the feature selection algorithm can effectively deal with incomplete information systems. ● As the judgment of tolerance relation is very lenient, the existing improvement methods either rely on the probability distribution of attribute values in the information system, or need to give a additional threshold. To solve the above problems, this paper improves the tolerance relation, and the improved binary relation only relies on the existing information system itself, without giving additional threshold. ● Rough set in incomplete information systems can only select more attributes to refine the knowledge granules when making feature selection due to the lenient judgment of tolerance relation, which is equivalent to making up for the problems caused by tolerance relation when selecting attributes, but this will lead to low reduction rate. In this paper, we use the newly proposed binary relation to construct neighborhood rough set model in incomplete information systems, and use this model to make feature selection, which improves the problem of low reduction rate above, while maintaining the accuracy of the classifier. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
15684946
Volume :
157
Database :
Supplemental Index
Journal :
Applied Soft Computing
Publication Type :
Academic Journal
Accession number :
176543279
Full Text :
https://doi.org/10.1016/j.asoc.2024.111463