Back to Search Start Over

Robust feature selection via central point link information and sparse latent representation.

Authors :
Kong, Jiarui
Shang, Ronghua
Zhang, Weitong
Wang, Chao
Xu, Songhua
Source :
Pattern Recognition. Oct2024, Vol. 154, pN.PAG-N.PAG. 1p.
Publication Year :
2024

Abstract

• This paper proposes a novel unsupervised feature selection called CPSLR. • CPSLR uses a specific formula to obtain the central point matrix. Construct a link graph through the central matrix and the Laplacian matrix to retain the similarity between the data. • The link graph and the data graph form a dual graph structure, which can not only preserve more complete data information but also holds on to the manifold structure of the data. • Feature selection is conducted in the latent representation space, and interconnection information among data is mined by using latent representation learning to preserve the connections among data itself. • CPSLR applies l 2,1/2-norm constraint on the feature transformation matrix to select robust and low-redundancy features. Before conducting unsupervised feature selection, it is usually assumed that these data are independent of each other. On the contrary, real data will influence each other. Therefore, traditional feature selection methods may lose information related to each other between data. This can lead to inaccurately generated pseudo-label information and may result in poor feature selection results. To find solutions to this issue, this paper proposes robust feature selection via central point link information and sparse latent representation (CPSLR). Firstly, structure a link graph by calculating the center matrix to store the distance information from the sample to the center point. If two samples have similar distances to the center point, it can be determined that they belong to the same class. Therefore, the similarity between samples is preserved, and more accurate pseudo-label information is obtained. Secondly, CPSLR uses data graph and link graph to form a dual graph structure. It can not only retain the link information between samples but also retain the manifold structures of the samples. Then, CPSLR saves the interconnection contents between samples by sparse latent representation. That is, the constraint l 2,1 -norm is exerted on the expression of latent representation, and sparse non-redundant interconnection information is preserved. And by combining central point link information with sparse latent representation makes the interconnections between data reserved more comprehensive. That is to say, the pseudo-labels obtained are more like the real labels of the classes. Finally, CPSLR constrains the feature transformation matrix by l 2,1/2 -norm constraint so as to select robust and sparse features. CPSLR uses l 2,1/2 -norm constraint to assure that the feature transformation matrix is sparse, selecting more discriminative features, thereby obtaining the feature selection that can improve its efficiency. The experiments demonstrate that the clustering result of CPSLR outperform six classical or latest compared algorithms on eight datasets. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
00313203
Volume :
154
Database :
Academic Search Index
Journal :
Pattern Recognition
Publication Type :
Academic Journal
Accession number :
177843603
Full Text :
https://doi.org/10.1016/j.patcog.2024.110617