Zhang, Jifu, Jiang, Yiyong, Chang, Kai H., Zhang, Sulan, Cai, Jianghui, and Hu, Lihua
Abstract: Traditional outlier mining methods identify outliers from a global point of view. It is usually difficult to find deviated data points in low-dimensional subspaces using these methods. The concept lattice, due to its straight-forwardness, conciseness and completeness in knowledge expression, has become an effective tool for data analysis and knowledge discovery. In this paper, a concept lattice based outlier mining algorithm (CLOM) for low-dimensional subspaces is proposed, which treats the intent of every concept lattice node as a subspace. First, sparsity and density coefficients, which measure outliers in low-dimensional subspaces, are defined and discussed. Second, the intent of a concept lattice node is regarded as a subspace, and sparsity subspaces are identified based on a predefined sparsity coefficient threshold. At this stage, whether the intent of any ancestor node of a sparsity subspace is a density subspace is identified based on a predefined density coefficient threshold. If it is a density subspace, then the objects in the extent of the node whose intent is a sparsity subspace are defined as outliers. Experimental results on a star spectral database show that CLOM is effective in mining outliers in low-dimensional subspaces. The accuracy of the results is also greatly improved. [Copyright &y& Elsevier]