Back to Search Start Over

Interdependence analysis on heterogeneous data via behavior interior dimensions.

Authors :
Wang, Can
Chi, Chi-Hung
Yao, Lina
Liew, Alan Wee-Chung
Shen, Hong
Source :
Knowledge-Based Systems. Nov2023, Vol. 279, pN.PAG-N.PAG. 1p.
Publication Year :
2023

Abstract

Interdependent dimensions including categorical and continuous variables can be seen commonly as heterogeneous behavioral data in the real world. Mixed-type objects are more or less associated in terms of certain coupling relationships. The usual representation of such behavioral data is an information table with explicit behavior exterior dimensions (i.e. the original attributes to describe data heterogeneity), assuming the independence of dimensions and the independence of objects. However, both variables and objects are actually very often interdependent on one another either explicitly or implicitly in functional and semantic manners. Limited research has been done in analyzing such interactions among dimensions and those relationships among objects, leading to the learning results to be more local than global. This paper proposes the interdependence analysis to capture the functional multifarious relationships among attributes and among objects in heterogeneous data by addressing the coupling context and coupling weights in unsupervised learning. Such global couplings consider the interactions within discrete dimensions, within numerical attributes and across them, as well as the relationships within an individual object and between multiple objects, to form the attribute-based and object-based coupled data representation schemes based on feature conversion and neighborhood calculation. In addition, we interpret both the representation models via implicit behavior interior dimensions (i.e. the newly defined attributes to model data interdependence) to explain the intrinsic rationales for the superiority of our proposed methods. This work explicitly models the coupling of multiple attributes and the coupling of multiple objects for heterogeneous data sets, demonstrated by various data mining and machine learning applications, such as cluster structure analysis, data clustering evaluation, and data density comparison. Moreover, the sensitivity study is carried out to tune the neighborhood parameter and weight parameter, and the scalability analysis is explored to test the robustness of both models. Extensive experiments on a series of synthetic data sets and multiple UCI data sets show that our proposed framework can effectively capture the global couplings of both heterogeneous variables and mixed-type objects, and is superior to the traditional way as well as the state-of-the-art approaches, which is also verified by statistical analysis. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09507051
Volume :
279
Database :
Academic Search Index
Journal :
Knowledge-Based Systems
Publication Type :
Academic Journal
Accession number :
172845369
Full Text :
https://doi.org/10.1016/j.knosys.2023.110893