Back to Search
Start Over
M-LDQ feature embedding and regression modeling for distribution-valued data.
- Source :
-
Information Sciences . Sep2022, Vol. 609, p121-152. 32p. - Publication Year :
- 2022
-
Abstract
- With the improving capacity to collect massive amounts of data, distribution-valued data are increasingly used in many applications, where they are presented in a clustered, summarized, or aggregated form to provide detailed information, as opposed to single-valued data. Most of the existing models for distribution-valued data are subject to limitations attributed to the inherent constraints caused by the special expressions of probability distributions. This makes the practical usage of distribution-valued data highly challenging. This paper introduces a novel feature embedding method to characterize a probability distribution, and on this basis, an effective linear regression model that does not contain additional constraints is proposed. Unlike previous models with nonnegative constraints on coefficients, our model is capable of addressing negative coefficients. The detailed parameter estimation procedure applying partial least squares for this model is presented to guarantee more stable results, especially in the presence of a relatively small sample size or multicollinearity among variables. Overall, the proposed method fundamentally facilitates distribution-valued data regression analysis. Extensive simulation experiments and empirical PM 2.5 concentration modeling not only verify the effectiveness of our regression method for distribution-valued data but also demonstrate the advantages of the proposed method compared with existing approaches. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 00200255
- Volume :
- 609
- Database :
- Academic Search Index
- Journal :
- Information Sciences
- Publication Type :
- Periodical
- Accession number :
- 158863320
- Full Text :
- https://doi.org/10.1016/j.ins.2022.07.064