Back to Search Start Over

3D clustering of gene expression data from systemic autoinflammatory diseases using self-organizing maps (Clust3D)

Authors :
Orestis D. Papagiannopoulos
Vasileios C. Pezoulas
Costas Papaloukas
Dimitrios I. Fotiadis
Source :
Computational and Structural Biotechnology Journal, Vol 23, Iss , Pp 2152-2162 (2024)
Publication Year :
2024
Publisher :
Elsevier, 2024.

Abstract

Background and objective: Systemic autoinflammatory diseases (SAIDs) are characterized by widespread inflammation, but for most of them there is a lack of specific biomarkers for accurate diagnosis. Although a number of machine learning algorithms have been used to analyze SAID datasets, aiding in the discovery of novel biomarkers, there is a growing recognition of the importance of SAID timeseries clustering, as it can capture the temporal dynamics of gene expression patterns. Methodology: This paper proposes a novel clustering methodology to efficiently associate three-dimensional data. The algorithm utilizes competitive learning to create a self-organizing neural network and adjust neuron positions in time-dependent and high dimensional feature space in order to assign them as clustering centers. The quantitative evaluation of the clustering was based on well-known clustering indices. Furthermore, a differential expression analysis and classification pipeline was employed to assess the capability of the proposed methodology to extract more accurate pathway-specific genes from its clusters. For that, a comparative analysis was also conducted against a heuristic timeseries clustering method. Results: The proposed methodology achieved better overall clustering indices scores and classification metrics using genes derived from its clusters. Notable cases include a threefold increase in the Calinski-Harabasz clustering index, a twofold improvement in the Davies–Bouldin clustering index and a ∼60% increase in the classification specificity score. Conclusion: A novel clustering methodology was developed and applied on several gene expression timeseries datasets from systemic autoinflammatory diseases, and its ability to efficiently produce well separated clusters compared to existing heuristic methods was demonstrated.

Details

Language :
English
ISSN :
20010370
Volume :
23
Issue :
2152-2162
Database :
Directory of Open Access Journals
Journal :
Computational and Structural Biotechnology Journal
Publication Type :
Academic Journal
Accession number :
edsdoj.87e649650bbf40ff947cc67f752506d8
Document Type :
article
Full Text :
https://doi.org/10.1016/j.csbj.2024.05.003