Back to Search Start Over

What is a consistent glycan composition dataset?

Authors :
Saba, Federico
Mariethoz, Julien
Lisacek, Frederique
Source :
Frontiers in Analytical Science; 6/7/2023, p1-17, 17p
Publication Year :
2023

Abstract

Introduction: One of the main challenges in bioinformatics has been and still is, the comparison of entities through the development of algorithms for similarity scoring and data clustering according to biologically relevant aspects. Glycoinformatics also faces this challenge, in particular regarding the automated comparison of protein and/or tissue glycomes, that remains a relatively uncharted territory. Methods: Low and high throughput experimental glycomic and glycoproteomic results were collected, revealing a bias toward N-linked glycomes. Then, Nglycomes were considered and represented as networks of related glycan compositions as opposed to lists of glycans. They were processed and compared through a java application generating graphs and another producing a similarity matrix based on graph content. Several scoring schemes (e.g., Jaccard index or cosine) were tested and evaluated using the Matthews Correlation Coefficient, in order to capture a meaningful protein and tissue N-glycome similarity. Results: Assuming that a glycome corresponds to a well-connected graph of glycan compositions, graph comparison has revealed gaps that can be interpreted as inconsistencies. The outcome of systematic graph comparison is both formal and practical. In principle, it is shown that the idiosyncrasy of current glycome data limits the definition of appropriate estimates for systematically comparing Nglycomes. Yet, several potentially interesting criteria could be identified in a series of use cases detailed in the study. Discussion: Differentially expressed glycomes are usually compared manually, but the resulting work tends to remain in publications due to the lack of dedicated tools. Even manually, cross-comparison is challenging mostly because different sets of features are used from one study to the other. The work presented here enables laying down guidelines for developing a software tool comparing glycomes based on appropriate definitions of similarity and suitable methods for its evaluation and implementation. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
26739283
Database :
Complementary Index
Journal :
Frontiers in Analytical Science
Publication Type :
Academic Journal
Accession number :
174397617
Full Text :
https://doi.org/10.3389/frans.2023.1073540