Back to Search
Start Over
New strategy for the representation and the integration of biomolecular knowledge at a cellular scale
- Source :
- New strategy for the representation and the integration of biomolecular knowledge at a cellular scale, Actes des 5ème Journées Ouvertes Biologie Informatique Mathématiques à Montréal (Canada), Actes des 5ème Journées Ouvertes Biologie Informatique Mathématiques à Montréal (Canada), 2004, Canada. pp.3581-3589, Nucleic Acids Research, Nucleic Acids Research, Oxford University Press, 2004, 32 (12), pp.3581-9. ⟨10.1093/nar/gkh681⟩, HAL
- Publication Year :
- 2004
- Publisher :
- HAL CCSD, 2004.
-
Abstract
- International audience; The combination of sequencing and post-sequencing experimental approaches produces huge collections of data that are highly heterogeneous both in structure and in semantics. We propose a new strategy for the integration of such data. This strategy uses structured sets of sequences as a unified representation of biological information and defines a probabilistic measure of similarity between the sets. Sets can be composed of sequences that are known to have a biological relationship (e.g. proteins involved in a complex or a pathway) or that share similar values for a particular attribute (e.g. expression profile). We have developed a software, BlastSets, which implements this strategy. It exploits a database where the sets derived from diverse biological information can be deposited using a standard XML format. For a given query set, BlastSets returns target sets found in the database whose similarity to the query is statistically significant. The tool allowed us to automatically identify verified relationships between correlated expression profiles and biological pathways using publicly available data for Saccharomyces cerevisiae. It was also used to retrieve the members of a complex (ribosome) based on the mining of expression profiles. These first results validate the relevance of the strategy and demonstrate the promising potential of BlastSets.
- Subjects :
- data-mining
computer.internet_protocol
gene enrichment
[INFO.INFO-OH]Computer Science [cs]/Other [cs.OH]
Saccharomyces cerevisiae
Biology
Bioinformatics
computer.software_genre
Set (abstract data type)
03 medical and health sciences
Similarity (psychology)
Databases, Genetic
Genetics
Relevance (information retrieval)
Representation (mathematics)
030304 developmental biology
Structure (mathematical logic)
0303 health sciences
Gene Expression Profiling
030302 biochemistry & molecular biology
Probabilistic logic
Computational Biology
Articles
Genomics
Expression (mathematics)
[INFO.INFO-OH] Computer Science [cs]/Other [cs.OH]
Systems Integration
classification
Data mining
[INFO.INFO-BI]Computer Science [cs]/Bioinformatics [q-bio.QM]
computer
Sequence Analysis
XML
Software
Subjects
Details
- Language :
- English
- ISSN :
- 03051048 and 13624962
- Database :
- OpenAIRE
- Journal :
- New strategy for the representation and the integration of biomolecular knowledge at a cellular scale, Actes des 5ème Journées Ouvertes Biologie Informatique Mathématiques à Montréal (Canada), Actes des 5ème Journées Ouvertes Biologie Informatique Mathématiques à Montréal (Canada), 2004, Canada. pp.3581-3589, Nucleic Acids Research, Nucleic Acids Research, Oxford University Press, 2004, 32 (12), pp.3581-9. ⟨10.1093/nar/gkh681⟩, HAL
- Accession number :
- edsair.doi.dedup.....306e808d470bd8890086b39734418e74
- Full Text :
- https://doi.org/10.1093/nar/gkh681⟩