Back to Search
Start Over
MACSIMS: multiple alignment of complete sequences information management system
- Source :
- BMC Bioinformatics, BMC Bioinformatics, BioMed Central, 2006, 7, pp.318. ⟨10.1186/1471-2105-7-318⟩, BMC Bioinformatics, Vol 7, Iss 1, p 318 (2006)
- Publication Year :
- 2006
- Publisher :
- HAL CCSD, 2006.
-
Abstract
- Background In the post-genomic era, systems-level studies are being performed that seek to explain complex biological systems by integrating diverse resources from fields such as genomics, proteomics or transcriptomics. New information management systems are now needed for the collection, validation and analysis of the vast amount of heterogeneous data available. Multiple alignments of complete sequences provide an ideal environment for the integration of this information in the context of the protein family. Results MACSIMS is a multiple alignment-based information management program that combines the advantages of both knowledge-based and ab initio sequence analysis methods. Structural and functional information is retrieved automatically from the public databases. In the multiple alignment, homologous regions are identified and the retrieved data is evaluated and propagated from known to unknown sequences with these reliable regions. In a large-scale evaluation, the specificity of the propagated sequence features is estimated to be >99%, i.e. very few false positive predictions are made. MACSIMS is then used to characterise mutations in a test set of 100 proteins that are known to be involved in human genetic diseases. The number of sequence features associated with these proteins was increased by 60%, compared to the features available in the public databases. An XML format output file allows automatic parsing of the MACSIM results, while a graphical display using the JalView program allows manual analysis. Conclusion MACSIMS is a new information management system that incorporates detailed analyses of protein families at the structural, functional and evolutionary levels. MACSIMS thus provides a unique environment that facilitates knowledge extraction and the presentation of the most pertinent information to the biologist. A web server and the source code are available at http://bips.u-strasbg.fr/MACSIMS/.
- Subjects :
- Proteomics
Information management
Source code
MESH: Sequence Analysis, Protein
computer.software_genre
Biochemistry
Knowledge extraction
Sequence Analysis, Protein
Structural Biology
MESH: Animals
Databases, Protein
lcsh:QH301-705.5
MESH: Evolution, Molecular
media_common
0303 health sciences
Multiple sequence alignment
Applied Mathematics
MESH: Proteomics
MESH: Genomics
030302 biochemistry & molecular biology
Genomics
Computer Science Applications
MESH: Reproducibility of Results
MESH: Programming Languages
lcsh:R858-859.7
Data mining
Algorithms
MESH: Computational Biology
MESH: Databases, Protein
MESH: Mutation
Sequence analysis
MESH: Management Information Systems
media_common.quotation_subject
Context (language use)
MESH: Algorithms
Biology
lcsh:Computer applications to medicine. Medical informatics
Management Information Systems
Evolution, Molecular
03 medical and health sciences
MESH: Software
Animals
Humans
Molecular Biology
030304 developmental biology
MESH: Humans
Computational Biology
Reproducibility of Results
[SDV.BBM.BM]Life Sciences [q-bio]/Biochemistry, Molecular Biology/Molecular biology
Management information systems
lcsh:Biology (General)
Test set
Mutation
Programming Languages
computer
Software
Subjects
Details
- Language :
- English
- ISSN :
- 14712105
- Database :
- OpenAIRE
- Journal :
- BMC Bioinformatics, BMC Bioinformatics, BioMed Central, 2006, 7, pp.318. ⟨10.1186/1471-2105-7-318⟩, BMC Bioinformatics, Vol 7, Iss 1, p 318 (2006)
- Accession number :
- edsair.doi.dedup.....0322a514f7c90c41ee04dac9d28a25e5