Author: "Meinicke, Peter" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Meinicke, Peter"' showing total 218 results

Start Over Author "Meinicke, Peter"

218 results on '"Meinicke, Peter"'

1. Critical Assessment of Metagenome Interpretation—a benchmark of metagenomics software

Author: Sczyrba, Alexander, Hofmann, Peter, Belmann, Peter, Koslicki, David, Janssen, Stefan, Dröge, Johannes, Gregor, Ivan, Majda, Stephan, Fiedler, Jessika, Dahms, Eik, Bremges, Andreas, Fritz, Adrian, Garrido-Oter, Ruben, Jørgensen, Tue Sparholt, Shapiro, Nicole, Blood, Philip D, Gurevich, Alexey, Bai, Yang, Turaev, Dmitrij, DeMaere, Matthew Z, Chikhi, Rayan, Nagarajan, Niranjan, Quince, Christopher, Meyer, Fernando, Balvočiūtė, Monika, Hansen, Lars Hestbjerg, Sørensen, Søren J, Chia, Burton KH, Denis, Bertrand, Froula, Jeff L, Wang, Zhong, Egan, Robert, Don Kang, Dongwan, Cook, Jeffrey J, Deltel, Charles, Beckstette, Michael, Lemaitre, Claire, Peterlongo, Pierre, Rizk, Guillaume, Lavenier, Dominique, Wu, Yu-Wei, Singer, Steven W, Jain, Chirag, Strous, Marc, Klingenberg, Heiner, Meinicke, Peter, Barton, Michael D, Lingner, Thomas, Lin, Hsin-Hung, Liao, Yu-Chieh, Silva, Genivaldo Gueiros Z, Cuevas, Daniel A, Edwards, Robert A, Saha, Surya, Piro, Vitor C, Renard, Bernhard Y, Pop, Mihai, Klenk, Hans-Peter, Göker, Markus, Kyrpides, Nikos C, Woyke, Tanja, Vorholt, Julia A, Schulze-Lefert, Paul, Rubin, Edward M, Darling, Aaron E, Rattei, Thomas, and McHardy, Alice C
Subjects: Biological Sciences, Networking and Information Technology R&D (NITRD), Algorithms, Benchmarking, Metagenomics, Sequence Analysis, DNA, Software, Technology, Medical and Health Sciences, Developmental Biology, Biological sciences
Abstract: Methods for assembly, taxonomic profiling and binning are key to interpreting metagenome data, but a lack of consensus about benchmarking complicates performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on highly complex and realistic data sets, generated from ∼700 newly sequenced microorganisms and ∼600 novel viruses and plasmids and representing common experimental setups. Assembly and genome binning programs performed well for species represented by individual genomes but were substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below family level. Parameter settings markedly affected performance, underscoring their importance for program reproducibility. The CAMI results highlight current challenges but also provide a roadmap for software selection to answer specific research questions.
Published: 2017

2. CoCoPyE: feature engineering for learning and prediction of genome quality indices

Author: Birth, Niklas, primary, Leppich, Nicolina, additional, Schirmacher, Julia, additional, Andreae, Nina, additional, Steinkamp, Rasmus, additional, Blanke, Matthias, additional, and Meinicke, Peter, additional
Published: 2024
Full Text: View/download PDF

3. Tax4Fun2: prediction of habitat-specific functional profiles and functional redundancy based on 16S rRNA gene sequences

Author: Wemheuer, Franziska, Taylor, Jessica A., Daniel, Rolf, Johnston, Emma, Meinicke, Peter, Thomas, Torsten, and Wemheuer, Bernd
Published: 2020
Full Text: View/download PDF

4. Using Maximum Contrast Classifiers for EEG data analysis

Author: Meinicke, Peter, primary, Kaper, Matthias, additional, Weiss, Sabine, additional, Müller, Horst M., additional, and Ritter, Helge, additional
Published: 2019
Full Text: View/download PDF

5. Utilizing SVMs to derive psychophysiological information from a Brain-Computer Interfacing study.

Author: Kaper, Matthias, primary, Meinicke, Peter, additional, and Ritter, Helge, additional
Published: 2019
Full Text: View/download PDF

6. Land Use Type Significantly Affects Microbial Gene Transcription in Soil

Author: Nacke, Heiko, Fischer, Christiane, Thürmer, Andrea, Meinicke, Peter, and Daniel, Rolf
Published: 2014

7. Fast Target Set Reduction for Large-Scale Protein Function Prediction: A Multi-class Multi-label Machine Learning Approach

Author: Lingner, Thomas, Meinicke, Peter, Istrail, Sorin, editor, Pevzner, Pavel, editor, Waterman, Michael S., editor, Crandall, Keith A., editor, and Lagergren, Jens, editor
Published: 2008
Full Text: View/download PDF

8. Evolutionary Optimization of Sequence Kernels for Detection of Bacterial Gene Starts

Author: Mersch, Britta, Glasmachers, Tobias, Meinicke, Peter, Igel, Christian, Hutchison, David, editor, Kanade, Takeo, editor, Kittler, Josef, editor, Kleinberg, Jon M., editor, Mattern, Friedemann, editor, Mitchell, John C., editor, Naor, Moni, editor, Nierstrasz, Oscar, editor, Pandu Rangan, C., editor, Steffen, Bernhard, editor, Sudan, Madhu, editor, Terzopoulos, Demetri, editor, Tygar, Dough, editor, Vardi, Moshe Y., editor, Weikum, Gerhard, editor, Kollias, Stefanos, editor, Stafylopatis, Andreas, editor, Duch, Włodzisław, editor, and Oja, Erkki, editor
Published: 2006
Full Text: View/download PDF

9. MarVis-Pathway: integrative and exploratory pathway analysis of non-targeted metabolomics data

Author: Kaever, Alexander, Landesfeind, Manuel, Feussner, Kirstin, Mosblech, Alina, Heilmann, Ingo, Morgenstern, Burkhard, Feussner, Ivo, and Meinicke, Peter
Published: 2015
Full Text: View/download PDF

10. Identification of Novel Plant Peroxisomal Targeting Signals by a Combination of Machine Learning Methods and in Vivo Subcellular Targeting Analyses

Author: Lingner, Thomas, Kataya, Amr R., Antonicelli, Gerardo E., Benichou, Aline, Nilssen, Kjersti, Chen, Xiong-Yan, Siemsen, Tanja, Morgenstern, Burkhard, Meinicke, Peter, and Reumann, Sigrun
Published: 2011
Full Text: View/download PDF

11. BinChecker: a new algorithm for quality assessment of microbial draft genomes

Author: Klingenberg, Heiner, primary and Meinicke, Peter, additional
Published: 2021
Full Text: View/download PDF

12. Mining metagenomes for natural product biosynthetic gene clusters:unlocking new potential with ultrafast techniques

Author: Pereira-Flores, Emiliano, Medema, Marnix, Buttigieg, Pier Luigi, Meinicke, Peter, Glöckner, Frank Oliver, and Fernández-Guerra, Antonio
Abstract: Microorganisms produce an immense variety of natural products through the expression of Biosynthetic Gene Clusters (BGCs): physically clustered genes that encode the enzymes of a specialized metabolic pathway. These natural products cover a wide range of chemical classes (e.g., aminoglycosides, lantibiotics, nonribosomal peptides, oligosaccharides, polyketides, terpenes) that are highly valuable for industrial and medical applications1. Metagenomics, as a culture-independent approach, has greatly enhanced our ability to survey the functional potential of microorganisms and is growing in popularity for the mining of BGCs. However, to effectively exploit metagenomic data to this end, it will be crucial to more efficiently identify these genomic elements in highly complex and ever-increasing volumes of data2. Here, we address this challenge by developing the ultrafast Biosynthetic Gene cluster MEtagenomic eXploration toolbox (BiG-MEx). BiG-MEx rapidly identifies a broad range of BGC protein domains, assess their diversity and novelty, and predicts the abundance profile of natural product BGC classes in metagenomic data. We show the advantages of BiG-MEx compared to standard BGC-mining approaches, and use it to explore the BGC domain and class composition of samples in the TARA Oceans3 and Human Microbiome Project datasets4. In these analyses, we demonstrate BiG-MEx’s applicability to study the distribution, diversity, and ecological roles of BGCs in metagenomic data, and guide the exploration of natural products with clinical applications.Competing Interest StatementThe authors have declared no competing interest.
Published: 2021

13. Mining metagenomes for natural product biosynthetic gene clusters: unlocking new potential with ultrafast techniques

Author: Pereira-Flores, Emiliano, Medema, M, Buttigieg, Pier Luigi, Meinicke, Peter, Glöckner, Frank Oliver, Fernández-Guerra, Antonio, Pereira-Flores, Emiliano, Medema, M, Buttigieg, Pier Luigi, Meinicke, Peter, Glöckner, Frank Oliver, and Fernández-Guerra, Antonio
Published: 2021

14. Metabolic priming by a secreted fungal effector

Author: Djamei, Armin, Schipper, Kerstin, Rabe, Franziska, Ghosh, Anupama, Vincon, Volker, Kahnt, Jorg, Osorio, Sonia, Tohge, Takayuki, Fernie, Alisdair R., Feussner, Ivo, Feussner, Kirstin, Meinicke, Peter, Stierhof, York-Dieter, Schwarz, Heinz, Macek, Boris, Mann, Matthias, and Kahmann, Regine
Subjects: Chorismate -- Genetic aspects -- Research, Mutation (Biology) -- Research, Tumors, Plant -- Genetic aspects -- Research -- Risk factors, Environmental issues, Science and technology, Zoology and wildlife conservation
Abstract: Maize smut caused by the fungus Ustilago maydis is a widespread disease characterized by the development of large plant tumours. U. maydisis a biotrophic pathogen that requires living plant tissue for its development and establishes an intimate interaction zone between fungal hyphae and the plant plasma membrane. U. maydis actively suppresses plant defence responses by secreted protein effectors (1,2). Its effector repertoire comprises at least 386 genes mostly encoding proteins of unknown function (1,3,4) and expressed exclusively during the biotrophic stage (3).The U. maydis secretome also contains about 150 proteins with probable roles in fungal nutrition, fungal cell wall modification and host penetration as well as proteins unlikely to act in the fungal-host interface (4) like a chorismate mutase. Chorismate mutases are key enzymes of the shikimate pathway and catalyse the conversion of chorismate to prephenate, the precursor for tyrosine and phenylalanine synthesis. Root-knot nematodes inject a secreted chorismate mutase into plant cells likely to affect development (5,6). Here we show that the chorismate mutase Cmu1 secreted by U. maydis is a virulence factor. The enzyme is taken up by plant cells, can spread to neighbouring cells and changes the metabolic status of these cells through metabolic priming. Secreted chorismate mutases are found in many plant-associated microbes and might serve as general tools for host manipulation., The U. maydis genome (http://mips.helmholtz-muenchen.de/genre/proj/ustilago) contains genes for both a cytosolic chorismate mutase, designated aro7 (um04220), and a putatively secreted chorismate mutase, cmul (um05731). Cmu1 belongs to the AroQ class [...]
Published: 2011
Full Text: View/download PDF

15. Metabolite Clustering and Visualization of Mass Spectrometry Data Using One-Dimensional Self-Organizing Maps

Author: Kaever, Alexander, primary, Landesfeind, Manuel, additional, Feussner, Kirstin, additional, Feussner, Ivo, additional, and Meinicke, Peter, additional
Published: 2013
Full Text: View/download PDF

16. Mining metagenomes for natural product biosynthetic gene clusters: unlocking new potential with ultrafast techniques

Author: Pereira-Flores, Emiliano, primary, Medema, Marnix, additional, Buttigieg, Pier Luigi, additional, Meinicke, Peter, additional, Glöckner, Frank Oliver, additional, and Fernández-Guerra, Antonio, additional
Published: 2021
Full Text: View/download PDF

17. Principal surfaces from unsupervised kernel regression

Author: Meinicke, Peter, Klanke, Stefan, Memisevic, Roland, and Ritter, Helge
Subjects: Artificial intelligence, Principal components analysis, Kernel functions, Dimensional analysis, Artificial intelligence
Abstract: We propose a nonparametric approach to learning of principal surfaces based on an unsupervised formulation of the Nadaraya-Watson kernel regression estimator. As compared with previous approaches to principal curves and surfaces, the new method offers several advantages: First, it provides a practical solution to the model selection problem because all parameters can be estimated by leave-one-out cross-validation without additional computational cost. In addition, our approach allows for a convenient incorporation of nonlinear spectral methods for parameter initialization, beyond classical initializations based on linear PCA. Furthermore, it shows a simple way to fit principal surfaces in general feature spaces, beyond the usual data space setup. The experimental results illustrate these convenient features on simulated and real data. Index Terms--Dimensionality reduction, principal curves, principal surfaces, density estimation, model selection, kernel methods.
Published: 2005

18. BCI competition 2003--data set IIb: support vector machines for the P300 speller paradigm

Author: Kaper, Matthias, Meinicke, Peter, Lingner, Thomas, and Ritter, Helge
Subjects: Biomedical engineering -- Research, Biological sciences, Business, Computers, Health care industry
Abstract: We propose an approach to analyze data from the P300 speller paradigm using the machine-learning technique support vector machines. In a conservative classification scheme, we found the correct solution after five repetitions. While the classification within the competition is designed for offline analysis, our approach is also well-suited for a real-world online solution: It is fast, requires only 10 electrode positions and demands only a small amount of preprocessing. Index Terms--BCI competition 2003, brain-computer interface, P300 speller, SVM.
Published: 2004

19. Protein signature-based estimation of metagenomic abundances including all domains of life and viruses

Author: Klingenberg, Heiner, Ahauer, Kathrin Petra, Lingner, Thomas, and Meinicke, Peter
Published: 2013
Full Text: View/download PDF

20. Gene Prediction in Metagenomic Fragments with Orphelia: A Large-Scale Machine Learning Approach

Author: Hoff, Katharina H., primary, Tech, Maike, additional, Lingner, Thomas, additional, Daniel, Rolf, additional, Morgenstern, Burkhard, additional, and Meinicke, Peter, additional
Published: 2011
Full Text: View/download PDF

21. CoMet—a web server for comparative functional profiling of metagenomes

Author: Lingner, Thomas, Ahauer, Kathrin Petra, Schreiber, Fabian, and Meinicke, Peter
Published: 2011
Full Text: View/download PDF

22. Mixture models for analysis of the taxonomic composition of metagenomes

Author: Meinicke, Peter, Ahauer, Kathrin Petra, and Lingner, Thomas
Published: 2011
Full Text: View/download PDF

23. The COP9 signalosome mediates transcriptional and metabolic response to hormones, oxidative stress protection and cell wall rearrangement during fungal development

Author: Nahlik, Krystyna, Dumkow, Marc, Bayram, Özgür, Helmstaedt, Kerstin, Busch, Silke, Valerius, Oliver, Gerke, Jennifer, Hoppert, Michael, Schwier, Elke, Opitz, Lennart, Westermann, Mieke, Grond, Stephanie, Feussner, Kirstin, Goebel, Cornelia, Kaever, Alexander, Meinicke, Peter, Feussner, Ivo, and Braus, Gerhard H.
Published: 2010
Full Text: View/download PDF

24. DIALIGN-TX and multiple protein alignment using secondary structure information at GOBICS

Author: Subramanian, Amarendran R., Hiran, Suvrat, Steinkamp, Rasmus, Meinicke, Peter, Corel, Eduardo, and Morgenstern, Burkhard
Published: 2010
Full Text: View/download PDF

25. Treephyler: fast taxonomic profiling of metagenomes

Author: Schreiber, Fabian, Gumrich, Peter, Daniel, Rolf, and Meinicke, Peter
Published: 2010
Full Text: View/download PDF

26. Evolutionary Optimization of Sequence Kernels for Detection of Bacterial Gene Starts

Author: Mersch, Britta, primary, Glasmachers, Tobias, additional, Meinicke, Peter, additional, and Igel, Christian, additional
Published: 2006
Full Text: View/download PDF

27. Orphelia: predicting genes in metagenomic sequencing reads

Author: Hoff, Katharina J., Lingner, Thomas, Meinicke, Peter, and Tech, Maike
Published: 2009

28. Remote homology detection based on oligomer distances

Author: Lingner, Thomas and Meinicke, Peter
Published: 2006

29. TICO: a tool for improving predictions of prokaryotic translation initiation sites

Author: Tech, Maike, Pfeifer, Nico, Morgenstern, Burkhard, and Meinicke, Peter
Published: 2005

30. Tax4Fun2: a R-based tool for the rapid prediction of habitat-specific functional profiles and functional redundancy based on 16S rRNA gene marker gene sequences

Author: Wemheuer, Franziska, primary, Taylor, Jessica A, additional, Daniel, Rolf, additional, Johnston, Emma, additional, Meinicke, Peter, additional, Thomas, Torsten, additional, and Wemheuer, Bernd, additional
Published: 2018
Full Text: View/download PDF

31. Predicting phenotypic traits of prokaryotes from protein domain frequencies

Author: Notredame Cedric, Gabaldón Toni, Mühlhausen Stefanie, Lingner Thomas, and Meinicke Peter
Subjects: Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Establishing the relationship between an organism's genome sequence and its phenotype is a fundamental challenge that remains largely unsolved. Accurately predicting microbial phenotypes solely based on genomic features will allow us to infer relevant phenotypic characteristics when the availability of a genome sequence precedes experimental characterization, a scenario that is favored by the advent of novel high-throughput and single cell sequencing techniques. Results We present a novel approach to predict the phenotype of prokaryotes directly from their protein domain frequencies. Our discriminative machine learning approach provides high prediction accuracy of relevant phenotypes such as motility, oxygen requirement or spore formation. Moreover, the set of discriminative domains provides biological insight into the underlying phenotype-genotype relationship and enables deriving hypotheses on the possible functions of uncharacterized domains. Conclusions Fast and accurate prediction of microbial phenotypes based on genomic protein domain content is feasible and has the potential to provide novel biological insights. First results of a systematic check for annotation errors indicate that our approach may also be applied to semi-automatic correction and completion of the existing phenotype annotation.
Published: 2010
Full Text: View/download PDF

32. UFO: a web server for ultra-fast functional profiling of whole genome protein sequences

Author: Meinicke Peter
Subjects: Biotechnology, TP248.13-248.65, Genetics, QH426-470
Abstract: Abstract Background Functional profiling is a key technique to characterize and compare the functional potential of entire genomes. The estimation of profiles according to an assignment of sequences to functional categories is a computationally expensive task because it requires the comparison of all protein sequences from a genome with a usually large database of annotated sequences or sequence families. Description Based on machine learning techniques for Pfam domain detection, the UFO web server for ultra-fast functional profiling allows researchers to process large protein sequence collections instantaneously. Besides the frequencies of Pfam and GO categories, the user also obtains the sequence specific assignments to Pfam domain families. In addition, a comparison with existing genomes provides dissimilarity scores with respect to 821 reference proteomes. Considering the underlying UFO domain detection, the results on 206 test genomes indicate a high sensitivity of the approach. In comparison with current state-of-the-art HMMs, the runtime measurements show a considerable speed up in the range of four orders of magnitude. For an average size prokaryotic genome, the computation of a functional profile together with its comparison typically requires about 10 seconds of processing time. Conclusion For the first time the UFO web server makes it possible to get a quick overview on the functional inventory of newly sequenced organisms. The genome scale comparison with a large number of precomputed profiles allows a first guess about functionally related organisms. The service is freely available and does not require user registration or specification of a valid email address.
Published: 2009
Full Text: View/download PDF

33. MarVis: a tool for clustering and visualization of metabolic biomarkers

Author: Feussner Ivo, Göbel Cornelia, Feussner Kirstin, Lingner Thomas, Kaever Alexander, and Meinicke Peter
Subjects: Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background A central goal of experimental studies in systems biology is to identify meaningful markers that are hidden within a diffuse background of data originating from large-scale analytical intensity measurements as obtained from metabolomic experiments. Intensity-based clustering is an unsupervised approach to the identification of metabolic markers based on the grouping of similar intensity profiles. A major problem of this basic approach is that in general there is no prior information about an adequate number of biologically relevant clusters. Results We present the tool MarVis (Marker Visualization) for data mining on intensity-based profiles using one-dimensional self-organizing maps (1D-SOMs). MarVis can import and export customizable CSV (Comma Separated Values) files and provides aggregation and normalization routines for preprocessing of intensity profiles that contain repeated measurements for a number of different experimental conditions. Robust clustering is then achieved by training of an 1D-SOM model, which introduces a similarity-based ordering of the intensity profiles. The ordering allows a convenient visualization of the intensity variations within the data and facilitates an interactive aggregation of clusters into larger blocks. The intensity-based visualization is combined with the presentation of additional data attributes, which can further support the analysis of experimental data. Conclusion MarVis is a user-friendly and interactive tool for exploration of complex pattern variation in a large set of experimental intensity profiles. The application of 1D-SOMs gives a convenient overview on relevant profiles and groups of profiles. The specialized visualization effectively supports researchers in analyzing a large number of putative clusters, even though the true number of biologically meaningful groups is unknown. Although MarVis has been developed for the analysis of metabolomic data, the tool may be applied to gene expression data as well.
Published: 2009
Full Text: View/download PDF

34. Metabolite-based clustering and visualization of mass spectrometry data using one-dimensional self-organizing maps

Author: Karlovsky Petr, Feussner Ivo, Göbel Cornelia, Feussner Kirstin, Kaever Alexander, Lingner Thomas, Meinicke Peter, and Morgenstern Burkhard
Subjects: Biology (General), QH301-705.5, Genetics, QH426-470
Abstract: Abstract Background One of the goals of global metabolomic analysis is to identify metabolic markers that are hidden within a large background of data originating from high-throughput analytical measurements. Metabolite-based clustering is an unsupervised approach for marker identification based on grouping similar concentration profiles of putative metabolites. A major problem of this approach is that in general there is no prior information about an adequate number of clusters. Results We present an approach for data mining on metabolite intensity profiles as obtained from mass spectrometry measurements. We propose one-dimensional self-organizing maps for metabolite-based clustering and visualization of marker candidates. In a case study on the wound response of Arabidopsis thaliana, based on metabolite profile intensities from eight different experimental conditions, we show how the clustering and visualization capabilities can be used to identify relevant groups of markers. Conclusion Our specialized realization of self-organizing maps is well-suitable to gain insight into complex pattern variation in a large set of metabolite profiles. In comparison to other methods our visualization approach facilitates the identification of interesting groups of metabolites by means of a convenient overview on relevant intensity patterns. In particular, the visualization effectively supports researchers in analyzing many putative clusters when the true number of biologically meaningful groups is unknown.
Published: 2008
Full Text: View/download PDF

35. Word correlation matrices for protein sequence analysis and remote homology detection

Author: Meinicke Peter and Lingner Thomas
Subjects: Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Classification of protein sequences is a central problem in computational biology. Currently, among computational methods discriminative kernel-based approaches provide the most accurate results. However, kernel-based methods often lack an interpretable model for analysis of discriminative sequence features, and predictions on new sequences usually are computationally expensive. Results In this work we present a novel kernel for protein sequences based on average word similarity between two sequences. We show that this kernel gives rise to a feature space that allows analysis of discriminative features and fast classification of new sequences. We demonstrate the performance of our approach on a widely-used benchmark setup for protein remote homology detection. Conclusion Our word correlation approach provides highly competitive performance as compared with state-of-the-art methods for protein remote homology detection. The learned model is interpretable in terms of biologically meaningful features. In particular, analysis of discriminative words allows the identification of characteristic regions in biological sequences. Because of its high computational efficiency, our method can be applied to ranking of potential homologs in large databases.
Published: 2008
Full Text: View/download PDF

36. Gene prediction in metagenomic fragments: A large scale machine learning approach

Author: Morgenstern Burkhard, Daniel Rolf, Lingner Thomas, Tech Maike, Hoff Katharina J, and Meinicke Peter
Subjects: Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Metagenomics is an approach to the characterization of microbial genomes via the direct isolation of genomic sequences from the environment without prior cultivation. The amount of metagenomic sequence data is growing fast while computational methods for metagenome analysis are still in their infancy. In contrast to genomic sequences of single species, which can usually be assembled and analyzed by many available methods, a large proportion of metagenome data remains as unassembled anonymous sequencing reads. One of the aims of all metagenomic sequencing projects is the identification of novel genes. Short length, for example, Sanger sequencing yields on average 700 bp fragments, and unknown phylogenetic origin of most fragments require approaches to gene prediction that are different from the currently available methods for genomes of single species. In particular, the large size of metagenomic samples requires fast and accurate methods with small numbers of false positive predictions. Results We introduce a novel gene prediction algorithm for metagenomic fragments based on a two-stage machine learning approach. In the first stage, we use linear discriminants for monocodon usage, dicodon usage and translation initiation sites to extract features from DNA sequences. In the second stage, an artificial neural network combines these features with open reading frame length and fragment GC-content to compute the probability that this open reading frame encodes a protein. This probability is used for the classification and scoring of gene candidates. With large scale training, our method provides fast single fragment predictions with good sensitivity and specificity on artificially fragmented genomic DNA. Additionally, this method is able to predict translation initiation sites accurately and distinguishes complete from incomplete genes with high reliability. Conclusion Large scale machine learning methods are well-suited for gene prediction in metagenomic DNA fragments. In particular, the combination of linear discriminants and neural networks is promising and should be considered for integration into metagenomic analysis pipelines. The data sets can be downloaded from the URL provided (see Availability and requirements section).
Published: 2008
Full Text: View/download PDF

37. P-value based visualization of codon usage data

Author: Fricke Wolfgang, Brodag Thomas, Meinicke Peter, and Waack Stephan
Subjects: Biology (General), QH301-705.5, Genetics, QH426-470
Abstract: Abstract Two important and not yet solved problems in bacterial genome research are the identification of horizontally transferred genes and the prediction of gene expression levels. Both problems can be addressed by multivariate analysis of codon usage data. In particular dimensionality reduction methods for visualization of multivariate data have shown to be effective tools for codon usage analysis. We here propose a multidimensional scaling approach using a novel similarity measure for codon usage tables. Our probabilistic similarity measure is based on P-values derived from the well-known chi-square test for comparison of two distributions. Experimental results on four microbial genomes indicate that the new method is well-suited for the analysis of horizontal gene transfer and translational selection. As compared with the widely-used correspondence analysis, our method did not suffer from outlier sensitivity and showed a better clustering of putative alien genes in most cases.
Published: 2006
Full Text: View/download PDF

38. Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models

Author: Surovcik Katharina, Fricke Wolfgang, Damm Carsten, Brodag Thomas, Asper Roman, Keller Oliver, Waack Stephan, Meinicke Peter, and Merkl Rainer
Subjects: Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Horizontal gene transfer (HGT) is considered a strong evolutionary force shaping the content of microbial genomes in a substantial manner. It is the difference in speed enabling the rapid adaptation to changing environmental demands that distinguishes HGT from gene genesis, duplications or mutations. For a precise characterization, algorithms are needed that identify transfer events with high reliability. Frequently, the transferred pieces of DNA have a considerable length, comprise several genes and are called genomic islands (GIs) or more specifically pathogenicity or symbiotic islands. Results We have implemented the program SIGI-HMM that predicts GIs and the putative donor of each individual alien gene. It is based on the analysis of codon usage (CU) of each individual gene of a genome under study. CU of each gene is compared against a carefully selected set of CU tables representing microbial donors or highly expressed genes. Multiple tests are used to identify putatively alien genes, to predict putative donors and to mask putatively highly expressed genes. Thus, we determine the states and emission probabilities of an inhomogeneous hidden Markov model working on gene level. For the transition probabilities, we draw upon classical test theory with the intention of integrating a sensitivity controller in a consistent manner. SIGI-HMM was written in JAVA and is publicly available. It accepts as input any file created according to the EMBL-format. It generates output in the common GFF format readable for genome browsers. Benchmark tests showed that the output of SIGI-HMM is in agreement with known findings. Its predictions were both consistent with annotated GIs and with predictions generated by different methods. Conclusion SIGI-HMM is a sensitive tool for the identification of GIs in microbial genomes. It allows to interactively analyze genomes in detail and to generate or to test hypotheses about the origin of acquired genes.
Published: 2006
Full Text: View/download PDF

39. An unsupervised classification scheme for improving predictions of prokaryotic TIS

Author: Meinicke Peter and Tech Maike
Subjects: Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Although it is not difficult for state-of-the-art gene finders to identify coding regions in prokaryotic genomes, exact prediction of the corresponding translation initiation sites (TIS) is still a challenging problem. Recently a number of post-processing tools have been proposed for improving the annotation of prokaryotic TIS. However, inherent difficulties of these approaches arise from the considerable variation of TIS characteristics across different species. Therefore prior assumptions about the properties of prokaryotic gene starts may cause suboptimal predictions for newly sequenced genomes with TIS signals differing from those of well-investigated genomes. Results We introduce a clustering algorithm for completely unsupervised scoring of potential TIS, based on positionally smoothed probability matrices. The algorithm requires an initial gene prediction and the genomic sequence of the organism to perform the reannotation. As compared with other methods for improving predictions of gene starts in bacterial genomes, our approach is not based on any specific assumptions about prokaryotic TIS. Despite the generality of the underlying algorithm, the prediction rate of our method is competitive on experimentally verified test data from E. coli and B. subtilis. Regarding genomes with high G+C content, in contrast to some previously proposed methods, our algorithm also provides good performance on P. aeruginosa, B. pseudomallei and R. solanacearum. Conclusion On reliable test data we showed that our method provides good results in post-processing the predictions of the widely-used program GLIMMER. The underlying clustering algorithm is robust with respect to variations in the initial TIS annotation and does not require specific assumptions about prokaryotic gene starts. These features are particularly useful on genomes with high G+C content. The algorithm has been implemented in the tool »TICO«(TIs COrrector) which is publicly available from our web site.
Published: 2006
Full Text: View/download PDF

40. Oligo kernels for datamining on biological sequences: a case study on prokaryotic translation initiation sites

Author: Merkl Rainer, Morgenstern Burkhard, Tech Maike, and Meinicke Peter
Subjects: Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background Kernel-based learning algorithms are among the most advanced machine learning methods and have been successfully applied to a variety of sequence classification tasks within the field of bioinformatics. Conventional kernels utilized so far do not provide an easy interpretation of the learnt representations in terms of positional and compositional variability of the underlying biological signals. Results We propose a kernel-based approach to datamining on biological sequences. With our method it is possible to model and analyze positional variability of oligomers of any length in a natural way. On one hand this is achieved by mapping the sequences to an intuitive but high-dimensional feature space, well-suited for interpretation of the learnt models. On the other hand, by means of the kernel trick we can provide a general learning algorithm for that high-dimensional representation because all required statistics can be computed without performing an explicit feature space mapping of the sequences. By introducing a kernel parameter that controls the degree of position-dependency, our feature space representation can be tailored to the characteristics of the biological problem at hand. A regularized learning scheme enables application even to biological problems for which only small sets of example sequences are available. Our approach includes a visualization method for transparent representation of characteristic sequence features. Thereby importance of features can be measured in terms of discriminative strength with respect to classification of the underlying sequences. To demonstrate and validate our concept on a biochemically well-defined case, we analyze E. coli translation initiation sites in order to show that we can find biologically relevant signals. For that case, our results clearly show that the Shine-Dalgarno sequence is the most important signal upstream a start codon. The variability in position and composition we found for that signal is in accordance with previous biological knowledge. We also find evidence for signals downstream of the start codon, previously introduced as transcriptional enhancers. These signals are mainly characterized by occurrences of adenine in a region of about 4 nucleotides next to the start codon. Conclusions We showed that the oligo kernel can provide a valuable tool for the analysis of relevant signals in biological sequences. In the case of translation initiation sites we could clearly deduce the most discriminative motifs and their positional variation from example sequences. Attractive features of our approach are its flexibility with respect to oligomer length and position conservation. By means of these two parameters oligo kernels can easily be adapted to different biological problems.
Published: 2004
Full Text: View/download PDF

41. Exploring Neighborhoods in the Metagenome Universe

Author: Aßhauer, Kathrin P., Klingenberg, Heiner, Lingner, Thomas, and Meinicke, Peter
Subjects: metagenomics, functional profile, Genome, Human, Microbiota, Genomics, Sequence Analysis, DNA, taxonomic profile, Article, metagenome comparison, lcsh:Chemistry, lcsh:Biology (General), lcsh:QD1-999, Humans, Metagenome, lcsh:QH301-705.5
Abstract: The variety of metagenomes in current databases provides a rapidly growing source of information for comparative studies. However, the quantity and quality of supplementary metadata is still lagging behind. It is therefore important to be able to identify related metagenomes by means of the available sequence data alone. We have studied efficient sequence-based methods for large-scale identification of similar metagenomes within a database retrieval context. In a broad comparison of different profiling methods we found that vector-based distance measures are well-suitable for the detection of metagenomic neighbors. Our evaluation on more than 1700 publicly available metagenomes indicates that for a query metagenome from a particular habitat on average nine out of ten nearest neighbors represent the same habitat category independent of the utilized profiling method or distance measure. While for well-defined labels a neighborhood accuracy of 100% can be achieved, in general the neighbor detection is severely affected by a natural overlap of manually annotated categories. In addition, we present results of a novel visualization method that is able to reflect the similarity of metagenomes in a 2D scatter plot. The visualization method shows a similarly high accuracy in the reduced space as compared with the high-dimensional profile space. Our study suggests that for inspection of metagenome neighborhoods the profiling methods and distance measures can be chosen to provide a convenient interpretation of results in terms of the underlying features. Furthermore, supplementary metadata of metagenome samples in the future needs to comply with readily available ontologies for fine-grained and standardized annotation. To make profile-based k-nearest-neighbor search and the 2D-visualization of the metagenome universe available to the research community, we included the proposed methods in our CoMet-Universe server for comparative metagenome analysis. Open-Access-Publikationsfonds 2014 peerReviewed
Published: 2014

42. Critical Assessment of Metagenome Interpretation:a benchmark of metagenomics software

Author: Sczyrba, Alexander, Hofmann, Peter, Belmann, Peter, Koslicki, David, Janssen, Stefan, Dröge, Johannes, Gregor, Ivan, Majda, Stephan, Fiedler, Jessika, Dahms, Eik, Bremges, Andreas, Fritz, Adrian, Garrido-Oter, Ruben, Jørgensen, Tue Sparholt, Shapiro, Nicole, Blood, Philip D., Gurevich, Alexey, Bai, Yang, Turaev, Dmitrij, DeMaere, Matthew Z., Chikhi, Rayan, Nagarajan, Niranjan, Quince, Christopher, Meyer, Fernando, Balvociute, Monika, Hansen, Lars Hestbjerg, Sørensen, Søren Johannes, Chia, Burton K. H., Denis, Bertrand, Froula, Jeff L., Wang, Zhong, Egan, Robert, Kang, Dongwan Don, Cook, Jeffrey J., Deltel, Charles, Beckstette, Michael, Lemaitre, Claire, Peterlongo, Pierre, Rizk, Guillaume, Lavenier, Dominique, Wu, Yu-Wei, Singer, Steven W., Jain, Chirag, Strous, Marc, Klingenberg, Heiner, Meinicke, Peter, Barton, Michael D., Lingner, Thomas, Lin, Hsin-Hung, Liao, Yu-Chieh, Silva, Genivaldo Gueiros Z., Cuevas, Daniel A., Edwards, Robert A., Saha, Surya, Piro, Vitor C., Renard, Bernhard Y., Pop, Mihai, Klenk, Hans-Peter, Göker, Markus, Kyrpides, Nikos C., Woyke, Tanja, Vorholt, Julia A., Schulze-Lefert, Paul, Rubin, Edward M., Darling, Aaron E., Rattei, Thomas, McHardy, Alice C., Sczyrba, Alexander, Hofmann, Peter, Belmann, Peter, Koslicki, David, Janssen, Stefan, Dröge, Johannes, Gregor, Ivan, Majda, Stephan, Fiedler, Jessika, Dahms, Eik, Bremges, Andreas, Fritz, Adrian, Garrido-Oter, Ruben, Jørgensen, Tue Sparholt, Shapiro, Nicole, Blood, Philip D., Gurevich, Alexey, Bai, Yang, Turaev, Dmitrij, DeMaere, Matthew Z., Chikhi, Rayan, Nagarajan, Niranjan, Quince, Christopher, Meyer, Fernando, Balvociute, Monika, Hansen, Lars Hestbjerg, Sørensen, Søren Johannes, Chia, Burton K. H., Denis, Bertrand, Froula, Jeff L., Wang, Zhong, Egan, Robert, Kang, Dongwan Don, Cook, Jeffrey J., Deltel, Charles, Beckstette, Michael, Lemaitre, Claire, Peterlongo, Pierre, Rizk, Guillaume, Lavenier, Dominique, Wu, Yu-Wei, Singer, Steven W., Jain, Chirag, Strous, Marc, Klingenberg, Heiner, Meinicke, Peter, Barton, Michael D., Lingner, Thomas, Lin, Hsin-Hung, Liao, Yu-Chieh, Silva, Genivaldo Gueiros Z., Cuevas, Daniel A., Edwards, Robert A., Saha, Surya, Piro, Vitor C., Renard, Bernhard Y., Pop, Mihai, Klenk, Hans-Peter, Göker, Markus, Kyrpides, Nikos C., Woyke, Tanja, Vorholt, Julia A., Schulze-Lefert, Paul, Rubin, Edward M., Darling, Aaron E., Rattei, Thomas, and McHardy, Alice C.
Published: 2017

43. Critical assessment of metagenome interpretation − a benchmark of computational metagenomics software

Author: Sczyrba, Alexander, Hofmann, Peter, Belmann, Peter, Koslicki, David, Janssen, Stefan, Droege, Johannes, Gregor, Ivan, Majda, Stephan, Fiedler, Jessika, Dahms, Eik, Bremges, Andreas, Fritz, Adrian, Garrido-Oter, Ruben, Sparholt Jorgensen, Tue, Shapiro, Nicole, Blood, Philip D., Gurevich, Alexey, Bai, Yang, Turaev, Dmitrij, DeMaere, Matthew Z., Chikhi, Rayan, Nagarajan, Niranjan, Quince, Christopher, Meyer, Fernando, Balvociute, Monika, Hestbjerg Hansen, Lars, Sorensen, Soren J., Chia, Burton K. H., Denis, Bertrand, Froula, Jeff L., Wang, Zhong, Egan, Robert, Kang, Dongwan Don, Cook, Jeffrey J., Deltel, Charles, Beckstette, Michael, Lemaitre, Claire, Peterlongo, Pierre, Rizk, Guillaume, Lavenier, Dominique, Wu, Yu-Wei, Singer, Steven W., Jain, Chirag, Strous, Marc, Klingenberg, Heiner, Meinicke, Peter, Barton, Michael, Lingner, Thomas, Lin, Hsin-Hung, Liao, Yu-Chieh, Gueiros Z Silva, Genivaldo, Cuevas, Daniel A., Edwards, Robert A., Saha, Surya, Piro, Vitor C., Renard, Bernhard Y., Pop, Mihai, Klenk, Hans-Peter, Goeker, Markus, Kyrpides, Nikos C., Woyke, Tanja, Vorholt, Julia A., Schulze-Lefert, Paul, Rubin, Edward M., Darling, Aaron E., Rattei, Thomas, McHardy, Alice C., Sczyrba, Alexander, Hofmann, Peter, Belmann, Peter, Koslicki, David, Janssen, Stefan, Droege, Johannes, Gregor, Ivan, Majda, Stephan, Fiedler, Jessika, Dahms, Eik, Bremges, Andreas, Fritz, Adrian, Garrido-Oter, Ruben, Sparholt Jorgensen, Tue, Shapiro, Nicole, Blood, Philip D., Gurevich, Alexey, Bai, Yang, Turaev, Dmitrij, DeMaere, Matthew Z., Chikhi, Rayan, Nagarajan, Niranjan, Quince, Christopher, Meyer, Fernando, Balvociute, Monika, Hestbjerg Hansen, Lars, Sorensen, Soren J., Chia, Burton K. H., Denis, Bertrand, Froula, Jeff L., Wang, Zhong, Egan, Robert, Kang, Dongwan Don, Cook, Jeffrey J., Deltel, Charles, Beckstette, Michael, Lemaitre, Claire, Peterlongo, Pierre, Rizk, Guillaume, Lavenier, Dominique, Wu, Yu-Wei, Singer, Steven W., Jain, Chirag, Strous, Marc, Klingenberg, Heiner, Meinicke, Peter, Barton, Michael, Lingner, Thomas, Lin, Hsin-Hung, Liao, Yu-Chieh, Gueiros Z Silva, Genivaldo, Cuevas, Daniel A., Edwards, Robert A., Saha, Surya, Piro, Vitor C., Renard, Bernhard Y., Pop, Mihai, Klenk, Hans-Peter, Goeker, Markus, Kyrpides, Nikos C., Woyke, Tanja, Vorholt, Julia A., Schulze-Lefert, Paul, Rubin, Edward M., Darling, Aaron E., Rattei, Thomas, and McHardy, Alice C.
Abstract: In metagenome analysis, computational methods for assembly, taxonomic profiling and binning are key components facilitating downstream biological data interpretation. However, a lack of consensus about benchmarking datasets and evaluation metrics complicates proper performance assessment. The Critical Assessment of Metagenome Interpretation (CAMI) challenge has engaged the global developer community to benchmark their programs on datasets of unprecedented complexity and realism. Benchmark metagenomes were generated from ~700 newly sequenced microorganisms and ~600 novel viruses and plasmids, including genomes with varying degrees of relatedness to each other and to publicly available ones and representing common experimental setups. Across all datasets, assembly and genome binning programs performed well for species represented by individual genomes, while performance was substantially affected by the presence of related strains. Taxonomic profiling and binning programs were proficient at high taxonomic ranks, with a notable performance decrease below the family level. Parameter settings substantially impacted performances, underscoring the importance of program reproducibility. While highlighting current challenges in computational metagenomics, the CAMI results provide a roadmap for software selection to answer specific research questions.
Published: 2017

44. How to normalize metatranscriptomic count data for differential expression analysis

Author: Klingenberg, Heiner, primary and Meinicke, Peter, additional
Published: 2017
Full Text: View/download PDF

45. Critical Assessment of Metagenome Interpretation – a benchmark of computational metagenomics software

Author: Sczyrba, Alexander, primary, Hofmann, Peter, additional, Belmann, Peter, additional, Koslicki, David, additional, Janssen, Stefan, additional, Dröge, Johannes, additional, Gregor, Ivan, additional, Majda, Stephan, additional, Fiedler, Jessika, additional, Dahms, Eik, additional, Bremges, Andreas, additional, Fritz, Adrian, additional, Garrido-Oter, Ruben, additional, Jørgensen, Tue Sparholt, additional, Shapiro, Nicole, additional, Blood, Philip D., additional, Gurevich, Alexey, additional, Bai, Yang, additional, Turaev, Dmitrij, additional, DeMaere, Matthew Z., additional, Chikhi, Rayan, additional, Nagarajan, Niranjan, additional, Quince, Christopher, additional, Meyer, Fernando, additional, Balvoit, Monika, additional, Hansen, Lars Hestbjerg, additional, Sørensen, Søren J., additional, Chia, Burton K. H., additional, Denis, Bertrand, additional, Froula, Jeff L., additional, Wang, Zhong, additional, Egan, Robert, additional, Kang, Dongwan Don, additional, Cook, Jeffrey J., additional, Deltel, Charles, additional, Beckstette, Michael, additional, Lemaitre, Claire, additional, Peterlongo, Pierre, additional, Rizk, Guillaume, additional, Lavenier, Dominique, additional, Wu, Yu-Wei, additional, Singer, Steven W., additional, Jain, Chirag, additional, Strous, Marc, additional, Klingenberg, Heiner, additional, Meinicke, Peter, additional, Barton, Michael, additional, Lingner, Thomas, additional, Lin, Hsin-Hung, additional, Liao, Yu-Chieh, additional, Silva, Genivaldo Gueiros Z., additional, Cuevas, Daniel A., additional, Edwards, Robert A., additional, Saha, Surya, additional, Piro, Vitor C., additional, Renard, Bernhard Y., additional, Pop, Mihai, additional, Klenk, Hans-Peter, additional, Göker, Markus, additional, Kyrpides, Nikos C., additional, Woyke, Tanja, additional, Vorholt, Julia A., additional, Schulze-Lefert, Paul, additional, Rubin, Edward M., additional, Darling, Aaron E., additional, Rattei, Thomas, additional, and McHardy, Alice C., additional
Published: 2017
Full Text: View/download PDF

46. Integrative study of Arabidopsis thaliana metabolomic and transcriptomic data with the interactive MarVis-Graph software

Author: Landesfeind, Manuel, Kaever, Alexander, Feussner, Kirstin, Thurow, Corinna, Gatz, Christiane, Feussner, Ivo, and Meinicke, Peter
Subjects: Metabolomics, Transcriptomics, Metabolic network analysis, DNA microarray, Metabolite fingerprinting, Bioinformatics, lcsh:R, lcsh:Medicine, Computational Biology, Plant Science
Abstract: State of the art high-throughput technologies allow comprehensive experimental studies of organism metabolism and induce the need for a convenient presentation of large heterogeneous datasets. Especially, the combined analysis and visualization of data from different high-throughput technologies remains a key challenge in bioinformatics.We present here theMarVis-Graph software for integrative analysis of metabolic and transcriptomic data. All experimental data is investigated in terms of the full metabolic network obtained from a reference database. The reactions of the network are scored based on the associated data, and sub-networks, according to connected high-scoring reactions, are identified. Finally, MarVis-Graph scores the detected sub-networks, evaluates them by means of a random permutation test and presents them as a ranked list. Furthermore, MarVis-Graph features an interactive network visualization that provides researchers with a convenient view on the results. The key advantage ofMarVis-Graph is the analysis of reactions detached from their pathways so that it is possible to identify new pathways or to connect known pathways by previously unrelated reactions. TheMarVis-Graph software is freely available for academic use and can be downloaded at: http://marvis.gobics.de/marvis-graph. Open-Access-Publikationsfonds 2014 peerReviewed
Published: 2014

47. Predicting the functional repertoire of an organism from unassembled RNA–seq data

Author: Landesfeind, Manuel and Meinicke, Peter
Subjects: Genetics, Biotechnology
Abstract: Background The annotation of biomolecular functions is an essential step in the analysis of newly sequenced organisms. Usually, the functions are inferred from predicted genes on the genome using homology search techniques. A high quality genomic sequence is an important prerequisite which, however, is difficult to achieve for certain organisms, such as hybrids or organisms with a large genome. For functional analysis it is also possible to use a de novo transcriptome assembly but the computational requirements can be demanding. Up to now, it is unclear how much of the functional repertoire of an organism can be reliably predicted from unassembled RNA-seq short reads alone. Results We have conducted a study to investigate to what degree it is possible to reconstruct the functional profile of an organism from unassembled transcriptome data. We simulated the de novo prediction of biomolecular functions for Arabidopsis thaliana using a comprehensive RNA-seq data set. We evaluated the prediction performance using several homology search methods in combination with different evidence measures. For the decision on the presence or absence of a particular function under noisy conditions we propose a statistical mixture model enabling unsupervised estimation of a detection threshold. Our results indicate that the prediction of the biomolecular functions from the KEGG database is possible with a high sensitivity up to 94 percent. In this setting, the application of the mixture model for automatic threshold calibration allowed the reduction of the falsely predicted functions down to 4 percent. Furthermore, we found that our statistical approach even outperforms the prediction from a de novo transcriptome assembly. Conclusion The analysis of an organism’s transcriptome can provide a solid basis for the prediction of biomolecular functions. Using RNA-seq short reads directly, the functional profile of an organism can be reconstructed in a computationally efficient way to provide a draft annotation in cases where the classical genome-based approaches cannot be applied. peerReviewed
Published: 2014

48. Dinucleotide distance histograms for fast detection of rRNA in metatranscriptomic sequences

Author: Klingenberg, Heiner, Martinjak, Robin, Glöckner, Frank Oliver, Daniel, Rolf, Lingner, Thomas, and Meinicke, Peter
Subjects: 000 Computer science, knowledge, general works, Computer Science
Abstract: With the advent of metatranscriptomics it has now become possible to study the dynamics of microbial communities. The analysis of environmental RNA-Seq data implies several challenges for the development of efficient tools in bioinformatics. One of the first steps in the computational analysis of metatranscriptomic sequencing reads requires the separation of rRNA and mRNA fragments to ensure that only protein coding sequences are actually used in a subsequent functional analysis. In the context of the rRNA filtering task it is desirable to have a broad spectrum of different methods in order to find a suitable trade-off between speed and accuracy for a particular dataset. We introduce a machine learning approach for the detection of rRNA in metatranscriptomic sequencing reads that is based on support vector machines in combination with dinucleotide distance histograms for feature representation. The results show that our SVM-based approach is at least one order of magnitude faster than any of the existing tools with only a slight degradation of the detection performance when compared to state-of-the-art alignment-based methods.
Published: 2013
Full Text: View/download PDF

49. Tax4Fun: predicting functional profiles from metagenomic 16S rRNA data

Author: Aßhauer, Kathrin P., primary, Wemheuer, Bernd, additional, Daniel, Rolf, additional, and Meinicke, Peter, additional
Published: 2015
Full Text: View/download PDF

50. UProC: tools for ultra-fast protein domain classification

Author: Meinicke, Peter, primary
Published: 2014
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

218 results on '"Meinicke, Peter"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources