12 results on '"Basher, Abdur Rahman M. A."'
Search Results
2. Relabeling Metabolic Pathway Data with Groups to Improve Prediction Outcomes
- Author
-
Basher, Abdur Rahman M. A., primary and Hallam, Steven J., additional
- Published
- 2022
- Full Text
- View/download PDF
3. Metabolic pathway inference using multi-label classification with rich pathway features
- Author
-
Basher, Abdur Rahman M. A., McLaughlin, Ryan J., and Hallam, Steven J.
- Subjects
0301 basic medicine ,Computer science ,ved/biology.organism_classification_rank.species ,Inference ,Genome ,Biochemistry ,Machine Learning ,0302 clinical medicine ,Software ,Mathematical and Statistical Techniques ,Metabolic potential ,Databases, Genetic ,Biology (General) ,0303 health sciences ,education.field_of_study ,Ecology ,Rule sets ,Applied Mathematics ,Simulation and Modeling ,Statistics ,Genomics ,Enzymes ,Computational Theory and Mathematics ,Modeling and Simulation ,Physical Sciences ,Benchmark (computing) ,Metabolic Pathways ,Algorithms ,Metabolic Networks and Pathways ,Research Article ,Computer and Information Sciences ,Cell Physiology ,QH301-705.5 ,Population ,Computational biology ,Research and Analysis Methods ,Genome Complexity ,Biosynthesis ,Low complexity ,03 medical and health sciences ,Cellular and Molecular Neuroscience ,Machine Learning Algorithms ,Artificial Intelligence ,Proteobacteria ,Genetics ,Statistical Methods ,Model organism ,education ,Molecular Biology ,Ecology, Evolution, Behavior and Systematics ,030304 developmental biology ,Multi-label classification ,ved/biology ,business.industry ,Biology and Life Sciences ,Proteins ,Computational Biology ,Cell Biology ,Cell Metabolism ,Metabolic pathway ,030104 developmental biology ,ComputingMethodologies_PATTERNRECOGNITION ,Metabolism ,Logistic Models ,Enzymology ,business ,030217 neurology & neurosurgery ,Mathematics ,Forecasting - Abstract
Metabolic inference from genomic sequence information is a necessary step in determining the capacity of cells to make a living in the world at different levels of biological organization. A common method for determining the metabolic potential encoded in genomes is to map conceptually translated open reading frames onto a database containing known product descriptions. Such gene-centric methods are limited in their capacity to predict pathway presence or absence and do not support standardized rule sets for automated and reproducible research. Pathway-centric methods based on defined rule sets or machine learning algorithms provide an adjunct or alternative inference method that supports hypothesis generation and testing of metabolic relationships within and between cells. Here, we present mlLGPR, multi-label based on logistic regression for pathway prediction, a software package that uses supervised multi-label classification and rich pathway features to infer metabolic networks in organismal and multi-organismal datasets. We evaluated mlLGPR performance using a corpora of 12 experimental datasets manifesting diverse multi-label properties, including manually curated organismal genomes, synthetic microbial communities and low complexity microbial communities. Resulting performance metrics equaled or exceeded previous reports for organismal genomes and identify specific challenges associated with features engineering and training data for community-level metabolic inference., Author summary Predicting the complex series of metabolic interactions e.g. pathways, within and between cells from genomic sequence information is an integral problem in biology linking genotype to phenotype. This is a prerequisite to both understanding fundamental life processes and ultimately engineering these processes for specific biotechnological applications. A pathway prediction problem exists because we have limited knowledge of the reactions and pathways operating in cells even in model organisms like Esherichia coli where the majority of protein functions are determined. To improve pathway prediction outcomes for genomes at different levels of complexity and completion we have developed mlLGPR, multi-label based on logistic regression for pathway prediction, a scalable open source software package that uses supervised multi-label classification and rich pathway features to infer metabolic networks. We benchmark mlLGPR performance against other inference methods providing a code base and metrics for continued application of machine learning methods to the pathway prediction problem.
- Published
- 2020
4. Leveraging heterogeneous network embedding for metabolic pathway prediction
- Author
-
Basher, Abdur Rahman M. A. and Hallam, Steven J.
- Subjects
Statistics and Probability ,AcademicSubjects/SCI01060 ,Computer science ,Population ,02 engineering and technology ,Machine learning ,computer.software_genre ,Biochemistry ,Machine Learning ,03 medical and health sciences ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Leverage (statistics) ,education ,Molecular Biology ,030304 developmental biology ,chemistry.chemical_classification ,0303 health sciences ,education.field_of_study ,business.industry ,Systems Biology ,MetaCyc ,Proteins ,Genomics ,Original Papers ,Computer Science Applications ,Visualization ,Computational Mathematics ,Metabolic pathway ,Enzyme ,ComputingMethodologies_PATTERNRECOGNITION ,Computational Theory and Mathematics ,chemistry ,Embedding ,Artificial intelligence ,business ,Heuristics ,computer ,Heterogeneous network ,Metabolic Networks and Pathways ,Software - Abstract
Motivation Metabolic pathway reconstruction from genomic sequence information is a key step in predicting regulatory and functional potential of cells at the individual, population and community levels of organization. Although the most common methods for metabolic pathway reconstruction are gene-centric e.g. mapping annotated proteins onto known pathways using a reference database, pathway-centric methods based on heuristics or machine learning to infer pathway presence provide a powerful engine for hypothesis generation in biological systems. Such methods rely on rule sets or rich feature information that may not be known or readily accessible. Results Here, we present pathway2vec, a software package consisting of six representational learning modules used to automatically generate features for pathway inference. Specifically, we build a three-layered network composed of compounds, enzymes and pathways, where nodes within a layer manifest inter-interactions and nodes between layers manifest betweenness interactions. This layered architecture captures relevant relationships used to learn a neural embedding-based low-dimensional space of metabolic features. We benchmark pathway2vec performance based on node-clustering, embedding visualization and pathway prediction using MetaCyc as a trusted source. In the pathway prediction task, results indicate that it is possible to leverage embeddings to improve prediction outcomes. Availability and implementation The software package and installation instructions are published on http://github.com/pathway2vec. Supplementary information Supplementary data are available at Bioinformatics online.
- Published
- 2020
5. leADS: improved metabolic pathway inference based on active dataset subsampling
- Author
-
Basher, Abdur Rahman M. A., primary, Nallan, Aditi N., additional, McLaughlin, Ryan J., additional, Anstett, Julia, additional, and Hallam, Steven J., additional
- Published
- 2020
- Full Text
- View/download PDF
6. Relabeling metabolic pathway data with groups to improve prediction outcomes
- Author
-
Basher, Abdur Rahman M. A., primary and Hallam, Steven J., additional
- Published
- 2020
- Full Text
- View/download PDF
7. Metabolic pathway prediction using non-negative matrix factorization with improved precision
- Author
-
Basher, Abdur Rahman M. A., primary, McLaughlin, Ryan J., additional, and Hallam, Steven J., additional
- Published
- 2020
- Full Text
- View/download PDF
8. Leveraging Heterogeneous Network Embedding for Metabolic Pathway Prediction
- Author
-
Basher, Abdur Rahman M. A., primary and Hallam, Steven J., additional
- Published
- 2020
- Full Text
- View/download PDF
9. Metabolic pathway inference using multi-label classification with rich pathway features
- Author
-
Basher, Abdur Rahman M. A., primary, McLaughlin, Ryan J., additional, and Hallam, Steven J., additional
- Published
- 2020
- Full Text
- View/download PDF
10. Leveraging heterogeneous network embedding for metabolic pathway prediction.
- Author
-
Basher, Abdur Rahman M A and Hallam, Steven J
- Subjects
- *
COMPUTER software installation , *INTEGRATED software , *BIOLOGICAL systems , *FORECASTING , *INTERNET servers , *LEARNING modules - Abstract
Motivation Metabolic pathway reconstruction from genomic sequence information is a key step in predicting regulatory and functional potential of cells at the individual, population and community levels of organization. Although the most common methods for metabolic pathway reconstruction are gene-centric e.g. mapping annotated proteins onto known pathways using a reference database, pathway-centric methods based on heuristics or machine learning to infer pathway presence provide a powerful engine for hypothesis generation in biological systems. Such methods rely on rule sets or rich feature information that may not be known or readily accessible. Results Here, we present pathway2vec, a software package consisting of six representational learning modules used to automatically generate features for pathway inference. Specifically, we build a three-layered network composed of compounds, enzymes and pathways, where nodes within a layer manifest inter-interactions and nodes between layers manifest betweenness interactions. This layered architecture captures relevant relationships used to learn a neural embedding-based low-dimensional space of metabolic features. We benchmark pathway2vec performance based on node-clustering, embedding visualization and pathway prediction using MetaCyc as a trusted source. In the pathway prediction task, results indicate that it is possible to leverage embeddings to improve prediction outcomes. Availability and implementation The software package and installation instructions are published on http://github.com/pathway2vec. Supplementary information Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
11. Heterogeneity-Preserving Discriminative Feature Selection for Subtype Discovery.
- Author
-
Basher ARMA, Hallinan C, and Lee K
- Abstract
The discovery of subtypes is pivotal for disease diagnosis and targeted therapy, considering the diverse responses of different cells or patients to specific treatments. Exploring the heterogeneity within disease or cell states provides insights into disease progression mechanisms and cell differentiation. The advent of high-throughput technologies has enabled the generation and analysis of various molecular data types, such as single-cell RNA-seq, proteomic, and imaging datasets, at large scales. While presenting opportunities for subtype discovery, these datasets pose challenges in finding relevant signatures due to their high dimensionality. Feature selection, a crucial step in the analysis pipeline, involves choosing signatures that reduce the feature size for more efficient downstream computational analysis. Numerous existing methods focus on selecting signatures that differentiate known diseases or cell states, yet they often fall short in identifying features that preserve heterogeneity and reveal subtypes. To identify features that can capture the diversity within each class while also maintaining the discrimination of known disease states, we employed deep metric learning-based feature embedding to conduct a detailed exploration of the statistical properties of features essential in preserving heterogeneity. Our analysis revealed that features with a significant difference in interquartile range (IQR) between classes possess crucial subtype information. Guided by this insight, we developed a robust statistical method, termed PHet (Preserving Heterogeneity) that performs iterative subsampling differential analysis of IQR and Fisher's method between classes, identifying a minimal set of heterogeneity-preserving discriminative features to optimize subtype clustering quality. Validation using public single-cell RNA-seq and microarray datasets showcased PHet's effectiveness in preserving sample heterogeneity while maintaining discrimination of known disease/cell states, surpassing the performance of previous outlier-based methods. Furthermore, analysis of a single-cell RNA-seq dataset from mouse tracheal epithelial cells revealed, through PHet-based features, the presence of two distinct basal cell subtypes undergoing differentiation toward a luminal secretory phenotype. Notably, one of these subtypes exhibited high expression of BPIFA1. Interestingly, previous studies have linked BPIFA1 secretion to the emergence of secretory cells during mucociliary differentiation of airway epithelial cells. PHet successfully pinpointed the basal cell subtype associated with this phenomenon, a distinction that pre-annotated markers and dispersion-based features failed to make due to their admixed feature expression profiles. These findings underscore the potential of our method to deepen our understanding of the mechanisms underlying diseases and cell differentiation and contribute significantly to personalized medicine., Competing Interests: Conflict of Interest The authors declare no competing financial or non-financial interests.
- Published
- 2023
- Full Text
- View/download PDF
12. Metabolic Pathway Prediction Using Non-Negative Matrix Factorization with Improved Precision.
- Author
-
Basher ARMA, Mclaughlin RJ, and Hallam SJ
- Subjects
- Algorithms, Bacterial Proteins genetics, Cluster Analysis, Machine Learning, Microbiota, Bacteria genetics, Computational Biology methods, Metabolic Networks and Pathways
- Abstract
Machine learning provides a probabilistic framework for metabolic pathway inference from genomic sequence information at different levels of complexity and completion. However, several challenges, including pathway features engineering, multiple mapping of enzymatic reactions, and emergent or distributed metabolism within populations or communities of cells, can limit prediction performance. In this article, we present triUMPF (triple non-negative matrix factorization [NMF] with community detection for metabolic pathway inference), which combines three stages of NMF to capture myriad relationships between enzymes and pathways within a graph network. This is followed by community detection to extract a higher-order structure based on the clustering of vertices that share similar statistical properties. We evaluated triUMPF performance by using experimental datasets manifesting diverse multi-label properties, including Tier 1 genomes from the BioCyc collection of organismal Pathway/Genome Databases and low complexity microbial communities. Resulting performance metrics equaled or exceeded other prediction methods on organismal genomes with improved precision on multi-organismal datasets.
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.