Descriptor: "Computational Biology" / Search Limiters: Full Text - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Computational Biology"' showing total 16,728 results

Start Over Descriptor "Computational Biology" Search Limiters Full Text

16,728 results on '"Computational Biology"'

1. Biocomputation: Moving Beyond Turing with Living Cellular Computers.

Author: Goñi-Moreno, Ángel
Subjects: *SYNTHETIC biology, *BIOLOGICALLY inspired computing, *COMPUTER science, *BOOLEAN functions, *COMPUTATIONAL biology, *MOLECULAR computers, *DNA
Abstract: This article explores the topic of biocomputation with living cellular computers by detailing the basic concepts of cellular computer construction as well as exploring improvements. The article includes programming bacteria to perform living Boolean logic functions and developing advanced living models of computation beyond combinatorial logic. Topics include synthetic biology, cellular supremacy, as well as systems complexity and algorithmic complexity.
Published: 2024
Full Text: View/download PDF

2. CodLncScape Provides a Self‐Enriching Framework for the Systematic Collection and Exploration of Coding LncRNAs.

Author: Liu, Tianyuan, Qiao, Huiyuan, Wang, Zixu, Yang, Xinyan, Pan, Xianrun, Yang, Yu, Ye, Xiucai, Sakurai, Tetsuya, Lin, Hao, and Zhang, Yang
Abstract: Recent studies have revealed that numerous lncRNAs can translate proteins under specific conditions, performing diverse biological functions, thus termed coding lncRNAs. Their comprehensive landscape, however, remains elusive due to this field's preliminary and dispersed nature. This study introduces codLncScape, a framework for coding lncRNA exploration consisting of codLncDB, codLncFlow, codLncWeb, and codLncNLP. Specifically, it contains a manually compiled knowledge base, codLncDB, encompassing 353 coding lncRNA entries validated by experiments. Building upon codLncDB, codLncFlow investigates the expression characteristics of these lncRNAs and their diagnostic potential in the pan‐cancer context, alongside their association with spermatogenesis. Furthermore, codLncWeb emerges as a platform for storing, browsing, and accessing knowledge concerning coding lncRNAs within various programming environments. Finally, codLncNLP serves as a knowledge‐mining tool to enhance the timely content inclusion and updates within codLncDB. In summary, this study offers a well‐functioning, content‐rich ecosystem for coding lncRNA research, aiming to accelerate systematic studies in this field. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. In silico bioprospecting of receptors associated with the mechanism of action of Rondonin, an antifungal peptide from spider Acanthoscurria rondoniae haemolymph.

Author: Muniz Seif, Elias Jorge, Icimoto, Marcelo Yudi, and Silva Júnior, Pedro Ismael
Subjects: *PEPTIDES, *SPIDER venom, *SUCCINATE dehydrogenase, *HEMOLYMPH, *BIOPROSPECTING, *SMALL molecules
Abstract: Multiple drug-resistant fungal species are associated with the development of diseases. Thus, more efficient drugs for the treatment of these aetiological agents are needed. Rondonin is a peptide isolated from the haemolymph of the spider Acanthoscurria rondoniae. Previous studies have shown that this peptide has antifungal activity against Candida sp. and Trichosporon sp. strains, acting on their genetic material. However, the molecular targets involved in its biological activity have not yet been described. Bioinformatics tools were used to determine the possible targets involved in the biological activity of Rondonin. The PharmMapper server was used to search for microorganismal targets of Rondonin. The PatchDock server was used to perform the molecular docking. UCSF Chimera software was used to evaluate these intermolecular interactions. In addition, the I-TASSER server was used to predict the target ligand sites. Then, these predictions were contrasted with the sites previously described in the literature. Molecular dynamics simulations were conducted for two promising complexes identified from the docking analysis. Rondonin demonstrated consistency with the ligand sites of the following targets: outer membrane proteins F (id: 1MPF) and A (id: 1QJP), which are responsible for facilitating the passage of small molecules through the plasma membrane; the subunit of the flavoprotein fumarate reductase (id: 1D4E), which is involved in the metabolism of nitrogenous bases; and the ATP-dependent Holliday DNA helicase junction (id: 1IN4), which is associated with histone proteins that package genetic material. Additionally, the molecular dynamics results indicated the stability of the interaction of Rondonin with 1MPF and 1IN4 during a 10 ns simulation. These interactions corroborate with previous in vitro studies on Rondonin, which acts on fungal genetic material without causing plasma membrane rupture. Therefore, the bioprospecting methods used in this research were considered satisfactory since they were consistent with previous results obtained via in vitro experimentation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. Assessing opportunities of SYCL for biological sequence alignment on GPU-based systems.

Author: Costanzo, Manuel, Rucci, Enzo, García-Sanchez, Carlos, Naiouf, Marcelo, and Prieto-Matías, Manuel
Subjects: *SEQUENCE alignment, *COMPUTATIONAL biology, *PROGRAMMING languages, *BIOINFORMATICS, *C++, *BIOINFORMATICS software
Abstract: Bioinformatics and computational biology are two fields that have been exploiting GPUs for more than two decades, with being CUDA the most used programming language for them. However, as CUDA is an NVIDIA proprietary language, it implies a strong portability restriction to a wide range of heterogeneous architectures, like AMD or Intel GPUs. To face this issue, the Khronos group has recently proposed the SYCL standard, which is an open, royalty-free, cross-platform abstraction layer that enables the programming of a heterogeneous system to be written using standard, single-source C++ code. Over the past few years, several implementations of this SYCL standard have emerged, being oneAPI the one from Intel. This paper presents the migration process of the SW# suite, a biological sequence alignment tool developed in CUDA, to SYCL using Intel's oneAPI ecosystem. The experimental results show that SW# was completely migrated with a small programmer intervention in terms of hand-coding. In addition, it was possible to port the migrated code between different architectures (considering multiple vendor GPUs and also CPUs), with no noticeable performance degradation on five different NVIDIA GPUs. Moreover, performance remained stable when switching to another SYCL implementation. As a consequence, SYCL and its implementations can offer attractive opportunities for the bioinformatics community, especially considering the vast existence of CUDA-based legacy codes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. Maboss for HPC environments: implementations of the continuous time Boolean model simulator for large CPU clusters and GPU accelerators.

Author: Šmelko, Adam, Kratochvíl, Miroslav, Barillot, Emmanuel, and Noël, Vincent
Subjects: *CONTINUOUS time models, *SYSTEMS biology, *GRAPHICS processing units, *SIMULATION software, *HIGH performance computing, *SYNTHETIC biology, *COMPUTATIONAL biology
Abstract: Background: Computational models in systems biology are becoming more important with the advancement of experimental techniques to query the mechanistic details responsible for leading to phenotypes of interest. In particular, Boolean models are well fit to describe the complexity of signaling networks while being simple enough to scale to a very large number of components. With the advance of Boolean model inference techniques, the field is transforming from an artisanal way of building models of moderate size to a more automatized one, leading to very large models. In this context, adapting the simulation software for such increases in complexity is crucial. Results: We present two new developments in the continuous time Boolean simulators: MaBoSS.MPI, a parallel implementation of MaBoSS which can exploit the computational power of very large CPU clusters, and MaBoSS.GPU, which can use GPU accelerators to perform these simulations. Conclusion: These implementations enable simulation and exploration of the behavior of very large models, thus becoming a valuable analysis tool for the systems biology community. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. Optimized model architectures for deep learning on genomic data.

Author: Gündüz, Hüseyin Anil, Mreches, René, Moosbauer, Julia, Robertson, Gary, To, Xiao-Yin, Franzosa, Eric A., Huttenhower, Curtis, Rezaei, Mina, McHardy, Alice C., Bischl, Bernd, Münch, Philipp C., and Binder, Martin
Subjects: *DEEP learning, *ARCHITECTURAL design, *COMPUTER vision, *COMPUTATIONAL biology, *VISUAL fields
Abstract: The success of deep learning in various applications depends on task-specific architecture design choices, including the types, hyperparameters, and number of layers. In computational biology, there is no consensus on the optimal architecture design, and decisions are often made using insights from more well-established fields such as computer vision. These may not consider the domain-specific characteristics of genome sequences, potentially limiting performance. Here, we present GenomeNet-Architect, a neural architecture design framework that automatically optimizes deep learning models for genome sequence data. It optimizes the overall layout of the architecture, with a search space specifically designed for genomics. Additionally, it optimizes hyperparameters of individual layers and the model training procedure. On a viral classification task, GenomeNet-Architect reduced the read-level misclassification rate by 19%, with 67% faster inference and 83% fewer parameters, and achieved similar contig-level accuracy with ~100 times fewer parameters compared to the best-performing deep learning baselines. Introducing GenomeNet-Architect, a neural architecture design framework that automatically optimises the overall layout of the architecture, the hyperparameters, and the training procedure of deep learning models for genome sequence data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. Assessment of Gene Set Enrichment Analysis using curated RNA-seq-based benchmarks.

Author: Candia, Julián and Ferrucci, Luigi
Subjects: *ETIOLOGY of cancer, *LIVER cancer, *HEPATOCELLULAR carcinoma, *GENES, *PHENOTYPES, *COMPUTATIONAL biology, *SYSTEMS biology
Abstract: Pathway enrichment analysis is a ubiquitous computational biology method to interpret a list of genes (typically derived from the association of large-scale omics data with phenotypes of interest) in terms of higher-level, predefined gene sets that share biological function, chromosomal location, or other common features. Among many tools developed so far, Gene Set Enrichment Analysis (GSEA) stands out as one of the pioneering and most widely used methods. Although originally developed for microarray data, GSEA is nowadays extensively utilized for RNA-seq data analysis. Here, we quantitatively assessed the performance of a variety of GSEA modalities and provide guidance in the practical use of GSEA in RNA-seq experiments. We leveraged harmonized RNA-seq datasets available from The Cancer Genome Atlas (TCGA) in combination with large, curated pathway collections from the Molecular Signatures Database to obtain cancer-type-specific target pathway lists across multiple cancer types. We carried out a detailed analysis of GSEA performance using both gene-set and phenotype permutations combined with four different choices for the Kolmogorov-Smirnov enrichment statistic. Based on our benchmarks, we conclude that the classic/unweighted gene-set permutation approach offered comparable or better sensitivity-vs-specificity tradeoffs across cancer types compared with other, more complex and computationally intensive permutation methods. Finally, we analyzed other large cohorts for thyroid cancer and hepatocellular carcinoma. We utilized a new consensus metric, the Enrichment Evidence Score (EES), which showed a remarkable agreement between pathways identified in TCGA and those from other sources, despite differences in cancer etiology. This finding suggests an EES-based strategy to identify a core set of pathways that may be complemented by an expanded set of pathways for downstream exploratory analysis. This work fills the existing gap in current guidelines and benchmarks for the use of GSEA with RNA-seq data and provides a framework to enable detailed benchmarking of other RNA-seq-based pathway analysis tools. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. Interpretable online network dictionary learning for inferring long-range chromatin interactions.

Author: Rana, Vishal, Peng, Jianhao, Pan, Chao, Lyu, Hanbaek, Cheng, Albert, Kim, Minji, and Milenkovic, Olgica
Subjects: *ENCYCLOPEDIAS & dictionaries, *CHROMATIN, *FLUORESCENCE in situ hybridization, *COMPUTATIONAL biology, *DROSOPHILA melanogaster, *MATRIX decomposition
Abstract: Dictionary learning (DL), implemented via matrix factorization (MF), is commonly used in computational biology to tackle ubiquitous clustering problems. The method is favored due to its conceptual simplicity and relatively low computational complexity. However, DL algorithms produce results that lack interpretability in terms of real biological data. Additionally, they are not optimized for graph-structured data and hence often fail to handle them in a scalable manner. In order to address these limitations, we propose a novel DL algorithm called online convex network dictionary learning (online cvxNDL). Unlike classical DL algorithms, online cvxNDL is implemented via MF and designed to handle extremely large datasets by virtue of its online nature. Importantly, it enables the interpretation of dictionary elements, which serve as cluster representatives, through convex combinations of real measurements. Moreover, the algorithm can be applied to data with a network structure by incorporating specialized subnetwork sampling techniques. To demonstrate the utility of our approach, we apply cvxNDL on 3D-genome RNAPII ChIA-Drop data with the goal of identifying important long-range interaction patterns (long-range dictionary elements). ChIA-Drop probes higher-order interactions, and produces data in the form of hypergraphs whose nodes represent genomic fragments. The hyperedges represent observed physical contacts. Our hypergraph model analysis has the objective of creating an interpretable dictionary of long-range interaction patterns that accurately represent global chromatin physical contact maps. Through the use of dictionary information, one can also associate the contact maps with RNA transcripts and infer cellular functions. To accomplish the task at hand, we focus on RNAPII-enriched ChIA-Drop data from Drosophila Melanogaster S2 cell lines. Our results offer two key insights. First, we demonstrate that online cvxNDL retains the accuracy of classical DL (MF) methods while simultaneously ensuring unique interpretability and scalability. Second, we identify distinct collections of proximal and distal interaction patterns involving chromatin elements shared by related processes across different chromosomes, as well as patterns unique to specific chromosomes. To associate the dictionary elements with biological properties of the corresponding chromatin regions, we employ Gene Ontology (GO) enrichment analysis and perform multiple RNA coexpression studies. Author summary: We introduce a novel method for dictionary learning termed online convex Network Dictionary Learning (online cvxNDL). The method operates in an online manner and utilizes representative subnetworks of a network dataset as dictionary elements. A key feature of online cvxNDL is its ability to work with graph-structured data and generate dictionary elements that represent convex combinations of real data points, thus ensuring interpretability. Online cvxNDL is used to investigate long-range chromatin interactions in S2 cell lines of Drosophila Melanogaster obtained through RNAPII ChIA-Drop measurements represented as hypergraphs. The results show that dictionary elements can accurately and efficiently reconstruct the original interactions present in the data, even when subjected to convexity constraints. To shed light on the biological relevance of the identified dictionaries, we perform Gene Ontology enrichment and RNA-seq coexpression analyses. These studies uncover multiple long-range interaction patterns that are chromosome-specific. Furthermore, the findings affirm the significance of convex dictionaries in representing TADs cross-validated by imaging methods (such as 3-color FISH (fluorescence in situ hybridization)). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Special Issue "Bioinformatics of Unusual DNA and RNA Structures".

Author: Bartas, Martin, Brázda, Václav, and Pečinka, Petr
Subjects: *DNA structure, *QUADRUPLEX nucleic acids, *DNA analysis, *COMPUTATIONAL biology, *MOLECULAR biology, *MOLECULAR structure
Abstract: This editorial from the International Journal of Molecular Sciences provides an overview of the field of unusual nucleic acid structures (UNas), which are noncanonical structures that differ from the classical double-stranded structure of B-DNA. The editorial highlights eight articles published in the special issue, covering topics such as G-quadruplex polymorphism, R-loop prediction, G-quadruplexes in cervical cancer, G-quadruplexes in arboviruses, the role of G-quadruplexes in gene expression regulation, the effects of ions on G-quadruplex formation in rice, tRNA fragments in bacterial communities, and virus-induced gene silencing. The editorial also discusses the future perspectives of UNas research, including the development of bioinformatic tools, structural modeling, virtual screening, and molecular dynamics methods. The article also discusses the potential of UNas as molecular targets for drug discovery, specifically focusing on G-quadruplexes. However, the main challenge with UNa-binding compounds has been their low specificity and high toxicity. The authors believe that advances in bioinformatic methods will soon allow for the selective targeting of specific pathological UNas, paving the way for their application in drug discovery. The article also provides a table summarizing the characteristics and functions of different UNas. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

10. Platelet Biorheology and Mechanobiology in Thrombosis and Hemostasis: Perspectives from Multiscale Computation.

Author: Tuna, Rukiye, Yi, Wenjuan, Crespo Cruz, Esmeralda, Romero, JP, Ren, Yi, Guan, Jingjiao, Li, Yan, Deng, Yuefan, Bluestein, Danny, Liu, Zixiang Leonardo, and Sheriff, Jawaad
Subjects: *RHEOLOGY (Biology), *COMPUTATIONAL biology, *BLOOD platelet activation, *HEMOSTASIS, *BLOOD platelets, *BLOOD platelet aggregation
Abstract: Thrombosis is the pathological clot formation under abnormal hemodynamic conditions, which can result in vascular obstruction, causing ischemic strokes and myocardial infarction. Thrombus growth under moderate to low shear (<1000 s−1) relies on platelet activation and coagulation. Thrombosis at elevated high shear rates (>10,000 s−1) is predominantly driven by unactivated platelet binding and aggregating mediated by von Willebrand factor (VWF), while platelet activation and coagulation are secondary in supporting and reinforcing the thrombus. Given the molecular and cellular level information it can access, multiscale computational modeling informed by biology can provide new pathophysiological mechanisms that are otherwise not accessible experimentally, holding promise for novel first-principle-based therapeutics. In this review, we summarize the key aspects of platelet biorheology and mechanobiology, focusing on the molecular and cellular scale events and how they build up to thrombosis through platelet adhesion and aggregation in the presence or absence of platelet activation. In particular, we highlight recent advancements in multiscale modeling of platelet biorheology and mechanobiology and how they can lead to the better prediction and quantification of thrombus formation, exemplifying the exciting paradigm of digital medicine. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. Dysregulated microRNAs in prostate cancer: In silico prediction and in vitro validation.

Author: Rezaei, Samaneh, Najaf Abadi, Mohammad Hasan Jafari, Bazyari, Mohammad Javad, Jalili, Amin, Oskuee, Reza Kazemi, and Aghaee-Bakhtiari, Seyed Hamid
Subjects: *MICRORNA, *BIOINFORMATICS, *PROSTATE cancer, *GENE expression, *COMPUTATIONAL biology
Abstract: Objective(s): MicroRNAs, which are micro-coordinators of gene expression, have been recently investigated as a potential treatment for cancer. The study used computational techniques to identify microRNAs that could target a set of genes simultaneously. Due to their multi-target-directed nature, microRNAs have the potential to impact multiple key pathways and their pathogenic cross-talk. Materials and Methods: We identified microRNAs that target a prostate cancer-associated gene set using integrated bioinformatics analyses and experimental validation. The candidate gene set included genes targeted by clinically approved prostate cancer medications. We used STRING, GO, and KEGG web tools to confirm gene-gene interactions and their clinical significance. Then, we employed integrated predicted and validated bioinformatics approaches to retrieve hsa-miR-124-3p, 16-5p, and 27a-3p as the top three relevant microRNAs. KEGG and DIANA-miRPath showed the related pathways for the candidate genes and microRNAs Results: The Real-time PCR results showed that miR-16-5p simultaneously down-regulated all genes significantly except for PIK3CA/CB in LNCaP; miR-27a-3p simultaneously down-regulated all genes significantly, excluding MET in LNCaP and PIK3CA in PC-3; and miR-124-3p could not downregulate significantly PIK3CB, MET, and FGFR4 in LNCaP and FGFR4 in PC-3. Finally, we used a cell cycle assay to show significant G0/G1 arrest by transfecting miR-124-3p in LNCaP and miR-16-5p in both cell lines. Conclusion: Our findings suggest that this novel approach may have therapeutic benefits and these predicted microRNAs could effectively target the candidate genes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. Machine Learning Strategies in MicroRNA Research: Bridging Genome to Phenome.

Author: Daniel Thomas, Sonet, Vijayakumar, Krithika, John, Levin, Krishnan, Deepak, Rehman, Niyas, Revikumar, Amjesh, Kandel Codi, Jalaluddin Akbar, Prasad, Thottethodi Subrahmanya Keshava, S.S., Vinodchandra, and Raju, Rajesh
Subjects: *CIRCULAR RNA, *GENOMICS, *LEARNING strategies, *MACHINE learning, *GENE expression, *GENETIC regulation
Abstract: MicroRNAs (miRNAs) have emerged as a prominent layer of regulation of gene expression. This article offers the salient and current aspects of machine learning (ML) tools and approaches from genome to phenome in miRNA research. First, we underline that the complexity in the analysis of miRNA function ranges from their modes of biogenesis to the target diversity in diverse biological conditions. Therefore, it is imperative to first ascertain the miRNA coding potential of genomes and understand the regulatory mechanisms of their expression. This knowledge enables the efficient classification of miRNA precursors and the identification of their mature forms and respective target genes. Second, and because one miRNA can target multiple mRNAs and vice versa, another challenge is the assessment of the miRNA-mRNA target interaction network. Furthermore, long-noncoding RNA (lncRNA)and circular RNAs (circRNAs) also contribute to this complexity. ML has been used to tackle these challenges at the high-dimensional data level. The present expert review covers more than 100 tools adopting various ML approaches pertaining to, for example, (1) miRNA promoter prediction, (2) precursor classification, (3) mature miRNA prediction, (4) miRNA target prediction, (5) miRNA- lncRNA and miRNA-circRNA interactions, (6) miRNA-mRNA expression profiling, (7) miRNA regulatory module detection, (8) miRNA-disease association, and (9) miRNA essentiality prediction. Taken together, we unpack, critically examine, and highlight the cutting-edge synergy of ML approaches and miRNA research so as to develop a dynamic and microlevel understanding of human health and diseases. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. miRSNP rs188493331: A key player in genetic control of microRNA‐induced pathway activation in hypertrophic scars and keloids.

Author: Chen, Meiqing, Pan, Yuyan, Chen, Zhiwei, Qi, Fazhi, Gu, Jianying, Qiu, Yangyang, He, Anqi, and Liu, Jiaqi
Subjects: *HYPERTROPHIC scars, *COMPUTATIONAL biology, *GENE expression, *KELOIDS, *FOCAL adhesions, *PI3K/AKT pathway, *PROTEOGLYCANS
Abstract: Background: Our study aims to delineate the miRSNP–microRNA–gene–pathway interactions in the context of hypertrophic scars (HS) and keloids. Materials and Methods: We performed a computational biology study involving differential expression analysis to identify genes and their mRNAs in HS and keloid tissues compared to normal skin, identifying key hub genes and enriching their functional roles, comprehensively analyzing microRNA‐target genes and related signaling pathways through bioinformatics, identifying MiRSNPs, and constructing a pathway‐based network to illustrate miRSNP‐miRNA‐gene‐signaling pathway interactions. Results: Our results revealed a total of 429 hub genes, with a strong enrichment in signaling pathways related to proteoglycans in cancer, focal adhesion, TGF‐β, PI3K/Akt, and EGFR tyrosine kinase inhibitor resistance. Particularly noteworthy was the substantial crosstalk between the focal adhesion and PI3K/Akt signaling pathways, making them more susceptible to regulation by microRNAs. We also identified specific miRNAs, including miRNA‐1279, miRNA‐429, and miRNA‐302e, which harbored multiple SNP loci, with miRSNPs rs188493331 and rs78979933 exerting control over a significant number of miRNA target genes. Furthermore, we observed that miRSNP rs188493331 shared a location with microRNA302e, microRNA202a‐3p, and microRNA20b‐5p, and these three microRNAs collectively targeted the gene LAMA3, which is integral to the focal adhesion signaling pathway. Conclusions: The study successfully unveils the complex interactions between miRSNPs, miRNAs, genes, and signaling pathways, shedding light on the genetic factors contributing to HS and keloid formation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. Optimized model architectures for deep learning on genomic data.

Author: Gündüz, Hüseyin Anil, Mreches, René, Moosbauer, Julia, Robertson, Gary, To, Xiao-Yin, Franzosa, Eric A., Huttenhower, Curtis, Rezaei, Mina, McHardy, Alice C., Bischl, Bernd, Münch, Philipp C., and Binder, Martin
Subjects: *DEEP learning, *ARCHITECTURAL design, *COMPUTER vision, *COMPUTATIONAL biology, *VISUAL fields
Abstract: The success of deep learning in various applications depends on task-specific architecture design choices, including the types, hyperparameters, and number of layers. In computational biology, there is no consensus on the optimal architecture design, and decisions are often made using insights from more well-established fields such as computer vision. These may not consider the domain-specific characteristics of genome sequences, potentially limiting performance. Here, we present GenomeNet-Architect, a neural architecture design framework that automatically optimizes deep learning models for genome sequence data. It optimizes the overall layout of the architecture, with a search space specifically designed for genomics. Additionally, it optimizes hyperparameters of individual layers and the model training procedure. On a viral classification task, GenomeNet-Architect reduced the read-level misclassification rate by 19%, with 67% faster inference and 83% fewer parameters, and achieved similar contig-level accuracy with ~100 times fewer parameters compared to the best-performing deep learning baselines. Introducing GenomeNet-Architect, a neural architecture design framework that automatically optimises the overall layout of the architecture, the hyperparameters, and the training procedure of deep learning models for genome sequence data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Biology System Description Language (BiSDL): a modeling language for the design of multicellular synthetic biological systems.

Author: Giannantoni, Leonardo, Bardini, Roberta, Savino, Alessandro, and Di Carlo, Stefano
Subjects: *SYNTHETIC biology, *BIOLOGICAL systems, *SYSTEMS biology, *DEVELOPMENTAL biology, *CONCEPTUAL design, *BIOLOGISTS
Abstract: Background: The Biology System Description Language (BiSDL) is an accessible, easy-to-use computational language for multicellular synthetic biology. It allows synthetic biologists to represent spatiality and multi-level cellular dynamics inherent to multicellular designs, filling a gap in the state of the art. Developed for designing and simulating spatial, multicellular synthetic biological systems, BiSDL integrates high-level conceptual design with detailed low-level modeling, fostering collaboration in the Design-Build-Test-Learn cycle. BiSDL descriptions directly compile into Nets-Within-Nets (NWNs) models, offering a unique approach to spatial and hierarchical modeling in biological systems. Results: BiSDL's effectiveness is showcased through three case studies on complex multicellular systems: a bacterial consortium, a synthetic morphogen system and a conjugative plasmid transfer process. These studies highlight the BiSDL proficiency in representing spatial interactions and multi-level cellular dynamics. The language facilitates the compilation of conceptual designs into detailed, simulatable models, leveraging the NWNs formalism. This enables intuitive modeling of complex biological systems, making advanced computational tools more accessible to a broader range of researchers. Conclusions: BiSDL represents a significant step forward in computational languages for synthetic biology, providing a sophisticated yet user-friendly tool for designing and simulating complex biological systems with an emphasis on spatiality and cellular dynamics. Its introduction has the potential to transform research and development in synthetic biology, allowing for deeper insights and novel applications in understanding and manipulating multicellular systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. Clusters of grapevine genes for a burning world.

Author: Coupel‐Ledru, Aude, Westgeest, Adrianus J., Albasha, Rami, Millan, Mathilde, Pallas, Benoît, Doligez, Agnès, Flutre, Timothée, Segura, Vincent, This, Patrice, Torregrosa, Laurent, Simonneau, Thierry, and Pantin, Florent
Subjects: *VITIS vinifera, *PLANT molecular biology, *GRAPES, *BOTANY, *ABSCISIC acid, *LOCUS (Genetics), *COMPUTATIONAL biology, *BOTANICAL chemistry
Abstract: A study published in the New Phytologist journal examines the genetic diversity of grapevines and their ability to withstand extreme heatwaves caused by climate change. The researchers conducted experiments on a grapevine diversity panel in South France during a record heatwave in 2019. They discovered that certain genomic regions were linked to heat tolerance, suggesting that genetic diversity could be used to breed fruit crops that can withstand heatwaves. The study also investigated the role of leaf size, leaf mass per area, and evaporative cooling in heat tolerance. The researchers identified candidate genes that may contribute to heat tolerance in grapevines. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

17. Rescue of Mycobacterium bovis DNA Obtained from Cultured Samples during Official Surveillance of Animal TB: Key Steps for Robust Whole Genome Sequence Data Generation.

Author: Pinto, Daniela, Themudo, Gonçalo, Pereira, André C., Botelho, Ana, and Cunha, Mónica V.
Subjects: *MYCOBACTERIUM bovis, *WHOLE genome sequencing, *IDENTIFICATION of animals, *DNA, *MIXED infections, *COMPUTATIONAL biology
Abstract: Epidemiological surveillance of animal tuberculosis (TB) based on whole genome sequencing (WGS) of Mycobacterium bovis has recently gained track due to its high resolution to identify infection sources, characterize the pathogen population structure, and facilitate contact tracing. However, the workflow from bacterial isolation to sequence data analysis has several technical challenges that may severely impact the power to understand the epidemiological scenario and inform outbreak response. While trying to use archived DNA from cultured samples obtained during routine official surveillance of animal TB in Portugal, we struggled against three major challenges: the low amount of M. bovis DNA obtained from routinely processed animal samples; the lack of purity of M. bovis DNA, i.e., high levels of contamination with DNA from other organisms; and the co-occurrence of more than one M. bovis strain per sample (within-host mixed infection). The loss of isolated genomes generates missed links in transmission chain reconstruction, hampering the biological and epidemiological interpretation of data as a whole. Upon identification of these challenges, we implemented an integrated solution framework based on whole genome amplification and a dedicated computational pipeline to minimize their effects and recover as many genomes as possible. With the approaches described herein, we were able to recover 62 out of 100 samples that would have otherwise been lost. Based on these results, we discuss adjustments that should be made in official and research laboratories to facilitate the sequential implementation of bacteriological culture, PCR, downstream genomics, and computational-based methods. All of this in a time frame supporting data-driven intervention. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. PLEACH: a new heuristic algorithm for pure parsimony haplotyping problem.

Author: Feizabadi, Reza, Bagherian, Mehri, Vaziri, Hamidreza, and Salahi, Maziar
Subjects: *MIXED integer linear programming, *PARSIMONIOUS models, *HAPLOTYPES, *COMPUTATIONAL biology, *HEURISTIC algorithms, *HEART abnormalities
Abstract: Haplotype inference is an important issue in computational biology due to its various applications in diagnosing and treating genetic diseases such as diabetes, Alzheimer, and heart defects. There are different criteria to choose the solution from the alternatives. Parsimony is one of the most important criteria according to which the problem is known as Pure Parsimony Haplotyping (PPH) problem. The approaches to solve PPH are classified to two groups: exact and non-exact. The exact approaches often model the problem as a Mixed Integer Linear Programming (MILP) problem. Although in solving the small instances, these models generate the optimal solution in a reasonable time, because of the NP-hardness characteristic of PPH problem, they are ineffective in solving very large instances. This deficiency is compensated by non-exact algorithms. In this paper, we present a non-exact algorithm for large instances of PPH problem based on the divide-and-conquer technique. This algorithm, first, divides the problem into small sub-problems, which are solved by one of the previous exact approaches, and finally the solutions of the sub-problems are combined through solving an MILP. The appeared MILPs for solving the sub-problems and those for combining the solutions are so small that are solved rapidly. The performance of this algorithm has been evaluated by implementing it on real and simulated instances and in comparison with two well-known methods of PHASE and WinHap2. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. Integration between Bioinformatics Algorithms and Neutrosophic Theory.

Author: Farag, Romany M., Shams, Mahmoud Y., Aldawody, Dalia A., Khalid, Huda E., El-Bakry, Hazem M., and Salama, Ahmed A.
Subjects: *COMPUTATIONAL biology, *ARTIFICIAL intelligence, *BIOINFORMATICS, *DATA mining, *DATABASES, *BIOINFORMATICS software, *SYNTHETIC biology
Abstract: This paper presents a neutrosophic inference model for bioinformatics. The model is used to develop a system for accurate comparisons of human nucleic acids, where the new nucleic acid is compared to a database of old nucleic acids. The comparisons are analyzed in terms of accuracy, certainty, uncertainty, neutrality, and bias. The proposed system achieves good results and provides a reliable standard for future comparisons. It highlights the potential of neutrosophic inference models in bioinformatics applications. Data mining and bioinformatics play a crucial role in computational biology, with applications in scientific research and industrial development. Biological analysts rely on specialized tools and algorithms to collect, store, categorize, and analyze large volumes of unstructured data. Data mining techniques are used to extract valuable information from this data, aiding in the development of new therapies and understanding genetic relationships between organisms. Recent advancements in bioinformatics include gene expression tools, Bio sequencing, and Bio databases, which facilitate the extraction and analysis of vital biological information. These technologies contribute to the analysis of big data, identification of key bioinformatics insights, and generation of new biological knowledge. Data collection, analysis, and interpretation in this field involves the use of modern technologies such as cloud computing, machine learning, and artificial intelligence, enabling more efficient and accurate results. Ultimately, data mining and bioinformatics enhance our understanding of genetic relationships, aid in developing new therapies, and improve healthcare outcomes. [ABSTRACT FROM AUTHOR]
Published: 2024

20. Combining machine learning with structure-based protein design to predict and engineer post-translational modifications of proteins.

Author: Ertelt, Moritz, Mulligan, Vikram Khipple, Maguire, Jack B., Lyskov, Sergey, Moretti, Rocco, Schiffner, Torben, Meiler, Jens, and Schoeder, Clara T.
Subjects: *POST-translational modification, *PROTEIN engineering, *MACHINE learning, *ENGINEERING design, *COMPUTATIONAL biology, *ARTIFICIAL neural networks, *COMPUTATIONAL neuroscience
Abstract: Post-translational modifications (PTMs) of proteins play a vital role in their function and stability. These modifications influence protein folding, signaling, protein-protein interactions, enzyme activity, binding affinity, aggregation, degradation, and much more. To date, over 400 types of PTMs have been described, representing chemical diversity well beyond the genetically encoded amino acids. Such modifications pose a challenge to the successful design of proteins, but also represent a major opportunity to diversify the protein engineering toolbox. To this end, we first trained artificial neural networks (ANNs) to predict eighteen of the most abundant PTMs, including protein glycosylation, phosphorylation, methylation, and deamidation. In a second step, these models were implemented inside the computational protein modeling suite Rosetta, which allows flexible combination with existing protocols to model the modified sites and understand their impact on protein stability as well as function. Lastly, we developed a new design protocol that either maximizes or minimizes the predicted probability of a particular site being modified. We find that this combination of ANN prediction and structure-based design can enable the modification of existing, as well as the introduction of novel, PTMs. The potential applications of our work include, but are not limited to, glycan masking of epitopes, strengthening protein-protein interactions through phosphorylation, as well as protecting proteins from deamidation liabilities. These applications are especially important for the design of new protein therapeutics where PTMs can drastically change the therapeutic properties of a protein. Our work adds novel tools to Rosetta's protein engineering toolbox that allow for the rational design of PTMs. Author summary: Machine learning (ML) is changing the world of protein design, from structure prediction methods like AlphaFold to fixed-backbone design methods like ProteinMPNN. ML methods have made much progress in various aspects of protein computational biology, both complementing and, in some cases, surpassing traditional macromolecular modeling methods such as those combined in libraries like the Rosetta software suite. However, a lack of compatibility and flexibility can hinder interoperability with existing methods, preventing the full potential of these new solutions from being realized. Here, we first present a new machine learning tool for predicting post-translational modifications (PTMs), which play an important role in the stability and function of proteins, and then highlight how the implementation of this tool in the existing Rosetta toolbox can facilitate new applications. To this end, we combine PTM prediction with protein design, maximizing or minimizing the predicted probability of a post-translational modification occurring at a specific site. As one example, we predict the N-linked glycosylation of influenza hemagglutinin, which has applications in both understanding the evolution of viral strains over time, and engineering additional glycosylation sites to mask unwanted epitopes of vaccine candidates. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. xCAPT5: protein–protein interaction prediction using deep and wide multi-kernel pooling convolutional neural networks with protein language model.

Author: Dang, Thanh Hai and Vu, Tien Anh
Subjects: *CONVOLUTIONAL neural networks, *LANGUAGE models, *NERVE tissue proteins, *PROTEIN models, *COMPUTATIONAL biology, *PROTEIN-protein interactions, *FUMONISINS
Abstract: Background: Predicting protein–protein interactions (PPIs) from sequence data is a key challenge in computational biology. While various computational methods have been proposed, the utilization of sequence embeddings from protein language models, which contain diverse information, including structural, evolutionary, and functional aspects, has not been fully exploited. Additionally, there is a significant need for a comprehensive neural network capable of efficiently extracting these multifaceted representations. Results: Addressing this gap, we propose xCAPT5, a novel hybrid classifier that uniquely leverages the T5-XL-UniRef50 protein large language model for generating rich amino acid embeddings from protein sequences. The core of xCAPT5 is a multi-kernel deep convolutional siamese neural network, which effectively captures intricate interaction features at both micro and macro levels, integrated with the XGBoost algorithm, enhancing PPIs classification performance. By concatenating max and average pooling features in a depth-wise manner, xCAPT5 effectively learns crucial features with low computational cost. Conclusion: This study represents one of the initial efforts to extract informative amino acid embeddings from a large protein language model using a deep and wide convolutional network. Experimental results show that xCAPT5 outperforms recent state-of-the-art methods in binary PPI prediction, excelling in cross-validation on several benchmark datasets and demonstrating robust generalization across intra-species, cross-species, inter-species, and stringent similarity contexts. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. In silico prospection of receptors associated with the biological activity of U1-SCTRX-lg1a: an antimicrobial peptide isolated from the venom of Loxosceles gaucho.

Author: de Oliveira, André Souza, Muniz Seif, Elias Jorge, and da Silva Junior, Pedro Ismael
Subjects: *ANTIMICROBIAL peptides, *SPIDER venom, *PEPTIDE antibiotics, *LOXOSCELES, *VENOM, *PHOSPHOLIPASE D, *MOLECULAR dynamics
Abstract: The emergence of antibiotic-resistant pathogens generates impairment to human health. U1-SCTRX-lg1a is a peptide isolated from a phospholipase D extracted from the spider venom of Loxosceles gaucho with antimicrobial activity against Gram-negative bacteria (between 1.15 and 4.6 μM). The aim of this study was to suggest potential receptors associated with the antimicrobial activity of U1-SCTRX-lg1a using in silico bioinformatics tools. The search for potential targets of U1-SCRTX-lg1a was performed using the PharmMapper server. Molecular docking between U1-SCRTX-lg1a and the receptor was performed using PatchDock software. The prediction of ligand sites for each receptor was conducted using the PDBSum server. Chimera 1.6 software was used to perform molecular dynamics simulations only for the best dock score receptor. In addition, U1-SCRTX-lg1a and native ligand interactions were compared using AutoDock Vina software. Finally, predicted interactions were compared with the ligand site previously described in the literature. The bioprospecting of U1-SCRTX-lg1a resulted in the identification of three hundred (300) diverse targets (Table S1), forty-nine (49) of which were intracellular proteins originating from Gram-negative microorganisms (Table S2). Docking results indicate Scores (10,702 to 6066), Areas (1498.70 to 728.40) and ACEs (417.90 to – 152.8) values. Among these, NAD + NH3-dependent synthetase (PDB ID: 1wxi) showed a dock score of 9742, area of 1223.6 and ACE of 38.38 in addition to presenting a Normalized Fit score of 8812 on PharmMapper server. Analysis of the interaction of ligands and receptors suggests that the peptide derived from brown spider venom can interact with residues SER48 and THR160. Furthermore, the C terminus (– 7.0 score) has greater affinity for the receptor than the N terminus (– 7.7 score). The molecular dynamics assay shown that free energy value for the protein complex of – 214,890.21 kJ/mol, whereas with rigid docking, this value was – 29.952.8 sugerindo that after the molecular dynamics simulation, the complex exhibits a more favorable energy value compared to the previous state. The in silico bioprospecting of receptors suggests that U1-SCRTX-lg1a may interfere with NAD + production in Escherichia coli, a Gram-negative bacterium, altering the homeostasis of the microorganism and impairing growth. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. A ubiquitous GC content signature underlies multimodal mRNA regulation by DDX3X.

Author: Jowhar, Ziad, Xu, Albert, Venkataramanan, Srivats, Dossena, Francesco, Hoye, Mariah L, Silver, Debra L, Floor, Stephen N, and Calviello, Lorenzo
Subjects: *RNA regulation, *DEVELOPMENTAL neurobiology, *RNA-binding proteins, *GENETIC regulation, *STATISTICAL learning, *GENE expression, *MESSENGER RNA
Abstract: The road from transcription to protein synthesis is paved with many obstacles, allowing for several modes of post-transcriptional regulation of gene expression. A fundamental player in mRNA biology is DDX3X, an RNA binding protein that canonically regulates mRNA translation. By monitoring dynamics of mRNA abundance and translation following DDX3X depletion, we observe stabilization of translationally suppressed mRNAs. We use interpretable statistical learning models to uncover GC content in the coding sequence as the major feature underlying RNA stabilization. This result corroborates GC content-related mRNA regulation detectable in other studies, including hundreds of ENCODE datasets and recent work focusing on mRNA dynamics in the cell cycle. We provide further evidence for mRNA stabilization by detailed analysis of RNA-seq profiles in hundreds of samples, including a Ddx3x conditional knockout mouse model exhibiting cell cycle and neurogenesis defects. Our study identifies a ubiquitous feature underlying mRNA regulation and highlights the importance of quantifying multiple steps of the gene expression cascade, where RNA abundance and protein production are often uncoupled. Synopsis: Monitoring the dynamics of mRNA changes after DDX3X depletion indicates translation suppression followed by mRNA stabilization. GC content in the CDS is a strong predictor of mRNA stabilization and it is detectable in multiple transcriptomics datasets, with a potential link to cell-cycle regulation. RNA-seq and Ribo-seq on a time course of DDX3X depletion show regulation of translation and mRNA levels. Intron-exon count modeling and SLAM-seq demonstrate post-transcriptional mRNA stabilization. Random Forest and Lasso show GC content in the CDS (GCcds) as a main predictor of mRNA stabilization. ENCODE RBP knockdowns and in vivo datasets show widespread GCcds-dependent mRNA stabilization. Monitoring the dynamics of mRNA changes after DDX3X depletion indicates translation suppression followed by mRNA stabilization. GC content in the CDS is a strong predictor of mRNA stabilization and it is detectable in multiple transcriptomics datasets, with a potential link to cell-cycle regulation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. Efficiency of Various Tiling Strategies for the Zuker Algorithm Optimization.

Author: Blaszynski, Piotr, Palkowski, Marek, Bielecki, Wlodzimierz, and Poliwoda, Maciej
Subjects: *DYNAMIC programming, *AFFINE transformations, *COMPILERS (Computer programs), *BIOINFORMATICS software, *ENERGY consumption, *COMPUTATIONAL biology, *PLUTO (Dwarf planet)
Abstract: This paper focuses on optimizing the Zuker RNA folding algorithm, a bioinformatics task with non-serial polyadic dynamic programming and non-uniform loop dependencies. The intricate dependence pattern is represented using affine formulas, enabling the automatic application of tiling strategies via the polyhedral method. Three source-to-source compilers—PLUTO, TRACO, and DAPT—are employed, utilizing techniques such as affine transformations, the transitive closure of dependence relation graphs, and space–time tiling to generate cache-efficient codes, respectively. A dedicated transpose code technique for non-serial polyadic dynamic programming codes is also examined. The study evaluates the performance of these optimized codes for speed-up and scalability on multi-core machines and explores energy efficiency using RAPL. The paper provides insights into related approaches and outlines future research directions within the context of bioinformatics algorithm optimization. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. New classifications for quantum bioinformatics: Q-bioinformatics, QCt-bioinformatics, QCg-bioinformatics, and QCr-bioinformatics.

Author: Mokhtari, Majid, Khoshbakht, Samane, Ziyaei, Kobra, Akbari, Mohammad Esmaeil, and Moravveji, Sayyed Sajjad
Subjects: *BIOMECHANICS, *QUANTUM biochemistry, *MOLECULAR biology, *QUANTUM mechanics, *BIOINFORMATICS, *COMPUTATIONAL biology, *PROTEIN folding
Abstract: Bioinformatics has revolutionized biology and medicine by using computational methods to analyze and interpret biological data. Quantum mechanics has recently emerged as a promising tool for the analysis of biological systems, leading to the development of quantum bioinformatics. This new field employs the principles of quantum mechanics, quantum algorithms, and quantum computing to solve complex problems in molecular biology, drug design, and protein folding. However, the intersection of bioinformatics, biology, and quantum mechanics presents unique challenges. One significant challenge is the possibility of confusion among scientists between quantum bioinformatics and quantum biology, which have similar goals and concepts. Additionally, the diverse calculations in each field make it difficult to establish boundaries and identify purely quantum effects from other factors that may affect biological processes. This review provides an overview of the concepts of quantum biology and quantum mechanics and their intersection in quantum bioinformatics. We examine the challenges and unique features of this field and propose a classification of quantum bioinformatics to promote interdisciplinary collaboration and accelerate progress. By unlocking the full potential of quantum bioinformatics, this review aims to contribute to our understanding of quantum mechanics in biological systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. Prediction of cancer driver genes and mutations: the potential of integrative computational frameworks.

Author: Nourbakhsh, Mona, Degn, Kristine, Saksager, Astrid, Tiberti, Matteo, and Papaleo, Elena
Subjects: *CANCER genes, *GENETIC mutation, *SINGLE nucleotide polymorphisms, *COMPUTER software developers, *RESEARCH personnel, *COMPUTATIONAL neuroscience, *COMPUTATIONAL biology
Abstract: The vast amount of available sequencing data allows the scientific community to explore different genetic alterations that may drive cancer or favor cancer progression. Software developers have proposed a myriad of predictive tools, allowing researchers and clinicians to compare and prioritize driver genes and mutations and their relative pathogenicity. However, there is little consensus on the computational approach or a golden standard for comparison. Hence, benchmarking the different tools depends highly on the input data, indicating that overfitting is still a massive problem. One of the solutions is to limit the scope and usage of specific tools. However, such limitations force researchers to walk on a tightrope between creating and using high-quality tools for a specific purpose and describing the complex alterations driving cancer. While the knowledge of cancer development increases daily, many bioinformatic pipelines rely on single nucleotide variants or alterations in a vacuum without accounting for cellular compartments, mutational burden or disease progression. Even within bioinformatics and computational cancer biology, the research fields work in silos, risking overlooking potential synergies or breakthroughs. Here, we provide an overview of databases and datasets for building or testing predictive cancer driver tools. Furthermore, we introduce predictive tools for driver genes, driver mutations, and the impact of these based on structural analysis. Additionally, we suggest and recommend directions in the field to avoid silo-research, moving towards integrative frameworks. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. scRNA-seq Reveals Novel Genetic Pathways and Sex Chromosome Regulation in Tribolium Spermatogenesis.

Author: Robben, Michael, Ramesh, Balan, Pau, Shana, Meletis, Demetra, Luber, Jacob, and Demuth, Jeffery
Subjects: *SEX chromosomes, *RED flour beetle, *SPERMATOGENESIS, *TRIBOLIUM, *X chromosome, *BEETLES, *SPERMATOZOA
Abstract: Spermatogenesis is critical to sexual reproduction yet evolves rapidly in many organisms. High-throughput single-cell transcriptomics promises unparalleled insight into this important process but understanding can be impeded in nonmodel systems by a lack of known genes that can reliably demarcate biologically meaningful cell populations. Tribolium castaneum , the red flour beetle, lacks known markers for spermatogenesis found in insect species like Drosophila melanogaster. Using single-cell sequencing data collected from adult beetle testes, we implement a strategy for elucidating biologically meaningful cell populations by using transient expression stage identification markers, weighted principal component clustering, and SNP-based haploid/diploid phasing. We identify populations that correspond to observable points in sperm differentiation and find species specific markers for each stage. Our results indicate that molecular pathways underlying spermatogenesis in Coleoptera are substantially diverged from those in Diptera. We also show that most genes on the X chromosome experience meiotic sex chromosome inactivation. Temporal expression of Drosophila MSL complex homologs coupled with spatial analysis of potential chromatin entry sites further suggests that the dosage compensation machinery may mediate escape from meiotic sex chromosome inactivation and postmeiotic reactivation of the X chromosome. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Herbivory‐driven shifts in arbuscular mycorrhizal fungal community assembly: increased fungal competition and plant phosphorus benefits.

Author: Frew, Adam, Öpik, Maarja, Oja, Jane, Vahter, Tanel, Hiiesalu, Inga, and Aguilar‐Trigueros, Carlos A.
Subjects: *PLANT competition, *FUNGAL communities, *SOIL biology, *BIOTIC communities, *BOTANY, *COMPUTATIONAL biology, *FUNGAL spores, *PLANT defenses
Abstract: This article examines the impact of aboveground insect herbivory on the diversity and composition of arbuscular mycorrhizal (AM) fungal communities in plant roots. The study suggests that herbivory can affect the assembly of AM fungal communities, but the specific effects vary. While herbivory did not significantly reduce AM fungal richness, it did increase community evenness. The study also found that herbivory altered the composition and structure of AM fungal communities, leading to increased phylogenetic diversity. Additionally, plants with herbivores benefited more from AM fungi in terms of phosphorus acquisition compared to herbivore-free plants. These findings highlight the potential influence of aboveground herbivory on plant performance and nutrient acquisition. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

29. NeoHunter: Flexible software for systematically detecting neoantigens from sequencing data.

Author: Ma, Tianxing, Zhao, Zetong, Li, Haochen, Wei, Lei, and Zhang, Xuegong
Subjects: *MOLECULAR biology, *ANTIGENS, *COMPUTER software, *CANCER vaccines, *COMPUTATIONAL biology, *GENE fusion
Abstract: Complicated molecular alterations in tumors generate various mutant peptides. Some of these mutant peptides can be presented to the cell surface and then elicit immune responses, and such mutant peptides are called neoantigens. Accurate detection of neoantigens could help to design personalized cancer vaccines. Although some computational frameworks for neoantigen detection have been proposed, most of them can only detect SNV- and indel-derived neoantigens. In addition, current frameworks adopt oversimplified neoantigen prioritization strategies. These factors hinder the comprehensive and effective detection of neoantigens. We developed NeoHunter, flexible software to systematically detect and prioritize neoantigens from sequencing data in different formats. NeoHunter can detect not only SNV- and indel-derived neoantigens but also gene fusion- and aberrant splicing-derived neoantigens. NeoHunter supports both direct and indirect immunogenicity evaluation strategies to prioritize candidate neoantigens. These strategies utilize binding characteristics, existing biological big data, and T-cell receptor specificity to ensure accurate detection and prioritization. We applied NeoHunter to the TESLA dataset, cohorts of melanoma and non-small cell lung cancer patients. NeoHunter achieved high performance across the TESLA cancer patients and detected 79% (27 out of 34) of validated neoantigens in total. SNV- and indel-derived neoantigens accounted for 90% of the top 100 candidate neoantigens while neoantigens from aberrant splicing accounted for 9%. Gene fusion-derived neoantigens were detected in one patient. NeoHunter is a powerful tool to 'catch all' neoantigens and is available for free academic use on Github (XuegongLab/NeoHunter). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. Exact global alignment using A* with chaining seed heuristic and match pruning.

Author: Koerkamp, Ragnar Groot and Ivanov, Pesho
Subjects: *SEQUENCE alignment, *COMPUTATIONAL biology, *HEURISTIC, *SEEDS
Abstract: Motivation Sequence alignment has been at the core of computational biology for half a century. Still, it is an open problem to design a practical algorithm for exact alignment of a pair of related sequences in linear-like time. Results We solve exact global pairwise alignment with respect to edit distance by using the A* shortest path algorithm. In order to efficiently align long sequences with high divergence, we extend the recently proposed seed heuristic with match chaining , gap costs , and inexact matches. We additionally integrate the novel match pruning technique and diagonal transition to improve the A* search. We prove the correctness of our algorithm, implement it in the A* PA aligner, and justify our extensions intuitively and empirically. On random sequences of divergence d = 4 % and length n , the empirical runtime of A* PA scales near-linearly with length (best fit n 1.06 , n ≤ 10 7 bp ⁠). A similar scaling remains up to d = 12 % (best fit n 1.24 ⁠ , n ≤ 10 7 bp ⁠). For n = 10 7 bp and d = 4 % ⁠ , A* PA reaches > 500 × speedup compared to the leading exact aligners Edlib and Bi WFA. The performance of A* PA is highly influenced by long gaps. On long (⁠ n > 500 kb ⁠) ONT reads of a human sample it efficiently aligns sequences with d < 10 % ⁠ , leading to 3 × median speedup compared to Edlib and Bi WFA. When the sequences come from different human samples, A* PA performs 1.7 × faster than Edlib and Bi WFA. Availability and implementation github.com/RagnarGrootKoerkamp/astar-pairwise-aligner. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. Exploring DNA Damage and Repair Mechanisms: A Review with Computational Insights.

Author: Chen, Jiawei, Potlapalli, Ravi, Quan, Heng, Chen, Lingtao, Xie, Ying, Pouriyeh, Seyedamin, Sakib, Nazmus, Liu, Lichao, and Xie, Yixin
Subjects: *DNA repair, *DNA data banks, *DNA damage, *COMPUTATIONAL neuroscience, *DEOXYRIBOZYMES, *COMPUTATIONAL biology
Abstract: DNA damage is a critical factor contributing to genetic alterations, directly affecting human health, including developing diseases such as cancer and age-related disorders. DNA repair mechanisms play a pivotal role in safeguarding genetic integrity and preventing the onset of these ailments. Over the past decade, substantial progress and pivotal discoveries have been achieved in DNA damage and repair. This comprehensive review paper consolidates research efforts, focusing on DNA repair mechanisms, computational research methods, and associated databases. Our work is a valuable resource for scientists and researchers engaged in computational DNA research, offering the latest insights into DNA-related proteins, diseases, and cutting-edge methodologies. The review addresses key questions, including the major types of DNA damage, common DNA repair mechanisms, the availability of reliable databases for DNA damage and associated diseases, and the predominant computational research methods for enzymes involved in DNA damage and repair. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. Deep Learning for Subtypes Identification of Pure Seminoma of the Testis.

Author: Medvedev, Kirill E, Acosta, Paul H, Jia, Liwei, and Grishin, Nick V
Subjects: *TESTIS, *DEEP learning, *BIOINFORMATICS, *CANCER patients, *TESTIS tumors, *DECISION making, *DESCRIPTIVE statistics, *HISTOLOGICAL techniques, *RECEIVER operating characteristic curves, *PREDICTION models, *SEMINOMA
Abstract: The most critical step in the clinical diagnosis workflow is the pathological evaluation of each tumor sample. Deep learning is a powerful approach that is widely used to enhance diagnostic accuracy and streamline the diagnosis process. In our previous study using omics data, we identified 2 distinct subtypes of pure seminoma. Seminoma is the most common histological type of testicular germ cell tumors (TGCTs). Here we developed a deep learning decision making tool for the identification of seminoma subtypes using histopathological slides. We used all available slides for pure seminoma samples from The Cancer Genome Atlas (TCGA). The developed model showed an area under the ROC curve of 0.896. Our model not only confirms the presence of 2 distinct subtypes within pure seminoma but also unveils the presence of morphological differences between them that are imperceptible to the human eye. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Generic model to unravel the deeper insights of viral infections: an empirical application of evolutionary graph coloring in computational network biology.

Author: Kole, Arnab, Bag, Arup Kumar, Pal, Anindya Jyoti, and De, Debashis
Subjects: *GRAPH coloring, *VIRUS diseases, *COMPUTATIONAL biology, *TRANSCRIPTION factors, *DRUG target
Abstract: Purpose: Graph coloring approach has emerged as a valuable problem-solving tool for both theoretical and practical aspects across various scientific disciplines, including biology. In this study, we demonstrate the graph coloring's effectiveness in computational network biology, more precisely in analyzing protein–protein interaction (PPI) networks to gain insights about the viral infections and its consequences on human health. Accordingly, we propose a generic model that can highlight important hub proteins of virus-associated disease manifestations, changes in disease-associated biological pathways, potential drug targets and respective drugs. We test our model on SARS-CoV-2 infection, a highly transmissible virus responsible for the COVID-19 pandemic. The pandemic took significant human lives, causing severe respiratory illnesses and exhibiting various symptoms ranging from fever and cough to gastrointestinal, cardiac, renal, neurological, and other manifestations. Methods: To investigate the underlying mechanisms of SARS-CoV-2 infection-induced dysregulation of human pathobiology, we construct a two-level PPI network and employed a differential evolution-based graph coloring (DEGCP) algorithm to identify critical hub proteins that might serve as potential targets for resolving the associated issues. Initially, we concentrate on the direct human interactors of SARS-CoV-2 proteins to construct the first-level PPI network and subsequently applied the DEGCP algorithm to identify essential hub proteins within this network. We then build a second-level PPI network by incorporating the next-level human interactors of the first-level hub proteins and use the DEGCP algorithm to predict the second level of hub proteins. Results: We first identify the potential crucial hub proteins associated with SARS-CoV-2 infection at different levels. Through comprehensive analysis, we then investigate the cellular localization, interactions with other viral families, involvement in biological pathways and processes, functional attributes, gene regulation capabilities as transcription factors, and their associations with disease-associated symptoms of these identified hub proteins. Our findings highlight the significance of these hub proteins and their intricate connections with disease pathophysiology. Furthermore, we predict potential drug targets among the hub proteins and identify specific drugs that hold promise in preventing or treating SARS-CoV-2 infection and its consequences. Conclusion: Our generic model demonstrates the effectiveness of DEGCP algorithm in analyzing biological PPI networks, provides valuable insights into disease biology, and offers a basis for developing novel therapeutic strategies for other viral infections that may cause future pandemic. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. K+ and pH homeostasis in plant cells is controlled by a synchronized K+/H+ antiport at the plasma and vacuolar membrane.

Author: Li, Kunkun, Grauschopf, Christina, Hedrich, Rainer, Dreyer, Ingo, and Konrad, Kai R.
Subjects: *CELL membranes, *COMPUTATIONAL biology, *CYTOLOGY, *ION transport (Biology), *MEMBRANE potential, *HOMEOSTASIS, *POTASSIUM channels, *STOMATA
Abstract: Summary: Stomatal movement involves ion transport across the plasma membrane (PM) and vacuolar membrane (VM) of guard cells. However, the coupling mechanisms of ion transporters in both membranes and their interplay with Ca2+ and pH changes are largely unclear.Here, we investigated transporter networks in tobacco guard cells and mesophyll cells using multiparametric live‐cell ion imaging and computational simulations.K+ and anion fluxes at both, PM and VM, affected H+ and Ca2+, as changes in extracellular KCl or KNO3 concentrations were accompanied by cytosolic and vacuolar pH shifts and changes in [Ca2+]cyt and the membrane potential. At both membranes, the K+ transporter networks mediated an antiport of K+ and H+. By contrast, net transport of anions was accompanied by parallel H+ transport, with differences in transport capacity for chloride and nitrate. Guard and mesophyll cells exhibited similarities in K+/H+ transport but cell type‐specific differences in [H+]cyt and pH‐dependent [Ca2+]cyt signals. Computational cell biology models explained mechanistically the properties of transporter networks and the coupling of transport across the PM and VM.Our integrated approach indicates fundamental principles of coupled ion transport at membrane sandwiches to control H+/K+ homeostasis and points to transceptor‐like Ca2+/H+‐based ion signaling in plant cells. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. Special Issue "State-of-the-Art Molecular Plant Sciences in Japan".

Author: Komatsu, Setsuko and Uemura, Matsuo
Subjects: *BIOTIC communities, *MOLECULAR biology, *AGRICULTURE, *COMPUTATIONAL biology, *DEVELOPMENTAL biology, *SYNTHETIC biology, *BOTANY
Abstract: This document is a summary of a special issue in the International Journal of Molecular Sciences titled "State-of-the-Art Molecular Plant Sciences in Japan." The issue focuses on the impact of climate change on food production and the efforts of plant scientists in Japan to develop resilient crops. The articles cover various topics such as stress tolerance, plant metabolism, plant-microbe interactions, and the use of omics techniques in agricultural research. The research presented in the issue aims to improve food production and understand the interaction between plants and their environment. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

36. AI-Based Detection of Oral Squamous Cell Carcinoma with Raman Histology.

Author: Weber, Andreas, Enderle-Ammour, Kathrin, Kurowski, Konrad, Metzger, Marc C., Poxleitner, Philipp, Werner, Martin, Rothweiler, René, Beck, Jürgen, Straehle, Jakob, Schmelzeisen, Rainer, Steybe, David, and Bronsert, Peter
Subjects: *HEAD & neck cancer diagnosis, *TISSUE analysis, *DEEP learning, *MOUTH tumors, *PREDICTIVE tests, *STAINS & staining (Microscopy), *INTRAOPERATIVE care, *ARTIFICIAL intelligence, *HEAD & neck cancer, *RAMAN spectroscopy, *EPITHELIUM, *RESEARCH funding, *COMPUTER-aided diagnosis, *ARTIFICIAL neural networks, *SQUAMOUS cell carcinoma, *ADIPOSE tissues, RESEARCH evaluation
Abstract: Simple Summary: Stimulated Raman Histology (SRH) is a technique that uses laser light to create detailed images of tissues without the need for traditional staining. This study aimed to use deep learning to classify oral squamous cell carcinoma (OSCC) and different non-malignant tissue types using SRH images. The performances of the classifications between SRH images and the original images obtained from stimulated Raman scattering (SRS) were compared. A deep learning model was trained on 64 images and tested on 16, showing that it could effectively identify tissue types during surgery, potentially speeding up decision making in oral cancer surgery. Stimulated Raman Histology (SRH) employs the stimulated Raman scattering (SRS) of photons at biomolecules in tissue samples to generate histological images. Subsequent pathological analysis allows for an intraoperative evaluation without the need for sectioning and staining. The objective of this study was to investigate a deep learning-based classification of oral squamous cell carcinoma (OSCC) and the sub-classification of non-malignant tissue types, as well as to compare the performances of the classifier between SRS and SRH images. Raman shifts were measured at wavenumbers k1 = 2845 cm−1 and k2 = 2930 cm−1. SRS images were transformed into SRH images resembling traditional H&E-stained frozen sections. The annotation of 6 tissue types was performed on images obtained from 80 tissue samples from eight OSCC patients. A VGG19-based convolutional neural network was then trained on 64 SRS images (and corresponding SRH images) and tested on 16. A balanced accuracy of 0.90 (0.87 for SRH images) and F1-scores of 0.91 (0.91 for SRH) for stroma, 0.98 (0.96 for SRH) for adipose tissue, 0.90 (0.87 for SRH) for squamous epithelium, 0.92 (0.76 for SRH) for muscle, 0.87 (0.90 for SRH) for glandular tissue, and 0.88 (0.87 for SRH) for tumor were achieved. The results of this study demonstrate the suitability of deep learning for the intraoperative identification of tissue types directly on SRS and SRH images. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. m1A-Ensem: accurate identification of 1-methyladenosine sites through ensemble models.

Author: Suleman, Muhammad Taseer, Alturise, Fahad, Alkhalifah, Tamim, and Khan, Yaser Daanial
Subjects: *NUCLEOTIDE sequence, *SITE-specific mutagenesis, *COMPUTATIONAL biology, *MASS spectrometry, *METABOLITES, *IDENTIFICATION
Abstract: Background: 1-methyladenosine (m1A) is a variant of methyladenosine that holds a methyl substituent in the 1st position having a prominent role in RNA stability and human metabolites. Objective: Traditional approaches, such as mass spectrometry and site-directed mutagenesis, proved to be time-consuming and complicated. Methodology: The present research focused on the identification of m1A sites within RNA sequences using novel feature development mechanisms. The obtained features were used to train the ensemble models, including blending, boosting, and bagging. Independent testing and k-fold cross validation were then performed on the trained ensemble models. Results: The proposed model outperformed the preexisting predictors and revealed optimized scores based on major accuracy metrics. Conclusion: For research purpose, a user-friendly webserver of the proposed model can be accessed through https://taseersuleman-m1a-ensem1.streamlit.app/. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. StressME: Unified computing framework of Escherichia coli metabolism, gene expression, and stress responses.

Author: Zhao, Jiao, Chen, Ke, Palsson, Bernhard O., and Yang, Laurence
Subjects: *ESCHERICHIA coli, *GENE expression, *INDUSTRIAL architecture, *THERMAL stresses, *THERMAL batteries, *COMPUTATIONAL biology
Abstract: Generalist microbes have adapted to a multitude of environmental stresses through their integrated stress response system. Individual stress responses have been quantified by E. coli metabolism and expression (ME) models under thermal, oxidative and acid stress, respectively. However, the systematic quantification of cross-stress & cross-talk among these stress responses remains lacking. Here, we present StressME: the unified stress response model of E. coli combining thermal (FoldME), oxidative (OxidizeME) and acid (AcidifyME) stress responses. StressME is the most up to date ME model for E. coli and it reproduces all published single-stress ME models. Additionally, it includes refined rate constants to improve prediction accuracy for wild-type and stress-evolved strains. StressME revealed certain optimal proteome allocation strategies associated with cross-stress and cross-talk responses. These stress-optimal proteomes were shaped by trade-offs between protective vs. metabolic enzymes; cytoplasmic vs. periplasmic chaperones; and expression of stress-specific proteins. As StressME is tuned to compute metabolic and gene expression responses under mild acid, oxidative, and thermal stresses, it is useful for engineering and health applications. The modular design of our open-source package also facilitates model expansion (e.g., to new stress mechanisms) by the computational biology community. Author summary: A fundamental understanding of multi-stress adaptation in E.coli has potential industrial relevance. While individual stress responses have been quantified through the protein regulatory network in E.coli, the systematic quantification of the cross-stress & cross-talk among stress responses remains lacking. Here, we develop a new modeling pipeline by which thermal, oxidative and acid stress response can be coupled to each other, and the metabolic activities, protein and metabolic flux redistribution due to cross-stress & cross-talk can be quantified. We optimize the effective rate constants in the integrated model. We then confirm the model robustness by validating against the published data under single stress. Finally, we use the model to characterize the cross-adaptation between protective and catalytic proteins as well as between chaperones present in different cellular compartments. We find effective cross-protection against cross stress by adapting the E.coli cells to the thermal stress first. We also indicate the presence of cross-talk through trade-offs by which the cell may refuse to give up more protein allocation away from one stress response to the other, because doing so would decrease stress tolerance further. The single stress plug-in design makes the model build-up pipeline flexible and expandable, allowing incorporation of more stressors into the model architecture for industrial applications. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. A framework for multi-scale intervention modeling: virtual cohorts, virtual clinical trials, and model-to-model comparisons.

Author: Michael, Christian T., Almohri, Sayed Ahmad, Linderman, Jennifer J., and Kirschner, Denise E.
Abstract: Computational models of disease progression have been constructed for a myriad of pathologies. Typically, the conceptual implementation for pathology-related in silico intervention studies has been ad hoc and similar in design to experimental studies. We introduce a multi-scale interventional design (MID) framework toward two key goals: tracking of disease dynamics from within-body to patient to population scale; and tracking impact(s) of interventions across these same spatial scales. Our MID framework prioritizes investigation of impact on individual patients within virtual pre-clinical trials, instead of replicating the design of experimental studies. We apply a MID framework to develop, organize, and analyze a cohort of virtual patients for the study of tuberculosis (TB) as an example disease. For this study, we use HostSim: our next-generation whole patient-scale computational model of individuals infected with Mycobacterium tuberculosis. HostSim captures infection within lungs by tracking multiple granulomas, together with dynamics occurring with blood and lymph node compartments, the compartments involved during pulmonary TB. We extend HostSim to include a simple drug intervention as an example of our approach and use our MID framework to quantify the impact of treatment at cellular and tissue (granuloma), patient (lungs, lymph nodes and blood), and population scales. Sensitivity analyses allow us to determine which features of virtual patients are the strongest predictors of intervention efficacy across scales. These insights allow us to identify patient-heterogeneous mechanisms that drive outcomes across scales. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. A framework for multi-scale intervention modeling: virtual cohorts, virtual clinical trials, and model-to-model comparisons.

Author: Michael, Christian T., Almohri, Sayed Ahmad, Linderman, Jennifer J., and Kirschner, Denise E.
Subjects: *MULTISCALE modeling, *MYCOBACTERIUM tuberculosis, *SIMULATED patients, *CLINICAL trials, *LYMPH nodes, *LUNGS
Abstract: Computational models of disease progression have been constructed for a myriad of pathologies. Typically, the conceptual implementation for pathology-related in silico intervention studies has been ad hoc and similar in design to experimental studies. We introduce a multi-scale interventional design (MID) framework toward two key goals: tracking of disease dynamics from within-body to patient to population scale; and tracking impact(s) of interventions across these same spatial scales. Our MID framework prioritizes investigation of impact on individual patients within virtual pre-clinical trials, instead of replicating the design of experimental studies. We apply a MID framework to develop, organize, and analyze a cohort of virtual patients for the study of tuberculosis (TB) as an example disease. For this study, we use HostSim: our next-generation whole patient-scale computational model of individuals infected with Mycobacterium tuberculosis. HostSim captures infection within lungs by tracking multiple granulomas, together with dynamics occurring with blood and lymph node compartments, the compartments involved during pulmonary TB. We extend HostSim to include a simple drug intervention as an example of our approach and use our MID framework to quantify the impact of treatment at cellular and tissue (granuloma), patient (lungs, lymph nodes and blood), and population scales. Sensitivity analyses allow us to determine which features of virtual patients are the strongest predictors of intervention efficacy across scales. These insights allow us to identify patient-heterogeneous mechanisms that drive outcomes across scales. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. Learning from high dimensional data based on weighted feature importance in decision tree ensembles.

Author: Pour, Nayiri Galestian and Shemehsavar, Soudabeh
Subjects: *DECISION trees, *RANDOM forest algorithms, *IMAGE recognition (Computer vision), *COMPUTATIONAL biology, *MACHINE learning, *DATA analysis
Abstract: Learning from high dimensional data has been utilized in various applications such as computational biology, image classification, and finance. Most classical machine learning algorithms fail to give accurate predictions in high dimensional settings due to the enormous feature space. In this article, we present a novel ensemble of classification trees based on weighted random subspaces that aims to adjust the distribution of selection probabilities. In the proposed algorithm base classifiers are built on random feature subspaces in which the probability that influential features will be selected for the next subspace, is updated by incorporating grouping information based on previous classifiers through a weighting function. As an interpretation tool, we show that variable importance measures computed by the new method can identify influential features efficiently. We provide theoretical reasoning for the different elements of the proposed method, and we evaluate the usefulness of the new method based on simulation studies and real data analysis. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. Transcriptome‐wide analysis reveals GYG2 as a mitochondria‐related aging biomarker in human subcutaneous adipose tissue.

Author: Ham, Mira, Cho, Yeonju, Kang, Tae‐Wook, Oh, Taeyun, Kim, Hyoung‐June, and Kim, Kyu‐Han
Subjects: *BIOMARKERS, *GENE expression, *AGING, *COMPUTATIONAL biology, *GENE regulatory networks, *ADIPOSE tissues, *HOMEOSTASIS
Abstract: Subcutaneous adipose tissue (SAT), a vital energy reservoir and endocrine organ for maintaining systemic glucose, lipid, and energy homeostasis, undergoes significant changes with age. However, among the existing aging‐related markers, only few genes are associated with SAT aging. In this study, weighted gene co‐expression network analysis was used on a transcriptome of SAT obtained from the Genotype‐Tissue Expression portal to identify biologically relevant, SAT‐specific, and age‐related marker genes. We found modules that exhibited significant changes with age and identified GYG2 as a novel key aging associated gene. The link between GYG2 and mitochondrial function as well as brown/beige adipocytes was supported using additional bioinformatics and experimental analyses. Additionally, we identified PPARG as the transcription factor of GYG2 expression. The newly discovered GYG2 marker can be used to not only determine the age of SAT but also uncover new mechanisms underlying SAT aging. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. A Review: Multi-Omics Approach to Studying the Association between Ionizing Radiation Effects on Biological Aging.

Author: Ruprecht, Nathan A., Singhal, Sonalika, Schaefer, Kalli, Panda, Om, Sens, Donald, and Singhal, Sandeep K.
Subjects: *PHYSIOLOGICAL effects of radiation, *IONIZING radiation, *MULTIOMICS, *DOUBLE-strand DNA breaks, *LITERATURE reviews, *COMPUTATIONAL biology, *OLD age
Abstract: Simple Summary: The effects of radiation exposure seem closely related to effects of old age—so much so that the idea of a radiation–age association came about in the 1960s. While not a new idea, modern technology is allowing us to revisit these ideas and explore them with a fresh perspective. Separately, there are gaps in the community's understanding of the effects of radiation and aging, such as with respect to low-level, long-term effects of radiation and estimating someone's biological age. To study their association, a number of tools exist that need to be efficiently integrated to study this complex and interdisciplinary field. This article includes an extensive literature review on the theory of these two topics, providing a detailed foundation for a current understanding. We then present a resource-agnostic approach for researchers in these areas, focusing on studying the association between the two. Primary points of interest are focused on indirect damage of radiation exposure via oxidative stress within a cell, a comprehensive table of functional estimators for biological age, and using modern computational tools and biology to overlap fields of study to develop and exploit a rad–age association. Multi-omics studies have emerged as powerful tools for tailoring individualized responses to various conditions, capitalizing on genome sequencing technologies' increasing affordability and efficiency. This paper delves into the potential of multi-omics in deepening our understanding of biological age, examining the techniques available in light of evolving technology and computational models. The primary objective is to review the relationship between ionizing radiation and biological age, exploring a wide array of functional, physiological, and psychological parameters. This comprehensive review draws upon an extensive range of sources, including peer-reviewed journal articles, government documents, and reputable websites. The literature review spans from fundamental insights into radiation effects to the latest developments in aging research. Ionizing radiation exerts its influence through direct mechanisms, notably single- and double-strand DNA breaks and cross links, along with other critical cellular events. The cumulative impact of DNA damage forms the foundation for the intricate process of natural aging, intersecting with numerous diseases and pivotal biomarkers. Furthermore, there is a resurgence of interest in ionizing radiation research from various organizations and countries, reinvigorating its importance as a key contributor to the study of biological age. Biological age serves as a vital reference point for the monitoring and mitigation of the effects of various stressors, including ionizing radiation. Ionizing radiation emerges as a potent candidate for modeling the separation of biological age from chronological age, offering a promising avenue for tailoring protocols across diverse fields, including the rigorous demands of space exploration. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. Targeting the Bottlenecks in Levan Biosynthesis Pathway in Bacillus subtilis and Strain Optimization by Computational Modeling and Omics Integration.

Author: Immanuel, Aruldoss, Yennamalli, Ragothaman M., and Ulaganathan, Venkatasubramanian
Abstract: Levan is a fructan polymer with many industrial applications such as the formulation of hydrogels, drug delivery, and wound healing, among others. To this end, metabolic systems engineering is a valuable method to improve the yield of a specific metabolite in a wide range of bacterial and eukaryotic organisms. In this study, we report a systems biology approach integrating genomics data for the Bacillus subtilis model, wherein the metabolic pathway for levan biosynthesis is unpacked. We analyzed a revised genome-scale enzyme-constrained metabolic model (ecGEM) and performed simulations to increase levan biopolymer production capacity in B. subtilis. We used the model ec_iYO844_lvn to (1) identify the essential genes and bottlenecks in levan production, and (2) specifically design an engineered B. subtilis strain capable of producing higher levan yields. The FBA and FVA analysis showed the maximal growth rate of the organism up to 0.624 hr−1 at 20 mmol gDw−1 hr−1 of sucrose intake. Gene knockout analyses were performed to identify gene knockout targets to increase the levan flux in B. subtilis. Importantly, we found that the pgk and ctaD genes are the two target genes for the knockout. The perturbation of these two genes has flux gains for levan production reactions with 1.3- and 1.4-fold the relative flux span in the mutant strains, respectively, compared to the wild type. In all, this work identifies the bottlenecks in the production of levan and possible ways to overcome them. Our results provide deeper insights on the bacterium's physiology and new avenues for strain engineering. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. ESGCT 30th Annual Congress In collaboration with SFTCG and NVGCT Brussels, Belgium October 24–27, 2023 Abstracts.

Subjects: *COMPUTATIONAL biology, *GENE expression, *GENETIC variation, *HUMAN embryonic stem cells, *GRAFT versus host disease
Abstract: The article provides summaries of several scientific studies and developments related to gene therapy and gene editing. The studies cover a range of genetic disorders, including myotonic dystrophy, mucopolysaccharidosis, frontotemporal dementia, ocular diseases, Duchenne muscular dystrophy, sickle cell disease, X-linked severe combined immunodeficiency, and cancer. The research explores the use of various gene editing techniques, such as CRISPR/Cas9 and Cas12 enzymes, as well as the development of non-viral delivery systems. The studies demonstrate the potential of gene therapy and gene editing in treating genetic diseases and improving patient outcomes. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

46. Bioframe: operations on genomic intervals in Pandas dataframes.

Author: Abdennur, Nezar, Fudenberg, Geoffrey, Flyamer, Ilya M, Galitsyna, Aleksandra A, Goloborodko, Anton, Imakaev, Maxim, and Venev, Sergey
Subjects: *INTERVAL analysis, *DATA structures, *COMPUTATIONAL biology, *PYTHON programming language, *BINDING sites, *PANDAS
Abstract: Motivation Genomic intervals are one of the most prevalent data structures in computational genome biology, and used to represent features ranging from genes, to DNA binding sites, to disease variants. Operations on genomic intervals provide a language for asking questions about relationships between features. While there are excellent interval arithmetic tools for the command line, they are not smoothly integrated into Python, one of the most popular general-purpose computational and visualization environments. Results Bioframe is a library to enable flexible and performant operations on genomic interval dataframes in Python. Bioframe extends the Python data science stack to use cases for computational genome biology by building directly on top of two of the most commonly-used Python libraries, NumPy and Pandas. The bioframe API enables flexible name and column orders, and decouples operations from data formats to avoid unnecessary conversions, a common scourge for bioinformaticians. Bioframe achieves these goals while maintaining high performance and a rich set of features. Availability and implementation Bioframe is open-source under MIT license, cross-platform, and can be installed from the Python Package Index. The source code is maintained by Open2C on GitHub at https://github.com/open2c/bioframe. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. pycofitness—Evaluating the fitness landscape of RNA and protein sequences.

Author: Pucci, Fabrizio, Zerihun, Mehari B, Rooman, Marianne, and Schug, Alexander
Subjects: *AMINO acid sequence, *COMPUTATIONAL biology, *GENETIC variation, *PROTEIN engineering, *RESEARCH personnel, *SYNTHETIC biology
Abstract: Motivation The accurate prediction of how mutations change biophysical properties of proteins or RNA is a major goal in computational biology with tremendous impacts on protein design and genetic variant interpretation. Evolutionary approaches such as coevolution can help solving this issue. Results We present pycofitness, a standalone Python-based software package for the in silico mutagenesis of protein and RNA sequences. It is based on coevolution and, more specifically, on a popular inverse statistical approach, namely direct coupling analysis by pseudo-likelihood maximization. Its efficient implementation and user-friendly command line interface make it an easy-to-use tool even for researchers with no bioinformatics background. To illustrate its strengths, we present three applications in which pycofitness efficiently predicts the deleteriousness of genetic variants and the effect of mutations on protein fitness and thermodynamic stability. Availability and implementation https://github.com/KIT-MBS/pycofitness. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. Allelic Variations in Vernalization (Vrn) Genes in Triticum spp.

Author: Afshari-Behbahanizadeh, Sanaz, Puglisi, Damiano, Esposito, Salvatore, and De Vita, Pasquale
Subjects: *VERNALIZATION, *COMPUTATIONAL biology, *FLOWERING time, *GENES, *CLIMATE change, *DURUM wheat, *WHEAT, *SYSTEMS biology
Abstract: Rapid climate changes, with higher warming rates during winter and spring seasons, dramatically affect the vernalization requirements, one of the most critical processes for the induction of wheat reproductive growth, with severe consequences on flowering time, grain filling, and grain yield. Specifically, the Vrn genes play a major role in the transition from vegetative to reproductive growth in wheat. Recent advances in wheat genomics have significantly improved the understanding of the molecular mechanisms of Vrn genes (Vrn-1, Vrn-2, Vrn-3, and Vrn-4), unveiling a diverse array of natural allelic variations. In this review, we have examined the current knowledge of Vrn genes from a functional and structural point of view, considering the studies conducted on Vrn alleles at different ploidy levels (diploid, tetraploid, and hexaploid). The molecular characterization of Vrn-1 alleles has been a focal point, revealing a diverse array of allelic forms with implications for flowering time. We have highlighted the structural complexity of the different allelic forms and the problems linked to the different nomenclature of some Vrn alleles. Addressing these issues will be crucial for harmonizing research efforts and enhancing our understanding of Vrn gene function and evolution. The increasing availability of genome and transcriptome sequences, along with the improvements in bioinformatics and computational biology, offers a versatile range of possibilities for enriching genomic regions surrounding the target sites of Vrn genes, paving the way for innovative approaches to manipulate flowering time and improve wheat productivity. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. Molecular epidemiology of pregnancy using omics data: advances, success stories, and challenges.

Author: Rahnavard, Ali, Chatterjee, Ranojoy, Wen, Hui, Gaylord, Clark, Mugusi, Sabina, Klatt, Kevin C., and Smith, Emily R.
Subjects: *MOLECULAR epidemiology, *PREGNANCY outcomes, *PREGNANCY, *MULTIOMICS, *COMPUTATIONAL biology
Abstract: Multi-omics approaches have been successfully applied to investigate pregnancy and health outcomes at a molecular and genetic level in several studies. As omics technologies advance, research areas are open to study further. Here we discuss overall trends and examples of successfully using omics technologies and techniques (e.g., genomics, proteomics, metabolomics, and metagenomics) to investigate the molecular epidemiology of pregnancy. In addition, we outline omics applications and study characteristics of pregnancy for understanding fundamental biology, causal health, and physiological relationships, risk and prediction modeling, diagnostics, and correlations. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. Fulgor: a fast and compact k-mer index for large-scale matching and color queries.

Author: Fan, Jason, Khan, Jamshed, Singh, Noor Pratap, Pibiri, Giulio Ermanno, and Patro, Rob
Subjects: *DE Bruijn graph, *COMPUTATIONAL biology, *DATA structures, *SALMONELLA enterica, *NUCLEOTIDE sequence, *SALMONELLA, *COLORS
Abstract: The problem of sequence identification or matching—determining the subset of reference sequences from a given collection that are likely to contain a short, queried nucleotide sequence—is relevant for many important tasks in Computational Biology, such as metagenomics and pangenome analysis. Due to the complex nature of such analyses and the large scale of the reference collections a resource-efficient solution to this problem is of utmost importance. This poses the threefold challenge of representing the reference collection with a data structure that is efficient to query, has light memory usage, and scales well to large collections. To solve this problem, we describe an efficient colored de Bruijn graph index, arising as the combination of a k-mer dictionary with a compressed inverted index. The proposed index takes full advantage of the fact that unitigs in the colored compacted de Bruijn graph are monochromatic (i.e., all k-mers in a unitig have the same set of references of origin, or color). Specifically, the unitigs are kept in the dictionary in color order, thereby allowing for the encoding of the map from k-mers to their colors in as little as 1 + o(1) bits per unitig. Hence, one color per unitig is stored in the index with almost no space/time overhead. By combining this property with simple but effective compression methods for integer lists, the index achieves very small space. We implement these methods in a tool called Fulgor, and conduct an extensive experimental analysis to demonstrate the improvement of our tool over previous solutions. For example, compared to Themisto—the strongest competitor in terms of index space vs. query time trade-off—Fulgor requires significantly less space (up to 43% less space for a collection of 150,000 Salmonella enterica genomes), is at least twice as fast for color queries, and is 2–6 × faster to construct. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

16,728 results on '"Computational Biology"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources