Descriptor: "Computational Biology" / Publication Year Range: This year - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Computational Biology"' showing total 116 results

Start Over Descriptor "Computational Biology" Publication Year Range This year

116 results on '"Computational Biology"'

1. Biocomputation: Moving Beyond Turing with Living Cellular Computers.

Author: Goñi-Moreno, Ángel
Subjects: *SYNTHETIC biology, *BIOLOGICALLY inspired computing, *COMPUTER science, *BOOLEAN functions, *COMPUTATIONAL biology, *MOLECULAR computers, *DNA
Abstract: This article explores the topic of biocomputation with living cellular computers by detailing the basic concepts of cellular computer construction as well as exploring improvements. The article includes programming bacteria to perform living Boolean logic functions and developing advanced living models of computation beyond combinatorial logic. Topics include synthetic biology, cellular supremacy, as well as systems complexity and algorithmic complexity.
Published: 2024
Full Text: View/download PDF

2. Molecular mechanisms of quetiapine bidirectional regulation of bipolar depression and mania based on network pharmacology and molecular docking: Evidence from computational biology.

Author: Li, Chao, Tian, Hongjun, Li, Ranli, Jia, Feng, Wang, Lina, Ma, Xiaoyan, Yang, Lei, Zhang, Qiuyu, Zhang, Ying, Yao, Kaifang, and Zhuo, Chuanjun
Subjects: *BIPOLAR disorder, *MOLECULAR docking, *MOLECULAR pharmacology, *COMPUTATIONAL biology, *QUETIAPINE
Abstract: Quetiapine monotherapy is recommended as the first-line option for acute mania and acute bipolar depression. However, the mechanism of action of quetiapine is unclear. Network pharmacology and molecular docking were employed to determine the molecular mechanisms of quetiapine bidirectional regulation of bipolar depression and mania. Putative target genes for quetiapine were collected from the GeneCard, SwissTargetPrediction, and DrugBank databases. Targets for bipolar depression and bipolar mania were identified from the DisGeNET and GeneCards databases. A protein-protein interaction (PPI) network was generated using the String database and imported into Cytoscape. DAVID and the Bioinformatics platform were employed to perform the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses of the top 15 core targets. The drug-pathway-target-disease network was constructed using Cytoscape. Finally, molecular docking was performed to evaluate the interactions between quetiapine and potential targets. Targets for quetiapine actions against bipolar depression (126 targets) and bipolar mania (81 targets) were identified. Based on PPI and KEGG pathway analyses, quetiapine may affect bipolar depression by targeting the MAPK and PI3K/AKT insulin signaling pathways via BDNF, INS, EGFR, IGF1, and NGF, and it may affect bipolar mania by targeting the neuroactive ligand-receptor interaction signaling pathway via HTR1A, HTR1B, HTR2A, DRD2, and GRIN2B. Molecular docking revealed good binding affinity between quetiapine and potential targets. Pharmacological experiments should be conducted to verify and further explore these results. Our findings suggest that quetiapine affects bipolar depression and bipolar mania through distinct biological core targets, and thus through different mechanisms. Furthermore, our results provide a theoretical basis for the clinical use of quetiapine and possible directions for new drug development. • Quetiapine affects bipolar depression and bipolar mania through different mechanisms. • Quetiapine affects depression by targeting MAPK and PI3K/AKT insulin signaling pathways via BDNF, INS, EGFR, IGF1, NGF. • Quetiapine affects mania through neuroactive ligand-receptor interaction pathway via HTR1A, HTR1B, HTR2A, DRD2, GRIN2B. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. CodLncScape Provides a Self‐Enriching Framework for the Systematic Collection and Exploration of Coding LncRNAs.

Author: Liu, Tianyuan, Qiao, Huiyuan, Wang, Zixu, Yang, Xinyan, Pan, Xianrun, Yang, Yu, Ye, Xiucai, Sakurai, Tetsuya, Lin, Hao, and Zhang, Yang
Subjects: *LINCRNA, *COMPUTATIONAL biology, *KNOWLEDGE base, *SPERMATOGENESIS
Abstract: Recent studies have revealed that numerous lncRNAs can translate proteins under specific conditions, performing diverse biological functions, thus termed coding lncRNAs. Their comprehensive landscape, however, remains elusive due to this field's preliminary and dispersed nature. This study introduces codLncScape, a framework for coding lncRNA exploration consisting of codLncDB, codLncFlow, codLncWeb, and codLncNLP. Specifically, it contains a manually compiled knowledge base, codLncDB, encompassing 353 coding lncRNA entries validated by experiments. Building upon codLncDB, codLncFlow investigates the expression characteristics of these lncRNAs and their diagnostic potential in the pan‐cancer context, alongside their association with spermatogenesis. Furthermore, codLncWeb emerges as a platform for storing, browsing, and accessing knowledge concerning coding lncRNAs within various programming environments. Finally, codLncNLP serves as a knowledge‐mining tool to enhance the timely content inclusion and updates within codLncDB. In summary, this study offers a well‐functioning, content‐rich ecosystem for coding lncRNA research, aiming to accelerate systematic studies in this field. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

4. In silico bioprospecting of receptors associated with the mechanism of action of Rondonin, an antifungal peptide from spider Acanthoscurria rondoniae haemolymph.

Author: Muniz Seif, Elias Jorge, Icimoto, Marcelo Yudi, and Silva Júnior, Pedro Ismael
Subjects: *PEPTIDES, *SPIDER venom, *SUCCINATE dehydrogenase, *HEMOLYMPH, *BIOPROSPECTING, *SMALL molecules
Abstract: Multiple drug-resistant fungal species are associated with the development of diseases. Thus, more efficient drugs for the treatment of these aetiological agents are needed. Rondonin is a peptide isolated from the haemolymph of the spider Acanthoscurria rondoniae. Previous studies have shown that this peptide has antifungal activity against Candida sp. and Trichosporon sp. strains, acting on their genetic material. However, the molecular targets involved in its biological activity have not yet been described. Bioinformatics tools were used to determine the possible targets involved in the biological activity of Rondonin. The PharmMapper server was used to search for microorganismal targets of Rondonin. The PatchDock server was used to perform the molecular docking. UCSF Chimera software was used to evaluate these intermolecular interactions. In addition, the I-TASSER server was used to predict the target ligand sites. Then, these predictions were contrasted with the sites previously described in the literature. Molecular dynamics simulations were conducted for two promising complexes identified from the docking analysis. Rondonin demonstrated consistency with the ligand sites of the following targets: outer membrane proteins F (id: 1MPF) and A (id: 1QJP), which are responsible for facilitating the passage of small molecules through the plasma membrane; the subunit of the flavoprotein fumarate reductase (id: 1D4E), which is involved in the metabolism of nitrogenous bases; and the ATP-dependent Holliday DNA helicase junction (id: 1IN4), which is associated with histone proteins that package genetic material. Additionally, the molecular dynamics results indicated the stability of the interaction of Rondonin with 1MPF and 1IN4 during a 10 ns simulation. These interactions corroborate with previous in vitro studies on Rondonin, which acts on fungal genetic material without causing plasma membrane rupture. Therefore, the bioprospecting methods used in this research were considered satisfactory since they were consistent with previous results obtained via in vitro experimentation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

5. An Ensemble Classifiers for Improved Prediction of Native–Non-Native Protein–Protein Interaction.

Author: Pratiwi, Nor Kumalasari Caecar, Tayara, Hilal, and Chong, Kil To
Abstract: In this study, we present an innovative approach to improve the prediction of protein–protein interactions (PPIs) through the utilization of an ensemble classifier, specifically focusing on distinguishing between native and non-native interactions. Leveraging the strengths of various base models, including random forest, gradient boosting, extreme gradient boosting, and light gradient boosting, our ensemble classifier integrates these diverse predictions using a logistic regression meta-classifier. Our model was evaluated using a comprehensive dataset generated from molecular dynamics simulations. While the gains in AUC and other metrics might seem modest, they contribute to a model that is more robust, consistent, and adaptable. To assess the effectiveness of various approaches, we compared the performance of logistic regression to four baseline models. Our results indicate that logistic regression consistently underperforms across all evaluated metrics. This suggests that it may not be well-suited to capture the complex relationships within this dataset. Tree-based models, on the other hand, appear to be more effective for problems involving molecular dynamics simulations. Extreme gradient boosting (XGBoost) and light gradient boosting (LightGBM) are optimized for performance and speed, handling datasets effectively and incorporating regularizations to avoid over-fitting. Our findings indicate that the ensemble method enhances the predictive capability of PPIs, offering a promising tool for computational biology and drug discovery by accurately identifying potential interaction sites and facilitating the understanding of complex protein functions within biological systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

6. Prediction of the antigenic regions in eight RhD variants identified by computational biology.

Author: Trueba‐Gómez, Rocio, Rosenfeld‐Mann, Fany, and Estrada‐Juárez, Higinio
Subjects: *COMPUTATIONAL biology, *TERTIARY structure, *PROTEIN structure, *MEMBRANE proteins, *ANTIGENS
Abstract: Background and Objectives: Changes in RHD generate variations in protein structure that lead to antigenic variants. The classical model divides them into quantitative (weak and Del) and qualitative (partial D). There are two types of protein antigens: linear and conformational. Computational biology analyses the theoretical assembly of tertiary protein structures and allows us to identify the 'topological' differences between isoforms. Our aim was to determine the theoretical antigenic differences between weak RhD variants compared with normal RhD based on structural analysis using bioinformatic techniques. Materials and Methods: We analysed the variations in secondary structures and hydrophobicity of RHD*01, RHD*01W.1, W2, W3, RHD*09.03.01, RHD*09.04, RHD*11, RHD*15 and RHD*21. We then modelled the tertiary structure and calculated their probable antigenic regions, intra‐protein interactions, displacement and membrane width and compared them with Rhce. Results: The 10 proteins are similar in their secondary structure and hydrophobicity, with the main differences observed in the exofacial coils. We identified six potential antigenic regions: one that is unique to RhD (R3), one that is common to all D (R6), three that are highly variable among RhD isoforms (R1, R2 and R4), one that they share with Rhce (R5) and two that are unique to Rhce (Ra and Rbc). Conclusion: The alloimmunization capacity of these subjects could be explained by the variability of the antigen pattern, which is not necessarily recognized or recognized with lower intensity by the commercially available antibodies, and not because they have a lower protein concentration in the membrane. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

7. 大黄素对骨关节炎模型小鼠痛觉行为的调节机制.

Author: 袁满, 冯子瀚, 谢敏, and 王柏军
Abstract: Objective To explore the regulatory mechanism of emodin on pain behavior in a mouse model of osteoarthritis based on mitochondrial key genes. Methods Thirty C57BL/6J mice were randomly divided into the control group, the osteoarthritis (OA) model group and the emodin-treated (OA+emodin) group, 10 mice per each group. The mice in the OA group and the OA+emodin group were intra-articular injection of complete Freund’s adjuvant (20 µL) in knee to establish the OA model, and mice in the OA+emodin group were treated by intraperitoneal emodin (10 mg/kg) injection. After behavioral testing, knee tissue of mice was collected for hematoxylin-eosin staining. Western blot analysis was used to detect expression levels of proinflammatory factors interleukin-1β (IL-1β), tumor necrosis factor-α (TNF-α) and mitochondriarelated proteins NADH dehydrogenase (ubiquinone) flavoprotein 1 (NDUFV1), cytochrome C oxidase subunit 5B (COX5B), cytochrome C oxidase assembly protein COX15 homolog (COX15), NADH dehydrogenase (ubiquinone) 1 alpha subcomplex subunit 10 (NDUFA10) in knee tissue. Results Compared with the control group, mice in the OA group showed decreased mechanical nociceptive threshold (PWT), reduced latency and distance in rotarod test (P＜0.05). Compared with the OA group, mice in the OA+emodin group showed increased PWT, latency, and distance (P＜0.05). In the control group, the structures of cartilage and subchondral bone were intact, while in the OA group, the cartilage was thinner and the subchondral trabeculae was deteriorated. The treatment with emodin alleviated cartilage degeneration. The expression levels of IL-1β, TNF-α, COX15 and NDUFA10 were increased while expression levels of NDUFV1 and COX5B were decreased in the OA group compared with the control group. The emodin treatment restored the above-mentioned protein expression levels (P＜0.05). Conclusion Emodin can alleviate pain behavior in OA mice by regulating the expressions of inflammatory factors and mitochondrial related proteins. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

8. Evolutionary feedback from the environment shapes mechanisms that generate genome variation.

Author: Caporale, Lynn Helena
Subjects: *GENOMES, *GENETICS, *COMPUTATIONAL biology, *BIOINFORMATICS, *NATURAL selection
Abstract: Darwin recognized that 'a grand and almost untrodden field of inquiry will be opened, on the causes and laws of variation.' However, because the Modern Synthesis assumes that the intrinsic probability of any individual mutation is unrelated to that mutation's potential adaptive value, attention has been focused on selection rather than on the intrinsic generation of variation. Yet many examples illustrate that the term 'random' mutation, as widely understood, is inaccurate. The probabilities of distinct classes of variation are neither evenly distributed across a genome nor invariant over time, nor unrelated to their potential adaptive value. Because selection acts upon variation, multiple biochemical mechanisms can and have evolved that increase the relative probability of adaptive mutations. In effect, the generation of heritable variation is in a feedback loop with selection, such that those mechanisms that tend to generate variants that survive recurring challenges in the environment would be captured by this survival and thus inherited and accumulated within lineages of genomes. Moreover, because genome variation is affected by a wide range of biochemical processes, genome variation can be regulated. Biochemical mechanisms that sense stress, from lack of nutrients to DNA damage, can increase the probability of specific classes of variation. A deeper understanding of evolution involves attention to the evolution of, and environmental influences upon, the intrinsic variation generated in gametes, in other words upon the biochemical mechanisms that generate variation across generations. These concepts have profound implications for the types of questions that can and should be asked, as omics databases become more comprehensive, detection methods more sensitive, and computation and experimental analyses even more high throughput and thus capable of revealing the intrinsic generation of variation in individual gametes. These concepts also have profound implications for evolutionary theory, which, upon reflection it will be argued, predicts that selection would increase the probability of generating adaptive mutations, in other words, predicts that the ability to evolve itself evolves. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

9. Assessing opportunities of SYCL for biological sequence alignment on GPU-based systems.

Author: Costanzo, Manuel, Rucci, Enzo, García-Sanchez, Carlos, Naiouf, Marcelo, and Prieto-Matías, Manuel
Subjects: *SEQUENCE alignment, *COMPUTATIONAL biology, *PROGRAMMING languages, *BIOINFORMATICS, *C++, *BIOINFORMATICS software
Abstract: Bioinformatics and computational biology are two fields that have been exploiting GPUs for more than two decades, with being CUDA the most used programming language for them. However, as CUDA is an NVIDIA proprietary language, it implies a strong portability restriction to a wide range of heterogeneous architectures, like AMD or Intel GPUs. To face this issue, the Khronos group has recently proposed the SYCL standard, which is an open, royalty-free, cross-platform abstraction layer that enables the programming of a heterogeneous system to be written using standard, single-source C++ code. Over the past few years, several implementations of this SYCL standard have emerged, being oneAPI the one from Intel. This paper presents the migration process of the SW# suite, a biological sequence alignment tool developed in CUDA, to SYCL using Intel's oneAPI ecosystem. The experimental results show that SW# was completely migrated with a small programmer intervention in terms of hand-coding. In addition, it was possible to port the migrated code between different architectures (considering multiple vendor GPUs and also CPUs), with no noticeable performance degradation on five different NVIDIA GPUs. Moreover, performance remained stable when switching to another SYCL implementation. As a consequence, SYCL and its implementations can offer attractive opportunities for the bioinformatics community, especially considering the vast existence of CUDA-based legacy codes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. HAUS5 在肾透明细胞癌中的表达及其对预后的意义.

Author: 杨川, 张元峰, 张唯力, 郑俊, and 唐纳贤
Abstract: Objective To investigate the expression of Augmin like complex subunit 5 (HAUS5) gene and its prognostic significance in renal clear cell carcinoma (RCCC), elucidating its potential mechanisms in the occurrence and development of RCCC through bioinformatics methods and various analytical approaches. Methods Pan-cancer analysis was conducted using the TIMER2 website. Human protein atlas database was utilized to obtain immunohistochemistry-based antibody-specific staining scores for RCCC. Additionally, gene mRNA expression levels and related clinical data for RCCC were downloaded from the Cancer Genome Atlas database, mRNA expression levels and associated clinical data of HAUS5 gene in RCCC were organized using R and Excel software. Patients were categorized into the high and the low expression groups based on the median HAUS5 expression level. SPSS27.0 and R software were used for survival analysis, functional analysis. and visualization. Results HAUS5 exhibited high expression in both RCCC tissues and cells. Higher expression of HAUS5 in RCCC was associated with an increased risk of mortality, with statistically significant differences (hazard ratio=1.744,95% confidence interval: 1. 286-2.365,P<0.001). Other clinical variables significantly associated with overall survival [such as distant metastasis (M stage), clinical stage, and primary tumor status (T stage) also had an impact (hazard ratio=1.550,1.866,1.891,95% confidence interval: 1.302-1.846.1.637-2.126,1.608-2.225, P<0.001). Conclusion HAUS5 is highly expressed in both RCCC tissues and cells and is closely associated with the progression and adverse prognosis of RCCC. suggesting it as a potential target for clinical treatment of RCCC. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. 基于 TCGA 数据库肺腺癌自噬相关基因预后模型的建立与验证.

Author: 吴双丽, 吴铁成, 喻光, 徐敬宣, 李保健, and 邢龙
Abstract: Objective To explore autophagy-related genes (ARGs) in lung adenocarcinoma and construct a prognostic model for lung adenocarcinoma based on ARGs. Methods RNA high-throughput transcriptome data of lung adenocarcinoma were obtained from The Cancer Genome Atlas (TCGA) database and HADb database to acquire ARGs. A prognostic model for lung adenocarcinoma was constructed and validated based on differentially expressed ARGs, followed by the construction of column line graphs and calibration curves to explore the clinical application value of the model. Gene ontology and Kyoto Encyclopedia of Genes and Genomes enrichment analyses were performed on differentially expressed ARGs. Lasso regression analysis was conducted on differentially expressed ARGs with prognostic significance to construct the prognostic model for lung adenocarcinoma. Kaplan-Meier survival curves were plotted. Results A total of 31 differentially expressed ARGs were obtained, and 10 differentially expressed ARGs with prognostic significance were selected. Patients in the high-risk group were significantly associated with poorer overall survival, with statistical significance (P<0.05). T stage, N stage and risk score was an independent prognostic factor for patients with lung adenocarcinoma. The global consistency of the calibration curve column line graph was 0.710,indicating a high level of agreement between the model's predicted results and actual outcomes. Conclusion The risk model constructed based on differentially expressed ARGs can serve as a prognostic feature for patients with lung adenocarcinoma or provide a reference for individualized treatment for patients with lung adenocarcinoma. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. An Effective DNA‐Based File Storage System for Practical Archiving and Retrieval of Medical MRI Data.

Author: Rasool, Abdur, Hong, Jingwei, Hong, Zhiling, Li, Yuanzhen, Zou, Chao, Chen, Hui, Qu, Qiang, Wang, Yang, Jiang, Qingshan, Huang, Xiaoluo, and Dai, Junbiao
Abstract: DNA‐based data storage is a new technology in computational and synthetic biology, that offers a solution for long‐term, high‐density data archiving. Given the critical importance of medical data in advancing human health, there is a growing interest in developing an effective medical data storage system based on DNA. Data integrity, accuracy, reliability, and efficient retrieval are all significant concerns. Therefore, this study proposes an Effective DNA Storage (EDS) approach for archiving medical MRI data. The EDS approach incorporates three key components (i) a novel fraction strategy to address the critical issue of rotating encoding, which often leads to data loss due to single base error propagation; (ii) a novel rule‐based quaternary transcoding method that satisfies bio‐constraints and ensure reliable mapping; and (iii) an indexing technique designed to simplify random search and access. The effectiveness of this approach is validated through computer simulations and biological experiments, confirming its practicality. The EDS approach outperforms existing methods, providing superior control over bio‐constraints and reducing computational time. The results and code provided in this study open new avenues for practical DNA storage of medical MRI data, offering promising prospects for the future of medical data archiving and retrieval. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. Maboss for HPC environments: implementations of the continuous time Boolean model simulator for large CPU clusters and GPU accelerators.

Author: Šmelko, Adam, Kratochvíl, Miroslav, Barillot, Emmanuel, and Noël, Vincent
Subjects: *CONTINUOUS time models, *SYSTEMS biology, *GRAPHICS processing units, *SIMULATION software, *HIGH performance computing, *SYNTHETIC biology, *COMPUTATIONAL biology
Abstract: Background: Computational models in systems biology are becoming more important with the advancement of experimental techniques to query the mechanistic details responsible for leading to phenotypes of interest. In particular, Boolean models are well fit to describe the complexity of signaling networks while being simple enough to scale to a very large number of components. With the advance of Boolean model inference techniques, the field is transforming from an artisanal way of building models of moderate size to a more automatized one, leading to very large models. In this context, adapting the simulation software for such increases in complexity is crucial. Results: We present two new developments in the continuous time Boolean simulators: MaBoSS.MPI, a parallel implementation of MaBoSS which can exploit the computational power of very large CPU clusters, and MaBoSS.GPU, which can use GPU accelerators to perform these simulations. Conclusion: These implementations enable simulation and exploration of the behavior of very large models, thus becoming a valuable analysis tool for the systems biology community. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. Optimized model architectures for deep learning on genomic data.

Author: Gündüz, Hüseyin Anil, Mreches, René, Moosbauer, Julia, Robertson, Gary, To, Xiao-Yin, Franzosa, Eric A., Huttenhower, Curtis, Rezaei, Mina, McHardy, Alice C., Bischl, Bernd, Münch, Philipp C., and Binder, Martin
Subjects: *DEEP learning, *ARCHITECTURAL design, *COMPUTER vision, *COMPUTATIONAL biology, *VISUAL fields
Abstract: The success of deep learning in various applications depends on task-specific architecture design choices, including the types, hyperparameters, and number of layers. In computational biology, there is no consensus on the optimal architecture design, and decisions are often made using insights from more well-established fields such as computer vision. These may not consider the domain-specific characteristics of genome sequences, potentially limiting performance. Here, we present GenomeNet-Architect, a neural architecture design framework that automatically optimizes deep learning models for genome sequence data. It optimizes the overall layout of the architecture, with a search space specifically designed for genomics. Additionally, it optimizes hyperparameters of individual layers and the model training procedure. On a viral classification task, GenomeNet-Architect reduced the read-level misclassification rate by 19%, with 67% faster inference and 83% fewer parameters, and achieved similar contig-level accuracy with ~100 times fewer parameters compared to the best-performing deep learning baselines. Introducing GenomeNet-Architect, a neural architecture design framework that automatically optimises the overall layout of the architecture, the hyperparameters, and the training procedure of deep learning models for genome sequence data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Assessment of Gene Set Enrichment Analysis using curated RNA-seq-based benchmarks.

Author: Candia, Julián and Ferrucci, Luigi
Subjects: *ETIOLOGY of cancer, *LIVER cancer, *HEPATOCELLULAR carcinoma, *GENES, *PHENOTYPES, *COMPUTATIONAL biology, *SYSTEMS biology
Abstract: Pathway enrichment analysis is a ubiquitous computational biology method to interpret a list of genes (typically derived from the association of large-scale omics data with phenotypes of interest) in terms of higher-level, predefined gene sets that share biological function, chromosomal location, or other common features. Among many tools developed so far, Gene Set Enrichment Analysis (GSEA) stands out as one of the pioneering and most widely used methods. Although originally developed for microarray data, GSEA is nowadays extensively utilized for RNA-seq data analysis. Here, we quantitatively assessed the performance of a variety of GSEA modalities and provide guidance in the practical use of GSEA in RNA-seq experiments. We leveraged harmonized RNA-seq datasets available from The Cancer Genome Atlas (TCGA) in combination with large, curated pathway collections from the Molecular Signatures Database to obtain cancer-type-specific target pathway lists across multiple cancer types. We carried out a detailed analysis of GSEA performance using both gene-set and phenotype permutations combined with four different choices for the Kolmogorov-Smirnov enrichment statistic. Based on our benchmarks, we conclude that the classic/unweighted gene-set permutation approach offered comparable or better sensitivity-vs-specificity tradeoffs across cancer types compared with other, more complex and computationally intensive permutation methods. Finally, we analyzed other large cohorts for thyroid cancer and hepatocellular carcinoma. We utilized a new consensus metric, the Enrichment Evidence Score (EES), which showed a remarkable agreement between pathways identified in TCGA and those from other sources, despite differences in cancer etiology. This finding suggests an EES-based strategy to identify a core set of pathways that may be complemented by an expanded set of pathways for downstream exploratory analysis. This work fills the existing gap in current guidelines and benchmarks for the use of GSEA with RNA-seq data and provides a framework to enable detailed benchmarking of other RNA-seq-based pathway analysis tools. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. Interpretable online network dictionary learning for inferring long-range chromatin interactions.

Author: Rana, Vishal, Peng, Jianhao, Pan, Chao, Lyu, Hanbaek, Cheng, Albert, Kim, Minji, and Milenkovic, Olgica
Subjects: *ENCYCLOPEDIAS & dictionaries, *CHROMATIN, *FLUORESCENCE in situ hybridization, *COMPUTATIONAL biology, *DROSOPHILA melanogaster, *MATRIX decomposition
Abstract: Dictionary learning (DL), implemented via matrix factorization (MF), is commonly used in computational biology to tackle ubiquitous clustering problems. The method is favored due to its conceptual simplicity and relatively low computational complexity. However, DL algorithms produce results that lack interpretability in terms of real biological data. Additionally, they are not optimized for graph-structured data and hence often fail to handle them in a scalable manner. In order to address these limitations, we propose a novel DL algorithm called online convex network dictionary learning (online cvxNDL). Unlike classical DL algorithms, online cvxNDL is implemented via MF and designed to handle extremely large datasets by virtue of its online nature. Importantly, it enables the interpretation of dictionary elements, which serve as cluster representatives, through convex combinations of real measurements. Moreover, the algorithm can be applied to data with a network structure by incorporating specialized subnetwork sampling techniques. To demonstrate the utility of our approach, we apply cvxNDL on 3D-genome RNAPII ChIA-Drop data with the goal of identifying important long-range interaction patterns (long-range dictionary elements). ChIA-Drop probes higher-order interactions, and produces data in the form of hypergraphs whose nodes represent genomic fragments. The hyperedges represent observed physical contacts. Our hypergraph model analysis has the objective of creating an interpretable dictionary of long-range interaction patterns that accurately represent global chromatin physical contact maps. Through the use of dictionary information, one can also associate the contact maps with RNA transcripts and infer cellular functions. To accomplish the task at hand, we focus on RNAPII-enriched ChIA-Drop data from Drosophila Melanogaster S2 cell lines. Our results offer two key insights. First, we demonstrate that online cvxNDL retains the accuracy of classical DL (MF) methods while simultaneously ensuring unique interpretability and scalability. Second, we identify distinct collections of proximal and distal interaction patterns involving chromatin elements shared by related processes across different chromosomes, as well as patterns unique to specific chromosomes. To associate the dictionary elements with biological properties of the corresponding chromatin regions, we employ Gene Ontology (GO) enrichment analysis and perform multiple RNA coexpression studies. Author summary: We introduce a novel method for dictionary learning termed online convex Network Dictionary Learning (online cvxNDL). The method operates in an online manner and utilizes representative subnetworks of a network dataset as dictionary elements. A key feature of online cvxNDL is its ability to work with graph-structured data and generate dictionary elements that represent convex combinations of real data points, thus ensuring interpretability. Online cvxNDL is used to investigate long-range chromatin interactions in S2 cell lines of Drosophila Melanogaster obtained through RNAPII ChIA-Drop measurements represented as hypergraphs. The results show that dictionary elements can accurately and efficiently reconstruct the original interactions present in the data, even when subjected to convexity constraints. To shed light on the biological relevance of the identified dictionaries, we perform Gene Ontology enrichment and RNA-seq coexpression analyses. These studies uncover multiple long-range interaction patterns that are chromosome-specific. Furthermore, the findings affirm the significance of convex dictionaries in representing TADs cross-validated by imaging methods (such as 3-color FISH (fluorescence in situ hybridization)). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Pericyte Control of Gene Expression in the Blood-Brain Barrier Endothelium: Implications for Alzheimer's Disease.

Author: Nelson, Doug, Thompson, Kevin J., Wang, Lushan, Wang, Zengtao, Eberts, Paulina, Azarin, Samira M., Kalari, Krishna R., and Kandimalla, Karunya K.
Subjects: *BLOOD-brain barrier, *ALZHEIMER'S disease, *SOMATOMEDIN A, *GENE expression, *BRAIN-derived neurotrophic factor
Abstract: Background: A strong body of evidence suggests that cerebrovascular pathologies augment the onset and progression of Alzheimer's disease (AD). One distinctive aspect of this cerebrovascular dysfunction is the degeneration of brain pericytes—often overlooked supporting cells of blood-brain barrier endothelium. Objective: The current study investigates the influence of pericytes on gene and protein expressions in the blood-brain barrier endothelium, which is expected to facilitate the identification of pathophysiological pathways that are triggered by pericyte loss and lead to blood-brain barrier dysfunction in AD. Methods: Bioinformatics analysis was conducted on the RNA-Seq expression counts matrix (GSE144474), which compared solo-cultured human blood-brain barrier endothelial cells against endothelial cells co-cultured with human brain pericytes in a non-contact model. We constructed a similar cell culture model to verify protein expression using western blots. Results: The insulin resistance and ferroptosis pathways were found to be enriched. Western blots of the insulin receptor and heme oxygenase expressions were consistent with those observed in RNA-Seq data. Additionally, we observed more than 5-fold upregulation of several genes associated with neuroprotection, including insulin-like growth factor 2 and brain-derived neurotrophic factor. Conclusions: Results suggest that pericyte influence on blood-brain barrier endothelial gene expression confers protection from insulin resistance, iron accumulation, oxidative stress, and amyloid deposition. Since these are conditions associated with AD pathophysiology, they imply mechanisms by which pericyte degeneration could contribute to disease progression. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. Special Issue "Bioinformatics of Unusual DNA and RNA Structures".

Author: Bartas, Martin, Brázda, Václav, and Pečinka, Petr
Subjects: *DNA structure, *QUADRUPLEX nucleic acids, *DNA analysis, *COMPUTATIONAL biology, *MOLECULAR biology, *MOLECULAR structure
Abstract: This editorial from the International Journal of Molecular Sciences provides an overview of the field of unusual nucleic acid structures (UNas), which are noncanonical structures that differ from the classical double-stranded structure of B-DNA. The editorial highlights eight articles published in the special issue, covering topics such as G-quadruplex polymorphism, R-loop prediction, G-quadruplexes in cervical cancer, G-quadruplexes in arboviruses, the role of G-quadruplexes in gene expression regulation, the effects of ions on G-quadruplex formation in rice, tRNA fragments in bacterial communities, and virus-induced gene silencing. The editorial also discusses the future perspectives of UNas research, including the development of bioinformatic tools, structural modeling, virtual screening, and molecular dynamics methods. The article also discusses the potential of UNas as molecular targets for drug discovery, specifically focusing on G-quadruplexes. However, the main challenge with UNa-binding compounds has been their low specificity and high toxicity. The authors believe that advances in bioinformatic methods will soon allow for the selective targeting of specific pathological UNas, paving the way for their application in drug discovery. The article also provides a table summarizing the characteristics and functions of different UNas. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

19. Platelet Biorheology and Mechanobiology in Thrombosis and Hemostasis: Perspectives from Multiscale Computation.

Author: Tuna, Rukiye, Yi, Wenjuan, Crespo Cruz, Esmeralda, Romero, JP, Ren, Yi, Guan, Jingjiao, Li, Yan, Deng, Yuefan, Bluestein, Danny, Liu, Zixiang Leonardo, and Sheriff, Jawaad
Subjects: *RHEOLOGY (Biology), *COMPUTATIONAL biology, *BLOOD platelet activation, *HEMOSTASIS, *BLOOD platelets, *BLOOD platelet aggregation
Abstract: Thrombosis is the pathological clot formation under abnormal hemodynamic conditions, which can result in vascular obstruction, causing ischemic strokes and myocardial infarction. Thrombus growth under moderate to low shear (<1000 s−1) relies on platelet activation and coagulation. Thrombosis at elevated high shear rates (>10,000 s−1) is predominantly driven by unactivated platelet binding and aggregating mediated by von Willebrand factor (VWF), while platelet activation and coagulation are secondary in supporting and reinforcing the thrombus. Given the molecular and cellular level information it can access, multiscale computational modeling informed by biology can provide new pathophysiological mechanisms that are otherwise not accessible experimentally, holding promise for novel first-principle-based therapeutics. In this review, we summarize the key aspects of platelet biorheology and mechanobiology, focusing on the molecular and cellular scale events and how they build up to thrombosis through platelet adhesion and aggregation in the presence or absence of platelet activation. In particular, we highlight recent advancements in multiscale modeling of platelet biorheology and mechanobiology and how they can lead to the better prediction and quantification of thrombus formation, exemplifying the exciting paradigm of digital medicine. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Self-controlled in silico gene knockdown strategies to enhance the sustainable production of heterologous terpenoid by Saccharomyces cerevisiae.

Author: Zhang, Na, Li, Xiaohan, Zhou, Qiang, Zhang, Ying, Lv, Bo, Hu, Bing, and Li, Chun
Subjects: *SUSTAINABILITY, *SACCHAROMYCES cerevisiae, *OPTIMIZATION algorithms, *GENE regulatory networks, *COMPUTATIONAL biology, *SYNTHETIC biology, *BIOENGINEERING
Abstract: Microbial bioengineering is a growing field for producing plant natural products (PNPs) in recent decades, using heterologous metabolic pathways in host cells. Once heterologous metabolic pathways have been introduced into host cells, traditional metabolic engineering techniques are employed to enhance the productivity and yield of PNP biosynthetic routes, as well as to manage competing pathways. The advent of computational biology has marked the beginning of a novel epoch in strain design through in silico methods. These methods utilize genome-scale metabolic models (GEMs) and flux optimization algorithms to facilitate rational design across the entire cellular metabolic network. However, the implementation of in silico strategies can often result in an uneven distribution of metabolic fluxes due to the rigid knocking out of endogenous genes, which can impede cell growth and ultimately impact the accumulation of target products. In this study, we creatively utilized synthetic biology to refine in silico strain design for efficient PNPs production. OptKnock simulation was performed on the GEM of Saccharomyces cerevisiae OA07, an engineered strain for oleanolic acid (OA) bioproduction that has been reported previously. The simulation predicted that the single deletion of fol1 , fol2 , fol3 , abz1 , and abz2 , or a combined knockout of hfd1 , ald2 and ald3 could improve its OA production. Consequently, strains EK1∼EK7 were constructed and cultivated. EK3 (OA07△ fol3), EK5 (OA07△ abz1), and EK6 (OA07△ abz2) had significantly higher OA titers in a batch cultivation compared to the original strain OA07. However, these increases were less pronounced in the fed-batch mode, indicating that gene deletion did not support sustainable OA production. To address this, we designed a negative feedback circuit regulated by malonyl-CoA, a growth-associated intermediate whose synthesis served as a bypass to OA synthesis, at fol3, abz1 , abz2 , and at acetyl-CoA carboxylase-encoding gene acc1 , to dynamically and autonomously regulate the expression of these genes in OA07. The constructed strains R_3A, R_5A and R_6A had significantly higher OA titers than the initial strain and the responding gene-knockout mutants in either batch or fed-batch culture modes. Among them, strain R_3A stand out with the highest OA titer reported to date. Its OA titer doubled that of the initial strain in the flask-level fed-batch cultivation, and achieved at 1.23 ± 0.04 g L−1 in 96 h in the fermenter-level fed-batch mode. This indicated that the integration of optimization algorithm and synthetic biology approaches was efficiently rational for PNP-producing strain design. • We developed a semi-rational strain design method for PNP overproduction in microbes. • The flux optimization algorithm and gene circuit design approach were jointly used. • The 3 constructed S. cerevisiae variants were sustainably efficient in OA production. • Strain R_3A with the self-regulated suppression of fol3 improved OA production to 1.23 g L−1. • Illustration of novel ideas to compromise host cell growth and the heterologous production. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. Dysregulated microRNAs in prostate cancer: In silico prediction and in vitro validation.

Author: Rezaei, Samaneh, Najaf Abadi, Mohammad Hasan Jafari, Bazyari, Mohammad Javad, Jalili, Amin, Oskuee, Reza Kazemi, and Aghaee-Bakhtiari, Seyed Hamid
Subjects: *MICRORNA, *BIOINFORMATICS, *PROSTATE cancer, *GENE expression, *COMPUTATIONAL biology
Abstract: Objective(s): MicroRNAs, which are micro-coordinators of gene expression, have been recently investigated as a potential treatment for cancer. The study used computational techniques to identify microRNAs that could target a set of genes simultaneously. Due to their multi-target-directed nature, microRNAs have the potential to impact multiple key pathways and their pathogenic cross-talk. Materials and Methods: We identified microRNAs that target a prostate cancer-associated gene set using integrated bioinformatics analyses and experimental validation. The candidate gene set included genes targeted by clinically approved prostate cancer medications. We used STRING, GO, and KEGG web tools to confirm gene-gene interactions and their clinical significance. Then, we employed integrated predicted and validated bioinformatics approaches to retrieve hsa-miR-124-3p, 16-5p, and 27a-3p as the top three relevant microRNAs. KEGG and DIANA-miRPath showed the related pathways for the candidate genes and microRNAs Results: The Real-time PCR results showed that miR-16-5p simultaneously down-regulated all genes significantly except for PIK3CA/CB in LNCaP; miR-27a-3p simultaneously down-regulated all genes significantly, excluding MET in LNCaP and PIK3CA in PC-3; and miR-124-3p could not downregulate significantly PIK3CB, MET, and FGFR4 in LNCaP and FGFR4 in PC-3. Finally, we used a cell cycle assay to show significant G0/G1 arrest by transfecting miR-124-3p in LNCaP and miR-16-5p in both cell lines. Conclusion: Our findings suggest that this novel approach may have therapeutic benefits and these predicted microRNAs could effectively target the candidate genes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. Machine Learning Strategies in MicroRNA Research: Bridging Genome to Phenome.

Author: Daniel Thomas, Sonet, Vijayakumar, Krithika, John, Levin, Krishnan, Deepak, Rehman, Niyas, Revikumar, Amjesh, Kandel Codi, Jalaluddin Akbar, Prasad, Thottethodi Subrahmanya Keshava, S.S., Vinodchandra, and Raju, Rajesh
Subjects: *CIRCULAR RNA, *GENOMICS, *LEARNING strategies, *MACHINE learning, *GENE expression, *GENETIC regulation
Abstract: MicroRNAs (miRNAs) have emerged as a prominent layer of regulation of gene expression. This article offers the salient and current aspects of machine learning (ML) tools and approaches from genome to phenome in miRNA research. First, we underline that the complexity in the analysis of miRNA function ranges from their modes of biogenesis to the target diversity in diverse biological conditions. Therefore, it is imperative to first ascertain the miRNA coding potential of genomes and understand the regulatory mechanisms of their expression. This knowledge enables the efficient classification of miRNA precursors and the identification of their mature forms and respective target genes. Second, and because one miRNA can target multiple mRNAs and vice versa, another challenge is the assessment of the miRNA-mRNA target interaction network. Furthermore, long-noncoding RNA (lncRNA)and circular RNAs (circRNAs) also contribute to this complexity. ML has been used to tackle these challenges at the high-dimensional data level. The present expert review covers more than 100 tools adopting various ML approaches pertaining to, for example, (1) miRNA promoter prediction, (2) precursor classification, (3) mature miRNA prediction, (4) miRNA target prediction, (5) miRNA- lncRNA and miRNA-circRNA interactions, (6) miRNA-mRNA expression profiling, (7) miRNA regulatory module detection, (8) miRNA-disease association, and (9) miRNA essentiality prediction. Taken together, we unpack, critically examine, and highlight the cutting-edge synergy of ML approaches and miRNA research so as to develop a dynamic and microlevel understanding of human health and diseases. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. miRSNP rs188493331: A key player in genetic control of microRNA‐induced pathway activation in hypertrophic scars and keloids.

Author: Chen, Meiqing, Pan, Yuyan, Chen, Zhiwei, Qi, Fazhi, Gu, Jianying, Qiu, Yangyang, He, Anqi, and Liu, Jiaqi
Subjects: *HYPERTROPHIC scars, *COMPUTATIONAL biology, *GENE expression, *KELOIDS, *FOCAL adhesions, *PI3K/AKT pathway, *PROTEOGLYCANS
Abstract: Background: Our study aims to delineate the miRSNP–microRNA–gene–pathway interactions in the context of hypertrophic scars (HS) and keloids. Materials and Methods: We performed a computational biology study involving differential expression analysis to identify genes and their mRNAs in HS and keloid tissues compared to normal skin, identifying key hub genes and enriching their functional roles, comprehensively analyzing microRNA‐target genes and related signaling pathways through bioinformatics, identifying MiRSNPs, and constructing a pathway‐based network to illustrate miRSNP‐miRNA‐gene‐signaling pathway interactions. Results: Our results revealed a total of 429 hub genes, with a strong enrichment in signaling pathways related to proteoglycans in cancer, focal adhesion, TGF‐β, PI3K/Akt, and EGFR tyrosine kinase inhibitor resistance. Particularly noteworthy was the substantial crosstalk between the focal adhesion and PI3K/Akt signaling pathways, making them more susceptible to regulation by microRNAs. We also identified specific miRNAs, including miRNA‐1279, miRNA‐429, and miRNA‐302e, which harbored multiple SNP loci, with miRSNPs rs188493331 and rs78979933 exerting control over a significant number of miRNA target genes. Furthermore, we observed that miRSNP rs188493331 shared a location with microRNA302e, microRNA202a‐3p, and microRNA20b‐5p, and these three microRNAs collectively targeted the gene LAMA3, which is integral to the focal adhesion signaling pathway. Conclusions: The study successfully unveils the complex interactions between miRSNPs, miRNAs, genes, and signaling pathways, shedding light on the genetic factors contributing to HS and keloid formation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. Optimized model architectures for deep learning on genomic data.

Author: Gündüz, Hüseyin Anil, Mreches, René, Moosbauer, Julia, Robertson, Gary, To, Xiao-Yin, Franzosa, Eric A., Huttenhower, Curtis, Rezaei, Mina, McHardy, Alice C., Bischl, Bernd, Münch, Philipp C., and Binder, Martin
Subjects: *DEEP learning, *ARCHITECTURAL design, *COMPUTER vision, *COMPUTATIONAL biology, *VISUAL fields
Abstract: The success of deep learning in various applications depends on task-specific architecture design choices, including the types, hyperparameters, and number of layers. In computational biology, there is no consensus on the optimal architecture design, and decisions are often made using insights from more well-established fields such as computer vision. These may not consider the domain-specific characteristics of genome sequences, potentially limiting performance. Here, we present GenomeNet-Architect, a neural architecture design framework that automatically optimizes deep learning models for genome sequence data. It optimizes the overall layout of the architecture, with a search space specifically designed for genomics. Additionally, it optimizes hyperparameters of individual layers and the model training procedure. On a viral classification task, GenomeNet-Architect reduced the read-level misclassification rate by 19%, with 67% faster inference and 83% fewer parameters, and achieved similar contig-level accuracy with ~100 times fewer parameters compared to the best-performing deep learning baselines. Introducing GenomeNet-Architect, a neural architecture design framework that automatically optimises the overall layout of the architecture, the hyperparameters, and the training procedure of deep learning models for genome sequence data. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. Biology System Description Language (BiSDL): a modeling language for the design of multicellular synthetic biological systems.

Author: Giannantoni, Leonardo, Bardini, Roberta, Savino, Alessandro, and Di Carlo, Stefano
Subjects: *SYNTHETIC biology, *BIOLOGICAL systems, *SYSTEMS biology, *DEVELOPMENTAL biology, *CONCEPTUAL design, *BIOLOGISTS
Abstract: Background: The Biology System Description Language (BiSDL) is an accessible, easy-to-use computational language for multicellular synthetic biology. It allows synthetic biologists to represent spatiality and multi-level cellular dynamics inherent to multicellular designs, filling a gap in the state of the art. Developed for designing and simulating spatial, multicellular synthetic biological systems, BiSDL integrates high-level conceptual design with detailed low-level modeling, fostering collaboration in the Design-Build-Test-Learn cycle. BiSDL descriptions directly compile into Nets-Within-Nets (NWNs) models, offering a unique approach to spatial and hierarchical modeling in biological systems. Results: BiSDL's effectiveness is showcased through three case studies on complex multicellular systems: a bacterial consortium, a synthetic morphogen system and a conjugative plasmid transfer process. These studies highlight the BiSDL proficiency in representing spatial interactions and multi-level cellular dynamics. The language facilitates the compilation of conceptual designs into detailed, simulatable models, leveraging the NWNs formalism. This enables intuitive modeling of complex biological systems, making advanced computational tools more accessible to a broader range of researchers. Conclusions: BiSDL represents a significant step forward in computational languages for synthetic biology, providing a sophisticated yet user-friendly tool for designing and simulating complex biological systems with an emphasis on spatiality and cellular dynamics. Its introduction has the potential to transform research and development in synthetic biology, allowing for deeper insights and novel applications in understanding and manipulating multicellular systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. Nucleoside‐Based Drug Target with General Antimicrobial Screening and Specific Computational Studies against SARS‐CoV‐2 Main Protease.

Author: Kawsar, Sarkar M. A., Hossain, Md. Ahad, Saha, Supriyo, Abdallah, Emad M., Bhat, Ajmal R., Ahmed, Sumeer, Jamalis, Joazaizulfazli, and Ozeki, Yasuhiro
Subjects: *SARS-CoV-2, *PROTEOLYTIC enzymes, *DRUG target, *EMERGING infectious diseases, *ENZYME kinetics
Abstract: This review article aims to significantly advance the scientific community's efforts to develop effective nucleoside‐based drugs for treating severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) and other emerging infectious diseases. This study concentrates on the main viral protease (Mpro) and explores nucleoside‐based compounds as potential therapeutic agents. This investigation investigated the impact of acylation‐induced modifications on the nucleoside hydroxyl group and subsequent properties. Nucleoside analogs, which are recognized for their diverse biochemical properties, were synthesized and rigorously screened to evaluate their antimicrobial efficacy. In the domain of pharmaceutical research, computational pharmacokinetics has emerged as a critical tool, especially in the pursuit of nucleoside analogs as potential therapeutics. In silico methods aid in predicting pharmacokinetic traits, interactions with crucial enzymes, and the stability of these analogs in biological environments, thereby streamlining drug design and reducing experimental costs. Concurrently, computational studies revealed the intricate interactions between the analogs and the active site of the main protease. The amalgamation of experimental screening and computational insights underscores the emergence of potent nucleoside candidates with inhibitory activity against SARS‐CoV‐2 Mpro. Additionally, this review integrates computational studies that provide valuable insights into the interactions between nucleoside analogs and the main protease of SARS‐CoV‐2. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Clusters of grapevine genes for a burning world.

Author: Coupel‐Ledru, Aude, Westgeest, Adrianus J., Albasha, Rami, Millan, Mathilde, Pallas, Benoît, Doligez, Agnès, Flutre, Timothée, Segura, Vincent, This, Patrice, Torregrosa, Laurent, Simonneau, Thierry, and Pantin, Florent
Subjects: *VITIS vinifera, *PLANT molecular biology, *GRAPES, *BOTANY, *ABSCISIC acid, *LOCUS (Genetics), *COMPUTATIONAL biology, *BOTANICAL chemistry
Abstract: A study published in the New Phytologist journal examines the genetic diversity of grapevines and their ability to withstand extreme heatwaves caused by climate change. The researchers conducted experiments on a grapevine diversity panel in South France during a record heatwave in 2019. They discovered that certain genomic regions were linked to heat tolerance, suggesting that genetic diversity could be used to breed fruit crops that can withstand heatwaves. The study also investigated the role of leaf size, leaf mass per area, and evaporative cooling in heat tolerance. The researchers identified candidate genes that may contribute to heat tolerance in grapevines. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

28. A novel professional‐use synergistic peel technology to reduce visible hyperpigmentation on face: Clinical evidence and mechanistic understanding by computational biology and optical biopsy.

Author: Bhardwaj, Vinay, Handler, Marc Zachary, Mao, Junhong, Azadegan, Chloe, Panda, Pritam K., Breunig, Hans Georg, Wenskus, Isabell, Diaz, Isabel, and König, Karsten
Subjects: *COMPUTATIONAL biology, *CHEMICAL peel, *HYPERPIGMENTATION, *CONFOCAL microscopy, *BIOPSY, *PHENOL oxidase
Abstract: Topicals and chemical peels are the standard of care for management of facial hyperpigmentation. However, traditional therapies have come under recent scrutiny, such as topical hydroquinone (HQ) has some regulatory restrictions, and high concentration trichloroacetic acid (TCA) peel pose a risk in patients with skin of colour. The objective of our research was to identify, investigate and elucidate the mechanism of action of a novel TCA‐ and HQ‐free professional‐use chemical peel to manage common types of facial hyperpigmentation. Using computational modelling and in vitro assays on tyrosinase, we identified proprietary multi‐acid synergistic technology (MAST). After a single application on human skin explants, MAST peel was found to be more effective than a commercial HQ peel in inhibiting melanin (histochemical imaging and gene expression). All participants completed the case study (N = 9) without any adverse events. After administration of the MAST peel by a dermatologist, the scoring and VISIA photography reported improvements in hyperpigmentation, texture and erythema, which could be linked to underlying pathophysiological changes in skin after peeling, visualized by non‐invasive optical biopsy of face. Using reflectance confocal microscopy (VivaScope®) and multiphoton tomography (MPTflex™), we observed reduction in melanin, increase in metabolic activity of keratinocytes, and no signs of inflammatory cells after peeling. Subsequent swabbing of the cheek skin found no microbiota dysbiosis resulting from the chemical peel. The strong efficacy with minimum downtime and no adverse events could be linked to the synergistic action of the ingredients in the novel HQ‐ and TCA‐free professional peel technology. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. Rescue of Mycobacterium bovis DNA Obtained from Cultured Samples during Official Surveillance of Animal TB: Key Steps for Robust Whole Genome Sequence Data Generation.

Author: Pinto, Daniela, Themudo, Gonçalo, Pereira, André C., Botelho, Ana, and Cunha, Mónica V.
Subjects: *MYCOBACTERIUM bovis, *WHOLE genome sequencing, *IDENTIFICATION of animals, *DNA, *MIXED infections, *COMPUTATIONAL biology
Abstract: Epidemiological surveillance of animal tuberculosis (TB) based on whole genome sequencing (WGS) of Mycobacterium bovis has recently gained track due to its high resolution to identify infection sources, characterize the pathogen population structure, and facilitate contact tracing. However, the workflow from bacterial isolation to sequence data analysis has several technical challenges that may severely impact the power to understand the epidemiological scenario and inform outbreak response. While trying to use archived DNA from cultured samples obtained during routine official surveillance of animal TB in Portugal, we struggled against three major challenges: the low amount of M. bovis DNA obtained from routinely processed animal samples; the lack of purity of M. bovis DNA, i.e., high levels of contamination with DNA from other organisms; and the co-occurrence of more than one M. bovis strain per sample (within-host mixed infection). The loss of isolated genomes generates missed links in transmission chain reconstruction, hampering the biological and epidemiological interpretation of data as a whole. Upon identification of these challenges, we implemented an integrated solution framework based on whole genome amplification and a dedicated computational pipeline to minimize their effects and recover as many genomes as possible. With the approaches described herein, we were able to recover 62 out of 100 samples that would have otherwise been lost. Based on these results, we discuss adjustments that should be made in official and research laboratories to facilitate the sequential implementation of bacteriological culture, PCR, downstream genomics, and computational-based methods. All of this in a time frame supporting data-driven intervention. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. PLEACH: a new heuristic algorithm for pure parsimony haplotyping problem.

Author: Feizabadi, Reza, Bagherian, Mehri, Vaziri, Hamidreza, and Salahi, Maziar
Subjects: *MIXED integer linear programming, *PARSIMONIOUS models, *HAPLOTYPES, *COMPUTATIONAL biology, *HEURISTIC algorithms, *HEART abnormalities
Abstract: Haplotype inference is an important issue in computational biology due to its various applications in diagnosing and treating genetic diseases such as diabetes, Alzheimer, and heart defects. There are different criteria to choose the solution from the alternatives. Parsimony is one of the most important criteria according to which the problem is known as Pure Parsimony Haplotyping (PPH) problem. The approaches to solve PPH are classified to two groups: exact and non-exact. The exact approaches often model the problem as a Mixed Integer Linear Programming (MILP) problem. Although in solving the small instances, these models generate the optimal solution in a reasonable time, because of the NP-hardness characteristic of PPH problem, they are ineffective in solving very large instances. This deficiency is compensated by non-exact algorithms. In this paper, we present a non-exact algorithm for large instances of PPH problem based on the divide-and-conquer technique. This algorithm, first, divides the problem into small sub-problems, which are solved by one of the previous exact approaches, and finally the solutions of the sub-problems are combined through solving an MILP. The appeared MILPs for solving the sub-problems and those for combining the solutions are so small that are solved rapidly. The performance of this algorithm has been evaluated by implementing it on real and simulated instances and in comparison with two well-known methods of PHASE and WinHap2. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. Learning the structure of the mTOR protein signaling pathway from protein phosphorylation data.

Author: Salam, Abdul and Grzegorczyk, Marco
Subjects: *MTOR protein, *PROTEIN structure, *CELLULAR signal transduction, *SYSTEMS biology, *COMPUTATIONAL biology, *POLYMER networks
Abstract: Statistical learning of the structures of cellular networks, such as protein signaling pathways, is a topical research field in computational systems biology. To get the most information out of experimental data, it is often required to develop a tailored statistical approach rather than applying one of the off-the-shelf network reconstruction methods. The focus of this paper is on learning the structure of the mTOR protein signaling pathway from immunoblotting protein phosphorylation data. Under two experimental conditions eleven phosphorylation sites of eight key proteins of the mTOR pathway were measured at ten non-equidistant time points. For the statistical analysis we propose a new advanced hierarchically coupled non-homogeneous dynamic Bayesian network (NH-DBN) model, and we consider various data imputation methods for dealing with non-equidistant temporal observations. Because of the absence of a true gold standard network, we propose to use predictive probabilities in combination with a leave-one-out cross validation strategy to objectively cross-compare the accuracies of different NH-DBN models and data imputation methods. Finally, we employ the best combination of model and data imputation method for predicting the structure of the mTOR protein signaling pathway. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. How much can ChatGPT really help computational biologists in programming?

Author: Rahman, Chowdhury Rafeed and Wong, Limsoon
Subjects: *CHATGPT, *MACHINE learning, *CHATBOTS, *BIOLOGISTS, *COMPUTATIONAL biology, *COMPUTER science
Abstract: ChatGPT, a recently developed product by openAI, is successfully leaving its mark as a multi-purpose natural language based chatbot. In this paper, we are more interested in analyzing its potential in the field of computational biology. A major share of work done by computational biologists these days involve coding up bioinformatics algorithms, analyzing data, creating pipelining scripts and even machine learning modeling and feature extraction. This paper focuses on the potential influence (both positive and negative) of ChatGPT in the mentioned aspects with illustrative examples from different perspectives. Compared to other fields of computer science, computational biology has (1) less coding resources, (2) more sensitivity and bias issues (deals with medical data), and (3) more necessity of coding assistance (people from diverse background come to this field). Keeping such issues in mind, we cover use cases such as code writing, reviewing, debugging, converting, refactoring, and pipelining using ChatGPT from the perspective of computational biologists in this paper. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Approaches for studying human macrophages.

Author: Bao, Yuzhou, Wang, Guanlin, and Li, Hanjie
Subjects: *MACROPHAGES, *HUMAN biology, *COMPUTATIONAL biology, *YOLK sac, *FETAL development
Abstract: Differences and similarities between human and mouse macrophages call for more human-centric studies. Hematopoiesis in the yolk sac, aorta–gonad–mesonephros (AGM) region, and fetal liver contributes to human macrophage ontogeny during prenatal development. Due to technical limitations, studying human macrophage biology is still in its infancy; however, studies can be accelerated by combining cutting-edge approaches that include single-cell methods and analyses, human organoid platforms, novel animal models, and computational biology. Our knowledge of mouse macrophages has progressed significantly over the past two decades, whereas our understanding of human macrophages is still in its infancy. Recent studies are unveiling the ontogeny and function of human macrophages. Cutting-edge approaches can be used to examine the diversity, development, niche, and functions of human tissue-resident macrophages. These methodologies can facilitate analyses of human macrophages and lay the groundwork for new therapeutic endeavors in macrophage-relevant disorders. Macrophages are vital tissue components involved in organogenesis, maintaining homeostasis, and responses to disease. Mouse models have significantly improved our understanding of macrophages. Further investigations into the characteristics and development of human macrophages are crucial, considering the substantial anatomical and physiological distinctions between mice and humans. Despite challenges in human macrophage research, recent studies are shedding light on the ontogeny and function of human macrophages. In this opinion, we propose combinations of cutting-edge approaches to examine the diversity, development, niche, and function of human tissue-resident macrophages. These methodologies can facilitate our exploration of human macrophages more efficiently, ideally providing new therapeutic avenues for macrophage-relevant disorders. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. Combining machine learning with structure-based protein design to predict and engineer post-translational modifications of proteins.

Author: Ertelt, Moritz, Mulligan, Vikram Khipple, Maguire, Jack B., Lyskov, Sergey, Moretti, Rocco, Schiffner, Torben, Meiler, Jens, and Schoeder, Clara T.
Subjects: *POST-translational modification, *PROTEIN engineering, *MACHINE learning, *ENGINEERING design, *COMPUTATIONAL biology, *ARTIFICIAL neural networks, *COMPUTATIONAL neuroscience
Abstract: Post-translational modifications (PTMs) of proteins play a vital role in their function and stability. These modifications influence protein folding, signaling, protein-protein interactions, enzyme activity, binding affinity, aggregation, degradation, and much more. To date, over 400 types of PTMs have been described, representing chemical diversity well beyond the genetically encoded amino acids. Such modifications pose a challenge to the successful design of proteins, but also represent a major opportunity to diversify the protein engineering toolbox. To this end, we first trained artificial neural networks (ANNs) to predict eighteen of the most abundant PTMs, including protein glycosylation, phosphorylation, methylation, and deamidation. In a second step, these models were implemented inside the computational protein modeling suite Rosetta, which allows flexible combination with existing protocols to model the modified sites and understand their impact on protein stability as well as function. Lastly, we developed a new design protocol that either maximizes or minimizes the predicted probability of a particular site being modified. We find that this combination of ANN prediction and structure-based design can enable the modification of existing, as well as the introduction of novel, PTMs. The potential applications of our work include, but are not limited to, glycan masking of epitopes, strengthening protein-protein interactions through phosphorylation, as well as protecting proteins from deamidation liabilities. These applications are especially important for the design of new protein therapeutics where PTMs can drastically change the therapeutic properties of a protein. Our work adds novel tools to Rosetta's protein engineering toolbox that allow for the rational design of PTMs. Author summary: Machine learning (ML) is changing the world of protein design, from structure prediction methods like AlphaFold to fixed-backbone design methods like ProteinMPNN. ML methods have made much progress in various aspects of protein computational biology, both complementing and, in some cases, surpassing traditional macromolecular modeling methods such as those combined in libraries like the Rosetta software suite. However, a lack of compatibility and flexibility can hinder interoperability with existing methods, preventing the full potential of these new solutions from being realized. Here, we first present a new machine learning tool for predicting post-translational modifications (PTMs), which play an important role in the stability and function of proteins, and then highlight how the implementation of this tool in the existing Rosetta toolbox can facilitate new applications. To this end, we combine PTM prediction with protein design, maximizing or minimizing the predicted probability of a post-translational modification occurring at a specific site. As one example, we predict the N-linked glycosylation of influenza hemagglutinin, which has applications in both understanding the evolution of viral strains over time, and engineering additional glycosylation sites to mask unwanted epitopes of vaccine candidates. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. xCAPT5: protein–protein interaction prediction using deep and wide multi-kernel pooling convolutional neural networks with protein language model.

Author: Dang, Thanh Hai and Vu, Tien Anh
Subjects: *CONVOLUTIONAL neural networks, *LANGUAGE models, *NERVE tissue proteins, *PROTEIN models, *COMPUTATIONAL biology, *PROTEIN-protein interactions, *FUMONISINS
Abstract: Background: Predicting protein–protein interactions (PPIs) from sequence data is a key challenge in computational biology. While various computational methods have been proposed, the utilization of sequence embeddings from protein language models, which contain diverse information, including structural, evolutionary, and functional aspects, has not been fully exploited. Additionally, there is a significant need for a comprehensive neural network capable of efficiently extracting these multifaceted representations. Results: Addressing this gap, we propose xCAPT5, a novel hybrid classifier that uniquely leverages the T5-XL-UniRef50 protein large language model for generating rich amino acid embeddings from protein sequences. The core of xCAPT5 is a multi-kernel deep convolutional siamese neural network, which effectively captures intricate interaction features at both micro and macro levels, integrated with the XGBoost algorithm, enhancing PPIs classification performance. By concatenating max and average pooling features in a depth-wise manner, xCAPT5 effectively learns crucial features with low computational cost. Conclusion: This study represents one of the initial efforts to extract informative amino acid embeddings from a large protein language model using a deep and wide convolutional network. Experimental results show that xCAPT5 outperforms recent state-of-the-art methods in binary PPI prediction, excelling in cross-validation on several benchmark datasets and demonstrating robust generalization across intra-species, cross-species, inter-species, and stringent similarity contexts. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. In silico prospection of receptors associated with the biological activity of U1-SCTRX-lg1a: an antimicrobial peptide isolated from the venom of Loxosceles gaucho.

Author: de Oliveira, André Souza, Muniz Seif, Elias Jorge, and da Silva Junior, Pedro Ismael
Subjects: *ANTIMICROBIAL peptides, *SPIDER venom, *PEPTIDE antibiotics, *LOXOSCELES, *VENOM, *PHOSPHOLIPASE D, *MOLECULAR dynamics
Abstract: The emergence of antibiotic-resistant pathogens generates impairment to human health. U1-SCTRX-lg1a is a peptide isolated from a phospholipase D extracted from the spider venom of Loxosceles gaucho with antimicrobial activity against Gram-negative bacteria (between 1.15 and 4.6 μM). The aim of this study was to suggest potential receptors associated with the antimicrobial activity of U1-SCTRX-lg1a using in silico bioinformatics tools. The search for potential targets of U1-SCRTX-lg1a was performed using the PharmMapper server. Molecular docking between U1-SCRTX-lg1a and the receptor was performed using PatchDock software. The prediction of ligand sites for each receptor was conducted using the PDBSum server. Chimera 1.6 software was used to perform molecular dynamics simulations only for the best dock score receptor. In addition, U1-SCRTX-lg1a and native ligand interactions were compared using AutoDock Vina software. Finally, predicted interactions were compared with the ligand site previously described in the literature. The bioprospecting of U1-SCRTX-lg1a resulted in the identification of three hundred (300) diverse targets (Table S1), forty-nine (49) of which were intracellular proteins originating from Gram-negative microorganisms (Table S2). Docking results indicate Scores (10,702 to 6066), Areas (1498.70 to 728.40) and ACEs (417.90 to – 152.8) values. Among these, NAD + NH3-dependent synthetase (PDB ID: 1wxi) showed a dock score of 9742, area of 1223.6 and ACE of 38.38 in addition to presenting a Normalized Fit score of 8812 on PharmMapper server. Analysis of the interaction of ligands and receptors suggests that the peptide derived from brown spider venom can interact with residues SER48 and THR160. Furthermore, the C terminus (– 7.0 score) has greater affinity for the receptor than the N terminus (– 7.7 score). The molecular dynamics assay shown that free energy value for the protein complex of – 214,890.21 kJ/mol, whereas with rigid docking, this value was – 29.952.8 sugerindo that after the molecular dynamics simulation, the complex exhibits a more favorable energy value compared to the previous state. The in silico bioprospecting of receptors suggests that U1-SCRTX-lg1a may interfere with NAD + production in Escherichia coli, a Gram-negative bacterium, altering the homeostasis of the microorganism and impairing growth. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. Flipped C-Terminal Ends of APOA1 Promote ABCA1-Dependent Cholesterol Efflux by Small HDLs.

Author: Yi He, Pavanello, Chiara, Hutchins, Patrick M., Chongren Tang, Pourmousa, Mohsen, Vaisar, Tomas, Song, Hyun D., Pastor, Richard W., Remaley, Alan T., Goldberg, Ira J., Costacou, Tina, Davidson, W. Sean, Bornfeldt, Karin E., Calabresi, Laura, Segrest, Jere P., and Heinecke, Jay W.
Subjects: *CHOLESTERYL ester transfer protein, *APOLIPOPROTEIN A, *ATP-binding cassette transporters, *CHOLESTEROL, *HIGH density lipoproteins, *ION mobility, *MOLECULAR dynamics
Abstract: BACKGROUND: Cholesterol efflux capacity (CEC) predicts cardiovascular disease independently of high-density lipoprotein (HDL) cholesterol levels. Isolated small HDL particles are potent promoters of macrophage CEC by the ABCA1 (ATPbinding cassette transporter A1) pathway, but the underlying mechanisms are unclear. METHODS: We used model system studies of reconstituted HDL and plasma from control and lecithin-cholesterol acyltransferase (LCAT)-deficient subjects to investigate the relationships among the sizes of HDL particles, the structure of APOA1 (apolipoprotein A1) in the different particles, and the CECs of plasma and isolated HDLs. RESULTS: We quantified macrophage and ABCA1 CEC of 4 distinct sizes of reconstituted HDL. CEC increased as particle size decreased. Tandem mass spectrometric analysis of chemically cross-linked peptides and molecular dynamics simulations of APOA1, the major protein of HDL, indicated that the mobility of C-terminus of that protein was markedly higher and flipped off the surface in the smallest particles. To explore the physiological relevance of the model system studies, we isolated HDL from LCAT-deficient subjects, whose small HDLs (like reconstituted HDLs) are discoidal and composed of APOA1, cholesterol, and phospholipid. Despite their very low plasma levels of HDL particles, these subjects had normal CEC. In both the LCAT-deficient subjects and control subjects, the CEC of isolated extra-small HDL (a mixture of extra-small and small HDL by calibrated ion mobility analysis) was 3- to 5-fold greater than that of the larger sizes of isolated HDL. Incubating LCAT-deficient plasma and control plasma with human LCAT converted extra-small and small HDL particles into larger particles, and it markedly inhibited CEC. CONCLUSIONS: We present a mechanism for the enhanced CEC of small HDLs. In smaller particles, the C-termini of the 2 antiparallel molecules of APOA1 are "flipped" off the lipid surface of HDL. This extended conformation allows them to engage with ABCA1. In contrast, the C-termini of larger HDLs are unable to interact productively with ABCA1 because they form a helical bundle that strongly adheres to the lipid on the particle. Enhanced CEC, as seen with the smaller particles, predicts decreased cardiovascular disease risk. Thus, extra-small and small HDLs may be key mediators and indicators of the cardioprotective effects of HDL. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. A ubiquitous GC content signature underlies multimodal mRNA regulation by DDX3X.

Author: Jowhar, Ziad, Xu, Albert, Venkataramanan, Srivats, Dossena, Francesco, Hoye, Mariah L, Silver, Debra L, Floor, Stephen N, and Calviello, Lorenzo
Subjects: *RNA regulation, *DEVELOPMENTAL neurobiology, *RNA-binding proteins, *GENETIC regulation, *STATISTICAL learning, *GENE expression, *MESSENGER RNA
Abstract: The road from transcription to protein synthesis is paved with many obstacles, allowing for several modes of post-transcriptional regulation of gene expression. A fundamental player in mRNA biology is DDX3X, an RNA binding protein that canonically regulates mRNA translation. By monitoring dynamics of mRNA abundance and translation following DDX3X depletion, we observe stabilization of translationally suppressed mRNAs. We use interpretable statistical learning models to uncover GC content in the coding sequence as the major feature underlying RNA stabilization. This result corroborates GC content-related mRNA regulation detectable in other studies, including hundreds of ENCODE datasets and recent work focusing on mRNA dynamics in the cell cycle. We provide further evidence for mRNA stabilization by detailed analysis of RNA-seq profiles in hundreds of samples, including a Ddx3x conditional knockout mouse model exhibiting cell cycle and neurogenesis defects. Our study identifies a ubiquitous feature underlying mRNA regulation and highlights the importance of quantifying multiple steps of the gene expression cascade, where RNA abundance and protein production are often uncoupled. Synopsis: Monitoring the dynamics of mRNA changes after DDX3X depletion indicates translation suppression followed by mRNA stabilization. GC content in the CDS is a strong predictor of mRNA stabilization and it is detectable in multiple transcriptomics datasets, with a potential link to cell-cycle regulation. RNA-seq and Ribo-seq on a time course of DDX3X depletion show regulation of translation and mRNA levels. Intron-exon count modeling and SLAM-seq demonstrate post-transcriptional mRNA stabilization. Random Forest and Lasso show GC content in the CDS (GCcds) as a main predictor of mRNA stabilization. ENCODE RBP knockdowns and in vivo datasets show widespread GCcds-dependent mRNA stabilization. Monitoring the dynamics of mRNA changes after DDX3X depletion indicates translation suppression followed by mRNA stabilization. GC content in the CDS is a strong predictor of mRNA stabilization and it is detectable in multiple transcriptomics datasets, with a potential link to cell-cycle regulation. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. Efficiency of Various Tiling Strategies for the Zuker Algorithm Optimization.

Author: Blaszynski, Piotr, Palkowski, Marek, Bielecki, Wlodzimierz, and Poliwoda, Maciej
Subjects: *DYNAMIC programming, *AFFINE transformations, *COMPILERS (Computer programs), *BIOINFORMATICS software, *ENERGY consumption, *COMPUTATIONAL biology, *PLUTO (Dwarf planet)
Abstract: This paper focuses on optimizing the Zuker RNA folding algorithm, a bioinformatics task with non-serial polyadic dynamic programming and non-uniform loop dependencies. The intricate dependence pattern is represented using affine formulas, enabling the automatic application of tiling strategies via the polyhedral method. Three source-to-source compilers—PLUTO, TRACO, and DAPT—are employed, utilizing techniques such as affine transformations, the transitive closure of dependence relation graphs, and space–time tiling to generate cache-efficient codes, respectively. A dedicated transpose code technique for non-serial polyadic dynamic programming codes is also examined. The study evaluates the performance of these optimized codes for speed-up and scalability on multi-core machines and explores energy efficiency using RAPL. The paper provides insights into related approaches and outlines future research directions within the context of bioinformatics algorithm optimization. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. New classifications for quantum bioinformatics: Q-bioinformatics, QCt-bioinformatics, QCg-bioinformatics, and QCr-bioinformatics.

Author: Mokhtari, Majid, Khoshbakht, Samane, Ziyaei, Kobra, Akbari, Mohammad Esmaeil, and Moravveji, Sayyed Sajjad
Subjects: *BIOMECHANICS, *QUANTUM biochemistry, *MOLECULAR biology, *QUANTUM mechanics, *BIOINFORMATICS, *COMPUTATIONAL biology, *PROTEIN folding
Abstract: Bioinformatics has revolutionized biology and medicine by using computational methods to analyze and interpret biological data. Quantum mechanics has recently emerged as a promising tool for the analysis of biological systems, leading to the development of quantum bioinformatics. This new field employs the principles of quantum mechanics, quantum algorithms, and quantum computing to solve complex problems in molecular biology, drug design, and protein folding. However, the intersection of bioinformatics, biology, and quantum mechanics presents unique challenges. One significant challenge is the possibility of confusion among scientists between quantum bioinformatics and quantum biology, which have similar goals and concepts. Additionally, the diverse calculations in each field make it difficult to establish boundaries and identify purely quantum effects from other factors that may affect biological processes. This review provides an overview of the concepts of quantum biology and quantum mechanics and their intersection in quantum bioinformatics. We examine the challenges and unique features of this field and propose a classification of quantum bioinformatics to promote interdisciplinary collaboration and accelerate progress. By unlocking the full potential of quantum bioinformatics, this review aims to contribute to our understanding of quantum mechanics in biological systems. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. Prediction of cancer driver genes and mutations: the potential of integrative computational frameworks.

Author: Nourbakhsh, Mona, Degn, Kristine, Saksager, Astrid, Tiberti, Matteo, and Papaleo, Elena
Subjects: *CANCER genes, *GENETIC mutation, *SINGLE nucleotide polymorphisms, *COMPUTER software developers, *RESEARCH personnel, *COMPUTATIONAL neuroscience, *COMPUTATIONAL biology
Abstract: The vast amount of available sequencing data allows the scientific community to explore different genetic alterations that may drive cancer or favor cancer progression. Software developers have proposed a myriad of predictive tools, allowing researchers and clinicians to compare and prioritize driver genes and mutations and their relative pathogenicity. However, there is little consensus on the computational approach or a golden standard for comparison. Hence, benchmarking the different tools depends highly on the input data, indicating that overfitting is still a massive problem. One of the solutions is to limit the scope and usage of specific tools. However, such limitations force researchers to walk on a tightrope between creating and using high-quality tools for a specific purpose and describing the complex alterations driving cancer. While the knowledge of cancer development increases daily, many bioinformatic pipelines rely on single nucleotide variants or alterations in a vacuum without accounting for cellular compartments, mutational burden or disease progression. Even within bioinformatics and computational cancer biology, the research fields work in silos, risking overlooking potential synergies or breakthroughs. Here, we provide an overview of databases and datasets for building or testing predictive cancer driver tools. Furthermore, we introduce predictive tools for driver genes, driver mutations, and the impact of these based on structural analysis. Additionally, we suggest and recommend directions in the field to avoid silo-research, moving towards integrative frameworks. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. scRNA-seq Reveals Novel Genetic Pathways and Sex Chromosome Regulation in Tribolium Spermatogenesis.

Author: Robben, Michael, Ramesh, Balan, Pau, Shana, Meletis, Demetra, Luber, Jacob, and Demuth, Jeffery
Subjects: *SEX chromosomes, *RED flour beetle, *SPERMATOGENESIS, *TRIBOLIUM, *X chromosome, *BEETLES, *SPERMATOZOA
Abstract: Spermatogenesis is critical to sexual reproduction yet evolves rapidly in many organisms. High-throughput single-cell transcriptomics promises unparalleled insight into this important process but understanding can be impeded in nonmodel systems by a lack of known genes that can reliably demarcate biologically meaningful cell populations. Tribolium castaneum , the red flour beetle, lacks known markers for spermatogenesis found in insect species like Drosophila melanogaster. Using single-cell sequencing data collected from adult beetle testes, we implement a strategy for elucidating biologically meaningful cell populations by using transient expression stage identification markers, weighted principal component clustering, and SNP-based haploid/diploid phasing. We identify populations that correspond to observable points in sperm differentiation and find species specific markers for each stage. Our results indicate that molecular pathways underlying spermatogenesis in Coleoptera are substantially diverged from those in Diptera. We also show that most genes on the X chromosome experience meiotic sex chromosome inactivation. Temporal expression of Drosophila MSL complex homologs coupled with spatial analysis of potential chromatin entry sites further suggests that the dosage compensation machinery may mediate escape from meiotic sex chromosome inactivation and postmeiotic reactivation of the X chromosome. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. Herbivory‐driven shifts in arbuscular mycorrhizal fungal community assembly: increased fungal competition and plant phosphorus benefits.

Author: Frew, Adam, Öpik, Maarja, Oja, Jane, Vahter, Tanel, Hiiesalu, Inga, and Aguilar‐Trigueros, Carlos A.
Subjects: *PLANT competition, *FUNGAL communities, *SOIL biology, *BIOTIC communities, *BOTANY, *COMPUTATIONAL biology, *FUNGAL spores, *PLANT defenses
Abstract: This article examines the impact of aboveground insect herbivory on the diversity and composition of arbuscular mycorrhizal (AM) fungal communities in plant roots. The study suggests that herbivory can affect the assembly of AM fungal communities, but the specific effects vary. While herbivory did not significantly reduce AM fungal richness, it did increase community evenness. The study also found that herbivory altered the composition and structure of AM fungal communities, leading to increased phylogenetic diversity. Additionally, plants with herbivores benefited more from AM fungi in terms of phosphorus acquisition compared to herbivore-free plants. These findings highlight the potential influence of aboveground herbivory on plant performance and nutrient acquisition. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

44. The hollow newel state in gastropods: when snail shells are open-axis.

Author: Friend, Dana S, Anderson, Brendan M, and Allmon, Warren D
Subjects: *SNAIL shells, *GASTROPODA, *NATURAL history, *COMPUTATIONAL biology, *RIVER channels, *ATMOSPHERIC sciences
Abstract: This article examines the concept of holes in philosophy and their practical implications in zoological morphology, specifically in gastropod shells. The authors introduce different types of openings in gastropod shells, such as the true umbilicus, pseudoumbilicus, and a newly identified opening called the hollow newel. They propose that the axial openings in gastropod shells be designated as hollow newels, which lack complete inner shell walls forming the columella. The authors compare this concept to spiral staircases without a central supporting pillar. The text explores the presence of a hole, known as the "HN," in certain gastropod shells, discussing its attachment points for muscles and various character states associated with this feature. The authors conducted a survey and found that HNs are primarily found in turritellids. The text also delves into the potential functions and evolutionary significance of the HN. Overall, the article emphasizes the need for further research and examination of physical specimens to gain a better understanding of gastropod evolution and ecology. [Extracted from the article]
Published: 2024
Full Text: View/download PDF

45. NeoHunter: Flexible software for systematically detecting neoantigens from sequencing data.

Author: Ma, Tianxing, Zhao, Zetong, Li, Haochen, Wei, Lei, and Zhang, Xuegong
Subjects: *MOLECULAR biology, *ANTIGENS, *COMPUTER software, *CANCER vaccines, *COMPUTATIONAL biology, *GENE fusion
Abstract: Complicated molecular alterations in tumors generate various mutant peptides. Some of these mutant peptides can be presented to the cell surface and then elicit immune responses, and such mutant peptides are called neoantigens. Accurate detection of neoantigens could help to design personalized cancer vaccines. Although some computational frameworks for neoantigen detection have been proposed, most of them can only detect SNV- and indel-derived neoantigens. In addition, current frameworks adopt oversimplified neoantigen prioritization strategies. These factors hinder the comprehensive and effective detection of neoantigens. We developed NeoHunter, flexible software to systematically detect and prioritize neoantigens from sequencing data in different formats. NeoHunter can detect not only SNV- and indel-derived neoantigens but also gene fusion- and aberrant splicing-derived neoantigens. NeoHunter supports both direct and indirect immunogenicity evaluation strategies to prioritize candidate neoantigens. These strategies utilize binding characteristics, existing biological big data, and T-cell receptor specificity to ensure accurate detection and prioritization. We applied NeoHunter to the TESLA dataset, cohorts of melanoma and non-small cell lung cancer patients. NeoHunter achieved high performance across the TESLA cancer patients and detected 79% (27 out of 34) of validated neoantigens in total. SNV- and indel-derived neoantigens accounted for 90% of the top 100 candidate neoantigens while neoantigens from aberrant splicing accounted for 9%. Gene fusion-derived neoantigens were detected in one patient. NeoHunter is a powerful tool to 'catch all' neoantigens and is available for free academic use on Github (XuegongLab/NeoHunter). [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. Inferring a directed acyclic graph of phenotypes from GWAS summary statistics.

Author: Zilinskas, Rachel, Li, Chunlin, Shen, Xiaotong, Pan, Wei, and Yang, Tianzhong
Subjects: *DIRECTED acyclic graphs, *GENOME-wide association studies, *COMPUTATIONAL biology, *ALZHEIMER'S disease, *DIRECTED graphs, *PHENOTYPES, *INSTRUMENTAL variables (Statistics), *LIKELIHOOD ratio tests
Abstract: Estimating phenotype networks is a growing field in computational biology. It deepens the understanding of disease etiology and is useful in many applications. In this study, we present a method that constructs a phenotype network by assuming a Gaussian linear structure model embedding a directed acyclic graph (DAG). We utilize genetic variants as instrumental variables and show how our method only requires access to summary statistics from a genome-wide association study (GWAS) and a reference panel of genotype data. Besides estimation, a distinct feature of the method is its summary statistics-based likelihood ratio test on directed edges. We applied our method to estimate a causal network of 29 cardiovascular-related proteins and linked the estimated network to Alzheimer's disease (AD). A simulation study was conducted to demonstrate the effectiveness of this method. An R package sumdag implementing the proposed method, all relevant code, and a Shiny application are available. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. Exact global alignment using A* with chaining seed heuristic and match pruning.

Author: Koerkamp, Ragnar Groot and Ivanov, Pesho
Subjects: *SEQUENCE alignment, *COMPUTATIONAL biology, *HEURISTIC, *SEEDS
Abstract: Motivation Sequence alignment has been at the core of computational biology for half a century. Still, it is an open problem to design a practical algorithm for exact alignment of a pair of related sequences in linear-like time. Results We solve exact global pairwise alignment with respect to edit distance by using the A* shortest path algorithm. In order to efficiently align long sequences with high divergence, we extend the recently proposed seed heuristic with match chaining , gap costs , and inexact matches. We additionally integrate the novel match pruning technique and diagonal transition to improve the A* search. We prove the correctness of our algorithm, implement it in the A* PA aligner, and justify our extensions intuitively and empirically. On random sequences of divergence d = 4 % and length n , the empirical runtime of A* PA scales near-linearly with length (best fit n 1.06 , n ≤ 10 7 bp ⁠). A similar scaling remains up to d = 12 % (best fit n 1.24 ⁠ , n ≤ 10 7 bp ⁠). For n = 10 7 bp and d = 4 % ⁠ , A* PA reaches > 500 × speedup compared to the leading exact aligners Edlib and Bi WFA. The performance of A* PA is highly influenced by long gaps. On long (⁠ n > 500 kb ⁠) ONT reads of a human sample it efficiently aligns sequences with d < 10 % ⁠ , leading to 3 × median speedup compared to Edlib and Bi WFA. When the sequences come from different human samples, A* PA performs 1.7 × faster than Edlib and Bi WFA. Availability and implementation github.com/RagnarGrootKoerkamp/astar-pairwise-aligner. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. Exploring DNA Damage and Repair Mechanisms: A Review with Computational Insights.

Author: Chen, Jiawei, Potlapalli, Ravi, Quan, Heng, Chen, Lingtao, Xie, Ying, Pouriyeh, Seyedamin, Sakib, Nazmus, Liu, Lichao, and Xie, Yixin
Subjects: *DNA repair, *DNA data banks, *DNA damage, *COMPUTATIONAL neuroscience, *DEOXYRIBOZYMES, *COMPUTATIONAL biology
Abstract: DNA damage is a critical factor contributing to genetic alterations, directly affecting human health, including developing diseases such as cancer and age-related disorders. DNA repair mechanisms play a pivotal role in safeguarding genetic integrity and preventing the onset of these ailments. Over the past decade, substantial progress and pivotal discoveries have been achieved in DNA damage and repair. This comprehensive review paper consolidates research efforts, focusing on DNA repair mechanisms, computational research methods, and associated databases. Our work is a valuable resource for scientists and researchers engaged in computational DNA research, offering the latest insights into DNA-related proteins, diseases, and cutting-edge methodologies. The review addresses key questions, including the major types of DNA damage, common DNA repair mechanisms, the availability of reliable databases for DNA damage and associated diseases, and the predominant computational research methods for enzymes involved in DNA damage and repair. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. Deep Learning for Subtypes Identification of Pure Seminoma of the Testis.

Author: Medvedev, Kirill E, Acosta, Paul H, Jia, Liwei, and Grishin, Nick V
Subjects: *TESTIS, *DEEP learning, *BIOINFORMATICS, *CANCER patients, *TESTIS tumors, *DECISION making, *DESCRIPTIVE statistics, *HISTOLOGICAL techniques, *RECEIVER operating characteristic curves, *PREDICTION models, *SEMINOMA
Abstract: The most critical step in the clinical diagnosis workflow is the pathological evaluation of each tumor sample. Deep learning is a powerful approach that is widely used to enhance diagnostic accuracy and streamline the diagnosis process. In our previous study using omics data, we identified 2 distinct subtypes of pure seminoma. Seminoma is the most common histological type of testicular germ cell tumors (TGCTs). Here we developed a deep learning decision making tool for the identification of seminoma subtypes using histopathological slides. We used all available slides for pure seminoma samples from The Cancer Genome Atlas (TCGA). The developed model showed an area under the ROC curve of 0.896. Our model not only confirms the presence of 2 distinct subtypes within pure seminoma but also unveils the presence of morphological differences between them that are imperceptible to the human eye. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. Generic model to unravel the deeper insights of viral infections: an empirical application of evolutionary graph coloring in computational network biology.

Author: Kole, Arnab, Bag, Arup Kumar, Pal, Anindya Jyoti, and De, Debashis
Subjects: *GRAPH coloring, *VIRUS diseases, *COMPUTATIONAL biology, *TRANSCRIPTION factors, *DRUG target
Abstract: Purpose: Graph coloring approach has emerged as a valuable problem-solving tool for both theoretical and practical aspects across various scientific disciplines, including biology. In this study, we demonstrate the graph coloring's effectiveness in computational network biology, more precisely in analyzing protein–protein interaction (PPI) networks to gain insights about the viral infections and its consequences on human health. Accordingly, we propose a generic model that can highlight important hub proteins of virus-associated disease manifestations, changes in disease-associated biological pathways, potential drug targets and respective drugs. We test our model on SARS-CoV-2 infection, a highly transmissible virus responsible for the COVID-19 pandemic. The pandemic took significant human lives, causing severe respiratory illnesses and exhibiting various symptoms ranging from fever and cough to gastrointestinal, cardiac, renal, neurological, and other manifestations. Methods: To investigate the underlying mechanisms of SARS-CoV-2 infection-induced dysregulation of human pathobiology, we construct a two-level PPI network and employed a differential evolution-based graph coloring (DEGCP) algorithm to identify critical hub proteins that might serve as potential targets for resolving the associated issues. Initially, we concentrate on the direct human interactors of SARS-CoV-2 proteins to construct the first-level PPI network and subsequently applied the DEGCP algorithm to identify essential hub proteins within this network. We then build a second-level PPI network by incorporating the next-level human interactors of the first-level hub proteins and use the DEGCP algorithm to predict the second level of hub proteins. Results: We first identify the potential crucial hub proteins associated with SARS-CoV-2 infection at different levels. Through comprehensive analysis, we then investigate the cellular localization, interactions with other viral families, involvement in biological pathways and processes, functional attributes, gene regulation capabilities as transcription factors, and their associations with disease-associated symptoms of these identified hub proteins. Our findings highlight the significance of these hub proteins and their intricate connections with disease pathophysiology. Furthermore, we predict potential drug targets among the hub proteins and identify specific drugs that hold promise in preventing or treating SARS-CoV-2 infection and its consequences. Conclusion: Our generic model demonstrates the effectiveness of DEGCP algorithm in analyzing biological PPI networks, provides valuable insights into disease biology, and offers a basis for developing novel therapeutic strategies for other viral infections that may cause future pandemic. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

116 results on '"Computational Biology"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources