Author: "Zou, Quan" / Publication Year Range: This year / Search Limiters: Full Text and Peer Reviewed - Searchworks@Jio Institute Digital Library Search Results

1. Deep learning meta-analysis for predicting plant soil-borne fungal disease occurrence from soil microbiome data

Author: Wang, Yansu and Zou, Quan
Published: 2024
Full Text: View/download PDF

2. Pebble flow in the HTR-PM reactor core by GPU-DEM simulation: Effect of friction

Author: Zhang, Zuoyi, Zou, Quan, Gui, Nan, Xia, Bing, Liu, Zhiyong, and Yang, Xingtuan
Published: 2024
Full Text: View/download PDF

3. Model-driven full system dynamics estimation of PMSM-driven chain shell magazine

Author: Wei, Kai, Chen, Longmiao, and Zou, Quan
Published: 2024
Full Text: View/download PDF

4. Numerical study of the effect of particle size on pebble flow in the HTR-PM

Author: Zou, Quan, Gui, Nan, Yang, Xingtuan, Tu, Jiyuan, Jiang, Shengyao, and Liu, Zhiyong
Published: 2024
Full Text: View/download PDF

5. Fusion of multi-source relationships and topology to infer lncRNA-protein interactions

Author: Zhang, Xinyu, Liu, Mingzhe, Li, Zhen, Zhuo, Linlin, Fu, Xiangzheng, and Zou, Quan
Published: 2024
Full Text: View/download PDF

6. iAFPs-Mv-BiTCN: Predicting antifungal peptides using self-attention transformer embedding and transform evolutionary based multi-view features with bidirectional temporal convolutional networks

Author: Akbar, Shahid, Zou, Quan, Raza, Ali, and Alarfaj, Fawaz Khaled
Published: 2024
Full Text: View/download PDF

7. Random subsequence forests

Author: He, Zengyou, Wang, Jiaqi, Jiang, Mudi, Hu, Lianyu, and Zou, Quan
Published: 2024
Full Text: View/download PDF

8. CD39hi identifies an exhausted tumor-reactive CD8+ T cell population associated with tumor progression in human gastric cancer

Author: Shen, Yang, Qiu, Yuan, Duan, Zhen-quan, Li, Yu-xian, Wang, Ying, Zhang, Yuan-yuan, Zhu, Bao-hang, Yu, Xiao-hong, Tan, Xue-ling, Chen, Weisan, Zhuang, Yuan, Zou, Quan-ming, Ma, Dai-yuan, and Peng, Liu-sheng
Published: 2024
Full Text: View/download PDF

9. GPU-DEM-based heat transfer model for an HTGR pebble bed

Author: Zou, Quan, Gui, Nan, Yang, Xingtuan, Tu, Jiyuan, and Jiang, Shengyao
Published: 2024
Full Text: View/download PDF

10. Joint masking and self-supervised strategies for inferring small molecule-miRNA associations

Author: Zhou, Zhecheng, Zhuo, Linlin, Fu, Xiangzheng, Lv, Juan, Zou, Quan, and Qi, Ren
Published: 2024
Full Text: View/download PDF

11. Potential inhibition of SARS-CoV-2 infection and its mutation with the novel geldanamycin analogue: Ignaciomycin

Author: Stalin, Antony, Saravana Kumar, Pachaiyappan, Senthamarai Kannan, Balakrishnan, Saravanan, Rajamanikam, Ignacimuthu, Savarimuthu, and Zou, Quan
Published: 2024
Full Text: View/download PDF

12. MVST: Identifying spatial domains of spatial transcriptomes from multiple views using multi-view graph convolutional networks.

Author: Duan, Hao, Zhang, Qingchen, Cui, Feifei, Zou, Quan, and Zhang, Zilong
Subjects: MORPHOLOGY, GENE expression, TISSUES, STRUCTURAL frames, TRANSCRIPTOMES
Abstract: Spatial transcriptome technology can parse transcriptomic data at the spatial level to detect high-throughput gene expression and preserve information regarding the spatial structure of tissues. Identifying spatial domains, that is identifying regions with similarities in gene expression and histology, is the most basic and critical aspect of spatial transcriptome data analysis. Most current methods identify spatial domains only through a single view, which may obscure certain important information and thus fail to make full use of the information embedded in spatial transcriptome data. Therefore, we propose an unsupervised clustering framework based on multiview graph convolutional networks (MVST) to achieve accurate spatial domain recognition by the learning graph embedding features of neighborhood graphs constructed from gene expression information, spatial location information, and histopathological image information through multiview graph convolutional networks. By exploring spatial transcriptomes from multiple views, MVST enables data from all parts of the spatial transcriptome to be comprehensively and fully utilized to obtain more accurate spatial expression patterns. We verified the effectiveness of MVST on real spatial transcriptome datasets, the robustness of MVST on some simulated datasets, and the reasonableness of the framework structure of MVST in ablation experiments, and from the experimental results, it is clear that MVST can achieve a more accurate spatial domain identification compared with the current more advanced methods. In conclusion, MVST is a powerful tool for spatial transcriptome research with improved spatial domain recognition. Author summary: Spatial transcriptome sequencing can not only reveal the mechanisms of disease development, but can also be used to explore the structure of biological tissues, which are widely used in a variety of fields such as developmental biology and oncology. To utilize spatial transcriptome data to understand how different cells work together to perform complex functions, spatially relevant groups of cells must be identified, which leads to the task of dissecting tissues into spatial domains. However, most of the currently available spatial domain identification tools have yet to fully utilize the information embedded in spatial transcriptomic data, and their identification remains to be improved for the fine-grained exploration of tissue structure. Here, we developed a tool for accurate spatial domain identification, and our tool makes full use of the various types of spatial transcriptome data from multiple perspectives, identifying the real grouping information in spatial transcriptome data as much as possible. Our results show that our spatial domain identification tool can identify spatial domains more accurately and provide effective help for biological organization structure exploration. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. Overcoming CRISPR-Cas9 off-target prediction hurdles: A novel approach with ESB rebalancing strategy and CRISPR-MCA model.

Author: Yang, Yanpeng, Zheng, Yanyi, Zou, Quan, Li, Jian, and Feng, Hailin
Subjects: ARTIFICIAL neural networks, GENOME editing, CRISPRS, DEEP learning, ENCODING
Abstract: The off-target activities within the CRISPR-Cas9 system remains a formidable barrier to its broader application and development. Recent advancements have highlighted the potential of deep learning models in predicting these off-target effects, yet they encounter significant hurdles including imbalances within datasets and the intricacies associated with encoding schemes and model architectures. To surmount these challenges, our study innovatively introduces an Efficiency and Specificity-Based (ESB) class rebalancing strategy, specifically devised for datasets featuring mismatches-only off-target instances, marking a pioneering approach in this realm. Furthermore, through a meticulous evaluation of various One-hot encoding schemes alongside numerous hybrid neural network models, we discern that encoding and models of moderate complexity ideally balance performance and efficiency. On this foundation, we advance a novel hybrid model, the CRISPR-MCA, which capitalizes on multi-feature extraction to enhance predictive accuracy. The empirical results affirm that the ESB class rebalancing strategy surpasses five conventional methods in addressing extreme dataset imbalances, demonstrating superior efficacy and broader applicability across diverse models. Notably, the CRISPR-MCA model excels in off-target effect prediction across four distinct mismatches-only datasets and significantly outperforms contemporary state-of-the-art models in datasets comprising both mismatches and indels. In summation, the CRISPR-MCA model, coupled with the ESB rebalancing strategy, offers profound insights and a robust framework for future explorations in this field. Author summary: In the field of gene editing, the application of deep learning technologies holds significant promise for predicting off-target effects in the CRISPR-Cas9 system. Nevertheless, one of the primary challenges encountered is the extreme imbalance among classes within the off-target datasets, which severely hampers the predictive accuracy for certain classes. Furthermore, as an array of sequence encoding methods continue to evolve, there has been a corresponding increase in model complexity. Addressing these issues, we introduce a novel Efficiency and Specificity-Based (ESB) class rebalancing strategy designed to mitigate the impact of class imbalance. Additionally, we assess the influence of six encoding schemes and four distinct architectural approaches on the prediction performance, employing four benchmark datasets for validation. Building upon these insights, we have developed a new hybrid model, termed CRISPR-MCA. Our experimental results demonstrate that the ESB strategy significantly surpasses the performance of existing baseline methods across multiple models. Moreover, the CRISPR-MCA model exhibits robust performance on two distinct types of datasets, affirming its effectiveness in enhancing the accuracy of deep learning predictions for off-target activities. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. ECD-CDGI: An efficient energy-constrained diffusion model for cancer driver gene identification.

Author: Wang, Tao, Zhuo, Linlin, Chen, Yifan, Fu, Xiangzheng, Zeng, Xiangxiang, and Zou, Quan
Subjects: CANCER genes, TRANSFORMER models, MEASUREMENT errors, GRAPH neural networks, INDIVIDUALIZED medicine
Abstract: The identification of cancer driver genes (CDGs) poses challenges due to the intricate interdependencies among genes and the influence of measurement errors and noise. We propose a novel energy-constrained diffusion (ECD)-based model for identifying CDGs, termed ECD-CDGI. This model is the first to design an ECD-Attention encoder by combining the ECD technique with an attention mechanism. ECD-Attention encoder excels at generating robust gene representations that reveal the complex interdependencies among genes while reducing the impact of data noise. We concatenate topological embedding extracted from gene-gene networks through graph transformers to these gene representations. We conduct extensive experiments across three testing scenarios. Extensive experiments show that the ECD-CDGI model possesses the ability to not only be proficient in identifying known CDGs but also efficiently uncover unknown potential CDGs. Furthermore, compared to the GNN-based approach, the ECD-CDGI model exhibits fewer constraints by existing gene-gene networks, thereby enhancing its capability to identify CDGs. Additionally, ECD-CDGI is open-source and freely available. We have also launched the model as a complimentary online tool specifically crafted to expedite research efforts focused on CDGs identification. Author summary: Cancer has become a major disease threatening human life and health. Cancer usually originates from abnormal gene activities, such as mutations and copy number variations. Mutations in cancer driver genes are crucial for the selective growth of tumor cells. Identifying cancer driver genes is crucial in cancer-related research and treatment strategies, as it helps understand cancer occurrence and development. However, the complex gene-gene interactions, measurement errors, and the prevalence of unlabeled data significantly complicate the identification of these driver genes. We developed a new method that integrates an energy-constrained diffusion mechanism with an attention mechanism to uncover implicit gene dependencies in biomolecular networks and generate robust gene representations. Extensive experiments demonstrated that our model accurately identifies known cancer driver genes and effectively discovers potential ones. Furthermore, we analyzed and predicted patient-specific mutated genes, enhancing our understanding of their pathogenesis and advancing precision medicine. In summary, our method offers a promising tool for advancing the identification of cancer driver genes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. scRNMF: An imputation method for single-cell RNA-seq data by robust and non-negative matrix factorization.

Author: Qian, Yuqing, Zou, Quan, Zhao, Mengyuan, Liu, Yi, Guo, Fei, and Ding, Yijie
Subjects: *NONNEGATIVE matrices, *MATRIX decomposition, *GENE expression, *RNA sequencing, *DATA recovery
Abstract: Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful tool in genomics research, enabling the analysis of gene expression at the individual cell level. However, scRNA-seq data often suffer from a high rate of dropouts, where certain genes fail to be detected in specific cells due to technical limitations. This missing data can introduce biases and hinder downstream analysis. To overcome this challenge, the development of effective imputation methods has become crucial in the field of scRNA-seq data analysis. Here, we propose an imputation method based on robust and non-negative matrix factorization (scRNMF). Instead of other matrix factorization algorithms, scRNMF integrates two loss functions: L2 loss and C-loss. The L2 loss function is highly sensitive to outliers, which can introduce substantial errors. We utilize the C-loss function when dealing with zero values in the raw data. The primary advantage of the C-loss function is that it imposes a smaller punishment for larger errors, which results in more robust factorization when handling outliers. Various datasets of different sizes and zero rates are used to evaluate the performance of scRNMF against other state-of-the-art methods. Our method demonstrates its power and stability as a tool for imputation of scRNA-seq data. Author summary: It is still difficult to analyze scRNA-seq data because a significant portion of expressed genes have zeros. Gene expression levels can be restored through the imputation of scRNA-seq data, facilitating downstream analysis. To overcome this challenge, we propose an imputation method based on robust and non-negative matrix factorization (scRNMF). Instead of other matrix factorization algorithms, scRNMF integrates two loss functions: L2 loss and C-loss. Through the use of several simulated and real datasets, we perform an comprehensively evaluation of scRNMF against existing methods. scRNMF can enhance various aspects of downstream analysis, including gene expression data recovery, cell clustering analysis, gene differential expression analysis, and cellular trajectory reconstruction. The results of our study demonstrate that scRNMF is a powerful tool that can improve the accuracy of single-cell data analysis. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. StackedEnC-AOP: prediction of antioxidant proteins using transform evolutionary and sequential features based multi-scale vector with stacked ensemble learning.

Author: Rukh, Gul, Akbar, Shahid, Rehman, Gauhar, Alarfaj, Fawaz Khaled, and Zou, Quan
Subjects: DISCRETE wavelet transforms, INDEPENDENT sets, DRUG design, FREE radicals, OXIDATIVE stress
Abstract: Background: Antioxidant proteins are involved in several biological processes and can protect DNA and cells from the damage of free radicals. These proteins regulate the body's oxidative stress and perform a significant role in many antioxidant-based drugs. The current invitro-based medications are costly, time-consuming, and unable to efficiently screen and identify the targeted motif of antioxidant proteins. Methods: In this model, we proposed an accurate prediction method to discriminate antioxidant proteins namely StackedEnC-AOP. The training sequences are formulation encoded via incorporating a discrete wavelet transform (DWT) into the evolutionary matrix to decompose the PSSM-based images via two levels of DWT to form a Pseudo position-specific scoring matrix (PsePSSM-DWT) based embedded vector. Additionally, the Evolutionary difference formula and composite physiochemical properties methods are also employed to collect the structural and sequential descriptors. Then the combined vector of sequential features, evolutionary descriptors, and physiochemical properties is produced to cover the flaws of individual encoding schemes. To reduce the computational cost of the combined features vector, the optimal features are chosen using Minimum redundancy and maximum relevance (mRMR). The optimal feature vector is trained using a stacking-based ensemble meta-model. Results: Our developed StackedEnC-AOP method reported a prediction accuracy of 98.40% and an AUC of 0.99 via training sequences. To evaluate model validation, the StackedEnC-AOP training model using an independent set achieved an accuracy of 96.92% and an AUC of 0.98. Conclusion: Our proposed StackedEnC-AOP strategy performed significantly better than current computational models with a ~ 5% and ~ 3% improved accuracy via training and independent sets, respectively. The efficacy and consistency of our proposed StackedEnC-AOP make it a valuable tool for data scientists and can execute a key role in research academia and drug design. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Predicting intercellular communication based on metabolite-related ligand-receptor interactions with MRCLinkdb.

Author: Zhang, Yuncong, Yang, Yu, Ren, Liping, Zhan, Meixiao, Sun, Taoping, Zou, Quan, and Zhang, Yang
Abstract: Background: Metabolite-associated cell communications play critical roles in maintaining human biological function. However, most existing tools and resources focus only on ligand-receptor interaction pairs where both partners are proteinaceous, neglecting other non-protein molecules. To address this gap, we introduce the MRCLinkdb database and algorithm, which aggregates and organizes data related to non-protein L-R interactions in cell-cell communication, providing a valuable resource for predicting intercellular communication based on metabolite-related ligand-receptor interactions. Results: Here, we manually curated the metabolite-ligand-receptor (ML-R) interactions from the literature and known databases, ultimately collecting over 790 human and 670 mouse ML-R interactions. Additionally, we compiled information on over 1900 enzymes and 260 transporter entries associated with these metabolites. We developed Metabolite-Receptor based Cell Link Database (MRCLinkdb) to store these ML-R interactions data. Meanwhile, the platform also offers extensive information for presenting ML-R interactions, including fundamental metabolite information and the overall expression landscape of metabolite-associated gene sets (such as receptor, enzymes, and transporter proteins) based on single-cell transcriptomics sequencing (covering 35 human and 26 mouse tissues, 52 human and 44 mouse cell types) and bulk RNA-seq/microarray data (encompassing 62 human and 39 mouse tissues). Furthermore, MRCLinkdb introduces a web server dedicated to the analysis of intercellular communication based on ML-R interactions. MRCLinkdb is freely available at . Conclusions: In addition to supplementing ligand-receptor databases, MRCLinkdb may provide new perspectives for decoding the intercellular communication and advancing related prediction tools based on ML-R interactions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. GraphADT: empowering interpretable predictions of acute dermal toxicity with multi-view graph pooling and structure remapping.

Author: Ma, Xinqian, Fu, Xiangzheng, Wang, Tao, Zhuo, Linlin, and Zou, Quan
Subjects: GRAPH neural networks, MOLECULAR graphs, MOLECULAR structure, CHEMICAL bonds, MOLECULES, DEEP learning
Abstract: Motivation Accurate prediction of acute dermal toxicity (ADT) is essential for the safe and effective development of contact drugs. Currently, graph neural networks, a form of deep learning technology, accurately model the structure of compound molecules, enhancing predictions of their ADT. However, many existing methods emphasize atom-level information transfer and overlook crucial data conveyed by molecular bonds and their interrelationships. Additionally, these methods often generate "equal" node representations across the entire graph, failing to accentuate "important" substructures like functional groups, pharmacophores, and toxicophores, thereby reducing interpretability. Results We introduce a novel model, GraphADT, utilizing structure remapping and multi-view graph pooling (MVPool) technologies to accurately predict compound ADT. Initially, our model applies structure remapping to better delineate bonds, transforming "bonds" into new nodes and "bond-atom-bond" interactions into new edges, thereby reconstructing the compound molecular graph. Subsequently, we use MVPool to amalgamate data from various perspectives, minimizing biases inherent to single-view analyses. Following this, the model generates a robust node ranking collaboratively, emphasizing critical nodes or substructures to enhance model interpretability. Lastly, we apply a graph comparison learning strategy to train both the original and structure remapped molecular graphs, deriving the final molecular representation. Experimental results on public datasets indicate that the GraphADT model outperforms existing state-of-the-art models. The GraphADT model has been demonstrated to effectively predict compound ADT, offering potential guidance for the development of contact drugs and related treatments. Availability and implementation Our code and data are accessible at: https://github.com/mxqmxqmxq/GraphADT.git. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. An Information Integration Technology for Safety Assessment on Civil Airborne System.

Author: Chen, Xi, Zou, Quan, Bai, Jie, and Dong, Lei
Subjects: TECHNOLOGY assessment, MULTIPLE scattering (Physics), ELECTRONIC records, MODEL airplanes, AIRWORTHINESS
Abstract: With the significant expansion of civil aviation, particularly in the low-altitude economy, there is a significant gap between the escalating demand for airworthiness certification of novel aircraft designs, such as electric vertical take-off and landing (eVTOL) vehicles, and the inefficiency of the current safety assessment process. This gap is partially attributed to safety assessors' limited exposure to these innovative aircraft models in the safety assessment process, necessitating extensive efforts in identifying precedents and their handling strategies. Complicating matters further, pertinent case studies are scattered across diverse, unstandardized digital formats, obliging assessors to navigate voluminous electronic records while concurrently establishing links among fragmented information scattered across multiple files. This study introduces an advanced information integration methodology, comprising a multi-level path-based architecture and a self-updating algorithm. The proposed method not only furnishes safety assessors with pertinent knowledge featuring explicative interconnectedness automatically, but also dynamically enriches this knowledge corpus through operational usage. Additionally, we devise a suite of evaluative criteria to validate the capacity of our method in processing and consolidating relevant safety datasets. Experimental analyses affirm the efficacy of our proposed approach in streamlining and refreshing safety assessment data. The automation of the retrieval of analogous cases, which relieves the reliance on expert knowledge, enhances the efficiency of the overall safety appraisal procedure. Consequently, this research contributes a solution to enhancing the velocity and accuracy of aircraft certification processes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Drug–target interaction predictions with multi-view similarity network fusion strategy and deep interactive attention mechanism.

Author: Song, Wei, Xu, Lewen, Han, Chenguang, Tian, Zhen, and Zou, Quan
Subjects: DRUG discovery, DRUG repositioning, PREDICTION models, FORECASTING, COMPUTATIONAL neuroscience
Abstract: Motivation Accurately identifying the drug–target interactions (DTIs) is one of the crucial steps in the drug discovery and drug repositioning process. Currently, many computational-based models have already been proposed for DTI prediction and achieved some significant improvement. However, these approaches pay little attention to fuse the multi-view similarity networks related to drugs and targets in an appropriate way. Besides, how to fully incorporate the known interaction relationships to accurately represent drugs and targets is not well investigated. Therefore, there is still a need to improve the accuracy of DTI prediction models. Results In this study, we propose a novel approach that employs Multi-view similarity network fusion strategy and deep Interactive attention mechanism to predict Drug–Target Interactions (MIDTI). First, MIDTI constructs multi-view similarity networks of drugs and targets with their diverse information and integrates these similarity networks effectively in an unsupervised manner. Then, MIDTI obtains the embeddings of drugs and targets from multi-type networks simultaneously. After that, MIDTI adopts the deep interactive attention mechanism to further learn their discriminative embeddings comprehensively with the known DTI relationships. Finally, we feed the learned representations of drugs and targets to the multilayer perceptron model and predict the underlying interactions. Extensive results indicate that MIDTI significantly outperforms other baseline methods on the DTI prediction task. The results of the ablation experiments also confirm the effectiveness of the attention mechanism in the multi-view similarity network fusion strategy and the deep interactive attention mechanism. Availability and implementation https://github.com/XuLew/MIDTI. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. Integrating Single‐Cell and Spatial Transcriptomics Reveals Heterogeneity of Early Pig Skin Development and a Subpopulation with Hair Placode Formation.

Author: Wang, Yi, Jiang, Yao, Ni, Guiyan, Li, Shujuan, Balderson, Brad, Zou, Quan, Liu, Huatao, Jiang, Yifan, Sun, Jingchun, and Ding, Xiangdong
Subjects: TRANSCRIPTOMES, EPIDERMIS, HAIR follicles, SWINE, HAIR, HETEROGENEITY, ETIOLOGY of diseases, FETUS, KERATINOCYTE differentiation
Abstract: The dermis and epidermis, crucial structural layers of the skin, encompass appendages, hair follicles (HFs), and intricate cellular heterogeneity. However, an integrated spatiotemporal transcriptomic atlas of embryonic skin has not yet been described and would be invaluable for studying skin‐related diseases in humans. Here, single‐cell and spatial transcriptomic analyses are performed on skin samples of normal and hairless fetal pigs across four developmental periods. The cross‐species comparison of skin cells illustrated that the pig epidermis is more representative of the human epidermis than mice epidermis. Moreover, Phenome‐wide association study analysis revealed that the conserved genes between pigs and humans are strongly associated with human skin‐related diseases. In the epidermis, two lineage differentiation trajectories describe hair follicle (HF) morphogenesis and epidermal development. By comparing normal and hairless fetal pigs, it is found that the hair placode (Pc), the most characteristic initial structure in HFs, arises from progenitor‐like OGN+/UCHL1+ cells. These progenitors appear earlier in development than the previously described early Pc cells and exhibit abnormal proliferation and migration during differentiation in hairless pigs. The study provides a valuable resource for in‐depth insights into HF development, which may serve as a key reference atlas for studying human skin disease etiology using porcine models. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. Electrochemically Enable N‑Sulfenylation/Phosphinylation of Sulfoximines via Oxidative Dehydrocoupling Reaction.

Author: Zhang, Wenbao, Jin, Dongsheng, Hu, Yongkang, Yin, Kun, Zou, Quan, Tang, Liang, and Qian, Peng
Published: 2024
Full Text: View/download PDF

23. Application and Comparison of Machine Learning and Database-Based Methods in Taxonomic Classification of High-Throughput Sequencing Data.

Author: Tian, Qinzhong, Zhang, Pinglu, Zhai, Yixiao, Wang, Yansu, and Zou, Quan
Subjects: NUCLEOTIDE sequencing, TECHNOLOGICAL innovations, CLASSIFICATION, DEVELOPMENTAL biology, DATABASES, SYNTHETIC biology
Abstract: The advent of high-throughput sequencing technologies has not only revolutionized the field of bioinformatics but has also heightened the demand for efficient taxonomic classification. Despite technological advancements, efficiently processing and analyzing the deluge of sequencing data for precise taxonomic classification remains a formidable challenge. Existing classification approaches primarily fall into two categories, database-based methods and machine learning methods, each presenting its own set of challenges and advantages. On this basis, the aim of our study was to conduct a comparative analysis between these two methods while also investigating the merits of integrating multiple database-based methods. Through an in-depth comparative study, we evaluated the performance of both methodological categories in taxonomic classification by utilizing simulated data sets. Our analysis revealed that database-based methods excel in classification accuracy when backed by a rich and comprehensive reference database. Conversely, while machine learning methods show superior performance in scenarios where reference sequences are sparse or lacking, they generally show inferior performance compared with database methods under most conditions. Moreover, our study confirms that integrating multiple database-based methods does, in fact, enhance classification accuracy. These findings shed new light on the taxonomic classification of high-throughput sequencing data and bear substantial implications for the future development of computational biology. For those interested in further exploring our methods, the source code of this study is publicly available on https://github.com/LoadStar822/Genome-Classifier-Performance-Evaluator. Additionally, a dedicated webpage showcasing our collected database, data sets, and various classification software can be found at http://lab.malab.cn/~tqz/project/taxonomic/. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. scMNMF: a novel method for single-cell multi-omics clustering based on matrix factorization.

Author: Qiu, Yushan, Guo, Dong, Zhao, Pu, and Zou, Quan
Subjects: MATRIX decomposition, MULTIOMICS, METABOLOMICS, NONNEGATIVE matrices, CONSTRAINED optimization, FEATURE selection, TRANSCRIPTOMES
Abstract: Motivation The technology for analyzing single-cell multi-omics data has advanced rapidly and has provided comprehensive and accurate cellular information by exploring cell heterogeneity in genomics, transcriptomics, epigenomics, metabolomics and proteomics data. However, because of the high-dimensional and sparse characteristics of single-cell multi-omics data, as well as the limitations of various analysis algorithms, the clustering performance is generally poor. Matrix factorization is an unsupervised, dimensionality reduction-based method that can cluster individuals and discover related omics variables from different blocks. Here, we present a novel algorithm that performs joint dimensionality reduction learning and cell clustering analysis on single-cell multi-omics data using non-negative matrix factorization that we named scMNMF. We formulate the objective function of joint learning as a constrained optimization problem and derive the corresponding iterative formulas through alternating iterative algorithms. The major advantage of the scMNMF algorithm remains its capability to explore hidden related features among omics data. Additionally, the feature selection for dimensionality reduction and cell clustering mutually influence each other iteratively, leading to a more effective discovery of cell types. We validated the performance of the scMNMF algorithm using two simulated and five real datasets. The results show that scMNMF outperformed seven other state-of-the-art algorithms in various measurements. Availability and implementation scMNMF code can be found at https://github.com/yushanqiu/scMNMF. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. MS-BACL: enhancing metabolic stability prediction through bond graph augmentation and contrastive learning.

Author: Wang, Tao, Li, Zhen, Zhuo, Linlin, Chen, Yifan, Fu, Xiangzheng, and Zou, Quan
Subjects: BOND graphs, GRAPH neural networks, MOLECULAR graphs, MOLECULAR structure, DRUG efficacy
Abstract: Motivation Accurately predicting molecular metabolic stability is of great significance to drug research and development, ensuring drug safety and effectiveness. Existing deep learning methods, especially graph neural networks, can reveal the molecular structure of drugs and thus efficiently predict the metabolic stability of molecules. However, most of these methods focus on the message passing between adjacent atoms in the molecular graph, ignoring the relationship between bonds. This makes it difficult for these methods to estimate accurate molecular representations, thereby being limited in molecular metabolic stability prediction tasks. Results We propose the MS-BACL model based on bond graph augmentation technology and contrastive learning strategy, which can efficiently and reliably predict the metabolic stability of molecules. To our knowledge, this is the first time that bond-to-bond relationships in molecular graph structures have been considered in the task of metabolic stability prediction. We build a bond graph based on 'atom-bond-atom', and the model can simultaneously capture the information of atoms and bonds during the message propagation process. This enhances the model's ability to reveal the internal structure of the molecule, thereby improving the structural representation of the molecule. Furthermore, we perform contrastive learning training based on the molecular graph and its bond graph to learn the final molecular representation. Multiple sets of experimental results on public datasets show that the proposed MS-BACL model outperforms the state-of-the-art model. Availability and Implementation The code and data are publicly available at https://github.com/taowang11/MS. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. DeepAVP-TPPred: identification of antiviral peptides using transformed image-based localized descriptors and binary tree growth algorithm.

Author: Ullah, Matee, Akbar, Shahid, Raza, Ali, and Zou, Quan
Subjects: ARTIFICIAL neural networks, TREE growth, PEPTIDES, LIFE cycles (Biology), FEATURE selection, FEATURE extraction, IDENTIFICATION
Abstract: Motivation Despite the extensive manufacturing of antiviral drugs and vaccination, viral infections continue to be a major human ailment. Antiviral peptides (AVPs) have emerged as potential candidates in the pursuit of novel antiviral drugs. These peptides show vigorous antiviral activity against a diverse range of viruses by targeting different phases of the viral life cycle. Therefore, the accurate prediction of AVPs is an essential yet challenging task. Lately, many machine learning-based approaches have developed for this purpose; however, their limited capabilities in terms of feature engineering, accuracy, and generalization make these methods restricted. Results In the present study, we aim to develop an efficient machine learning-based approach for the identification of AVPs, referred to as DeepAVP-TPPred, to address the aforementioned problems. First, we extract two new transformed feature sets using our designed image-based feature extraction algorithms and integrate them with an evolutionary information-based feature. Next, these feature sets were optimized using a novel feature selection approach called binary tree growth Algorithm. Finally, the optimal feature space from the training dataset was fed to the deep neural network to build the final classification model. The proposed model DeepAVP-TPPred was tested using stringent 5-fold cross-validation and two independent dataset testing methods, which achieved the maximum performance and showed enhanced efficiency over existing predictors in terms of both accuracy and generalization capabilities. Availability and implementation https://github.com/MateeullahKhan/DeepAVP-TPPred. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Integrated convolution and self-attention for improving peptide toxicity prediction.

Author: Jiao, Shihu, Ye, Xiucai, Sakurai, Tetsuya, Zou, Quan, and Liu, Ruijun
Subjects: PEPTIDES, AMINO acid sequence, PEPTIDE drugs, SOURCE code, DRUG development
Abstract: Motivation Peptides are promising agents for the treatment of a variety of diseases due to their specificity and efficacy. However, the development of peptide-based drugs is often hindered by the potential toxicity of peptides, which poses a significant barrier to their clinical application. Traditional experimental methods for evaluating peptide toxicity are time-consuming and costly, making the development process inefficient. Therefore, there is an urgent need for computational tools specifically designed to predict peptide toxicity accurately and rapidly, facilitating the identification of safe peptide candidates for drug development. Results We provide here a novel computational approach, CAPTP, which leverages the power of convolutional and self-attention to enhance the prediction of peptide toxicity from amino acid sequences. CAPTP demonstrates outstanding performance, achieving a Matthews correlation coefficient of approximately 0.82 in both cross-validation settings and on independent test datasets. This performance surpasses that of existing state-of-the-art peptide toxicity predictors. Importantly, CAPTP maintains its robustness and generalizability even when dealing with data imbalances. Further analysis by CAPTP reveals that certain sequential patterns, particularly in the head and central regions of peptides, are crucial in determining their toxicity. This insight can significantly inform and guide the design of safer peptide drugs. Availability and implementation The source code for CAPTP is freely available at https://github.com/jiaoshihu/CAPTP. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. scTPC: a novel semisupervised deep clustering model for scRNA-seq data.

Author: Qiu, Yushan, Yang, Lingfei, Jiang, Hao, and Zou, Quan
Subjects: DEEP learning, NEGATIVE binomial distribution, RNA sequencing, DATA modeling, FUZZY clustering technique, SEQUENCE analysis, RESEARCH personnel
Abstract: Motivation Continuous advancements in single-cell RNA sequencing (scRNA-seq) technology have enabled researchers to further explore the study of cell heterogeneity, trajectory inference, identification of rare cell types, and neurology. Accurate scRNA-seq data clustering is crucial in single-cell sequencing data analysis. However, the high dimensionality, sparsity, and presence of "false" zero values in the data can pose challenges to clustering. Furthermore, current unsupervised clustering algorithms have not effectively leveraged prior biological knowledge, making cell clustering even more challenging. Results This study investigates a semisupervised clustering model called scTPC, which integrates the t riplet constraint, p airwise constraint, and c ross-entropy constraint based on deep learning. Specifically, the model begins by pretraining a denoising autoencoder based on a zero-inflated negative binomial distribution. Deep clustering is then performed in the learned latent feature space using triplet constraints and pairwise constraints generated from partial labeled cells. Finally, to address imbalanced cell-type datasets, a weighted cross-entropy loss is introduced to optimize the model. A series of experimental results on 10 real scRNA-seq datasets and five simulated datasets demonstrate that scTPC achieves accurate clustering with a well-designed framework. Availability and implementation scTPC is a Python-based algorithm, and the code is available from https://github.com/LF-Yang/Code or https://zenodo.org/records/10951780. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. Revisiting drug–protein interaction prediction: a novel global–local perspective.

Author: Zhou, Zhecheng, Liao, Qingquan, Wei, Jinhang, Zhuo, Linlin, Wu, Xiaonan, Fu, Xiangzheng, and Zou, Quan
Subjects: MULTILAYER perceptrons, BIPARTITE graphs, DRUG repositioning, DEEP learning, TRANSFORMER models, INDIVIDUALIZED medicine, PROTEIN-protein interactions
Abstract: Motivation Accurate inference of potential drug–protein interactions (DPIs) aids in understanding drug mechanisms and developing novel treatments. Existing deep learning models, however, struggle with accurate node representation in DPI prediction, limiting their performance. Results We propose a new computational framework that integrates global and local features of nodes in the drug–protein bipartite graph for efficient DPI inference. Initially, we employ pre-trained models to acquire fundamental knowledge of drugs and proteins and to determine their initial features. Subsequently, the MinHash and HyperLogLog algorithms are utilized to estimate the similarity and set cardinality between drug and protein subgraphs, serving as their local features. Then, an energy-constrained diffusion mechanism is integrated into the transformer architecture, capturing interdependencies between nodes in the drug–protein bipartite graph and extracting their global features. Finally, we fuse the local and global features of nodes and employ multilayer perceptrons to predict the likelihood of potential DPIs. A comprehensive and precise node representation guarantees efficient prediction of unknown DPIs by the model. Various experiments validate the accuracy and reliability of our model, with molecular docking results revealing its capability to identify potential DPIs not present in existing databases. This approach is expected to offer valuable insights for furthering drug repurposing and personalized medicine research. Availability and implementation Our code and data are accessible at: https://github.com/ZZCrazy00/DPI. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. Electrocatalytic C–H/S–H Coupling of Amino Pyrazoles and Thiophenols: Synthesis of Amino Pyrazole Thioether Derivatives.

Author: Zhang, Wenbao, Zou, Quan, Wang, Qian, Jin, Dongsheng, Jiang, Shan, and Qian, Peng
Published: 2024
Full Text: View/download PDF

31. TPMA: A two pointers meta-alignment tool to ensemble different multiple nucleic acid sequence alignments.

Author: Zhai, Yixiao, Chao, Jiannan, Wang, Yizheng, Zhang, Pinglu, Tang, Furong, and Zou, Quan
Subjects: SEQUENCE alignment, RESEARCH personnel, SEQUENCE analysis, SOURCE code, NUCLEIC acids
Abstract: Accurate multiple sequence alignment (MSA) is imperative for the comprehensive analysis of biological sequences. However, a notable challenge arises as no single MSA tool consistently outperforms its counterparts across diverse datasets. Users often have to try multiple MSA tools to achieve optimal alignment results, which can be time-consuming and memory-intensive. While the overall accuracy of certain MSA results may be lower, there could be local regions with the highest alignment scores, prompting researchers to seek a tool capable of merging these locally optimal results from multiple initial alignments into a globally optimal alignment. In this study, we introduce Two Pointers Meta-Alignment (TPMA), a novel tool designed for the integration of nucleic acid sequence alignments. TPMA employs two pointers to partition the initial alignments into blocks containing identical sequence fragments. It selects blocks with the high sum of pairs (SP) scores to concatenate them into an alignment with an overall SP score superior to that of the initial alignments. Through tests on simulated and real datasets, the experimental results consistently demonstrate that TPMA outperforms M-Coffee in terms of aSP, Q, and total column (TC) scores across most datasets. Even in cases where TPMA's scores are comparable to M-Coffee, TPMA exhibits significantly lower running time and memory consumption. Furthermore, we comprehensively assessed all the MSA tools used in the experiments, considering accuracy, time, and memory consumption. We propose accurate and fast combination strategies for small and large datasets, which streamline the user tool selection process and facilitate large-scale dataset integration. The dataset and source code of TPMA are available on GitHub (https://github.com/malabz/TPMA). Author summary: Accurate multiple sequence alignment (MSA) is vital for comprehensive biological sequence analysis. However, as no single MSA tool consistently outperforms others across diverse datasets, researchers must invest significant time exploring multiple tools to identify the most suitable one for their specific dataset. To address this, researchers seek tools that can merge locally optimal results from diverse initial alignments into a globally optimal alignment. Our novel approach, Two Pointers Meta-Alignment (TPMA), employs a two-pointer to partition initial alignments into blocks, selecting those with the higher sum of pairs (SP) scores for integration into a globally optimal alignment. TPMA consistently outperforms M-Coffee, demonstrating superior aSP, Q, and total column (TC) scores, coupled with faster running times and lower memory consumption. We present comprehensive assessments of various MSA tools, proposing efficient combination strategies for diverse datasets. Our tool, TPMA, and associated resources are publicly available on GitHub (https://github.com/malabz/TPMA), offering a valuable contribution to the field of evolutionary biology and streamlining the selection process for users dealing with large-scale datasets. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. Effects of crystalline lens rise and anterior chamber parameters on vault after implantable collamer lens placement.

Author: Zou, Quan, Zhao, Sen, Cheng, Lei, Song, Chao, Yuan, Ping, and Zhu, Ran
Subjects: *CRYSTALLINE lens, *BLAND-Altman plot, *ACOUSTIC microscopy, *MULTIPLE regression analysis, *RECEIVER operating characteristic curves, *FACTOR analysis
Abstract: Background: To analyze vault effects of crystalline lens rise (CLR) and anterior chamber parameters (recorded by Pentacam) in highly myopic patients receiving implantable collamer lenses (ICLs), which may avoid subsequent complications such as glaucoma and cataract caused by the abnormal vault. Methods: We collected clinical data of 137 patients with highly myopic vision, who were all subsequent recipients of V4c ICLs between June 2020 and January 2021. Horizontal ciliary sulcus-to-sulcus diameter (hSTS) and CLR were measured by ultrasonic biomicroscopy (UBM), and a Pentacam anterior segment analyzer was used to measure horizontal white-to-white diameter (hWTW), anterior chamber depth (ACD), anterior chamber angle (ACA), anterior chamber volume (ACV), CLR, and postoperative vault (Year 1 and Month 1). The lens thickness (LT) was determined by optical biometry (IOL Master instrument). The predictive model was generated through multiple linear regression analyses of influential factors, such as hSTS, CLR, hWTW, ACD, ACA, ACV, ICL size, and LT. The predictive performance of the multivariate model on vault after ICL was assessed using the receiver operating characteristic (ROC) curve with area under the curve (AUC) as well as the point of tangency. Results: Average CLR assessed by UBM was lower than the average value obtained by Pentacam (0.561 vs. 0.683). Bland-Altman analysis showed a good consistency in the two measurement methods and substantial correlation (r = 0.316; P = 0.000). The ROC curve of Model 1 (postoperative Year 1) displayed an AUC of 0.847 (95% confidence interval [CI]: 74.19–95.27), with optimal threshold of 0.581 (sensitivity, 0.857; specificity, 0.724). In addition, respective values for Model 2 (postoperative Month 1) were 0.783 (95% CI: 64.94–91.64) and 0.522 (sensitivity, 0.917; specificity, 0.605). Conclusion: CLR and anterior chamber parameters are important determinants of postoperative vault after ICL placement. The multivariate regression model we constructed may serve in large part as a predictive gauge, effectively avoid postoperative complication. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Deepstacked-AVPs: predicting antiviral peptides using tri-segment evolutionary profile and word embedding based multi-perspective features with deep stacking model.

Author: Akbar, Shahid, Raza, Ali, and Zou, Quan
Subjects: PEPTIDES, ANTIMICROBIAL peptides, VIRUS diseases, MACHINE learning, ANTIVIRAL agents
Abstract: Background: Viral infections have been the main health issue in the last decade. Antiviral peptides (AVPs) are a subclass of antimicrobial peptides (AMPs) with substantial potential to protect the human body against various viral diseases. However, there has been significant production of antiviral vaccines and medications. Recently, the development of AVPs as an antiviral agent suggests an effective way to treat virus-affected cells. Recently, the involvement of intelligent machine learning techniques for developing peptide-based therapeutic agents is becoming an increasing interest due to its significant outcomes. The existing wet-laboratory-based drugs are expensive, time-consuming, and cannot effectively perform in screening and predicting the targeted motif of antiviral peptides. Methods: In this paper, we proposed a novel computational model called Deepstacked-AVPs to discriminate AVPs accurately. The training sequences are numerically encoded using a novel Tri-segmentation-based position-specific scoring matrix (PSSM-TS) and word2vec-based semantic features. Composition/Transition/Distribution-Transition (CTDT) is also employed to represent the physiochemical properties based on structural features. Apart from these, the fused vector is formed using PSSM-TS features, semantic information, and CTDT descriptors to compensate for the limitations of single encoding methods. Information gain (IG) is applied to choose the optimal feature set. The selected features are trained using a stacked-ensemble classifier. Results: The proposed Deepstacked-AVPs model achieved a predictive accuracy of 96.60%%, an area under the curve (AUC) of 0.98, and a precision-recall (PR) value of 0.97 using training samples. In the case of the independent samples, our model obtained an accuracy of 95.15%, an AUC of 0.97, and a PR value of 0.97. Conclusion: Our Deepstacked-AVPs model outperformed existing models with a ~ 4% and ~ 2% higher accuracy using training and independent samples, respectively. The reliability and efficacy of the proposed Deepstacked-AVPs model make it a valuable tool for scientists and may perform a beneficial role in pharmaceutical design and research academia. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. Diff-AMP: tailored designed antimicrobial peptide framework with all-in-one generation, identification, prediction and optimization.

Author: Wang, Rui, Wang, Tao, Zhuo, Linlin, Wei, Jinhang, Fu, Xiangzheng, Zou, Quan, and Yao, Xiaojun
Subjects: ANTIMICROBIAL peptides, INTERNET servers, CONVOLUTIONAL neural networks, PEPTIDE antibiotics, REINFORCEMENT learning, DEEP learning, DRUG toxicity
Abstract: Antimicrobial peptides (AMPs), short peptides with diverse functions, effectively target and combat various organisms. The widespread misuse of chemical antibiotics has led to increasing microbial resistance. Due to their low drug resistance and toxicity, AMPs are considered promising substitutes for traditional antibiotics. While existing deep learning technology enhances AMP generation, it also presents certain challenges. Firstly, AMP generation overlooks the complex interdependencies among amino acids. Secondly, current models fail to integrate crucial tasks like screening, attribute prediction and iterative optimization. Consequently, we develop a integrated deep learning framework, Diff-AMP, that automates AMP generation, identification, attribute prediction and iterative optimization. We innovatively integrate kinetic diffusion and attention mechanisms into the reinforcement learning framework for efficient AMP generation. Additionally, our prediction module incorporates pre-training and transfer learning strategies for precise AMP identification and screening. We employ a convolutional neural network for multi-attribute prediction and a reinforcement learning-based iterative optimization strategy to produce diverse AMPs. This framework automates molecule generation, screening, attribute prediction and optimization, thereby advancing AMP research. We have also deployed Diff-AMP on a web server, with code, data and server details available in the Data Availability section. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. FEOpti-ACVP: identification of novel anti-coronavirus peptide sequences based on feature engineering and optimization.

Author: Jiang, Jici, Pei, Hongdi, Li, Jiayu, Li, Mingxin, Zou, Quan, and Lv, Zhibin
Subjects: AMINO acid sequence, DEEP learning, MACHINE learning, ENGINEERING, FEATURE extraction, IDENTIFICATION
Abstract: Anti-coronavirus peptides (ACVPs) represent a relatively novel approach of inhibiting the adsorption and fusion of the virus with human cells. Several peptide-based inhibitors showed promise as potential therapeutic drug candidates. However, identifying such peptides in laboratory experiments is both costly and time consuming. Therefore, there is growing interest in using computational methods to predict ACVPs. Here, we describe a model for the prediction of ACVPs that is based on the combination of feature engineering (FE) optimization and deep representation learning. FEOpti-ACVP was pre-trained using two feature extraction frameworks. At the next step, several machine learning approaches were tested in to construct the final algorithm. The final version of FEOpti-ACVP outperformed existing methods used for ACVPs prediction and it has the potential to become a valuable tool in ACVP drug design. A user-friendly webserver of FEOpti-ACVP can be accessed at http://servers.aibiochem.net/soft/FEOpti-ACVP/. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. Identification, characterization and expression analysis of circRNA encoded by SARS-CoV-1 and SARS-CoV-2.

Author: Niu, Mengting, Wang, Chunyu, Chen, Yaojia, Zou, Quan, and Xu, Lei
Subjects: CIRCULAR RNA, SARS virus, SARS-CoV-2, VIRUS diseases, CORONAVIRUSES, FUNCTIONAL analysis
Abstract: Virus-encoded circular RNA (circRNA) participates in the immune response to viral infection, affects the human immune system, and can be used as a target for precision therapy and tumor biomarker. The coronaviruses SARS-CoV-1 and SARS-CoV-2 (SARS-CoV-1/2) that have emerged in recent years are highly contagious and have high mortality rates. In coronaviruses, little is known about the circRNA encoded by the SARS-CoV-1/2. Therefore, this study explores whether SARS-CoV-1/2 encodes circRNA and characteristics and functions of circRNA. Based on RNA-seq data of SARS-CoV-1 and SARS-CoV-2 infections, we used circRNA identification tools (circRNA_finder, find_circ and CIRI2) to identify circRNAs. The number of circRNAs encoded by SARS-CoV-1 and SARS-CoV-2 was identified as 151 and 470, respectively. It can be found that SARS-CoV-2 shows more prominent circRNA encoding ability than SARS-CoV-1. Expression analysis showed that only a few circRNAs encoded by SARS-CoV-1/2 showed high expression levels, and the positive strand produced more abundant circRNAs. Then, based on the identified SARS-CoV-1/2-encoded circRNAs, we performed circRNA identification and characterization using the previously developed CirRNAPL. Finally, target gene prediction and functional enrichment analysis were performed. It was found that viral circRNA is closely related to cancer and has a potential role in regulating host cell functions. This study studied the characteristics and functions of viral circRNA encoded by coronavirus SARS-CoV-1/2, providing a valuable resource for further research on the function and molecular mechanism of coronavirus circRNA. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. CD39 expression defines exhausted CD4+ T cells associated with poor survival and immune evasion in human gastric cancer.

Author: Duan, Zhen‐quan, Li, Yu‐xian, Qiu, Yuan, Shen, Yang, Wang, Ying, Zhang, Yuan‐yuan, Zhu, Bao‐hang, Yu, Xiao‐hong, Tan, Xue‐ling, Chen, Weisan, Zhuang, Yuan, Cheng, Ping, Zhang, Wei‐jun, Zou, Quan‐ming, Ma, Dai‐yuan, and Peng, Liu‐sheng
Subjects: T cells, T helper cells, REGULATORY T cells, STOMACH cancer, T-cell exhaustion
Abstract: Objectives: CD4+ T cell helper and regulatory function in human cancers has been well characterised. However, the definition of tumor‐infiltrating CD4+ T cell exhaustion and how it contributes to the immune response and disease progression in human gastric cancer (GC) remain largely unknown. Methods: A total of 128 GC patients were enrolled in the study. The expression of CD39 and PD‐1 on CD4+ T cells in the different samples was analysed by flow cytometry. GC‐infiltrating CD4+ T cell subpopulations based on CD39 expression were phenotypically and functionally assessed. The role of CD39 in the immune response of GC‐infiltrating T cells was investigated by inhibiting CD39 enzymatic activity. Results: In comparison with CD4+ T cells from the non‐tumor tissues, significantly more GC‐infiltrating CD4+ T cells expressed CD39. Most GC‐infiltrating CD39+CD4+ T cells exhibited CD45RA−CCR7− effector–memory phenotype expressing more exhaustion‐associated inhibitory molecules and transcription factors and produced less TNF‐α, IFN‐γ and cytolytic molecules than their CD39−CD4+ counterparts. Moreover, ex vivo inhibition of CD39 enzymatic activity enhanced their functional potential reflected by TNF‐α and IFN‐γ production. Finally, increased percentages of GC‐infiltrating CD39+CD4+ T cells were positively associated with disease progression and patients' poorer overall survival. Conclusion: Our study demonstrates that CD39 expression defines GC‐infiltrating CD4+ T cell exhaustion and their immunosuppressive function. Targeting CD39 may be a promising therapeutic strategy for treating GC patients. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. CircRNA identification and feature interpretability analysis.

Author: Niu, Mengting, Wang, Chunyu, Chen, Yaojia, Zou, Quan, Qi, Ren, and Xu, Lei
Subjects: CIRCULAR RNA, LINCRNA
Abstract: Background: Circular RNAs (circRNAs) can regulate microRNA activity and are related to various diseases, such as cancer. Functional research on circRNAs is the focus of scientific research. Accurate identification of circRNAs is important for gaining insight into their functions. Although several circRNA prediction models have been developed, their prediction accuracy is still unsatisfactory. Therefore, providing a more accurate computational framework to predict circRNAs and analyse their looping characteristics is crucial for systematic annotation. Results: We developed a novel framework, CircDC, for classifying circRNAs from other lncRNAs. CircDC uses four different feature encoding schemes and adopts a multilayer convolutional neural network and bidirectional long short-term memory network to learn high-order feature representation and make circRNA predictions. The results demonstrate that the proposed CircDC model is more accurate than existing models. In addition, an interpretable analysis of the features affecting the model is performed, and the computational framework is applied to the extended application of circRNA identification. Conclusions: CircDC is suitable for the prediction of circRNA. The identification of circRNA helps to understand and delve into the related biological processes and functions. Feature importance analysis increases model interpretability and uncovers significant biological properties. The relevant code and data in this article can be accessed for free at https://github.com/nmt315320/CircDC.git. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. A computational model of circRNA-associated diseases based on a graph neural network: prediction and case studies for follow-up experimental validation.

Author: Niu, Mengting, Wang, Chunyu, Zhang, Zhanguo, and Zou, Quan
Subjects: INTERNET servers, CIRCULAR RNA, DEEP learning, STOMACH cancer, HEPATOCELLULAR carcinoma, FORECASTING, MULTIOMICS
Abstract: Background: Circular RNAs (circRNAs) have been confirmed to play a vital role in the occurrence and development of diseases. Exploring the relationship between circRNAs and diseases is of far-reaching significance for studying etiopathogenesis and treating diseases. To this end, based on the graph Markov neural network algorithm (GMNN) constructed in our previous work GMNN2CD, we further considered the multisource biological data that affects the association between circRNA and disease and developed an updated web server CircDA and based on the human hepatocellular carcinoma (HCC) tissue data to verify the prediction results of CircDA. Results: CircDA is built on a Tumarkov-based deep learning framework. The algorithm regards biomolecules as nodes and the interactions between molecules as edges, reasonably abstracts multiomics data, and models them as a heterogeneous biomolecular association network, which can reflect the complex relationship between different biomolecules. Case studies using literature data from HCC, cervical, and gastric cancers demonstrate that the CircDA predictor can identify missing associations between known circRNAs and diseases, and using the quantitative real-time PCR (RT-qPCR) experiment of HCC in human tissue samples, it was found that five circRNAs were significantly differentially expressed, which proved that CircDA can predict diseases related to new circRNAs. Conclusions: This efficient computational prediction and case analysis with sufficient feedback allows us to identify circRNA-associated diseases and disease-associated circRNAs. Our work provides a method to predict circRNA-associated diseases and can provide guidance for the association of diseases with certain circRNAs. For ease of use, an online prediction server (http://server.malab.cn/CircDA) is provided, and the code is open-sourced (https://github.com/nmt315320/CircDA.git) for the convenience of algorithm improvement. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. AutoEdge-CCP: A novel approach for predicting cancer-associated circRNAs and drugs based on automated edge embedding.

Author: Chen, Yaojia, Wang, Jiacheng, Wang, Chunyu, and Zou, Quan
Subjects: DRUG target, DRUGS, ANTINEOPLASTIC agents, MOLECULAR interactions, CANCER treatment, NOMOGRAPHY (Mathematics)
Abstract: The unique expression patterns of circRNAs linked to the advancement and prognosis of cancer underscore their considerable potential as valuable biomarkers. Repurposing existing drugs for new indications can significantly reduce the cost of cancer treatment. Computational prediction of circRNA-cancer and drug-cancer relationships is crucial for precise cancer therapy. However, prior computational methods fail to analyze the interaction between circRNAs, drugs, and cancer at the systematic level. It is essential to propose a method that uncover more valuable information for achieving cancer-centered multi-association prediction. In this paper, we present a novel computational method, AutoEdge-CCP, to unveil cancer-associated circRNAs and drugs. We abstract the complex relationships between circRNAs, drugs, and cancer into a multi-source heterogeneous network. In this network, each molecule is represented by two types information, one is the intrinsic attribute information of molecular features, and the other is the link information explicitly modeled by autoGNN, which searches information from both intra-layer and inter-layer of message passing neural network. The significant performance on multi-scenario applications and case studies establishes AutoEdge-CCP as a potent and promising association prediction tool. Author summary: CircRNAs serve as crucial biomarkers and drug targets in cancer therapy. Predicting cancer-associated circRNAs and drugs contributes to uncover intricate molecular mechanisms driving tumorigenesis, thus offering novel insights into cancer diagnosis, treatment, and research. However, prevailing predictive methods often neglect the comprehensive interactions within circRNAs, drugs, and cancer, leading to an incomplete understanding of their complex interplay. In response, we introduce AutoEdge-CCP, a framework that models circRNA-cancer-drug interactions within a multi-source heterogeneous network. Each molecule combines intrinsic attribute information describing molecular features with interaction information derived through autoGNN, revealing pivotal circRNAs and drugs associated with cancer. Experimental results across multi-scenario attest to AutoEdge-CCP's superior performance compared to competing methods, particularly in predicting novel circRNAs and drugs associated with cancer. Additionally, visualization of edge embeddings and case studies provide interpretable insights into the prediction outcomes. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. RAVAR: a curated repository for rare variant–trait associations.

Author: Cao, Chen, Shao, Mengting, Zuo, Chunman, Kwok, Devin, Liu, Lin, Ge, Yuli, Zhang, Zilong, Cui, Feifei, Chen, Mingshuai, Fan, Rui, Ding, Yijie, Jiang, Hangjin, Wang, Guishen, and Zou, Quan
Published: 2024
Full Text: View/download PDF

42. Joint deep autoencoder and subgraph augmentation for inferring microbial responses to drugs.

Author: Zhou, Zhecheng, Zhuo, Linlin, Fu, Xiangzheng, and Zou, Quan
Subjects: GRAPH algorithms, ARTIFICIAL intelligence, VIDEO coding, DRUGS, SUBGRAPHS
Abstract: Exploring microbial stress responses to drugs is crucial for the advancement of new therapeutic methods. While current artificial intelligence methodologies have expedited our understanding of potential microbial responses to drugs, the models are constrained by the imprecise representation of microbes and drugs. To this end, we combine deep autoencoder and subgraph augmentation technology for the first time to propose a model called JDASA-MRD, which can identify the potential indistinguishable responses of microbes to drugs. In the JDASA-MRD model, we begin by feeding the established similarity matrices of microbe and drug into the deep autoencoder, enabling to extract robust initial features of both microbes and drugs. Subsequently, we employ the MinHash and HyperLogLog algorithms to account intersections and cardinality data between microbe and drug subgraphs, thus deeply extracting the multi-hop neighborhood information of nodes. Finally, by integrating the initial node features with subgraph topological information, we leverage graph neural network technology to predict the microbes' responses to drugs, offering a more effective solution to the 'over-smoothing' challenge. Comparative analyses on multiple public datasets confirm that the JDASA-MRD model's performance surpasses that of current state-of-the-art models. This research aims to offer a more profound insight into the adaptability of microbes to drugs and to furnish pivotal guidance for drug treatment strategies. Our data and code are publicly available at: https://github.com/ZZCrazy00/JDASA-MRD. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. FMAlign2: a novel fast multiple nucleotide sequence alignment method for ultralong datasets.

Author: Zhang, Pinglu, Liu, Huan, Wei, Yanming, Zhai, Yixiao, Tian, Qinzhong, and Zou, Quan
Subjects: SEQUENCE alignment, NUCLEOTIDE sequence, SOURCE code, RESEARCH personnel, BIOINFORMATICS
Abstract: Motivation In bioinformatics, multiple sequence alignment (MSA) is a crucial task. However, conventional methods often struggle with aligning ultralong sequences. To address this issue, researchers have designed MSA methods rooted in a vertical division strategy, which segments sequence data for parallel alignment. A prime example of this approach is FMAlign, which utilizes the FM-index to extract common seeds and segment the sequences accordingly. Results FMAlign2 leverages the suffix array to identify maximal exact matches, redefining the approach of FMAlign from searching for global chains to partial chains. By using a vertical division strategy, large-scale problem is deconstructed into manageable tasks, enabling parallel execution of subMSA. Furthermore, sequence-profile alignment and refinement are incorporated to concatenate subsets, yielding the final result seamlessly. Compared to FMAlign, FMAlign2 markedly augments the segmentation of sequences and significantly reduces the time while maintaining accuracy, especially on ultralong datasets. Importantly, FMAlign2 enhances existing MSA methods by conferring the capability to handle sequences reaching billions in length within an acceptable time frame. Availability and implementation Source code and datasets are available at https://github.com/malabz/FMAlign2 and https://zenodo.org/records/10435770. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. CircSI-SSL: circRNA-binding site identification based on self-supervised learning.

Author: Cao, Chao, Wang, Chunyu, Yang, Shuhong, and Zou, Quan
Subjects: SUPERVISED learning, CIRCULAR RNA, PROTEOMICS, BINDING sites, SOURCE code, CARRIER proteins
Abstract: Motivation In recent years, circular RNAs (circRNAs), the particular form of RNA with a closed-loop structure, have attracted widespread attention due to their physiological significance (they can directly bind proteins), leading to the development of numerous protein site identification algorithms. Unfortunately, these studies are supervised and require the vast majority of labeled samples in training to produce superior performance. But the acquisition of sample labels requires a large number of biological experiments and is difficult to obtain. Results To resolve this matter that a great deal of tags need to be trained in the circRNA-binding site prediction task, a self-supervised learning binding site identification algorithm named CircSI-SSL is proposed in this article. According to the survey, this is unprecedented in the research field. Specifically, CircSI-SSL initially combines multiple feature coding schemes and employs RNA_Transformer for cross-view sequence prediction (self-supervised task) to learn mutual information from the multi-view data, and then fine-tuning with only a few sample labels. Comprehensive experiments on six widely used circRNA datasets indicate that our CircSI-SSL algorithm achieves excellent performance in comparison to previous algorithms, even in the extreme case where the ratio of training data to test data is 1:9. In addition, the transplantation experiment of six linRNA datasets without network modification and hyperparameter adjustment shows that CircSI-SSL has good scalability. In summary, the prediction algorithm based on self-supervised learning proposed in this article is expected to replace previous supervised algorithms and has more extensive application value. Availability and implementation The source code and data are available at https://github.com/cc646201081/CircSI-SSL. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. Robust generalised predictive position control for chain‐type rotary shell magazine with disturbance observer.

Author: Zhou, Guangzu, Qian, Linfang, Zou, Quan, Sun, Le, and Wei, Kai
Published: 2024
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

45 results on '"Zou, Quan"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources