10 results on '"Zou, Quan"'
Search Results
2. MVST: Identifying spatial domains of spatial transcriptomes from multiple views using multi-view graph convolutional networks.
- Author
-
Duan, Hao, Zhang, Qingchen, Cui, Feifei, Zou, Quan, and Zhang, Zilong
- Subjects
MORPHOLOGY ,GENE expression ,TISSUES ,STRUCTURAL frames ,TRANSCRIPTOMES - Abstract
Spatial transcriptome technology can parse transcriptomic data at the spatial level to detect high-throughput gene expression and preserve information regarding the spatial structure of tissues. Identifying spatial domains, that is identifying regions with similarities in gene expression and histology, is the most basic and critical aspect of spatial transcriptome data analysis. Most current methods identify spatial domains only through a single view, which may obscure certain important information and thus fail to make full use of the information embedded in spatial transcriptome data. Therefore, we propose an unsupervised clustering framework based on multiview graph convolutional networks (MVST) to achieve accurate spatial domain recognition by the learning graph embedding features of neighborhood graphs constructed from gene expression information, spatial location information, and histopathological image information through multiview graph convolutional networks. By exploring spatial transcriptomes from multiple views, MVST enables data from all parts of the spatial transcriptome to be comprehensively and fully utilized to obtain more accurate spatial expression patterns. We verified the effectiveness of MVST on real spatial transcriptome datasets, the robustness of MVST on some simulated datasets, and the reasonableness of the framework structure of MVST in ablation experiments, and from the experimental results, it is clear that MVST can achieve a more accurate spatial domain identification compared with the current more advanced methods. In conclusion, MVST is a powerful tool for spatial transcriptome research with improved spatial domain recognition. Author summary: Spatial transcriptome sequencing can not only reveal the mechanisms of disease development, but can also be used to explore the structure of biological tissues, which are widely used in a variety of fields such as developmental biology and oncology. To utilize spatial transcriptome data to understand how different cells work together to perform complex functions, spatially relevant groups of cells must be identified, which leads to the task of dissecting tissues into spatial domains. However, most of the currently available spatial domain identification tools have yet to fully utilize the information embedded in spatial transcriptomic data, and their identification remains to be improved for the fine-grained exploration of tissue structure. Here, we developed a tool for accurate spatial domain identification, and our tool makes full use of the various types of spatial transcriptome data from multiple perspectives, identifying the real grouping information in spatial transcriptome data as much as possible. Our results show that our spatial domain identification tool can identify spatial domains more accurately and provide effective help for biological organization structure exploration. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Overcoming CRISPR-Cas9 off-target prediction hurdles: A novel approach with ESB rebalancing strategy and CRISPR-MCA model.
- Author
-
Yang, Yanpeng, Zheng, Yanyi, Zou, Quan, Li, Jian, and Feng, Hailin
- Subjects
ARTIFICIAL neural networks ,GENOME editing ,CRISPRS ,DEEP learning ,ENCODING - Abstract
The off-target activities within the CRISPR-Cas9 system remains a formidable barrier to its broader application and development. Recent advancements have highlighted the potential of deep learning models in predicting these off-target effects, yet they encounter significant hurdles including imbalances within datasets and the intricacies associated with encoding schemes and model architectures. To surmount these challenges, our study innovatively introduces an Efficiency and Specificity-Based (ESB) class rebalancing strategy, specifically devised for datasets featuring mismatches-only off-target instances, marking a pioneering approach in this realm. Furthermore, through a meticulous evaluation of various One-hot encoding schemes alongside numerous hybrid neural network models, we discern that encoding and models of moderate complexity ideally balance performance and efficiency. On this foundation, we advance a novel hybrid model, the CRISPR-MCA, which capitalizes on multi-feature extraction to enhance predictive accuracy. The empirical results affirm that the ESB class rebalancing strategy surpasses five conventional methods in addressing extreme dataset imbalances, demonstrating superior efficacy and broader applicability across diverse models. Notably, the CRISPR-MCA model excels in off-target effect prediction across four distinct mismatches-only datasets and significantly outperforms contemporary state-of-the-art models in datasets comprising both mismatches and indels. In summation, the CRISPR-MCA model, coupled with the ESB rebalancing strategy, offers profound insights and a robust framework for future explorations in this field. Author summary: In the field of gene editing, the application of deep learning technologies holds significant promise for predicting off-target effects in the CRISPR-Cas9 system. Nevertheless, one of the primary challenges encountered is the extreme imbalance among classes within the off-target datasets, which severely hampers the predictive accuracy for certain classes. Furthermore, as an array of sequence encoding methods continue to evolve, there has been a corresponding increase in model complexity. Addressing these issues, we introduce a novel Efficiency and Specificity-Based (ESB) class rebalancing strategy designed to mitigate the impact of class imbalance. Additionally, we assess the influence of six encoding schemes and four distinct architectural approaches on the prediction performance, employing four benchmark datasets for validation. Building upon these insights, we have developed a new hybrid model, termed CRISPR-MCA. Our experimental results demonstrate that the ESB strategy significantly surpasses the performance of existing baseline methods across multiple models. Moreover, the CRISPR-MCA model exhibits robust performance on two distinct types of datasets, affirming its effectiveness in enhancing the accuracy of deep learning predictions for off-target activities. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. ECD-CDGI: An efficient energy-constrained diffusion model for cancer driver gene identification.
- Author
-
Wang, Tao, Zhuo, Linlin, Chen, Yifan, Fu, Xiangzheng, Zeng, Xiangxiang, and Zou, Quan
- Subjects
CANCER genes ,TRANSFORMER models ,MEASUREMENT errors ,GRAPH neural networks ,INDIVIDUALIZED medicine - Abstract
The identification of cancer driver genes (CDGs) poses challenges due to the intricate interdependencies among genes and the influence of measurement errors and noise. We propose a novel energy-constrained diffusion (ECD)-based model for identifying CDGs, termed ECD-CDGI. This model is the first to design an ECD-Attention encoder by combining the ECD technique with an attention mechanism. ECD-Attention encoder excels at generating robust gene representations that reveal the complex interdependencies among genes while reducing the impact of data noise. We concatenate topological embedding extracted from gene-gene networks through graph transformers to these gene representations. We conduct extensive experiments across three testing scenarios. Extensive experiments show that the ECD-CDGI model possesses the ability to not only be proficient in identifying known CDGs but also efficiently uncover unknown potential CDGs. Furthermore, compared to the GNN-based approach, the ECD-CDGI model exhibits fewer constraints by existing gene-gene networks, thereby enhancing its capability to identify CDGs. Additionally, ECD-CDGI is open-source and freely available. We have also launched the model as a complimentary online tool specifically crafted to expedite research efforts focused on CDGs identification. Author summary: Cancer has become a major disease threatening human life and health. Cancer usually originates from abnormal gene activities, such as mutations and copy number variations. Mutations in cancer driver genes are crucial for the selective growth of tumor cells. Identifying cancer driver genes is crucial in cancer-related research and treatment strategies, as it helps understand cancer occurrence and development. However, the complex gene-gene interactions, measurement errors, and the prevalence of unlabeled data significantly complicate the identification of these driver genes. We developed a new method that integrates an energy-constrained diffusion mechanism with an attention mechanism to uncover implicit gene dependencies in biomolecular networks and generate robust gene representations. Extensive experiments demonstrated that our model accurately identifies known cancer driver genes and effectively discovers potential ones. Furthermore, we analyzed and predicted patient-specific mutated genes, enhancing our understanding of their pathogenesis and advancing precision medicine. In summary, our method offers a promising tool for advancing the identification of cancer driver genes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. scRNMF: An imputation method for single-cell RNA-seq data by robust and non-negative matrix factorization.
- Author
-
Qian, Yuqing, Zou, Quan, Zhao, Mengyuan, Liu, Yi, Guo, Fei, and Ding, Yijie
- Subjects
- *
NONNEGATIVE matrices , *MATRIX decomposition , *GENE expression , *RNA sequencing , *DATA recovery - Abstract
Single-cell RNA sequencing (scRNA-seq) has emerged as a powerful tool in genomics research, enabling the analysis of gene expression at the individual cell level. However, scRNA-seq data often suffer from a high rate of dropouts, where certain genes fail to be detected in specific cells due to technical limitations. This missing data can introduce biases and hinder downstream analysis. To overcome this challenge, the development of effective imputation methods has become crucial in the field of scRNA-seq data analysis. Here, we propose an imputation method based on robust and non-negative matrix factorization (scRNMF). Instead of other matrix factorization algorithms, scRNMF integrates two loss functions: L2 loss and C-loss. The L2 loss function is highly sensitive to outliers, which can introduce substantial errors. We utilize the C-loss function when dealing with zero values in the raw data. The primary advantage of the C-loss function is that it imposes a smaller punishment for larger errors, which results in more robust factorization when handling outliers. Various datasets of different sizes and zero rates are used to evaluate the performance of scRNMF against other state-of-the-art methods. Our method demonstrates its power and stability as a tool for imputation of scRNA-seq data. Author summary: It is still difficult to analyze scRNA-seq data because a significant portion of expressed genes have zeros. Gene expression levels can be restored through the imputation of scRNA-seq data, facilitating downstream analysis. To overcome this challenge, we propose an imputation method based on robust and non-negative matrix factorization (scRNMF). Instead of other matrix factorization algorithms, scRNMF integrates two loss functions: L2 loss and C-loss. Through the use of several simulated and real datasets, we perform an comprehensively evaluation of scRNMF against existing methods. scRNMF can enhance various aspects of downstream analysis, including gene expression data recovery, cell clustering analysis, gene differential expression analysis, and cellular trajectory reconstruction. The results of our study demonstrate that scRNMF is a powerful tool that can improve the accuracy of single-cell data analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. AutoEdge-CCP: A novel approach for predicting cancer-associated circRNAs and drugs based on automated edge embedding
- Author
-
Chen, Yaojia, primary, Wang, Jiacheng, additional, Wang, Chunyu, additional, and Zou, Quan, additional
- Published
- 2024
- Full Text
- View/download PDF
7. PCB: A pseudotemporal causality-based Bayesian approach to identify EMT-associated regulatory relationships of AS events and RBPs during breast cancer progression
- Author
-
Sun, Liangjie, primary, Qiu, Yushan, additional, Ching, Wai-Ki, additional, Zhao, Pu, additional, and Zou, Quan, additional
- Published
- 2023
- Full Text
- View/download PDF
8. A multi-label learning model for predicting drug-induced pathology in multi-organ based on toxicogenomics data
- Author
-
Su, Ran, primary, Yang, Haitang, additional, Wei, Leyi, additional, Chen, Siqi, additional, and Zou, Quan, additional
- Published
- 2022
- Full Text
- View/download PDF
9. Recall DNA methylation levels at low coverage sites using a CNN model in WGBS.
- Author
-
Luo, Ximei, Wang, Yansu, Zou, Quan, and Xu, Lei
- Subjects
DNA methylation ,REGULATOR genes ,GENETIC regulation ,DEEP learning ,METHYLATION ,METHYLGUANINE - Abstract
DNA methylation is an important regulator of gene transcription. WGBS is the gold-standard approach for base-pair resolution quantitative of DNA methylation. It requires high sequencing depth. Many CpG sites with insufficient coverage in the WGBS data, resulting in inaccurate DNA methylation levels of individual sites. Many state-of-arts computation methods were proposed to predict the missing value. However, many methods required either other omics datasets or other cross-sample data. And most of them only predicted the state of DNA methylation. In this study, we proposed the RcWGBS, which can impute the missing (or low coverage) values from the DNA methylation levels on the adjacent sides. Deep learning techniques were employed for the accurate prediction. The WGBS datasets of H1-hESC and GM12878 were down-sampled. The average difference between the DNA methylation level at 12× depth predicted by RcWGBS and that at >50× depth in the H1-hESC and GM2878 cells are less than 0.03 and 0.01, respectively. RcWGBS performed better than METHimpute even though the sequencing depth was as low as 12×. Our work would help to process methylation data of low sequencing depth. It is beneficial for researchers to save sequencing costs and improve data utilization through computational methods. Author summary: DNA methylation has a major impact on gene regulation. WGBS is the gold standard for investigating the DNA methylation. The DNA methylation level of the sites with low coverage are often not accurate in WGBS datasets. Therefore, we proposed a method based on the CNN model to perform DNA methylation level interpolation for specific sites and named this method as RcWGBS. RcWGBS did not rely on other omics data or other cross-sample data. It only used the sites with sufficient coverage contained in the target WGBS dataset for model training to obtain parameters. Then, the trained model can be used to predict the DNA methylation level of sites with low coverage. Our analyses showed that RcWGBS could recalibrate the methylation level of some CpGs with insufficient coverage. It is suggested that our research could benefit the WGBS datasets with insufficient sequencing coverage. RcWGBS is implemented as an R-packages. It is efficient and convenient and does not need other WGBS or omics data. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
10. CRBPDL: Identification of circRNA-RBP interaction sites using an ensemble neural network approach
- Author
-
Niu, Mengting, primary, Zou, Quan, additional, and Lin, Chen, additional
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.