43 results on '"Fang-Xiang Wu"'
Search Results
2. Biomarker Identification via a Factorization Machine-based Neural Network with Binary Pairwise Encoding
- Author
-
Yulian Ding, Xiujuan Lei, Bo Liao, and Fang-Xiang Wu
- Subjects
Applied Mathematics ,Genetics ,Biotechnology - Published
- 2023
- Full Text
- View/download PDF
3. Temporal-Spatial Analysis of the Essentiality of Hub Proteins in Protein-Protein Interaction Networks
- Author
-
Xiangmao Meng, Wenkai Li, Ju Xiang, Hayat Dino Bedru, Wenkang Wang, Fang-Xiang Wu, and Min Li
- Subjects
Computer Networks and Communications ,Control and Systems Engineering ,Computer Science Applications - Published
- 2022
- Full Text
- View/download PDF
4. A Dual Ranking Algorithm Based on the Multiplex Network for Heterogeneous Complex Disease Analysis
- Author
-
Fang-Xiang Wu, Ju Xiang, Min Li, and Xingyi Li
- Subjects
Computer science ,business.industry ,Applied Mathematics ,Aggregate (data warehouse) ,Machine learning ,computer.software_genre ,Network topology ,Medical research ,Data type ,Dual (category theory) ,Ranking ,Genetics ,Multiplex ,Artificial intelligence ,business ,computer ,Algorithms ,Biomarkers ,Biotechnology ,Interpretability - Abstract
Identifying biomarkers of heterogeneous complex diseases has always been one of the focuses in medical research. In previous studies, the powerful network propagation methods have been applied to finding marker genes related to specific diseases, but existing methods are mostly based on a single network, which may be greatly affected by the incompleteness of the network and the ignorance of a large amount of information about physical and functional interactions between biological components. Other methods that directly integrate multiple types of interactions into an aggregate network have the risks that different types of data may conflict with each other and the characteristics and topologies of each individual network are lost. Meanwhile, biomarkers used in clinical trials should have the characteristics of small quantity and strong discriminate ability. In this study, we developed a multiplex network-based dual ranking framework (DualRank) for heterogeneous complex disease analysis. We applied the proposed method to heterogeneous complex diseases for diagnosis, prognosis, and classification. The results showed that DualRank outperformed competing methods and could identify biomarkers with the small quantity, great prediction performance (average AUC = 0.818) and biological interpretability.
- Published
- 2022
- Full Text
- View/download PDF
5. Predicting Drug-Drug Interactions Based on Integrated Similarity and Semi-Supervised Learning
- Author
-
Zhang Yayan, Yi Pan, Fang-Xiang Wu, Jianxin Wang, Cheng Yan, and Guihua Duan
- Subjects
Drug ,Computer science ,media_common.quotation_subject ,0206 medical engineering ,02 engineering and technology ,Semi-supervised learning ,Machine learning ,computer.software_genre ,Cross-validation ,Genetics ,Humans ,Drug Interactions ,Drug reaction ,Least-Squares Analysis ,media_common ,business.industry ,Applied Mathematics ,Cosine similarity ,Pharmaceutical Preparations ,Drug development ,Learning methods ,Supervised Machine Learning ,Artificial intelligence ,business ,Classifier (UML) ,computer ,Algorithms ,020602 bioinformatics ,Biotechnology - Abstract
A drug-drug interaction (DDI) is defined as an association between two drugs where the pharmacological effects of a drug are influenced by another drug. Positive DDIs can usually improve the therapeutic effects of patients, but negative DDIs cause the major cause of adverse drug reactions and even result in the drug withdrawal from the market and the patient death. Therefore, identifying DDIs has become a key component of the drug development and disease treatment. In this study, we propose a novel method to predict DDIs based on the integrated similarity and semi-supervised learning (DDI-IS-SL). DDI-IS-SL integrates the drug chemical, biological and phenotype data to calculate the feature similarity of drugs with the cosine similarity method. The Gaussian Interaction Profile kernel similarity of drugs is also calculated based on known DDIs. A semi-supervised learning method (the Regularized Least Squares classifier) is used to calculate the interaction possibility scores of drug-drug pairs. In terms of the 5-fold cross validation, 10-fold cross validation and de novo drug validation, DDI-IS-SL can achieve the better prediction performance than other comparative methods. In addition, the average computation time of DDI-IS-SL is shorter than that of other comparative methods. Finally, case studies further demonstrate the performance of DDI-IS-SL in practical applications.
- Published
- 2022
- Full Text
- View/download PDF
6. A Deep Neural Network for Cervical Cell Classification Based on Cytology Images
- Author
-
Ming Fang, Xiujuan Lei, Bo Liao, and Fang-Xiang Wu
- Subjects
General Computer Science ,General Engineering ,General Materials Science ,Electrical and Electronic Engineering - Published
- 2022
- Full Text
- View/download PDF
7. Predicting miRNA-Disease Associations Based On Multi-View Variational Graph Auto-Encoder With Matrix Factorization
- Author
-
Xiujuan Lei, Yulian Ding, Fang-Xiang Wu, and Bo Liao
- Subjects
020205 medical informatics ,Computer science ,Association (object-oriented programming) ,02 engineering and technology ,Machine learning ,computer.software_genre ,Matrix decomposition ,03 medical and health sciences ,Matrix (mathematics) ,0302 clinical medicine ,Health Information Management ,Similarity (psychology) ,0202 electrical engineering, electronic engineering, information engineering ,Humans ,Genetic Predisposition to Disease ,030212 general & internal medicine ,Electrical and Electronic Engineering ,Computational model ,Artificial neural network ,business.industry ,Computational Biology ,Autoencoder ,Computer Science Applications ,MicroRNAs ,Graph (abstract data type) ,Neural Networks, Computer ,Artificial intelligence ,business ,computer ,Algorithms ,Biotechnology - Abstract
MicroRNAs (miRNAs) have been proved to play critical roles in diverse biological processes, including the human disease development process. Exploring the potential associations between miRNAs and diseases can help us better understand complex disease mechanisms. Given that traditional biological experiments are expensive and time-consuming, computational models can serve as efficient means to uncover potential miRNA-disease associations. This study presents a new computational model based on variational graph auto-encoder with matrix factorization (VGAMF) for miRNA-disease association prediction. More specifically, VGAMF first integrates four different types of information about miRNAs into an miRNA comprehensive similarity network and two types of information about diseases into a disease comprehensive similarity network, respectively. Then, VGAMF gets the non-linear representations of miRNAs and diseases, respectively, from those two comprehensive similarity networks with variational graph auto-encoders. Simultaneously, a non-negative matrix factorization is conducted on the miRNA-disease association matrix to get the linear representations of miRNAs and diseases. Finally, a fully connected neural network combines linear and non-linear representations of miRNAs and diseases to get the final predicted association score for all miRNA-disease pairs. In the 10-fold cross-validation experiments, VGAMF achieves an average AUC of 0.9280 on HMDD v2.0 and 0.9470 on HMDD v3.2, which outperforms other competing methods. Besides, the case studies on colon cancer and esophageal cancer further demonstrate the effectiveness of VGAMF in predicting novel miRNA-disease associations.
- Published
- 2022
- Full Text
- View/download PDF
8. A Deep Learning Framework for Gene Ontology Annotations With Sequence- and Network-Based Information
- Author
-
Yi Pan, Min Zeng, Hong Song, Fang-Xiang Wu, Min Li, Fuhao Zhang, and Yaohang Li
- Subjects
InterPro ,0206 medical engineering ,02 engineering and technology ,Convolutional neural network ,Deep Learning ,Subsequence ,Genetics ,Word2vec ,Amino Acid Sequence ,Protein Interaction Maps ,Sequence ,Artificial neural network ,business.industry ,Applied Mathematics ,Deep learning ,Computational Biology ,Proteins ,Molecular Sequence Annotation ,Pattern recognition ,Gene Ontology ,Embedding ,Artificial intelligence ,business ,Algorithms ,Software ,020602 bioinformatics ,Biotechnology - Abstract
Knowledge of protein functions plays an important role in biology and medicine. With the rapid development of high-throughput technologies, a huge number of proteins have been discovered. However, there are a great number of proteins without functional annotations. A protein usually has multiple functions and some functions or biological processes require interactions of a plurality of proteins. Additionally, Gene Ontology provides a useful classification for protein functions and contains more than 40,000 terms. We propose a deep learning framework called DeepGOA to predict protein functions with protein sequences and protein-protein interaction (PPI) networks. For protein sequences, we extract two types of information: sequence semantic information and subsequence-based features. We use the word2vec technique to numerically represent protein sequences, and utilize a Bi-directional Long and Short Time Memory (Bi-LSTM) and multi-scale convolutional neural network (multi-scale CNN) to obtain the global and local semantic features of protein sequences, respectively. Additionally, we use the InterPro tool to scan protein sequences for extracting subsequence-based information, such as domains and motifs. Then, the information is plugged into a neural network to generate high-quality features. For the PPI network, the Deepwalk algorithm is applied to generate its embedding information of PPI. Then the two types of features are concatenated together to predict protein functions. To evaluate the performance of DeepGOA, several different evaluation methods and metrics are utilized. The experimental results show that DeepGOA outperforms DeepGO and BLAST.
- Published
- 2021
- Full Text
- View/download PDF
9. Deletion Detection Method Using the Distribution of Insert Size and a Precise Alignment Strategy
- Author
-
Junwei Luo, Fang-Xiang Wu, Zhen Zhang, Juan Shang, Jianxin Wang, Yi Pan, and Min Li
- Subjects
Genome, Human ,Computer science ,business.industry ,Applied Mathematics ,Breakpoint ,Computational Biology ,Genomics ,Pattern recognition ,Sequence Analysis, DNA ,Insert (molecular biology) ,Structural variation ,Mutagenesis, Insertional ,Distribution (mathematics) ,Genomic Structural Variation ,Genetics ,Humans ,Human genome ,Artificial intelligence ,business ,Sequence Alignment ,Gene Deletion ,Biotechnology - Abstract
Homozygous and heterozygous deletions commonly exist in the human genome. For current structural variation detection tools, it is significant to determine whether a deletion is homozygous or heterozygous. However, the problems of sequencing errors, micro-homologies, and micro-insertions prohibit common alignment tools from identifying accurate breakpoint locations, and often result in detecting false structural variations. In this study, we present a novel deletion detection tool called Sprites2. Comparing with Sprites, Sprites2 makes the following modifications: (1) The distribution of insert size is used in Sprites2, which can identify the type of deletions and improve the accuracy of deletion calls. (2) A precise alignment method based on AGE (one algorithm simultaneously aligning 5’ and 3’ ends between two sequences) is adopted in Sprites2 to identify breakpoints, which is helpful to resolve the problems introduced by sequencing errors, micro-homologies, and micro-insertions. In order to test and verify the performance of Sprites2, some simulated and real datasets are adopted in our experiments, and Sprites2 is compared with five popular tools. The experimental results show that Sprites2 can improve the performance of deletion detection. Sprites2 can be downloaded from https://github.com/zhangzhen/sprites2 .
- Published
- 2021
- Full Text
- View/download PDF
10. High-Risk Prediction of Cardiovascular Diseases via Attention-Based Deep Neural Networks
- Author
-
Chen Xianlai, Fang-Xiang Wu, Jianxin Wang, Huang Nengjun, and Ying An
- Subjects
Adult ,Male ,Prognosis prediction ,Computer science ,0206 medical engineering ,Feature extraction ,MEDLINE ,02 engineering and technology ,Disease ,Machine learning ,computer.software_genre ,Risk Assessment ,Data modeling ,Deep Learning ,Genetics ,Data Mining ,Electronic Health Records ,Humans ,Artificial neural network ,Mechanism (biology) ,business.industry ,Applied Mathematics ,Middle Aged ,Cardiovascular Diseases ,Deep neural networks ,Female ,Neural Networks, Computer ,Artificial intelligence ,business ,computer ,Algorithms ,Medical Informatics ,020602 bioinformatics ,Biotechnology - Abstract
High-risk prediction of cardiovascular disease is of great significance and impendency in medical fields with the increasing phenomenon of sub-health these years. Most existing pathological methods for the prognosis prediction are either costly or prone to misjudgement. Therefore, plenty of automated models based on machine learning have been proposed to predict the onset of cardiovascular disease with the premorbid information of patients extracted from their historical Electronic Health Records (EHRs). However, it is a tough job to select proper features from longitudinal and heterogeneous EHRs, and also a great challenge to obtain accurate and robust representations for patients. In this paper, we propose an entirely end-to-end model called DeepRisk based on attention mechanism and deep neural networks, which can not only learn high-quality features automatically from EHRs, but also efficiently integrate heterogeneous and time-ordered medical data, and finally predict patients' risk of cardiovascular diseases. Experiments are carried out on a real medical dataset and results show that DeepRisk can significantly improve the high-risk prediction accuracy for cardiovascular disease compared with state-of-the-art approaches.
- Published
- 2021
- Full Text
- View/download PDF
11. A Novel Drug Repositioning Approach Based on Collaborative Metric Learning
- Author
-
Fang-Xiang Wu, Jianxin Wang, Huimin Luo, Cheng Yan, Yi Pan, and Min Li
- Subjects
Drug ,Computer science ,Association (object-oriented programming) ,media_common.quotation_subject ,0206 medical engineering ,02 engineering and technology ,ENCODE ,Machine learning ,computer.software_genre ,Toxicogenetics ,Task (project management) ,Machine Learning ,Genetics ,Humans ,media_common ,Models, Statistical ,business.industry ,Applied Mathematics ,Drug Repositioning ,Computational Biology ,Drug repositioning ,Metric space ,Drug development ,Metric (mathematics) ,Artificial intelligence ,business ,computer ,Algorithms ,020602 bioinformatics ,Biotechnology - Abstract
Computational drug repositioning, which is an efficient approach to find potential indications for drugs, has been used to increase the efficiency of drug development. The drug repositioning problem essentially is a top-K recommendation task that recommends most likely diseases to drugs based on drug and disease related information. Therefore, many recommendation methods can be adopted to drug repositioning. Collaborative metric learning (CML) algorithm can produce distance metrics that capture the important relationships among objects, and has been widely used in recommendation domains. By applying CML in drug repositioning, a joint metric space is learned to encode drug's relationships with different diseases. In this study, we propose a novel drug repositioning computational method using Collaborative Metric Learning to predict novel drug-disease associations based on known drug and disease related information. Specifically, the proposed method learns latent vectors of drugs and diseases by applying metric learning, and then predicts the association probability of one drug-disease pair based on the learned vectors. The comprehensive experimental results show that CMLDR outperforms the other state-of-the-art drug repositioning algorithms in terms of precision, recall, and AUPR.
- Published
- 2021
- Full Text
- View/download PDF
12. A Gene Rank Based Approach for Single Cell Similarity Assessment and Clustering
- Author
-
Feng Luo, Jianxin Wang, Fang-Xiang Wu, Yunpei Xu, Yi Pan, and Hong-Dong Li
- Subjects
Cell type ,Computer science ,0206 medical engineering ,Population ,02 engineering and technology ,Correlation ,Mice ,Similarity (network science) ,Databases, Genetic ,Genetics ,Animals ,Cluster Analysis ,Humans ,Cluster analysis ,education ,education.field_of_study ,Sequence Analysis, RNA ,business.industry ,Applied Mathematics ,Rank (computer programming) ,Computational Biology ,Pattern recognition ,Gene Ontology ,Key (cryptography) ,Unsupervised learning ,Artificial intelligence ,Single-Cell Analysis ,Transcriptome ,business ,Algorithms ,020602 bioinformatics ,Biotechnology - Abstract
Single-cell RNA sequencing (scRNA-seq) technology provides quantitative gene expression profiles at single-cell resolution. As a result, researchers have established new ways to explore cell population heterogeneity and genetic variability of cells. One of the current research directions for scRNA-seq data is to identify different cell types accurately through unsupervised clustering methods. However, scRNA-seq data analysis is challenging because of their high noise level, high dimensionality and sparsity. Moreover, the impact of multiple latent factors on gene expression heterogeneity and on the ability to accurately identify cell types remains unclear. How to overcome these challenges to reveal the biological difference between cell types has become the key to analyze scRNA-seq data. For these reasons, the unsupervised learning for cell population discovery based on scRNA-seq data analysis has become an important research area. A cell similarity assessment method plays a significant role in cell clustering. Here, we present BioRank, a new cell similarity assessment method based on annotated gene sets and gene ranks. To evaluate the performances, we cluster cells by two classical clustering algorithms based on the similarity between cells obtained by BioRank. In addition, BioRank can be used by any clustering algorithm that requires a similarity matrix. Applying BioRank to 12 public scRNA-seq datasets, we show that it is better than or at least as well as several popular similarity assessment methods for single cell clustering.
- Published
- 2021
- Full Text
- View/download PDF
13. Deep Matrix Factorization Improves Prediction of Human CircRNA-Disease Associations
- Author
-
Chengqian Lu, Jianxin Wang, Min Li, Fuhao Zhang, Min Zeng, and Fang-Xiang Wu
- Subjects
0301 basic medicine ,Computer science ,Association (object-oriented programming) ,Disease ,Machine learning ,computer.software_genre ,Matrix decomposition ,03 medical and health sciences ,0302 clinical medicine ,Health Information Management ,Humans ,Diagnostic biomarker ,Electrical and Electronic Engineering ,Artificial neural network ,business.industry ,GRASP ,RNA, Circular ,Computer Science Applications ,Projection (relational algebra) ,030104 developmental biology ,Research Design ,030220 oncology & carcinogenesis ,Mutation (genetic algorithm) ,Neural Networks, Computer ,Artificial intelligence ,business ,computer ,Forecasting ,Biotechnology - Abstract
In recent years, more and more evidence indicates that circular RNAs (circRNAs) with covalently closed loop play various roles in biological processes. Dysregulation and mutation of circRNAs may be implicated in diseases. Due to its stable structure and resistance to degradation, circRNAs provide great potential to be diagnostic biomarkers. Therefore, predicting circRNA-disease associations is helpful in disease diagnosis. However, there are few experimentally validated associations between circRNAs and diseases. Although several computational methods have been proposed, precisely representing underlying features and grasping the complex structures of data are still challenging. In this paper, we design a new method, called DMFCDA (Deep Matrix Factorization CircRNA-Disease Association), to infer potential circRNA-disease associations. DMFCDA takes both explicit and implicit feedback into account. Then, it uses a projection layer to automatically learn latent representations of circRNAs and diseases. With multi-layer neural networks, DMFCDA can model the non-linear associations to grasp the complex structure of data. We assess the performance of DMFCDA using leave-one cross-validation and 5-fold cross-validation on two datasets. Computational results show that DMFCDA efficiently infers circRNA-disease associations according to AUC values, the percentage of precisely retrieved associations in various top ranks, and statistical comparison. We also conduct case studies to evaluate DMFCDA. All results show that DMFCDA provides accurate predictions.
- Published
- 2021
- Full Text
- View/download PDF
14. Multi-Receptive-Field CNN for Semantic Segmentation of Medical Images
- Author
-
Yu-Ping Wang, Liangliang Liu, Fang-Xiang Wu, and Jianxin Wang
- Subjects
Computer science ,Feature extraction ,02 engineering and technology ,Convolutional neural network ,030218 nuclear medicine & medical imaging ,03 medical and health sciences ,0302 clinical medicine ,Health Information Management ,Image Processing, Computer-Assisted ,0202 electrical engineering, electronic engineering, information engineering ,Humans ,Segmentation ,Electrical and Electronic Engineering ,business.industry ,Pattern recognition ,Image segmentation ,Subnet ,Semantics ,Computer Science Applications ,Kernel (image processing) ,Receptive field ,Test set ,020201 artificial intelligence & image processing ,Neural Networks, Computer ,Artificial intelligence ,business ,Biotechnology - Abstract
The context-based convolutional neural network (CNN) is one of the most well-known CNNs to improve the performance of semantic segmentation. It has achieved remarkable success in various medical image segmentation tasks. However, extracting rich and useful context information from complex and changeable medical images is a challenge for medical image segmentation. In this study, a novel Multi-Receptive-Field CNN (MRFNet) is proposed to tackle this challenge. MRFNet offers the optimal receptive field for each subnet in the encoder-decoder module (EDM) and generates multi-receptive-field context information at the feature map level. Moreover, MRFNet fuses these multi-feature maps by the concatenation operation. MRFNet is evaluated on 3 public medical image data sets, including SISS, 3DIRCADb, and SPES. Experimental results show that MRFNet achieves the outstanding performance on all 3 data sets, and outperforms other segmentation methods on 3DIRCADb test set without pre-training the model.
- Published
- 2020
- Full Text
- View/download PDF
15. MEC: Misassembly Error Correction in Contigs based on Distribution of Paired-End Reads and Statistics of GC-contents
- Author
-
Yi Pan, Junwei Luo, Xingyu Liao, Binbin Wu, Jianxin Wang, Fang-Xiang Wu, and Min Li
- Subjects
Contig ,Computer science ,Applied Mathematics ,0206 medical engineering ,food and beverages ,Sequence assembly ,02 engineering and technology ,Repetitive Regions ,Computational biology ,Genome ,Genetics ,Statistical analysis ,Error detection and correction ,020602 bioinformatics ,Biotechnology - Abstract
The de novo assembly tools aim at reconstructing genomes from next-generation sequencing (NGS) data. However, the assembly tools usually generate a large amount of contigs containing many misassemblies, which are caused by problems of repetitive regions, chimeric reads, and sequencing errors. As they can improve the accuracy of assembly results, detecting and correcting the misassemblies in contigs are appealing, yet challenging. In this study, a novel method, called MEC, is proposed to identify and correct misassemblies in contigs. Based on the insert size distribution of paired-end reads and the statistical analysis of GC-contents, MEC can identify more misassemblies accurately. We evaluate our MEC with the metrics (NA50, NGA50) on four datasets, compared it with the most available misassembly correction tools, and carry out experiments to analyze the influence of MEC on scaffolding results, which shows that MEC can reduce misassemblies effectively and result in quantitative improvements in scaffolding quality. MEC is publicly available at https://github.com/bioinfomaticsCSU/MEC .
- Published
- 2020
- Full Text
- View/download PDF
16. miRTRS: A Recommendation Algorithm for Predicting miRNA Targets
- Author
-
Fang-Xiang Wu, Wei Lan, Yi Pan, Min Li, Hui Jiang, and Jianxin Wang
- Subjects
Models, Genetic ,Computer science ,Applied Mathematics ,0206 medical engineering ,Feature extraction ,Computational Biology ,02 engineering and technology ,Cross-validation ,Mirna target ,MicroRNAs ,Prediction algorithms ,Prediction methods ,microRNA ,Genetics ,Humans ,Gene sequence ,Algorithm ,Algorithms ,020602 bioinformatics ,Biotechnology - Abstract
microRNAs (miRNAs) are small and important non-coding RNAs that regulate gene expression in transcriptional and post-transcriptional level by combining with their targets (genes). Predicting miRNA targets is an important problem in biological research. It is expensive and time-consuming to identify miRNA targets by using biological experiments. Many computational methods have been proposed to predict miRNA targets. In this study, we develop a novel method, named miRTRS, for predicting miRNA targets based on a recommendation algorithm. miRTRS can predict targets for an isolated (new) miRNA with miRNA sequence similarity, as well as isolated (new) targets for a miRNA with gene sequence similarity. Furthermore, when compared to supervised machine learning methods, miRTRS does not need to select negative samples. We use 10-fold cross validation and independent datasets to evaluate the performance of our method. We compared miRTRS with two most recently published methods for miRNA target prediction. The experimental results have shown that our method miRTRS outperforms competing prediction methods in terms of AUC and other evaluation metrics.
- Published
- 2020
- Full Text
- View/download PDF
17. ALSBMF: Predicting lncRNA-Disease Associations by Alternating Least Squares Based on Matrix Factorization
- Author
-
Wen Zhu, Kaimei Huang, Xiaofang Xiao, Bo Liao, Yuhua Yao, and Fang-Xiang Wu
- Subjects
General Computer Science ,Computer science ,Covariance matrix ,General Engineering ,Computational biology ,lncRNA similarity ,matrix factorization ,Missing data ,Least squares ,leave-one-out cross validation ,ROC curve ,Cross-validation ,Matrix decomposition ,Matrix (mathematics) ,disease similarity ,Alternating least squares ,Feature (machine learning) ,General Materials Science ,lcsh:Electrical engineering. Electronics. Nuclear engineering ,lcsh:TK1-9971 - Abstract
In recent years, it has been increasingly clear that long non-coding RNAs (lncRNAs) are able to regulate their target genes at multi-levels, including transcriptional level, translational level, etc and play key regulatory roles in many important biological processes, such as cell differentiation, chromatin remodeling and more. Inferring potential lncRNA-disease associations is essential to reveal the secrets behind diseases, develop novel drugs, and optimize personalized treatments. However, biological experiments to validate lncRNA-disease associations are very time-consuming and costly. Thus, it is critical to develop effective computational models. In this study, we have proposed a method by alternating least squares based on matrix factorization to predict lncRNA-disease associations, referred to as ALSBMF. ALSBMF first decomposes the known lncRNA-disease correlation matrix into two characteristic matrices, then defines the optimization function using disease semantic similarity, lncRNA functional similarity and known lncRNA-disease associations and solves two optimal feature matrices by least squares method. The two optimal feature matrices are finally multiplied to reconstruct the scoring matrix, filling the missing values of the original matrix to predict lncRNA-disease associations. Compared to existing methods, ALSBMF has the same advantages as BPLLDA. It does not require negative samples and can predict associations related to novel lncRNAs or novel diseases. In addition, this study performs leave-one-out cross-validation (LOOCV) and five-fold cross-validation to evaluate the prediction performance of ALSBMF. The AUCs are 0.9501 and 0.9215, respectively, which are better than the existing methods. Furthermore colon cancer, kidney cancer, and liver cancer are selected as case studies. The predicted top three colon cancer, kidney cancer, and liver cancer-related lncRNAs were validated in the latest LncRNADisease database and related literature. In order to test the ability of ALSBMF to predict novel disease-associated lncRNAs and new lncRNA-associated diseases, all known associations of diseases and lncRNAs were eliminated, the predicted top five breast cancer, nasopharyngeal carcinoma cancer-related lncRNAs and top five H19, MALAT1 lncRNA-related cancers were validated in PubMed and dbSNP.
- Published
- 2020
- Full Text
- View/download PDF
18. MGT-SM: A Method for Constructing Cellular Signal Transduction Networks
- Author
-
Min Li, Ruiqing Zheng, Yaohang Li, Jianxin Wang, and Fang-Xiang Wu
- Subjects
Multivariate statistics ,Computer science ,Gene Expression Profiling ,Applied Mathematics ,Linear model ,Computational Biology ,Inference ,Bivariate analysis ,computer.software_genre ,Electronic mail ,Granger causality ,Neoplasms ,Yeasts ,Singular value decomposition ,Linear Models ,Genetics ,Humans ,Computer Simulation ,Data mining ,Coefficient matrix ,Monte Carlo Method ,computer ,Algorithms ,Signal Transduction ,Biotechnology - Abstract
A cellular signal transduction network is an important means to describe biological responses to environmental stimuli and exchange of biological signals. Constructing the cellular signal transduction network provides an important basis for the study of the biological activities, the mechanism of the diseases, drug targets and so on. The statistical approaches to network inference are popular in literature. Granger test has been used as an effective method for causality inference. Compared with bivariate granger tests, multivariate granger tests reduce the indirect causality and were used widely for the construction of cellular signal transduction networks. A multivariate Granger test requires that the number of time points in the time-series data is more than the number of nodes involved in the network. However, there are many real datasets with a few time points which are much less than the number of nodes in the network. In this study, we propose a new multivariate Granger test-based framework to construct cellular signal transduction network, called MGT-SM. Our MGT-SM uses SVD to compute the coefficient matrix from gene expression data and adopts Monte Carlo simulation to estimate the significance of directed edges in the constructed networks. We apply the proposed MGT-SM to Yeast Synthetic Network and MDA-MB-468, and evaluate its performance in terms of the recall and the AUC. The results show that MGT-SM achieves better results, compared with other popular methods (CGC2SPR, PGC, and DBN).
- Published
- 2019
- Full Text
- View/download PDF
19. Improving Alzheimer's Disease Classification by Combining Multiple Measures
- Author
-
Fang-Xiang Wu, Bin Hu, Jianxin Wang, Zhenjun Tang, Yi Pan, and Jin Liu
- Subjects
Male ,Databases, Factual ,Feature extraction ,Computed tomography ,02 engineering and technology ,computer.software_genre ,03 medical and health sciences ,Mri image ,0302 clinical medicine ,Alzheimer Disease ,Image Interpretation, Computer-Assisted ,0202 electrical engineering, electronic engineering, information engineering ,Genetics ,medicine ,Humans ,Cognitive Dysfunction ,Cognitive impairment ,Feature set ,Aged ,Mathematics ,Aged, 80 and over ,Multiple kernel learning ,medicine.diagnostic_test ,business.industry ,Applied Mathematics ,Brain ,Disease classification ,Pattern recognition ,Magnetic Resonance Imaging ,Female ,020201 artificial intelligence & image processing ,Artificial intelligence ,Data mining ,business ,Classifier (UML) ,computer ,Algorithms ,030217 neurology & neurosurgery ,Biotechnology - Abstract
Several anatomical magnetic resonance imaging (MRI) markers for Alzheimer's disease (AD) have been identified. Cortical gray matter volume, cortical thickness, and subcortical volume have been used successfully to assist the diagnosis of Alzheimer's disease including its early warning and developing stages, e.g., mild cognitive impairment (MCI) including MCI converted to AD (MCIc) and MCI not converted to AD (MCInc). Currently, these anatomical MRI measures have mainly been used separately. Thus, the full potential of anatomical MRI scans for AD diagnosis might not yet have been used optimally. Meanwhile, most studies currently only focused on morphological features of regions of interest (ROIs) or interregional features without considering the combination of them. To further improve the diagnosis of AD, we propose a novel approach of extracting ROI features and interregional features based on multiple measures from MRI images to distinguish AD, MCI (including MCIc and MCInc), and health control (HC). First, we construct six individual networks based on six different anatomical measures (i.e., CGMV, CT, CSA, CC, CFI, and SV) and Automated Anatomical Labeling (AAL) atlas for each subject. Then, for each individual network, we extract all node (ROI) features and edge (interregional) features, and denoted as node feature set and edge feature set, respectively. Therefore, we can obtain six node feature sets and six edge feature sets from six different anatomical measures. Next, each feature within a feature set is ranked by $F$ -score in descending order, and the top ranked features of each feature set are applied to MKBoost algorithm to obtain the best classification accuracy. After obtaining the best classification accuracy, we can get the optimal feature subset and the corresponding classifier for each node or edge feature set. Afterwards, to investigate the classification performance with only node features, we proposed a weighted multiple kernel learning (wMKL) framework to combine these six optimal node feature subsets, and obtain a combined classifier to perform AD classification. Similarly, we can obtain the classification performance with only edge features. Finally, we combine both six optimal node feature subsets and six optimal edge feature subsets to further improve the classification performance. Experimental results show that the proposed method outperforms some state-of-the-art methods in AD classification, and demonstrate that different measures contain complementary information.
- Published
- 2018
- Full Text
- View/download PDF
20. M-Matrix-Based State Observer Design for Genetic Regulatory Networks With Mixed Delays
- Author
-
Fang-Xiang Wu, Venkat Palgat, and Li-Ping Tian
- Subjects
0209 industrial biotechnology ,Computer science ,Quantitative Biology::Molecular Networks ,Feasible region ,02 engineering and technology ,Linear matrix ,020901 industrial engineering & automation ,Control theory ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,State observer ,Electrical and Electronic Engineering ,M-matrix ,Stable state - Abstract
Understanding genetic regulatory networks (GRNs) and designing a controller requires access to all the states of the system. However, not all the states of GRN can be experimentally measured in practice. Therefore, a state observer is necessary to estimate the unknown states from measured data. In this brief, we use M-matrix theory to design stable state observers for GRNs with mixed delays. Different from the linear matrix inequalities-based method, we formulate the state observer design as a feasible set problem which can be easily solved by some programming solvers with the optimization principle. The simulation results illustrate the effectiveness of our design approach.
- Published
- 2018
- Full Text
- View/download PDF
21. Alzheimer’s Disease Classification Based on Individual Hierarchical Networks Constructed With 3-D Texture Features
- Author
-
Yi Pan, Jianxin Wang, Fang-Xiang Wu, Jin Liu, and Bin Hu
- Subjects
Male ,0301 basic medicine ,Computer science ,Feature extraction ,Biomedical Engineering ,Pharmaceutical Science ,Medicine (miscellaneous) ,Bioengineering ,Sensitivity and Specificity ,Cross-validation ,Pattern Recognition, Automated ,03 medical and health sciences ,Imaging, Three-Dimensional ,0302 clinical medicine ,Neuroimaging ,Alzheimer Disease ,Connectome ,medicine ,Humans ,Dementia ,Cognitive Dysfunction ,Computer vision ,Electrical and Electronic Engineering ,Aged ,Multiple kernel learning ,business.industry ,Node (networking) ,Brain ,Reproducibility of Results ,Pattern recognition ,medicine.disease ,Magnetic Resonance Imaging ,Computer Science Applications ,030104 developmental biology ,Pattern recognition (psychology) ,Female ,Artificial intelligence ,Nerve Net ,Alzheimer's disease ,business ,Algorithms ,030217 neurology & neurosurgery ,Biotechnology - Abstract
Brain network plays an important role in representing abnormalities in Alzheimers disease (AD) and mild cognitive impairment (MCI), which includes MCIc (MCI converted to AD) and MCInc (MCI not converted to AD). In our previous study, we proposed an AD classification approach based on individual hierarchical networks constructed with 3D texture features of brain images. However, we only used edge features of the networks without node features of the networks. In this paper, we propose a framework of the combination of multiple kernels to combine edge features and node features for AD classification. An evaluation of the proposed approach has been conducted with MRI images of 710 subjects (230 health controls (HC), 280 MCI (including 120 MCIc and 160 MCInc), and 200 AD) from the Alzheimer's disease neuroimaging initiative database by using ten-fold cross validation. Experimental results show that the proposed method is not only superior to the existing AD classification methods, but also efficient and promising for clinical applications for the diagnosis of AD via MRI images. Furthermore, the results also indicate that 3D texture could detect the subtle texture differences between tissues in AD, MCI, and HC, and texture features of MRI images might be related to the severity of AD cognitive impairment. These results suggest that 3D texture is a useful aid in AD diagnosis.
- Published
- 2017
- Full Text
- View/download PDF
22. Biomolecular Network Controllability With Drug Binding Information
- Author
-
Lin Wu, Fang-Xiang Wu, Lingkai Tang, Jianxin Wang, and Min Li
- Subjects
0301 basic medicine ,Systems biology ,Distributed computing ,0206 medical engineering ,Biomedical Engineering ,Pharmaceutical Science ,Medicine (miscellaneous) ,Bioengineering ,Nanotechnology ,02 engineering and technology ,Biology ,Models, Biological ,03 medical and health sciences ,Computer Simulation ,Electrical and Electronic Engineering ,Molecular biophysics ,Proteins ,Complex network ,Network controllability ,Computer Science Applications ,Controllability ,Drug repositioning ,Identification (information) ,030104 developmental biology ,Pharmaceutical Preparations ,Algorithms ,020602 bioinformatics ,Biological network ,Protein Binding ,Signal Transduction ,Biotechnology - Abstract
Complex networks are ubiquitous in nature. In biological systems, biomolecules interact with each other to form so-called biomolecular networks, which determine the cellular behaviors of living organisms. Controlling the cellular behaviors by regulating certain biomolecules in the network is one of the most concerned problems in systems biology. Recently, the connections between biological networks and structural control theory have been explored, uncovering some interesting biological phenomena. Some researchers have paid attentions to the structural controllability of networks in notion of the minimum steering sets (MSSs). However, because the MSSs for complex networks are not unique and the importance of different MSSs is diverse in real applications, MSSs with certain meanings should be studied. In this paper, we investigated the MSSs of biomolecular networks by considering the drug binding information. The biomolecules in the MSSs with binding preference are enriched with known drug targets and are likely to have more chemical-binding opportunities with existing drugs compared with randomly chosen MSSs, suggesting novel applications for drug target identification and drug repositioning.
- Published
- 2017
- Full Text
- View/download PDF
23. United Complex Centrality for Identification of Essential Proteins from PPI Networks
- Author
-
Fang-Xiang Wu, Zhibei Niu, Yu Lu, and Min Li
- Subjects
0301 basic medicine ,Saccharomyces cerevisiae Proteins ,0206 medical engineering ,02 engineering and technology ,Computational biology ,Biology ,computer.software_genre ,Cellular life ,03 medical and health sciences ,Network level ,Genetics ,Humans ,Protein Interaction Maps ,Databases, Protein ,Organism ,Applied Mathematics ,Computational Biology ,Proteins ,Reproducibility of Results ,High-Throughput Screening Assays ,030104 developmental biology ,ROC Curve ,Proteins metabolism ,Ppi network ,Identification (biology) ,Data mining ,Centrality ,computer ,020602 bioinformatics ,Biotechnology - Abstract
Essential proteins are indispensable for the survival or reproduction of an organism. Identification of essential proteins is not only necessary for the understanding of the minimal requirements for cellular life, but also important for the disease study and drug design. With the development of high-throughput techniques, a large number of protein-protein interaction data are available, which promotes the studies of essential proteins from the network level. Up to now, though a series of computational methods have been proposed, the prediction precision still needs to be improved. In this paper, we propose a new method, United complex Centrality (UC), to identify essential proteins by integrating the protein complexes with the topological features of protein-protein interaction (PPI) networks. By analyzing the relationship between the essential proteins and the known protein complexes of S. cerevisiae and human, we find that the proteins in complexes are more likely to be essential compared with the proteins not included in any complexes and the proteins appeared in multiple complexes are more inclined to be essential compared to those only appeared in a single complex. Considering that some protein complexes generated by computational methods are inaccurate, we also provide a modified version of UC with parameter alpha, named UC-P. The experimental results show that protein complex information can help identify the essential proteins more accurate both for the PPI network of S. cerevisiae and that of human. The proposed method UC performs obviously better than the eight previously proposed methods (DC, IC, EC, SC, BC, CC, NC, and LAC) for identifying essential proteins.
- Published
- 2017
- Full Text
- View/download PDF
24. NovoExD: De novo Peptide Sequencing for ETD/ECD Spectra
- Author
-
Fang-Xiang Wu, Yan Yan, and Anthony Kusalik
- Subjects
0301 basic medicine ,chemistry.chemical_classification ,Sequence analysis ,Applied Mathematics ,Peptide ,De novo peptide sequencing ,Computational biology ,Bioinformatics ,Tandem mass spectrometry ,03 medical and health sciences ,030104 developmental biology ,chemistry ,Protein methods ,Genetics ,De novo sequencing ,Biotechnology - Abstract
De novo peptide sequencing using tandem mass spectrometry (MS/MS) data has become a major computational method for sequence identification in recent years. With the development of new instruments and technology, novel computational methods have emerged with enhanced performance. However, there are only a few methods focusing on ECD/ETD spectra, which mainly contain variants of $c$ -ions and $z$ -ions. Here, a de novo sequencing method for ECD/ETD spectra, NovoExD, is presented. NovoExD applies a new form of spectrum graph with multiple edge types (called a GMET), considers multiple peptide tags, and integrates amino acid combination (AAC) and fragment ion charge information. Its performance is compared with another successful de novo sequencing method, pNovo+, which has an option for ECD/ETD spectra. Experiments conducted on three different datasets show that the average full length peptide identification accuracy of NovoExD is as high as 88.70 percent, and that NovoExD's average accuracy is more than 20 percent greater on all datasets than that of pNovo+.
- Published
- 2017
- Full Text
- View/download PDF
25. Identifying Individual-Cancer-Related Genes by Rebalancing the Training Samples
- Author
-
Min Li, Fang-Xiang Wu, Bolin Chen, Jianxin Wang, and Xuequn Shang
- Subjects
0301 basic medicine ,Computer science ,0206 medical engineering ,Biomedical Engineering ,Pharmaceutical Science ,Medicine (miscellaneous) ,Bioengineering ,Sample (statistics) ,02 engineering and technology ,Machine learning ,computer.software_genre ,Logistic regression ,Cross-validation ,Clinical knowledge ,Machine Learning ,Random Allocation ,03 medical and health sciences ,Neoplasms ,Resampling ,Humans ,Electrical and Electronic Engineering ,Set (psychology) ,business.industry ,3. Good health ,Computer Science Applications ,Cancer related genes ,Identification (information) ,Logistic Models ,030104 developmental biology ,Artificial intelligence ,business ,computer ,Algorithms ,020602 bioinformatics ,Genes, Neoplasm ,Biotechnology - Abstract
The identification of individual-cancer-related genes typically is an imbalanced classification issue. The number of known cancer-related genes is far less than the number of all unknown genes, which makes it very hard to detect novel predictions from such imbalanced training samples. A regular machine learning method can either only detect genes related to all cancers or add clinical knowledge to circumvent this issue. In this study, we introduce a training sample rebalancing strategy to overcome this issue by using a two-step logistic regression and a random resampling method. The two-step logistic regression is to select a set of genes that related to all cancers. While the random resampling method is performed to further classify those genes associated with individual cancers. The issue of imbalanced classification is circumvented by randomly adding positive instances related to other cancers at first, and then excluding those unrelated predictions according to the overall performance at the following step. Numerical experiments show that the proposed resampling method is able to identify cancer-related genes even when the number of known genes related to it is small. The final predictions for all individual cancers achieve AUC values around 0.93 by using the leave-one-out cross validation method, which is very promising, compared with existing methods.
- Published
- 2016
- Full Text
- View/download PDF
26. A New Method for Predicting Protein Functions From Dynamic Weighted Interactome Networks
- Author
-
Yaohang Li, Jianxin Wang, Min Li, Bihai Zhao, Fang-Xiang Wu, Xueyong Li, and Yi Pan
- Subjects
0301 basic medicine ,Saccharomyces cerevisiae Proteins ,0206 medical engineering ,Biomedical Engineering ,Pharmaceutical Science ,Medicine (miscellaneous) ,Bioengineering ,02 engineering and technology ,Biology ,computer.software_genre ,Interactome ,Protein–protein interaction ,Domain (software engineering) ,03 medical and health sciences ,Annotation ,Protein Interaction Mapping ,Protein function prediction ,Protein Interaction Maps ,Electrical and Electronic Engineering ,Databases, Protein ,Computational Biology ,Construct (python library) ,Function (mathematics) ,Protein engineering ,Computer Science Applications ,ComputingMethodologies_PATTERNRECOGNITION ,030104 developmental biology ,Data mining ,computer ,Algorithms ,020602 bioinformatics ,Biotechnology - Abstract
Automated annotation of protein function is challenging. As the number of sequenced genomes rapidly grows, the overwhelming majority of proteins can only be annotated computationally. Under new conditions or stimuli, not only the number and location of proteins would be changed, but also their interactions. This dynamic feature of protein interactions, however, was not considered in the existing function prediction algorithms. Taking the dynamic nature of protein interactions into consideration, we construct a dynamic weighted interactome network (DWIN) by integrating protein-protein interaction (PPI) network and time course gene expression data, as well as proteins' domain information and protein complex information. Then, we propose a new prediction approach that predicts protein functions from the constructed dynamic weighted interactome network. For an unknown protein, the proposed method visits dynamic networks at different time points and scores functions derived from all neighbors. Finally, the method selects top N functions from these ranked candidate functions to annotate the testing protein. Experiments on PPI datasets were conducted to evaluate the effectiveness of the proposed approach in predicting unknown protein functions. The evaluation results demonstrated that the proposed method outperforms other competing methods.
- Published
- 2016
- Full Text
- View/download PDF
27. DPCMNE: detecting protein complexes from protein-protein interaction networks via multi-level network embedding
- Author
-
Ju Xiang, Fang-Xiang Wu, Xiangmao Meng, Min Li, and Ruiqing Zheng
- Subjects
business.industry ,Applied Mathematics ,Topological information ,0206 medical engineering ,Network embedding ,Computational Biology ,Proteins ,Pattern recognition ,Topology (electrical circuits) ,Saccharomyces cerevisiae ,02 engineering and technology ,Network topology ,Protein protein interaction network ,ComputingMethodologies_PATTERNRECOGNITION ,Protein Interaction Mapping ,Genetics ,Protein Interaction Maps ,Granularity ,Artificial intelligence ,Cluster analysis ,business ,Algorithms ,020602 bioinformatics ,Biological network ,Biotechnology - Abstract
Biological functions of a cell are typically carried out through protein complexes. The detection of protein complexes is therefore of great significance for understanding the cellular organizations and protein functions. In the past decades, many computational methods have been proposed to detect protein complexes. However, most of the existing methods just search the local topological information to mine dense subgraphs as protein complexes, ignoring the global topological information. To tackle this issue, we propose the DPCMNE method to detect protein complexes via multi-level network embedding. It can preserve both the local and global topological information of biological networks. First, DPCMNE employs a hierarchical compressing strategy to recursively compress the input protein-protein interaction (PPI) network into multi-level smaller PPI networks. Then, a network embedding method is applied on these smaller PPI networks to learn protein embeddings of different levels of granularity. The embeddings learned from all the compressed PPI networks are concatenated to represent the final protein embeddings of the original input PPI network. Finally, a core-attachment based strategy is adopted to detect protein complexes in the weighted PPI network constructed by the pairwise similarity of protein embeddings. To assess the efficiency of our proposed method, DPCMNE is compared with other eight clustering algorithms on two yeast datasets. The experimental results show that the performance of DPCMNE outperforms those state-of-the-art complex detection methods in terms of F1 and F1+Acc. Furthermore, the results of functional enrichment analysis indicate that protein complexes detected by DPCMNE are more biologically significant in terms of P-score.
- Published
- 2021
- Full Text
- View/download PDF
28. A Framework of De Novo Peptide Sequencing for Multiple Tandem Mass Spectra
- Author
-
Fang-Xiang Wu, Yan Yan, and Anthony Kusalik
- Subjects
chemistry.chemical_classification ,Electron-capture dissociation ,Biomedical Engineering ,Peptide sequence tag ,Analytical chemistry ,Pharmaceutical Science ,Medicine (miscellaneous) ,Bioengineering ,Peptide ,De novo peptide sequencing ,Tandem mass spectrometry ,Dissociation (chemistry) ,Computer Science Applications ,Electron-transfer dissociation ,chemistry ,Fragmentation (mass spectrometry) ,Electrical and Electronic Engineering ,Biotechnology - Abstract
With tandem mass spectrometry (MS/MS), spectra can be generated by various fragmentation techniques including collision-induced dissociation (CID), higher-energy collisional dissociation (HCD), electron capture dissociation (ECD), electron transfer dissociation (ETD) and so on. At the same time, de novo sequencing using multiple spectra from the same peptide generated by different fragmentation techniques is becoming popular in proteomics studies. The focus of this study is the use of paired spectra from CID (or HCD) and ECD (or ETD) fragmentation because of the complementarity between them. We present a de novo peptide sequencing framework for multiple tandem mass spectra, and apply it to paired spectra sequencing problem. The performance of the framework on paired spectra is compared to another successful method named ${\rm pNovo}+$ . The results show that our proposed method outperforms ${\rm pNovo}+$ in terms of full length peptide sequencing accuracy on three pairs of experimental datasets, with the accuracy increasing up to 13.6% compared to ${\rm pNovo}+$ .
- Published
- 2015
- Full Text
- View/download PDF
29. UDoNC: An Algorithm for Identifying Essential Proteins Based on Protein Domains and Protein-Protein Interaction Networks
- Author
-
Jianxin Wang, Wei Peng, Yi Pan, Fang-Xiang Wu, Yu Lu, and Yingjiao Cheng
- Subjects
Saccharomyces cerevisiae Proteins ,Computer science ,Escherichia coli Proteins ,Applied Mathematics ,Protein domain ,Molecular biophysics ,Feature extraction ,Computational Biology ,Computational biology ,Network topology ,computer.software_genre ,Protein Structure, Tertiary ,Domain (software engineering) ,Completeness (order theory) ,Frequency domain ,Genetics ,Cluster Analysis ,Protein Interaction Maps ,Data mining ,computer ,Algorithms ,Biotechnology ,Clustering coefficient - Abstract
Prediction of essential proteins which are crucial to an organism's survival is important for disease analysis and drug design, as well as the understanding of cellular life. The majority of prediction methods infer the possibility of proteins to be essential by using the network topology. However, these methods are limited to the completeness of available protein-protein interaction (PPI) data and depend on the network accuracy. To overcome these limitations, some computational methods have been proposed. However, seldom of them solve this problem by taking consideration of protein domains. In this work, we first analyze the correlation between the essentiality of proteins and their domain features based on data of 13 species. We find that the proteins containing more protein domain types which rarely occur in other proteins tend to be essential. Accordingly, we propose a new prediction method, named UDoNC, by combining the domain features of proteins with their topological properties in PPI network. In UDoNC, the essentiality of proteins is decided by the number and the frequency of their protein domain types, as well as the essentiality of their adjacent edges measured by edge clustering coefficient. The experimental results on S. cerevisiae data show that UDoNC outperforms other existing methods in terms of area under the curve (AUC). Additionally, UDoNC can also perform well in predicting essential proteins on data of E. coli.
- Published
- 2015
- Full Text
- View/download PDF
30. Network Output Controllability-Based Method for Drug Target Identification
- Author
-
Min Li, Fang-Xiang Wu, Yichao Shen, and Lin Wu
- Subjects
Computer science ,Drug target ,Biomedical Engineering ,Pharmaceutical Science ,Medicine (miscellaneous) ,Bioengineering ,Bioinformatics ,computer.software_genre ,Models, Biological ,Network output ,Set (abstract data type) ,Animals ,Humans ,Computer Simulation ,Molecular Targeted Therapy ,Electrical and Electronic Engineering ,Feedback, Physiological ,Drug discovery ,Molecular biophysics ,Proteins ,Computer Science Applications ,Controllability ,Identification (information) ,ComputingMethodologies_PATTERNRECOGNITION ,Pharmaceutical Preparations ,Drug Design ,State (computer science) ,Data mining ,computer ,Signal Transduction ,Biotechnology - Abstract
Biomolecules do not perform their functions alone, but interactively with one another to form so called biomolecular networks. It is well known that a complex disease stems from the malfunctions of corresponding biomolecular networks. Therefore, one of important tasks is to identify drug targets from biomolecular networks. In this study, the drug target identification is formulated as a problem of finding steering nodes in biomolecular networks while the concept of network output controllability is applied to the problem of drug target identification. By applying control signals to these steering nodes, the biomolecular networks are expected to be transited from one state to another. A graph-theoretic algorithm has been proposed to find a minimum set of steering nodes in biomolecular networks which can be a potential set of drug targets. Application results of the method to real biomolecular networks show that identified potential drug targets are in agreement with existing research results. This indicates that the method can generate testable predictions and provide insights into experimental design of drug discovery.
- Published
- 2015
- Full Text
- View/download PDF
31. A Topology Potential-Based Method for Identifying Essential Proteins from PPI Networks
- Author
-
Yi Pan, Fang-Xiang Wu, Jianxin Wang, Min Li, and Yu Lu
- Subjects
Models, Molecular ,Saccharomyces cerevisiae Proteins ,Surface Properties ,Applied Mathematics ,Reliability (computer networking) ,Computational Biology ,Biology ,Topology ,computer.software_genre ,Network topology ,Development (topology) ,Ranking ,Betweenness centrality ,Area Under Curve ,Protein Interaction Mapping ,Genetics ,Protein Interaction Maps ,Data mining ,Centrality ,computer ,Topology (chemistry) ,Biological network ,Biotechnology - Abstract
Essential proteins are indispensable for cellular life. It is of great significance to identify essential proteins that can help us understand the minimal requirements for cellular life and is also very important for drug design. However, identification of essential proteins based on experimental approaches are typically time-consuming and expensive. With the development of high-throughput technology in the post-genomic era, more and more protein-protein interaction data can be obtained, which make it possible to study essential proteins from the network level. There have been a series of computational approaches proposed for predicting essential proteins based on network topologies. Most of these topology based essential protein discovery methods were to use network centralities. In this paper, we investigate the essential proteins' topological characters from a completely new perspective. To our knowledge it is the first time that topology potential is used to identify essential proteins from a protein-protein interaction (PPI) network. The basic idea is that each protein in the network can be viewed as a material particle which creates a potential field around itself and the interaction of all proteins forms a topological field over the network. By defining and computing the value of each protein's topology potential, we can obtain a more precise ranking which reflects the importance of proteins from the PPI network. The experimental results show that topology potential-based methods TP and TP-NC outperform traditional topology measures: degree centrality (DC), betweenness centrality (BC), closeness centrality (CC), subgraph centrality (SC), eigenvector centrality (EC), information centrality (IC), and network centrality (NC) for predicting essential proteins. In addition, these centrality measures are improved on their performance for identifying essential proteins in biological network when controlled by topology potential.
- Published
- 2015
- Full Text
- View/download PDF
32. NIMCE: a gene regulatory network inference approach based on multi time delays causal entropy
- Author
-
Ruiqing Zheng, Min Li, Fang-Xiang Wu, Feng Haonan, and Jianxin Wang
- Subjects
Candidate gene ,Time delays ,Time Factors ,Computer science ,Entropy ,Applied Mathematics ,Gene regulatory network ,Inference ,Computational biology ,Causality ,Expression data ,Genetics ,Entropy (information theory) ,Gene Regulatory Networks ,Transfer entropy ,Gene ,Biotechnology - Abstract
Gene regulatory networks (GRNs) are involved in various biological processes, such as cell cycle, differentiation and apoptosis. The existing large amount of expression data, especially the time-series expression data, provide a chance to infer GRNs by computational methods. These data can reveal the dynamics of gene expression and imply the regulatory relationships among genes. However, identify the indirect regulatory links is still a big challenge as most studies treat time points as independent observations, while ignoring the influences of time delays. In this study, we propose a GRN inference method based on information-theory measure, called NIMCE. NIMCE incorporates the transfer entropy to measure the regulatory links between each pair of genes, then applies the causation entropy to filter indirect relationships. In addition, NIMCE applies multi time delays to identify indirect regulatory relationships from candidate genes. Experiments on simulated and colorectal cancer data show NIMCE outperforms than other competing methods. All data and codes used in this study are publicly available at https://github.com/CSUBioGroup/NIMCE.
- Published
- 2020
- Full Text
- View/download PDF
33. Prediction of Essential Proteins Based on Overlapping Essential Modules
- Author
-
Fang-Xiang Wu, Min Li, Jianxin Wang, Yi Pan, and Bihai Zhao
- Subjects
Saccharomyces cerevisiae Proteins ,Eigenvector centrality ,Biomedical Engineering ,Pharmaceutical Science ,Medicine (miscellaneous) ,Bioengineering ,Saccharomyces cerevisiae ,Computational biology ,Network topology ,computer.software_genre ,Models, Biological ,Interactome ,Betweenness centrality ,Prediction methods ,Protein Interaction Mapping ,Animals ,Humans ,Computer Simulation ,Electrical and Electronic Engineering ,Clustering coefficient ,Mathematics ,Modularity (networks) ,Computer Science Applications ,Gene Expression Regulation ,Metabolome ,Data mining ,Centrality ,computer ,Algorithms ,Software ,Biotechnology - Abstract
Many computational methods have been proposed to identify essential proteins by using the topological features of interactome networks. However, the precision of essential protein discovery still needs to be improved. Researches show that majority of hubs (essential proteins) in the yeast interactome network are essential due to their involvement in essential complex biological modules and hubs can be classified into two categories: date hubs and party hubs. In this study, combining with gene expression profiles, we propose a new method to predict essential proteins based on overlapping essential modules, named POEM. In POEM, the original protein interactome network is partitioned into many overlapping essential modules. The frequencies and weighted degrees of proteins in these modules are employed to decide which categories does a protein belong to? The comparative results show that POEM outperforms the classical centrality measures: Degree Centrality (DC), Information Centrality (IC), Eigenvector Centrality (EC), Subgraph Centrality (SC), Betweenness Centrality (BC), Closeness Centrality (CC), Edge Clustering Coefficient Centrality (NC), and two newly proposed essential proteins prediction methods: PeC and CoEWC. Experimental results indicate that the precision of predicting essential proteins can be improved by considering the modularity of proteins and integrating gene expression profiles with network topological features.
- Published
- 2014
- Full Text
- View/download PDF
34. NovoHCD: De novo Peptide Sequencing From HCD Spectra
- Author
-
Anthony Kusalik, Fang-Xiang Wu, and Yan Yan
- Subjects
chemistry.chemical_classification ,Chemistry ,Biomedical Engineering ,Peptide sequence tag ,Pharmaceutical Science ,Medicine (miscellaneous) ,Bioengineering ,Peptide ,De novo peptide sequencing ,Computational biology ,Models, Theoretical ,Mass spectrometry ,Combinatorial chemistry ,Spectral line ,Computer Science Applications ,Peptide mass fingerprinting ,Sequence Analysis, Protein ,Tandem Mass Spectrometry ,Peptide spectral library ,Graph (abstract data type) ,Electrical and Electronic Engineering ,Peptides ,Algorithms ,Biotechnology - Abstract
In recent years, de novo peptide sequencing from mass spectrometry data has developed as one of the major peptide identification methods with the emergence of new instruments and advanced computational methods. However, there are still limitations to this method; for example, the typically used spectrum graph model cannot represent all the information and relationships inherent in tandem mass spectra (MS/MS spectra). Here, we present a new method named NovoHCD which applies a spectrum graph model with multiple types of edges (called a multi-edge graph), and integrates into it amino acid combination (AAC) information and peptide tags. In addition, information on immonium ions observed particularly in higher-energy collisional dissociation (HCD) spectra is incorporated. Comparisons between NovoHCD and another successful de novo peptide sequencing method for HCD spectra, pNovo, were performed. Experiments were conducted on five HCD spectral datasets. Results show that NovoHCD outperforms pNovo in terms of full length peptide identification accuracy; specifically, the accuracy increases 13%-21% over the five datasets.
- Published
- 2014
- Full Text
- View/download PDF
35. Detecting Protein Complexes Based on Uncertain Graph Model
- Author
-
Bihai Zhao, Fang-Xiang Wu, Min Li, Jianxin Wang, and Yi Pan
- Subjects
Saccharomyces cerevisiae Proteins ,Theoretical computer science ,Applied Mathematics ,Computational Biology ,A protein ,Saccharomyces cerevisiae ,Protein engineering ,computer.software_genre ,Graph model ,Graph ,Protein–protein interaction ,Future study ,ComputingMethodologies_PATTERNRECOGNITION ,Ppi network ,Protein Interaction Mapping ,Genetics ,Data mining ,Cluster analysis ,computer ,Algorithms ,Biotechnology ,Mathematics - Abstract
Advanced biological technologies are producing large-scale protein-protein interaction (PPI) data at an ever increasing pace, which enable us to identify protein complexes from PPI networks. Pair-wise protein interactions can be modeled as a graph, where vertices represent proteins and edges represent PPIs. However most of current algorithms detect protein complexes based on deterministic graphs, whose edges are either present or absent. Neighboring information is neglected in these methods. Based on the uncertain graph model, we propose the concept of expected density to assess the density degree of a subgraph, the concept of relative degree to describe the relationship between a protein and a subgraph in a PPI network. We develop an algorithm called DCU (detecting complex based on uncertain graph model) to detect complexes from PPI networks. In our method, the expected density combined with the relative degree is used to determine whether a subgraph represents a complex with high cohesion and low coupling. We apply our method and the existing competing algorithms to two yeast PPI networks. Experimental results indicate that our method performs significantly better than the state-of-the-art methods and the proposed model can provide more insights for future study in PPI networks.
- Published
- 2014
- Full Text
- View/download PDF
36. Identifying Protein Complexes Based on Multiple Topological Structures in PPI Networks
- Author
-
Bolin Chen and Fang-Xiang Wu
- Subjects
Models, Statistical ,Matching (graph theory) ,Topological information ,Molecular biophysics ,Biomedical Engineering ,Structure (category theory) ,Computational Biology ,Proteins ,Pharmaceutical Science ,Medicine (miscellaneous) ,Bioengineering ,Topology ,Computer Science Applications ,ComputingMethodologies_PATTERNRECOGNITION ,Yeasts ,Ppi network ,Humans ,Structure based ,Protein Interaction Maps ,Electrical and Electronic Engineering ,Databases, Protein ,Algorithms ,Topology (chemistry) ,Biotechnology ,Mathematics - Abstract
Various computational algorithms are developed to identify protein complexes based on only one of specific topological structures in protein-protein interaction (PPI) networks, such as cliques, dense subgraphs, core-attachment structures and starlike structures. However, protein complexes exhibit intricate connections in a PPI network. They cannot be fully detected by only single topological structure. In this paper, we propose an algorithm based on multiple topological structures to identify protein complexes from PPI networks. In the proposed algorithm, four single topological structure based algorithms are first employed to identify raw predictions with specific topological structures, respectively. Those raw predictions are trimmed according to their topological information or GO annotations. Similar results are carefully merged before generating final predictions. Numerical experiments are conducted on a yeast PPI network of DIP and a human PPI network of HPRD. The predicted results show that the multiple topological structure based algorithm can not only obtain a more number of predictions, but also generate results with high accuracy in terms of f-score, matching with known protein complexes and functional enrichments with GO.
- Published
- 2013
- Full Text
- View/download PDF
37. Inference of Biological S-System Using the Separable Estimation Method and the Genetic Algorithm
- Author
-
Li-Zhi Liu, Fang-Xiang Wu, and Wenjun Zhang
- Subjects
Mathematical optimization ,Models, Genetic ,Estimation theory ,Systems Biology ,Applied Mathematics ,Population-based incremental learning ,Models, Biological ,Regularization (mathematics) ,Hybrid algorithm ,Nonlinear system ,Genetic algorithm ,Genetics ,Pruning (decision trees) ,Algorithms ,Biotechnology ,Killer heuristic ,Mathematics - Abstract
Reconstruction of a biological system from its experimental time series data is a challenging task in systems biology. The S-system which consists of a group of nonlinear ordinary differential equations (ODEs) is an effective model to characterize molecular biological systems and analyze the system dynamics. However, inference of S-systems without the knowledge of system structure is not a trivial task due to its nonlinearity and complexity. In this paper, a pruning separable parameter estimation algorithm (PSPEA) is proposed for inferring S-systems. This novel algorithm combines the separable parameter estimation method (SPEM) and a pruning strategy, which includes adding an \ell_1 regularization term to the objective function and pruning the solution with a threshold value. Then, this algorithm is combined with the continuous genetic algorithm (CGA) to form a hybrid algorithm that owns the properties of these two combined algorithms. The performance of the pruning strategy in the proposed algorithm is evaluated from two aspects: the parameter estimation error and structure identification accuracy. The results show that the proposed algorithm with the pruning strategy has much lower estimation error and much higher identification accuracy than the existing method.
- Published
- 2012
- Full Text
- View/download PDF
38. Stability and Bifurcation of Ring-Structured Genetic Regulatory Networks With Time Delays
- Author
-
Fang-Xiang Wu
- Subjects
Ring (mathematics) ,Control theory ,Quantitative Biology::Molecular Networks ,Connection (vector bundle) ,Stability (learning theory) ,Ring network ,Electrical and Electronic Engineering ,Parameter space ,Network topology ,Quantitative Biology::Genomics ,Biological applications of bifurcation theory ,Bifurcation ,Mathematics - Abstract
Recently nonlinear differential equations with time delays have been proposed to model genetic regulatory networks, which provide a powerful tool for understanding gene regulatory processes in living organisms. In this paper we study the stability and bifurcation of a class of genetic regulatory networks with ring topology and multiple time delays at its equilibrium state. We first present necessary and sufficient conditions for delay-independently local stability of such genetic regulatory networks in the parameter space. Then we investigate their bifurcation when such networks lose their stability. Although such networks may have multiple delays and different connection strengths among individual nodes, their stability and bifurcation depends on the sum of all time delays among all elements (including both mRNAs and proteins) and the product of the connection strengths between all elements. An autoregulatory network and a repressilatory network are employed to illustrate the theorems developed in this study.
- Published
- 2012
- Full Text
- View/download PDF
39. Delay-Independent Stability of Genetic Regulatory Networks
- Author
-
Fang-Xiang Wu
- Subjects
Models, Genetic ,Computer Networks and Communications ,Computer science ,Alternative splicing ,Stability (learning theory) ,General Medicine ,Computational biology ,Nonlinear differential equations ,Computer Science Applications ,Alternative Splicing ,Gene Expression Regulation ,Artificial Intelligence ,Control theory ,RNA splicing ,Genetics ,Thermodynamics ,Computer Simulation ,Neural Networks, Computer ,RNA, Messenger ,Nonnegative matrix ,Gene ,Algorithms ,Software - Abstract
Genetic regulatory networks can be described by nonlinear differential equations with time delays. In this paper, we study both locally and globally delay-independent stability of genetic regulatory networks, taking messenger ribonucleic acid alternative splicing into consideration. Based on nonnegative matrix theory, we first develop necessary and sufficient conditions for locally delay-independent stability of genetic regulatory networks with multiple time delays. Compared to the previous results, these conditions are easy to verify. Then we develop sufficient conditions for global delay-independent stability for genetic regulatory networks. Compared to the previous results, this sufficient condition is less conservative. To illustrate theorems developed in this paper, we analyze delay-independent stability of two genetic regulatory networks: a real-life repressilatory network with three genes and three proteins, and a synthetic gene regulatory network with five genes and seven proteins. The simulation results show that the theorems developed in this paper can effectively determine the delay-independent stability of genetic regulatory networks.
- Published
- 2011
- Full Text
- View/download PDF
40. Artificial Fish Swarm Optimization Based Method to Identify Essential Proteins
- Author
-
Fang-Xiang Wu, Xiaoqin Yang, and Xiujuan Lei
- Subjects
Identification methods ,Saccharomyces cerevisiae Proteins ,Computer science ,0206 medical engineering ,Saccharomyces cerevisiae ,Cellular functions ,02 engineering and technology ,Computational biology ,Network topology ,Models, Biological ,Protein–protein interaction ,Artificial Intelligence ,Genetics ,Animals ,Drosophila Proteins ,Protein Interaction Maps ,biology ,Gene ontology ,Applied Mathematics ,Fishes ,Computational Biology ,Swarm behaviour ,biology.organism_classification ,Drosophila melanogaster ,Algorithms ,020602 bioinformatics ,Biotechnology - Abstract
It is well known that essential proteins play an extremely important role in controlling cellular activities in living organisms. Identifying essential proteins from protein protein interaction (PPI) networks is conducive to the understanding of cellular functions and molecular mechanisms. Hitherto, many essential proteins detection methods have been proposed. Nevertheless, those existing identification methods are not satisfactory because of low efficiency and low sensitivity to noisy data. This paper presents a novel computational approach based on artificial fish swarm optimization for essential proteins prediction in PPI networks (called AFSO_EP). In AFSO_EP, first, a part of known essential proteins are randomly chosen as artificial fishes of priori knowledge. Then, detecting essential proteins by imitating four principal biological behaviors of artificial fishes when searching for food or companions, including foraging behavior, following behavior, swarming behavior, and random behavior, in which process, the network topology, gene expression, gene ontology (GO) annotation, and subcellular localization information are utilized. To evaluate the performance of AFSO_EP, we conduct experiments on two species (Saccharomyces cerevisiae and Drosophila melanogaster), the experimental results show that our method AFSO_EP achieves a better performance for identifying essential proteins in comparison with several other well-known identification methods, which confirms the effectiveness of AFSO_EP.
- Published
- 2019
- Full Text
- View/download PDF
41. A deep learning framework for identifying essential proteins by integrating multiple types of biological information
- Author
-
Jianxin Wang, Yaohang Li, Fang-Xiang Wu, Min Zeng, Yi Pan, Min Li, and Zhihui Fei
- Subjects
Saccharomyces cerevisiae Proteins ,Computer science ,0206 medical engineering ,Feature extraction ,Intracellular Space ,Score ,02 engineering and technology ,Machine learning ,computer.software_genre ,Network topology ,Deep Learning ,Protein Interaction Mapping ,Genetics ,Feature (machine learning) ,Protein Interaction Maps ,Indicator vector ,business.industry ,Applied Mathematics ,Deep learning ,Computational Biology ,Identification (information) ,Artificial intelligence ,Transcriptome ,Centrality ,business ,computer ,020602 bioinformatics ,Biotechnology - Abstract
Computational methods including centrality and machine learning-based methods have been proposed to identify essential proteins for understanding the minimum requirements of the survival and evolution of a cell. In centrality methods, researchers are required to design a score function which is based on prior knowledge, yet is usually not sufficient to capture the complexity of biological information. In machine learning-based methods, some selected biological features cannot represent the complete properties of biological information as they lack a computational framework to automatically select features. To tackle these problems, we propose a deep learning framework to automatically learn biological features without prior knowledge. We use node2vec technique to automatically learn a richer representation of protein-protein interaction (PPI) network topologies than a score function. Bidirectional long short term memory cells are applied to capture non-local relationships in gene expression data. For subcellular localization information, we exploit a high dimensional indicator vector to characterize their feature. To evaluate the performance of our method, we tested it on PPI network of S. cerevisiae. Our experimental results demonstrate that the performance of our method is better than traditional centrality methods and is superior to existing machine learning-based methods. To explore which of the three types of biological information is the most vital element, we conduct an ablation study by removing each component in turn. Our results show that the PPI network embedding contributes most to the improvement. In addition, gene expression profiles and subcellular localization information are also helpful to improve the performance in identification of essential proteins.
- Published
- 2019
- Full Text
- View/download PDF
42. Control of hybrid Machines with 2-DOF for trajectory tracking problems
- Author
-
Wenjun Zhang, Fang-Xiang Wu, Z.X. Zhou, P.R. Ouyang, and Q. Li
- Subjects
Electric motor ,Engineering ,business.industry ,Servomechanism ,Servomotor ,Motion control ,Flywheel ,law.invention ,Control and Systems Engineering ,Control theory ,law ,Hybrid system ,Electrical and Electronic Engineering ,business ,Simulation ,Machine control - Abstract
There are two types of drivers in production machine systems: constant velocity (CV) motor and servo-motor. If a system contains two drivers or more, among which some are of the CV motor while the other are the servo-motor, the system has the so-called hybrid driver architecture and is called hybrid machine for short. The hybrid system has the advantage of high payload and application flexibility. In this brief, we propose a control algorithm and show that the controlled hybrid machine is stable. A simulation is performed to verify the proposed controller. The CV motor has the velocity fluctuation due to the change of its workload. The common approach to attenuate the velocity fluctuation is via a flywheel which is attached on the shaft of the CV motor. We show that this can further improve the tracking performance of the hybrid system. A five-bar linkage with two degrees of freedom is used for illustration throughout the brief.
- Published
- 2005
- Full Text
- View/download PDF
43. An Analysis of Two-Heater Active Thermal Control Technology for Device Class Testing
- Author
-
David A. Torvi, Fang-Xiang Wu, J.W. Wan, and Wenjun Zhang
- Subjects
Engineering ,Class (computer programming) ,Temperature control ,business.industry ,Process (computing) ,Control engineering ,Electronic, Optical and Magnetic Materials ,Power (physics) ,Overshoot (signal) ,Process control ,Point (geometry) ,Stage (hydrology) ,Electrical and Electronic Engineering ,business - Abstract
A novel technology for controlling temperature rise in the class testing is described in this article. This technology is based on two active heater sources and is called a two-heater active thermal control (2H-ATC) system. From a point of control, a lumped analytical model for representing the whole class testing process is very important, and is developed in this article. The model was validated by comparing the simulated result with the measured result on a commercial tester. Based on this model, we have studied the issue of optimization of the performance of the testing process, in particular examining effects of test system parameters on system performance. We have also observed a concept called critical heater power, which is important in achieving a minimum overshoot at the transition from the preheating stage to the testing stage. The outcome of this study has already been applied in practical process control during the whole class testing.
- Published
- 2004
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.