110 results on '"Bajic Vladimir B"'
Search Results
2. Genome-wide analysis of regions similar to promoters of histone genes.
- Author
-
Chowdhary, Rajesh, Bajic, Vladimir B., Dong, Difeng, Limsoon Wong, and Liu, Jun S.
- Subjects
- *
HISTONES , *GENES , *HUMAN genome , *HYPOTHESIS , *GENE expression - Abstract
Background: The purpose of this study is to: i) develop a computational model of promoters of human histone-encoding genes (shortly histone genes), an important class of genes that participate in various critical cellular processes, ii) use the model so developed to identify regions across the human genome that have similar structure as promoters of histone genes; such regions could represent potential genomic regulatory regions, e.g. promoters, of genes that may be coregulated with histone genes, and iii/ identify in this way genes that have high likelihood of being coregulated with the histone genes. Results: We successfully developed a histone promoter model using a comprehensive collection of histone genes. Based on leave-one-out cross-validation test, the model produced good prediction accuracy (94.1% sensitivity, 92.6% specificity, and 92.8% positive predictive value). We used this model to predict across the genome a number of genes that shared similar promoter structures with the histone gene promoters. We thus hypothesize that these predicted genes could be coregulated with histone genes. This hypothesis matches well with the available gene expression, gene ontology, and pathways data. Jointly with promoters of the above-mentioned genes, we found a large number of intergenic regions with similar structure as histone promoters. Conclusions: This study represents one of the most comprehensive computational analyses conducted thus far on a genome-wide scale of promoters of human histone genes. Our analysis suggests a number of other human genes that share a high similarity of promoter structure with the histone genes and thus are highly likely to be coregulated, and consequently coexpressed, with the histone genes. We also found that there are a large number of intergenic regions across the genome with their structures similar to promoters of histone genes. These regions may be promoters of yet unidentified genes, or may represent remote control regions that participate in regulation of histone and histone-coregulated gene transcription initiation. While these hypotheses still remain to be verified, we believe that these form a useful resource for researchers to further explore regulation of human histone genes and human genome. It is worthwhile to note that the regulatory regions of the human genome remain largely un-annotated even today and this study is an attempt to supplement our understanding of histone regulatory regions. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
3. E2F5 status significantly improves malignancydiagnosis of epithelial ovarian cancer.
- Author
-
Kothandaraman, Narasimhan, Bajic, Vladimir B., Brendan, Pang N. K., Huak, Chan Y., Keow, Peh B., Razvi, Khalil, Salto-Tellez, Manuel, and Choolani, Mahesh
- Subjects
- *
CANCER patients , *OVARIAN cancer , *TRANSCRIPTION factors , *WESTERN immunoblotting , *DIAGNOSIS - Abstract
Background: Ovarian epithelial cancer (OEC) usually presents in the later stages of the disease. Factors, especially those associated with cell-cycle genes, affecting the genesis and tumour progression for ovarian cancer are largely unknown. We hypothesized that over-expressed transcription factors (TFs), as well as those that are driving the expression of the OEC over-expressed genes, could be the key for OEC genesis and potentially useful tissue and serum markers for malignancy associated with OEC. Methods: Using a combination of computational (selection of candidate TF markers and malignancy prediction) and experimental approaches (tissue microarray and western blotting on patient samples) we identified and evaluated E2F5 transcription factor involved in cell proliferation, as a promising candidate regulatory target in early stage disease. Our hypothesis was supported by our tissue array experiments that showed E2F5 expression only in OEC samples but not in normal and benign tissues, and by significantly positively biased expression in serum samples done using western blotting studies. Results: Analysis of clinical cases shows that of the E2F5 status is characteristic for a different population group than one covered by CA125, a conventional OEC biomarker. E2F5 used in different combinations with CA125 for distinguishing malignant cyst from benign cyst shows that the presence of CA125 or E2F5 increases sensitivity of OEC detection to 97.9% (an increase from 87.5% if only CA125 is used) and, more importantly, the presence of both CA125 and E2F5 increases specificity of OEC to 72.5% (an increase from 55% if only CA125 is used). This significantly improved accuracy suggests possibility of an improved diagnostics of OEC. Furthermore, detection of malignancy status in 86 cases (38 benign, 48 early and late OEC) shows that the use of E2F5 status in combination with other clinical characteristics allows for an improved detection of malignant cases with sensitivity, specificity, F-measure and accuracy of 97.92%, 97.37%, 97.92% and 97.67%, respectively. Conclusions: Overall, our findings, in addition to opening a realistic possibility for improved OEC diagnosis, provide an indirect evidence that a cell-cycle regulatory protein E2F5 might play a significant role in OEC pathogenesis. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
4. Sex differences in the recognition of and innate antiviral responses to Seoul virus in Norway rats
- Author
-
Hannah, Michele F., Bajic, Vladimir B., and Klein, Sabra L.
- Subjects
- *
HANTAVIRUSES , *RATS , *RNA ,SEX differences (Biology) - Abstract
Abstract: Among rodents that carry hantaviruses, more males are infected than females. Male rats also have elevated copies of Seoul virus RNA and reduced transcription of immune-related genes in the lungs than females. To further characterize sex differences in antiviral defenses and whether these differences are mediated by gonadal hormones, we examined viral RNA in the lungs, virus shedding in saliva, and antiviral defenses among male and female rats that were intact, gonadectomized neonatally, or gonadectomized in adulthood. Following inoculation with Seoul virus, high amounts viral RNA persisted longer in lungs from intact males than intact females. Removal of the gonads in males reduced the amount of viral RNA to levels comparable with intact females at 40days post-inoculation (p.i.). Intact males shed more virus in saliva than intact females 15days p.i.; removal of the gonads during either the neonatal period or in adulthood increased virus shedding in females and decreased virus shedding in males. Induction of pattern recognition receptors (PRRs; Tlr7 and Rig-I), expression of antiviral genes (Myd88, Visa, Jun, Irf7, Ifnβ, Ifnar1, Jak2, Stat3, and Mx2), and production of Mx protein was elevated in the lungs of intact females compared with intact males. Gonadectomy had more robust effects on the induction of PRRs than on downstream IFNβ or Mx2 expression. Putative androgen and estrogen response elements are present in the promoters of several of these antiviral genes, suggesting the propensity for sex steroids to directly affect dimorphic antiviral responses against Seoul virus infection. [Copyright &y& Elsevier]
- Published
- 2008
- Full Text
- View/download PDF
5. Transcriptional network dynamics in macrophage activation
- Author
-
Nilsson, Roland, Bajic, Vladimir B., Suzuki, Harukazu, di Bernardo, Diego, Björkegren, Johan, Katayama, Shintaro, Reid, James F., Sweet, Matthew J., Gariboldi, Manuela, Carninci, Piero, Hayashizaki, Yosihide, Hume, David A., Tegner, Jesper, and Ravasi, Timothy
- Subjects
- *
KILLER cells , *NATURAL immunity , *PROTEINS , *TRANSCRIPTION factors - Abstract
Abstract: Transcriptional regulatory networks govern cell differentiation and the cellular response to external stimuli. However, mammalian model systems have not yet been accessible for network analysis. Here, we present a genome-wide network analysis of the transcriptional regulation underlying the mouse macrophage response to bacterial lipopolysaccharide (LPS). Key to uncovering the network structure is our combination of time-series cap analysis of gene expression with in silico prediction of transcription factor binding sites. By integrating microarray and qPCR time-series expression data with a promoter analysis, we find dynamic subnetworks that describe how signaling pathways change dynamically during the progress of the macrophage LPS response, thus defining regulatory modules characteristic of the inflammatory response. In particular, our integrative analysis enabled us to suggest novel roles for the transcription factors ATF-3 and NRF-2 during the inflammatory response. We believe that our system approach presented here is applicable to understanding cellular differentiation in higher eukaryotes. [Copyright &y& Elsevier]
- Published
- 2006
- Full Text
- View/download PDF
6. Mice and Men: Their Promoter Properties.
- Author
-
Bajic, Vladimir B., Sin Lam Tan, Christoffels, Alan, Schönbach, Christian, Lipovich, Leonard, Liang Yang, Hofmann, Oliver, Kruger, Adele, Hide, Winston, Kai, Chikatoshi, Kawai, Jun, Hume, David A., Carninci, Piero, and Hayashizaki, Yoshihide
- Subjects
- *
PROMOTERS (Genetics) , *MICE , *HUMAN beings , *GENETIC transcription , *CIRCULAR DNA , *GENOMES - Abstract
Using the two largest collections of Mus musculus and Homo sapiens transcription start sites (TSSs) determined based on CAGE tags, ditags, full-length cDNAs, and other transcript data, we describe the compositional landscape surrounding TSSs with the aim of gaining better insight into the properties of mammalian promoters. We classified TSSs into four types based on compositional properties of regions immediately surrounding them. These properties highlighted distinctive features in the extended core promoters that helped us delineate boundaries of the transcription initiation domain space for both species. The TSS types were analyzed for associations with initiating dinucleotides, CpG islands, TATA boxes, and an extensive collection of statistically significant cis-elements in mouse and human. We found that different TSS types show preferences for different sets of initiating dinucleotides and cis-elements. Through Gene Ontology and eVOC categories and tissue expression libraries we linked TSS characteristics to expression. Moreover, we show a link of TSS characteristics to very specific genomic organization in an example of immune-response-related genes (GO:0006955). Our results shed light on the global properties of the two transcriptomes not revealed before and therefore provide the framework for better understanding of the transcriptional mechanisms in the two species, as well as a framework for development of new and more efficient promoter- and gene-finding tools. [ABSTRACT FROM AUTHOR]
- Published
- 2006
- Full Text
- View/download PDF
7. Dragon Plant Biology Explorer. A Text-Mining Tool for Integrating Associations between Genetic and Biochemical Entities with Genome Annotation and Biochemical Terms Lists.
- Author
-
Bajic, Vladimir B., Veronika, Merlin, Veladandi, Pardha Sarathi, Meka, Archana, Mok-Wei Heng, Rajaraman, Kanagasabai, Hong Pan, and Swarup, Sanjay
- Subjects
- *
DATA mining , *DATABASE searching , *ONLINE data processing , *ARABIDOPSIS , *BRASSICACEAE , *PLANT physiology , *INFORMATION resources - Abstract
We introduce a tool for text mining, Dragon Plant Biology Explorer (DPBE) that integrates information on Arabidopsis (Arabidopsis thaliana) genes with their functions, based on gene ontologies and biochemical entity vocabularies, and presents the associations as interactive networks. The associations are based on (1) user-provided PubMed abstracts; (2) a list of Arabidopsis genes compiled by The Arabidopsis Information Resource; (3) user-defined combinations of four vocabulary lists based on the ones developed by the general, plant, and Arabidopsis GO consortia; and (4) three lists developed here based on metabolic pathways, enzymes, and metabolites derived from AraCyc, BRENDA, and other metabolism databases. We demonstrate how various combinations can be applied to fields of (1) gene function and gene interaction analyses, (2) plant development, (3) biochemistry and metabolism, and (4) pharmacology of bioactive compounds. Furthermore, we show the suitability of DPBE for systems approaches by integration with ‘omics’ platform outputs. Using a list of abiotic stress-related genes identified by microarray experiments, we show how this tool can be used to rapidly build an information base on the previously reported relationships. This tool complements the existing biological resources for systems biology by identifying potentially novel associations using text analysis between cellular entities based on genome annotation terms. Thus, it allows researchers to efficiently summarize existing information for a group of genes or pathways, so as to make better informed choices for designing validation experiments. Last, DPBE can be helpful for beginning researchers and graduate students to summarize vast information in an unfamiliar area. [ABSTRACT FROM AUTHOR]
- Published
- 2005
- Full Text
- View/download PDF
8. Computational methods for prediction of T-cell epitopes—a framework for modelling, testing, and applications
- Author
-
Brusic, Vladimir, Bajic, Vladimir B., and Petrovsky, Nikolai
- Subjects
- *
EPITOPES , *HLA histocompatibility antigens , *T cells , *PEPTIDES - Abstract
Abstract: Computational models complement laboratory experimentation for efficient identification of MHC-binding peptides and T-cell epitopes. Methods for prediction of MHC-binding peptides include binding motifs, quantitative matrices, artificial neural networks, hidden Markov models, and molecular modelling. Models derived by these methods have been successfully used for prediction of T-cell epitopes in cancer, autoimmunity, infectious disease, and allergy. For maximum benefit, the use of computer models must be treated as experiments analogous to standard laboratory procedures and performed according to strict standards. This requires careful selection of data for model building, and adequate testing and validation. A range of web-based databases and MHC-binding prediction programs are available. Although some available prediction programs for particular MHC alleles have reasonable accuracy, there is no guarantee that all models produce good quality predictions. In this article, we present and discuss a framework for modelling, testing, and applications of computational methods used in predictions of T-cell epitopes [Copyright &y& Elsevier]
- Published
- 2004
- Full Text
- View/download PDF
9. Promoter prediction analysis on the whole human genome.
- Author
-
Bajic, Vladimir B., Tan, Sin Lam, Suzuki, Yutaka, and Sugano, Sumio
- Subjects
- *
HUMAN genome , *GENETIC regulation , *DNA , *MESSENGER RNA , *CHROMOSOMES , *GENOMES - Abstract
Promoter prediction programs (PPPs) are important for in silico gene discovery without support from expressed sequence tag (EST)/cDNA/mRNA sequences, in the analysis of gene regulation and in genome annotation. Contrary to previous expectations, a comprehensive analysis of PPPs reveals that no program simultaneously achieves sensitivity and a positive predictive value>65%. PPP performances deduced from a limited number of chromosomes or smaller data sets do not hold when evaluated at the level of the whole genome, with serious inaccuracy of predictions for non-CpG-island-related promoters. Some PPPs even perform worse than, or close to, pure random guessing. [ABSTRACT FROM AUTHOR]
- Published
- 2004
- Full Text
- View/download PDF
10. Content Analysis of the Core Promoter Region of Human Genes.
- Author
-
Bajic, Vladimir B., Choudhary, Vidhu, and Hock, Chuan Koh
- Subjects
- *
BIOCHEMISTRY , *BINDING sites , *TRANSCRIPTION factors , *PROTEINS , *DATABASES - Abstract
We analyzed an extended core promoter regions covering [-70,+60] segment relative to the transcription start site of human promoters contained in the Eukaryotic Promoter Database. The analysis was made by using the Match program ver. 1.9 with an optimized setting and the TRANSFAC Professional database ver. 7.2. This analysis revealed that the most common transcription factor binding site in the examined collection of core promoters appears to be initiator (characterized by GEN_INI), which is expected. The other less obvious sites found were Spz1, E2F-1, ZF5, and C/EBP. The 'cap' site was also in this most common group. Over-representation of these sites relative to the non-promoter background data ranged from 0.3167 to 32.1645. These sites were characterized by being present in more than 60% of promoter sequences. Interestingly, the TATA-box has been found in only 11.63% of all examined promoters. The study is complemented by separate analyses of promoter groups having different GC content. These additional analyses revealed that the most common promoter elements found also include AP-2, CdxA, Pax-2, SRY, STAT1 and STAT5A. It was also observed that a number of promoter elements show strong preference either for the GC-rich or the GC-poor core promoters. [ABSTRACT FROM AUTHOR]
- Published
- 2004
11. Dragon Gene Start Finder: An Advanced System for Finding Approximate Locations of the Start of Gene Transcriptional Units.
- Author
-
Bajic, Vladimir B. and Seng Hong Seah
- Subjects
- *
GENETIC transcription , *GENOMES , *HUMAN chromosomes - Abstract
Describes Dragon Gene Start Finder, a system for finding approximate locations of the start of gene transcriptional units in mammalian genomes. Accuracy of the system compared to other systems; Programs used for the comparison analysis; Comparison of Dragon gene start finder with FirstEf and Eponine.
- Published
- 2003
- Full Text
- View/download PDF
12. Computer model for recognition of functional transcription start sites in RNA polymerase II promoters of vertebrates
- Author
-
Bajic, Vladimir B., Seah, Seng Hong, Chong, Allen, Krishnan, S.P.T., Koh, Judice L.Y., and Brusic, Vladimir
- Subjects
- *
MEDICAL transcription , *COMPUTER simulation - Abstract
This paper introduces a new computer system for recognition of functional transcription start sites (TSSs) in RNA polymerase II promoter regions of vertebrates. This system allows scanning complete vertebrate genomes for promoters with significantly reduced number of false positive predictions. It can be used in the context of gene finding through its recognition of the 5′ end of genes. The implemented recognition model uses a composite-hierarchical approach, artificial intelligence, statistics, and signal processing techniques. It also exploits the separation of promoter sequences into those that are
C+G -rich orC+G -poor. The system was evaluated on a large and diverse human sequence-set and exhibited several times higher accuracy than several publicly available TSS-finding programs. Results obtained using human chromosome 22 data showed even greater specificity than the evaluation set results. The system has been implemented in the Dragon Promoter Finder package, which can be accessed at http://sdmc.krdl.org.sg:8080/promoter/. [Copyright &y& Elsevier]- Published
- 2003
- Full Text
- View/download PDF
13. Robust discrete adaptive input-output-based sliding mode controller.
- Author
-
Sha, Daohang and Bajic, Vladimir B
- Subjects
- *
SLIDING mode control , *POLE assignment - Abstract
A robust input-output-based discrete adaptive sliding mode controller is proposed. It combines an integral action, a nonlinear output feedback, an adjustable sliding mode and an adaptive plant parameter estimator. The controller design is carried out via the Lyapunov direct method. A pole assignment procedure is developed for determination of the integral control gain and the coefficients of the sliding mode hyperplane. An on-line update for coefficients of the hyperplane is used to improve control loop behaviour further. Compared with the optimally tuned proportional-integral-derivative (PID) controller, the new controller has increased robustness with regard to the variation in the main process parameters and it has much better set point tracking characteristics. The new controller also exhibits very good disturbance rejection property comparable work or better than to that obtained by the optimally tuned PID controller. Simulation experiments are made to illustrate the quality and robustness of control achieved. [ABSTRACT FROM AUTHOR]
- Published
- 2000
- Full Text
- View/download PDF
14. DDR: efficient computational method to predict drug-target interactions using graph mining and machine learning approaches.
- Author
-
Olayan, Rawan S, Ashoor, Haitham, and Bajic, Vladimir B
- Subjects
- *
TARGETED drug delivery , *PREDICTION models , *ACCURACY , *PROTEIN analysis , *SYSTEMS biology - Abstract
Motivation: Finding computationally drug-target interactions (DTIs) is a convenient strategy to identify new DTIs at low cost with reasonable accuracy. However, the current DTI prediction methods suffer the high false positive prediction rate. Results: We developed DDR, a novel method that improves the DTI prediction accuracy. DDR is based on the use of a heterogeneous graph that contains known DTIs with multiple similarities between drugs and multiple similarities between target proteins. DDR applies non-linear similarity fusion method to combine different similarities. Before fusion, DDR performs a pre-processing step where a subset of similarities is selected in a heuristic process to obtain an optimized combination of similarities. Then, DDR applies a random forest model using different graph-based features extracted from the DTI heterogeneous graph. Using 5-repeats of 10-fold cross-validation, three testing setups and the weighted average of area under the precision-recall curve (AUPR) scores, we show that DDR significantly reduces the AUPR score error relative to the next best start-of-the-art method for predicting DTIs by 34% when the drugs are new, by 23% when targets are new and by 34% when the drugs and the targets are known but not all DTIs between them are not known. Using independent sources of evidence, we verify as correct 22 out of the top 25 DDR novel predictions. This suggests that DDR can be used as an efficient method to identify correct DTIs. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
15. Omni-PolyA: a method and tool for accurate recognition of Poly(A) signals in human genomic DNA.
- Author
-
Magana-Mora, Arturo, Kalkatawi, Manal, and Bajic, Vladimir B.
- Subjects
- *
NUCLEOTIDE sequence , *MACHINE learning , *BIOINFORMATICS , *GENETIC algorithms , *POLYMERIZATION - Abstract
Background: Polyadenylation is a critical stage of RNA processing during the formation of mature mRNA, and is present in most of the known eukaryote protein-coding transcripts and many long non-coding RNAs. The correct identification of poly(A) signals (PAS) not only helps to elucidate the 3'-end genomic boundaries of a transcribed DNA region and gene regulatory mechanisms but also gives insight into the multiple transcript isoforms resulting from alternative PAS. Although progress has been made in the in-silico prediction of genomic signals, the recognition of PAS in DNA genomic sequences remains a challenge. Results: In this study, we analyzed human genomic DNA sequences for the 12 most common PAS variants. Our analysis has identified a set of features that helps in the recognition of true PAS, which may be involved in the regulation of the polyadenylation process. The proposed features, in combination with a recognition model, resulted in a novel method and tool, Omni-PolyA. Omni-PolyA combines several machine learning techniques such as different classifiers in a tree-like decision structure and genetic algorithms for deriving a robust classification model. We performed a comparison between results obtained by state-of-the-art methods, deep neural networks, and Omni-PolyA. Results show that Omni-PolyA significantly reduced the average classification error rate by 35.37% in the prediction of the 12 considered PAS variants relative to the state-of-the-art results. Conclusions: The results of our study demonstrate that Omni-PolyA is currently the most accurate model for the prediction of PAS in human and can serve as a useful complement to other PAS recognition methods. Omni-PolyA is publicly available as an online tool accessible at www.cbrc.kaust.edu.sa/omnipolya/. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
16. Progress and challenges in bioinformatics approaches for enhancer identification.
- Author
-
Kleftogiannis, Dimitrios, Kalnis, Panos, and Bajic, Vladimir B.
- Subjects
- *
BIOINFORMATICS , *GENE enhancers , *GENETIC regulation , *GENE expression , *CHROMATIN , *MACHINE learning - Abstract
Enhancers are cis-acting DNA elements that play critical roles in distal regulation of gene expression. Identifying enhancers is an important step for understanding distinct gene expression programs that may reflect normal and pathogenic cellular conditions. Experimental identification of enhancers is constrained by the set of conditions used in the experiment. This requires multiple experiments to identify enhancers, as they can be active under specific cellular conditions but not in different cell types/tissues or cellular states. This has opened prospects for computational prediction methods that can be used for high-throughput identification of putative enhancers to complement experimental approaches. Potential functions and properties of predicted enhancers have been catalogued and summarized in several enhancer-oriented databases. Because the current methods for the computational prediction of enhancers produce significantly different enhancer predictions, it will be beneficial for the research community to have an overview of the strategies and solutions developed in this field. In this review, we focus on the identification and analysis of enhancers by bioinformatics approaches. First, we describe a general framework for computational identification of enhancers, present relevant data types and discuss possible computational solutions. Next, we cover over 30 existing computational enhancer identification methods that were developed since 2000. Our review highlights advantages, limitations and potentials, while suggesting pragmatic guidelines for development of more efficient computational enhancer prediction methods. Finally, we discuss challenges and open problems of this topic, which require further consideration. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
17. BEACON: automated tool for Bacterial GEnome Annotation ComparisON.
- Author
-
Kalkatawi, Manal, Alam, Intikhab, and Bajic, Vladimir B.
- Subjects
- *
BACTERIAL genomes , *PROKARYOTES , *MICROBIAL genomes , *GENOMES , *PROKARYOTIC genomes - Abstract
Background: Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results: The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON's utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions: We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
18. BEACON: automated tool for Bacterial GEnome Annotation ComparisON.
- Author
-
Kalkatawi, Manal, Alam, Intikhab, and Bajic, Vladimir B.
- Abstract
Background: Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results: The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions: We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
19. DES-Amyloidoses "Amyloidoses through the looking-glass": A knowledgebase developed for exploring and linking information related to human amyloid-related diseases.
- Author
-
Bajic, Vladan P., Salhi, Adil, Lakota, Katja, Radovanovic, Aleksandar, Razali, Rozaimi, Zivkovic, Lada, Spremo-Potparevic, Biljana, Uludag, Mahmut, Tifratene, Faroug, Motwalli, Olaa, Marchand, Benoit, Bajic, Vladimir B., Gojobori, Takashi, Isenovic, Esma R., and Essack, Magbubah
- Subjects
- *
SCIENTIFIC literature , *TEXT mining , *AMYLOID , *ALZHEIMER'S disease , *AMYLOID beta-protein , *DATA mining , *BIOLOGICAL networks - Abstract
More than 30 types of amyloids are linked to close to 50 diseases in humans, the most prominent being Alzheimer's disease (AD). AD is brain-related local amyloidosis, while another amyloidosis, such as AA amyloidosis, tends to be more systemic. Therefore, we need to know more about the biological entities' influencing these amyloidosis processes. However, there is currently no support system developed specifically to handle this extraordinarily complex and demanding task. To acquire a systematic view of amyloidosis and how this may be relevant to the brain and other organs, we needed a means to explore "amyloid network systems" that may underly processes that leads to an amyloid-related disease. In this regard, we developed the DES-Amyloidoses knowledgebase (KB) to obtain fast and relevant information regarding the biological network related to amyloid proteins/peptides and amyloid-related diseases. This KB contains information obtained through text and data mining of available scientific literature and other public repositories. The information compiled into the DES-Amyloidoses system based on 19 topic-specific dictionaries resulted in 796,409 associations between terms from these dictionaries. Users can explore this information through various options, including enriched concepts, enriched pairs, and semantic similarity. We show the usefulness of the KB using an example focused on inflammasome-amyloid associations. To our knowledge, this is the only KB dedicated to human amyloid-related diseases derived primarily through literature text mining and complemented by data mining that provides a novel way of exploring information relevant to amyloidoses. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
20. Chemical Compounds Toxic to Invertebrates Isolated from Marine Cyanobacteria of Potential Relevance to the Agricultural Industry.
- Author
-
Essack, Magbubah, Alzubaidy, Hanin S., Bajic, Vladimir B., and Archer, John A. C.
- Subjects
- *
MARINE toxins , *CYANOBACTERIA , *INVERTEBRATE pests , *PEST control , *AGRICULTURAL industries , *CLIMATE change - Abstract
In spite of advances in invertebrate pest management, the agricultural industry is suffering from impeded pest control exacerbated by global climate changes that have altered rain patterns to favour opportunistic breeding. Thus, novel naturally derived chemical compounds toxic to both terrestrial and aquatic invertebrates are of interest, as potential pesticides. In this regard, marine cyanobacterium-derived metabolites that are toxic to both terrestrial and aquatic invertebrates continue to be a promising, but neglected, source of potential pesticides. A PubMed query combined with hand-curation of the information from retrieved articles allowed for the identification of 36 cyanobacteria-derived chemical compounds experimentally confirmed as being toxic to invertebrates. These compounds are discussed in this review. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
21. Information Exploration System for Sickle Cell Disease and Repurposing of Hydroxyfasudil.
- Author
-
Essack, Magbubah, Radovanovic, Aleksandar, and Bajic, Vladimir B.
- Subjects
- *
SICKLE cell anemia , *FETAL diseases , *DISEASE complications , *MEDICAL literature , *COMPUTATIONAL biology , *MEDICAL informatics , *MEDICAL genetics - Abstract
Background: Sickle cell disease (SCD) is a fatal monogenic disorder with no effective cure and thus high rates of morbidity and sequelae. Efforts toward discovery of disease modifying drugs and curative strategies can be augmented by leveraging the plethora of information contained in available biomedical literature. To facilitate research in this direction we have developed a resource, Dragon Exploration System for Sickle Cell Disease (DESSCD) (http://cbrc.kaust.edu.sa/desscd/) that aims to promote the easy exploration of SCD-related data. Description: The Dragon Exploration System (DES), developed based on text mining and complemented by data mining, processed 419,612 MEDLINE abstracts retrieved from a PubMed query using SCD-related keywords. The processed SCD-related data has been made available via the DESSCD web query interface that enables: a/information retrieval using specified concepts, keywords and phrases, and b/the generation of inferred association networks and hypotheses. The usefulness of the system is demonstrated by: a/reproducing a known scientific fact, the “Sickle_Cell_Anemia–Hydroxyurea” association, and b/generating novel and plausible “Sickle_Cell_Anemia–Hydroxyfasudil” hypothesis. A PCT patent (PCT/US12/55042) has been filed for the latter drug repurposing for SCD treatment. Conclusion: We developed the DESSCD resource dedicated to exploration of text-mined and data-mined information about SCD. No similar SCD-related resource exists. Thus, we anticipate that DESSCD will serve as a valuable tool for physicians and researchers interested in SCD. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
22. Simplified Method to Predict Mutual Interactions of Human Transcription Factors Based on Their Primary Structure.
- Author
-
Schmeier, Sebastian, Jankovic, Boris, and Bajic, Vladimir B.
- Subjects
- *
TRANSCRIPTION factors , *AMINO acid sequence , *PROTEIN-protein interactions , *GENETIC regulation , *DISCRIMINANT analysis , *CELL compartmentation , *COMPARATIVE studies - Abstract
Background: Physical interactions between transcription factors (TFs) are necessary for forming regulatory protein complexes and thus play a crucial role in gene regulation. Currently, knowledge about the mechanisms of these TF interactions is incomplete and the number of known TF interactions is limited. Computational prediction of such interactions can help identify potential new TF interactions as well as contribute to better understanding the complex machinery involved in gene regulation. Methodology: We propose here such a method for the prediction of TF interactions. The method uses only the primary sequence information of the interacting TFs, resulting in a much greater simplicity of the prediction algorithm. Through an advanced feature selection process, we determined a subset of 97 model features that constitute the optimized model in the subset we considered. The model, based on quadratic discriminant analysis, achieves a prediction accuracy of 85.39% on a blind set of interactions. This result is achieved despite the selection for the negative data set of only those TF from the same type of proteins, i.e. TFs that function in the same cellular compartment (nucleus) and in the same type of molecular process (transcription initiation). Such selection poses significant challenges for developing models with high specificity, but at the same time better reflects real-world problems. Conclusions: The performance of our predictor compares well to those of much more complex approaches for predicting TF and general protein-protein interactions, particularly when taking the reduced complexity of model utilisation into account. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
23. Systems biology of innate immunity
- Author
-
Tegnér, Jesper, Nilsson, Roland, Bajic, Vladimir B., Björkegren, Johan, and Ravasi, Timothy
- Subjects
- *
NATURAL immunity , *MOLECULAR biology , *KILLER cells , *BIOCHEMISTRY - Abstract
Abstract: Systems Biology has emerged as an exciting research approach in molecular biology and functional genomics that involves a systematic use of genomic, proteomic, and metabolomic technologies for the construction of network-based models of biological processes. These endeavors, collectively referred to as systems biology establish a paradigm by which to systematically interrogate, model, and iteratively refine our knowledge of the regulatory events within a cell. Here, we present a new systems approach, integrating DNA and transcript expression information, specifically designed to identify transcriptional networks governing the macrophage immune response to lipopolysaccharide (LPS). Using this approach, we are not only able to infer a global macrophage transcriptional network, but also time-specific sub-networks that are dynamically active across the LPS response. We believe that our system biological approach could be useful for identifying other complex networks mediating immunological responses. [Copyright &y& Elsevier]
- Published
- 2006
- Full Text
- View/download PDF
24. Information for the Coordinates of Exons (ICE): a human splice sites database
- Author
-
Chong, Allen, Zhang, Guanglan, and Bajic, Vladimir B.
- Subjects
- *
GENETICS , *HEREDITY , *GENE silencing , *MESSENGER RNA - Abstract
We present a comprehensive database, Information for the Coordinates of Exons (ICE), of genomic splice sites (SSs) for 10,803 human genes. ICE contains 91,846 pairs of donor acceptor sites, supported by the alignment of “full-length” human mRNAs (including transcript variants) on human genomic sequences. ICE represents the largest collection of human SSs known to date and provides a significant resource to both molecular biologists and bioinformaticians alike. A user can visualize and extract genomic sequences around SSs of the donor acceptor pairs and can also visualize the primary structure of individual genes. We list in this article the 22 most frequently found canonical and noncanonical splice sites. The top four most represented donor acceptor pairs (GT-AG, GC-AG, AT-AC, and GT-GG) accounted for 99.16% of our data set. In addition, we calculated the SS matrix models for the three most common donor acceptor pairs. The database is focused on providing SSs and surrounding sequence information, associated SS and sequence characteristics, and relation to overall transcript structure. It allows targeted search and presents evidence for the gene structure. [Copyright &y& Elsevier]
- Published
- 2004
- Full Text
- View/download PDF
25. Enhancement of Plant-Microbe Interactions Using a Rhizosphere Metabolomics in the Removal of Polychlorinated Biphenyls.
- Author
-
Narasimhan, Kothandaraman, Basheer, Chanbasha, Bajic, Vladimir B., and Swarup, Sanjay
- Subjects
- *
POLYCHLORINATED biphenyls , *BACTERIA , *PHYSIOLOGICAL control systems - Abstract
Examines the enhanced depletion of polychlorinated biphenyls (PCB) using root-associated microbes. Improvement of the competitive abilities of biocontrol and biofertilization strains; Colonization of the phenylpropanoid-utilizing microbes; Removal of PCBs.
- Published
- 2003
- Full Text
- View/download PDF
26. Information and Sequence Extraction around the 5'-End and Translation Initiation Site of Human Genes.
- Author
-
Chong, Allen, Zhang, Guanglan, and Bajic, Vladimir B.
- Subjects
- *
GENES , *PROMOTERS (Genetics) , *BIOINFORMATICS - Abstract
FIE (5'-end Information Extraction) is a web-based program designed primarily to extract the sequence of the regions around the 5'-end and around the translation initiation sites for a particular gene, based on information provided by LocusLink. [ABSTRACT FROM AUTHOR]
- Published
- 2002
27. DES-Tcell is a knowledgebase for exploring immunology-related literature.
- Author
-
AlSaieedi, Ahdab, Salhi, Adil, Tifratene, Faroug, Raies, Arwa Bin, Hungler, Arnaud, Uludag, Mahmut, Van Neste, Christophe, Bajic, Vladimir B., Gojobori, Takashi, and Essack, Magbubah
- Subjects
- *
AUTOIMMUNE thyroiditis , *LEUCOCYTES , *DISEASE risk factors , *GRAFT rejection , *DRUGS - Abstract
T-cells are a subtype of white blood cells circulating throughout the body, searching for infected and abnormal cells. They have multifaceted functions that include scanning for and directly killing cells infected with intracellular pathogens, eradicating abnormal cells, orchestrating immune response by activating and helping other immune cells, memorizing encountered pathogens, and providing long-lasting protection upon recurrent infections. However, T-cells are also involved in immune responses that result in organ transplant rejection, autoimmune diseases, and some allergic diseases. To support T-cell research, we developed the DES-Tcell knowledgebase (KB). This KB incorporates text- and data-mined information that can expedite retrieval and exploration of T-cell relevant information from the large volume of published T-cell-related research. This KB enables exploration of data through concepts from 15 topic-specific dictionaries, including immunology-related genes, mutations, pathogens, and pathways. We developed three case studies using DES-Tcell, one of which validates effective retrieval of known associations by DES-Tcell. The second and third case studies focuses on concepts that are common to Grave's disease (GD) and Hashimoto's thyroiditis (HT). Several reports have shown that up to 20% of GD patients treated with antithyroid medication develop HT, thus suggesting a possible conversion or shift from GD to HT disease. DES-Tcell found miR-4442 links to both GD and HT, and that miR-4442 possibly targets the autoimmune disease risk factor CD6, which provides potential new knowledge derived through the use of DES-Tcell. According to our understanding, DES-Tcell is the first KB dedicated to exploring T-cell-relevant information via literature-mining, data-mining, and topic-specific dictionaries. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
28. DTi2Vec: Drug–target interaction prediction using network embedding and ensemble learning.
- Author
-
Thafar, Maha A., Olayan, Rawan S., Albaradei, Somayah, Bajic, Vladimir B., Gojobori, Takashi, Essack, Magbubah, and Gao, Xin
- Subjects
- *
SCIENTIFIC literature , *DRUG repositioning , *SCIENCE databases , *DRUG utilization , *FORECASTING , *VIRTUAL networks , *FEATURE extraction - Abstract
Drug–target interaction (DTI) prediction is a crucial step in drug discovery and repositioning as it reduces experimental validation costs if done right. Thus, developing in-silico methods to predict potential DTI has become a competitive research niche, with one of its main focuses being improving the prediction accuracy. Using machine learning (ML) models for this task, specifically network-based approaches, is effective and has shown great advantages over the other computational methods. However, ML model development involves upstream hand-crafted feature extraction and other processes that impact prediction accuracy. Thus, network-based representation learning techniques that provide automated feature extraction combined with traditional ML classifiers dealing with downstream link prediction tasks may be better-suited paradigms. Here, we present such a method, DTi2Vec, which identifies DTIs using network representation learning and ensemble learning techniques. DTi2Vec constructs the heterogeneous network, and then it automatically generates features for each drug and target using the nodes embedding technique. DTi2Vec demonstrated its ability in drug–target link prediction compared to several state-of-the-art network-based methods, using four benchmark datasets and large-scale data compiled from DrugBank. DTi2Vec showed a statistically significant increase in the prediction performances in terms of AUPR. We verified the "novel" predicted DTIs using several databases and scientific literature. DTi2Vec is a simple yet effective method that provides high DTI prediction performance while being scalable and efficient in computation, translating into a powerful drug repositioning tool. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
29. KAUST Metagenomic Analysis Platform (KMAP), enabling access to massive analytics of re-annotated metagenomic data.
- Author
-
Alam, Intikhab, Kamau, Allan Anthony, Ngugi, David Kamanda, Gojobori, Takashi, Duarte, Carlos M., and Bajic, Vladimir B.
- Subjects
- *
METAGENOMICS , *DATA analysis , *BIOINFORMATICS , *DATABASE management , *DATA integration - Abstract
Exponential rise of metagenomics sequencing is delivering massive functional environmental genomics data. However, this also generates a procedural bottleneck for on-going re-analysis as reference databases grow and methods improve, and analyses need be updated for consistency, which require acceess to increasingly demanding bioinformatic and computational resources. Here, we present the KAUST Metagenomic Analysis Platform (KMAP), a new integrated open web-based tool for the comprehensive exploration of shotgun metagenomic data. We illustrate the capacities KMAP provides through the re-assembly of ~ 27,000 public metagenomic samples captured in ~ 450 studies sampled across ~ 77 diverse habitats. A small subset of these metagenomic assemblies is used in this pilot study grouped into 36 new habitat-specific gene catalogs, all based on full-length (complete) genes. Extensive taxonomic and gene annotations are stored in Gene Information Tables (GITs), a simple tractable data integration format useful for analysis through command line or for database management. KMAP pilot study provides the exploration and comparison of microbial GITs across different habitats with over 275 million genes. KMAP access to data and analyses is available at https://www.cbrc.kaust.edu.sa/aamg/kmap.start. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
30. DTiGEMS+: drug–target interaction prediction using graph embedding, graph mining, and similarity-based techniques.
- Author
-
Thafar, Maha A., Olayan, Rawan S., Ashoor, Haitham, Albaradei, Somayah, Bajic, Vladimir B., Gao, Xin, Gojobori, Takashi, and Essack, Magbubah
- Subjects
- *
FORECASTING , *SIMILARITY (Geometry) , *EMBEDDINGS (Mathematics) , *ERROR rates , *MINES & mineral resources - Abstract
In silico prediction of drug–target interactions is a critical phase in the sustainable drug development process, especially when the research focus is to capitalize on the repositioning of existing drugs. However, developing such computational methods is not an easy task, but is much needed, as current methods that predict potential drug–target interactions suffer from high false-positive rates. Here we introduce DTiGEMS+, a computational method that predicts Drug–Target interactions using Graph Embedding, graph Mining, and Similarity-based techniques. DTiGEMS+ combines similarity-based as well as feature-based approaches, and models the identification of novel drug–target interactions as a link prediction problem in a heterogeneous network. DTiGEMS+ constructs the heterogeneous network by augmenting the known drug–target interactions graph with two other complementary graphs namely: drug–drug similarity, target–target similarity. DTiGEMS+ combines different computational techniques to provide the final drug target prediction, these techniques include graph embeddings, graph mining, and machine learning. DTiGEMS+ integrates multiple drug–drug similarities and target–target similarities into the final heterogeneous graph construction after applying a similarity selection procedure as well as a similarity fusion algorithm. Using four benchmark datasets, we show DTiGEMS+ substantially improves prediction performance compared to other state-of-the-art in silico methods developed to predict of drug-target interactions by achieving the highest average AUPR across all datasets (0.92), which reduces the error rate by 33.3% relative to the second-best performing model in the state-of-the-art methods comparison. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
31. HbA1C as a marker of retrograde glycaemic control in diabetes patient with co‐existed beta‐thalassaemia: A case report and a literature review.
- Author
-
Gluvic, Zoran, Obradovic, Milan, Lackovic, Milena, Samardzic, Vladimir, Tica Jevtic, Jelena, Essack, Magbubah, Bajic, Vladimir B., and Isenovic, Esma R.
- Subjects
- *
DIABETES prevention , *INSULIN therapy , *ULCER treatment , *BIOMARKERS , *GLYCOSYLATED hemoglobin , *HYPERBARIC oxygenation , *COMORBIDITY , *TREATMENT effectiveness , *BETA-Thalassemia , *GLYCEMIC control - Abstract
What is known and objective: The HbA1C marker used in assessing diabetes control quality is not sufficient in diabetes patients with thalassaemia. Case description: A male diabetic patient with thalassaemia was hospitalized due to distal neuropathic pain, right toe trophic ulcer, unacceptable five‐point glycaemic profile and recommended HbA1C value. After simultaneously initiated insulin therapy and management of ulcer by hyperbaric oxygen, the patient showed improved glycaemic control and ulcer healing, which led to the patient's discharge. What is new and conclusion: In thalassaemia and haemoglobinopathies, due to discrepancies in the five‐point glycaemic profile and HbA1C values, it is necessary to measure HbA1C with a different method or to determine HbA1C and fructosamine simultaneously. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
32. Redox control of vascular biology.
- Author
-
Obradovic, Milan, Essack, Magbubah, Zafirovic, Sonja, Sudar‐Milovanovic, Emina, Bajic, Vladan P., Van Neste, Christophe, Trpkovic, Andreja, Stanimirovic, Julijana, Bajic, Vladimir B., and Isenovic, Esma R.
- Subjects
- *
VASCULAR smooth muscle , *CARDIOVASCULAR system , *OXIDATION-reduction reaction , *VASCULAR endothelial cells , *BIOLOGY , *GLYCOCALYX , *ENDOTHELIUM - Abstract
Redox control is lost when the antioxidant defense system cannot remove abnormally high concentrations of signaling molecules, such as reactive oxygen species (ROS). Chronically elevated levels of ROS cause oxidative stress that may eventually lead to cancer and cardiovascular and neurodegenerative diseases. In this review, we focus on redox effects in the vascular system. We pay close attention to the subcompartments of the vascular system (endothelium, smooth muscle cell layer) and give an overview of how redox changes influence those different compartments. We also review the core aspects of redox biology, cardiovascular physiology, and pathophysiology. Moreover, the topic‐specific knowledgebase DES‐RedoxVasc was used to develop two case studies, one focused on endothelial cells and the other on the vascular smooth muscle cells, as a starting point to possibly extend our knowledge of redox control in vascular biology. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
33. Proteome-level assessment of origin, prevalence and function of leucine-aspartic acid (LD) motifs.
- Author
-
Alam, Tanvir, Alazmi, Meshari, Naser, Rayan, Huser, Franceline, Momin, Afaque A, Astro, Veronica, Hong, SeungBeom, Walkiewicz, Katarzyna W, Canlas, Christian G, Huser, Raphaël, Ali, Amal J, Merzaban, Jasmeen, Adamo, Antonio, Jaremko, Mariusz, Jaremko, Łukasz, Bajic, Vladimir B, Gao, Xin, and Arold, Stefan T
- Subjects
- *
LEUCINE , *CELL adhesion , *PROTEOMICS , *MACHINE learning , *SOURCE code - Abstract
Motivation Leucine-aspartic acid (LD) motifs are short linear interaction motifs (SLiMs) that link paxillin family proteins to factors controlling cell adhesion, motility and survival. The existence and importance of LD motifs beyond the paxillin family is poorly understood. Results To enable a proteome-wide assessment of LD motifs, we developed an active learning based framework (LD m otif f inder; LDMF) that iteratively integrates computational predictions with experimental validation. Our analysis of the human proteome revealed a dozen new proteins containing LD motifs. We found that LD motif signalling evolved in unicellular eukaryotes more than 800 Myr ago, with paxillin and vinculin as core constituents, and nuclear export signal as a likely source of de novo LD motifs. We show that LD motif proteins form a functionally homogenous group, all being involved in cell morphogenesis and adhesion. This functional focus is recapitulated in cells by GFP-fused LD motifs, suggesting that it is intrinsic to the LD motif sequence, possibly through their effect on binding partners. Our approach elucidated the origin and dynamic adaptations of an ancestral SLiM, and can serve as a guide for the identification of other SLiMs for which only few representatives are known. Availability and implementation LDMF is freely available online at www.cbrc.kaust.edu.sa/ldmf ; Source code is available at https://github.com/tanviralambd/LD/. Supplementary information Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
34. Characterization and identification of long non-coding RNAs based on feature relationship.
- Author
-
Wang, Guangyu, Yin, Hongyan, Li, Boyang, Yu, Chunlei, Wang, Fan, Xu, Xingjian, Cao, Jiabao, Bao, Yiming, Wang, Liguo, Abbasi, Amir A, Bajic, Vladimir B, Ma, Lina, and Zhang, Zhang
- Subjects
- *
NON-coding RNA , *INTERNET servers , *IDENTIFICATION , *NUMBERS of species , *PRIOR learning , *RNA - Abstract
Motivation The significance of long non-coding RNAs (lncRNAs) in many biological processes and diseases has gained intense interests over the past several years. However, computational identification of lncRNAs in a wide range of species remains challenging; it requires prior knowledge of well-established sequences and annotations or species-specific training data, but the reality is that only a limited number of species have high-quality sequences and annotations. Results Here we first characterize lncRNAs in contrast to protein-coding RNAs based on feature relationship and find that the feature relationship between open reading frame length and guanine-cytosine (GC) content presents universally substantial divergence in lncRNAs and protein-coding RNAs, as observed in a broad variety of species. Based on the feature relationship, accordingly, we further present LGC, a novel algorithm for identifying lncRNAs that is able to accurately distinguish lncRNAs from protein-coding RNAs in a cross-species manner without any prior knowledge. As validated on large-scale empirical datasets, comparative results show that LGC outperforms existing algorithms by achieving higher accuracy, well-balanced sensitivity and specificity, and is robustly effective (>90% accuracy) in discriminating lncRNAs from protein-coding RNAs across diverse species that range from plants to mammals. To our knowledge, this study, for the first time, differentially characterizes lncRNAs and protein-coding RNAs based on feature relationship, which is further applied in computational identification of lncRNAs. Taken together, our study represents a significant advance in characterization and identification of lncRNAs and LGC thus bears broad potential utility for computational analysis of lncRNAs in a wide range of species. Availability and implementation LGC web server is publicly available at http://bigd.big.ac.cn/lgc/calculator. The scripts and data can be downloaded at http://bigd.big.ac.cn/biocode/tools/BT000004. Supplementary information Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
35. Hybrid model for efficient prediction of poly(A) signals in human genomic DNA.
- Author
-
Albalawi, Fahad, Chahid, Abderrazak, Guo, Xingang, Albaradei, Somayah, Magana-Mora, Arturo, Jankovic, Boris R., Uludag, Mahmut, Van Neste, Christophe, Essack, Magbubah, Laleg-Kirati, Taous-Meriem, and Bajic, Vladimir B.
- Subjects
- *
HUMAN DNA , *PREDICTION models , *GENETIC regulation , *LOGISTIC regression analysis , *NUCLEOTIDE sequence - Abstract
• New hybrid model for poly(A) signal prediction in human DNA is developed. • The hybrid model contains 8 deep neural networks and 4 logistic regression models. • The hybrid model contains a separate prediction model for each of the 12 PAS types. • A novel feature generation method converts DNA sequences into input signals. • The prediction error of poly(A) signal is reduced by 30.29%. Polyadenylation signals (PAS) are found in most protein-coding and some non-coding genes in eukaryotes. Their accurate recognition improves understanding gene regulation mechanisms and recognition of the 3′-end of transcribed gene regions where premature or alternate transcription ends may lead to various diseases. Although different methods and tools for in-silico prediction of genomic signals have been proposed, the correct identification of PAS in genomic DNA remains challenging due to a vast number of non-relevant hexamers identical to PAS hexamers. In this study, we developed a novel method for PAS recognition. The method is implemented in a hybrid PAS recognition model (HybPAS), which is based on deep neural networks (DNNs) and logistic regression models (LRMs). One of such models is developed for each of the 12 most frequent human PAS hexamers. DNN models appeared the best for eight PAS types (including the two most frequent PAS hexamers), while LRM appeared best for the remaining four PAS types. The new models use different combinations of signal processing-based, statistical, and sequence-based features as input. The results obtained on human genomic data show that HybPAS outperforms the well-tuned state-of-the-art Omni-PolyA models, reducing the classification error for different PAS hexamers by up to 57.35% for 10 out of 12 PAS types, with Omni-PolyA models being better for two PAS types. For the most frequent PAS types, 'AATAAA' and 'ATTAAA', HybPAS reduced the error rate by 35.14% and 34.48%, respectively. On average, HybPAS reduces the error by 30.29%. HybPAS is implemented partly in Python and in MATLAB available at https://github.com/EMANG-KAUST/PolyA_Prediction_LRM_DNN. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
36. DeepGSR: an optimized deep-learning structure for the recognition of genomic signals and regions.
- Author
-
Kalkatawi, Manal, Magana-Mora, Arturo, Jankovic, Boris, and Bajic, Vladimir B
- Subjects
- *
GENOMIC imprinting , *GENE expression , *GENE amplification , *POLYMERASE chain reaction , *MAMMALS - Abstract
Motivation Recognition of different genomic signals and regions (GSRs) in DNA is crucial for understanding genome organization, gene regulation, and gene function, which in turn generate better genome and gene annotations. Although many methods have been developed to recognize GSRs, their pure computational identification remains challenging. Moreover, various GSRs usually require a specialized set of features for developing robust recognition models. Recently, deep-learning (DL) methods have been shown to generate more accurate prediction models than 'shallow' methods without the need to develop specialized features for the problems in question. Here, we explore the potential use of DL for the recognition of GSRs. Results We developed DeepGSR, an optimized DL architecture for the prediction of different types of GSRs. The performance of the DeepGSR structure is evaluated on the recognition of polyadenylation signals (PAS) and translation initiation sites (TIS) of different organisms: human, mouse, bovine and fruit fly. The results show that DeepGSR outperformed the state-of-the-art methods, reducing the classification error rate of the PAS and TIS prediction in the human genome by up to 29% and 86%, respectively. Moreover, the cross-organisms and genome-wide analyses we performed, confirmed the robustness of DeepGSR and provided new insights into the conservation of examined GSRs across species. Availability and implementation DeepGSR is implemented in Python using Keras API; it is available as open-source software and can be obtained at https://doi.org/10.5281/zenodo.1117159. Supplementary information Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
37. CpG traffic lights are markers of regulatory regions in human genome.
- Author
-
Lioznova, Anna V., Khamis, Abdullah M., Artemov, Artem V., Besedina, Elizaveta, Ramensky, Vasily, Bajic, Vladimir B., Kulakovskiy, Ivan V., and Medvedeva, Yulia A.
- Subjects
- *
DNA methylation , *BIOLOGICAL tags , *HUMAN genome , *NUCLEOTIDE sequencing , *GENETIC transcription - Abstract
Background: DNA methylation is involved in the regulation of gene expression. Although bisulfite-sequencing based methods profile DNA methylation at a single CpG resolution, methylation levels are usually averaged over genomic regions in the downstream bioinformatic analysis. Results: We demonstrate that on the genome level a single CpG methylation can serve as a more accurate predictor of gene expression than an average promoter / gene body methylation. We define CpG traffic lights (CpG TL) as CpG dinucleotides with a significant correlation between methylation and expression of a gene nearby. CpG TL are enriched in all regulatory regions. Among all promoters, CpG TL are especially enriched in poised ones, suggesting involvement of DNA methylation in their regulation. Yet, binding of only a handful of transcription factors, such as NRF1, ETS, STAT and IRF-family members, could be regulated by direct methylation of transcription factor binding sites (TFBS) or its close proximity. For the majority of TF, an alternative scenario is more likely: methylation and inactivation of the whole regulatory element indirectly represses functional TF binding with a CpG TL being a reliable marker of such inactivation. Conclusions: CpG TL provide a promising insight into mechanisms of enhancer activity and gene regulation linking methylation of single CpG to gene expression. CpG TL methylation can be used as reliable markers of enhancer activity and gene expression in applications, e.g. in clinic where measuring DNA methylation is easier compared to directly measuring gene expression due to more stable nature of DNA. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
38. DDR: efficient computational method to predict drug–target interactions using graph mining and machine learning approaches.
- Author
-
Olayan, Rawan S, Ashoor, Haitham, and Bajic, Vladimir B
- Subjects
- *
TARGETED drug delivery , *DATA mining - Abstract
A correction is presented to the article "DDR: efficient computational method to predict drug-target interactions using graph mining and machine learning approaches", which appeared in the issue of November 24, 2017.
- Published
- 2018
- Full Text
- View/download PDF
39. BioPS: System for screening and assessment of biofuel-production potential of cyanobacteria.
- Author
-
Essack, Magbubah, Salhi, Adil, Hanks, John, Bajic, Vladimir B., Motwalli, Olaa, and Mijakovic, Ivan
- Subjects
- *
FREE fatty acids , *BIOMASS energy , *CYANOBACTERIA , *PROTEOMICS , *GENOMICS , *CHARTS, diagrams, etc. - Abstract
Background: Cyanobacteria are one of the target groups of organisms explored for production of free fatty acids (FFAs) as biofuel precursors. Experimental evaluation of cyanobacterial potential for FFA production is costly and time consuming. Thus, computational approaches for comparing and ranking cyanobacterial strains for their potential to produce biofuel based on the characteristics of their predicted proteomes can be of great importance. Results: To enable such comparison and ranking, and to assist biotechnology developers and researchers in selecting strains more likely to be successfully engineered for the FFA production, we developed the fuel roducer creen (BioPS) platform (). BioPS relies on the estimation of the predicted proteome makeup of cyanobacterial strains to produce and secrete FFAs, based on the analysis of well-studied cyanobacterial strains with known FFA production profiles. The system links results back to various external repositories such as KEGG, UniProt and GOLD, making it easier for users to explore additional related information. Conclusion: To our knowledge, BioPS is the first tool that screens and evaluates cyanobacterial strains for their potential to produce and secrete FFAs based on strain’s predicted proteome characteristics, and rank strains based on that assessment. We believe that the availability of such a platform (comprising both a prediction tool and a repository of pre-evaluated stains) would be of interest to biofuel researchers. The BioPS system will be updated annually with information obtained from newly sequenced cyanobacterial genomes as they become available, as well as with new genes that impact FFA production or secretion. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
40. In silico exploration of Red Sea <italic>Bacillus</italic> genomes for natural product biosynthetic gene clusters.
- Author
-
Othoum, Ghofran, Bougouffa, Salim, Razali, Rozaimi, Bokhari, Ameerah, Alamoudi, Soha, Antunes, André, Gao, Xin, Hoehndorf, Robert, Arold, Stefan T., Gojobori, Takashi, Hirt, Heribert, Mijakovic, Ivan, Bajic, Vladimir B., Lafi, Feras F., and Essack, Magbubah
- Subjects
- *
BACILLUS licheniformis , *PATHOGENIC microorganisms , *BIOINFORMATICS , *ANTI-infective agents , *BIOSYNTHESIS - Abstract
Background: The increasing spectrum of multidrug-resistant bacteria is a major global public health concern, necessitating discovery of novel antimicrobial agents. Here, members of the genus
Bacillus are investigated as a potentially attractive source of novel antibiotics due to their broad spectrum of antimicrobial activities. We specifically focus on a computational analysis of the distinctive biosynthetic potential ofBacillus paralicheniformis strains isolated from the Red Sea, an ecosystem exposed to adverse, highly saline and hot conditions. Results: We report the complete circular and annotated genomes of two Red Sea strains,B. paralicheniformis Bac48 isolated from mangrove mud andB. paralicheniformis Bac84 isolated from microbial mat collected from Rabigh Harbor Lagoon in Saudi Arabia. Comparing the genomes ofB. paralicheniformis Bac48 andB. paralicheniformis Bac84 with nine publicly available complete genomes ofB. licheniformis and three genomes ofB. paralicheniformis, revealed that all of theB .paralicheniformis strains in this study are more enriched in nonribosomal peptides (NRPs). We further report the first computationally identified trans-acyltransferase (trans-AT) nonribosomal peptide synthetase/polyketide synthase (PKS/ NRPS) cluster in strains of this species. Conclusions:B. paralicheniformis species have more genes associated with biosynthesis of antimicrobial bioactive compounds than other previously characterized species ofB. licheniformis , which suggests that these species are better potential sources for novel antibiotics. Moreover, the genome of the Red Sea strainB. paralicheniformis Bac48 is more enriched in modular PKS genes compared toB. licheniformis strains and otherB. paralicheniformis strains. This may be linked to adaptations that strains surviving in the Red Sea underwent to survive in the relatively hot and saline ecosystems. [ABSTRACT FROM AUTHOR]- Published
- 2018
- Full Text
- View/download PDF
41. DES-ncRNA: A knowledgebase for exploring information about human micro and long noncoding RNAs based on literature-mining.
- Author
-
Salhi, Adil, Essack, Magbubah, Alam, Tanvir, Bajic, Vladan P., Ma, Lina, Radovanovic, Aleksandar, Marchand, Benoit, Schmeier, Sebastian, Zhang, Zhang, and Bajic, Vladimir B.
- Published
- 2017
- Full Text
- View/download PDF
42. Genomic characterization of two novel SAR11 isolates from the Red Sea, including the first strain of the SAR11 Ib clade.
- Author
-
Jimenez-Infante, Francy, Kamanda Ngugi, David, Vinu, Manikandan, Blom, Jochen, Alam, Intikhab, Bajic, Vladimir B., and Stingl, Ulrich
- Subjects
- *
MARINE bacteria , *PROTEOBACTERIA , *GENOMIC imprinting , *NUCLEOTIDE sequencing , *STORM drains - Abstract
The SAR11 clade (Pelagibacterales) is a diverse group that forms a monophyletic clade within the Alphaproteobacteria, and constitutes up to one third of all prokaryotic cells in the photic zone of most oceans. Pelagibacterales are very abundant in the warm and highly saline surface waters of the Red Sea, raising the question of adaptive traits of SAR11 populations in this water body and warmer oceans through the world. In this study, two pure cultures were successfully obtained from surface waters on the Red Sea: one isolate of subgroup Ia and one of the previously uncultured SAR11 Ib lineage. The novel genomes were very similar to each other and to genomes of isolates of SAR11 subgroup Ia (Ia pan-genome), both in terms of gene content and synteny. Among the genes that were not present in the Ia pan-genome, 108 (RS39, Ia) and 151 genes (RS40, Ib) were strain specific. Detailed analyses showed that only 51 (RS39, Ia) and 55 (RS40, Ib) of these strain-specific genes had not reported before on genome fragments of Pelagibacterales. Further analyses revealed the potential production of phosphonates by some SAR11 members and possible adaptations for oligotrophic life, including pentose sugar utilization and adhesion to marine particulate matter. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
43. Semantic prioritization of novel causative genomic variants.
- Author
-
Boudellioua, Imane, Mahamad Razali, Rozaimi B., Kulmanov, Maxat, Hashish, Yasmeen, Bajic, Vladimir B., Goncalves-Serra, Eva, Schoenmakers, Nadia, Gkoutos, Georgios V., Schofield, Paul N., and Hoehndorf, Robert
- Subjects
- *
GENOMICS , *GENETIC mutation , *MACHINE learning , *GENETIC disorders , *CONGENITAL disorders , *HYPOTHYROIDISM - Abstract
Discriminating the causative disease variant(s) for individuals with inherited or de novo mutations presents one of the main challenges faced by the clinical genetics community today. Computational approaches for variant prioritization include machine learning methods utilizing a large number of features, including molecular information, interaction networks, or phenotypes. Here, we demonstrate the PhenomeNET Variant Predictor (PVP) system that exploits semantic technologies and automated reasoning over genotype-phenotype relations to filter and prioritize variants in whole exome and whole genome sequencing datasets. We demonstrate the performance of PVP in identifying causative variants on a large number of synthetic whole exome and whole genome sequences, covering a wide range of diseases and syndromes. In a retrospective study, we further illustrate the application of PVP for the interpretation of whole exome sequencing data in patients suffering from congenital hypothyroidism. We find that PVP accurately identifies causative variants in whole exome and whole genome sequencing datasets and provides a powerful resource for the discovery of causal variants. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
44. bTSSfinder: a novel tool for the prediction of promoters in cyanobacteria and Escherichia coli.
- Author
-
Shahmuradov, Ilham Ayub, Razali, Rozaimi Mohamad, Bougouffa, Salim, Radovanovic, Aleksandar, and Bajic, Vladimir B.
- Abstract
Motivation: The computational search for promoters in prokaryotes remains an attractive problem in bioinformatics. Despite the attention it has received for many years, the problem has not been addressed satisfactorily. In any bacterial genome, the transcription start site is chosen mostly by the sigma (σ) factor proteins, which control the gene activation. The majority of published bacterial promoter prediction tools target σ70 promoters in Escherichia coli. Moreover, no σ-specific classification of promoters is available for prokaryotes other than for E. coli.Results: Here, we introduce bTSSfinder, a novel tool that predicts putative promoters for five classes of σ factors in Cyanobacteria (σA, σC, σH, σG and σF) and for five classes of sigma factors in E. coli (σ70, σ38, σ32, σ28 and σ24). Comparing to currently available tools, bTSSfinder achieves higher accuracy (MCC = 0.86, F1-score = 0.93) compared to the next best tool with MCC = 0.59, F1-score = 0.79) and covers multiple classes of promoters. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
45. In silico screening for candidate chassis strains of free fatty acid-producing cyanobacteria.
- Author
-
Olaa Motwalli, Essack, Magbubah, Jankovic, Boris R., Boyang Ji, Xinyao Liu, Ansari, Hifzur Rahman, Hoehndorf, Robert, Xin Gao, Arold, Stefan T., Mineta, Katsuhiko, Archer, John A. C., Gojobori, Takashi, Mijakovic, Ivan, and Bajic, Vladimir B.
- Subjects
- *
FREE fatty acids , *SYNECHOCOCCUS , *ENERGY density , *BIOMASS energy , *CARBON dioxide , *PROCHLOROCOCCUS , *OXIDATIVE stress , *BIOINFORMATICS - Abstract
Background: Finding a source from which high-energy-density biofuels can be derived at an industrial scale has become an urgent challenge for renewable energy production. Some microorganisms can produce free fatty acids (FFA) as precursors towards such high-energy-density biofuels. In particular, photosynthetic cyanobacteria are capable of directly converting carbon dioxide into FFA. However, current engineered strains need several rounds of engineering to reach the level of production of FFA to be commercially viable; thus new chassis strains that require less engineering are needed. Although more than 120 cyanobacterial genomes are sequenced, the natural potential of these strains for FFA production and excretion has not been systematically estimated. Results: Here we present the FFA SC (FFASC), an in silico screening method that evaluates the potential for FFA production and excretion of cyanobacterial strains based on their proteomes. A literature search allowed for the compilation of 64 proteins, most of which influence FFA production and a few of which affect FFA excretion. The proteins are classified into 49 orthologous groups (OGs) that helped create rules used in the scoring/ranking of algorithms developed to estimate the potential for FFA production and excretion of an organism. Among 125 cyanobacterial strains, FFASC identified 20 candidate chassis strains that rank in their FFA producing and excreting potential above the specifically engineered reference strain, Synechococcus sp. PCC 7002. We further show that the top ranked cyanobacterial strains are unicellular and primarily include Prochlorococcus (order Prochlorales) and marine Synechococcus (order Chroococcales) that cluster phylogenetically. Moreover, two principal categories of enzymes were shown to influence FFA production the most: those ensuring precursor availability for the biosynthesis of lipids, and those involved in handling the oxidative stress associated to FFA synthesis. Conclusion: To our knowledge FFASC is the first in silico method to screen cyanobacteria proteomes for their potential to produce and excrete FFA, as well as the first attempt to parameterize the criteria derived from genetic characteristics that are favorable/non-favorable for this purpose. Thus, FFASC helps focus experimental evaluation only on the most promising cyanobacteria. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
46. Metagenomics as a preliminary screen for antimicrobial bioprospecting.
- Author
-
Al-Amoudi, Soha, Razali, Rozaimi, Essack, Magbubah, Amini, Mohammad Shoaib, Bougouffa, Salim, Archer, John A.C., Lafi, Feras F., and Bajic, Vladimir B.
- Subjects
- *
BACTERIAL contamination , *ANTI-infective agents , *METAGENOMICS , *BIOPROSPECTING , *DRUG use testing , *BACTERIAL diversity , *SEDIMENTS - Abstract
Since the composition of soil directs the diversity of the contained microbiome and its potential to produce bioactive compounds, many studies has been focused on sediment types with unique features characteristic of extreme environments. However, not much is known about the potential of microbiomes that inhabit the highly saline and hot Red Sea lagoons. This case study explores mangrove mud and the microbial mat of sediments collected from the Rabigh harbor lagoon and Al Kharrar lagoon for antimicrobial bioprospecting. Rabigh harbor lagoon appears the better location, and the best sediment type for this purpose is mangrove mud. On the other hand, Al Kharrar lagoon displayed increased anaerobic hydrocarbon degradation and an abundance of bacterial DNA associated with antibiotic resistance. Moreover, our findings show an identical shift in phyla associated with historic hydrocarbon contamination exposure reported in previous studies (that is, enrichment of Gamma- and Delta-proteobacteria), but we also report that bacterial DNA sequences associated with antibiotic synthesis enzymes are derived from Gamma-, Delta- and Alpha-proteobacteria. This suggests that selection pressure associated with hydrocarbon contamination tend to enrich the bacterial classes DNA associated with antibiotic synthesis enzymes. Although Actinobacteria tends to be the common target for research when it comes to antimicrobial bioprospecting, our study suggests that Firmicutes (Bacilli and Clostridia), Bacteroidetes, Cyanobacteria, and Proteobacteria should be antimicrobial bioprospecting targets as well. To the best of our knowledge, this is the first metagenomic study that analyzed the microbiomes in Red Sea lagoons for antimicrobial bioprospecting. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
47. Genome Sequence of Salinisphaera shabanensis, a Gammaproteobacterium from the Harsh, Variable Environment of the Brine-Seawater Interface of the Shaban Deep in the Red Sea.
- Author
-
Antunes, André, Alam, Intikhab, Bajic, Vladimir B., and Stingl, Ulrich
- Subjects
- *
BACTERIA , *SEAWATER , *SALT , *HEAVY metals , *BETA-hydroxy-beta-methylbutyrate , *SIDEROPHORES , *MICROBIAL metabolites - Abstract
We present the genome of Salinisphaera shabanensis, isolated from a brine-seawater interface and representing a new order within the Gammaproteobacteria. Its adaptations to physicochemical and nutrient availability fluctuations include six genes encoding heavy metal-translocating P-type ATPases and multiple genes involved in iron uptake, siderophore production, and poly-β-hydroxybutyrate synthesis. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
48. Genome Sequence of Halorhabdus tiamatea, the First Archaeon Isolated from a Deep-Sea Anoxic Brine Lake.
- Author
-
Antunes, André, Alam, Intikhab, Bajic, Vladimir B., and Stingl, Ulrich
- Subjects
- *
ARCHAEBACTERIA , *SALT , *GENOMES , *NUCLEOTIDE sequence , *LACTATE dehydrogenase - Abstract
We present the draft genome of Halorhabdus tiamatea, the first member of the Archaea ever isolated from a deep-sea anoxic brine. Genome comparison with Halorhabdus utahensis revealed some striking differences, including a marked increase in genes associated with transmembrane transport and putative genes for a trehalose synthase and a lactate dehydrogenase. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
49. DASPfind: new efficient method to predict drug-target interactions.
- Author
-
Ba-alawi, Wail, Soufan, Othman, Essack, Magbubah, Kalnis, Panos, and Bajic, Vladimir B.
- Subjects
- *
PHARMACEUTICAL research , *PROTEINS , *BIOLOGICAL databases , *DIBUCAINE , *AMIDES - Abstract
Background: Identification of novel drug-target interactions (DTIs) is important for drug discovery. Experimental determination of such DTIs is costly and time consuming, hence it necessitates the development of efficient computational methods for the accurate prediction of potential DTIs. To-date, many computational methods have been proposed for this purpose, but they suffer the drawback of a high rate of false positive predictions. Results: Here, we developed a novel computational DTI prediction method, DASPfind. DASPfind uses simple paths of particular lengths inferred from a graph that describes DTIs, similarities between drugs, and similarities between the protein targets of drugs. We show that on average, over the four gold standard DTI datasets, DASPfind significantly outperforms other existing methods when the single top-ranked predictions are considered, resulting in 46.17% of these predictions being correct, and it achieves 49.22% correct single top ranked predictions when the set of all DTIs for a single drug is tested. Furthermore, we demonstrate that our method is best suited for predicting DTIs in cases of drugs with no known targets or with few known targets. We also show the practical use of DASPfind by generating novel predictions for the Ion Channel dataset and validating them manually. Conclusions: DASPfind is a computational method for finding reliable new interactions between drugs and proteins. We show over six different DTI datasets that DASPfind outperforms other state-of-the-art methods when the single top-ranked predictions are considered, or when a drug with no known targets or with few known targets is considered. We illustrate the usefulness and practicality of DASPfind by predicting novel DTIs for the Ion Channel dataset. The validated predictions suggest that DASPfind can be used as an efficient method to identify correct DTIs, thus reducing the cost of necessary experimental verifications in the process of drug discovery. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
50. Rhizosphere microbiome metagenomics of gray mangroves (Avicennia marina) in the Red Sea.
- Author
-
Alzubaidy, Hanin, Essack, Magbubah, Malas, Tareq B., Bokhari, Ameerah, Motwalli, Olaa, Kamanu, Frederick Kinyua, Jamhor, Suhaiza Ahmad, Mokhtar, Noor Azlin, Antunes, André, Simões, Marta Filipa, Alam, Intikhab, Bougouffa, Salim, Lafi, Feras F., Bajic, Vladimir B., and Archer, John A.C.
- Subjects
- *
RHIZOSPHERE microbiology , *METAGENOMICS , *MANGROVE plants , *AVICENNIA , *COASTAL ecology , *MICROBIAL ecology , *SEDIMENTS - Abstract
Mangroves are unique, and endangered, coastal ecosystems that play a vital role in the tropical and subtropical environments. A comprehensive description of the microbial communities in these ecosystems is currently lacking, and additional studies are required to have a complete understanding of the functioning and resilience of mangroves worldwide. In this work, we carried out a metagenomic study by comparing the microbial community of mangrove sediment with the rhizosphere microbiome of Avicennia marina , in northern Red Sea mangroves, along the coast of Saudi Arabia. Our results revealed that rhizosphere samples presented similar profiles at the taxonomic and functional levels and differentiated from the microbiome of bulk soil controls. Overall, samples showed predominance by Proteobacteria, Bacteroidetes and Firmicutes, with high abundance of sulfate reducers and methanogens, although specific groups were selectively enriched in the rhizosphere. Functional analysis showed significant enrichment in ‘metabolism of aromatic compounds’, ‘mobile genetic elements’, ‘potassium metabolism’ and ‘pathways that utilize osmolytes’ in the rhizosphere microbiomes. To our knowledge, this is the first metagenomic study on the microbiome of mangroves in the Red Sea, and the first application of unbiased 454-pyrosequencing to study the rhizosphere microbiome associated with A. marina . Our results provide the first insights into the range of functions and microbial diversity in the rhizosphere and soil sediments of gray mangrove ( A. marina ) in the Red Sea. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.