23 results on '"Kalita JK"'
Search Results
2. scDiffCoAM: A complete framework to identify potential biomarkers for esophageal squamous cell carcinoma using scRNA-Seq data analysis.
- Author
-
Saikia M, Bhattacharyya DK, and Kalita JK
- Subjects
- Humans, Gene Expression Profiling methods, Gene Regulatory Networks genetics, RNA-Seq methods, Single-Cell Gene Expression Analysis, Esophageal Squamous Cell Carcinoma genetics, Esophageal Squamous Cell Carcinoma pathology, Single-Cell Analysis methods, Biomarkers, Tumor genetics, Esophageal Neoplasms genetics, Esophageal Neoplasms pathology, Gene Expression Regulation, Neoplastic, Sequence Analysis, RNA methods
- Abstract
Single-cell RNA sequencing (scRNA-Seq) technology provides the scope to gain insight into the interplay between intrinsic cellular processes as well as transcriptional and behavioral changes in gene-gene interactions across varying conditions. The high level of scarcity of scRNA-seq data, however, poses a significant challenge for analysis. We propose a complete differential co-expression (DCE) analysis framework for scRNA-Seq data to extract network modules and identify hub-genes. The performance of our method has been shown to be satisfactory after validation using an scRNA-Seq esophageal squamous cell carcinoma (ESCC) dataset. From comparison with four other existing hub-gene finding methods, it has been observed that our method performs better in the majority of cases and has the ability to identify unique potential biomarkers that were not detected by the other methods. The potential biomarker genes identified by our framework, differential co-expression analysis method for single-cell RNA sequencing data (scDiffCoAM), have been validated both statistically and biologically.
- Published
- 2024
3. Identification of Potential Biomarkers Using Integrative Approach: A Case Study of ESCC.
- Author
-
Saikia M, Bhattacharyya DK, and Kalita JK
- Abstract
This paper presents a consensus-based approach that incorporates three microarray and three RNA-Seq methods for unbiased and integrative identification of differentially expressed genes (DEGs) as potential biomarkers for critical disease(s). The proposed method performs satisfactorily on two microarray datasets (GSE20347 and GSE23400) and one RNA-Seq dataset (GSE130078) for esophageal squamous cell carcinoma (ESCC). Based on the input dataset, our framework employs specific DE methods to detect DEGs independently. A consensus based function that first considers DEGs common to all three methods for further downstream analysis has been introduced. The consensus function employs other parameters to overcome information loss. Differential co-expression (DCE) and preservation analysis of DEGs facilitates the study of behavioral changes in interactions among DEGs under normal and diseased circumstances. Considering hub genes in biologically relevant modules and most GO and pathway enriched DEGs as candidates for potential biomarkers of ESCC, we perform further validation through biological analysis as well as literature evidence. We have identified 25 DEGs that have strong biological relevance to their respective datasets and have previous literature establishing them as potential biomarkers for ESCC. We have further identified 8 additional DEGs as probable potential biomarkers for ESCC, but recommend further in-depth analysis., Competing Interests: Conflict of interestOn the behalf of all authors, the corresponding authors states that there is no conflict of interest., (© The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd 2022, Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.)
- Published
- 2023
- Full Text
- View/download PDF
4. DEGnext: classification of differentially expressed genes from RNA-seq data using a convolutional neural network with transfer learning.
- Author
-
Kakati T, Bhattacharyya DK, Kalita JK, and Norden-Krichmar TM
- Subjects
- Humans, Machine Learning, RNA-Seq, Support Vector Machine, Neoplasms, Neural Networks, Computer
- Abstract
Background: A limitation of traditional differential expression analysis on small datasets involves the possibility of false positives and false negatives due to sample variation. Considering the recent advances in deep learning (DL) based models, we wanted to expand the state-of-the-art in disease biomarker prediction from RNA-seq data using DL. However, application of DL to RNA-seq data is challenging due to absence of appropriate labels and smaller sample size as compared to number of genes. Deep learning coupled with transfer learning can improve prediction performance on novel data by incorporating patterns learned from other related data. With the emergence of new disease datasets, biomarker prediction would be facilitated by having a generalized model that can transfer the knowledge of trained feature maps to the new dataset. To the best of our knowledge, there is no Convolutional Neural Network (CNN)-based model coupled with transfer learning to predict the significant upregulating (UR) and downregulating (DR) genes from both trained and untrained datasets., Results: We implemented a CNN model, DEGnext, to predict UR and DR genes from gene expression data obtained from The Cancer Genome Atlas database. DEGnext uses biologically validated data along with logarithmic fold change values to classify differentially expressed genes (DEGs) as UR and DR genes. We applied transfer learning to our model to leverage the knowledge of trained feature maps to untrained cancer datasets. DEGnext's results were competitive (ROC scores between 88 and 99[Formula: see text]) with those of five traditional machine learning methods: Decision Tree, K-Nearest Neighbors, Random Forest, Support Vector Machine, and XGBoost. DEGnext was robust and effective in terms of transferring learned feature maps to facilitate classification of unseen datasets. Additionally, we validated that the predicted DEGs from DEGnext were mapped to significant Gene Ontology terms and pathways related to cancer., Conclusions: DEGnext can classify DEGs into UR and DR genes from RNA-seq cancer datasets with high performance. This type of analysis, using biologically relevant fine-tuning data, may aid in the exploration of potential biomarkers and can be adapted for other disease datasets., (© 2021. The Author(s).)
- Published
- 2022
- Full Text
- View/download PDF
5. UICPC: Centrality-based clustering for scRNA-seq data analysis without user input.
- Author
-
Chowdhury HA, Bhattacharyya DK, and Kalita JK
- Subjects
- Algorithms, Cluster Analysis, Data Analysis, Gene Expression Profiling, Sequence Analysis, RNA, Single-Cell Analysis, Software, RNA, Small Cytoplasmic
- Abstract
scRNA-seq data analysis enables new possibilities for identification of novel cells, specific characterization of known cells and study of cell heterogeneity. The performance of most clustering methods especially developed for scRNA-seq is greatly influenced by user input. We propose a centrality-clustering method named UICPC and compare its performance with 9 state-of-the-art clustering methods on 11 real-world scRNA-seq datasets to demonstrate its effectiveness and usefulness in discovering cell groups. Our method does not require user input. However, it requires settings of threshold, which are benchmarked after performing extensive experiments. We observe that most compared approaches show poor performance due to high heterogeneity and large dataset dimensions. However, UICPC shows excellent performance in terms of NMI, Purity and ARI, respectively. UICPC is available as an R package and can be downloaded by clicking the link https://sites.google.com/view/hussinchowdhury/software., (Copyright © 2021 Elsevier Ltd. All rights reserved.)
- Published
- 2021
- Full Text
- View/download PDF
6. Rank-preserving biclustering algorithm: a case study on miRNA breast cancer.
- Author
-
Mandal K, Sarmah R, Bhattacharyya DK, Kalita JK, and Borah B
- Subjects
- Algorithms, Biomarkers, Tumor genetics, Female, Gene Expression Profiling, Humans, Breast Neoplasms genetics, MicroRNAs genetics
- Abstract
Effective biomarkers aid in the early diagnosis and monitoring of breast cancer and thus play an important role in the treatment of patients suffering from the disease. Growing evidence indicates that alteration of expression levels of miRNA is one of the principal causes of cancer. We analyze breast cancer miRNA data to discover a list of biclusters as well as breast cancer miRNA biomarkers which can help to understand better this critical disease and take important clinical decisions for treatment and diagnosis. In this paper, we propose a pattern-based parallel biclustering algorithm termed Rank-Preserving Biclustering (RPBic). The key strategy is to identify rank-preserved rows under a subset of columns based on a modified version of all substrings common subsequence (ALCS) framework. To illustrate the effectiveness of the RPBic algorithm, we consider synthetic datasets and show that RPBic outperforms relevant biclustering algorithms in terms of relevance and recovery. For breast cancer data, we identify 68 biclusters and establish that they have strong clinical characteristics among the samples. The differentially co-expressed miRNAs are found to be involved in KEGG cancer related pathways. Moreover, we identify frequency-based biomarkers (hsa-miR-410, hsa-miR-483-5p) and network-based biomarkers (hsa-miR-454, hsa-miR-137) which we validate to have strong connectivity with breast cancer. The source code and the datasets used can be found at http://agnigarh.tezu.ernet.in/~rosy8/Bioinformatics_RPBic_Data.rar . Graphical Abstract.
- Published
- 2021
- Full Text
- View/download PDF
7. A Survey of the Usages of Deep Learning for Natural Language Processing.
- Author
-
Otter DW, Medina JR, and Kalita JK
- Subjects
- Computer Systems, Humans, Linguistics, Neural Networks, Computer, Surveys and Questionnaires, Deep Learning, Natural Language Processing
- Abstract
Over the last several years, the field of natural language processing has been propelled forward by an explosion in the use of deep learning models. This article provides a brief introduction to the field and a quick overview of deep learning architectures and methods. It then sifts through the plethora of recent studies and summarizes a large assortment of relevant contributions. Analyzed research areas include several core linguistic processing issues in addition to many applications of computational linguistics. A discussion of the current state of the art is then provided along with recommendations for future research in the field.
- Published
- 2021
- Full Text
- View/download PDF
8. Prioritizing disease biomarkers using functional module based network analysis: A multilayer consensus driven scheme.
- Author
-
Jha M, Roy S, and Kalita JK
- Subjects
- Biomarkers, Consensus, Gene Expression Profiling, Humans, Alzheimer Disease genetics, Gene Regulatory Networks genetics
- Abstract
Many complex diseases occur due to genetic factors. A perturbation in the pathway of gene interactions leads to such disorders. Even though a group of genes is responsible, a few significant genes act as a biomarker for disease, perturbing the healthy network. Identifying such marker genes or a set of genes that play a pivotal role in diseases helps drug prioritization. We propose a scheme for finding potential bio-markers using a multi-layer consensus-driven approach. We reconstruct a functional module guided disease sub-network, followed by a multi-step consensus of network inference methods and shared ontological terms. We perform centrality analysis on the sub-networks under consideration and report hub genes as potentially key players in the target disease. To establish our scheme's effectiveness, we use Alzheimer's Disease (AD) and Breast Cancer as candidate diseases for experimentation. We evaluate the significance of prioritized genes based on reported evidence. We observe that BRCA1, BRCA2, and PTEN are the essential genes for Breast Cancer, whereas MAPK1, APP, and CASP7 are the essential genes playing an important role during AD., (Copyright © 2020 Elsevier Ltd. All rights reserved.)
- Published
- 2020
- Full Text
- View/download PDF
9. (Differential) Co-Expression Analysis of Gene Expression: A Survey of Best Practices.
- Author
-
Chowdhury HA, Bhattacharyya DK, and Kalita JK
- Subjects
- Animals, Gene Regulatory Networks genetics, Humans, Oligonucleotide Array Sequence Analysis, RNA-Seq, Transcriptome genetics, Gene Expression Profiling methods, Gene Expression Profiling standards
- Abstract
Analysis of gene expression data is widely used in transcriptomic studies to understand functions of molecules inside a cell and interactions among molecules. Differential co-expression analysis studies diseases and phenotypic variations by finding modules of genes whose co-expression patterns vary across conditions. We review the best practices in gene expression data analysis in terms of analysis of (differential) co-expression, co-expression network, differential networking, and differential connectivity considering both microarray and RNA-seq data along with comparisons. We highlight hurdles in RNA-seq data analysis using methods developed for microarrays. We include discussion of necessary tools for gene expression analysis throughout the paper. In addition, we shed light on scRNA-seq data analysis by including preprocessing and scRNA-seq in co-expression analysis along with useful tools specific to scRNA-seq. To get insights, biological interpretation and functional profiling is included. Finally, we provide guidelines for the analyst, along with research issues and challenges which should be addressed.
- Published
- 2020
- Full Text
- View/download PDF
10. Differential Expression Analysis of RNA-seq Reads: Overview, Taxonomy, and Tools.
- Author
-
Chowdhury HA, Bhattacharyya DK, and Kalita JK
- Subjects
- High-Throughput Nucleotide Sequencing methods, Humans, Quality Control, Sequence Analysis, RNA standards, Software, Transcriptome genetics, Computational Biology methods, Gene Expression Profiling methods, RNA genetics, Sequence Analysis, RNA methods
- Abstract
Analysis of RNA-sequence (RNA-seq) data is widely used in transcriptomic studies and it has many applications. We review RNA-seq data analysis from RNA-seq reads to the results of differential expression analysis. In addition, we perform a descriptive comparison of tools used in each step of RNA-seq data analysis along with a discussion of important characteristics of these tools. A taxonomy of tools is also provided. A discussion of issues in quality control and visualization of RNA-seq data is also included along with useful tools. Finally, we provide some guidelines for the RNA-seq data analyst, along with research issues and challenges which should be addressed.
- Published
- 2020
- Full Text
- View/download PDF
11. X-Module: A novel fusion measure to associate co-expressed gene modules from condition-specific expression profiles.
- Author
-
Kakati T, Bhattacharyya DK, and Kalita JK
- Subjects
- Algorithms, Alzheimer Disease genetics, Alzheimer Disease metabolism, Animals, Databases, Genetic, Humans, Pan troglodytes, Parkinson Disease genetics, Parkinson Disease metabolism, Gene Expression Regulation, Gene Regulatory Networks physiology, Transcriptome
- Abstract
A gene co-expression network (CEN) is of biological interest, since co-expressed genes share common functions and biological processes or pathways. Finding relationships among modules can reveal inter-modular preservation, and similarity in transcriptome, functional, and biological behaviors among modules of the same or two different datasets. There is no method which explores the one-to-one relationships and one-to-many relationships among modules extracted from control and disease samples based on both topological and semantic similarity using both microarray and RNA seq data. In this work, we propose a novel fusion measure to detect mapping between modules from two sets of co-expressed modules extracted from control and disease stages of Alzheimer's disease (AD) and Parkinson's disease (PD) datasets. Our measure considers both topological and biological information of a module and is an estimation of four parameters, namely, semantic similarity, eigengene correlation, degree difference, and the number of common genes. We analyze the consensus modules shared between both control and disease stages in terms of their association with diseases. We also validate the close associations between human and chimpanzee modules and compare with the state-ofthe- art method. Additionally, we propose two novel observations on the relationships between modules for further analysis.
- Published
- 2020
12. Comparison of Methods for Differential Co-expression Analysis for Disease Biomarker Prediction.
- Author
-
Kakati T, Bhattacharyya DK, Barah P, and Kalita JK
- Subjects
- Biomarkers metabolism, Humans, Alzheimer Disease genetics, Alzheimer Disease metabolism, Databases, Genetic, Gene Expression Profiling, Gene Regulatory Networks, Parkinson Disease genetics, Parkinson Disease metabolism, Transcriptome
- Abstract
In the recent past, a number of methods have been developed for analysis of biological data. Among these methods, gene co-expression networks have the ability to mine functionally related genes with similar co-expression patterns, because of which such networks have been most widely used. However, gene co-expression networks cannot identify genes, which undergo condition specific changes in their relationships with other genes. In contrast, differential co-expression analysis enables finding co-expressed genes exhibiting significant changes across disease conditions. In this paper, we present some significant outcomes of a comparative study of four co-expression network module detection techniques, namely, THD-Module Extractor, DiffCoEx, MODA, and WGCNA, which can perform differential co-expression analysis on both gene and miRNA expression data (microarray and RNA-seq) and discuss the applications to Alzheimer's disease and Parkinson's disease research. Our observations reveal that compared to other methods, THD-Module Extractor is the most effective in finding modules with higher functional relevance and biological significance., (Copyright © 2019. Published by Elsevier Ltd.)
- Published
- 2019
- Full Text
- View/download PDF
13. Intrinsic-overlapping co-expression module detection with application to Alzheimer's Disease.
- Author
-
Manners HN, Roy S, and Kalita JK
- Subjects
- Algorithms, Cluster Analysis, Gene Expression Regulation, Genomics methods, Humans, Phenotype, Transcriptome, Alzheimer Disease genetics, Gene Regulatory Networks
- Abstract
Genes interact with each other and may cause perturbation in the molecular pathways leading to complex diseases. Often, instead of any single gene, a subset of genes interact, forming a network, to share common biological functions. Such a subnetwork is called a functional module or motif. Identifying such modules and central key genes in them, that may be responsible for a disease, may help design patient-specific drugs. In this study, we consider the neurodegenerative Alzheimer's Disease (AD) and identify potentially responsible genes from functional motif analysis. We start from the hypothesis that central genes in genetic modules are more relevant to a disease that is under investigation and identify hub genes from the modules as potential marker genes. Motifs or modules are often non-exclusive or overlapping in nature. Moreover, they sometimes show intrinsic or hierarchical distributions with overlapping functional roles. To the best of our knowledge, no prior work handles both the situations in an integrated way. We propose a non-exclusive clustering approach, CluViaN (Clustering Via Network) that can detect intrinsic as well as overlapping modules from gene co-expression networks constructed using microarray expression profiles. We compare our method with existing methods to evaluate the quality of modules extracted. CluViaN reports the presence of intrinsic and overlapping motifs in different species not reported by any other research. We further apply our method to extract significant AD specific modules using CluViaN and rank them based the number of genes from a module involved in the disease pathways. Finally, top central genes are identified by topological analysis of the modules. We use two different AD phenotype data for experimentation. We observe that central genes, namely PSEN1, APP, NDUFB2, NDUFA1, UQCR10, PPP3R1 and a few more, play significant roles in the AD. Interestingly, our experiments also find a hub gene, PML, which has recently been reported to play a role in plasticity, circadian rhythms and the response to proteins which can cause neurodegenerative disorders. MUC4, another hub gene that we find experimentally is yet to be investigated for its potential role in AD. A software implementation of CluViaN in Java is available for download at https://sites.google.com/site/swarupnehu/publications/resources/CluViaN Software.rar., (Copyright © 2018 Elsevier Ltd. All rights reserved.)
- Published
- 2018
- Full Text
- View/download PDF
14. THD-Tricluster: A robust triclustering technique and its application in condition specific change analysis in HIV-1 progression data.
- Author
-
Kakati T, Ahmed HA, Bhattacharyya DK, and Kalita JK
- Subjects
- Cluster Analysis, HIV-1 isolation & purification, Humans, Oligonucleotide Array Sequence Analysis, Algorithms, HIV-1 genetics
- Abstract
Developing a cost-effective and robust triclustering algorithm that can identify triclusters of high biological significance in the gene-sample-time (GST) domain is a challenging task. Most existing triclustering algorithms can detect shifting and scaling patterns in isolation, they are not able to handle co-occurring shifting-and-scaling patterns. This paper makes an attempt to address this issue. It introduces a robust triclustering algorithm called THD-Tricluster to identify triclusters over the GST domain. In addition to applying over several benchmark datasets for its validation, the proposed THD-Tricluster algorithm was applied on HIV-1 progression data to identify disease-specific genes. THD-Tricluster could identify 38 most responsible genes for the deadly disease which includes GATA3, EGR1, JUN, ELF1, AGFG1, AGFG2, CX3CR1, CXCL12, CCR5, CCR2, and many others. The results are validated using GeneCard and other established results., (Copyright © 2018 Elsevier Ltd. All rights reserved.)
- Published
- 2018
- Full Text
- View/download PDF
15. Detecting protein complexes based on a combination of topological and biological properties in protein-protein interaction network.
- Author
-
Sharma P, Bhattacharyya DK, and Kalita JK
- Abstract
Protein complexes are known to play a major role in controlling cellular activity in a living being. Identifying complexes from raw protein protein interactions (PPIs) is an important area of research. Earlier work has been limited mostly to yeast. Such protein complex identification methods, when applied to large human PPIs often give poor performance. We introduce a novel method called CSC to detect protein complexes. The method is evaluated in terms of positive predictive value, sensitivity and accuracy using the datasets of the model organism, yeast and humans. CSC outperforms several other competing algorithms for both organisms. Further, we present a framework to establish the usefulness of CSC in analyzing the influence of a given disease gene in a complex topologically as well as biologically considering eight major association factors.
- Published
- 2018
- Full Text
- View/download PDF
16. Protein complex finding and ranking: An application to Alzheimer's disease.
- Author
-
Sharma P, Bhattacharyya DK, and Kalita JK
- Subjects
- Alzheimer Disease diagnosis, Alzheimer Disease pathology, Databases, Protein, Humans, Protein Binding, Algorithms, Alzheimer Disease metabolism, Computational Biology methods, Protein Interaction Mapping statistics & numerical data
- Abstract
Protein complexes are known to play a major role in controlling cellular activity in a living being. Identifying complexes from raw protein-protein interactions (PPIs) is an important area of research. Earlier work has been limited mostly to yeast and a few other model organisms. Such protein complex identification methods, when applied to large human PPIs often give poor performance. We introduce a novel method called ComFiR to detect such protein complexes and further rank diseased complexes based on a query disease. We have shown that it has better performance in identifying protein complexes from human PPI data. This method is evaluated in terms of positive predictive value, sensitivity and accuracy. We have introduced a ranking approach and showed its application on Alzheimer's disease.
- Published
- 2017
- Full Text
- View/download PDF
17. Analysis of Gene Expression Patterns Using Biclustering.
- Author
-
Roy S, Bhattacharyya DK, and Kalita JK
- Subjects
- Animals, Data Mining methods, Gene Expression Regulation, Humans, Oligonucleotide Array Sequence Analysis methods, Reproducibility of Results, Cluster Analysis, Computational Biology methods, Gene Expression Profiling methods
- Abstract
Mining microarray data to unearth interesting expression profile patterns for discovery of in silico biological knowledge is an emerging area of research in computational biology. A group of functionally related genes may have similar expression patterns under a set of conditions or at some time points. Biclustering is an important data mining tool that has been successfully used to analyze gene expression data for biologically significant cluster discovery. The purpose of this chapter is to introduce interesting patterns that may be observed in expression data and discuss the role of biclustering techniques in detecting interesting functional gene groups with similar expression patterns.
- Published
- 2016
- Full Text
- View/download PDF
18. Core and peripheral connectivity based cluster analysis over PPI network.
- Author
-
Ahmed HA, Bhattacharyya DK, and Kalita JK
- Subjects
- Cluster Analysis, Databases, Protein, Protein Binding, Protein Interaction Mapping, Reproducibility of Results, Protein Interaction Maps, Proteins chemistry
- Abstract
A number of methods have been proposed in the literature of protein-protein interaction (PPI) network analysis for detection of clusters in the network. Clusters are identified by these methods using various graph theoretic criteria. Most of these methods have been found time consuming due to involvement of preprocessing and post processing tasks. In addition, they do not achieve high precision and recall consistently and simultaneously. Moreover, the existing methods do not employ the idea of core-periphery structural pattern of protein complexes effectively to extract clusters. In this paper, we introduce a clustering method named CPCA based on a recent observation by researchers that a protein complex in a PPI network is arranged as a relatively dense core region and additional proteins weakly connected to the core. CPCA uses two connectivity criterion functions to identify core and peripheral regions of the cluster. To locate initial node of a cluster we introduce a measure called DNQ (Degree based Neighborhood Qualification) index that evaluates tendency of the node to be part of a cluster. CPCA performs well when compared with well-known counterparts. Along with protein complex gold standards, a co-localization dataset has also been used for validation of the results., (Copyright © 2015 Elsevier Ltd. All rights reserved.)
- Published
- 2015
- Full Text
- View/download PDF
19. Shifting-and-Scaling Correlation Based Biclustering Algorithm.
- Author
-
Ahmed HA, Mahanta P, Bhattacharyya DK, and Kalita JK
- Subjects
- Databases, Genetic, Algorithms, Cluster Analysis, Computational Biology methods, Gene Expression Profiling methods
- Abstract
The existence of various types of correlations among the expressions of a group of biologically significant genes poses challenges in developing effective methods of gene expression data analysis. The initial focus of computational biologists was to work with only absolute and shifting correlations. However, researchers have found that the ability to handle shifting-and-scaling correlation enables them to extract more biologically relevant and interesting patterns from gene microarray data. In this paper, we introduce an effective shifting-and-scaling correlation measure named Shifting and Scaling Similarity (SSSim), which can detect highly correlated gene pairs in any gene expression data. We also introduce a technique named Intensive Correlation Search (ICS) biclustering algorithm, which uses SSSim to extract biologically significant biclusters from a gene expression data set. The technique performs satisfactorily with a number of benchmarked gene expression data sets when evaluated in terms of functional categories in Gene Ontology database.
- Published
- 2014
- Full Text
- View/download PDF
20. Reconstruction of gene co-expression network from microarray data using local expression patterns.
- Author
-
Roy S, Bhattacharyya DK, and Kalita JK
- Subjects
- Algorithms, Computer Simulation, Down-Regulation, Gene Expression, Humans, Models, Genetic, Gene Expression Profiling methods, Gene Regulatory Networks, Oligonucleotide Array Sequence Analysis methods
- Abstract
Background: Biological networks connect genes, gene products to one another. A network of co-regulated genes may form gene clusters that can encode proteins and take part in common biological processes. A gene co-expression network describes inter-relationships among genes. Existing techniques generally depend on proximity measures based on global similarity to draw the relationship between genes. It has been observed that expression profiles are sharing local similarity rather than global similarity. We propose an expression pattern based method called GeCON to extract Gene CO-expression Network from microarray data. Pair-wise supports are computed for each pair of genes based on changing tendencies and regulation patterns of the gene expression. Gene pairs showing negative or positive co-regulation under a given number of conditions are used to construct such gene co-expression network. We construct co-expression network with signed edges to reflect up- and down-regulation between pairs of genes. Most existing techniques do not emphasize computational efficiency. We exploit a fast correlogram matrix based technique for capturing the support of each gene pair to construct the network., Results: We apply GeCON to both real and synthetic gene expression data. We compare our results using the DREAM (Dialogue for Reverse Engineering Assessments and Methods) Challenge data with three well known algorithms, viz., ARACNE, CLR and MRNET. Our method outperforms other algorithms based on in silico regulatory network reconstruction. Experimental results show that GeCON can extract functionally enriched network modules from real expression data., Conclusions: In view of the results over several in-silico and real expression datasets, the proposed GeCON shows satisfactory performance in predicting co-expression network in a computationally inexpensive way. We further establish that a simple expression pattern matching is helpful in finding biologically relevant gene network. In future, we aim to introduce an enhanced GeCON to identify Protein-Protein interaction network complexes by incorporating variable density concept.
- Published
- 2014
- Full Text
- View/download PDF
21. An effective method for network module extraction from microarray data.
- Author
-
Mahanta P, Ahmed HA, Bhattacharyya DK, and Kalita JK
- Subjects
- Algorithms, Databases, Genetic statistics & numerical data, Gene Expression, Computational Biology methods, Data Interpretation, Statistical, Gene Expression Profiling statistics & numerical data, Gene Regulatory Networks, Oligonucleotide Array Sequence Analysis statistics & numerical data
- Abstract
Background: The development of high-throughput Microarray technologies has provided various opportunities to systematically characterize diverse types of computational biological networks. Co-expression network have become popular in the analysis of microarray data, such as for detecting functional gene modules., Results: This paper presents a method to build a co-expression network (CEN) and to detect network modules from the built network. We use an effective gene expression similarity measure called NMRS (Normalized mean residue similarity) to construct the CEN. We have tested our method on five publicly available benchmark microarray datasets. The network modules extracted by our algorithm have been biologically validated in terms of Q value and p value., Conclusions: Our results show that the technique is capable of detecting biologically significant network modules from the co-expression network. Biologist can use this technique to find groups of genes with similar functionality based on their expression information.
- Published
- 2012
- Full Text
- View/download PDF
22. An effective graph-based clustering technique to identify coherent patterns from gene expression data.
- Author
-
Priyadarshini G, Sarmah R, Chakraborty B, Bhattacharyya DK, and Kalita JK
- Subjects
- Databases, Genetic, Gene Expression Profiling, Oligonucleotide Array Sequence Analysis methods, Algorithms, Cluster Analysis, Gene Expression
- Abstract
This paper presents an effective parameter-less graph based clustering technique (GCEPD). GCEPD produces highly coherent clusters in terms of various cluster validity measures. The technique finds highly coherent patterns containing genes with high biological relevance. Experiments with real life datasets establish that the method produces clusters that are significantly better than other similar algorithms in terms of various quality measures.
- Published
- 2012
- Full Text
- View/download PDF
23. Computational modelling and simulation of the immune system.
- Author
-
Kalita JK, Chandrashekar K, Hans R, and Selvam P
- Subjects
- Animals, Complement System Proteins metabolism, Computer Simulation, Cytoplasm metabolism, Dendritic Cells cytology, Humans, Immune System, Killer Cells, Natural metabolism, Macrophages metabolism, Models, Biological, Neutrophils metabolism, Programming Languages, Software, Computational Biology methods
- Abstract
We have developed a software system called SIMISYS that models and simulates aspects of the human immune system based on the computational framework of cellular automata. We model tens of thousands of cells as exemplars of the significant players in the functioning of the immune system, and simulate normal and simple disease situations by interpreting interactions among the cells. SIMISYS 0.3, the current version, models and simulates the innate and adaptive components of the immune system. The specific players we model are the macrophages, dendritic cells, neutrophils, natural killer cells, B cells, T helper cells, complement proteins and pathogenic bacteria.
- Published
- 2006
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.