14,860 results
Search Results
2. The cold responsive mechanism of the paper mulberry: decreased photosynthesis capacity and increased starch accumulation.
- Author
-
Xianjun Peng, Linhong Teng, Xueqing Yan, Meiling Zhao, and Shihua Shen
- Subjects
- *
PAPER mulberry , *PHOTOSYNTHESIS , *STARCH metabolism , *EFFECT of cold on plants , *ABIOTIC stress , *PLANT adaptation , *TRANSMISSION electron microscopy - Abstract
Background: Most studies on the paper mulberry are mainly focused on the medicated and pharmacology, fiber quality, leaves feed development, little is known about its mechanism of adaptability to abiotic stress. Physiological measurement, transcriptomics and proteomic analysis were employed to understand its response to cold stress in this study. Methods: The second to fourth fully expanded leaves from up to down were harvested at different stress time points forthe transmission electron microscope (TEM) observation. Physiological characteristics measurement included the relative electrolyte leakage (REL), SOD activity assay, soluble sugar content, and Chlorophyll fluorescence parameter measurement. For screening of differentially expressed genes, the expression level of every transcript in each sample was calculated by quantifying the number of Illumina reads. To identify the differentially expressed protein, leaves of plants under 0, 6, 12, 24, 48 and 72 h cold stress wereharvested for proteomic analysis. Finally, real time PCR was used to verify the DEG results of the RNA-seq and the proteomics data. Results: Results showed that at the beginning of cold stress, respiratory metabolism was decreased and the transportation and hydrolysis of photosynthetic products was inhibited, leading to an accumulation of starch in the chloroplasts. Total of 5800 unigenes and 38 proteins were affected, including the repressed expression of photosynthesis and the enhanced expression in signal transduction, stress defense pathway as well as secondary metabolism. Although the transcriptional level of a large number of genes has been restored after 12 h, sustained cold stress brought more serious injury to the leaf cells, including the sharp rise of the relative electrolyte leakage, the declined Fv/Fm value, swelled chloroplast and the disintegrated membrane system. Conclusion: The starch accumulation and the photoinhibition might be the main adaptive mechanism of the paper mulberry responded to cold stress. Most of important, enhancing the transport and hydrolysis of photosynthetic products could be the potential targets for improving the cold tolerance of the paper mulberry. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
3. Characterization of metallothionein genes from Broussonetia papyrifera: metal binding and heavy metal tolerance mechanisms
- Author
-
Xu, Zhenggang, Yang, Shen, Li, Chenhao, Xie, Muhong, He, Yi, Chen, Sisi, Tang, Yan, Li, Dapei, Wang, Tianyu, and Yang, Guiyan
- Published
- 2024
- Full Text
- View/download PDF
4. Shared genes related to aggression, rather than chemical communication, are associated with reproductive dominance in paper wasps (Polistes metricus).
- Author
-
Toth, Amy L., Tooker, John F., Radhakrishnan, Srihari, Minard, Robert, Henshaw, Michael T., and Grozinger, Christina M.
- Subjects
- *
PAPER wasps , *INSECT aggregation , *GENE expression , *INSECT genetics , *INSECT communication , *ANIMAL social behavior , *REPRODUCTION , *INSECTS - Abstract
Background In social groups, dominant individuals may socially inhibit reproduction of subordinates using aggressive interactions or, in the case of highly eusocial insects, pheromonal communication. It has been hypothesized these two modes of reproductive inhibition utilize conserved pathways. Here, we use a comparative framework to investigate the chemical and genomic underpinnings of reproductive dominance in the primitively eusocial wasp Polistes metricus. Our goals were to first characterize transcriptomic and chemical correlates of reproductive dominance and second, to test whether dominance-associated mechanisms in paper wasps overlapped with aggression or pheromone-related gene expression patterns in other species. To explore whether conserved molecular pathways relate to dominance, we compared wasp transcriptomic data to previous studies of gene expression associated with pheromonal communication and queen-worker differences in honey bees, and aggressive behavior in bees, Drosophila, and mice. Results By examining dominant and subordinate females from queen and worker castes in early and late season colonies, we found that cuticular hydrocarbon profiles and genome-wide patterns of brain gene expression were primarily associated with season/social environment rather than dominance status. In contrast, gene expression patterns in the ovaries were associated primarily with caste and ovary activation. Comparative analyses suggest genes identified as differentially expressed in wasp brains are not related to queen pheromonal communication or caste in bees, but were significantly more likely to be associated with aggression in other insects (bees, flies), and even a mammal (mice). Conclusions This study provides the first comprehensive chemical and molecular analysis of reproductive dominance in paper wasps. We found little evidence for a chemical basis for reproductive dominance in P. metricus, and our transcriptomic analyses suggest that different pathways regulate dominance in paper wasps and pheromone response in bees. Furthermore, there was a substantial impact of season/social environment on gene expression patterns, indicating the important role of external cues in shaping the molecular processes regulating behavior. Interestingly, genes associated with dominance in wasps were also associated with aggressive behavior in bees, solitary insects and mammals. Thus, genes involved in social regulation of reproduction in Polistes may have conserved functions associated with aggression in insects and other taxa. [ABSTRACT FROM AUTHOR]
- Published
- 2014
- Full Text
- View/download PDF
5. The cold responsive mechanism of the paper mulberry: decreased photosynthesis capacity and increased starch accumulation.
- Author
-
Peng X, Teng L, Yan X, Zhao M, and Shen S
- Subjects
- Acclimatization genetics, Chloroplasts genetics, Chloroplasts metabolism, Cold Temperature, Gene Expression Regulation, Plant, Morus growth & development, Photosynthesis genetics, Plant Leaves genetics, Plant Leaves metabolism, Proteomics, Starch genetics, Morus genetics, Plant Proteins biosynthesis, Starch metabolism, Stress, Physiological genetics
- Abstract
Background: Most studies on the paper mulberry are mainly focused on the medicated and pharmacology, fiber quality, leaves feed development, little is known about its mechanism of adaptability to abiotic stress. Physiological measurement, transcriptomics and proteomic analysis were employed to understand its response to cold stress in this study., Methods: The second to fourth fully expanded leaves from up to down were harvested at different stress time points forthe transmission electron microscope (TEM) observation. Physiological characteristics measurement included the relative electrolyte leakage (REL), SOD activity assay, soluble sugar content, and Chlorophyll fluorescence parameter measurement. For screening of differentially expressed genes, the expression level of every transcript in each sample was calculated by quantifying the number of Illumina reads. To identify the differentially expressed protein, leaves of plants under 0, 6, 12, 24, 48 and 72 h cold stress wereharvested for proteomic analysis. Finally, real time PCR was used to verify the DEG results of the RNA-seq and the proteomics data., Results: Results showed that at the beginning of cold stress, respiratory metabolism was decreased and the transportation and hydrolysis of photosynthetic products was inhibited, leading to an accumulation of starch in the chloroplasts. Total of 5800 unigenes and 38 proteins were affected, including the repressed expression of photosynthesis and the enhanced expression in signal transduction, stress defense pathway as well as secondary metabolism. Although the transcriptional level of a large number of genes has been restored after 12 h, sustained cold stress brought more serious injury to the leaf cells, including the sharp rise of the relative electrolyte leakage, the declined Fv/Fm value, swelled chloroplast and the disintegrated membrane system., Conclusion: The starch accumulation and the photoinhibition might be the main adaptive mechanism of the paper mulberry responded to cold stress. Most of important, enhancing the transport and hydrolysis of photosynthetic products could be the potential targets for improving the cold tolerance of the paper mulberry.
- Published
- 2015
- Full Text
- View/download PDF
6. Interpretation knowledge extraction for genetic testing via question-answer model.
- Author
-
Wang, Wenjun, Chen, Huanxin, Wang, Hui, Fang, Lin, Wang, Huan, Ding, Yi, Lu, Yao, and Wu, Qingyao
- Abstract
Background: Sequencing-based genetic testing is widely used in biomedical research, including pathogenic microorganism detection with metagenomic next-generation sequencing (mNGS). The application of sequencing results to clinical diagnosis and treatment relies on various interpretation knowledge bases. Currently, the existing knowledge bases are primarily built through manual knowledge extraction. This method requires professionals to read extensive literature and extract relevant knowledge from it, which is time-consuming and costly. Furthermore, manual extraction unavoidably introduces subjective biases. In this study, we aimed to automatically extract knowledge for interpreting mNGS results. Method: We propose a novel approach to automatically extract pathogenic microorganism knowledge based on the question-answer (QA) model. First, we construct a MicrobeDB dataset since there is no available pathogenic microorganism QA dataset for training the model. The created dataset contains 3,161 samples from 618 published papers covering 224 pathogenic microorganisms. Then, we fine-tune the selected baseline model based on MicrobeDB. Finally, we utilize ChatGPT to enhance the diversity of training data, and employ data expansion to increase training data volume. Results: Our method achieves an Exact Match (EM) and F1 score of 88.39% and 93.18%, respectively, on the MicrobeDB test set. We also conduct ablation studies on the proposed data augmentation method. In addition, we perform comparative experiments with the ChatPDF tool based on the ChatGPT API to demonstrate the effectiveness of the proposed method. Conclusions: Our method is effective and valuable for extracting pathogenic microorganism knowledge. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Summary of talks and papers at ISCB-Asia/SCCG 2012.
- Author
-
Tretyakov, Konstantin, Goldberg, Tatyana, Jin, Victor X., and Horton, Paul
- Subjects
- *
COMPUTATIONAL biology , *MESSENGER RNA , *GENOMES , *GENOMICS , *BIOINFORMATICS - Abstract
The second ISCB-Asia conference of the International Society for Computational Biology took place December 17-19, 2012, in Shenzhen, China. The conference was co-hosted by BGI as the first Shenzhen Conference on Computational Genomics (SCCG). 45 talks were presented at ISCB-Asia/SCCG 2012. The topics covered included software tools, reproducible computing, next-generation sequencing data analysis, transcription and mRNA regulation, protein structure and function, cancer genomics and personalized medicine. Nine of the proceedings track talks are included as full papers in this supplement. In this report we first give a short overview of the conference by listing some statistics and visualizing the talk abstracts as word clouds. Then we group the talks by topic and briefly summarize each one, providing references to related publications whenever possible. Finally, we close with a few comments on the success of this conference [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
8. Shared genes related to aggression, rather than chemical communication, are associated with reproductive dominance in paper wasps (Polistes metricus)
- Author
-
Christina M. Grozinger, Robert D. Minard, Srihari Radhakrishnan, Michael T. Henshaw, John F. Tooker, and Amy L. Toth
- Subjects
Wasps ,Zoology ,Pheromones ,Chemical communication ,Polistes metricus ,medicine ,Genetics ,Animals ,Animal communication ,Social behavior ,Dominance (genetics) ,Genome ,biology ,Aggression ,Gene Expression Profiling ,Reproduction ,fungi ,Brain ,Genomics ,Bees ,biology.organism_classification ,Eusociality ,Hydrocarbons ,Gene expression profiling ,Animal Communication ,Sex pheromone ,Insect Proteins ,Female ,Polistes ,medicine.symptom ,Biotechnology ,Research Article - Abstract
Background In social groups, dominant individuals may socially inhibit reproduction of subordinates using aggressive interactions or, in the case of highly eusocial insects, pheromonal communication. It has been hypothesized these two modes of reproductive inhibition utilize conserved pathways. Here, we use a comparative framework to investigate the chemical and genomic underpinnings of reproductive dominance in the primitively eusocial wasp Polistes metricus. Our goals were to first characterize transcriptomic and chemical correlates of reproductive dominance and second, to test whether dominance-associated mechanisms in paper wasps overlapped with aggression or pheromone-related gene expression patterns in other species. To explore whether conserved molecular pathways relate to dominance, we compared wasp transcriptomic data to previous studies of gene expression associated with pheromonal communication and queen-worker differences in honey bees, and aggressive behavior in bees, Drosophila, and mice. Results By examining dominant and subordinate females from queen and worker castes in early and late season colonies, we found that cuticular hydrocarbon profiles and genome-wide patterns of brain gene expression were primarily associated with season/social environment rather than dominance status. In contrast, gene expression patterns in the ovaries were associated primarily with caste and ovary activation. Comparative analyses suggest genes identified as differentially expressed in wasp brains are not related to queen pheromonal communication or caste in bees, but were significantly more likely to be associated with aggression in other insects (bees, flies), and even a mammal (mice). Conclusions This study provides the first comprehensive chemical and molecular analysis of reproductive dominance in paper wasps. We found little evidence for a chemical basis for reproductive dominance in P. metricus, and our transcriptomic analyses suggest that different pathways regulate dominance in paper wasps and pheromone response in bees. Furthermore, there was a substantial impact of season/social environment on gene expression patterns, indicating the important role of external cues in shaping the molecular processes regulating behavior. Interestingly, genes associated with dominance in wasps were also associated with aggressive behavior in bees, solitary insects and mammals. Thus, genes involved in social regulation of reproduction in Polistes may have conserved functions associated with aggression in insects and other taxa.
- Published
- 2014
9. Proceedings from Asia Pacific Bioinformatics Network (APBioNet) Eighth International Conference on Bioinformatics (InCoB2009), Singapore, 7-11 September 2009.
- Subjects
- *
CONFERENCE papers , *BIOINFORMATICS , *GENOMICS , *GENE expression , *GENETIC engineering - Abstract
The article presents papers from the Asia Pacific Bioinformatics Network (APBioNet) Eighth International Conference on Bioinformatics (InCoB2009) in Singapore from September 7-11, 2009 including "Genome-Wide Analysis of Alternative Splicing in Cow: Implications in Bovine as a Model for Human Diseases," "MapNext: A Software Tool for Spliced and Unspliced Alignments and SNP Detection of Short Sequence Reads," and "Measuring Similarity Between Gene Expression Profiles: A Bayesian Approach."
- Published
- 2009
10. Mitochondrial genome plasticity of mammalian species.
- Author
-
Biró, Bálint, Gál, Zoltán, Fekete, Zsófia, Klecska, Eszter, and Hoffmann, Orsolya Ivett
- Subjects
MITOCHONDRIAL DNA ,MACHINE learning ,GENOMES - Abstract
There is an ongoing process in which mitochondrial sequences are being integrated into the nuclear genome. The importance of these sequences has already been revealed in cancer biology, forensic, phylogenetic studies and in the evolution of the eukaryotic genetic information. Human and numerous model organisms' genomes were described from those sequences point of view. Furthermore, recent studies were published on the patterns of these nuclear localised mitochondrial sequences in different taxa. However, the results of the previously released studies are difficult to compare due to the lack of standardised methods and/or using few numbers of genomes. Therefore, in this paper our primary goal is to establish a uniform mining pipeline to explore these nuclear localised mitochondrial sequences. Our results show that the frequency of several repetitive elements is higher in the flanking regions of these sequences than expected. A machine learning model reveals that the flanking regions' repetitive elements and different structural characteristics are highly influential during the integration process. In this paper, we introduce a general mining pipeline for all mammalian genomes. The workflow is publicly available and is believed to serve as a validated baseline for future research in this field. We confirm the widespread opinion, on - as to our current knowledge - the largest dataset, that structural circumstances and events corresponding to repetitive elements are highly significant. An accurate model has also been trained to predict these sequences and their corresponding flanking regions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Comparative plastome analysis of Arundinelleae (Poaceae, Panicoideae), with implications for phylogenetic relationships and plastome evolution.
- Author
-
Jiang, Li-Qiong, Drew, Bryan T., Arthan, Watchara, Yu, Guo-Ying, Wu, Hong, Zhao, Yue, Peng, Hua, and Xiang, Chun-Lei
- Subjects
MICROSATELLITE repeats ,WHOLE genome sequencing ,CHLOROPLAST DNA ,BASE pairs ,TRANSFER RNA - Abstract
Background: Arundinelleae is a small tribe within the Poaceae (grass family) possessing a widespread distribution that includes Asia, the Americas, and Africa. Several species of Arundinelleae are used as natural forage, feed, and raw materials for paper. The tribe is taxonomically cumbersome due to a paucity of clear diagnostic morphological characters. There has been scant genetic and genomic research conducted for this group, and as a result the phylogenetic relationships and species boundaries within Arundinelleae are poorly understood. Results: We compared and analyzed 11 plastomes of Arundinelleae, of which seven plastomes were newly sequenced. The plastomes range from 139,629 base pairs (bp) (Garnotia tenella) to 140,943 bp (Arundinella barbinodis), with a standard four-part structure. The average GC content was 38.39%, but varied in different regions of the plastome. In all, 110 genes were annotated, comprising 76 protein-coding genes, 30 tRNA genes, and four rRNA genes. Furthermore, 539 simple sequence repeats, 519 long repeats, and 10 hyper-variable regions were identified from the 11 plastomes of Arundinelleae. A phylogenetic reconstruction of Panicoideae based on 98 plastomes demonstrated the monophyly of Arundinella and Garnotia, but the circumscription of Arundinelleae remains unresolved. Conclusion: Complete chloroplast genome sequences can improve phylogenetic resolution relative to single marker approaches, particularly within taxonomically challenging groups. All phylogenetic analyses strongly support the monophyly of Arundinella and Garnotia, respectively, but the monophylly of Arundinelleae was not well supported. The intergeneric phylogenetic relationships within Arundinelleae require clarification, indicating that more data is necessary to resolve generic boundaries and evaluate the monophyly of Arundinelleae. A comprehensive taxonomic revision for the tribe is necessary. In addition, the identified hyper-variable regions could function as molecular markers for clarifying phylogenetic relationships and potentially as barcoding markers for species identification in the future. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. Mobilome impacts on physiology in the widely used non-toxic mutant Microcystis aeruginosa PCC 7806 ΔmcyB and toxic wildtype.
- Author
-
Stark, Gwendolyn F., Truchon, Alexander R., and Wilhelm, Steven W.
- Abstract
The Microcystis mobilome is a well-known but understudied component of this bloom-forming cyanobacterium. Through genomic and transcriptomic comparisons, we found five families of transposases that altered the expression of genes in the well-studied toxigenic type-strain, Microcystis aeruginosa PCC 7086, and a non-toxigenic genetic mutant, Microcystis aeruginosa PCC 7806 ΔmcyB. Since its creation in 1997, the ΔmcyB strain has been used in comparative physiology studies against the wildtype strain by research labs throughout the world. Some differences in gene expression between what were thought to be otherwise genetically identical strains have appeared due to insertion events in both intra- and intergenic regions. In our ΔmcyB isolate, a sulfate transporter gene cluster (sbp-cysTWA) showed differential expression from the wildtype, which may have been caused by the insertion of a miniature inverted repeat transposable element (MITE) in the sulfate-binding protein gene (sbp). Differences in growth in sulfate-limited media also were also observed between the two isolates. This paper highlights how Microcystis strains continue to “evolve” in lab conditions and illustrates the importance of insertion sequences / transposable elements in shaping genomic and physiological differences between Microcystis strains thought otherwise identical. This study forces the necessity of knowing the complete genetic background of isolates in comparative physiological experiments, to facilitate the correct conclusions (and caveats) from experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. From CFTR to a CF signalling network: a systems biology approach to study Cystic Fibrosis.
- Author
-
Najm, Matthieu, Martignetti, Loredana, Cornet, Matthieu, Kelly-Aubert, Mairead, Sermet, Isabelle, Calzone, Laurence, and Stoven, Véronique
- Subjects
CYSTIC fibrosis transmembrane conductance regulator ,MEMBRANE proteins ,CYSTIC fibrosis ,SYSTEMS biology ,CELLULAR signal transduction ,CHLORIDE channels - Abstract
Background: Cystic Fibrosis (CF) is a monogenic disease caused by mutations in the gene coding the Cystic Fibrosis Transmembrane Regulator (CFTR) protein, but its overall physio-pathology cannot be solely explained by the loss of the CFTR chloride channel function. Indeed, CFTR belongs to a yet not fully deciphered network of proteins participating in various signalling pathways. Methods: We propose a systems biology approach to study how the absence of the CFTR protein at the membrane leads to perturbation of these pathways, resulting in a panel of deleterious CF cellular phenotypes. Results: Based on publicly available transcriptomic datasets, we built and analyzed a CF network that recapitulates signalling dysregulations. The CF network topology and its resulting phenotypes were found to be consistent with CF pathology. Conclusion: Analysis of the network topology highlighted a few proteins that may initiate the propagation of dysregulations, those that trigger CF cellular phenotypes, and suggested several candidate therapeutic targets. Although our research is focused on CF, the global approach proposed in the present paper could also be followed to study other rare monogenic diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
14. Unveiling the Brazilian kefir microbiome: discovery of a novel Lactobacillus kefiranofaciens (LkefirU) genome and in silico prospection of bioactive peptides with potential anti-Alzheimer properties.
- Author
-
Silva, Matheus H., Batista, Letícia L., Malta, Serena M., Santos, Ana C. C., Mendes-Silva, Ana P., Bonetti, Ana M., Ueira-Vieira, Carlos, and dos Santos, Anderson R.
- Subjects
DIETARY bioactive peptides ,ALZHEIMER'S disease ,PAN-genome ,MOLECULAR docking ,KEFIR - Abstract
Background: Kefir is a complex microbial community that plays a critical role in the fermentation and production of bioactive peptides, and has health-improving properties. The composition of kefir can vary by geographic localization and weather, and this paper focuses on a Brazilian sample and continues previous work that has successful anti-Alzheimer properties. In this study, we employed shotgun metagenomics and peptidomics approaches to characterize Brazilian kefir further. Results: We successfully assembled the novel genome of Lactobacillus kefiranofaciens (LkefirU) and conducted a comprehensive pangenome analysis to compare it with other strains. Furthermore, we performed a peptidome analysis, revealing the presence of bioactive peptides encrypted by L. kefiranofaciens in the Brazilian kefir sample, and utilized in silico prospecting and molecular docking techniques to identify potential anti-Alzheimer peptides, targeting β-amyloid (fibril and plaque), BACE, and acetylcholinesterase. Through this analysis, we identified two peptides that show promise as compounds with anti-Alzheimer properties. Conclusions: These findings not only provide insights into the genome of L. kefiranofaciens but also serve as a promising prototype for the development of novel anti-Alzheimer compounds derived from Brazilian kefir. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Expression profile of small RNAs in Acacia mangium secondary xylem tissue with contrasting lignin content - potential regulatory sequences in monolignol biosynthetic pathway.
- Author
-
Ong, Seong Siang and Wickneswari, Ratnam
- Subjects
PLANT cells & tissues ,LIGNINS ,CROSSLINKED polymers ,PAPER mills ,BIOCHEMICAL engineering - Abstract
Background: Lignin, after cellulose, is the second most abundant biopolymer accounting for approximately 15- 35% of the dry weight of wood. As an important component during wood formation, lignin is indispensable for plant structure and defense. However, it is an undesirable component in the pulp and paper industry. Removal of lignin from cellulose is costly and environmentally hazardous process. Tremendous efforts have been devoted to understand the role of enzymes and genes in controlling the amount and composition of lignin to be deposited in the cell wall. However, studies on the impact of downregulation and overexpression of monolignol biosynthesis genes in model species on lignin content, plant fitness and viability have been inconsistent. Recently, non-coding RNAs have been discovered to play an important role in regulating the entire monolignol biosynthesis pathway. As small RNAs have critical functions in various biological process during wood formation, small RNA profiling is an important tool for the identification of complete set of differentially expressed small RNAs between low lignin and high lignin secondary xylem. Results: In line with this, we have generated two small RNAs libraries from samples with contrasting lignin content using Illumina GAII sequencer. About 10 million sequence reads were obtained in secondary xylem of Am48 with high lignin content (41%) and a corresponding 14 million sequence reads were obtained in secondary xylem of Am54 with low lignin content (21%). Our results suggested that A. mangium small RNAs are composed of a set of 12 highly conserved miRNAs families found in plant miRNAs database, 82 novel miRNAs and a large proportion of non-conserved small RNAs with low expression levels. The predicted target genes of those differentially expressed conserved and non-conserved miRNAs include transcription factors associated with regulation of the lignin biosynthetic pathway genes. Some of these small RNAs play an important role in epigenetic silencing. Differential expression of the small RNAs between secondary xylem tissues with contrasting lignin content suggests that a cascade of miRNAs play an interconnected role in regulating the lignin biosynthetic pathway in Acacia species. Conclusions: Our study critically demonstrated the roles of small RNAs during secondary wall formation. Comparison of the expression pattern of small RNAs between secondary xylem tissues with contrasting lignin content strongly indicated that small RNAs play a key regulatory role during lignin biosynthesis. Our analyses suggest an evolutionary mechanism for miRNA targets on the basis of the length of their 5' and 3' UTRs and their cellular roles. The results obtained can be used to better understand the roles of small RNAs during lignin biosynthesis and for the development of gene constructs for silencing of specific genes involved in monolignol biosynthesis with minimal effect on plant fitness and viability. For the first time, small RNAs were proven to play an important regulatory role during lignin biosynthesis in A. mangium. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
16. Time series-based hybrid ensemble learning model with multivariate multidimensional feature coding for DNA methylation prediction.
- Author
-
Yan, Wu, Tan, Li, Mengshan, Li, Weihong, Zhou, Sheng, Sheng, Jun, Wang, and Fu-an, Wu
- Subjects
DNA methylation ,PARTICLE swarm optimization ,BLENDED learning ,MACHINE learning ,TIME series analysis ,DNA methyltransferases - Abstract
Background: DNA methylation is a form of epigenetic modification that impacts gene expression without modifying the DNA sequence, thereby exerting control over gene function and cellular development. The prediction of DNA methylation is vital for understanding and exploring gene regulatory mechanisms. Currently, machine learning algorithms are primarily used for model construction. However, several challenges remain to be addressed, including limited prediction accuracy, constrained generalization capability, and insufficient learning capacity. Results: In response to the aforementioned challenges, this paper leverages the similarities between DNA sequences and time series to introduce a time series-based hybrid ensemble learning model, called Multi2-Con-CAPSO-LSTM. The model utilizes multivariate and multidimensional encoding approach, combining three types of time series encodings with three kinds of genetic feature encodings, resulting in a total of nine types of feature encoding matrices. Convolutional Neural Networks are utilized to extract features from DNA sequences, including temporal, positional, physicochemical, and genetic information, thereby creating a comprehensive feature matrix. The Long Short-Term Memory model is then optimized using the Chaotic Accelerated Particle Swarm Optimization algorithm for predicting DNA methylation. Conclusions: Through cross-validation experiments conducted on 17 species involving three types of DNA methylation (6 mA, 5hmC, and 4mC), the results demonstrate the robust predictive capabilities of the Multi2-Con-CAPSO-LSTM model in DNA methylation prediction across various types and species. Compared with other benchmark models, the Multi2-Con-CAPSO-LSTM model demonstrates significant advantages in sensitivity, specificity, accuracy, and correlation. The model proposed in this paper provides valuable insights and inspiration across various disciplines, including sequence alignment, genetic evolution, time series analysis, and structure–activity relationships. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
17. GBDT_KgluSite: An improved computational prediction model for lysine glutarylation sites based on feature fusion and GBDT classifier.
- Author
-
Liu, Xin, Zhu, Bao, Dai, Xia-Wei, Xu, Zhi-Ao, Li, Rui, Qian, Yuting, Lu, Ya-Ping, Zhang, Wenqing, Liu, Yong, and Zheng, Junnian
- Subjects
PREDICTION models ,POST-translational modification ,CELL physiology ,AMINO acid sequence ,LYSINE - Abstract
Background: Lysine glutarylation (Kglu) is one of the most important Post-translational modifications (PTMs), which plays significant roles in various cellular functions, including metabolism, mitochondrial processes, and translation. Therefore, accurate identification of the Kglu site is important for elucidating protein molecular function. Due to the time-consuming and expensive limitations of traditional biological experiments, computational-based Kglu site prediction research is gaining more and more attention. Results: In this paper, we proposed GBDT_KgluSite, a novel Kglu site prediction model based on GBDT and appropriate feature combinations, which achieved satisfactory performance. Specifically, seven features including sequence-based features, physicochemical property-based features, structural-based features, and evolutionary-derived features were used to characterize proteins. NearMiss-3 and Elastic Net were applied to address data imbalance and feature redundancy issues, respectively. The experimental results show that GBDT_KgluSite has good robustness and generalization ability, with accuracy and AUC values of 93.73%, and 98.14% on five-fold cross-validation as well as 90.11%, and 96.75% on the independent test dataset, respectively. Conclusion: GBDT_KgluSite is an effective computational method for identifying Kglu sites in protein sequences. It has good stability and generalization ability and could be useful for the identification of new Kglu sites in the future. The relevant code and dataset are available at https://github.com/flyinsky6/GBDT_KgluSite. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
18. Metatranscriptomic profiles of Eastern subterranean termites, Reticulitermes flavipes (Kollar) fed on second generation feedstocks.
- Author
-
Rajarapu, Swapna Priya, Shreve, Jacob T., Bhide, Ketaki P., Thimmapuram, Jyothi, and Scharf, Michael E.
- Subjects
LIGNOCELLULOSE ,LIGNINS ,FEEDSTOCK ,BIOMASS energy research ,RENEWABLE energy source research - Abstract
Background: Second generation lignocellulosic feedstocks are being considered as an alternative to first generation biofuels that are derived from grain starches and sugars. However, the current pre-treatment methods for second generation biofuel production are inefficient and expensive due to the recalcitrant nature of lignocellulose. In this study, we used the lower termite Reticulitermes flavipes (Kollar), as a model to identify potential pretreatment genes/enzymes specifically adapted for use against agricultural feedstocks. Results: Metatranscriptomic profiling was performed on worker termite guts after feeding on corn stover (CS), soybean residue (SR), or 98% pure cellulose (paper) to identify (i) microbial community, (ii) pathway level and (iii) gene-level responses. Microbial community profiles after CS and SR feeding were different from the paper feeding profile, and protist symbiont abundance decreased significantly in termites feeding on SR and CS relative to paper. Functional profiles after CS feeding were similar to paper and SR; whereas paper and SR showed different profiles. Amino acid and carbohydrate metabolism pathways were downregulated in termites feeding on SR relative to paper and CS. Gene expression analyses showed more significant down regulation of genes after SR feeding relative to paper and CS. Stereotypical lignocellulase genes/enzymes were not differentially expressed, but rather were among the most abundant/constitutively-expressed genes. Conclusions: These results suggest that the effect of CS and SR feeding on termite gut lignocellulase composition is minimal and thus, the most abundantly expressed enzymes appear to encode the best candidate catalysts for use in saccharification of these and related second-generation feedstocks. Further, based on these findings we hypothesize that the most abundantly expressed lignocellulases, rather than those that are differentially expressed have the best potential as pretreatment enzymes for CS and SR feedstocks. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
19. Sequencing by binding rivals SMOR error-corrected sequencing by synthesis technology for accurate detection and quantification of minor (< 0.1%) subpopulation variants.
- Author
-
Allender, Christopher J., Wike, Candice L., Porter, W. Tanner, Ellis, Dean, Lemmer, Darrin, Pond, Stephanie J. K., and Engelthaler, David M.
- Subjects
WHOLE genome sequencing ,ERROR rates ,SINGLE molecules ,NUCLEOTIDE sequencing ,BASIC needs - Abstract
Background: Detecting very minor (< 1%) subpopulations using next-generation sequencing is a critical need for multiple applications, including the detection of drug resistant pathogens and somatic variant detection in oncology. A recently available sequencing approach termed 'sequencing by binding (SBB)' claims to have higher base calling accuracy data "out of the box." This paper evaluates the utility of using SBB for the detection of ultra-rare drug resistant subpopulations in Mycobacterium tuberculosis (Mtb) using a targeted amplicon assay and compares the performance of SBB to single molecule overlapping reads (SMOR) error corrected sequencing by synthesis (SBS) data. Results: SBS displayed an elevated error rate when compared to SMOR error-corrected SBS and SBB techniques. SMOR error-corrected SBS and SBB technologies performed similarly within the linear range studies and error rate studies. Conclusions: With lower sequencing error rates within SBB sequencing, this technique looks promising for both targeted and unbiased whole genome sequencing, leading to the identification of minor (< 1%) subpopulations without the need for error correction methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Exploring crop genomes: assembly features, gene prediction accuracy, and implications for proteomics studies.
- Author
-
Abbas, Qussai, Wilhelm, Mathias, Kuster, Bernhard, Poppenberger, Brigitte, and Frishman, Dmitrij
- Subjects
PLANT genomes ,PROTEOMICS ,GENOMES ,GENES ,GENOMICS - Abstract
Plant genomics plays a pivotal role in enhancing global food security and sustainability by offering innovative solutions for improving crop yield, disease resistance, and stress tolerance. As the number of sequenced genomes grows and the accuracy and contiguity of genome assemblies improve, structural annotation of plant genomes continues to be a significant challenge due to their large size, polyploidy, and rich repeat content. In this paper, we present an overview of the current landscape in crop genomics research, highlighting the diversity of genomic characteristics across various crop species. We also assessed the accuracy of popular gene prediction tools in identifying genes within crop genomes and examined the factors that impact their performance. Our findings highlight the strengths and limitations of BRAKER2 and Helixer as leading structural genome annotation tools and underscore the impact of genome complexity, fragmentation, and repeat content on their performance. Furthermore, we evaluated the suitability of the predicted proteins as a reliable search space in proteomics studies using mass spectrometry data. Our results provide valuable insights for future efforts to refine and advance the field of structural genome annotation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
21. Drug-target binding affinity prediction using message passing neural network and self supervised learning.
- Author
-
Xia, Leiming, Xu, Lei, Pan, Shourun, Niu, Dongjiang, Zhang, Beiyi, and Li, Zhen
- Subjects
SUPERVISED learning ,MESSAGE passing (Computer science) ,DEEP learning ,DRUG discovery ,REPRESENTATIONS of graphs ,MOLECULAR graphs ,AMINO acid sequence - Abstract
Background: Drug-target binding affinity (DTA) prediction is important for the rapid development of drug discovery. Compared to traditional methods, deep learning methods provide a new way for DTA prediction to achieve good performance without much knowledge of the biochemical background. However, there are still room for improvement in DTA prediction: (1) only focusing on the information of the atom leads to an incomplete representation of the molecular graph; (2) the self-supervised learning method could be introduced for protein representation. Results: In this paper, a DTA prediction model using the deep learning method is proposed, which uses an undirected-CMPNN for molecular embedding and combines CPCProt and MLM models for protein embedding. An attention mechanism is introduced to discover the important part of the protein sequence. The proposed method is evaluated on the datasets Ki and Davis, and the model outperformed other deep learning methods. Conclusions: The proposed model improves the performance of the DTA prediction, which provides a novel strategy for deep learning-based virtual screening methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
22. RNA methylation patterns, immune characteristics, and autophagy-related mechanisms mediated by N6-methyladenosine (m6A) regulatory factors in venous thromboembolism
- Author
-
Zhang, Deshuai, Fu, Wenxia, Zhu, Shiwei, Pan, Yitong, and Li, Ruogu
- Published
- 2024
- Full Text
- View/download PDF
23. Computational dissection of genetic variation modulating the response of multiple photosynthetic phenotypes to the light environment
- Author
-
Gong, Huiying, Zhou, Ziyang, Bu, Chenhao, Zhang, Deqiang, Fang, Qing, Zhang, Xiao-Yu, and Song, Yuepeng
- Published
- 2024
- Full Text
- View/download PDF
24. Glutamine metabolism-related genes and immunotherapy in nonspecific orbital inflammation were validated using bioinformatics and machine learning
- Author
-
Wu, Zixuan, Li, Na, Gao, Yuan, Cao, Liyuan, Yao, Xiaolei, and Peng, Qinghua
- Published
- 2024
- Full Text
- View/download PDF
25. Solanum aculeatissimum and Solanum torvum chloroplast genome sequences: a comparative analysis with other Solanum chloroplast genomes.
- Author
-
Zhang, Longhao, Yi, Chengqi, Xia, Xin, Jiang, Zheng, Du, Lihui, Yang, Shixin, and Yang, Xu
- Subjects
CHLOROPLAST DNA ,SOLANUM ,SEQUENCE analysis ,MICROSATELLITE repeats ,PLANT classification ,BASE pairs - Abstract
Background: Solanum aculeatissimum and Solanum torvum belong to the Solanum species, and they are essential plants known for their high resistance to diseases and adverse conditions. They are frequently used as rootstocks for grafting and are often crossbred with other Solanum species to leverage their resistance traits. However, the phylogenetic relationship between S. aculeatissimum and S. torvum within the Solanum genus remains unclear. Therefore, this paper aims to sequence the complete chloroplast genomes of S. aculeatissimum and S. torvum and analyze them in comparison with 29 other previously published chloroplast genomes of Solanum species. Results: We observed that the chloroplast genomes of S. aculeatissimum and S. torvum possess typical tetrameric structures, consisting of one Large Single Copy (LSC) region, two reverse-symmetric Inverted Repeats (IRs), and one Small Single Copy (SSC) region. The total length of these chloroplast genomes ranged from 154,942 to 156,004 bp, with minimal variation. The highest GC content was found in the IR region, while the lowest was in the SSC region. Regarding gene content, the total number of chloroplast genes and CDS genes remained relatively consistent, ranging from 128 to 134 and 83 to 91, respectively. Nevertheless, there was notable variability in the number of tRNA genes and rRNAs. Relative synonymous codon usage (RSCU) analysis revealed that both S. aculeatissimum and S. torvum preferred codons that utilized A and U bases. Analysis of the IR boundary regions indicated that contraction and expansion primarily occurred at the junction between SSC and IR regions. Nucleotide polymorphism analysis and structural variation analysis demonstrated that chloroplast variation in Solanum species mainly occurred in the LSC and SSC regions. Repeat sequence analysis revealed that A/T was the most frequent base pair in simple repeat sequences (SSR), while Palindromic and Forward repeats were more common in long sequence repeats (LSR), with Reverse and Complement repeats being less frequent. Phylogenetic analysis indicated that S. aculeatissimum and S. torvum belonged to the same meristem and were more closely related to Cultivated Eggplant. Conclusion: These findings enhance our comprehension of chloroplast genomes within the Solanum genus, offering valuable insights for plant classification, evolutionary studies, and potential molecular markers for species identification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. scFSNN: a feature selection method based on neural network for single-cell RNA-seq data.
- Author
-
Peng, Minjiao, Lin, Baoqin, Zhang, Jun, Zhou, Yan, and Lin, Bingqing
- Subjects
FEATURE selection ,ARTIFICIAL neural networks ,FALSE discovery rate ,RNA sequencing ,GENE expression - Abstract
While single-cell RNA sequencing (scRNA-seq) allows researchers to analyze gene expression in individual cells, its unique characteristics like over-dispersion, zero-inflation, high gene-gene correlation, and large data volume with many features pose challenges for most existing feature selection methods. In this paper, we present a feature selection method based on neural network (scFSNN) to solve classification problem for the scRNA-seq data. scFSNN is an embedded method that can automatically select features (genes) during model training, control the false discovery rate of selected features and adaptively determine the number of features to be eliminated. Extensive simulation and real data studies demonstrate its excellent feature selection ability and predictive performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Deciphering the regulatory role of PheSnRK genes in Moso bamboo: insights into hormonal, energy, and stress responses.
- Author
-
Huifang Zheng, Yali Xie, Changhong Mu, Wenlong Cheng, Yucong Bai, and Jian Gao
- Abstract
The SnRK (sucrose non-fermentation-related protein kinase) plays an important role in regulating various signals in plants. However, as an important bamboo shoot and wood species, the response mechanism of PheSnRK in Phyllostachys edulis to hormones, low energy and stress remains unclear. In this paper, we focused on the structure, expression, and response of SnRK to hormones and sugars. In this study, we identified 75 PheSnRK genes from the Moso bamboo genome, which can be divided into three groups according to the evolutionary relationship. Cis-element analysis has shown that the PheSnRK gene can respond to various hormones, light, and stress. The PheSnRK2.9 proteins were localized in the nucleus and cytoplasm. Transgenic experiments showed that overexpression of PheSnRK2.9 inhibited root development, the plants were salt-tolerant and exhibited slowed starch consumption in Arabidopsis in the dark. The results of yeast one-hybrid and dual luciferase assay showed that PheIAAs and PheNACs can regulate PheSnRK2.9 gene expression by binding to the promoter of PheSnRK2.9. This study provided a comprehensive understanding of PheSnRK genes of Moso bamboo, which provides valuable information for further research on energy regulation mechanism and stress response during the growth and development of Moso bamboo. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
28. Modified Northern blot protocol for easy detection of mRNAs in total RNA using radiolabeled probes.
- Author
-
Yang, Tao, Zhang, Mingdi, and Zhang, Nianhui
- Subjects
RNA ,DNA probes ,MOLECULAR biology ,GENE expression ,RADIOACTIVITY - Abstract
Background: Northern blotting is still used as a gold standard for validation of the data obtained from high-throughput whole transcriptome-based methods. However, its disadvantages of lower sensitivity, labor-intensive operation, and higher quality of RNA required limit its utilization in a routine molecular biology laboratory to monitor gene expression at RNA level. Therefore, it is necessary to optimize the traditional Northern protocol to make the technique more applicable for standard use. Results: In this paper, we report modifications and tips used to improve the traditional Northern protocol for the detection of mRNAs in total RNA. To maximize the retention of specifically bound radiolabeled probes on the blot, posthybridization washes were performed under only with moderate-stringency until the level of radioactivity retained on the filter decreased to 20~50 counts per second, rather than normally under high and low stringency sequentially for scheduled time or under only high stringent condition. Successful detection of the low-expression gene using heterologous DNA probes in 20 µg of total RNA after a two-day exposure suggested an improvement in detection sensitivity. Quantitatively controlled posthybridization washes combined with an ethidium bromide-prestaining RNA procedure to directly visualize prestained RNA bands at any time during electrophoresis or immediately after electrophoresis, which made the progress of the Northern procedure to be monitored and evaluated step by step, thereby making the experiment reliable and controllable. We also report tips used in the modified Northern protocol, including the moderate concentration of formaldehyde in the gel, the accessory capillary setup, and the staining jar placed into an enamel square tray with a lid used for hybridization. Using our modified Northern protocol, eight rounds of rehybridization could be performed on a single blot. The modification made and tips used ensured the efficient proceeding of the experiment and the resulting good performance, but without using special reagents or equipment. Conclusions: The modified Northern protocol improved detection sensitivity and made the experiment easy, less expensive, reliable, and controllable, and can be employed in a routine molecular biology laboratory to detect low-expressed mRNAs with heterologous DNA probes in total RNA. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
29. OTSUCNV: an adaptive segmentation and OTSU-based anomaly classification method for CNV detection using NGS data.
- Author
-
Xie, Kun, Ge, Xiaojun, Alvi, Haque A.K., Liu, Kang, Song, Jianfeng, and Yu, Qiang
- Subjects
WHOLE genome sequencing ,NUCLEOTIDE sequencing ,HUMAN evolution ,COMPUTATIONAL complexity - Abstract
Copy-number variations (CNVs), which refer to deletions and duplications of chromosomal segments, represent a significant source of variation among individuals, contributing to human evolution and being implicated in various diseases ranging from mental illness and developmental disorders to cancer. Despite the development of several methods for detecting copy number variations based on next-generation sequencing (NGS) data, achieving robust detection performance for CNVs with arbitrary coverage and amplitude remains challenging due to the inherent complexity of sequencing samples. In this paper, we propose an alternative method called OTSUCNV for CNV detection on whole genome sequencing (WGS) data. This method utilizes a newly designed adaptive sequence segmentation algorithm and an OTSU-based CNV prediction algorithm, which does not rely on any distribution assumptions or involve complex outlier factor calculations. As a result, the effective detection of CNVs is achieved with lower computational complexity. The experimental results indicate that the proposed method demonstrates outstanding performance, and hence it may be used as an effective tool for CNV detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. Response to correspondence on "B chromosomes of multiple species have intense evolutionary dynamics and accumulated genes related to important biological processes".
- Author
-
Ahmad, Syed F., Valente, Guilherme T., and Martins, Cesar
- Subjects
GENES ,SPECIES ,KARYOTYPES ,DATA analysis ,CHROMOSOMES - Abstract
This document is a response to concerns raised about a previous article on B chromosomes and their evolutionary dynamics. The authors acknowledge the valid concerns and provide corrections and clarifications. They address issues with the availability of sequenced data and inaccuracies in supplementary figures and tables. The authors also discuss errors and corrections in a study on B chromosomes in different species, including mistakes in referencing, figure citation, and data analysis. They emphasize the need for standardized protocols and bioinformatic tools to improve the study of B chromosomes and acknowledge the challenges and gaps in understanding their genomic content. [Extracted from the article]
- Published
- 2023
- Full Text
- View/download PDF
31. Complete genome sequencing and comparison of two nitrogen-metabolizing bacteria isolated from Antarctic deep-sea sediment.
- Author
-
Liu, Wenqi, Cong, Bailin, Lin, Jing, Zhao, Linlin, and Liu, Shenghao
- Abstract
Background: Bacteria are an essential component of the earth`s biota and affect circulation of matters through their metabolic activity. They also play an important role in the carbon and nitrogen cycle in the deep-sea environment. In this paper, two strains from deep-sea sediments were investigated in order to understand nitrogen cycling involved in the deep-sea environment. Results: In this paper, the basic genomic information of two strains was obtained by whole genome sequencing. The Cobetia amphilecti N-80 and Halomonas profundus 13 genome sizes are 4,160,095 bp with a GC content of 62.5% and 5,251,450 bp with a GC content of 54.84%. Through a comparison of functional analyses, we predicted the possible C and N metabolic pathways of the two strains and determined that Halomonas profundus 13 could use more carbon sources than Cobetia amphilecti N-80. The main genes associated with N metabolism in Halomonas profundus 13 are narG, narY, narI, nirS, norB, norC, nosZ, and nirD. On the contrast, nirD, using NH
4 + for energy, plays a main role in Cobetia amphilecti N-80. Both of them have the same genes for fixing inorganic carbon: icd, ppc, fdhA, accC, accB, accD, and accA. Conclusion: In this study, the whole genomes of two strains were sequenced to clarify the basic characteristics of their genomes, laying the foundation for further studying nitrogen-metabolizing bacteria. Halomonas profundus 13 can utilize more carbon sources than Cobetia amphilecti N-80, as indicated by API as well as COG and KEGG prediction results. Finally, through the analysis of the nitrification and denitrification abilities as well as the inorganic carbon fixation ability of the two strains, the related genes were identified, and the possible metabolic pathways were predicted. Together, these results provide molecular markers and theoretical support for the mechanisms of inorganic carbon fixation by deep-sea microorganisms. [ABSTRACT FROM AUTHOR]- Published
- 2022
- Full Text
- View/download PDF
32. Retraction Note: TaWRKY40 transcription factor positively regulate the expression of TaGAPC1 to enhance drought tolerance.
- Author
-
Zhang, Lin, Xu, Zhiyong, Ji, Haikun, Zhou, Ye, and Yang, Shushen
- Subjects
DROUGHT tolerance ,TRANSCRIPTION factors ,DROUGHTS ,DROUGHT management - Abstract
RETRACTED ARTICLE: The specific MYB binding sites bound by I Ta MYB in the GAPCp2/3 i promoters are involved in the drought stress response in wheat. 5A (WT; drought 25d and 7d after re-watering) of their I BMC Plant Biology i paper [2]. [Extracted from the article]
- Published
- 2023
- Full Text
- View/download PDF
33. Long-term TE persistence even without beneficial insertion.
- Author
-
Kremer, Stefan C., Linquist, Stefan, Saylor, Brent, Elliott, Tyler A., Gregory, T. Ryan, and Cottenie, Karl
- Subjects
GENOMICS ,CRITICISM ,COMPREHENSION - Abstract
This correspondence responds to the critique by Butler et al. (BMC Genomics 22:241, 2021) of our recent paper on transposable element (TE) persistence. We address the three main objections raised by Butler et al. After running a series of additional simulations that were inspired by the authors' criticisms, we are able to present a more nuanced understanding of the conditions that generate long-term persistence. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
34. ETGPDA: identification of piRNA-disease associations based on embedding transformation graph convolutional network
- Author
-
Meng, Xianghan, Shang, Junliang, Ge, Daohui, Yang, Yi, Zhang, Tongdui, and Liu, Jin-Xing
- Published
- 2023
- Full Text
- View/download PDF
35. Small open reading frames: a comparative genetics approach to validation
- Author
-
Jain, Niyati, Richter, Felix, Adzhubei, Ivan, Sharp, Andrew J., and Gelb, Bruce D.
- Published
- 2023
- Full Text
- View/download PDF
36. Prediction of lncRNA functions using deep neural networks based on multiple networks.
- Author
-
Deng, Lei, Ren, Shengli, and Zhang, Jingpu
- Subjects
ARTIFICIAL neural networks ,LINCRNA ,BIOLOGICAL databases ,GENE ontology - Abstract
Background: More and more studies show that lncRNA is widely involved in various physiological processes of the organism. However, the functions of the vast majority of them continue to be unknown. In addition, data related to lncRNAs in biological databases are constantly increasing. Therefore, it is quite urgent to develop a computing method to make the utmost of these data. Results: In this paper, we propose a new computational method based on global heterogeneous networks to predict the functions of lncRNAs, called DNGRGO. DNGRGO first calculates the similarities among proteins, miRNAs, and lncRNAs, and annotates the functions of lncRNAs according to its similar protein-coding genes, which have been labeled with gene ontology (GO). To evaluate the performance of DNGRGO, we manually annotated GO terms to lncRNAs and implemented our method on these data. Compared with the existing methods, the results of DNGRGO show superior predictive performance of maximum F-measure and coverage. Conclusions: DNGRGO is able to annotate lncRNAs through capturing the low-dimensional features of the heterogeneous network. Moreover, the experimental results show that integrating miRNA data can help to improve the predictive performance of DNGRGO. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
37. A new and effective two-step clustering approach for single cell RNA sequencing data.
- Author
-
Li, Ruiyi, Guan, Jihong, Wang, Zhiye, and Zhou, Shuigeng
- Subjects
RNA sequencing ,HIERARCHICAL clustering (Cluster analysis) ,NATURAL immunity ,DRUG resistance ,CLUSTER analysis (Statistics) - Abstract
Background: The rapid devolvement of single cell RNA sequencing (scRNA-seq) technology leads to huge amounts of scRNA-seq data, which greatly advance the research of many biomedical fields involving tissue heterogeneity, pathogenesis of disease and drug resistance etc. One major task in scRNA-seq data analysis is to cluster cells in terms of their expression characteristics. Up to now, a number of methods have been proposed to infer cell clusters, yet there is still much space to improve their performance. Results: In this paper, we develop a new two-step clustering approach to effectively cluster scRNA-seq data, which is called TSC — the abbreviation of Two-Step Clustering. Particularly, by dividing all cells into two types: core cells (those possibly lying around the centers of clusters) and non-core cells (those locating in the boundary areas of clusters), we first clusters the core cells by hierarchical clustering (the first step) and then assigns the non-core cells to the corresponding nearest clusters (the second step). Extensive experiments on 12 real scRNA-seq datasets show that TSC outperforms the state of the art methods. Conclusion: TSC is an effective clustering method due to its two-steps clustering strategy, and it is a useful tool for scRNA-seq data analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
38. Functional analysis of differentially expressed circular RNAs in sheep subcutaneous fat.
- Author
-
Liu, Tian-yi, Feng, Hui, Yousuf, Salsabeel, Xie, Ling-li, and Miao, Xiang-yang
- Subjects
CIRCULAR RNA ,FUNCTIONAL analysis ,ADIPOSE tissues ,SHEEP ,FAT ,AMP-activated protein kinases - Abstract
Background: Circular RNAs (circRNAs), as important non-coding RNAs (ncRNAs), are involved in many biological activities. However, the exact chemical mechanism behind fat accumulation is unknown. In this paper, we obtained the expression profiles of circRNAs using high-throughput sequencing and investigated their differential expression in subcutaneous fat tissue of Duolang and Small Tail Han sheep. Results: From the transcriptomic analysis, 141 differentially expressed circRNAs were identified, comprising 61 up-regulated circRNAs and 80 down-regulated circRNAs. These host genes were primarily enriched in the MAPK and AMPK signaling pathways which is closely associated with fat deposition regulation. We identified circRNA812, circRNA91, and circRNA388 as vital genes in fat deposition by miRNA-circRNA target gene prediction. The functional annotation results of target genes of key circRNAs showed that the signaling pathways mainly included PI3K-Akt and AMPK. We constructed the competing endogenous RNA (ceRNA) regulatory network to study the role of circRNAs in sheep lipid deposition, and circRNA812, circRNA91, and circRNA388 can adsorb more miRNAs. NC_040253.1_5757, as the source of miRNA response element (MRE) among the three, may play an important role during the process of sheep fat deposition. Conclusions: Our study gives a systematic examination of the circRNA profiles expressed in sheep subcutaneous fat. These results from this study provide some new basis for understanding circRNA function and sheep fat metabolism. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. The NIH Comparative Genomics Resource: addressing the promises and challenges of comparative genomics on human health.
- Author
-
Bornstein, Kristin, Gryan, Gary, Chang, E. Sally, Marchler-Bauer, Aron, and Schneider, Valerie A.
- Subjects
COMPARATIVE genomics ,BIOLOGICAL evolution ,GENOMICS ,ZOONOSES ,DRUG target ,SPANNING trees - Abstract
Comparative genomics is the comparison of genetic information within and across organisms to understand the evolution, structure, and function of genes, proteins, and non-coding regions (Sivashankari and Shanmughavel, Bioinformation 1:376-8, 2007). Advances in sequencing technology and assembly algorithms have resulted in the ability to sequence large genomes and provided a wealth of data that are being used in comparative genomic analyses. Comparative analysis can be leveraged to systematically explore and evaluate the biological relationships and evolution between species, aid in understanding the structure and function of genes, and gain a better understanding of disease and potential drug targets. As our knowledge of genetics expands, comparative genomics can help identify emerging model organisms among a broader span of the tree of life, positively impacting human health. This impact includes, but is not limited to, zoonotic disease research, therapeutics development, microbiome research, xenotransplantation, oncology, and toxicology. Despite advancements in comparative genomics, new challenges have arisen around the quantity, quality assurance, annotation, and interoperability of genomic data and metadata. New tools and approaches are required to meet these challenges and fulfill the needs of researchers. This paper focuses on how the National Institutes of Health (NIH) Comparative Genomics Resource (CGR) can address both the opportunities for comparative genomics to further impact human health and confront an increasingly complex set of challenges facing researchers. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. Multiomics comparative analysis of the maize large grain mutant tc19 identified pathways related to kernel development.
- Author
-
Cai, Qing, Jiao, Fuchao, Wang, Qianqian, Zhang, Enying, Song, Xiyun, Pei, Yuhe, Li, Jun, Zhao, Meiai, and Guo, Xinmei
- Subjects
MULTIOMICS ,CORN breeding ,COMPARATIVE studies ,GRAIN yields ,GRAIN ,PHENYLPROPANOIDS ,CORN - Abstract
Background: The mechanism of grain development in elite maize breeding lines has not been fully elucidated. Grain length, grain width and grain weight are key components of maize grain yield. Previously, using the Chinese elite maize breeding line Chang7-2 and its large grain mutant tc19, we characterized the grain size developmental difference between Chang7-2 and tc19 and performed transcriptomic analysis. Results: In this paper, using Chang7-2 and tc19, we performed comparative transcriptomic, proteomic and metabolomic analyses at different grain development stages. Through proteomics analyses, we found 2884, 505 and 126 differentially expressed proteins (DEPs) at 14, 21 and 28 days after pollination, respectively. Through metabolomics analysis, we identified 51, 32 and 36 differentially accumulated metabolites (DAMs) at 14, 21 and 28 days after pollination, respectively. Through multiomics comparative analysis, we showed that the phenylpropanoid pathways are influenced at transcriptomic, proteomic and metabolomic levels in all the three grain developmental stages. Conclusion: We identified several genes in phenylpropanoid biosynthesis, which may be related to the large grain phenotype of tc19. In summary, our results provided new insights into maize grain development. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
41. SDNN-PPI: self-attention with deep neural network effect on protein-protein interaction prediction.
- Author
-
Li, Xue, Han, Peifu, Wang, Gan, Chen, Wenqi, Wang, Shuang, and Song, Tao
- Subjects
ARTIFICIAL neural networks ,NETWORK effect ,PROTEIN-protein interactions ,MICE ,DEEP learning ,GENETIC transcription regulation ,CAENORHABDITIS - Abstract
Background: Protein-protein interactions (PPIs) dominate intracellular molecules to perform a series of tasks such as transcriptional regulation, information transduction, and drug signalling. The traditional wet experiment method to obtain PPIs information is costly and time-consuming. Result: In this paper, SDNN-PPI, a PPI prediction method based on self-attention and deep learning is proposed. The method adopts amino acid composition (AAC), conjoint triad (CT), and auto covariance (AC) to extract global and local features of protein sequences, and leverages self-attention to enhance DNN feature extraction to more effectively accomplish the prediction of PPIs. In order to verify the generalization ability of SDNN-PPI, a 5-fold cross-validation on the intraspecific interactions dataset of Saccharomyces cerevisiae (core subset) and human is used to measure our model in which the accuracy reaches 95.48% and 98.94% respectively. The accuracy of 93.15% and 88.33% are obtained in the interspecific interactions dataset of human-Bacillus Anthracis and Human-Yersinia pestis, respectively. In the independent data set Caenorhabditis elegans, Escherichia coli, Homo sapiens, and Mus musculus, all prediction accuracy is 100%, which is higher than the previous PPIs prediction methods. To further evaluate the advantages and disadvantages of the model, the one-core and crossover network are conducted to predict PPIs, and the data show that the model correctly predicts the interaction pairs in the network. Conclusion: In this paper, AAC, CT and AC methods are used to encode the sequence, and SDNN-PPI method is proposed to predict PPIs based on self-attention deep learning neural network. Satisfactory results are obtained on interspecific and intraspecific data sets, and good performance is also achieved in cross-species prediction. It can also correctly predict the protein interaction of cell and tumor information contained in one-core network and crossover network.The SDNN-PPI proposed in this paper not only explores the mechanism of protein-protein interaction, but also provides new ideas for drug design and disease prevention. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
42. The International Conference on Intelligent Biology and Medicine (ICIBM) 2020: Scalable techniques and algorithms for computational genomics.
- Author
-
Zhang, Wei, Zhao, Zhongming, Wang, Kai, Shen, Li, and Shi, Xinghua
- Subjects
HORIZONTAL gene transfer ,COMPUTATIONAL biology ,GENOMICS ,BIOLOGY ,CONFERENCES & conventions ,MEDICAL informatics - Abstract
In this introduction article, we summarize the 2020 International Conference on Intelligent Biology and Medicine (ICIBM 2020) conference which was held on August 9–10, 2020 (virtual conference). We then briefly describe the nine research articles included in this supplement issue. ICIBM 2020 hosted four scientific sections covering current topics in bioinformatics, computational biology, genomics, biomedical informatics, among others. A total of 75 original manuscripts were submitted to ICIBM 2020. All the papers were under rigorous review (at least three reviewers), and highly ranked manuscripts were selected for oral presentation and supplement issues. This genomics supplement issue included nine manuscripts. These articles cover methods and applications for single cell RNA sequencing, multi-omics data integration for gene regulation, gene fusion detection from long-read RNA sequencing, gene co-expression analysis of metabolic pathways in cancer, integrative genome-wide association studies (GWAS) of subcortical imaging phenotype in Alzheimer's disease, as well as deep learning methods for protein structure prediction, metabolic pathway membership inference, and horizontal gene transfer (HGT) insertion sites prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
43. normGAM: an R package to remove systematic biases in genome architecture mapping data.
- Author
-
Liu, Tong and Wang, Zheng
- Subjects
GENE mapping ,DATA mapping ,FLUORESCENCE in situ hybridization ,RESTRICTION fragment length polymorphisms ,LINKAGE disequilibrium ,SOURCE code - Abstract
Background: The genome architecture mapping (GAM) technique can capture genome-wide chromatin interactions. However, besides the known systematic biases in the raw GAM data, we have found a new type of systematic bias. It is necessary to develop and evaluate effective normalization methods to remove all systematic biases in the raw GAM data. Results: We have detected a new type of systematic bias, the fragment length bias, in the genome architecture mapping (GAM) data, which is significantly different from the bias of window detection frequency previously mentioned in the paper introducing the GAM method but is similar to the bias of distances between restriction sites existing in raw Hi-C data. We have found that the normalization method (a normalized variant of the linkage disequilibrium) used in the GAM paper is not able to effectively eliminate the new fragment length bias at 1 Mb resolution (slightly better at 30 kb resolution). We have developed an R package named normGAM for eliminating the new fragment length bias together with the other three biases existing in raw GAM data, which are the biases related to window detection frequency, mappability, and GC content. Five normalization methods have been implemented and included in the R package including Knight-Ruiz 2-norm (KR2, newly designed by us), normalized linkage disequilibrium (NLD), vanilla coverage (VC), sequential component normalization (SCN), and iterative correction and eigenvector decomposition (ICE). Conclusions: Based on our evaluations, the five normalization methods can eliminate the four biases existing in raw GAM data, with VC and KR2 performing better than the others. We have observed that the KR2-normalized GAM data have a higher correlation with the KR-normalized Hi-C data on the same cell samples indicating that the KR-related methods are better than the others for keeping the consistency between the GAM and Hi-C experiments. Compared with the raw GAM data, the normalized GAM data are more consistent with the normalized distances from the fluorescence in situ hybridization (FISH) experiments. The source code of normGAM can be freely downloaded from http://dna.cs.miami.edu/normGAM/. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
44. Explore potential disease related metabolites based on latent factor model.
- Author
-
Yongtian Wang, Liran Juan, Jiajie Peng, Tao Wang, Tianyi Zang, and Yadong Wang
- Subjects
METABOLITES ,MATRIX decomposition ,DECOMPOSITION method ,METABOLIC models ,LATENT infection ,FRUIT rots - Abstract
Background: In biological systems, metabolomics can not only contribute to the discovery of metabolic signatures for disease diagnosis, but is very helpful to illustrate the underlying molecular disease-causing mechanism. Therefore, identification of disease-related metabolites is of great significance for comprehensively understanding the pathogenesis of diseases and improving clinical medicine. Results: In the paper, we propose a disease and literature driven metabolism prediction model (DLMPM) to identify the potential associations between metabolites and diseases based on latent factor model. We build the disease glossary with disease terms from different databases and an association matrix based on the mapping between diseases and metabolites. The similarity of diseases and metabolites is used to complete the association matrix. Finally, we predict potential associations between metabolites and diseases based on the matrix decomposition method. In total, 1,406 direct associations between diseases and metabolites are found. There are 119,206 unknown associations between diseases and metabolites predicted with a coverage rate of 80.88%. Subsequently, we extract training sets and testing sets based on data increment from the database of disease-related metabolites and assess the performance of DLMPM on 19 diseases. As a result, DLMPM is proven to be successful in predicting potential metabolic signatures for human diseases with an average AUC value of 82.33%. Conclusion: In this paper, a computational model is proposed for exploring metabolite-disease pairs and has good performance in predicting potential metabolites related to diseases through adequate validation. The results show that DLMPM has a better performance in prioritizing candidate diseases-related metabolites compared with the previous methods and would be helpful for researchers to reveal more information about human diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
45. Expression profile and bioinformatics analysis of circRNA and its associated ceRNA networks in longissimus dorsi from Lufeng cattle and Leiqiong cattle.
- Author
-
Yang, Chuang, Wu, Longfei, Guo, Yongqing, Li, Yaokun, Deng, Ming, Liu, Dewu, Liu, Guangbin, and Sun, Baoli
- Subjects
GENE expression ,ERECTOR spinae muscles ,CIRCULAR RNA ,RNA splicing ,RNA sequencing ,CATTLE breeds ,CATTLE breeding - Abstract
This paper aims to explore the role of circRNA expression profiles and circRNA-associated ceRNA networks in the regulation of myogenesis in the longissimus dorsi of cattle breeds surviving under subtropical conditions in southern China by RNA sequencing and bioinformatics analysis. It also aims to provide comprehensive understanding of the differences in muscle fibers in subtropical cattle breeds and to expand the knowledge of the molecular networks that regulate myogenesis. With regard to meat quality indicators, results showed that the longissimus dorsi of LQC had lower pH (P < 0.0001), lower redness (P < 0.01), lower shear force (P < 0.05), and higher brightness (P < 0.05) than the longissimus dorsi of LFC. With regard to muscle fiber characteristics, the longissimus dorsi of LQC had a smaller diameter (P < 0.0001) and higher density of muscle fibers (P < 0.05). The analysis results show that the function of many circRNA-targeted mRNAs was related to myogenesis and metabolic regulation. Furthermore, in the analysis of the function of circRNA source genes, we hypothesized that btacirc_00497 and btacirc_034497 may regulate the function and type of myofibrils by affecting the expression of MYH6, MYH7, and NEB through competitive linear splicing. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. iEnhancer-DCSA: identifying enhancers via dual-scale convolution and spatial attention.
- Author
-
Wang, Wenjun, Wu, Qingyao, and Li, Chunshan
- Subjects
DEEP learning ,TEST methods - Abstract
Background: Due to the dynamic nature of enhancers, identifying enhancers and their strength are major bioinformatics challenges. With the development of deep learning, several models have facilitated enhancers detection in recent years. However, existing studies either neglect different length motifs information or treat the features at all spatial locations equally. How to effectively use multi-scale motifs information while ignoring irrelevant information is a question worthy of serious consideration. In this paper, we propose an accurate and stable predictor iEnhancer-DCSA, mainly composed of dual-scale fusion and spatial attention, automatically extracting features of different length motifs and selectively focusing on the important features. Results: Our experimental results demonstrate that iEnhancer-DCSA is remarkably superior to existing state-of-the-art methods on the test dataset. Especially, the accuracy and MCC of enhancer identification are improved by 3.45% and 9.41%, respectively. Meanwhile, the accuracy and MCC of enhancer classification are improved by 7.65% and 18.1%, respectively. Furthermore, we conduct ablation studies to demonstrate the effectiveness of dual-scale fusion and spatial attention. Conclusions: iEnhancer-DCSA will be a valuable computational tool in identifying and classifying enhancers, especially for those not included in the training dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Re-examination of chimp protein kinases suggests "novel architectures" are gene prediction artifacts.
- Author
-
Robison, Keith
- Subjects
PROTEIN kinases ,CHIMPANZEES ,GENOMES ,HUMAN genome ,PHOSPHOTRANSFERASES - Abstract
Background: Anamika et al[1] recently published in this journal a sequence alignment analysis of protein kinases encoded by the chimpanzee genome in comparison to those in the human genome. From this analysis they concluded that several chimpanzee kinases have unusual domain arrangements. Results: Re-examination of these kinases reveals claimed novel arrangements cannot withstand scrutiny; each is either not novel or represents over-analysis of weakly confident computer generated gene models. Additional sequence evidence available at the time of the paper's submission either directly contradict the gene models or suggest alternate gene models. These alternate models would minimize or eliminate the observed differences between human and chimp kinases. Conclusion: None of the proposed novel chimpanzee kinase architectures are supported by experiment evidence. Guidelines to prevent such erroneous conclusions in similar papers are proposed. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
48. Promoting synergistic research and education in genomics and bioinformatics.
- Author
-
Yang, Jack Y., Qu Yang, Mary, Zhu, Mengxia (Michelle), Arabnia, Hamid R., and Youping Deng
- Subjects
CONFERENCES & conventions ,BIOINFORMATICS ,GENOMICS ,COMPUTATIONAL biology - Abstract
Bioinformatics and Genomics are closely related disciplines that hold great promises for the advancement of research and development in complex biomedical systems, as well as public health, drug design, comparative genomics, personalized medicine and so on. Research and development in these two important areas are impacting the science and technology. High throughput sequencing and molecular imaging technologies marked the beginning of a new era for modern translational medicine and personalized healthcare. The impact of having the human sequence and personalized digital images in hand has also created tremendous demands of developing powerful supercomputing, statistical learning and artificial intelligence approaches to handle the massive bioinformatics and personalized healthcare data, which will obviously have a profound effect on how biomedical research will be conducted toward the improvement of human health and prolonging of human life in the future. The International Society of Intelligent Biological Medicine (http:// www.isibm.org) and its official journals, the International Journal of Functional Informatics and Personalized Medicine (http://www.inderscience.com/ijfipm) and the International Journal of Computational Biology and Drug Design (http:// www.inderscience.com/ijcbdd) in collaboration with International Conference on Bioinformatics and Computational Biology (Biocomp), touch tomorrow's bioinformatics and personalized medicine throughout today's efforts in promoting the research, education and awareness of the upcoming integrated inter/multidisciplinary field. The 2007 international conference on Bioinformatics and Computational Biology (BIOCOMP07) was held in Las Vegas, the United States of American on June 25-28, 2007. The conference attracted over 400 papers, covering broad research areas in the genomics, biomedicine and bioinformatics. The Biocomp 2007 provides a common platform for the cross fertilization of ideas, and to help shape knowledge and scientific achievements by bridging these two very important disciplines into an interactive and attractive forum. Keeping this objective in mind, Biocomp 2007 aims to promote interdisciplinary and multidisciplinary education and research. 25 high quality peer-reviewed papers were selected from 400+ submissions for this supplementary issue of BMC Genomics. Those papers contributed to a wide-range of important research fields including gene expression data analysis and applications, high-throughput genome mapping, sequence analysis, gene regulation, protein structure prediction, disease prediction by machine learning techniques, systems biology, database and biological software development. We always encourage participants submitting proposals for genomics sessions, special interest research sessions, workshops and tutorials to Professor Hamid R. Arabnia (hra@cs.uga.edu) in order to ensure that Biocomp continuously plays the leadership role in promoting inter/multidisciplinary research and education in the fields. Biocomp received top conference ranking with a high score of 0.95/1.00. Biocomp is academically cosponsored by the International Society of Intelligent Biological Medicine and the Research Laboratories and Centers of Harvard University -- Massachusetts Institute of Technology, Indiana University - Purdue University, Georgia Tech -- Emory University, UIUC, UCLA, Columbia University, University of Texas at Austin and University of Iowa etc. Biocomp - Worldcomp brings leading scientists together across the nation and all over the world and aims to promote synergistic components such as keynote lectures, special interest sessions, workshops and tutorials in response to the advances of cutting-edge research. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
49. CNARA: reliability assessment for genomic copy number profiles
- Author
-
Haoyang Cai, Michael Baudis, Caius Solovan, Ni Ai, University of Zurich, and Ai, Ni
- Subjects
0301 basic medicine ,DNA Copy Number Variations ,Copy number analysis ,Reliability assessment ,Gene Dosage ,Genomics ,Computational biology ,Biology ,Web Browser ,Gene dosage ,Genome ,03 medical and health sciences ,1311 Genetics ,Genetics ,Preprocessor ,Profiling (information science) ,Computer Simulation ,Microarray platform ,Original Paper ,Computational Biology ,Reproducibility of Results ,CNA ,10124 Institute of Molecular Life Sciences ,030104 developmental biology ,1305 Biotechnology ,570 Life sciences ,biology ,Copy number profile ,DNA microarray ,Algorithms ,Biotechnology - Abstract
Background DNA copy number profiles from microarray and sequencing experiments sometimes contain wave artefacts which may be introduced during sample preparation and cannot be removed completely by existing preprocessing methods. Besides, large derivative log ratio spread (DLRS) of the probes correlating with poor DNA quality is sometimes observed in genome screening experiments and may lead to unreliable copy number profiles. Depending on the extent of these artefacts and the resulting misidentification of copy number alterations/variations (CNA/CNV), it may be desirable to exclude such samples from analyses or to adapt the downstream data analysis strategy accordingly. Results Here, we propose a method to distinguish reliable genomic copy number profiles from those containing heavy wave artefacts and/or large DLRS. We define four features that adequately summarize the copy number profiles for reliability assessment, and train a classifier on a dataset of 1522 copy number profiles from various microarray platforms. The method can be applied to predict the reliability of copy number profiles irrespective of the underlying microarray platform and may be adapted for those sequencing platforms from which copy number estimates could be computed as a piecewise constant signal. Further details can be found at https://github.com/baudisgroup/CNARA. Conclusions We have developed a method for the assessment of genomic copy number profiling data, and suggest to apply the method in addition to and after other state-of-the-art noise correction and quality control procedures. CNARA could be instrumental in improving the assessment of data used for genomic data mining experiments and support the reliable functional attribution of copy number aberrations especially in cancer research. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3074-7) contains supplementary material, which is available to authorized users.
- Published
- 2016
50. Metagenome-mining indicates an association between bacteriocin presence and strain diversity in the infant gut.
- Author
-
Ormaasen, Ida, Rudi, Knut, Diep, Dzung B., and Snipen, Lars
- Subjects
INFANTS ,GUT microbiome ,HUMAN microbiota ,ROLE conflict ,ANTIMICROBIAL peptides - Abstract
Background: Our knowledge about the ecological role of bacterial antimicrobial peptides (bacteriocins) in the human gut is limited, particularly in relation to their role in the diversification of the gut microbiota during early life. The aim of this paper was therefore to address associations between bacteriocins and bacterial diversity in the human gut microbiota. To investigate this, we did an extensive screening of 2564 healthy human gut metagenomes for the presence of predicted bacteriocin-encoding genes, comparing bacteriocin gene presence to strain diversity and age. Results: We found that the abundance of bacteriocin genes was significantly higher in infant-like metagenomes (< 2 years) compared to adult-like metagenomes (2–107 years). By comparing infant-like metagenomes with and without a given bacteriocin, we found that bacteriocin presence was associated with increased strain diversities. Conclusions: Our findings indicate that bacteriocins may play a role in the strain diversification during the infant gut microbiota establishment. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.