9,888 results
Search Results
2. First comprehensive analysis of lysine succinylation in paper mulberry (Broussonetia papyrifera)
- Author
-
Yibo Dong, Ping Li, and Chao Chen
- Subjects
Paper mulberry ,Lysine succinylation ,Posttranslational modification ,Photosynthesis ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background Lysine succinylation is a naturally occurring post-translational modification (PTM) that is ubiquitous in organisms. Lysine succinylation plays important roles in regulating protein structure and function as well as cellular metabolism. Global lysine succinylation at the proteomic level has been identified in a variety of species; however, limited information on lysine succinylation in plant species, especially paper mulberry, is available. Paper mulberry is not only an important plant in traditional Chinese medicine, but it is also a tree species with significant economic value. Paper mulberry is found in the temperate and tropical zones of China. The present study analyzed the effects of lysine succinylation on the growth, development, and physiology of paper mulberry. Results A total of 2097 lysine succinylation sites were identified in 935 proteins associated with the citric acid cycle (TCA cycle), glyoxylic acid and dicarboxylic acid metabolism, ribosomes and oxidative phosphorylation; these pathways play a role in carbon fixation in photosynthetic organisms and may be regulated by lysine succinylation. The modified proteins were distributed in multiple subcellular compartments and were involved in a wide variety of biological processes, such as photosynthesis and the Calvin-Benson cycle. Conclusion Lysine-succinylated proteins may play key regulatory roles in metabolism, primarily in photosynthesis and oxidative phosphorylation, as well as in many other cellular processes. In addition to the large number of succinylated proteins associated with photosynthesis and oxidative phosphorylation, some proteins associated with the TCA cycle are succinylated. Our study can serve as a reference for further proteomics studies of the downstream effects of succinylation on the physiology and biochemistry of paper mulberry.
- Published
- 2021
- Full Text
- View/download PDF
3. The cold responsive mechanism of the paper mulberry: decreased photosynthesis capacity and increased starch accumulation.
- Author
-
Xianjun Peng, Linhong Teng, Xueqing Yan, Meiling Zhao, and Shihua Shen
- Subjects
- *
PAPER mulberry , *PHOTOSYNTHESIS , *STARCH metabolism , *EFFECT of cold on plants , *ABIOTIC stress , *PLANT adaptation , *TRANSMISSION electron microscopy - Abstract
Background: Most studies on the paper mulberry are mainly focused on the medicated and pharmacology, fiber quality, leaves feed development, little is known about its mechanism of adaptability to abiotic stress. Physiological measurement, transcriptomics and proteomic analysis were employed to understand its response to cold stress in this study. Methods: The second to fourth fully expanded leaves from up to down were harvested at different stress time points forthe transmission electron microscope (TEM) observation. Physiological characteristics measurement included the relative electrolyte leakage (REL), SOD activity assay, soluble sugar content, and Chlorophyll fluorescence parameter measurement. For screening of differentially expressed genes, the expression level of every transcript in each sample was calculated by quantifying the number of Illumina reads. To identify the differentially expressed protein, leaves of plants under 0, 6, 12, 24, 48 and 72 h cold stress wereharvested for proteomic analysis. Finally, real time PCR was used to verify the DEG results of the RNA-seq and the proteomics data. Results: Results showed that at the beginning of cold stress, respiratory metabolism was decreased and the transportation and hydrolysis of photosynthetic products was inhibited, leading to an accumulation of starch in the chloroplasts. Total of 5800 unigenes and 38 proteins were affected, including the repressed expression of photosynthesis and the enhanced expression in signal transduction, stress defense pathway as well as secondary metabolism. Although the transcriptional level of a large number of genes has been restored after 12 h, sustained cold stress brought more serious injury to the leaf cells, including the sharp rise of the relative electrolyte leakage, the declined Fv/Fm value, swelled chloroplast and the disintegrated membrane system. Conclusion: The starch accumulation and the photoinhibition might be the main adaptive mechanism of the paper mulberry responded to cold stress. Most of important, enhancing the transport and hydrolysis of photosynthetic products could be the potential targets for improving the cold tolerance of the paper mulberry. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
4. Characterization of metallothionein genes from Broussonetia papyrifera: metal binding and heavy metal tolerance mechanisms
- Author
-
Zhenggang Xu, Shen Yang, Chenhao Li, Muhong Xie, Yi He, Sisi Chen, Yan Tang, Dapei Li, Tianyu Wang, and Guiyan Yang
- Subjects
Paper mulberry ,Metallothionein ,Expression analysis ,Yeast transformation ,Site-directed mutagenesis ,Heavy metal transfer ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background Broussonetia papyrifera is an economically significant tree with high utilization value, yet its cultivation is often constrained by soil contamination with heavy metals (HMs). Effective scientific cultivation management, which enhances the yield and quality of B. papyrifera, necessitates an understanding of its regulatory mechanisms in response to HM stress. Results Twelve Metallothionein (MT) genes were identified in B. papyrifera. Their open reading frames ranged from 186 to 372 bp, encoding proteins of 61 to 123 amino acids with molecular weights between 15,473.77 and 29,546.96 Da, and theoretical isoelectric points from 5.24 to 5.32. Phylogenetic analysis classified these BpMTs into three subclasses: MT1, MT2, and MT3, with MT2 containing seven members and MT3 only one. The expression of most BpMT genes was inducible by Cd, Mn, Cu, Zn, and abscisic acid (ABA) treatments, particularly BpMT2e, BpMT2d, BpMT2c, and BpMT1c, which showed significant responses and warrant further study. Yeast cells expressing these BpMT genes exhibited enhanced tolerance to Cd, Mn, Cu, and Zn stresses compared to control cells. Yeasts harboring BpMT1c, BpMT2e, and BpMT2d demonstrated higher accumulation of Cd, Cu, Mn, and Zn, suggesting a chelation and binding capacity of BpMTs towards HMs. Site-directed mutagenesis of cysteine (Cys) residues indicated that mutations in the C domain of type 1 BpMT led to increased sensitivity to HMs and reduced HM accumulation in yeast cells; While in type 2 BpMTs, the contribution of N and C domain to HMs’ chelation possibly corelated to the quantity of Cys residues. Conclusion The BpMT genes are crucial in responding to diverse HM stresses and are involved in ABA signaling. The Cys-rich domains of BpMTs are pivotal for HM tolerance and chelation. This study offers new insights into the structure-function relationships and metal-binding capabilities of type-1 and − 2 plant MTs, enhancing our understanding of their roles in plant adaptation to HM stresses.
- Published
- 2024
- Full Text
- View/download PDF
5. The cold responsive mechanism of the paper mulberry: decreased photosynthesis capacity and increased starch accumulation.
- Author
-
Peng X, Teng L, Yan X, Zhao M, and Shen S
- Subjects
- Acclimatization genetics, Chloroplasts genetics, Chloroplasts metabolism, Cold Temperature, Gene Expression Regulation, Plant, Morus growth & development, Photosynthesis genetics, Plant Leaves genetics, Plant Leaves metabolism, Proteomics, Starch genetics, Morus genetics, Plant Proteins biosynthesis, Starch metabolism, Stress, Physiological genetics
- Abstract
Background: Most studies on the paper mulberry are mainly focused on the medicated and pharmacology, fiber quality, leaves feed development, little is known about its mechanism of adaptability to abiotic stress. Physiological measurement, transcriptomics and proteomic analysis were employed to understand its response to cold stress in this study., Methods: The second to fourth fully expanded leaves from up to down were harvested at different stress time points forthe transmission electron microscope (TEM) observation. Physiological characteristics measurement included the relative electrolyte leakage (REL), SOD activity assay, soluble sugar content, and Chlorophyll fluorescence parameter measurement. For screening of differentially expressed genes, the expression level of every transcript in each sample was calculated by quantifying the number of Illumina reads. To identify the differentially expressed protein, leaves of plants under 0, 6, 12, 24, 48 and 72 h cold stress wereharvested for proteomic analysis. Finally, real time PCR was used to verify the DEG results of the RNA-seq and the proteomics data., Results: Results showed that at the beginning of cold stress, respiratory metabolism was decreased and the transportation and hydrolysis of photosynthetic products was inhibited, leading to an accumulation of starch in the chloroplasts. Total of 5800 unigenes and 38 proteins were affected, including the repressed expression of photosynthesis and the enhanced expression in signal transduction, stress defense pathway as well as secondary metabolism. Although the transcriptional level of a large number of genes has been restored after 12 h, sustained cold stress brought more serious injury to the leaf cells, including the sharp rise of the relative electrolyte leakage, the declined Fv/Fm value, swelled chloroplast and the disintegrated membrane system., Conclusion: The starch accumulation and the photoinhibition might be the main adaptive mechanism of the paper mulberry responded to cold stress. Most of important, enhancing the transport and hydrolysis of photosynthetic products could be the potential targets for improving the cold tolerance of the paper mulberry.
- Published
- 2015
- Full Text
- View/download PDF
6. Mitochondrial genome plasticity of mammalian species.
- Author
-
Biró, Bálint, Gál, Zoltán, Fekete, Zsófia, Klecska, Eszter, and Hoffmann, Orsolya Ivett
- Subjects
MITOCHONDRIAL DNA ,MACHINE learning ,GENOMES - Abstract
There is an ongoing process in which mitochondrial sequences are being integrated into the nuclear genome. The importance of these sequences has already been revealed in cancer biology, forensic, phylogenetic studies and in the evolution of the eukaryotic genetic information. Human and numerous model organisms' genomes were described from those sequences point of view. Furthermore, recent studies were published on the patterns of these nuclear localised mitochondrial sequences in different taxa. However, the results of the previously released studies are difficult to compare due to the lack of standardised methods and/or using few numbers of genomes. Therefore, in this paper our primary goal is to establish a uniform mining pipeline to explore these nuclear localised mitochondrial sequences. Our results show that the frequency of several repetitive elements is higher in the flanking regions of these sequences than expected. A machine learning model reveals that the flanking regions' repetitive elements and different structural characteristics are highly influential during the integration process. In this paper, we introduce a general mining pipeline for all mammalian genomes. The workflow is publicly available and is believed to serve as a validated baseline for future research in this field. We confirm the widespread opinion, on - as to our current knowledge - the largest dataset, that structural circumstances and events corresponding to repetitive elements are highly significant. An accurate model has also been trained to predict these sequences and their corresponding flanking regions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Comparative plastome analysis of Arundinelleae (Poaceae, Panicoideae), with implications for phylogenetic relationships and plastome evolution.
- Author
-
Jiang, Li-Qiong, Drew, Bryan T., Arthan, Watchara, Yu, Guo-Ying, Wu, Hong, Zhao, Yue, Peng, Hua, and Xiang, Chun-Lei
- Abstract
Background: Arundinelleae is a small tribe within the Poaceae (grass family) possessing a widespread distribution that includes Asia, the Americas, and Africa. Several species of Arundinelleae are used as natural forage, feed, and raw materials for paper. The tribe is taxonomically cumbersome due to a paucity of clear diagnostic morphological characters. There has been scant genetic and genomic research conducted for this group, and as a result the phylogenetic relationships and species boundaries within Arundinelleae are poorly understood. Results: We compared and analyzed 11 plastomes of Arundinelleae, of which seven plastomes were newly sequenced. The plastomes range from 139,629 base pairs (bp) (Garnotia tenella) to 140,943 bp (Arundinella barbinodis), with a standard four-part structure. The average GC content was 38.39%, but varied in different regions of the plastome. In all, 110 genes were annotated, comprising 76 protein-coding genes, 30 tRNA genes, and four rRNA genes. Furthermore, 539 simple sequence repeats, 519 long repeats, and 10 hyper-variable regions were identified from the 11 plastomes of Arundinelleae. A phylogenetic reconstruction of Panicoideae based on 98 plastomes demonstrated the monophyly of Arundinella and Garnotia, but the circumscription of Arundinelleae remains unresolved. Conclusion: Complete chloroplast genome sequences can improve phylogenetic resolution relative to single marker approaches, particularly within taxonomically challenging groups. All phylogenetic analyses strongly support the monophyly of Arundinella and Garnotia, respectively, but the monophylly of Arundinelleae was not well supported. The intergeneric phylogenetic relationships within Arundinelleae require clarification, indicating that more data is necessary to resolve generic boundaries and evaluate the monophyly of Arundinelleae. A comprehensive taxonomic revision for the tribe is necessary. In addition, the identified hyper-variable regions could function as molecular markers for clarifying phylogenetic relationships and potentially as barcoding markers for species identification in the future. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. Mobilome impacts on physiology in the widely used non-toxic mutant Microcystis aeruginosa PCC 7806 ΔmcyB and toxic wildtype.
- Author
-
Stark, Gwendolyn F., Truchon, Alexander R., and Wilhelm, Steven W.
- Abstract
The Microcystis mobilome is a well-known but understudied component of this bloom-forming cyanobacterium. Through genomic and transcriptomic comparisons, we found five families of transposases that altered the expression of genes in the well-studied toxigenic type-strain, Microcystis aeruginosa PCC 7086, and a non-toxigenic genetic mutant, Microcystis aeruginosa PCC 7806 ΔmcyB. Since its creation in 1997, the ΔmcyB strain has been used in comparative physiology studies against the wildtype strain by research labs throughout the world. Some differences in gene expression between what were thought to be otherwise genetically identical strains have appeared due to insertion events in both intra- and intergenic regions. In our ΔmcyB isolate, a sulfate transporter gene cluster (sbp-cysTWA) showed differential expression from the wildtype, which may have been caused by the insertion of a miniature inverted repeat transposable element (MITE) in the sulfate-binding protein gene (sbp). Differences in growth in sulfate-limited media also were also observed between the two isolates. This paper highlights how Microcystis strains continue to “evolve” in lab conditions and illustrates the importance of insertion sequences / transposable elements in shaping genomic and physiological differences between Microcystis strains thought otherwise identical. This study forces the necessity of knowing the complete genetic background of isolates in comparative physiological experiments, to facilitate the correct conclusions (and caveats) from experiments. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. From CFTR to a CF signalling network: a systems biology approach to study Cystic Fibrosis.
- Author
-
Najm, Matthieu, Martignetti, Loredana, Cornet, Matthieu, Kelly-Aubert, Mairead, Sermet, Isabelle, Calzone, Laurence, and Stoven, Véronique
- Subjects
CYSTIC fibrosis transmembrane conductance regulator ,MEMBRANE proteins ,CYSTIC fibrosis ,SYSTEMS biology ,CELLULAR signal transduction ,CHLORIDE channels - Abstract
Background: Cystic Fibrosis (CF) is a monogenic disease caused by mutations in the gene coding the Cystic Fibrosis Transmembrane Regulator (CFTR) protein, but its overall physio-pathology cannot be solely explained by the loss of the CFTR chloride channel function. Indeed, CFTR belongs to a yet not fully deciphered network of proteins participating in various signalling pathways. Methods: We propose a systems biology approach to study how the absence of the CFTR protein at the membrane leads to perturbation of these pathways, resulting in a panel of deleterious CF cellular phenotypes. Results: Based on publicly available transcriptomic datasets, we built and analyzed a CF network that recapitulates signalling dysregulations. The CF network topology and its resulting phenotypes were found to be consistent with CF pathology. Conclusion: Analysis of the network topology highlighted a few proteins that may initiate the propagation of dysregulations, those that trigger CF cellular phenotypes, and suggested several candidate therapeutic targets. Although our research is focused on CF, the global approach proposed in the present paper could also be followed to study other rare monogenic diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Unveiling the Brazilian kefir microbiome: discovery of a novel Lactobacillus kefiranofaciens (LkefirU) genome and in silico prospection of bioactive peptides with potential anti-Alzheimer properties.
- Author
-
Silva, Matheus H., Batista, Letícia L., Malta, Serena M., Santos, Ana C. C., Mendes-Silva, Ana P., Bonetti, Ana M., Ueira-Vieira, Carlos, and dos Santos, Anderson R.
- Subjects
DIETARY bioactive peptides ,ALZHEIMER'S disease ,PAN-genome ,MOLECULAR docking ,KEFIR - Abstract
Background: Kefir is a complex microbial community that plays a critical role in the fermentation and production of bioactive peptides, and has health-improving properties. The composition of kefir can vary by geographic localization and weather, and this paper focuses on a Brazilian sample and continues previous work that has successful anti-Alzheimer properties. In this study, we employed shotgun metagenomics and peptidomics approaches to characterize Brazilian kefir further. Results: We successfully assembled the novel genome of Lactobacillus kefiranofaciens (LkefirU) and conducted a comprehensive pangenome analysis to compare it with other strains. Furthermore, we performed a peptidome analysis, revealing the presence of bioactive peptides encrypted by L. kefiranofaciens in the Brazilian kefir sample, and utilized in silico prospecting and molecular docking techniques to identify potential anti-Alzheimer peptides, targeting β-amyloid (fibril and plaque), BACE, and acetylcholinesterase. Through this analysis, we identified two peptides that show promise as compounds with anti-Alzheimer properties. Conclusions: These findings not only provide insights into the genome of L. kefiranofaciens but also serve as a promising prototype for the development of novel anti-Alzheimer compounds derived from Brazilian kefir. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Time series-based hybrid ensemble learning model with multivariate multidimensional feature coding for DNA methylation prediction.
- Author
-
Yan, Wu, Tan, Li, Mengshan, Li, Weihong, Zhou, Sheng, Sheng, Jun, Wang, and Fu-an, Wu
- Subjects
DNA methylation ,PARTICLE swarm optimization ,BLENDED learning ,MACHINE learning ,TIME series analysis ,DNA methyltransferases - Abstract
Background: DNA methylation is a form of epigenetic modification that impacts gene expression without modifying the DNA sequence, thereby exerting control over gene function and cellular development. The prediction of DNA methylation is vital for understanding and exploring gene regulatory mechanisms. Currently, machine learning algorithms are primarily used for model construction. However, several challenges remain to be addressed, including limited prediction accuracy, constrained generalization capability, and insufficient learning capacity. Results: In response to the aforementioned challenges, this paper leverages the similarities between DNA sequences and time series to introduce a time series-based hybrid ensemble learning model, called Multi2-Con-CAPSO-LSTM. The model utilizes multivariate and multidimensional encoding approach, combining three types of time series encodings with three kinds of genetic feature encodings, resulting in a total of nine types of feature encoding matrices. Convolutional Neural Networks are utilized to extract features from DNA sequences, including temporal, positional, physicochemical, and genetic information, thereby creating a comprehensive feature matrix. The Long Short-Term Memory model is then optimized using the Chaotic Accelerated Particle Swarm Optimization algorithm for predicting DNA methylation. Conclusions: Through cross-validation experiments conducted on 17 species involving three types of DNA methylation (6 mA, 5hmC, and 4mC), the results demonstrate the robust predictive capabilities of the Multi2-Con-CAPSO-LSTM model in DNA methylation prediction across various types and species. Compared with other benchmark models, the Multi2-Con-CAPSO-LSTM model demonstrates significant advantages in sensitivity, specificity, accuracy, and correlation. The model proposed in this paper provides valuable insights and inspiration across various disciplines, including sequence alignment, genetic evolution, time series analysis, and structure–activity relationships. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
12. GBDT_KgluSite: An improved computational prediction model for lysine glutarylation sites based on feature fusion and GBDT classifier.
- Author
-
Liu, Xin, Zhu, Bao, Dai, Xia-Wei, Xu, Zhi-Ao, Li, Rui, Qian, Yuting, Lu, Ya-Ping, Zhang, Wenqing, Liu, Yong, and Zheng, Junnian
- Subjects
PREDICTION models ,POST-translational modification ,CELL physiology ,AMINO acid sequence ,LYSINE - Abstract
Background: Lysine glutarylation (Kglu) is one of the most important Post-translational modifications (PTMs), which plays significant roles in various cellular functions, including metabolism, mitochondrial processes, and translation. Therefore, accurate identification of the Kglu site is important for elucidating protein molecular function. Due to the time-consuming and expensive limitations of traditional biological experiments, computational-based Kglu site prediction research is gaining more and more attention. Results: In this paper, we proposed GBDT_KgluSite, a novel Kglu site prediction model based on GBDT and appropriate feature combinations, which achieved satisfactory performance. Specifically, seven features including sequence-based features, physicochemical property-based features, structural-based features, and evolutionary-derived features were used to characterize proteins. NearMiss-3 and Elastic Net were applied to address data imbalance and feature redundancy issues, respectively. The experimental results show that GBDT_KgluSite has good robustness and generalization ability, with accuracy and AUC values of 93.73%, and 98.14% on five-fold cross-validation as well as 90.11%, and 96.75% on the independent test dataset, respectively. Conclusion: GBDT_KgluSite is an effective computational method for identifying Kglu sites in protein sequences. It has good stability and generalization ability and could be useful for the identification of new Kglu sites in the future. The relevant code and dataset are available at https://github.com/flyinsky6/GBDT_KgluSite. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
13. Metatranscriptomic profiles of Eastern subterranean termites, Reticulitermes flavipes (Kollar) fed on second generation feedstocks.
- Author
-
Rajarapu, Swapna Priya, Shreve, Jacob T., Bhide, Ketaki P., Thimmapuram, Jyothi, and Scharf, Michael E.
- Subjects
LIGNOCELLULOSE ,LIGNINS ,FEEDSTOCK ,BIOMASS energy research ,RENEWABLE energy source research - Abstract
Background: Second generation lignocellulosic feedstocks are being considered as an alternative to first generation biofuels that are derived from grain starches and sugars. However, the current pre-treatment methods for second generation biofuel production are inefficient and expensive due to the recalcitrant nature of lignocellulose. In this study, we used the lower termite Reticulitermes flavipes (Kollar), as a model to identify potential pretreatment genes/enzymes specifically adapted for use against agricultural feedstocks. Results: Metatranscriptomic profiling was performed on worker termite guts after feeding on corn stover (CS), soybean residue (SR), or 98% pure cellulose (paper) to identify (i) microbial community, (ii) pathway level and (iii) gene-level responses. Microbial community profiles after CS and SR feeding were different from the paper feeding profile, and protist symbiont abundance decreased significantly in termites feeding on SR and CS relative to paper. Functional profiles after CS feeding were similar to paper and SR; whereas paper and SR showed different profiles. Amino acid and carbohydrate metabolism pathways were downregulated in termites feeding on SR relative to paper and CS. Gene expression analyses showed more significant down regulation of genes after SR feeding relative to paper and CS. Stereotypical lignocellulase genes/enzymes were not differentially expressed, but rather were among the most abundant/constitutively-expressed genes. Conclusions: These results suggest that the effect of CS and SR feeding on termite gut lignocellulase composition is minimal and thus, the most abundantly expressed enzymes appear to encode the best candidate catalysts for use in saccharification of these and related second-generation feedstocks. Further, based on these findings we hypothesize that the most abundantly expressed lignocellulases, rather than those that are differentially expressed have the best potential as pretreatment enzymes for CS and SR feedstocks. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
14. Sequencing by binding rivals SMOR error-corrected sequencing by synthesis technology for accurate detection and quantification of minor (< 0.1%) subpopulation variants.
- Author
-
Allender, Christopher J., Wike, Candice L., Porter, W. Tanner, Ellis, Dean, Lemmer, Darrin, Pond, Stephanie J. K., and Engelthaler, David M.
- Subjects
WHOLE genome sequencing ,ERROR rates ,SINGLE molecules ,NUCLEOTIDE sequencing ,BASIC needs - Abstract
Background: Detecting very minor (< 1%) subpopulations using next-generation sequencing is a critical need for multiple applications, including the detection of drug resistant pathogens and somatic variant detection in oncology. A recently available sequencing approach termed 'sequencing by binding (SBB)' claims to have higher base calling accuracy data "out of the box." This paper evaluates the utility of using SBB for the detection of ultra-rare drug resistant subpopulations in Mycobacterium tuberculosis (Mtb) using a targeted amplicon assay and compares the performance of SBB to single molecule overlapping reads (SMOR) error corrected sequencing by synthesis (SBS) data. Results: SBS displayed an elevated error rate when compared to SMOR error-corrected SBS and SBB techniques. SMOR error-corrected SBS and SBB technologies performed similarly within the linear range studies and error rate studies. Conclusions: With lower sequencing error rates within SBB sequencing, this technique looks promising for both targeted and unbiased whole genome sequencing, leading to the identification of minor (< 1%) subpopulations without the need for error correction methods. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Exploring crop genomes: assembly features, gene prediction accuracy, and implications for proteomics studies.
- Author
-
Abbas, Qussai, Wilhelm, Mathias, Kuster, Bernhard, Poppenberger, Brigitte, and Frishman, Dmitrij
- Subjects
PLANT genomes ,PROTEOMICS ,GENOMES ,GENES ,GENOMICS - Abstract
Plant genomics plays a pivotal role in enhancing global food security and sustainability by offering innovative solutions for improving crop yield, disease resistance, and stress tolerance. As the number of sequenced genomes grows and the accuracy and contiguity of genome assemblies improve, structural annotation of plant genomes continues to be a significant challenge due to their large size, polyploidy, and rich repeat content. In this paper, we present an overview of the current landscape in crop genomics research, highlighting the diversity of genomic characteristics across various crop species. We also assessed the accuracy of popular gene prediction tools in identifying genes within crop genomes and examined the factors that impact their performance. Our findings highlight the strengths and limitations of BRAKER2 and Helixer as leading structural genome annotation tools and underscore the impact of genome complexity, fragmentation, and repeat content on their performance. Furthermore, we evaluated the suitability of the predicted proteins as a reliable search space in proteomics studies using mass spectrometry data. Our results provide valuable insights for future efforts to refine and advance the field of structural genome annotation. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
16. Drug-target binding affinity prediction using message passing neural network and self supervised learning.
- Author
-
Xia, Leiming, Xu, Lei, Pan, Shourun, Niu, Dongjiang, Zhang, Beiyi, and Li, Zhen
- Subjects
SUPERVISED learning ,MESSAGE passing (Computer science) ,DEEP learning ,DRUG discovery ,REPRESENTATIONS of graphs ,MOLECULAR graphs ,AMINO acid sequence - Abstract
Background: Drug-target binding affinity (DTA) prediction is important for the rapid development of drug discovery. Compared to traditional methods, deep learning methods provide a new way for DTA prediction to achieve good performance without much knowledge of the biochemical background. However, there are still room for improvement in DTA prediction: (1) only focusing on the information of the atom leads to an incomplete representation of the molecular graph; (2) the self-supervised learning method could be introduced for protein representation. Results: In this paper, a DTA prediction model using the deep learning method is proposed, which uses an undirected-CMPNN for molecular embedding and combines CPCProt and MLM models for protein embedding. An attention mechanism is introduced to discover the important part of the protein sequence. The proposed method is evaluated on the datasets Ki and Davis, and the model outperformed other deep learning methods. Conclusions: The proposed model improves the performance of the DTA prediction, which provides a novel strategy for deep learning-based virtual screening methods. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
17. Solanum aculeatissimum and Solanum torvum chloroplast genome sequences: a comparative analysis with other Solanum chloroplast genomes.
- Author
-
Zhang, Longhao, Yi, Chengqi, Xia, Xin, Jiang, Zheng, Du, Lihui, Yang, Shixin, and Yang, Xu
- Subjects
CHLOROPLAST DNA ,SOLANUM ,SEQUENCE analysis ,MICROSATELLITE repeats ,PLANT classification ,BASE pairs - Abstract
Background: Solanum aculeatissimum and Solanum torvum belong to the Solanum species, and they are essential plants known for their high resistance to diseases and adverse conditions. They are frequently used as rootstocks for grafting and are often crossbred with other Solanum species to leverage their resistance traits. However, the phylogenetic relationship between S. aculeatissimum and S. torvum within the Solanum genus remains unclear. Therefore, this paper aims to sequence the complete chloroplast genomes of S. aculeatissimum and S. torvum and analyze them in comparison with 29 other previously published chloroplast genomes of Solanum species. Results: We observed that the chloroplast genomes of S. aculeatissimum and S. torvum possess typical tetrameric structures, consisting of one Large Single Copy (LSC) region, two reverse-symmetric Inverted Repeats (IRs), and one Small Single Copy (SSC) region. The total length of these chloroplast genomes ranged from 154,942 to 156,004 bp, with minimal variation. The highest GC content was found in the IR region, while the lowest was in the SSC region. Regarding gene content, the total number of chloroplast genes and CDS genes remained relatively consistent, ranging from 128 to 134 and 83 to 91, respectively. Nevertheless, there was notable variability in the number of tRNA genes and rRNAs. Relative synonymous codon usage (RSCU) analysis revealed that both S. aculeatissimum and S. torvum preferred codons that utilized A and U bases. Analysis of the IR boundary regions indicated that contraction and expansion primarily occurred at the junction between SSC and IR regions. Nucleotide polymorphism analysis and structural variation analysis demonstrated that chloroplast variation in Solanum species mainly occurred in the LSC and SSC regions. Repeat sequence analysis revealed that A/T was the most frequent base pair in simple repeat sequences (SSR), while Palindromic and Forward repeats were more common in long sequence repeats (LSR), with Reverse and Complement repeats being less frequent. Phylogenetic analysis indicated that S. aculeatissimum and S. torvum belonged to the same meristem and were more closely related to Cultivated Eggplant. Conclusion: These findings enhance our comprehension of chloroplast genomes within the Solanum genus, offering valuable insights for plant classification, evolutionary studies, and potential molecular markers for species identification. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. scFSNN: a feature selection method based on neural network for single-cell RNA-seq data.
- Author
-
Peng, Minjiao, Lin, Baoqin, Zhang, Jun, Zhou, Yan, and Lin, Bingqing
- Subjects
FEATURE selection ,ARTIFICIAL neural networks ,FALSE discovery rate ,RNA sequencing ,GENE expression - Abstract
While single-cell RNA sequencing (scRNA-seq) allows researchers to analyze gene expression in individual cells, its unique characteristics like over-dispersion, zero-inflation, high gene-gene correlation, and large data volume with many features pose challenges for most existing feature selection methods. In this paper, we present a feature selection method based on neural network (scFSNN) to solve classification problem for the scRNA-seq data. scFSNN is an embedded method that can automatically select features (genes) during model training, control the false discovery rate of selected features and adaptively determine the number of features to be eliminated. Extensive simulation and real data studies demonstrate its excellent feature selection ability and predictive performance. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
19. Deciphering the regulatory role of PheSnRK genes in Moso bamboo: insights into hormonal, energy, and stress responses.
- Author
-
Huifang Zheng, Yali Xie, Changhong Mu, Wenlong Cheng, Yucong Bai, and Jian Gao
- Abstract
The SnRK (sucrose non-fermentation-related protein kinase) plays an important role in regulating various signals in plants. However, as an important bamboo shoot and wood species, the response mechanism of PheSnRK in Phyllostachys edulis to hormones, low energy and stress remains unclear. In this paper, we focused on the structure, expression, and response of SnRK to hormones and sugars. In this study, we identified 75 PheSnRK genes from the Moso bamboo genome, which can be divided into three groups according to the evolutionary relationship. Cis-element analysis has shown that the PheSnRK gene can respond to various hormones, light, and stress. The PheSnRK2.9 proteins were localized in the nucleus and cytoplasm. Transgenic experiments showed that overexpression of PheSnRK2.9 inhibited root development, the plants were salt-tolerant and exhibited slowed starch consumption in Arabidopsis in the dark. The results of yeast one-hybrid and dual luciferase assay showed that PheIAAs and PheNACs can regulate PheSnRK2.9 gene expression by binding to the promoter of PheSnRK2.9. This study provided a comprehensive understanding of PheSnRK genes of Moso bamboo, which provides valuable information for further research on energy regulation mechanism and stress response during the growth and development of Moso bamboo. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
20. Modified Northern blot protocol for easy detection of mRNAs in total RNA using radiolabeled probes.
- Author
-
Yang, Tao, Zhang, Mingdi, and Zhang, Nianhui
- Subjects
RNA ,DNA probes ,MOLECULAR biology ,GENE expression ,RADIOACTIVITY - Abstract
Background: Northern blotting is still used as a gold standard for validation of the data obtained from high-throughput whole transcriptome-based methods. However, its disadvantages of lower sensitivity, labor-intensive operation, and higher quality of RNA required limit its utilization in a routine molecular biology laboratory to monitor gene expression at RNA level. Therefore, it is necessary to optimize the traditional Northern protocol to make the technique more applicable for standard use. Results: In this paper, we report modifications and tips used to improve the traditional Northern protocol for the detection of mRNAs in total RNA. To maximize the retention of specifically bound radiolabeled probes on the blot, posthybridization washes were performed under only with moderate-stringency until the level of radioactivity retained on the filter decreased to 20~50 counts per second, rather than normally under high and low stringency sequentially for scheduled time or under only high stringent condition. Successful detection of the low-expression gene using heterologous DNA probes in 20 µg of total RNA after a two-day exposure suggested an improvement in detection sensitivity. Quantitatively controlled posthybridization washes combined with an ethidium bromide-prestaining RNA procedure to directly visualize prestained RNA bands at any time during electrophoresis or immediately after electrophoresis, which made the progress of the Northern procedure to be monitored and evaluated step by step, thereby making the experiment reliable and controllable. We also report tips used in the modified Northern protocol, including the moderate concentration of formaldehyde in the gel, the accessory capillary setup, and the staining jar placed into an enamel square tray with a lid used for hybridization. Using our modified Northern protocol, eight rounds of rehybridization could be performed on a single blot. The modification made and tips used ensured the efficient proceeding of the experiment and the resulting good performance, but without using special reagents or equipment. Conclusions: The modified Northern protocol improved detection sensitivity and made the experiment easy, less expensive, reliable, and controllable, and can be employed in a routine molecular biology laboratory to detect low-expressed mRNAs with heterologous DNA probes in total RNA. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
21. OTSUCNV: an adaptive segmentation and OTSU-based anomaly classification method for CNV detection using NGS data.
- Author
-
Xie, Kun, Ge, Xiaojun, Alvi, Haque A.K., Liu, Kang, Song, Jianfeng, and Yu, Qiang
- Subjects
WHOLE genome sequencing ,NUCLEOTIDE sequencing ,HUMAN evolution ,COMPUTATIONAL complexity - Abstract
Copy-number variations (CNVs), which refer to deletions and duplications of chromosomal segments, represent a significant source of variation among individuals, contributing to human evolution and being implicated in various diseases ranging from mental illness and developmental disorders to cancer. Despite the development of several methods for detecting copy number variations based on next-generation sequencing (NGS) data, achieving robust detection performance for CNVs with arbitrary coverage and amplitude remains challenging due to the inherent complexity of sequencing samples. In this paper, we propose an alternative method called OTSUCNV for CNV detection on whole genome sequencing (WGS) data. This method utilizes a newly designed adaptive sequence segmentation algorithm and an OTSU-based CNV prediction algorithm, which does not rely on any distribution assumptions or involve complex outlier factor calculations. As a result, the effective detection of CNVs is achieved with lower computational complexity. The experimental results indicate that the proposed method demonstrates outstanding performance, and hence it may be used as an effective tool for CNV detection. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
22. Response to correspondence on "B chromosomes of multiple species have intense evolutionary dynamics and accumulated genes related to important biological processes".
- Author
-
Ahmad, Syed F., Valente, Guilherme T., and Martins, Cesar
- Subjects
GENES ,SPECIES ,KARYOTYPES ,DATA analysis ,CHROMOSOMES - Abstract
This document is a response to concerns raised about a previous article on B chromosomes and their evolutionary dynamics. The authors acknowledge the valid concerns and provide corrections and clarifications. They address issues with the availability of sequenced data and inaccuracies in supplementary figures and tables. The authors also discuss errors and corrections in a study on B chromosomes in different species, including mistakes in referencing, figure citation, and data analysis. They emphasize the need for standardized protocols and bioinformatic tools to improve the study of B chromosomes and acknowledge the challenges and gaps in understanding their genomic content. [Extracted from the article]
- Published
- 2023
- Full Text
- View/download PDF
23. Complete genome sequencing and comparison of two nitrogen-metabolizing bacteria isolated from Antarctic deep-sea sediment.
- Author
-
Liu, Wenqi, Cong, Bailin, Lin, Jing, Zhao, Linlin, and Liu, Shenghao
- Abstract
Background: Bacteria are an essential component of the earth`s biota and affect circulation of matters through their metabolic activity. They also play an important role in the carbon and nitrogen cycle in the deep-sea environment. In this paper, two strains from deep-sea sediments were investigated in order to understand nitrogen cycling involved in the deep-sea environment. Results: In this paper, the basic genomic information of two strains was obtained by whole genome sequencing. The Cobetia amphilecti N-80 and Halomonas profundus 13 genome sizes are 4,160,095 bp with a GC content of 62.5% and 5,251,450 bp with a GC content of 54.84%. Through a comparison of functional analyses, we predicted the possible C and N metabolic pathways of the two strains and determined that Halomonas profundus 13 could use more carbon sources than Cobetia amphilecti N-80. The main genes associated with N metabolism in Halomonas profundus 13 are narG, narY, narI, nirS, norB, norC, nosZ, and nirD. On the contrast, nirD, using NH
4 + for energy, plays a main role in Cobetia amphilecti N-80. Both of them have the same genes for fixing inorganic carbon: icd, ppc, fdhA, accC, accB, accD, and accA. Conclusion: In this study, the whole genomes of two strains were sequenced to clarify the basic characteristics of their genomes, laying the foundation for further studying nitrogen-metabolizing bacteria. Halomonas profundus 13 can utilize more carbon sources than Cobetia amphilecti N-80, as indicated by API as well as COG and KEGG prediction results. Finally, through the analysis of the nitrification and denitrification abilities as well as the inorganic carbon fixation ability of the two strains, the related genes were identified, and the possible metabolic pathways were predicted. Together, these results provide molecular markers and theoretical support for the mechanisms of inorganic carbon fixation by deep-sea microorganisms. [ABSTRACT FROM AUTHOR]- Published
- 2022
- Full Text
- View/download PDF
24. Retraction Note: TaWRKY40 transcription factor positively regulate the expression of TaGAPC1 to enhance drought tolerance.
- Author
-
Zhang, Lin, Xu, Zhiyong, Ji, Haikun, Zhou, Ye, and Yang, Shushen
- Subjects
DROUGHT tolerance ,TRANSCRIPTION factors ,DROUGHTS ,DROUGHT management - Abstract
RETRACTED ARTICLE: The specific MYB binding sites bound by I Ta MYB in the GAPCp2/3 i promoters are involved in the drought stress response in wheat. 5A (WT; drought 25d and 7d after re-watering) of their I BMC Plant Biology i paper [2]. [Extracted from the article]
- Published
- 2023
- Full Text
- View/download PDF
25. Long-term TE persistence even without beneficial insertion.
- Author
-
Kremer, Stefan C., Linquist, Stefan, Saylor, Brent, Elliott, Tyler A., Gregory, T. Ryan, and Cottenie, Karl
- Subjects
GENOMICS ,CRITICISM ,COMPREHENSION - Abstract
This correspondence responds to the critique by Butler et al. (BMC Genomics 22:241, 2021) of our recent paper on transposable element (TE) persistence. We address the three main objections raised by Butler et al. After running a series of additional simulations that were inspired by the authors' criticisms, we are able to present a more nuanced understanding of the conditions that generate long-term persistence. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
26. ETGPDA: identification of piRNA-disease associations based on embedding transformation graph convolutional network
- Author
-
Meng, Xianghan, Shang, Junliang, Ge, Daohui, Yang, Yi, Zhang, Tongdui, and Liu, Jin-Xing
- Published
- 2023
- Full Text
- View/download PDF
27. Small open reading frames: a comparative genetics approach to validation
- Author
-
Jain, Niyati, Richter, Felix, Adzhubei, Ivan, Sharp, Andrew J., and Gelb, Bruce D.
- Published
- 2023
- Full Text
- View/download PDF
28. A new and effective two-step clustering approach for single cell RNA sequencing data.
- Author
-
Li, Ruiyi, Guan, Jihong, Wang, Zhiye, and Zhou, Shuigeng
- Subjects
RNA sequencing ,HIERARCHICAL clustering (Cluster analysis) ,NATURAL immunity ,DRUG resistance ,CLUSTER analysis (Statistics) - Abstract
Background: The rapid devolvement of single cell RNA sequencing (scRNA-seq) technology leads to huge amounts of scRNA-seq data, which greatly advance the research of many biomedical fields involving tissue heterogeneity, pathogenesis of disease and drug resistance etc. One major task in scRNA-seq data analysis is to cluster cells in terms of their expression characteristics. Up to now, a number of methods have been proposed to infer cell clusters, yet there is still much space to improve their performance. Results: In this paper, we develop a new two-step clustering approach to effectively cluster scRNA-seq data, which is called TSC — the abbreviation of Two-Step Clustering. Particularly, by dividing all cells into two types: core cells (those possibly lying around the centers of clusters) and non-core cells (those locating in the boundary areas of clusters), we first clusters the core cells by hierarchical clustering (the first step) and then assigns the non-core cells to the corresponding nearest clusters (the second step). Extensive experiments on 12 real scRNA-seq datasets show that TSC outperforms the state of the art methods. Conclusion: TSC is an effective clustering method due to its two-steps clustering strategy, and it is a useful tool for scRNA-seq data analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. Prediction of lncRNA functions using deep neural networks based on multiple networks.
- Author
-
Deng, Lei, Ren, Shengli, and Zhang, Jingpu
- Subjects
ARTIFICIAL neural networks ,LINCRNA ,BIOLOGICAL databases ,GENE ontology - Abstract
Background: More and more studies show that lncRNA is widely involved in various physiological processes of the organism. However, the functions of the vast majority of them continue to be unknown. In addition, data related to lncRNAs in biological databases are constantly increasing. Therefore, it is quite urgent to develop a computing method to make the utmost of these data. Results: In this paper, we propose a new computational method based on global heterogeneous networks to predict the functions of lncRNAs, called DNGRGO. DNGRGO first calculates the similarities among proteins, miRNAs, and lncRNAs, and annotates the functions of lncRNAs according to its similar protein-coding genes, which have been labeled with gene ontology (GO). To evaluate the performance of DNGRGO, we manually annotated GO terms to lncRNAs and implemented our method on these data. Compared with the existing methods, the results of DNGRGO show superior predictive performance of maximum F-measure and coverage. Conclusions: DNGRGO is able to annotate lncRNAs through capturing the low-dimensional features of the heterogeneous network. Moreover, the experimental results show that integrating miRNA data can help to improve the predictive performance of DNGRGO. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Functional analysis of differentially expressed circular RNAs in sheep subcutaneous fat.
- Author
-
Liu, Tian-yi, Feng, Hui, Yousuf, Salsabeel, Xie, Ling-li, and Miao, Xiang-yang
- Subjects
CIRCULAR RNA ,FUNCTIONAL analysis ,ADIPOSE tissues ,SHEEP ,FAT ,AMP-activated protein kinases - Abstract
Background: Circular RNAs (circRNAs), as important non-coding RNAs (ncRNAs), are involved in many biological activities. However, the exact chemical mechanism behind fat accumulation is unknown. In this paper, we obtained the expression profiles of circRNAs using high-throughput sequencing and investigated their differential expression in subcutaneous fat tissue of Duolang and Small Tail Han sheep. Results: From the transcriptomic analysis, 141 differentially expressed circRNAs were identified, comprising 61 up-regulated circRNAs and 80 down-regulated circRNAs. These host genes were primarily enriched in the MAPK and AMPK signaling pathways which is closely associated with fat deposition regulation. We identified circRNA812, circRNA91, and circRNA388 as vital genes in fat deposition by miRNA-circRNA target gene prediction. The functional annotation results of target genes of key circRNAs showed that the signaling pathways mainly included PI3K-Akt and AMPK. We constructed the competing endogenous RNA (ceRNA) regulatory network to study the role of circRNAs in sheep lipid deposition, and circRNA812, circRNA91, and circRNA388 can adsorb more miRNAs. NC_040253.1_5757, as the source of miRNA response element (MRE) among the three, may play an important role during the process of sheep fat deposition. Conclusions: Our study gives a systematic examination of the circRNA profiles expressed in sheep subcutaneous fat. These results from this study provide some new basis for understanding circRNA function and sheep fat metabolism. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. The NIH Comparative Genomics Resource: addressing the promises and challenges of comparative genomics on human health.
- Author
-
Bornstein, Kristin, Gryan, Gary, Chang, E. Sally, Marchler-Bauer, Aron, and Schneider, Valerie A.
- Subjects
COMPARATIVE genomics ,BIOLOGICAL evolution ,GENOMICS ,ZOONOSES ,DRUG target ,SPANNING trees - Abstract
Comparative genomics is the comparison of genetic information within and across organisms to understand the evolution, structure, and function of genes, proteins, and non-coding regions (Sivashankari and Shanmughavel, Bioinformation 1:376-8, 2007). Advances in sequencing technology and assembly algorithms have resulted in the ability to sequence large genomes and provided a wealth of data that are being used in comparative genomic analyses. Comparative analysis can be leveraged to systematically explore and evaluate the biological relationships and evolution between species, aid in understanding the structure and function of genes, and gain a better understanding of disease and potential drug targets. As our knowledge of genetics expands, comparative genomics can help identify emerging model organisms among a broader span of the tree of life, positively impacting human health. This impact includes, but is not limited to, zoonotic disease research, therapeutics development, microbiome research, xenotransplantation, oncology, and toxicology. Despite advancements in comparative genomics, new challenges have arisen around the quantity, quality assurance, annotation, and interoperability of genomic data and metadata. New tools and approaches are required to meet these challenges and fulfill the needs of researchers. This paper focuses on how the National Institutes of Health (NIH) Comparative Genomics Resource (CGR) can address both the opportunities for comparative genomics to further impact human health and confront an increasingly complex set of challenges facing researchers. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. Multiomics comparative analysis of the maize large grain mutant tc19 identified pathways related to kernel development.
- Author
-
Cai, Qing, Jiao, Fuchao, Wang, Qianqian, Zhang, Enying, Song, Xiyun, Pei, Yuhe, Li, Jun, Zhao, Meiai, and Guo, Xinmei
- Subjects
MULTIOMICS ,CORN breeding ,COMPARATIVE studies ,GRAIN yields ,GRAIN ,PHENYLPROPANOIDS ,CORN - Abstract
Background: The mechanism of grain development in elite maize breeding lines has not been fully elucidated. Grain length, grain width and grain weight are key components of maize grain yield. Previously, using the Chinese elite maize breeding line Chang7-2 and its large grain mutant tc19, we characterized the grain size developmental difference between Chang7-2 and tc19 and performed transcriptomic analysis. Results: In this paper, using Chang7-2 and tc19, we performed comparative transcriptomic, proteomic and metabolomic analyses at different grain development stages. Through proteomics analyses, we found 2884, 505 and 126 differentially expressed proteins (DEPs) at 14, 21 and 28 days after pollination, respectively. Through metabolomics analysis, we identified 51, 32 and 36 differentially accumulated metabolites (DAMs) at 14, 21 and 28 days after pollination, respectively. Through multiomics comparative analysis, we showed that the phenylpropanoid pathways are influenced at transcriptomic, proteomic and metabolomic levels in all the three grain developmental stages. Conclusion: We identified several genes in phenylpropanoid biosynthesis, which may be related to the large grain phenotype of tc19. In summary, our results provided new insights into maize grain development. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. SDNN-PPI: self-attention with deep neural network effect on protein-protein interaction prediction.
- Author
-
Li, Xue, Han, Peifu, Wang, Gan, Chen, Wenqi, Wang, Shuang, and Song, Tao
- Subjects
ARTIFICIAL neural networks ,NETWORK effect ,PROTEIN-protein interactions ,MICE ,DEEP learning ,GENETIC transcription regulation ,CAENORHABDITIS - Abstract
Background: Protein-protein interactions (PPIs) dominate intracellular molecules to perform a series of tasks such as transcriptional regulation, information transduction, and drug signalling. The traditional wet experiment method to obtain PPIs information is costly and time-consuming. Result: In this paper, SDNN-PPI, a PPI prediction method based on self-attention and deep learning is proposed. The method adopts amino acid composition (AAC), conjoint triad (CT), and auto covariance (AC) to extract global and local features of protein sequences, and leverages self-attention to enhance DNN feature extraction to more effectively accomplish the prediction of PPIs. In order to verify the generalization ability of SDNN-PPI, a 5-fold cross-validation on the intraspecific interactions dataset of Saccharomyces cerevisiae (core subset) and human is used to measure our model in which the accuracy reaches 95.48% and 98.94% respectively. The accuracy of 93.15% and 88.33% are obtained in the interspecific interactions dataset of human-Bacillus Anthracis and Human-Yersinia pestis, respectively. In the independent data set Caenorhabditis elegans, Escherichia coli, Homo sapiens, and Mus musculus, all prediction accuracy is 100%, which is higher than the previous PPIs prediction methods. To further evaluate the advantages and disadvantages of the model, the one-core and crossover network are conducted to predict PPIs, and the data show that the model correctly predicts the interaction pairs in the network. Conclusion: In this paper, AAC, CT and AC methods are used to encode the sequence, and SDNN-PPI method is proposed to predict PPIs based on self-attention deep learning neural network. Satisfactory results are obtained on interspecific and intraspecific data sets, and good performance is also achieved in cross-species prediction. It can also correctly predict the protein interaction of cell and tumor information contained in one-core network and crossover network.The SDNN-PPI proposed in this paper not only explores the mechanism of protein-protein interaction, but also provides new ideas for drug design and disease prevention. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
34. The International Conference on Intelligent Biology and Medicine (ICIBM) 2020: Scalable techniques and algorithms for computational genomics.
- Author
-
Zhang, Wei, Zhao, Zhongming, Wang, Kai, Shen, Li, and Shi, Xinghua
- Subjects
HORIZONTAL gene transfer ,COMPUTATIONAL biology ,GENOMICS ,BIOLOGY ,CONFERENCES & conventions ,MEDICAL informatics - Abstract
In this introduction article, we summarize the 2020 International Conference on Intelligent Biology and Medicine (ICIBM 2020) conference which was held on August 9–10, 2020 (virtual conference). We then briefly describe the nine research articles included in this supplement issue. ICIBM 2020 hosted four scientific sections covering current topics in bioinformatics, computational biology, genomics, biomedical informatics, among others. A total of 75 original manuscripts were submitted to ICIBM 2020. All the papers were under rigorous review (at least three reviewers), and highly ranked manuscripts were selected for oral presentation and supplement issues. This genomics supplement issue included nine manuscripts. These articles cover methods and applications for single cell RNA sequencing, multi-omics data integration for gene regulation, gene fusion detection from long-read RNA sequencing, gene co-expression analysis of metabolic pathways in cancer, integrative genome-wide association studies (GWAS) of subcortical imaging phenotype in Alzheimer's disease, as well as deep learning methods for protein structure prediction, metabolic pathway membership inference, and horizontal gene transfer (HGT) insertion sites prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
35. Genome-wide association study reveals putative role of gga-miR-15a in controlling feed conversion ratio in layer chickens
- Author
-
Ning Yang, Jingwei Yuan, Sirui Chen, Congjiao Sun, Shi Fengying, Guiqin Wu, and Aiqiao Liu
- Subjects
0301 basic medicine ,Genome-wide association study ,Linkage disequilibrium ,lcsh:QH426-470 ,Animal feed ,lcsh:Biotechnology ,Feed efficiency ,Single-nucleotide polymorphism ,Biology ,Feed conversion ratio ,Linkage Disequilibrium ,Gga-miR-15a ,Late laying period ,03 medical and health sciences ,lcsh:TP248.13-248.65 ,Genetics ,Animals ,Original Paper ,0402 animal and dairy science ,04 agricultural and veterinary sciences ,Heritability ,Animal husbandry ,Animal Feed ,040201 dairy & animal science ,lcsh:Genetics ,MicroRNAs ,Phenotype ,030104 developmental biology ,Gene Expression Regulation ,Residual feed intake ,Chickens ,Biotechnology - Abstract
Background Efficient use of feed resources for farm animals is a critical concern in animal husbandry. Numerous genetic and nutritional studies have been conducted to investigate feed efficiency during the regular laying cycle of chickens. However, by prolonging the laying period of layers, the performance of feed utilization in the late-laying period becomes increasingly important. In the present study, we measured daily feed intake (FI), residual feed intake (RFI) and feed conversion ratio (FCR) of 808 hens during 81–82 weeks of age to evaluate genetic properties and then used a genome-wide association study (GWAS) to reveal the genetic determinants. Results The heritability estimates for the investigated traits were medium and between 0.15 and 0.28 in both pedigree- and genomic-based estimates, whereas the genetic correlations among these traits were high and ranged from 0.49 to 0.90. Three genome-wide significant SNPs located on chromosome 1 (GGA1) were detected for FCR. Linkage disequilibrium (LD) and conditional GWA analysis indicated that these 3 SNPs were highly correlated with one another, located at 13.55–45.16 Kb upstream of gga-miR-15a. Results of quantitative real-time polymerase chain reaction (qRT-PCR) analysis in liver tissue showed that the expression of gga-miR-15a was significantly higher in the high FCR birds than that in the medium or low FCR birds. Bioinformatics analysis further revealed that gga-mir-15a could act on many target genes, such as forkhead box O1 (FOXO1) that is involved in the insulin-signaling pathway, which influences nutrient metabolism in many organisms. Additionally, some suggestively significant variants, located on GGA3 and GGA9, were identified to associate with FI and RFI. Conclusions This GWA analysis was conducted on feed intake and efficiency traits for chickens and was innovative for application in the late laying period. Our findings can be used as a reference in the genomic breeding programs for increasing the efficiency performance of old hens and to improve our understanding of the molecular determinants for feed efficiency. Electronic supplementary material The online version of this article (10.1186/s12864-017-4092-9) contains supplementary material, which is available to authorized users.
- Published
- 2017
36. normGAM: an R package to remove systematic biases in genome architecture mapping data.
- Author
-
Liu, Tong and Wang, Zheng
- Subjects
GENE mapping ,DATA mapping ,FLUORESCENCE in situ hybridization ,RESTRICTION fragment length polymorphisms ,LINKAGE disequilibrium ,SOURCE code - Abstract
Background: The genome architecture mapping (GAM) technique can capture genome-wide chromatin interactions. However, besides the known systematic biases in the raw GAM data, we have found a new type of systematic bias. It is necessary to develop and evaluate effective normalization methods to remove all systematic biases in the raw GAM data. Results: We have detected a new type of systematic bias, the fragment length bias, in the genome architecture mapping (GAM) data, which is significantly different from the bias of window detection frequency previously mentioned in the paper introducing the GAM method but is similar to the bias of distances between restriction sites existing in raw Hi-C data. We have found that the normalization method (a normalized variant of the linkage disequilibrium) used in the GAM paper is not able to effectively eliminate the new fragment length bias at 1 Mb resolution (slightly better at 30 kb resolution). We have developed an R package named normGAM for eliminating the new fragment length bias together with the other three biases existing in raw GAM data, which are the biases related to window detection frequency, mappability, and GC content. Five normalization methods have been implemented and included in the R package including Knight-Ruiz 2-norm (KR2, newly designed by us), normalized linkage disequilibrium (NLD), vanilla coverage (VC), sequential component normalization (SCN), and iterative correction and eigenvector decomposition (ICE). Conclusions: Based on our evaluations, the five normalization methods can eliminate the four biases existing in raw GAM data, with VC and KR2 performing better than the others. We have observed that the KR2-normalized GAM data have a higher correlation with the KR-normalized Hi-C data on the same cell samples indicating that the KR-related methods are better than the others for keeping the consistency between the GAM and Hi-C experiments. Compared with the raw GAM data, the normalized GAM data are more consistent with the normalized distances from the fluorescence in situ hybridization (FISH) experiments. The source code of normGAM can be freely downloaded from http://dna.cs.miami.edu/normGAM/. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
37. Explore potential disease related metabolites based on latent factor model.
- Author
-
Yongtian Wang, Liran Juan, Jiajie Peng, Tao Wang, Tianyi Zang, and Yadong Wang
- Subjects
METABOLITES ,MATRIX decomposition ,DECOMPOSITION method ,METABOLIC models ,LATENT infection ,FRUIT rots - Abstract
Background: In biological systems, metabolomics can not only contribute to the discovery of metabolic signatures for disease diagnosis, but is very helpful to illustrate the underlying molecular disease-causing mechanism. Therefore, identification of disease-related metabolites is of great significance for comprehensively understanding the pathogenesis of diseases and improving clinical medicine. Results: In the paper, we propose a disease and literature driven metabolism prediction model (DLMPM) to identify the potential associations between metabolites and diseases based on latent factor model. We build the disease glossary with disease terms from different databases and an association matrix based on the mapping between diseases and metabolites. The similarity of diseases and metabolites is used to complete the association matrix. Finally, we predict potential associations between metabolites and diseases based on the matrix decomposition method. In total, 1,406 direct associations between diseases and metabolites are found. There are 119,206 unknown associations between diseases and metabolites predicted with a coverage rate of 80.88%. Subsequently, we extract training sets and testing sets based on data increment from the database of disease-related metabolites and assess the performance of DLMPM on 19 diseases. As a result, DLMPM is proven to be successful in predicting potential metabolic signatures for human diseases with an average AUC value of 82.33%. Conclusion: In this paper, a computational model is proposed for exploring metabolite-disease pairs and has good performance in predicting potential metabolites related to diseases through adequate validation. The results show that DLMPM has a better performance in prioritizing candidate diseases-related metabolites compared with the previous methods and would be helpful for researchers to reveal more information about human diseases. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
38. Expression profile and bioinformatics analysis of circRNA and its associated ceRNA networks in longissimus dorsi from Lufeng cattle and Leiqiong cattle.
- Author
-
Yang, Chuang, Wu, Longfei, Guo, Yongqing, Li, Yaokun, Deng, Ming, Liu, Dewu, Liu, Guangbin, and Sun, Baoli
- Subjects
GENE expression ,ERECTOR spinae muscles ,CIRCULAR RNA ,RNA splicing ,RNA sequencing ,CATTLE breeds ,CATTLE breeding - Abstract
This paper aims to explore the role of circRNA expression profiles and circRNA-associated ceRNA networks in the regulation of myogenesis in the longissimus dorsi of cattle breeds surviving under subtropical conditions in southern China by RNA sequencing and bioinformatics analysis. It also aims to provide comprehensive understanding of the differences in muscle fibers in subtropical cattle breeds and to expand the knowledge of the molecular networks that regulate myogenesis. With regard to meat quality indicators, results showed that the longissimus dorsi of LQC had lower pH (P < 0.0001), lower redness (P < 0.01), lower shear force (P < 0.05), and higher brightness (P < 0.05) than the longissimus dorsi of LFC. With regard to muscle fiber characteristics, the longissimus dorsi of LQC had a smaller diameter (P < 0.0001) and higher density of muscle fibers (P < 0.05). The analysis results show that the function of many circRNA-targeted mRNAs was related to myogenesis and metabolic regulation. Furthermore, in the analysis of the function of circRNA source genes, we hypothesized that btacirc_00497 and btacirc_034497 may regulate the function and type of myofibrils by affecting the expression of MYH6, MYH7, and NEB through competitive linear splicing. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
39. iEnhancer-DCSA: identifying enhancers via dual-scale convolution and spatial attention.
- Author
-
Wang, Wenjun, Wu, Qingyao, and Li, Chunshan
- Subjects
DEEP learning ,TEST methods - Abstract
Background: Due to the dynamic nature of enhancers, identifying enhancers and their strength are major bioinformatics challenges. With the development of deep learning, several models have facilitated enhancers detection in recent years. However, existing studies either neglect different length motifs information or treat the features at all spatial locations equally. How to effectively use multi-scale motifs information while ignoring irrelevant information is a question worthy of serious consideration. In this paper, we propose an accurate and stable predictor iEnhancer-DCSA, mainly composed of dual-scale fusion and spatial attention, automatically extracting features of different length motifs and selectively focusing on the important features. Results: Our experimental results demonstrate that iEnhancer-DCSA is remarkably superior to existing state-of-the-art methods on the test dataset. Especially, the accuracy and MCC of enhancer identification are improved by 3.45% and 9.41%, respectively. Meanwhile, the accuracy and MCC of enhancer classification are improved by 7.65% and 18.1%, respectively. Furthermore, we conduct ablation studies to demonstrate the effectiveness of dual-scale fusion and spatial attention. Conclusions: iEnhancer-DCSA will be a valuable computational tool in identifying and classifying enhancers, especially for those not included in the training dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
40. CNARA: reliability assessment for genomic copy number profiles
- Author
-
Haoyang Cai, Michael Baudis, Caius Solovan, Ni Ai, University of Zurich, and Ai, Ni
- Subjects
0301 basic medicine ,DNA Copy Number Variations ,Copy number analysis ,Reliability assessment ,Gene Dosage ,Genomics ,Computational biology ,Biology ,Web Browser ,Gene dosage ,Genome ,03 medical and health sciences ,1311 Genetics ,Genetics ,Preprocessor ,Profiling (information science) ,Computer Simulation ,Microarray platform ,Original Paper ,Computational Biology ,Reproducibility of Results ,CNA ,10124 Institute of Molecular Life Sciences ,030104 developmental biology ,1305 Biotechnology ,570 Life sciences ,biology ,Copy number profile ,DNA microarray ,Algorithms ,Biotechnology - Abstract
Background DNA copy number profiles from microarray and sequencing experiments sometimes contain wave artefacts which may be introduced during sample preparation and cannot be removed completely by existing preprocessing methods. Besides, large derivative log ratio spread (DLRS) of the probes correlating with poor DNA quality is sometimes observed in genome screening experiments and may lead to unreliable copy number profiles. Depending on the extent of these artefacts and the resulting misidentification of copy number alterations/variations (CNA/CNV), it may be desirable to exclude such samples from analyses or to adapt the downstream data analysis strategy accordingly. Results Here, we propose a method to distinguish reliable genomic copy number profiles from those containing heavy wave artefacts and/or large DLRS. We define four features that adequately summarize the copy number profiles for reliability assessment, and train a classifier on a dataset of 1522 copy number profiles from various microarray platforms. The method can be applied to predict the reliability of copy number profiles irrespective of the underlying microarray platform and may be adapted for those sequencing platforms from which copy number estimates could be computed as a piecewise constant signal. Further details can be found at https://github.com/baudisgroup/CNARA. Conclusions We have developed a method for the assessment of genomic copy number profiling data, and suggest to apply the method in addition to and after other state-of-the-art noise correction and quality control procedures. CNARA could be instrumental in improving the assessment of data used for genomic data mining experiments and support the reliable functional attribution of copy number aberrations especially in cancer research. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3074-7) contains supplementary material, which is available to authorized users.
- Published
- 2016
41. Metagenome-mining indicates an association between bacteriocin presence and strain diversity in the infant gut.
- Author
-
Ormaasen, Ida, Rudi, Knut, Diep, Dzung B., and Snipen, Lars
- Subjects
INFANTS ,GUT microbiome ,HUMAN microbiota ,ROLE conflict ,ANTIMICROBIAL peptides - Abstract
Background: Our knowledge about the ecological role of bacterial antimicrobial peptides (bacteriocins) in the human gut is limited, particularly in relation to their role in the diversification of the gut microbiota during early life. The aim of this paper was therefore to address associations between bacteriocins and bacterial diversity in the human gut microbiota. To investigate this, we did an extensive screening of 2564 healthy human gut metagenomes for the presence of predicted bacteriocin-encoding genes, comparing bacteriocin gene presence to strain diversity and age. Results: We found that the abundance of bacteriocin genes was significantly higher in infant-like metagenomes (< 2 years) compared to adult-like metagenomes (2–107 years). By comparing infant-like metagenomes with and without a given bacteriocin, we found that bacteriocin presence was associated with increased strain diversities. Conclusions: Our findings indicate that bacteriocins may play a role in the strain diversification during the infant gut microbiota establishment. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
42. Two simple methods to improve the accuracy of the genomic selection methodology.
- Author
-
Montesinos-López, Osval A., Kismiantini, and Montesinos-López, Abelardo
- Abstract
Background: Genomic selection (GS) is revolutionizing plant and animal breeding. However, still its practical implementation is challenging since it is affected by many factors that when they are not under control make this methodology not effective. Also, due to the fact that it is formulated as a regression problem in general has low sensitivity to select the best candidate individuals since a top percentage is selected according to a ranking of predicted breeding values. Results: For this reason, in this paper we propose two methods to improve the prediction accuracy of this methodology. One of the methods consist in reformulating the GS (nowadays formulated as a regression problem) methodology as a binary classification problem. The other consists only in a postprocessing step that adjust the threshold used for classification of the lines predicted in its original scale (continues scale) to guarantee similar sensitivity and specificity. The postprocessing method is applied for the resulting predictions after obtaining the predictions using the conventional regression model. Both methods assume that we defined with anticipation a threshold, to divide the training data as top lines and not top lines, and this threshold can be decided in terms of a quantile (for example 80%, 90%, etc.) or as the average (or maximum) of the performance of the checks. In the reformulation method it is required to label as one those lines in the training set that are equal or larger than the specified threshold and as zero otherwise. Then we train a binary classification model with the conventional inputs, but using the binary response variable in place of the continuous response variable. The training of the binary classification should be done to guarantee a more similar sensitivity and specificity, to guarantee a reasonable probability of classification of the top lines. Conclusions: We evaluated the proposed models in seven data sets and we found that the two proposed methods outperformed by large margin the conventional regression model (by 402.9% in terms of sensitivity, by 110.04% in terms of F1 score and by 70.96% in terms of Kappa coefficient, with the postprocessing methods). However, between the two proposed methods the postprocessing method was better than the reformulation as binary classification model. The simple postprocessing method to improve the accuracy of the conventional genomic regression models avoid the need to reformulate the conventional regression models as binary classification models with similar or better performance, that significantly improve the selection of the top best candidate lines. In general both proposed methods are simple and can easily be adopted for use in practical breeding programs, with the guarantee that will improve significantly the selection of the top best candidates lines. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Genetic diversity and signatures of selection in BoHuai goat revealed by whole-genome sequencing.
- Author
-
Yao, Zhi, Zhang, Shunjin, Wang, Xianwei, Guo, Yingwei, Xin, Xiaoling, Zhang, Zijing, Xu, Zejun, Wang, Eryao, Jiang, Yu, and Huang, Yongzhen
- Subjects
GENETIC variation ,GOAT breeds ,GOATS ,BREEDING ,NATURAL immunity ,LIPID metabolism ,ANIMAL breeding - Abstract
Background: Cross breeding is an important way to improve livestock performance. As an important livestock and poultry resource in Henan Province of China, Bohuai goat was formed by crossing Boer goat and Huai goat. After more than 20 years of breeding, BoHuai goats showed many advantages, such as fast growth, good reproductive performance, and high meat yield. In order to better develop and protect Bohuai goats, we sequenced the whole genomes of 30 BoHuai goats and 5 Huai goats to analyze the genetic diversity, population structure and genomic regions under selection of BoHuai goat. Furthermore, we used 126 published genomes of world-wide goat to characterize the genomic variation of BoHuai goat. Results: The results showed that the nucleotide diversity of BoHuai goats was lower and the degree of linkage imbalance was higher than that of other breeds. The analysis of population structure showed that BoHuai goats have obvious differences from other goat breeds. In addition, the BoHuai goat is more closely related to the Boer goat than the Huai goat and is highly similar to the Boer goat. Group by selection signal in the BoHuai goat study, we found that one region on chromosome 7 shows a very strong selection signal, which suggests that it could well be the segment region under the intense artificial selection results. Through selective sweeps, we detected some genes related to important traits such as lipid metabolism (LDLR, STAR, ANGPTL8), fertility (STAR), and disease resistance (CD274, DHPS, PDCD1LG2). Conclusion: In this paper, we elucidated the genomic variation, ancestry composition, and selective signals related to important economic traits in BoHuai goats. Our studies on the genome of BoHuai goats will not only help to understand the characteristics of the crossbred but also provide a basis for the improvement of cross-breeding programs. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Mitochondrial genome plasticity of mammalian species
- Author
-
Bálint Biró, Zoltán Gál, Zsófia Fekete, Eszter Klecska, and Orsolya Ivett Hoffmann
- Subjects
NUMT ,Mammals ,Genome ,Bioinformatics ,Machine learning ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract There is an ongoing process in which mitochondrial sequences are being integrated into the nuclear genome. The importance of these sequences has already been revealed in cancer biology, forensic, phylogenetic studies and in the evolution of the eukaryotic genetic information. Human and numerous model organisms’ genomes were described from those sequences point of view. Furthermore, recent studies were published on the patterns of these nuclear localised mitochondrial sequences in different taxa. However, the results of the previously released studies are difficult to compare due to the lack of standardised methods and/or using few numbers of genomes. Therefore, in this paper our primary goal is to establish a uniform mining pipeline to explore these nuclear localised mitochondrial sequences. Our results show that the frequency of several repetitive elements is higher in the flanking regions of these sequences than expected. A machine learning model reveals that the flanking regions’ repetitive elements and different structural characteristics are highly influential during the integration process. In this paper, we introduce a general mining pipeline for all mammalian genomes. The workflow is publicly available and is believed to serve as a validated baseline for future research in this field. We confirm the widespread opinion, on - as to our current knowledge - the largest dataset, that structural circumstances and events corresponding to repetitive elements are highly significant. An accurate model has also been trained to predict these sequences and their corresponding flanking regions.
- Published
- 2024
- Full Text
- View/download PDF
45. The International Conference on Intelligent Biology and Medicine (ICIBM) 2018: genomics with bigger data and wider applications.
- Author
-
Wu, Zhijin, Yan, Jingwen, Wang, Kai, Liu, Xiaoming, Guo, Yan, Zhi, Degui, Ruan, Jianhua, and Zhao, Zhongming
- Subjects
BIOLOGY conferences ,MEDICAL conferences ,GENOMES ,ARTIFICIAL intelligence ,INDIVIDUALIZED medicine - Abstract
The sixth International Conference on Intelligent Biology and Medicine (ICIBM) took place in Los Angeles, California, USA on June 10–12, 2018. This conference featured eleven regular scientific sessions, four tutorials, one poster session, four keynote talks, and four eminent scholar talks. The scientific program covered a wide range of topics from bench to bedside, including 3D Genome Organization, reconstruction of large scale evolution of genomes and gene functions, artificial intelligence in biological and biomedical fields, and precision medicine. Both method development and application in genomic research continued to be a main component in the conference, including studies on genetic variants, regulation of transcription, genetic-epigenetic interaction at both single cell and tissue level and artificial intelligence. Here, we write a summary of the conference and also briefly introduce the four high quality papers selected to be published in BMC Genomics that cover novel methodology development or innovative data analysis. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
46. DNAscent v2: detecting replication forks in nanopore sequencing data with deep learning.
- Author
-
Boemo, Michael A.
- Subjects
GENOMICS ,DNA analysis ,DNA sequencing ,DNA repair ,SEQUENCE alignment ,DEEP learning ,DNA replication - Abstract
Background: Measuring DNA replication dynamics with high throughput and single-molecule resolution is critical for understanding both the basic biology behind how cells replicate their DNA and how DNA replication can be used as a therapeutic target for diseases like cancer. In recent years, the detection of base analogues in Oxford Nanopore Technologies (ONT) sequencing reads has become a promising new method to supersede existing single-molecule methods such as DNA fibre analysis: ONT sequencing yields long reads with high throughput, and sequenced molecules can be mapped to the genome using standard sequence alignment software. Results: This paper introduces DNAscent v2, software that uses a residual neural network to achieve fast, accurate detection of the thymidine analogue BrdU with single-nucleotide resolution. DNAscent v2 also comes equipped with an autoencoder that interprets the pattern of BrdU incorporation on each ONT-sequenced molecule into replication fork direction to call the location of replication origins termination sites. DNAscent v2 surpasses previous versions of DNAscent in BrdU calling accuracy, origin calling accuracy, speed, and versatility across different experimental protocols. Unlike NanoMod, DNAscent v2 positively identifies BrdU without the need for sequencing unmodified DNA. Unlike RepNano, DNAscent v2 calls BrdU with single-nucleotide resolution and detects more origins than RepNano from the same sequencing data. DNAscent v2 is open-source and available at https://github.com/MBoemo/DNAscent. Conclusions: This paper shows that DNAscent v2 is the new state-of-the-art in the high-throughput, single-molecule detection of replication fork dynamics. These improvements in DNAscent v2 mark an important step towards measuring DNA replication dynamics in large genomes with single-molecule resolution. Looking forward, the increase in accuracy in single-nucleotide resolution BrdU calls will also allow DNAscent v2 to branch out into other areas of genome stability research, particularly the detection of DNA repair. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
47. Transcriptome sequencing and gene expression analysis revealed early ovule abortion of Paeonia ludlowii.
- Author
-
Chen, Ting-qiao, Sun, Yue, and Yuan, Tao
- Subjects
OVULES ,ABORTION ,GENE expression ,TRANSCRIPTOMES ,ENDANGERED species ,PEONIES - Abstract
Background: Paeonia ludlowii (Stern & G. Taylor D.Y. Hong) belongs to the peony group of the genus Paeonia in the Paeoniaceae family and is now classified as a "critically endangered species" in China. Reproduction is important for this species, and its low fruiting rate has become a critical factor limiting both the expansion of its wild population and its domestic cultivation. Results: In this study, we investigated possible causes of the low fruiting rate and ovule abortion in Paeonia ludlowii. We clarified the characteristics of ovule abortion and the specific time of abortion in Paeonia ludlowii, and used transcriptome sequencing to investigate the mechanism of abortion of ovules in Paeonia ludlowii. Conclusions: In this paper, the ovule abortion characteristics of Paeonia ludlowii were systematically studied for the first time and provide a theoretical basis for the optimal breeding and future cultivation of Paeonia ludlowii. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Computational dissection of genetic variation modulating the response of multiple photosynthetic phenotypes to the light environment
- Author
-
Huiying Gong, Ziyang Zhou, Chenhao Bu, Deqiang Zhang, Qing Fang, Xiao-Yu Zhang, and Yuepeng Song
- Subjects
Photosynthesis ,Electron transport rate ,Photochemical quenching ,Nonphotochemical quenching ,Light environment ,Genetic variation ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background The expression of biological traits is modulated by genetics as well as the environment, and the level of influence exerted by the latter may vary across characteristics. Photosynthetic traits in plants are complex quantitative traits that are regulated by both endogenous genetic factors and external environmental factors such as light intensity and CO2 concentration. The specific processes impacted occur dynamically and continuously as the growth of plants changes. Although studies have been conducted to explore the genetic regulatory mechanisms of individual photosynthetic traits or to evaluate the effects of certain environmental variables on photosynthetic traits, the systematic impact of environmental variables on the dynamic process of integrated plant growth and development has not been fully elucidated. Results In this paper, we proposed a research framework to investigate the genetic mechanism of high-dimensional complex photosynthetic traits in response to the light environment at the genome level. We established a set of high-dimensional equations incorporating environmental regulators to integrate functional mapping and dynamic screening of gene‒environment complex systems to elucidate the process and pattern of intrinsic genetic regulatory mechanisms of three types of photosynthetic phenotypes of Populus simonii that varied with light intensity. Furthermore, a network structure was established to elucidate the crosstalk among significant QTLs that regulate photosynthetic phenotypic systems. Additionally, the detection of key QTLs governing the response of multiple phenotypes to the light environment, coupled with the intrinsic differences in genotype expression, provides valuable insights into the regulatory mechanisms that drive the transition of photosynthetic activity and photoprotection in the face of varying light intensity gradients. Conclusions This paper offers a comprehensive approach to unraveling the genetic architecture of multidimensional variations in photosynthetic phenotypes, considering the combined impact of integrated environmental factors from multiple perspectives.
- Published
- 2024
- Full Text
- View/download PDF
49. Time series-based hybrid ensemble learning model with multivariate multidimensional feature coding for DNA methylation prediction
- Author
-
Wu Yan, Li Tan, Li Mengshan, Zhou Weihong, Sheng Sheng, Wang Jun, and Wu Fu-an
- Subjects
DNA methylation ,Time sequences ,Feature coding ,Ensemble learning ,Biotechnology ,TP248.13-248.65 ,Genetics ,QH426-470 - Abstract
Abstract Background DNA methylation is a form of epigenetic modification that impacts gene expression without modifying the DNA sequence, thereby exerting control over gene function and cellular development. The prediction of DNA methylation is vital for understanding and exploring gene regulatory mechanisms. Currently, machine learning algorithms are primarily used for model construction. However, several challenges remain to be addressed, including limited prediction accuracy, constrained generalization capability, and insufficient learning capacity. Results In response to the aforementioned challenges, this paper leverages the similarities between DNA sequences and time series to introduce a time series-based hybrid ensemble learning model, called Multi2-Con-CAPSO-LSTM. The model utilizes multivariate and multidimensional encoding approach, combining three types of time series encodings with three kinds of genetic feature encodings, resulting in a total of nine types of feature encoding matrices. Convolutional Neural Networks are utilized to extract features from DNA sequences, including temporal, positional, physicochemical, and genetic information, thereby creating a comprehensive feature matrix. The Long Short-Term Memory model is then optimized using the Chaotic Accelerated Particle Swarm Optimization algorithm for predicting DNA methylation. Conclusions Through cross-validation experiments conducted on 17 species involving three types of DNA methylation (6 mA, 5hmC, and 4mC), the results demonstrate the robust predictive capabilities of the Multi2-Con-CAPSO-LSTM model in DNA methylation prediction across various types and species. Compared with other benchmark models, the Multi2-Con-CAPSO-LSTM model demonstrates significant advantages in sensitivity, specificity, accuracy, and correlation. The model proposed in this paper provides valuable insights and inspiration across various disciplines, including sequence alignment, genetic evolution, time series analysis, and structure–activity relationships.
- Published
- 2023
- Full Text
- View/download PDF
50. The reported colour formation mechanism in pitaya fruit through co-accumulation of anthocyanins and betalains is inconsistent and fails to establish the co-accumulation.
- Author
-
Khan, Mohammad Imtiyaj
- Subjects
ANTHOCYANINS ,BETALAINS ,COLOR of fruit ,AMINO compounds ,MOLECULAR structure ,BOTANICAL chemistry - Abstract
This cannot be reconciled with the betalain biosynthetic pathway, and also not supported by the metabolite profile provided in Table S9 in [[1]], i.e. RR pulp has more than 800 times total betacyanins than GW pulp, and RR peel has more than 5 times than GW peel. 3 [[1]] have lower or similar expressions in YW pulp and GW pulp compared to YW peel and GW peel. For example, RR peel has about ten times more total anthocyanins (all the differentially expressed anthocyanins taken together) than RR pulp (Table S9 in [[1]]), however, the I ANS i expression in both of them was not significantly different (Fig. Keywords: Anthocyanins; Betalains; Gene expression; Amaranthin; Gomphrenin-I EN Anthocyanins Betalains Gene expression Amaranthin Gomphrenin-I 1 5 5 11/10/22 20221109 NES 221109 Background The premise of the paper authored by Zhou et al. [[1]] published in BMC Genomics is that, in pitayas I Hylocereus undatus i (red peel-red pulp or RR; green peel-white pulp or GW) and I H. megalanthus i (yellow peel-white pulp or YW, also called I Selenicereus megalanthus i , http://legacy.tropicos.org/Name/50251405?tab=acceptednames, accessed on 14/11/2020) [[2]]), anthocyanins and betalains co-accumulate, and hence both contribute to peel and pulp colour formation. [Extracted from the article]
- Published
- 2022
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.