5 results on '"Zhao, Duojun"'
Search Results
2. PacBio full‐length cDNA sequencing integrated with RNA‐seq reads drastically improves the discovery of splicing transcripts in rice.
- Author
-
Zhang, Guoqiang, Sun, Min, Wang, Jianfeng, Lei, Meng, Li, Chenji, Zhao, Duojun, Huang, Jun, Li, Wenjie, Li, Shuangli, Li, Jing, Yang, Jin, Luo, Yingfeng, Hu, Songnian, and Zhang, Bing
- Subjects
ANTISENSE DNA ,NUCLEOTIDE sequence ,RICE genetics ,ALTERNATIVE RNA splicing ,ERROR correction (Information theory) - Abstract
SUMMARY: In eukaryotes, alternative splicing (AS) greatly expands the diversity of transcripts. However, it is challenging to accurately determine full‐length splicing isoforms. Recently, more studies have taken advantage of Pacific Bioscience (PacBio) long‐read sequencing to identify full‐length transcripts. Nevertheless, the high error rate of PacBio reads seriously offsets the advantages of long reads, especially for accurately identifying splicing junctions. To best capitalize on the features of long reads, we used Illumina RNA‐seq reads to improve PacBio circular consensus sequence (CCS) quality and to validate splicing patterns in the rice transcriptome. We evaluated the impact of CCS accuracy on the number and the validation rate of splicing isoforms, and integrated a comprehensive pipeline of splicing transcripts analysis by Iso‐Seq and RNA‐seq (STAIR) to identify the full‐length multi‐exon isoforms in rice seedling transcriptome (Oryza sativa L. ssp. japonica). STAIR discovered 11 733 full‐length multi‐exon isoforms, 6599 more than the SMRT Portal RS_IsoSeq pipeline did. Of these splicing isoforms identified, 4453 (37.9%) were missed in assembled transcripts from RNA‐seq reads, and 5204 (44.4%), including 268 multi‐exon long non‐coding RNAs (lncRNAs), were not reported in the MSU_osa1r7 annotation. Some randomly selected unreported splicing junctions were verified by polymerase chain reaction (PCR) amplification. In addition, we investigated alternative polyadenylation (APA) events in transcripts and identified 829 major polyadenylation [poly(A)] site clusters (PACs). The analysis of splicing isoforms and APA events will facilitate the annotation of the rice genome and studies on the expression and polyadenylation of AS genes in different developmental stages or growth conditions of rice. Significance statement: In this study, we integrated PacBio full‐length cDNA sequencing and RNA‐seq into a STAIR pipeline to improve the discovery of splicing isoforms in rice, and further investigated alternative polyadenylation events in transcripts. The analysis of splicing isoforms and alternative polyadenylation events will facilitate the annotation of the rice genome and the understanding of the expressions of alternative splicing genes in rice. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
3. Genome sequence of the date palm Phoenix dactylifera L.
- Author
-
Al-Mssallem IS, Hu S, Zhang X, Lin Q, Liu W, Tan J, Yu X, Liu J, Pan L, Zhang T, Yin Y, Xin C, Wu H, Zhang G, Ba Abdullah MM, Huang D, Fang Y, Alnakhli YO, Jia S, Yin A, Alhuzimi EM, Alsaihati BA, Al-Owayyed SA, Zhao D, Zhang S, Al-Otaibi NA, Sun G, Majrashi MA, Li F, Tala, Wang J, Yun Q, Alnassar NA, Wang L, Yang M, Al-Jelaify RF, Liu K, Gao S, Chen K, Alkhaldi SR, Liu G, Zhang M, Guo H, and Yu J
- Subjects
- Base Sequence, Carbohydrate Metabolism genetics, Chromosomes, Plant genetics, Gene Duplication genetics, Gene Expression Profiling, Gene Expression Regulation, Plant, Genes, Plant genetics, Molecular Sequence Annotation, Multigene Family genetics, Phylogeny, Polymorphism, Single Nucleotide genetics, Reproducibility of Results, Synteny genetics, Arecaceae genetics, Genome, Plant genetics
- Abstract
Date palm (Phoenix dactylifera L.) is a cultivated woody plant species with agricultural and economic importance. Here we report a genome assembly for an elite variety (Khalas), which is 605.4 Mb in size and covers >90% of the genome (~671 Mb) and >96% of its genes (~41,660 genes). Genomic sequence analysis demonstrates that P. dactylifera experienced a clear genome-wide duplication after either ancient whole genome duplications or massive segmental duplications. Genetic diversity analysis indicates that its stress resistance and sugar metabolism-related genes tend to be enriched in the chromosomal regions where the density of single-nucleotide polymorphisms is relatively low. Using transcriptomic data, we also illustrate the date palm's unique sugar metabolism that underlies fruit development and ripening. Our large-scale genomic and transcriptomic data pave the way for further genomic studies not only on P. dactylifera but also other Arecaceae plants.
- Published
- 2013
- Full Text
- View/download PDF
4. A pangenomic study of Bacillus thuringiensis.
- Author
-
Fang Y, Li Z, Liu J, Shu C, Wang X, Zhang X, Yu X, Zhao D, Liu G, Hu S, Zhang J, Al-Mssallem I, and Yu J
- Subjects
- Bacillus thuringiensis classification, Bacillus thuringiensis Toxins, Bacterial Proteins genetics, DNA Mutational Analysis, Endotoxins genetics, Genetic Variation, Genomics, Hemolysin Proteins genetics, Phylogeny, Plasmids, Sequence Analysis, DNA, Bacillus thuringiensis genetics, Genome, Bacterial
- Abstract
Bacillus thuringiensis (B. thuringiensis) is a soil-dwelling Gram-positive bacterium and its plasmid-encoded toxins (Cry) are commonly used as biological alternatives to pesticides. In a pangenomic study, we sequenced seven B. thuringiensis isolates in both high coverage and base-quality using the next-generation sequencing platform. The B. thuringiensis pangenome was extrapolated to have 4196 core genes and an asymptotic value of 558 unique genes when a new genome is added. Compared to the pangenomes of its closely related species of the same genus, B. thuringiensis pangenome shows an open characteristic, similar to B. cereus but not to B. anthracis; the latter has a closed pangenome. We also found extensive divergence among the seven B. thuringiensis genome assemblies, which harbor ample repeats and single nucleotide polymorphisms (SNPs). The identities among orthologous genes are greater than 84.5% and the hotspots for the genome variations were discovered in genomic regions of 2.3-2.8Mb and 5.0-5.6Mb. We concluded that high-coverage sequence assemblies from multiple strains, before all the gaps are closed, are very useful for pangenomic studies., (Copyright © 2011. Published by Elsevier Ltd.)
- Published
- 2011
- Full Text
- View/download PDF
5. The complete chloroplast genome sequence of date palm (Phoenix dactylifera L.).
- Author
-
Yang M, Zhang X, Liu G, Yin Y, Chen K, Yun Q, Zhao D, Al-Mssallem IS, and Yu J
- Subjects
- Arecaceae chemistry, Base Sequence, Chloroplasts chemistry, Gene Expression Regulation, Plant, Genome, Plant, Molecular Sequence Data, Nucleic Acid Conformation, RNA, Plant chemistry, RNA, Plant genetics, Arecaceae genetics, Chloroplasts genetics, Genome, Chloroplast
- Abstract
Background: Date palm (Phoenix dactylifera L.), a member of Arecaceae family, is one of the three major economically important woody palms--the two other palms being oil palm and coconut tree--and its fruit is a staple food among Middle East and North African nations, as well as many other tropical and subtropical regions. Here we report a complete sequence of the data palm chloroplast (cp) genome based on pyrosequencing., Methodology/principal Findings: After extracting 369,022 cp sequencing reads from our whole-genome-shotgun data, we put together an assembly and validated it with intensive PCR-based verification, coupled with PCR product sequencing. The date palm cp genome is 158,462 bp in length and has a typical quadripartite structure of the large (LSC, 86,198 bp) and small single-copy (SSC, 17,712 bp) regions separated by a pair of inverted repeats (IRs, 27,276 bp). Similar to what has been found among most angiosperms, the date palm cp genome harbors 112 unique genes and 19 duplicated fragments in the IR regions. The junctions between LSC/IRs and SSC/IRs show different features of sequence expansion in evolution. We identified 78 SNPs as major intravarietal polymorphisms within the population of a specific cp genome, most of which were located in genes with vital functions. Based on RNA-sequencing data, we also found 18 polycistronic transcription units and three highly expression-biased genes--atpF, trnA-UGC, and rrn23., Conclusions: Unlike most monocots, date palm has a typical cp genome similar to that of tobacco--with little rearrangement and gene loss or gain. High-throughput sequencing technology facilitates the identification of intravarietal variations in cp genomes among different cultivars. Moreover, transcriptomic analysis of cp genes provides clues for uncovering regulatory mechanisms of transcription and translation in chloroplasts.
- Published
- 2010
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.