Back to Search
Start Over
Genome Reference and Sequence Variation in the Large Repetitive Central Exon of Human MUC5AC
- Source :
- American Journal of Respiratory Cell and Molecular Biology. 50:223-232
- Publication Year :
- 2014
- Publisher :
- American Thoracic Society, 2014.
-
Abstract
- Despite modern sequencing efforts, the difficulty in assembly of highly repetitive sequences has prevented resolution of human genome gaps, including some in the coding regions of genes with important biological functions. One such gene, MUC5AC, encodes a large, secreted mucin, which is one of the two major secreted mucins in human airways. The MUC5AC region contains a gap in the human genome reference (hg19) across the large, highly repetitive, and complex central exon. This exon is predicted to contain imperfect tandem repeat sequences and multiple conserved cysteine-rich (CysD) domains. To resolve the MUC5AC genomic gap, we used high-fidelity long PCR followed by single molecule real-time (SMRT) sequencing. This technology yielded long sequence reads and robust coverage that allowed for de novo sequence assembly spanning the entire repetitive region. Furthermore, we used SMRT sequencing of PCR amplicons covering the central exon to identify genetic variation in four individuals. The results demonstrated the presence of segmental duplications of CysD domains, insertions/deletions (indels) of tandem repeats, and single nucleotide variants. Additional studies demonstrated that one of the identified tandem repeat insertions is tagged by nonexonic single nucleotide polymorphisms. Taken together, these data illustrate the successful utility of SMRT sequencing long reads for de novo assembly of large repetitive sequences to fill the gaps in the human genome. Characterization of the MUC5AC gene and the sequence variation in the central exon will facilitate genetic and functional studies for this critical airway mucin.
- Subjects :
- Pulmonary and Respiratory Medicine
Genetics
Genome, Human
Clinical Biochemistry
Mucins
Sequence assembly
Hybrid genome assembly
Exons
Sequence Analysis, DNA
Cell Biology
Mucin 5AC
Biology
Polymorphism, Single Nucleotide
Genome
Linkage Disequilibrium
Exon
Tandem repeat
Cot analysis
Humans
Human genome
Molecular Biology
Repetitive Sequences, Nucleic Acid
Original Research
Single molecule real time sequencing
Subjects
Details
- ISSN :
- 15354989 and 10441549
- Volume :
- 50
- Database :
- OpenAIRE
- Journal :
- American Journal of Respiratory Cell and Molecular Biology
- Accession number :
- edsair.doi.dedup.....bf3e7c3bece9105b4c46ce05350d1bdd