Back to Search
Start Over
Classification of intra-genomic helitrons based on features extracted from different orders of FCGS
- Source :
- Informatics in Medicine Unlocked, Vol 18, Iss, Pp-(2020)
- Publication Year :
- 2020
- Publisher :
- Elsevier BV, 2020.
-
Abstract
- Helitrons, eukaryotic transposable elements (TEs), were discovered 18 years ago in various genomes. In the Cænorhabditis elegans (C.elegans) genome, helitron sequences have high variability in terms of size by base pairs (bp) varied from 11 to 8965 bp from one sequence to another. These TEs are not uniformly dispersed sequences, and they have the ability to mobilize within a genome by a rolling-circle mechanism. This ability to move and reproduce in genomes enables these elements to play a major role in genomic evolution. In order to follow the evolution, we predicted helitron families (10 classes) in the C.elegans genome using the combination of the features extracted from signals corresponding to DNA sequences and the Support Vector Machine (SVM) classifier. In our classification system, the features extracted from the signals were shown to be efficient to automatically predict helitronic sequences. As a result, the Gaussian radial kernel over 100-fold cross-validation gave the best accuracy rates, ranging from 68% to 97%, with an overall mean score of 83.7%, and we successfully identified the Helitron Y1A class for a specific value of c and gamma, reaching an accuracy rate of 100%. In addition, other notable helitrons (NDNAX2, NDNAX3 Helitron_Y2) were predicted with interesting accuracy rates. Keywords: Helitrons classification, Signal, FCGS coding technique, Machine learning, SVM
- Subjects :
- 0301 basic medicine
Transposable element
Algebraic interior
Base pair
Health Informatics
Computational biology
Biology
lcsh:Computer applications to medicine. Medical informatics
Genome
DNA sequencing
Support vector machine
03 medical and health sciences
030104 developmental biology
0302 clinical medicine
030220 oncology & carcinogenesis
Helitron
lcsh:R858-859.7
Classifier (UML)
Subjects
Details
- ISSN :
- 23529148
- Volume :
- 18
- Database :
- OpenAIRE
- Journal :
- Informatics in Medicine Unlocked
- Accession number :
- edsair.doi.dedup.....3788f251a7a6f99b2cd506b9a3e35839
- Full Text :
- https://doi.org/10.1016/j.imu.2019.100271