1. From first base: the sequence of the tip of the X chromosome of Drosophila melanogaster, a comparison of two sequencing strategies.
- Author
-
Benos, P V, Gatt, M K, Murphy, L, Harris, D, Barrell, B, Ferraz, C, Vidal, S, Brun, C, Demaille, J, Cadieu, E, Dreano, S, Gloux, S, Lelaure, V, Mottier, S, Galibert, F, Borkova, D, Miñana, B, Kafatos, F C, Bolshakov, S, Sidén-Kiamos, I, Papagiannakis, G, Spanos, L, Louis, C, Madueño, E, de Pablos, B, Modolell, J, Peter, A, Schöttler, P, Werner, M, Mourkioti, F, Beinert, N, Dowe, G, Schäfer, U, Jäckle, H, Bucheton, A, Callister, D, Campbell, L, Henderson, N S, McMillan, P J, Salles, C, Tait, E, Valenti, P, Saunders, R D, Billaud, A, Pachter, L, Glover, D M, and Ashburner, M
- Abstract
We present the sequence of a contiguous 2.63 Mb of DNA extending from the tip of the X chromosome of Drosophila melanogaster. Within this sequence, we predict 277 protein coding genes, of which 94 had been sequenced already in the course of studying the biology of their gene products, and examples of 12 different transposable elements. We show that an interval between bands 3A2 and 3C2, believed in the 1970s to show a correlation between the number of bands on the polytene chromosomes and the 20 genes identified by conventional genetics, is predicted to contain 45 genes from its DNA sequence. We have determined the insertion sites of P-elements from 111 mutant lines, about half of which are in a position likely to affect the expression of novel predicted genes, thus representing a resource for subsequent functional genomic analysis. We compare the European Drosophila Genome Project sequence with the corresponding part of the independently assembled and annotated Joint Sequence determined through "shotgun" sequencing. Discounting differences in the distribution of known transposable elements between the strains sequenced in the two projects, we detected three major sequence differences, two of which are probably explained by errors in assembly; the origin of the third major difference is unclear. In addition there are eight sequence gaps within the Joint Sequence. At least six of these eight gaps are likely to be sites of transposable elements; the other two are complex. Of the 275 genes in common to both projects, 60% are identical within 1% of their predicted amino-acid sequence and 31% show minor differences such as in choice of translation initiation or termination codons; the remaining 9% show major differences in interpretation.
- Published
- 2001
- Full Text
- View/download PDF