how much has been sequenced?

Sima Misra sima at
Fri May 5 09:39:26 EST 2000

Dear Rock,

Sorry that the numbers on the BDGP pages are unclear about the state of
the genome sequencing.  We are currently updating our pages, so this
should be made clearer soon.

The number 112MB is how much the BDGP has sequenced, and does not include
how much Celera sequenced.  The euchromatic portion of the genome (~120
Mb) is fairly complete; about 1300 small gaps in the euchromatin are being
worked on by the BDGP. The heterochromatic portion of the genome (~60 Mb),
which consists mainly of repetitive sequences, is not complete, because of
the technical difficulty of sequence assembly over repetitive regions.

About 97% of known genes with substantial sequence information were found 
before gap filling, to answer your question about how many of the genes
might be missing.

You can read more of the details in the Adams et al., 2000 Science
287:2185-2195 paper. 

Please let me know if I can be of further assistance.

Sima Misra

> Subject: how much has been sequenced?
> from this page,
> it appears that about 112 MB have been sequenced.
> ("As of April 29, 2000, the BDGP has sequenced:
> 111.4Mb total of Drosophila genomic DNA (including clones in progress)")
> and, there are 1500 gaps.
> and, from this page,
> it appears that the estimated size of the genome is 180 MB.
> ("How big is the Drosophila genome? How many genes does it contain?
> The genome of Drosophila melanogaster is now estimated to be 180
> megabases
> (Mb): the euchromatic genome is 140Mb in size, while the other
> 40Mb is heterochromatic (non-coding). ")
> so is it correct to say that there is still 28 MB of euchromatic genomic
> to sequence (and 68 MB of total genomic DNA to sequence)?
> 28/140 is 20%.  so 20% of the gene-rich regions hasn't been sequenced?
> 68/180 is about 37%.  so 37% of the total genome hasn't been sequenced?
> am i in using the right numbers here?
> also, does anyone know what fraction of the previously cloned genes
> never showed up in the celera 12X coverage of the genome?  or how
> many drosophila ESTs didn't get found in the Celera sequences?  that
> might
> be another way of guessing how many more genes there are out there.
> thanks.
> rock pulak


More information about the Dros mailing list