Is there a database available including all non-repetitive sequences of human genome?

Simon Andrews simon.andrews at bbsrc.ac.uk
Wed Aug 22 05:57:56 EST 2001


newgene wrote:
> 
> As title.
> I know the repetitive sequences in human genome can be masked by
> RepMasker. Did someone mask the entire human genome and make available
> the human genome sequnences without repetitive sequences?

You could get this information from Ensembl (www.ensembl.org).  They
have an assembled human sequence, and have mapped repeats onto it (they
are not shown by default, but need to be added through the features
menu).

Getting a masked version of the genome would not be trivial through. 
You can mirror their SQL database, which holds all of the sequences and
features, and you would then have to write scripts which extracted all
of the sequence not covered by a repeat.  You could then write this to a
conventional sequence database format.  This would require a lot of
computing power and time as well as programming and database experience.

What was it that you were trying to do with such a database?  You may
find there is a better alternative than going through the steps outlined
above.

	TTFN

	Simon.




More information about the Bio-www mailing list