Genefinder for the Masses -- Possibility or Hopeless Dream?

Steve Jones sjj at
Thu Aug 15 04:29:29 EST 1996

Erich Schwarz wrote:
>     I've been trying to get BLAST data from a partially sequenced
> cosmid that's of interest to our lab, and I've been forced to the
> following realization:
>     The public BLAST servers *really* don't like you if you try
> to search reasonably sized chunks of DNA.  I got NCBI to search
> 9 kb exactly once, but it took God's own time, and the output
> got yanked away between when I saw it came up and when I would
> have been able to download it.  Fooey on that.

	I suppose nowadays 9kb is not that large an amount of sequence
to be searching and I presume that if the NCBI does not currently 
let you search this amount it soon will. Until then you will probably 
have to set up your own databases and use blast locally.
	But the real issue is, even if you can search 9kb do you 
really want to search through what can be quite voluminous output
from the blast search. 
	If you do end up with large blast output files
You may find useful a program called MSPcrunch developed here 
at the Sanger Centre by Erik Sonnhammer which gives 
a useful analysis of significant blast hits. The -P option gives 
a very useful overview in seeing what is hitting your sequence and 
	MSPcrunch can be obtained from

>     So.  Does anybody out there know if there's either a publicly
> available server or a transferrable version of _C. elegans_
> Genefinder that we could use here at Columbia on cosmid sequences?
> If there was just a way to distill 40 kb of DNA to a few
> possible proteins, BLAST searching the residue would be a joy.
>     Thank you for any advice.  I'm sure there's an obvious
> answer, but this seems to be my day to be computer illiterate.
> --Erich Schwarz
>   schwarz at

	There is, as Thomas Burglin pointed out as well, a version 
of genefinder within acedb. There is also a short tutorial/guide 
on how to look at and import sequences into acedb and subsequently 
perform genefinder analyses on the Sanger Web site.


This should allow you to build up your predicted 
genes and export their predicted protein sequences. 

Also by specifying the coordinates in the active zone
box you can export segments of sequence which will be 
more amenable to blast searching at the NCBI. 

			Steve Jones 

The Sanger Centre

More information about the Celegans mailing list