Genefinder for the Masses -- Possibility or Hopeless Dream?
Steve Jones
sjj at sanger.ac.uk
Thu Aug 15 04:29:29 EST 1996
Erich Schwarz wrote:
>
> I've been trying to get BLAST data from a partially sequenced
> cosmid that's of interest to our lab, and I've been forced to the
> following realization:
>
> The public BLAST servers *really* don't like you if you try
> to search reasonably sized chunks of DNA. I got NCBI to search
> 9 kb exactly once, but it took God's own time, and the output
> got yanked away between when I saw it came up and when I would
> have been able to download it. Fooey on that.
I suppose nowadays 9kb is not that large an amount of sequence
to be searching and I presume that if the NCBI does not currently
let you search this amount it soon will. Until then you will probably
have to set up your own databases and use blast locally.
But the real issue is, even if you can search 9kb do you
really want to search through what can be quite voluminous output
from the blast search.
If you do end up with large blast output files
You may find useful a program called MSPcrunch developed here
at the Sanger Centre by Erik Sonnhammer which gives
a useful analysis of significant blast hits. The -P option gives
a very useful overview in seeing what is hitting your sequence and
where.
MSPcrunch can be obtained from http://www.sanger.ac.uk/~esr/MSPcrunch.html
>
> So. Does anybody out there know if there's either a publicly
> available server or a transferrable version of _C. elegans_
> Genefinder that we could use here at Columbia on cosmid sequences?
> If there was just a way to distill 40 kb of DNA to a few
> possible proteins, BLAST searching the residue would be a joy.
>
> Thank you for any advice. I'm sure there's an obvious
> answer, but this seems to be my day to be computer illiterate.
>
> --Erich Schwarz
> schwarz at cubsps.bio.columbia.edu
There is, as Thomas Burglin pointed out as well, a version
of genefinder within acedb. There is also a short tutorial/guide
on how to look at and import sequences into acedb and subsequently
perform genefinder analyses on the Sanger Web site.
in:-
http://www.sanger.ac.uk/~sjj/RUNGENE.html
This should allow you to build up your predicted
genes and export their predicted protein sequences.
Also by specifying the coordinates in the active zone
box you can export segments of sequence which will be
more amenable to blast searching at the NCBI.
Steve Jones
The Sanger Centre
http://www.sanger.ac.uk
More information about the Celegans
mailing list