FASTA moans and questions
wrp at cyclops.micr.Virginia.EDU
Fri May 15 11:43:08 EST 1992
In article <9205151608.AA07714 at gserv1> JAB5 at UK.AC.YORK.VAXA writes:
> I frequently use FASTA in my work. The latest version has some useful
>improvements, eg. the whole sequence entry is not listed in alignments
>when only a short match is found. However, I'd like to have a bit of a
>moan and to ask some questions...
>1. You used to be able to search sub-sets of the database, eg. B=
> Bacterial sequences. I found this most useful.
This has not changed; you can still search subsets of the database
if they are available in separate files. The program must be installed
correctly (the FASTLIBS file must be set up properly) for this to work.
>2. What is the meaning of the message "ignoring..(list of entry codes)"?
There are some very short sequences (1 amino acid, 3
nucleotides) in the databases. They are ignored.
>3. The best matches are listed as entry codes only. You used to get a
> short descriptor. "M21579" does not tell you much!
This is another instance of the program not being installed
correctly. It sounds like you have a "type 5" (PIR format) file but
you are searching it as a "type 0" (FASTA format) file. When you do
this, you end up treating the descriptive line as amino acid sequence.
>4. Is there a problem with the numbering of long sequences in alignments
> ?? eg MIPACGA (Embl) is over 100kb. The numbering goes from 99990 to
> 10000 ! Whilst pretty obvious at present, this might prove to be an
> important fault as longer sequences or melded entries accumulate.
The latest version of FASTA is set up to handle sequences up
to 10,000,000 residues without the numbering problem.
>5. In local comparisons (LFasta) why should the order of comparison
> sometimes give different answers? ie. it is not intuitive why
> a compared to b should not be the same as b to a...
LFASTA uses a heuristic algorithm that is fast, but not as
predictable as one might like. That is why I recommend that LALIGN
(also included with FASTA) be used when practical.
>Sorry if this sounds a bit negative-in fact I find it a very useful
>and powerful tool but I would like to see the points above addressed.
>JAB5 at UK.AC.YORK.VAXA
>University of York, UK.
More information about the Bio-soft