Computer readable FASTA output
Bill Pearson
wrp at avery.med.Virginia.EDU
Fri Oct 13 08:01:29 EST 1995
Several people have asked for a FASTA output format that would be more
easily parsed by other computer programs. Since FASTA already
supports multiple output formats, this is relatively easy to do. My
thought is to produce something like what is shown below, but I am
willing to listen to other suggestions. If you would like an
addition/change in computer readable format proposed below, please
send me email before November 1.
Proposed FASTA parseable output (-m 10):
>>LCBO prolactin precursor - bovine
; n1: 229
; initn: 442
; init1: 314
; opt: 501
; z-score: 600.7
; expect: 1.5e-27
; smith-waterman: 501
; ident: 0.365
; overlap: 222
; start_seq1: 1
; stop_seq1: 224
; start_seq2: 1
; stop_seq2: 229
>musplf ..
MLPSLIQPCSWILLLLLVNSSLLWKNVASFPMCAMRNGRCFMSFEDTFE
LAGSLSHNISIEVSELFTEFEKHYSNVSGLRDKSPMRCNTSFLPTPENKE
QARLTHYSALLKSGAMILDAWESPLDDLVSELSTIKNVPDIIISKATDIK
KKINAVRNGVNALMSTMLQNGDEEKKNPAWF....LQSDNEDARIHSLYG
MISCLDNDFKKVDIYLNVLKCYMLKIDNC
>LCBO ..
MDSKGSSQKGSRLLLLLVVSNLLLCQGVVSTPVCPNGPGNCQVSLRDLFD
RAVMVSHYIHDLSSEMFNEFDKRYAQGKGFITMALNSCHTSSLPTPEDKE
QAQQTHHEVLMSLILGLLRSWNDPLYHLVTEVRGMKGAPDAILSRAIEIE
EENKRLLEGMEMIFGQVIPGAKETEPYPVWSGLPSLQTKDEDARYSAFYN
LLHCLRRDSSKIDTYLKLLNCRIIYNNNC
>>LCPG prolactin precursor - pig (229 aa)
Each alignment record would start with a ">>", scoring
parameters would start with ";", and the sequence alignment itself
would start with ">".
Bill Pearson
--
wrp at virginia.EDU
Dept. of Biochemistry #440
U. of Virginia
Charlottesville, VA 22908
More information about the Bio-soft
mailing list