Computer readable FASTA output

Bill Pearson wrp at avery.med.Virginia.EDU
Fri Oct 13 08:01:29 EST 1995


Several people have asked for a FASTA output format that would be more
easily parsed by other computer programs.  Since FASTA already
supports multiple output formats, this is relatively easy to do.  My
thought is to produce something like what is shown below, but I am
willing to listen to other suggestions.  If you would like an
addition/change in computer readable format proposed below, please
send me email before November 1.

Proposed FASTA parseable output (-m 10):

>>LCBO prolactin precursor - bovine
; n1: 229
; initn:  442
; init1:  314
; opt: 501
; z-score: 600.7
; expect: 1.5e-27
; smith-waterman: 501
; ident: 0.365 
; overlap: 222
; start_seq1: 1
; stop_seq1: 224
; start_seq2: 1
; stop_seq2: 229
>musplf ..
 MLPSLIQPCSWILLLLLVNSSLLWKNVASFPMCAMRNGRCFMSFEDTFE
LAGSLSHNISIEVSELFTEFEKHYSNVSGLRDKSPMRCNTSFLPTPENKE
QARLTHYSALLKSGAMILDAWESPLDDLVSELSTIKNVPDIIISKATDIK
KKINAVRNGVNALMSTMLQNGDEEKKNPAWF....LQSDNEDARIHSLYG
MISCLDNDFKKVDIYLNVLKCYMLKIDNC
>LCBO ..
MDSKGSSQKGSRLLLLLVVSNLLLCQGVVSTPVCPNGPGNCQVSLRDLFD
RAVMVSHYIHDLSSEMFNEFDKRYAQGKGFITMALNSCHTSSLPTPEDKE
QAQQTHHEVLMSLILGLLRSWNDPLYHLVTEVRGMKGAPDAILSRAIEIE
EENKRLLEGMEMIFGQVIPGAKETEPYPVWSGLPSLQTKDEDARYSAFYN
LLHCLRRDSSKIDTYLKLLNCRIIYNNNC
>>LCPG prolactin precursor - pig                     (229 aa)


Each alignment record would start with a ">>", scoring
parameters would start with ";", and the sequence alignment itself
would start with ">".

Bill Pearson
-- 
wrp at virginia.EDU
Dept. of Biochemistry #440
U. of Virginia
Charlottesville, VA 22908




More information about the Bio-soft mailing list