How to handle selenocystiene in alignments ???

Gordon D. Pusch gdpusch at NO.xnet.SPAM.com
Mon Apr 14 23:28:42 EST 2003


I have recently found evidence that BLAST and FASTA do not properly handle
the official IUPAC single-letter-code 'U' for selenocystiene, presumably
because it does not appear in either the PAM or BLOSUM matrices (although
I have not been able to rule out hard-coding as a cause). 

Are substitution matrices available that include scores for selenocystiene?
If not, what is the least harmful way of handling the selenocystiene character?
Should it be changed to the code 'X' for an unknown amino acid?  Or should
it be changed to the code for another amino acid with similar chemical and
physical properties?  Would it be acceptable to change it to the extremely 
rare but still 'legal' character 'Z' for glutamine?  Any other suggestions?


-- Gordon D. Pusch   

perl -e '$_ = "gdpusch\@NO.xnet.SPAM.com\n"; s/NO\.//; s/SPAM\.//; print;'





More information about the Comp-bio mailing list