I have recently found evidence that BLAST and FASTA do not properly handle
the official IUPAC single-letter-code 'U' for selenocystiene, presumably
because it does not appear in either the PAM or BLOSUM matrices (although
I have not been able to rule out hard-coding as a cause).
Are substitution matrices available that include scores for selenocystiene?
If not, what is the least harmful way of handling the selenocystiene character?
Should it be changed to the code 'X' for an unknown amino acid? Or should
it be changed to the code for another amino acid with similar chemical and
physical properties? Would it be acceptable to change it to the extremely
rare but still 'legal' character 'Z' for glutamine? Any other suggestions?
-- Gordon D. Pusch
perl -e '$_ = "gdpusch\@NO.xnet.SPAM.com\n"; s/NO\.//; s/SPAM\.//; print;'