NCBI Result Search

Keith Robison robison1 at husc10.harvard.edu
Tue Apr 12 22:36:22 EST 1994


Again, I strongly recommend reading the recent Nature Genetics 
review written by the NCBI gang (6:119-129, 1994).  It covers
the relevant topics very well.  Also, there is my hypertext guide
(Mosaic, Lynx, etc (but NOT ftp!) http://twod.med.harvard.edu/seqanal/ )
but I _still_ haven't put in an example BLAST report (maybe next week...).

A VERY important thing to remember about the p values given by BLAST is
that they are calculated based on a statistical model which makes certain
assumptions.  _Many_ biological sequences violate these assumptions, 
particularly naive translations of DNA.  In particular, sequences 
enriched in particular amino acids or simple repeat patterns violate
the assumptions of the model.  You should consider filtering such
regions from your query with the SEG and/or XNU options to the BLAST
software (see the E-mail server help file for details).  These options
do not always clear things out -- be wary of alignments if most of
the matching residues are the same residue.  Some packages (such
as FASTA) contain programs which test for composition-bias-guided-alignments
by other methods.



Good luck!

Keith Robison
Harvard University
Department of Cellular and Developmental Biology
Department of Genetics / HHMI

krobison at nucleus.harvard.edu 







More information about the Bioforum mailing list