NCBI Result Search
Keith Robison
robison1 at husc10.harvard.edu
Tue Apr 12 22:36:22 EST 1994
Again, I strongly recommend reading the recent Nature Genetics
review written by the NCBI gang (6:119-129, 1994). It covers
the relevant topics very well. Also, there is my hypertext guide
(Mosaic, Lynx, etc (but NOT ftp!) http://twod.med.harvard.edu/seqanal/ )
but I _still_ haven't put in an example BLAST report (maybe next week...).
A VERY important thing to remember about the p values given by BLAST is
that they are calculated based on a statistical model which makes certain
assumptions. _Many_ biological sequences violate these assumptions,
particularly naive translations of DNA. In particular, sequences
enriched in particular amino acids or simple repeat patterns violate
the assumptions of the model. You should consider filtering such
regions from your query with the SEG and/or XNU options to the BLAST
software (see the E-mail server help file for details). These options
do not always clear things out -- be wary of alignments if most of
the matching residues are the same residue. Some packages (such
as FASTA) contain programs which test for composition-bias-guided-alignments
by other methods.
Good luck!
Keith Robison
Harvard University
Department of Cellular and Developmental Biology
Department of Genetics / HHMI
krobison at nucleus.harvard.edu
More information about the Bioforum
mailing list