SEARCH RESULT FOR BLASTp from NCBI
Mike Cherry
cherry at stout.Stanford.EDU
Mon Mar 28 11:10:06 EST 1994
In article <Pine.3.89.9403281600.A110-0100000 at kb.usm.my>,
Zainul Zain <zainul at KB.USM.MY> wrote:
>Hello, My name is Nazar. I',m stuck with this problem of interpreting
>the search result that was send back to me from NCBI. Could anybody give
>me some clue of what the "High Score" means etc.
>
>For Example:
> Smallest
> High Poisson
>Sequences producing High-scoring Segment Pairs: Score Probability
> P(N)
>sp|P11166|GTR1_HUMAN GLUCOSE TRANSPOTER TYPE 1, ERYTHR... 2505 0.0
>sp|P13181|GAL2_YEAST GALACTOSE TRANSPORTER (GALACTOSE P... 157 2.6e-25
The High Score is simply the sum of all the values for the BLOSUM2
scoring matrix. The reference for the BLOSUM62 matrix is:
Henikoff, Steven and Jorga G. Henikoff (1992). Amino acid
substitution matrices from protein blocks. Proc. Natl. Acad.
Sci. USA 89:10915-19.
For every amino acid aligned the value from the matrix is looked up
and added to the other values. So a higher score typically means a
better match.
The Poisson probability is perhaps misleading in this case because the
first match is so good. The P(N) value of 0.0 is really a number less
than 1.0e-320. That means there is a less than 1 in 1.0e320 chance
that a match would be as good or better than the score of 2505 from a
search of the database involved above. The probability is based on the
Karlin and Altschul work.
Karlin, Samuel and Stephen F. Altschul (1990). Methods for
assessing the statistical significance of molecular sequence
features by using general scoring schemes. Proc. Natl. Acad.
Sci. USA 87:2264-68.
Some more information on the BLAST program and many references on
scoring matrix are available from the NCBI. Check out the information
returned from blast at ncbi.nlm.nih.gov when you send an e-mail with just
"help" as the body of the message.
--
J. Michael Cherry Internet: cherry at genome.stanford.edu
Project Manager Saccharomyces Genome Database
Stanford DNA Sequence & Tech. Center Department of Genetics
Stanford University School of Medicine Stanford, CA 94305-5120
More information about the Bioforum
mailing list