[Bio-software] Re: NCBI Blast Report - Expect(2)

Kevin Karplus karplus at cheep.cse.ucsc.edu
Wed Mar 22 20:02:44 EST 2006


On 2006-03-23, jtmodel at gmail.com <jtmodel at gmail.com> wrote
[I trimmed his message a little]:
> Specifically, mixed in with the E-values I understand,there appear to
> be aberrant E-values labeled "Expect(2) n.nnn", where n is some number,
> lower than what it should be for a given Raw Score, K,m,n,Lambda and H.
>
> for example, here are two entries from the BLAST report...
> Score = 29.1 bits (55), Expect = 4.5
>  Identities = 9/25 (36%), Positives = 14/25 (56%)
>
> Query: 47      LGNKCYLQTDSIGEPPGQGPGRAWG 71
>                L + C   T+ +G PPG  P ++ G
> Sbjct: 3434216 LASSCGAATEQLGSPPGPSPSQSAG 3434240
>
>
>
>  Score = 26.2 bits (49), Expect(2) = 0.068
>  Identities = 7/16 (43%), Positives = 13/16 (81%)
>
> Query: 14      VGTILLVVAMATDHWM 29
>                V  ++ +VA++TDHW+
> Sbjct: 6868559 VSLLIWIVAISTDHWI 6868574
> ----------------------------------------------------------------------------
> Lambda     K      H
>    0.331    0.141    0.483
> -------------------------------------------------------------------------
> The Raw scores appear to be "55" and "49", yet the Expect value for the
> first HSP is MUCH higher than that for the second.
>
> The calculation of the Expect value for the first sequence is good-old
> E=(Kmn*exp(-Lam*S))/H
> For the second, I'm not really sure at all what Expect(2) means.  It
> doesn't seem to relate to the E=(Kmn*exp(-Lam*S))/H equation at all.

I believe that there is a subsequence paper explaining how multiple
hits in the same sequence are combined.  One hit is scored as you have
figured out, but subsequent hits to the same sequence make the hit
strong (E-value smaller).  I'm not sure, but I think the paper you want
to read is

SF Altschul, TL Madden, AA Schaffer, J Zhang, Z Zhang, W Miller and DJ
Lipman.
Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs
Nucleic Acids Research, Vol 25, Issue 17 3389-3402

http://nar.oxfordjournals.org/cgi/content/abstract/25/17/3389


------------------------------------------------------------
Kevin Karplus 	karplus at soe.ucsc.edu	http://www.soe.ucsc.edu/~karplus
Professor of Biomolecular Engineering, University of California, Santa Cruz
Undergraduate and Graduate Director, Bioinformatics
(Senior member, IEEE)	(Board of Directors & Chair of Education Committee, ISCB)
life member (LAB, Adventure Cycling, American Youth Hostels)
Effective Cycling Instructor #218-ck (lapsed)
Affiliations for identification only.



More information about the Bio-soft mailing list