Dear all,Evaluating low identity scores

Iddo Friedberg idoerg at cc.huji.ac.il
Fri Jan 30 11:22:39 EST 1998


Sean Eddy wrote:
> 
> In article <6anmv7$osi at net.bio.net> Iddo Friedberg <idoerg at cc.huji.ac.il> writes:
>   >This Monte-Carlo strategy of evaluating alignment scores is being used
>   >routinely in the GCG sequence alignment programs. Basically, the idea is
>   >as you stated it. Once you make, say, 100 randomizations, you get a
>   >normal distribution of scores (vs. the random) with a given mean, and
>    ^^^^^^^^^^^^^^^^^^^

[snip...]

> And it's since been shown (papers by Karlin, Altschul, and others)
> that the reason for this is that the score distribution for local
> alignments is not a normal distribution. Z-scoring is unreliable,
> giving overestimates of how significant a score is. The score
> distribution is instead closer to an extreme value distribution, with
> a longer tail than the Gaussian. Bill Pearson's FASTA/SSEARCH software
> package is an example of a package that lets you do Monte Carlo
> estimation of alignment significance using the extreme value
> distribution.
> 

I stand corrected... the tail is indeed longer than Gaussian, hence what
follows in Sean's post.

Iddo


-- 
Iddo Friedberg
Phone: (972)-2-6758647
email: idoerg at cc.huji.ac.il
web: http://www.ls.huji.ac.il/~idoerg
More info: finger idoerg at cc.huji.ac.il




More information about the Mol-evol mailing list