IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Dear all,Evaluating low identity scores

Iddo Friedberg idoerg at cc.huji.ac.il
Fri Jan 30 11:22:39 EST 1998

Sean Eddy wrote:
> In article <6anmv7$osi at net.bio.net> Iddo Friedberg <idoerg at cc.huji.ac.il> writes:
>   >This Monte-Carlo strategy of evaluating alignment scores is being used
>   >routinely in the GCG sequence alignment programs. Basically, the idea is
>   >as you stated it. Once you make, say, 100 randomizations, you get a
>   >normal distribution of scores (vs. the random) with a given mean, and
>    ^^^^^^^^^^^^^^^^^^^


> And it's since been shown (papers by Karlin, Altschul, and others)
> that the reason for this is that the score distribution for local
> alignments is not a normal distribution. Z-scoring is unreliable,
> giving overestimates of how significant a score is. The score
> distribution is instead closer to an extreme value distribution, with
> a longer tail than the Gaussian. Bill Pearson's FASTA/SSEARCH software
> package is an example of a package that lets you do Monte Carlo
> estimation of alignment significance using the extreme value
> distribution.

I stand corrected... the tail is indeed longer than Gaussian, hence what
follows in Sean's post.


Iddo Friedberg
Phone: (972)-2-6758647
email: idoerg at cc.huji.ac.il
web: http://www.ls.huji.ac.il/~idoerg
More info: finger idoerg at cc.huji.ac.il

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net