Scoring Literal Alignments

ouzounis at embl-heidelberg.de ouzounis at embl-heidelberg.de
Mon Sep 13 12:33:09 EST 1993


In article <271rjf$5ui at terminator.rs.itd.umich.edu>, cash at geneva.csmil.umich.edu (Howard Cash) writes:
> I'd like to bring some more minds in on an ongoing discussion here:
> What makes a good match between NUCLEIC ACIDS?  I ask about DNA
> to eliminate the discussion of PAM scores and likely mutations.
> 
> If one is doing an error-tolerant comparison of strings that
> SHOULD match exactly (as is the case when doing plain text searches
> or sequencing fragment assembly) how should one balance length
> of match against percent match?  Is an exact match of 20 bases
> better than a 96% exact match of 25 bases?  I have seen heuristics
> used for this decision, but have never seen any of them backed
> up with much discussion.
> 
> If you want to mail to me directly, I will post a summary to the
> net.
> 
> -Hobie
> 
> cash at csmil.umich.edu
> 

Arratia & Waterman have published a series of very interesting papers back in the mid-1980s dealing
with the problem of statistical significance in DNA sequence similarity (4-letter alphabets).

C A Ouzounis
EMBL
Heidelberg




More information about the Bio-soft mailing list