[Bio-software] Re: Pairwise alignements versus multiple alignement for distancematrix

Des Higgins via bio-soft%40net.bio.net (by dazzhiggins from hotmail.com)
Wed Mar 7 09:22:14 EST 2007


"Ahmed Moustafa" <ahmed from pobox.com> wrote in message 
news:mailman.165.1172707327.5139.bio-soft from net.bio.net...
> On 2/28/2007 7:45 AM, Kevin Karplus wrote:
>> On 2007-02-28, Ahmed Moustafa <ahmed from pobox.com> wrote:
>>> But MSA methods perform the pairwise alignments anyway as an initial 
>>> step to cluster or join the sequences in the multiple alignment. Also 
>>> MSA methods are approximation while pairwise alignment methods computes 
>>> the optimal alignments, so they give absolute values representing the 
>>> relatednesses (or distances) between the sequences.
>>
>> Only very slow (and ancient) MSA methods start with full pairwise 
>> alignments.
>>
>> Multiple sequence alignments are better alignments than independent
>> pairwise ones, because they can better disambiguate alignments where
>> the signal is weak.  The evolutionary distance measures from MSAs are
>> crude, but the ones from pairwise alignments alignments are often even
>> cruder.
>
> Regardless of the evolutionary distance, my guess is that for closely 
> related sequences, pairwise alignments would be more sensitive than 
> multiple alignment and resolve the relationships between the sequences and 
> cluster them accurately.
>

That will only begin to be approximately true in cases where the sequences 
are so close that you get the same answer by either route.
Pairwise alignment with dynamic programming is guaranteed to give the 
optimal score.  That is correct.  What is not correct is how you use that 
score.  You wish to use it to estimate an evolutionary distance.   Those 
estimates will be better if they come from the multiple alignment.
If the sequences are very distantly related (as Kevin said already), those 
estimates will be terrible if they come from the pairwise alignments.
If you take two very distant sequences and align them, there is huge 
uncertainty as to where exactly the gaps go.  That uncertainty is sligtly 
less if you do a multiple alignment first.  You can get a guaranteed optimal 
alignment between the 2 sequences just doing it pairwise but it will not 
mean much and it is only optimal regarding the particular parameters you 
choose anyway.

Des




More information about the Bio-soft mailing list