IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

combining distances

Andrew J. Roger roger at evol5.mbl.edu
Wed Sep 10 13:50:42 EST 1997


I wonder if someone can tell me whether the following is

The estimate of distance between two sequences has 
a variance associated with it that is a function of
the sequence length.  For short sequences, therefore,
the variance of the distance estimate is large.  So
if one wanted to correct this situation one could
get more sequence and calculate the distance for
the larger sequences.  This is similar (identical) to the practice
of concatenating datasets to improve the distance 
estimate (assuming that they are all evolving in a similar

However, in a lot of cases, I have noticed that the distances
between a pair of taxa will be separately estimated
from each individual gene and then the distances from many 
genes will be averaged somehow.

My question is, does the averaging of distances over 
many different genes DECREASE the variance of the final
distance estimate between two taxa in the same way that
concatenating the sequences would?

Intuitively, I would think that the averaging of distances
will, in the end, only lead to an average variance that is
comparable to the variance of any of the original datasets.

I hope someone can explain this to me!!!!

Andrew J. Roger

Marine Biological Laboratory
Woods Hole, MA

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net